public inbox for [email protected]
help / color / mirror / Atom feedRe: generic plans and "initial" pruning
29+ messages / 5 participants
[nested] [flat]
* Re: generic plans and "initial" pruning
@ 2024-08-15 15:34 Robert Haas <[email protected]>
2024-08-16 12:35 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
0 siblings, 1 reply; 29+ messages in thread
From: Robert Haas @ 2024-08-15 15:34 UTC (permalink / raw)
To: Amit Langote <[email protected]>; +Cc: Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; Jacob Champion <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>; Tom Lane <[email protected]>
On Thu, Aug 15, 2024 at 8:57 AM Amit Langote <[email protected]> wrote:
> TBH, it's more of a hunch that people who are not involved in this
> development might find the new reality, whereby the execution is not
> racefree until ExecutorRun(), hard to reason about.
I'm confused by what you mean here by "racefree". A race means
multiple sessions are doing stuff at the same time and the result
depends on who does what first, but the executor stuff is all
backend-private. Heavyweight locks are not backend-private, but those
would be taken in ExectorStart(), not ExecutorRun(), IIUC.
> With the patch, CreateQueryDesc() and ExecutorStart() are moved to
> PortalStart() so that QueryDescs including the PlanState trees for all
> queries are built before any is run. Why? So that if ExecutorStart()
> fails for any query in the list, we can simply throw out the QueryDesc
> and the PlanState trees of the previous queries (NOT run them) and ask
> plancache for a new CachedPlan for the list of queries. We don't have
> a way to ask plancache.c to replan only a given query in the list.
I agree that moving this from PortalRun() to PortalStart() seems like
a bad idea, especially in view of what you write below.
> * There's no longer CCI() between queries in PortalRunMulti() because
> the snapshots in each query's QueryDesc must have been adjusted to
> reflect the correct command counter. I've checked but can't really be
> sure if the value in the snapshot is all anyone ever uses if they want
> to know the current value of the command counter.
I don't think anything stops somebody wanting to look at the current
value of the command counter. I also don't think you can remove the
CommandCounterIncrement() calls between successive queries, because
then they won't see the effects of earlier calls. So this sounds
broken to me.
Also keep in mind that one of the queries could call a function which
does something that bumps the command counter again. I'm not sure if
that creates its own hazzard separate from the lack of CCIs, or
whether it's just another part of that same issue. But you can't
assume that each query's snapshot should have a command counter value
one more than the previous query.
While this all seems bad for the partially-initialized-execution-tree
approach, I wonder if you don't have problems here with the other
design, too. Let's say you've the multi-query case and there are 2
queries. The first one (Q1) is SELECT mysterious_function() and the
second one (Q2) is SELECT * FROM range_partitioned_table WHERE
key_column = 42. What if mysterious_function() performs DDL on
range_partitioned_table? I haven't tested this so maybe there are
things going on here that prevent trouble, but it seems like executing
Q1 can easily invalidate the plan for Q2. And then it seems like
you're basically back to the same problem.
> > > 3. The need to add *back* the fields to store the RT indexes of
> > > relations that are not looked at by ExecInitNode() traversal such as
> > > root partitioned tables and non-leaf partitions.
> >
> > I don't remember exactly why we removed those or what the benefit was,
> > so I'm not sure how big of a problem it is if we have to put them
> > back.
>
> We removed those in commit 52ed730d511b after commit f2343653f5b2
> removed redundant execution-time locking of non-leaf relations. So we
> removed them because we realized that execution time locking is
> unnecessary given that AcquireExecutorLocks() exists and now we want
> to add them back because we'd like to get rid of
> AcquireExecutorLocks(). :-)
My bias is to believe that getting rid of AcquireExecutorLocks() is
probably the right thing to do, but that's not a strongly-held
position and I could be totally wrong about it. The thing is, though,
that AcquireExecutorLocks() is fundamentally stupid, and it's hard to
see how it can ever be any smarter. If we want to make smarter
decisions about what to lock, it seems reasonable to me to think that
the locking code needs to be closer to code that can evaluate
expressions and prune partitions and stuff like that.
--
Robert Haas
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 29+ messages in thread
* Re: generic plans and "initial" pruning
2024-08-15 15:34 Re: generic plans and "initial" pruning Robert Haas <[email protected]>
@ 2024-08-16 12:35 ` Amit Langote <[email protected]>
2024-08-19 16:39 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
0 siblings, 1 reply; 29+ messages in thread
From: Amit Langote @ 2024-08-16 12:35 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>; Tom Lane <[email protected]>
On Fri, Aug 16, 2024 at 12:35 AM Robert Haas <[email protected]> wrote:
> On Thu, Aug 15, 2024 at 8:57 AM Amit Langote <[email protected]> wrote:
> > TBH, it's more of a hunch that people who are not involved in this
> > development might find the new reality, whereby the execution is not
> > racefree until ExecutorRun(), hard to reason about.
>
> I'm confused by what you mean here by "racefree". A race means
> multiple sessions are doing stuff at the same time and the result
> depends on who does what first, but the executor stuff is all
> backend-private. Heavyweight locks are not backend-private, but those
> would be taken in ExectorStart(), not ExecutorRun(), IIUC.
Sorry, yes, I meant ExecutorStart(). A backend that wants to execute
a plan tree from a CachedPlan is in a race with other backends that
might modify tables before ExecutorStart() takes the remaining locks.
That race window is bigger when it is ExecutorStart() that will take
the locks, and I don't mean in terms of timing, but in terms of the
other code that can run in between GetCachedPlan() returning a
partially valid plan and ExecutorStart() takes the remaining locks
depending on the calling module.
> > With the patch, CreateQueryDesc() and ExecutorStart() are moved to
> > PortalStart() so that QueryDescs including the PlanState trees for all
> > queries are built before any is run. Why? So that if ExecutorStart()
> > fails for any query in the list, we can simply throw out the QueryDesc
> > and the PlanState trees of the previous queries (NOT run them) and ask
> > plancache for a new CachedPlan for the list of queries. We don't have
> > a way to ask plancache.c to replan only a given query in the list.
>
> I agree that moving this from PortalRun() to PortalStart() seems like
> a bad idea, especially in view of what you write below.
>
> > * There's no longer CCI() between queries in PortalRunMulti() because
> > the snapshots in each query's QueryDesc must have been adjusted to
> > reflect the correct command counter. I've checked but can't really be
> > sure if the value in the snapshot is all anyone ever uses if they want
> > to know the current value of the command counter.
>
> I don't think anything stops somebody wanting to look at the current
> value of the command counter. I also don't think you can remove the
> CommandCounterIncrement() calls between successive queries, because
> then they won't see the effects of earlier calls. So this sounds
> broken to me.
I suppose you mean CCI between "running" (calling ExecutorRun on)
successive queries. Then the patch is indeed broken. If we're to
make that right, the number of CCIs for the multi-query portals will
have to double given the separation of ExecutorStart() and
ExecutorRun() phases.
> Also keep in mind that one of the queries could call a function which
> does something that bumps the command counter again. I'm not sure if
> that creates its own hazzard separate from the lack of CCIs, or
> whether it's just another part of that same issue. But you can't
> assume that each query's snapshot should have a command counter value
> one more than the previous query.
>
> While this all seems bad for the partially-initialized-execution-tree
> approach, I wonder if you don't have problems here with the other
> design, too. Let's say you've the multi-query case and there are 2
> queries. The first one (Q1) is SELECT mysterious_function() and the
> second one (Q2) is SELECT * FROM range_partitioned_table WHERE
> key_column = 42. What if mysterious_function() performs DDL on
> range_partitioned_table? I haven't tested this so maybe there are
> things going on here that prevent trouble, but it seems like executing
> Q1 can easily invalidate the plan for Q2. And then it seems like
> you're basically back to the same problem.
A rule (but not views AFAICS) can lead to the multi-query case (there
might be other ways). I tried the following, and, yes, the plan for
the query queued by the rule is broken by the execution of that for
the 1st query:
create table foo (a int);
create table bar (a int);
create or replace function foo_trig_func () returns trigger as $$
begin drop table bar cascade; return new.*; end; $$ language plpgsql;
create trigger foo_trig before insert on foo execute function foo_trig_func();
create rule insert_foo AS ON insert TO foo do also insert into bar
values (new.*);
set plan_cache_mode to force_generic_plan ;
prepare q as insert into foo values (1);
execute q;
NOTICE: drop cascades to rule insert_foo on table foo
ERROR: relation with OID 16418 does not exist
The ERROR comes from trying to run (actually "initialize") the cached
plan for `insert into bar values (new.*);` which is due to the rule.
Though, it doesn't have to be a cached plan for the breakage to
happen. You can see the same error without the prepared statement:
insert into foo values (1);
NOTICE: drop cascades to rule insert_foo on table foo
ERROR: relation with OID 16418 does not exist
Another example:
create or replace function foo_trig_func () returns trigger as $$
begin alter table bar add b int; return new.*; end; $$ language
plpgsql;
execute q;
ERROR: table row type and query-specified row type do not match
DETAIL: Query has too few columns.
insert into foo values (1);
ERROR: table row type and query-specified row type do not match
DETAIL: Query has too few columns.
This time the error occurs in ExecModifyTable(), so when "running" the
plan, but again the code that's throwing the error is just "lazy"
initialization of the ProjectionInfo when inserting into bar.
So it is possible for the executor to try to run a plan that has
become invalid since it was created, so...
> > > > 3. The need to add *back* the fields to store the RT indexes of
> > > > relations that are not looked at by ExecInitNode() traversal such as
> > > > root partitioned tables and non-leaf partitions.
> > >
> > > I don't remember exactly why we removed those or what the benefit was,
> > > so I'm not sure how big of a problem it is if we have to put them
> > > back.
> >
> > We removed those in commit 52ed730d511b after commit f2343653f5b2
> > removed redundant execution-time locking of non-leaf relations. So we
> > removed them because we realized that execution time locking is
> > unnecessary given that AcquireExecutorLocks() exists and now we want
> > to add them back because we'd like to get rid of
> > AcquireExecutorLocks(). :-)
>
> My bias is to believe that getting rid of AcquireExecutorLocks() is
> probably the right thing to do, but that's not a strongly-held
> position and I could be totally wrong about it. The thing is, though,
> that AcquireExecutorLocks() is fundamentally stupid, and it's hard to
> see how it can ever be any smarter. If we want to make smarter
> decisions about what to lock, it seems reasonable to me to think that
> the locking code needs to be closer to code that can evaluate
> expressions and prune partitions and stuff like that.
One perhaps crazy idea [1]:
What if we remove AcquireExecutorLocks() and move the responsibility
of taking the remaining necessary locks into the executor (those on
any inheritance children that are added during planning and thus not
accounted for by AcquirePlannerLocks()), like the patch already does,
but don't make it also check if the plan has become invalid, which it
can't do anyway unless it's from a CachedPlan. That means we instead
let the executor throw any errors that occur when trying to either
initialize the plan because of the changes that have occurred to the
objects referenced in the plan, like what is happening in the above
example. If that case is going to be rare anway, why spend energy on
checking the validity and replan, especially if that's not an easy
thing to do as we're finding out. In the above example, we could say
that it's a user error to create a rule like that, so it should not
happen in practice, but when it does, the executor seems to deal with
it correctly by refusing to execute a broken plan . Perhaps it's more
worthwhile to make the executor behave correctly in face of plan
invalidation than teach the rest of the system to deal with the
executor throwing its hands up when it runs into an invalid plan?
Again, I think this may be a crazy line of thinking but just wanted to
get it out there.
--
Thanks, Amit Langote
[1] I recall Michael Paquier mentioning something like this to me once
when I was describing this patch and thread to him.
^ permalink raw reply [nested|flat] 29+ messages in thread
* Re: generic plans and "initial" pruning
2024-08-15 15:34 Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-16 12:35 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
@ 2024-08-19 16:39 ` Robert Haas <[email protected]>
2024-08-19 16:54 ` Re: generic plans and "initial" pruning Tom Lane <[email protected]>
2024-08-20 13:00 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
0 siblings, 2 replies; 29+ messages in thread
From: Robert Haas @ 2024-08-19 16:39 UTC (permalink / raw)
To: Amit Langote <[email protected]>; +Cc: Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>; Tom Lane <[email protected]>
On Fri, Aug 16, 2024 at 8:36 AM Amit Langote <[email protected]> wrote:
> So it is possible for the executor to try to run a plan that has
> become invalid since it was created, so...
I'm not sure what the "so what" here is.
> One perhaps crazy idea [1]:
>
> What if we remove AcquireExecutorLocks() and move the responsibility
> of taking the remaining necessary locks into the executor (those on
> any inheritance children that are added during planning and thus not
> accounted for by AcquirePlannerLocks()), like the patch already does,
> but don't make it also check if the plan has become invalid, which it
> can't do anyway unless it's from a CachedPlan. That means we instead
> let the executor throw any errors that occur when trying to either
> initialize the plan because of the changes that have occurred to the
> objects referenced in the plan, like what is happening in the above
> example. If that case is going to be rare anway, why spend energy on
> checking the validity and replan, especially if that's not an easy
> thing to do as we're finding out. In the above example, we could say
> that it's a user error to create a rule like that, so it should not
> happen in practice, but when it does, the executor seems to deal with
> it correctly by refusing to execute a broken plan . Perhaps it's more
> worthwhile to make the executor behave correctly in face of plan
> invalidation than teach the rest of the system to deal with the
> executor throwing its hands up when it runs into an invalid plan?
> Again, I think this may be a crazy line of thinking but just wanted to
> get it out there.
I don't know whether this is crazy or not. I think there are two
issues. One, the set of checks that we have right now might not be
complete, and we might just not have realized that because it happens
infrequently enough that we haven't found all the bugs. If that's so,
then a change like this could be a good thing, because it might force
us to fix stuff we should be fixing anyway. I have a feeling that some
of the checks you hit there were added as bug fixes long after the
code was written originally, so my confidence that we don't have more
bugs isn't especially high.
And two, it matters a lot how frequent the errors will be in practice.
I think we normally try to replan rather than let a stale plan be used
because we want to not fail, because users don't like failure. If the
design you propose here would make failures more (or less) frequent,
then that's a problem (or awesome).
--
Robert Haas
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 29+ messages in thread
* Re: generic plans and "initial" pruning
2024-08-15 15:34 Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-16 12:35 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-19 16:39 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
@ 2024-08-19 16:54 ` Tom Lane <[email protected]>
2024-08-19 17:38 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
1 sibling, 1 reply; 29+ messages in thread
From: Tom Lane @ 2024-08-19 16:54 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: Amit Langote <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>
Robert Haas <[email protected]> writes:
> On Fri, Aug 16, 2024 at 8:36 AM Amit Langote <[email protected]> wrote:
>> So it is possible for the executor to try to run a plan that has
>> become invalid since it was created, so...
> I'm not sure what the "so what" here is.
The fact that there are holes in our protections against that doesn't
make it a good idea to walk away from the protections. That path
leads to crashes and data corruption and unhappy users.
What the examples here are showing is that AcquireExecutorLocks
is incomplete because it only provides defenses against DDL
initiated by other sessions, not by our own session. We have
CheckTableNotInUse but I'm not sure if it could be applied here.
We certainly aren't calling that in anywhere near as systematic
a way as we have for acquiring locks.
Maybe we should rethink the principle that a session's locks
never conflict against itself, although I fear that might be
a nasty can of worms.
Could it work to do CheckTableNotInUse when acquiring an
exclusive table lock? I don't doubt that we'd have to fix some
code paths, but if the damage isn't extensive then that
might offer a more nearly bulletproof approach.
regards, tom lane
^ permalink raw reply [nested|flat] 29+ messages in thread
* Re: generic plans and "initial" pruning
2024-08-15 15:34 Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-16 12:35 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-19 16:39 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-19 16:54 ` Re: generic plans and "initial" pruning Tom Lane <[email protected]>
@ 2024-08-19 17:38 ` Robert Haas <[email protected]>
2024-08-19 17:52 ` Re: generic plans and "initial" pruning Tom Lane <[email protected]>
0 siblings, 1 reply; 29+ messages in thread
From: Robert Haas @ 2024-08-19 17:38 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Amit Langote <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>
On Mon, Aug 19, 2024 at 12:54 PM Tom Lane <[email protected]> wrote:
> What the examples here are showing is that AcquireExecutorLocks
> is incomplete because it only provides defenses against DDL
> initiated by other sessions, not by our own session. We have
> CheckTableNotInUse but I'm not sure if it could be applied here.
> We certainly aren't calling that in anywhere near as systematic
> a way as we have for acquiring locks.
>
> Maybe we should rethink the principle that a session's locks
> never conflict against itself, although I fear that might be
> a nasty can of worms.
It might not be that bad. It could replace the CheckTableNotInUse()
protections that we have today but maybe cover more cases, and it
could do so without needing any changes to the shared lock manager.
Say every time you start a query you give that query an ID number, and
all locks taken by that query are tagged with that ID number in the
local lock table, and maybe some flags indicating why the lock was
taken. When a new lock acquisition comes along you can say "oh, this
lock was previously taken so that we could do thus-and-so" and then
use that to fail with the appropriate error message. That seems like
it might be more powerful than the refcnt check within
CheckTableNotInUse().
But that seems somewhat incidental to what this thread is about. IIUC,
Amit's original design involved having the plan cache call some new
executor function to do partition pruning before lock acquisition, and
then passing that data structure around, including back to the
executor, so that we didn't repeat the pruning we already did, which
would be a bad thing to do not only because it would incur CPU cost
but also because really bad things would happen if we got a different
answer the second time. IIUC, you didn't think that was going to work
out nicely, and suggested instead moving the pruning+locking to
ExecutorStart() time. But now Amit is finding problems with that
approach, because by the time we reach PortalRun() for the
PORTAL_MULTI_QUERY case, it's too late to replan, because we can't ask
the plancache to replan just one query from the list; and if we try to
fix that by moving ExecutorStart() to PortalStart(), then there are
other problems. Do you have a view on what the way forward might be?
This thread has gotten a tad depressing, honestly. All of the opinions
about what we ought to do seem to be based on the firm conviction that
X or Y or Z will not work, rather than on the confidence that A or B
or C will work. Yet I'm inclined to believe this problem is solvable.
--
Robert Haas
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 29+ messages in thread
* Re: generic plans and "initial" pruning
2024-08-15 15:34 Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-16 12:35 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-19 16:39 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-19 16:54 ` Re: generic plans and "initial" pruning Tom Lane <[email protected]>
2024-08-19 17:38 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
@ 2024-08-19 17:52 ` Tom Lane <[email protected]>
2024-08-19 18:20 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
0 siblings, 1 reply; 29+ messages in thread
From: Tom Lane @ 2024-08-19 17:52 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: Amit Langote <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>
Robert Haas <[email protected]> writes:
> But that seems somewhat incidental to what this thread is about.
Perhaps. But if we're running into issues related to that, it might
be good to set aside the long-term goal for a bit and come up with
a cleaner answer for intra-session locking. That could allow the
pruning problem to be solved more cleanly in turn, and it'd be
an improvement even if not.
> Do you have a view on what the way forward might be?
I'm fresh out of ideas at the moment, other than having a hope that
divide-and-conquer (ie, solving subproblems first) might pay off.
> This thread has gotten a tad depressing, honestly. All of the opinions
> about what we ought to do seem to be based on the firm conviction that
> X or Y or Z will not work, rather than on the confidence that A or B
> or C will work. Yet I'm inclined to believe this problem is solvable.
Yeah. We are working in an extremely not-green field here, which
means it's a lot easier to see pre-existing reasons why X will not
work than to have confidence that it will work. But hey, if this
were easy then we'd have done it already.
regards, tom lane
^ permalink raw reply [nested|flat] 29+ messages in thread
* Re: generic plans and "initial" pruning
2024-08-15 15:34 Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-16 12:35 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-19 16:39 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-19 16:54 ` Re: generic plans and "initial" pruning Tom Lane <[email protected]>
2024-08-19 17:38 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-19 17:52 ` Re: generic plans and "initial" pruning Tom Lane <[email protected]>
@ 2024-08-19 18:20 ` Robert Haas <[email protected]>
2024-08-20 13:14 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
0 siblings, 1 reply; 29+ messages in thread
From: Robert Haas @ 2024-08-19 18:20 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: Amit Langote <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>
On Mon, Aug 19, 2024 at 1:52 PM Tom Lane <[email protected]> wrote:
> Robert Haas <[email protected]> writes:
> > But that seems somewhat incidental to what this thread is about.
>
> Perhaps. But if we're running into issues related to that, it might
> be good to set aside the long-term goal for a bit and come up with
> a cleaner answer for intra-session locking. That could allow the
> pruning problem to be solved more cleanly in turn, and it'd be
> an improvement even if not.
Maybe, but the pieces aren't quite coming together for me. Solving
this would mean that if we execute a stale plan, we'd be more likely
to get a good error and less likely to get a bad, nasty-looking
internal error, or a crash. That's good on its own terms, but we don't
really want user queries to produce errors at all, so I don't think
we'd feel any more free to rearrange the order of operations than we
do today.
> > Do you have a view on what the way forward might be?
>
> I'm fresh out of ideas at the moment, other than having a hope that
> divide-and-conquer (ie, solving subproblems first) might pay off.
Fair enough, but why do you think that the original approach of
creating a data structure from within the plan cache mechanism
(probably via a call into some new executor entrypoint) and then
feeding that through to ExecutorRun() time can't work? Is it possible
you latched onto some non-optimal decisions that the early versions of
the patch made, rather than there being a fundamental problem with the
concept?
I actually thought the do-it-at-executorstart-time approach sounded
pretty good, even though we might have to abandon planstate tree
initialization partway through, right up until Amit started talking
about moving ExecutorStart() from PortalRun() to PortalStart(), which
I have a feeling is going to create a bigger problem than we can
solve. I think if we want to save that approach, we should try to
figure out if we can teach the plancache to replan one query from a
list without replanning the others, which seems like it might allow us
to keep the order of major operations unchanged. Otherwise, it makes
sense to me to have another go at the other approach, at least to make
sure we understand clearly why it can't work.
> Yeah. We are working in an extremely not-green field here, which
> means it's a lot easier to see pre-existing reasons why X will not
> work than to have confidence that it will work. But hey, if this
> were easy then we'd have done it already.
Yeah, true.
--
Robert Haas
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 29+ messages in thread
* Re: generic plans and "initial" pruning
2024-08-15 15:34 Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-16 12:35 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-19 16:39 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-19 16:54 ` Re: generic plans and "initial" pruning Tom Lane <[email protected]>
2024-08-19 17:38 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-19 17:52 ` Re: generic plans and "initial" pruning Tom Lane <[email protected]>
2024-08-19 18:20 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
@ 2024-08-20 13:14 ` Amit Langote <[email protected]>
0 siblings, 0 replies; 29+ messages in thread
From: Amit Langote @ 2024-08-20 13:14 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: Tom Lane <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>
On Tue, Aug 20, 2024 at 3:21 AM Robert Haas <[email protected]> wrote:
> On Mon, Aug 19, 2024 at 1:52 PM Tom Lane <[email protected]> wrote:
> > Robert Haas <[email protected]> writes:
> > > But that seems somewhat incidental to what this thread is about.
> >
> > Perhaps. But if we're running into issues related to that, it might
> > be good to set aside the long-term goal for a bit and come up with
> > a cleaner answer for intra-session locking. That could allow the
> > pruning problem to be solved more cleanly in turn, and it'd be
> > an improvement even if not.
>
> Maybe, but the pieces aren't quite coming together for me. Solving
> this would mean that if we execute a stale plan, we'd be more likely
> to get a good error and less likely to get a bad, nasty-looking
> internal error, or a crash. That's good on its own terms, but we don't
> really want user queries to produce errors at all, so I don't think
> we'd feel any more free to rearrange the order of operations than we
> do today.
Yeah, it's unclear whether executing a potentially stale plan is an
acceptable tradeoff compared to replanning, especially if it occurs
rarely. Personally, I would prefer that it is.
> > > Do you have a view on what the way forward might be?
> >
> > I'm fresh out of ideas at the moment, other than having a hope that
> > divide-and-conquer (ie, solving subproblems first) might pay off.
>
> Fair enough, but why do you think that the original approach of
> creating a data structure from within the plan cache mechanism
> (probably via a call into some new executor entrypoint) and then
> feeding that through to ExecutorRun() time can't work?
That would be ExecutorStart(). The data structure need not be
referenced after ExecInitNode().
> Is it possible
> you latched onto some non-optimal decisions that the early versions of
> the patch made, rather than there being a fundamental problem with the
> concept?
>
> I actually thought the do-it-at-executorstart-time approach sounded
> pretty good, even though we might have to abandon planstate tree
> initialization partway through, right up until Amit started talking
> about moving ExecutorStart() from PortalRun() to PortalStart(), which
> I have a feeling is going to create a bigger problem than we can
> solve. I think if we want to save that approach, we should try to
> figure out if we can teach the plancache to replan one query from a
> list without replanning the others, which seems like it might allow us
> to keep the order of major operations unchanged. Otherwise, it makes
> sense to me to have another go at the other approach, at least to make
> sure we understand clearly why it can't work.
+1
--
Thanks, Amit Langote
^ permalink raw reply [nested|flat] 29+ messages in thread
* Re: generic plans and "initial" pruning
2024-08-15 15:34 Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-16 12:35 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-19 16:39 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
@ 2024-08-20 13:00 ` Amit Langote <[email protected]>
2024-08-20 14:53 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
1 sibling, 1 reply; 29+ messages in thread
From: Amit Langote @ 2024-08-20 13:00 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>; Tom Lane <[email protected]>
On Tue, Aug 20, 2024 at 1:39 AM Robert Haas <[email protected]> wrote:
> On Fri, Aug 16, 2024 at 8:36 AM Amit Langote <[email protected]> wrote:
> > So it is possible for the executor to try to run a plan that has
> > become invalid since it was created, so...
>
> I'm not sure what the "so what" here is.
I meant that if the executor has to deal with broken plans anyway, we
might as well lean into that fact by choosing not to handle only the
cached plan case in a certain way. Yes, I understand that that's not
a good justification.
> > One perhaps crazy idea [1]:
> >
> > What if we remove AcquireExecutorLocks() and move the responsibility
> > of taking the remaining necessary locks into the executor (those on
> > any inheritance children that are added during planning and thus not
> > accounted for by AcquirePlannerLocks()), like the patch already does,
> > but don't make it also check if the plan has become invalid, which it
> > can't do anyway unless it's from a CachedPlan. That means we instead
> > let the executor throw any errors that occur when trying to either
> > initialize the plan because of the changes that have occurred to the
> > objects referenced in the plan, like what is happening in the above
> > example. If that case is going to be rare anway, why spend energy on
> > checking the validity and replan, especially if that's not an easy
> > thing to do as we're finding out. In the above example, we could say
> > that it's a user error to create a rule like that, so it should not
> > happen in practice, but when it does, the executor seems to deal with
> > it correctly by refusing to execute a broken plan . Perhaps it's more
> > worthwhile to make the executor behave correctly in face of plan
> > invalidation than teach the rest of the system to deal with the
> > executor throwing its hands up when it runs into an invalid plan?
> > Again, I think this may be a crazy line of thinking but just wanted to
> > get it out there.
>
> I don't know whether this is crazy or not. I think there are two
> issues. One, the set of checks that we have right now might not be
> complete, and we might just not have realized that because it happens
> infrequently enough that we haven't found all the bugs. If that's so,
> then a change like this could be a good thing, because it might force
> us to fix stuff we should be fixing anyway. I have a feeling that some
> of the checks you hit there were added as bug fixes long after the
> code was written originally, so my confidence that we don't have more
> bugs isn't especially high.
This makes sense.
> And two, it matters a lot how frequent the errors will be in practice.
> I think we normally try to replan rather than let a stale plan be used
> because we want to not fail, because users don't like failure. If the
> design you propose here would make failures more (or less) frequent,
> then that's a problem (or awesome).
I think we'd modify plancache.c to postpone the locking of only
prunable relations (i.e., partitions), so we're looking at only a
handful of concurrent modifications that are going to cause execution
errors. That's because we disallow many DDL modifications of
partitions unless they are done via recursion from the parent, so the
space of errors in practice would be smaller compared to if we were to
postpone *all* cached plan locks to ExecInitNode() time. DROP INDEX
a_partion_only_index comes to mind as something that might cause an
error. I've not tested if other partition-only constraints can cause
unsafe behaviors.
Perhaps, we can add the check for CachedPlan.is_valid after every
table_open() and index_open() in the executor that takes a lock or at
all the places we discussed previously and throw the error (say:
"cached plan is no longer valid") if it's false. That's better than
running into and throwing into some random error by soldiering ahead
with its initialization / execution, but still a loss in terms of user
experience because we're adding a new failure mode, however rare.
--
Thanks, Amit Langote
^ permalink raw reply [nested|flat] 29+ messages in thread
* Re: generic plans and "initial" pruning
2024-08-15 15:34 Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-16 12:35 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-19 16:39 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-20 13:00 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
@ 2024-08-20 14:53 ` Robert Haas <[email protected]>
2024-08-21 12:45 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
0 siblings, 1 reply; 29+ messages in thread
From: Robert Haas @ 2024-08-20 14:53 UTC (permalink / raw)
To: Amit Langote <[email protected]>; +Cc: Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>; Tom Lane <[email protected]>
On Tue, Aug 20, 2024 at 9:00 AM Amit Langote <[email protected]> wrote:
> I think we'd modify plancache.c to postpone the locking of only
> prunable relations (i.e., partitions), so we're looking at only a
> handful of concurrent modifications that are going to cause execution
> errors. That's because we disallow many DDL modifications of
> partitions unless they are done via recursion from the parent, so the
> space of errors in practice would be smaller compared to if we were to
> postpone *all* cached plan locks to ExecInitNode() time. DROP INDEX
> a_partion_only_index comes to mind as something that might cause an
> error. I've not tested if other partition-only constraints can cause
> unsafe behaviors.
This seems like a valid point to some extent, but in other contexts
we've had discussions about how we don't actually guarantee all that
much uniformity between a partitioned table and its partitions, and
it's been questioned whether we made the right decisions there. So I'm
not entirely sure that the surface area for problems here will be as
narrow as you're hoping -- I think we'd need to go through all of the
ALTER TABLE variants and think it through. But maybe the problems
aren't that bad.
It does seem like constraints can change the plan. Imagine the
partition had a CHECK(false) constraint before and now doesn't, or
something.
--
Robert Haas
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 29+ messages in thread
* Re: generic plans and "initial" pruning
2024-08-15 15:34 Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-16 12:35 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-19 16:39 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-20 13:00 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-20 14:53 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
@ 2024-08-21 12:45 ` Amit Langote <[email protected]>
2024-08-21 13:10 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
0 siblings, 1 reply; 29+ messages in thread
From: Amit Langote @ 2024-08-21 12:45 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>; Tom Lane <[email protected]>
On Tue, Aug 20, 2024 at 11:53 PM Robert Haas <[email protected]> wrote:
> On Tue, Aug 20, 2024 at 9:00 AM Amit Langote <[email protected]> wrote:
> > I think we'd modify plancache.c to postpone the locking of only
> > prunable relations (i.e., partitions), so we're looking at only a
> > handful of concurrent modifications that are going to cause execution
> > errors. That's because we disallow many DDL modifications of
> > partitions unless they are done via recursion from the parent, so the
> > space of errors in practice would be smaller compared to if we were to
> > postpone *all* cached plan locks to ExecInitNode() time. DROP INDEX
> > a_partion_only_index comes to mind as something that might cause an
> > error. I've not tested if other partition-only constraints can cause
> > unsafe behaviors.
>
> This seems like a valid point to some extent, but in other contexts
> we've had discussions about how we don't actually guarantee all that
> much uniformity between a partitioned table and its partitions, and
> it's been questioned whether we made the right decisions there. So I'm
> not entirely sure that the surface area for problems here will be as
> narrow as you're hoping -- I think we'd need to go through all of the
> ALTER TABLE variants and think it through. But maybe the problems
> aren't that bad.
Many changeable properties that are reflected in the RelationData of a
partition after getting the lock on it seem to cause no issues as long
as the executor code only looks at RelationData, which is true for
most Scan nodes. It also seems true for ModifyTable which looks into
RelationData for relation properties relevant to insert/deletes.
The two things that don't cope are:
* Index Scan nodes with concurrent DROP INDEX of partition-only indexes.
* Concurrent DROP CONSTRAINT of partition-only CHECK and NOT NULL
constraints can lead to incorrect result as I write below.
> It does seem like constraints can change the plan. Imagine the
> partition had a CHECK(false) constraint before and now doesn't, or
> something.
Yeah, if the CHECK constraint gets dropped concurrently, any new rows
that got added after that will not be returned by executing a stale
cached plan, because the plan would have been created based on the
assumption that such rows shouldn't be there due to the CHECK
constraint. We currently don't explicitly check that the constraints
that were used during planning still exist before executing the plan.
Overall, I'm starting to feel less enthused by the idea throwing an
error in the executor due to known and unknown hazards of trying to
execute a stale plan. Even if we made a note in the docs of such
hazards, any users who run into these rare errors are likely to head
to -bugs or -hackers anyway.
Tom said we should perhaps look at the hazards caused by intra-session
locking, but we'd still be left with the hazards of missing index and
constraints, AFAICS, due to DROP from other sessions.
So, the options:
* The replanning aspect of the lock-in-the-executor design would be
simpler if a CachedPlan contained the plan for a single query rather
than a list of queries, as previously mentioned. This is particularly
due to the requirements of the PORTAL_MULTI_QUERY case. However, this
option might be impractical.
* Polish the patch for the old design of doing the initial pruning
before AcquireExecutorLocks() and focus on hashing out any bugs and
issues of that design.
--
Thanks, Amit Langote
^ permalink raw reply [nested|flat] 29+ messages in thread
* Re: generic plans and "initial" pruning
2024-08-15 15:34 Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-16 12:35 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-19 16:39 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-20 13:00 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-20 14:53 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-21 12:45 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
@ 2024-08-21 13:10 ` Robert Haas <[email protected]>
2024-08-23 12:48 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
0 siblings, 1 reply; 29+ messages in thread
From: Robert Haas @ 2024-08-21 13:10 UTC (permalink / raw)
To: Amit Langote <[email protected]>; +Cc: Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>; Tom Lane <[email protected]>
On Wed, Aug 21, 2024 at 8:45 AM Amit Langote <[email protected]> wrote:
> * The replanning aspect of the lock-in-the-executor design would be
> simpler if a CachedPlan contained the plan for a single query rather
> than a list of queries, as previously mentioned. This is particularly
> due to the requirements of the PORTAL_MULTI_QUERY case. However, this
> option might be impractical.
It might be, but maybe it would be worth a try? I mean,
GetCachedPlan() seems to just call pg_plan_queries() which just loops
over the list of query trees and does the same thing for each one. If
we wanted to replan a single query, why couldn't we do
fake_querytree_list = list_make1(list_nth(querytree_list, n)) and then
call pg_plan_queries(fake_querytree_list)? Or something equivalent to
that. We could have a new GetCachedSinglePlan(cplan, n) to do this.
> * Polish the patch for the old design of doing the initial pruning
> before AcquireExecutorLocks() and focus on hashing out any bugs and
> issues of that design.
That's also an option. It probably has issues too, but I don't know
what they are exactly.
--
Robert Haas
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 29+ messages in thread
* Re: generic plans and "initial" pruning
2024-08-15 15:34 Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-16 12:35 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-19 16:39 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-20 13:00 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-20 14:53 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-21 12:45 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-21 13:10 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
@ 2024-08-23 12:48 ` Amit Langote <[email protected]>
2024-08-29 13:34 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
0 siblings, 1 reply; 29+ messages in thread
From: Amit Langote @ 2024-08-23 12:48 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>; Tom Lane <[email protected]>
On Wed, Aug 21, 2024 at 10:10 PM Robert Haas <[email protected]> wrote:
> On Wed, Aug 21, 2024 at 8:45 AM Amit Langote <[email protected]> wrote:
> > * The replanning aspect of the lock-in-the-executor design would be
> > simpler if a CachedPlan contained the plan for a single query rather
> > than a list of queries, as previously mentioned. This is particularly
> > due to the requirements of the PORTAL_MULTI_QUERY case. However, this
> > option might be impractical.
>
> It might be, but maybe it would be worth a try? I mean,
> GetCachedPlan() seems to just call pg_plan_queries() which just loops
> over the list of query trees and does the same thing for each one. If
> we wanted to replan a single query, why couldn't we do
> fake_querytree_list = list_make1(list_nth(querytree_list, n)) and then
> call pg_plan_queries(fake_querytree_list)? Or something equivalent to
> that. We could have a new GetCachedSinglePlan(cplan, n) to do this.
I've been hacking to prototype this, and it's showing promise. It
helps make the replan loop at the call sites that start the executor
with an invalidatable plan more localized and less prone to
action-at-a-distance issues. However, the interface and contract of
the new function in my prototype are pretty specialized for the replan
loop in this context—meaning it's not as general-purpose as
GetCachedPlan(). Essentially, what you get when you call it is a
'throwaway' CachedPlan containing only the plan for the query that
failed during ExecutorStart(), not a plan integrated into the original
CachedPlanSource's stmt_list. A call site entering the replan loop
will retry the execution with that throwaway plan, release it once
done, and resume looping over the plans in the original list. The
invalid plan that remains in the original list will be discarded and
replanned in the next call to GetCachedPlan() using the same
CachedPlanSource. While that may sound undesirable, I'm inclined to
think it's not something that needs optimization, given that we're
expecting this code path to be taken rarely.
I'll post a version of a revamped locks-in-the-executor patch set
using the above function after debugging some more.
--
Thanks, Amit Langote
^ permalink raw reply [nested|flat] 29+ messages in thread
* Re: generic plans and "initial" pruning
2024-08-15 15:34 Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-16 12:35 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-19 16:39 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-20 13:00 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-20 14:53 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-21 12:45 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-21 13:10 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-23 12:48 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
@ 2024-08-29 13:34 ` Amit Langote <[email protected]>
2024-08-31 12:30 ` Re: generic plans and "initial" pruning Junwang Zhao <[email protected]>
2024-09-17 12:57 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
0 siblings, 2 replies; 29+ messages in thread
From: Amit Langote @ 2024-08-29 13:34 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>; Tom Lane <[email protected]>
On Fri, Aug 23, 2024 at 9:48 PM Amit Langote <[email protected]> wrote:
> On Wed, Aug 21, 2024 at 10:10 PM Robert Haas <[email protected]> wrote:
> > On Wed, Aug 21, 2024 at 8:45 AM Amit Langote <[email protected]> wrote:
> > > * The replanning aspect of the lock-in-the-executor design would be
> > > simpler if a CachedPlan contained the plan for a single query rather
> > > than a list of queries, as previously mentioned. This is particularly
> > > due to the requirements of the PORTAL_MULTI_QUERY case. However, this
> > > option might be impractical.
> >
> > It might be, but maybe it would be worth a try? I mean,
> > GetCachedPlan() seems to just call pg_plan_queries() which just loops
> > over the list of query trees and does the same thing for each one. If
> > we wanted to replan a single query, why couldn't we do
> > fake_querytree_list = list_make1(list_nth(querytree_list, n)) and then
> > call pg_plan_queries(fake_querytree_list)? Or something equivalent to
> > that. We could have a new GetCachedSinglePlan(cplan, n) to do this.
>
> I've been hacking to prototype this, and it's showing promise. It
> helps make the replan loop at the call sites that start the executor
> with an invalidatable plan more localized and less prone to
> action-at-a-distance issues. However, the interface and contract of
> the new function in my prototype are pretty specialized for the replan
> loop in this context—meaning it's not as general-purpose as
> GetCachedPlan(). Essentially, what you get when you call it is a
> 'throwaway' CachedPlan containing only the plan for the query that
> failed during ExecutorStart(), not a plan integrated into the original
> CachedPlanSource's stmt_list. A call site entering the replan loop
> will retry the execution with that throwaway plan, release it once
> done, and resume looping over the plans in the original list. The
> invalid plan that remains in the original list will be discarded and
> replanned in the next call to GetCachedPlan() using the same
> CachedPlanSource. While that may sound undesirable, I'm inclined to
> think it's not something that needs optimization, given that we're
> expecting this code path to be taken rarely.
>
> I'll post a version of a revamped locks-in-the-executor patch set
> using the above function after debugging some more.
Here it is.
0001 implements changes to defer the locking of runtime-prunable
relations to the executor. The new design introduces a bitmapset
field in PlannedStmt to distinguish at runtime between relations that
are prunable whose locking can be deferred until ExecInitNode() and
those that are not and must be locked in advance. The set of prunable
relations can be constructed by looking at all the PartitionPruneInfos
in the plan and checking which are subject to "initial" pruning steps.
The set of unprunable relations is obtained by subtracting those from
the set of all RT indexes. This design gets rid of one annoying
aspect of the old design which was the need to add specialized fields
to store the RT indexes of partitioned relations that are not
otherwise referenced in the plan tree. That was necessary because in
the old design, I had removed the function AcquireExecutorLocks()
altogether to defer the locking of all child relations to execution.
In the new design such relations are still locked by
AcquireExecutorLocks().
0002 is the old patch to make ExecEndNode() robust against partially
initialized PlanState nodes by adding NULL checks.
0003 is the patch to add changes to deal with the CachedPlan becoming
invalid before the deferred locks on prunable relations are taken.
I've moved the replan loop into a new wrapper-over-ExecutorStart()
function instead of having the same logic at multiple sites. The
replan logic uses the GetSingleCachedPlan() described in the quoted
text. The callers of the new ExecutorStart()-wrapper, which I've
dubbed ExecutorStartExt(), need to pass the CachedPlanSource and a
query_index, which is the index of the query being executed in the
list CachedPlanSource.query_list. They are needed by
GetSingleCachedPlan(). The changes outside the executor are pretty
minimal in this design and all the difficulties of having to loop back
to GetCachedPlan() are now gone. I like how this turned out.
One idea that I think might be worth trying to reduce the footprint of
0003 is to try to lock the prunable relations in a step of InitPlan()
separate from ExecInitNode(), which can be implemented by doing the
initial runtime pruning in that separate step. That way, we'll have
all the necessary locks before calling ExecInitNode() and so we don't
need to sprinkle the CachedPlanStillValid() checks all over the place
and worry about missed checks and dealing with partially initialized
PlanState trees.
--
Thanks, Amit Langote
Attachments:
[application/octet-stream] v51-0003-Handle-CachedPlan-invalidation-in-the-executor.patch (84.7K, 2-v51-0003-Handle-CachedPlan-invalidation-in-the-executor.patch)
download | inline diff:
From 887627ec4455a70a716ce56f386f71df953cdf64 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Thu, 22 Aug 2024 19:38:13 +0900
Subject: [PATCH v51 3/3] Handle CachedPlan invalidation in the executor
This commit makes changes to handle cases where a cached plan
becomes invalid before deferred locks on prunable relations are taken.
* Add checks at various points in ExecutorStart() and its called
functions to determine if the plan becomes invalid. If detected,
the function and its callers return immediately. A previous commit
ensures any partially initialized PlanState tree objects are cleaned
up appropriately.
* Introduce ExecutorStartExt(), a wrapper over ExecutorStart(), to
handle cases where plan initialization is aborted due to invalidation.
ExecutorStartExt() creates a new transient CachedPlan if needed and
retries execution. This new entry point is only required for sites
using plancache.c. It requires passing the QueryDesc, eflags,
CachedPlanSource, and query_index (index in CachedPlanSource.query_list).
* Add GetSingleCachedPlan() in plancache.c to create a transient
CachedPlan for a specified query in the given CachedPlanSource.
This also adds isolation tests using the delay_execution test module
to verify scenarios where a CachedPlan becomes invalid before the
deferred locks are taken.
All ExecutorStart_hook implementations now must add the following
block after the ExecutorStart() call to ensure it doesn't work with an
invalid plan:
/* The plan may have become invalid during ExecutorStart() */
if (!ExecPlanStillValid(queryDesc->estate))
return;
Reviewed-by: Robert Haas
Discussion: https://postgr.es/m/CA+HiwqFGkMSge6TgC9KQzde0ohpAycLQuV7ooitEEpbKB0O_mg@mail.gmail.comk
---
contrib/auto_explain/auto_explain.c | 4 +
.../pg_stat_statements/pg_stat_statements.c | 4 +
contrib/postgres_fdw/postgres_fdw.c | 36 +++-
src/backend/commands/explain.c | 8 +-
src/backend/commands/portalcmds.c | 1 +
src/backend/commands/prepare.c | 10 +-
src/backend/commands/trigger.c | 14 ++
src/backend/executor/README | 32 +++-
src/backend/executor/execMain.c | 91 ++++++++-
src/backend/executor/execParallel.c | 4 +-
src/backend/executor/execPartition.c | 10 +
src/backend/executor/execProcnode.c | 7 +
src/backend/executor/execUtils.c | 42 ++++-
src/backend/executor/nodeAgg.c | 2 +
src/backend/executor/nodeAppend.c | 12 +-
src/backend/executor/nodeBitmapAnd.c | 2 +
src/backend/executor/nodeBitmapHeapscan.c | 4 +
src/backend/executor/nodeBitmapIndexscan.c | 6 +-
src/backend/executor/nodeBitmapOr.c | 2 +
src/backend/executor/nodeCustom.c | 2 +
src/backend/executor/nodeForeignscan.c | 4 +
src/backend/executor/nodeGather.c | 2 +
src/backend/executor/nodeGatherMerge.c | 2 +
src/backend/executor/nodeGroup.c | 2 +
src/backend/executor/nodeHash.c | 2 +
src/backend/executor/nodeHashjoin.c | 4 +
src/backend/executor/nodeIncrementalSort.c | 2 +
src/backend/executor/nodeIndexonlyscan.c | 7 +-
src/backend/executor/nodeIndexscan.c | 8 +-
src/backend/executor/nodeLimit.c | 2 +
src/backend/executor/nodeLockRows.c | 2 +
src/backend/executor/nodeMaterial.c | 2 +
src/backend/executor/nodeMemoize.c | 2 +
src/backend/executor/nodeMergeAppend.c | 6 +-
src/backend/executor/nodeMergejoin.c | 4 +
src/backend/executor/nodeModifyTable.c | 13 ++
src/backend/executor/nodeNestloop.c | 4 +
src/backend/executor/nodeProjectSet.c | 2 +
src/backend/executor/nodeRecursiveunion.c | 4 +
src/backend/executor/nodeResult.c | 2 +
src/backend/executor/nodeSamplescan.c | 3 +
src/backend/executor/nodeSeqscan.c | 3 +
src/backend/executor/nodeSetOp.c | 2 +
src/backend/executor/nodeSort.c | 2 +
src/backend/executor/nodeSubqueryscan.c | 2 +
src/backend/executor/nodeTidrangescan.c | 2 +
src/backend/executor/nodeTidscan.c | 2 +
src/backend/executor/nodeUnique.c | 2 +
src/backend/executor/nodeWindowAgg.c | 2 +
src/backend/executor/spi.c | 19 +-
src/backend/tcop/postgres.c | 4 +-
src/backend/tcop/pquery.c | 31 +++-
src/backend/utils/cache/plancache.c | 50 +++++
src/backend/utils/mmgr/portalmem.c | 4 +-
src/include/commands/explain.h | 1 +
src/include/commands/trigger.h | 1 +
src/include/executor/execdesc.h | 1 +
src/include/executor/executor.h | 18 ++
src/include/nodes/execnodes.h | 1 +
src/include/utils/plancache.h | 18 ++
src/include/utils/portal.h | 4 +-
src/test/modules/delay_execution/Makefile | 3 +-
.../modules/delay_execution/delay_execution.c | 63 ++++++-
.../expected/cached-plan-inval.out | 175 ++++++++++++++++++
src/test/modules/delay_execution/meson.build | 1 +
.../specs/cached-plan-inval.spec | 65 +++++++
66 files changed, 790 insertions(+), 58 deletions(-)
create mode 100644 src/test/modules/delay_execution/expected/cached-plan-inval.out
create mode 100644 src/test/modules/delay_execution/specs/cached-plan-inval.spec
diff --git a/contrib/auto_explain/auto_explain.c b/contrib/auto_explain/auto_explain.c
index 677c135f59..3675ce9a88 100644
--- a/contrib/auto_explain/auto_explain.c
+++ b/contrib/auto_explain/auto_explain.c
@@ -300,6 +300,10 @@ explain_ExecutorStart(QueryDesc *queryDesc, int eflags)
else
standard_ExecutorStart(queryDesc, eflags);
+ /* The plan may have become invalid during ExecutorStart() */
+ if (!ExecPlanStillValid(queryDesc->estate))
+ return;
+
if (auto_explain_enabled())
{
/*
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 362d222f63..98a328b79f 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -992,6 +992,10 @@ pgss_ExecutorStart(QueryDesc *queryDesc, int eflags)
else
standard_ExecutorStart(queryDesc, eflags);
+ /* The plan may have become invalid during ExecutorStart() */
+ if (!ExecPlanStillValid(queryDesc->estate))
+ return;
+
/*
* If query has queryId zero, don't track it. This prevents double
* counting of optimizable statements that are directly contained in
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index adc62576d1..65f4ffe5ee 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -2144,7 +2144,11 @@ postgresEndForeignModify(EState *estate,
{
PgFdwModifyState *fmstate = (PgFdwModifyState *) resultRelInfo->ri_FdwState;
- /* If fmstate is NULL, we are in EXPLAIN; nothing to do */
+ /*
+ * fmstate could be NULL under two conditions: during an EXPLAIN
+ * operation or if BeginForeignModify() hasn't been invoked.
+ * In either case, no action is required.
+ */
if (fmstate == NULL)
return;
@@ -2650,8 +2654,9 @@ postgresBeginDirectModify(ForeignScanState *node, int eflags)
{
ForeignScan *fsplan = (ForeignScan *) node->ss.ps.plan;
EState *estate = node->ss.ps.state;
+ Relation rel = node->ss.ss_currentRelation;
PgFdwDirectModifyState *dmstate;
- Index rtindex;
+ Index rtindex = node->resultRelInfo->ri_RangeTableIndex;
Oid userid;
ForeignTable *table;
UserMapping *user;
@@ -2663,24 +2668,32 @@ postgresBeginDirectModify(ForeignScanState *node, int eflags)
if (eflags & EXEC_FLAG_EXPLAIN_ONLY)
return;
+ /*
+ * Open the foreign table using the RT index given in the ResultRelInfo if
+ * the ScanState doesn't provide it. If the plan becomes invalid as a
+ * result of taking a lock in ExecOpenScanRelation(), do nothing, in which
+ * case node->fdw_state remains NULL.
+ */
+ if (rel == NULL)
+ {
+ Assert(fsplan->scan.scanrelid == 0);
+ rel = ExecOpenScanRelation(estate, rtindex, eflags);
+ if (unlikely(rel == NULL || !ExecPlanStillValid(estate)))
+ return;
+ }
+
/*
* We'll save private state in node->fdw_state.
*/
dmstate = (PgFdwDirectModifyState *) palloc0(sizeof(PgFdwDirectModifyState));
node->fdw_state = (void *) dmstate;
+ dmstate->rel = rel;
/*
* Identify which user to do the remote access as. This should match what
* ExecCheckPermissions() does.
*/
userid = OidIsValid(fsplan->checkAsUser) ? fsplan->checkAsUser : GetUserId();
-
- /* Get info about foreign table. */
- rtindex = node->resultRelInfo->ri_RangeTableIndex;
- if (fsplan->scan.scanrelid == 0)
- dmstate->rel = ExecOpenScanRelation(estate, rtindex, eflags);
- else
- dmstate->rel = node->ss.ss_currentRelation;
table = GetForeignTable(RelationGetRelid(dmstate->rel));
user = GetUserMapping(userid, table->serverid);
@@ -2811,7 +2824,10 @@ postgresEndDirectModify(ForeignScanState *node)
{
PgFdwDirectModifyState *dmstate = (PgFdwDirectModifyState *) node->fdw_state;
- /* if dmstate is NULL, we are in EXPLAIN; nothing to do */
+ /*
+ * Nothing to do if dmstate is NULL, either because we are in EXPLAIN or
+ * dmstate wasn't initialized due to aborted plan initialization.
+ */
if (dmstate == NULL)
return;
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index a83ea07db1..a7643360a7 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -507,7 +507,8 @@ standard_ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NULL, NULL, -1, into, es, queryString, params,
+ queryEnv,
&planduration, (es->buffers ? &bufusage : NULL),
es->memory ? &mem_counters : NULL);
}
@@ -616,6 +617,7 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
*/
void
ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
+ CachedPlanSource *plansource, int query_index,
IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
@@ -686,8 +688,8 @@ ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
if (into)
eflags |= GetIntoRelEFlags(into);
- /* call ExecutorStart to prepare the plan for execution */
- ExecutorStart(queryDesc, eflags);
+ /* Call ExecutorStartExt to prepare the plan for execution. */
+ ExecutorStartExt(queryDesc, eflags, plansource, query_index);
/* Execute the plan for statistics if asked for */
if (es->analyze)
diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index 4f6acf6719..4b1503c05e 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -107,6 +107,7 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
queryString,
CMDTAG_SELECT, /* cursor's query is always a SELECT */
list_make1(plan),
+ NULL,
NULL);
/*----------
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 311b9ebd5b..4cd79a6e3a 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -202,7 +202,8 @@ ExecuteQuery(ParseState *pstate,
query_string,
entry->plansource->commandTag,
plan_list,
- cplan);
+ cplan,
+ entry->plansource);
/*
* For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
@@ -583,6 +584,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
MemoryContextCounters mem_counters;
MemoryContext planner_ctx = NULL;
MemoryContext saved_ctx = NULL;
+ int i = 0;
if (es->memory)
{
@@ -655,8 +657,8 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, cplan, into, es, query_string, paramLI,
- queryEnv,
+ ExplainOnePlan(pstmt, cplan, entry->plansource, i,
+ into, es, query_string, paramLI, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL),
es->memory ? &mem_counters : NULL);
else
@@ -668,6 +670,8 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
/* Separate plans with an appropriate separator */
if (lnext(plan_list, p) != NULL)
ExplainSeparatePlans(es);
+
+ i++;
}
if (estate)
diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index 170360edda..91e4b821a0 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -5119,6 +5119,20 @@ AfterTriggerEndQuery(EState *estate)
afterTriggers.query_depth--;
}
+/* ----------
+ * AfterTriggerAbortQuery()
+ *
+ * Called by ExecutorEnd() if the query execution was aborted due to the
+ * plan becoming invalid during initialization.
+ * ----------
+ */
+void
+AfterTriggerAbortQuery(void)
+{
+ /* Revert the actions of AfterTriggerBeginQuery(). */
+ afterTriggers.query_depth--;
+}
+
/*
* AfterTriggerFreeQuery
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 642d63be61..e583df5be0 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -280,6 +280,28 @@ are typically reset to empty once per tuple. Per-tuple contexts are usually
associated with ExprContexts, and commonly each PlanState node has its own
ExprContext to evaluate its qual and targetlist expressions in.
+Relation Locking
+----------------
+
+Typically, when the executor initializes a plan tree for execution, it doesn't
+lock non-index relations if the plan tree is freshly generated and not derived
+from a CachedPlan. This is because such locks have already been established
+during the query's parsing, rewriting, and planning phases. However, with a
+cached plan tree, some relations may remain unlocked. The function
+AcquireExecutorLocks() only locks unprunable relations in the plan, deferring
+the locking of prunable ones to executor initialization. This avoids
+unnecessary locking of relations that will be pruned during "initial" runtime
+pruning in the ExecInitNode() routine of nodes containing the pruning info.
+
+This approach creates a window where a cached plan tree with child tables
+could become outdated if another backend modifies these tables before
+ExecInitNode() locks them. As a result, the executor has the added duty to
+verify the plan tree's validity whenever it locks a child table after
+execution-initialization-pruning. This validation is done by checking the
+CachedPlan.is_valid attribute. If the plan tree is outdated (is_valid=false),
+the executor halts further initialization, cleans up the partially initialized
+PlanState tree, and retries execution after creating a new transient
+CachedPlan.
Query Processing Control Flow
-----------------------------
@@ -288,7 +310,7 @@ This is a sketch of control flow for full query processing:
CreateQueryDesc
- ExecutorStart
+ ExecutorStart or ExecutorStartExt
CreateExecutorState
creates per-query context
switch to per-query context to run ExecInitNode
@@ -316,7 +338,13 @@ This is a sketch of control flow for full query processing:
FreeQueryDesc
-Per above comments, it's not really critical for ExecEndNode to free any
+As mentioned in the "Relation Locking" section, if the plan tree is found to
+be stale during one of the recursive calls of ExecInitNode() after taking a
+lock on a child table, the control is immmediately returned to
+ExecutorStartExt(), which will create a new plan tree and perform the
+steps starting from CreateExecutorState() again.
+
+Per above comments, it's not really critical for ExecEndPlan to free any
memory; it'll all go away in FreeExecutorState anyway. However, we do need to
be careful to close relations, drop buffer pins, etc, so we do need to scan
the plan state tree to find these sorts of resources.
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 0f6dbd1e2b..92e0c9af9e 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -58,6 +58,7 @@
#include "utils/backend_status.h"
#include "utils/lsyscache.h"
#include "utils/partcache.h"
+#include "utils/plancache.h"
#include "utils/rls.h"
#include "utils/snapmgr.h"
@@ -133,6 +134,52 @@ ExecutorStart(QueryDesc *queryDesc, int eflags)
standard_ExecutorStart(queryDesc, eflags);
}
+/*
+ * A variant of ExecutorStart() that handles cleanup and replanning if the
+ * input CachedPlan becomes invalid due to locks being taken during
+ * ExecutorStartInternal(). If that happens, a new CachedPlan is created
+ * only for the at the index 'query_index' in plansource->query_list, which
+ * is released separately from the original CachedPlan.
+ */
+void
+ExecutorStartExt(QueryDesc *queryDesc, int eflags,
+ CachedPlanSource *plansource,
+ int query_index)
+{
+ if (queryDesc->cplan == NULL)
+ ExecutorStart(queryDesc, eflags);
+ else
+ {
+ while (1)
+ {
+ ExecutorStart(queryDesc, eflags);
+ if (!CachedPlanStillValid(queryDesc->cplan))
+ {
+ CachedPlan *cplan_new;
+
+ /*
+ * Mark execution as aborted to ensure that AFTER trigger
+ * state is properly reset.
+ */
+ queryDesc->estate->es_aborted = true;
+
+ ExecutorEnd(queryDesc);
+
+ cplan_new = GetSingleCachedPlan(plansource, query_index,
+ queryDesc->params,
+ queryDesc->queryEnv);
+ Assert(list_length(cplan_new->stmt_list) == 1);
+ queryDesc->cplan = cplan_new;
+ queryDesc->release_cplan = true;
+ queryDesc->plannedstmt = linitial_node(PlannedStmt,
+ cplan_new->stmt_list);
+ }
+ else
+ break;
+ }
+ }
+}
+
void
standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
{
@@ -316,6 +363,7 @@ standard_ExecutorRun(QueryDesc *queryDesc,
estate = queryDesc->estate;
Assert(estate != NULL);
+ Assert(!estate->es_aborted);
Assert(!(estate->es_top_eflags & EXEC_FLAG_EXPLAIN_ONLY));
/* caller must ensure the query's snapshot is active */
@@ -422,8 +470,11 @@ standard_ExecutorFinish(QueryDesc *queryDesc)
Assert(estate != NULL);
Assert(!(estate->es_top_eflags & EXEC_FLAG_EXPLAIN_ONLY));
- /* This should be run once and only once per Executor instance */
- Assert(!estate->es_finished);
+ /*
+ * This should be run once and only once per Executor instance and never
+ * if the execution was aborted.
+ */
+ Assert(!estate->es_finished && !estate->es_aborted);
/* Switch into per-query memory context */
oldcontext = MemoryContextSwitchTo(estate->es_query_cxt);
@@ -482,11 +533,10 @@ standard_ExecutorEnd(QueryDesc *queryDesc)
Assert(estate != NULL);
/*
- * Check that ExecutorFinish was called, unless in EXPLAIN-only mode. This
- * Assert is needed because ExecutorFinish is new as of 9.1, and callers
- * might forget to call it.
+ * Check that ExecutorFinish was called, unless in EXPLAIN-only mode or if
+ * execution was aborted.
*/
- Assert(estate->es_finished ||
+ Assert(estate->es_finished || estate->es_aborted ||
(estate->es_top_eflags & EXEC_FLAG_EXPLAIN_ONLY));
/*
@@ -500,6 +550,14 @@ standard_ExecutorEnd(QueryDesc *queryDesc)
UnregisterSnapshot(estate->es_snapshot);
UnregisterSnapshot(estate->es_crosscheck_snapshot);
+ /*
+ * Reset AFTER trigger module if the query execution was aborted.
+ */
+ if (estate->es_aborted &&
+ !(estate->es_top_eflags &
+ (EXEC_FLAG_SKIP_TRIGGERS | EXEC_FLAG_EXPLAIN_ONLY)))
+ AfterTriggerAbortQuery();
+
/*
* Must switch out of context before destroying it
*/
@@ -832,7 +890,6 @@ ExecCheckXactReadOnly(PlannedStmt *plannedstmt)
PreventCommandIfParallelMode(CreateCommandName((Node *) plannedstmt));
}
-
/* ----------------------------------------------------------------
* InitPlan
*
@@ -897,6 +954,9 @@ InitPlan(QueryDesc *queryDesc, int eflags)
case ROW_MARK_KEYSHARE:
case ROW_MARK_REFERENCE:
relation = ExecGetRangeTableRelation(estate, rc->rti);
+ if (unlikely(relation == NULL ||
+ !ExecPlanStillValid(estate)))
+ return;
break;
case ROW_MARK_COPY:
/* no physical table access is required */
@@ -967,6 +1027,8 @@ InitPlan(QueryDesc *queryDesc, int eflags)
estate->es_subplanstates = lappend(estate->es_subplanstates,
subplanstate);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return;
i++;
}
@@ -977,6 +1039,8 @@ InitPlan(QueryDesc *queryDesc, int eflags)
* processing tuples.
*/
planstate = ExecInitNode(plan, estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return;
/*
* Get the tuple descriptor describing the type of tuples to return.
@@ -2858,6 +2922,7 @@ EvalPlanQualStart(EPQState *epqstate, Plan *planTree)
rcestate->es_rowmarks = parentestate->es_rowmarks;
rcestate->es_rteperminfos = parentestate->es_rteperminfos;
rcestate->es_plannedstmt = parentestate->es_plannedstmt;
+ rcestate->es_cachedplan = parentestate->es_cachedplan;
rcestate->es_junkFilter = parentestate->es_junkFilter;
rcestate->es_output_cid = parentestate->es_output_cid;
rcestate->es_queryEnv = parentestate->es_queryEnv;
@@ -2936,6 +3001,14 @@ EvalPlanQualStart(EPQState *epqstate, Plan *planTree)
subplanstate = ExecInitNode(subplan, rcestate, 0);
rcestate->es_subplanstates = lappend(rcestate->es_subplanstates,
subplanstate);
+
+ /*
+ * All necessary locks should have been taken when initializing the
+ * parent's copy of subplanstate, so the CachedPlan, if any, should
+ * not have become invalid during the above ExecInitNode().
+ */
+ if (!ExecPlanStillValid(rcestate))
+ elog(ERROR, "unexpected failure to initialize subplan in EvalPlanQualStart()");
}
/*
@@ -2977,6 +3050,10 @@ EvalPlanQualStart(EPQState *epqstate, Plan *planTree)
*/
epqstate->recheckplanstate = ExecInitNode(planTree, rcestate, 0);
+ /* See the comment above. */
+ if (!ExecPlanStillValid(rcestate))
+ elog(ERROR, "unexpected failure to initialize main plantree in EvalPlanQualStart()");
+
MemoryContextSwitchTo(oldcontext);
}
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 03b48e12b4..2017433c64 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -1263,9 +1263,7 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
* if it should take locks on certain relations, but paraller workers
* always take locks anyway.
*/
- return CreateQueryDesc(pstmt,
- NULL,
- queryString,
+ return CreateQueryDesc(pstmt, NULL, queryString,
GetActiveSnapshot(), InvalidSnapshot,
receiver, paramLI, NULL, instrument_options);
}
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 7651886229..38cd97b59c 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1794,6 +1794,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* If subplans are indeed pruned, subplan_map arrays contained in the returned
* PartitionPruneState are re-sequenced to not count those, though only if the
* maps will be needed for subsequent execution pruning passes.
+ *
+ * Returns NULL if the plan has become invalid after taking the locks to
+ * create the PartitionPruneState in CreatePartitionPruneState().
*/
PartitionPruneState *
ExecInitPartitionPruning(PlanState *planstate,
@@ -1809,6 +1812,8 @@ ExecInitPartitionPruning(PlanState *planstate,
/* Create the working data structure for pruning */
prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+ if (!ExecPlanStillValid(estate))
+ return NULL;
/*
* Perform an initial partition prune pass, if required.
@@ -1860,6 +1865,9 @@ ExecInitPartitionPruning(PlanState *planstate,
* stored in each PartitionedRelPruningData can be re-used each time we
* re-evaluate which partitions match the pruning steps provided in each
* PartitionedRelPruneInfo.
+ *
+ * Returns NULL if the plan has become invalid after taking a lock to create
+ * a PartitionedRelPruningData.
*/
static PartitionPruneState *
CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
@@ -1935,6 +1943,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
* duration of this executor run.
*/
partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+ if (unlikely(partrel == NULL || !ExecPlanStillValid(estate)))
+ return NULL;
partkey = RelationGetPartitionKey(partrel);
partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
partrel);
diff --git a/src/backend/executor/execProcnode.c b/src/backend/executor/execProcnode.c
index 34f28dfece..7689d34dd0 100644
--- a/src/backend/executor/execProcnode.c
+++ b/src/backend/executor/execProcnode.c
@@ -136,6 +136,10 @@ static bool ExecShutdownNode_walker(PlanState *node, void *context);
* 'eflags' is a bitwise OR of flag bits described in executor.h
*
* Returns a PlanState node corresponding to the given Plan node.
+ *
+ * Callers should check upon returning that ExecPlanStillValid(estate)
+ * returns true before continuing further with its processing, because the
+ * returned PlanState might be only partially valid otherwise.
* ------------------------------------------------------------------------
*/
PlanState *
@@ -388,6 +392,9 @@ ExecInitNode(Plan *node, EState *estate, int eflags)
break;
}
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return result;
+
ExecSetExecProcNode(result, result->ExecProcNode);
/*
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 6dfd5a26b7..39b388e6b4 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -146,6 +146,7 @@ CreateExecutorState(void)
estate->es_top_eflags = 0;
estate->es_instrument = 0;
estate->es_finished = false;
+ estate->es_aborted = false;
estate->es_exprcontexts = NIL;
@@ -691,6 +692,8 @@ ExecRelationIsTargetRelation(EState *estate, Index scanrelid)
*
* Open the heap relation to be scanned by a base-level scan plan node.
* This should be called during the node's ExecInit routine.
+ *
+ * NULL is returned if the relation is found to have been dropped.
* ----------------------------------------------------------------
*/
Relation
@@ -700,6 +703,8 @@ ExecOpenScanRelation(EState *estate, Index scanrelid, int eflags)
/* Open the relation. */
rel = ExecGetRangeTableRelation(estate, scanrelid);
+ if (unlikely(rel == NULL || !ExecPlanStillValid(estate)))
+ return rel;
/*
* Complain if we're attempting a scan of an unscannable relation, except
@@ -717,6 +722,26 @@ ExecOpenScanRelation(EState *estate, Index scanrelid, int eflags)
return rel;
}
+/* ----------------------------------------------------------------
+ * ExecOpenScanIndexRelation
+ *
+ * Open the index relation to be scanned by an index scan plan node.
+ * This should be called during the node's ExecInit routine.
+ * ----------------------------------------------------------------
+ */
+Relation
+ExecOpenScanIndexRelation(EState *estate, Oid indexid, int lockmode)
+{
+ Relation rel;
+
+ /* Open the index. */
+ rel = index_open(indexid, lockmode);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ elog(DEBUG2, "CachedPlan invalidated on locking index %u", indexid);
+
+ return rel;
+}
+
/*
* ExecInitRangeTable
* Set up executor's range-table-related data
@@ -776,8 +801,12 @@ ExecShouldLockRelation(EState *estate, Index rtindex)
* ExecGetRangeTableRelation
* Open the Relation for a range table entry, if not already done
*
- * The Relations will be closed again in ExecEndPlan().
+ * The Relations will be closed in ExecEndPlan().
+ *
+ * The returned value may be NULL if the relation is a prunable relation
+ * that has not been locked and may have been concurrently dropped.
*/
+
Relation
ExecGetRangeTableRelation(EState *estate, Index rti)
{
@@ -820,8 +849,14 @@ ExecGetRangeTableRelation(EState *estate, Index rti)
* that of a prunable relation and we're running a cached generic
* plan. AcquireExecutorLocks() of plancache.c would have locked
* only the unprunable relations in the plan tree.
+ *
+ * Note that we use try_table_open() here, because without a lock
+ * held on the relation, it may have disappeared from under us.
*/
- rel = table_open(rte->relid, rte->rellockmode);
+ rel = try_table_open(rte->relid, rte->rellockmode);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ elog(DEBUG2, "CachedPlan invalidated on locking relation %u",
+ rte->relid);
}
estate->es_relations[rti - 1] = rel;
@@ -845,6 +880,9 @@ ExecInitResultRelation(EState *estate, ResultRelInfo *resultRelInfo,
Relation resultRelationDesc;
resultRelationDesc = ExecGetRangeTableRelation(estate, rti);
+ if (unlikely(resultRelationDesc == NULL ||
+ !ExecPlanStillValid(estate)))
+ return;
InitResultRelInfo(resultRelInfo,
resultRelationDesc,
rti,
diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
index 0dfba5ca16..8c40d8c520 100644
--- a/src/backend/executor/nodeAgg.c
+++ b/src/backend/executor/nodeAgg.c
@@ -3303,6 +3303,8 @@ ExecInitAgg(Agg *node, EState *estate, int eflags)
eflags &= ~EXEC_FLAG_REWIND;
outerPlan = outerPlan(node);
outerPlanState(aggstate) = ExecInitNode(outerPlan, estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return aggstate;
/*
* initialize source tuple type.
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 86d75b1a7e..3c82a1ceab 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -147,6 +147,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
list_length(node->appendplans),
node->part_prune_info,
&validsubplans);
+ if (!ExecPlanStillValid(estate))
+ return appendstate;
appendstate->as_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
@@ -185,8 +187,10 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
appendstate->ps.resultopsset = true;
appendstate->ps.resultopsfixed = false;
- appendplanstates = (PlanState **) palloc(nplans *
- sizeof(PlanState *));
+ appendplanstates = (PlanState **) palloc0(nplans *
+ sizeof(PlanState *));
+ appendstate->appendplans = appendplanstates;
+ appendstate->as_nplans = nplans;
/*
* call ExecInitNode on each of the valid plans to be executed and save
@@ -221,11 +225,11 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
firstvalid = j;
appendplanstates[j++] = ExecInitNode(initNode, estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return appendstate;
}
appendstate->as_first_partial_plan = firstvalid;
- appendstate->appendplans = appendplanstates;
- appendstate->as_nplans = nplans;
/* Initialize async state */
appendstate->as_asyncplans = asyncplans;
diff --git a/src/backend/executor/nodeBitmapAnd.c b/src/backend/executor/nodeBitmapAnd.c
index ae391222bf..168c440692 100644
--- a/src/backend/executor/nodeBitmapAnd.c
+++ b/src/backend/executor/nodeBitmapAnd.c
@@ -89,6 +89,8 @@ ExecInitBitmapAnd(BitmapAnd *node, EState *estate, int eflags)
{
initNode = (Plan *) lfirst(l);
bitmapplanstates[i] = ExecInitNode(initNode, estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return bitmapandstate;
i++;
}
diff --git a/src/backend/executor/nodeBitmapHeapscan.c b/src/backend/executor/nodeBitmapHeapscan.c
index 19f18ab817..b13cae1cbb 100644
--- a/src/backend/executor/nodeBitmapHeapscan.c
+++ b/src/backend/executor/nodeBitmapHeapscan.c
@@ -754,11 +754,15 @@ ExecInitBitmapHeapScan(BitmapHeapScan *node, EState *estate, int eflags)
* open the scan relation
*/
currentRelation = ExecOpenScanRelation(estate, node->scan.scanrelid, eflags);
+ if (unlikely(currentRelation == NULL || !ExecPlanStillValid(estate)))
+ return scanstate;
/*
* initialize child nodes
*/
outerPlanState(scanstate) = ExecInitNode(outerPlan(node), estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return scanstate;
/*
* get the scan type from the relation descriptor.
diff --git a/src/backend/executor/nodeBitmapIndexscan.c b/src/backend/executor/nodeBitmapIndexscan.c
index 4669e8d0ce..f04a53e9be 100644
--- a/src/backend/executor/nodeBitmapIndexscan.c
+++ b/src/backend/executor/nodeBitmapIndexscan.c
@@ -252,7 +252,11 @@ ExecInitBitmapIndexScan(BitmapIndexScan *node, EState *estate, int eflags)
/* Open the index relation. */
lockmode = exec_rt_fetch(node->scan.scanrelid, estate)->rellockmode;
- indexstate->biss_RelationDesc = index_open(node->indexid, lockmode);
+ indexstate->biss_RelationDesc = ExecOpenScanIndexRelation(estate,
+ node->indexid,
+ lockmode);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return indexstate;
/*
* Initialize index-specific scan state
diff --git a/src/backend/executor/nodeBitmapOr.c b/src/backend/executor/nodeBitmapOr.c
index de439235d2..980b68dd82 100644
--- a/src/backend/executor/nodeBitmapOr.c
+++ b/src/backend/executor/nodeBitmapOr.c
@@ -90,6 +90,8 @@ ExecInitBitmapOr(BitmapOr *node, EState *estate, int eflags)
{
initNode = (Plan *) lfirst(l);
bitmapplanstates[i] = ExecInitNode(initNode, estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return bitmaporstate;
i++;
}
diff --git a/src/backend/executor/nodeCustom.c b/src/backend/executor/nodeCustom.c
index e559cd2346..2a7c5dccd8 100644
--- a/src/backend/executor/nodeCustom.c
+++ b/src/backend/executor/nodeCustom.c
@@ -58,6 +58,8 @@ ExecInitCustomScan(CustomScan *cscan, EState *estate, int eflags)
if (scanrelid > 0)
{
scan_rel = ExecOpenScanRelation(estate, scanrelid, eflags);
+ if (unlikely(scan_rel == NULL || !ExecPlanStillValid(estate)))
+ return css;
css->ss.ss_currentRelation = scan_rel;
}
diff --git a/src/backend/executor/nodeForeignscan.c b/src/backend/executor/nodeForeignscan.c
index 1357ccf3c9..90d5878ae3 100644
--- a/src/backend/executor/nodeForeignscan.c
+++ b/src/backend/executor/nodeForeignscan.c
@@ -172,6 +172,8 @@ ExecInitForeignScan(ForeignScan *node, EState *estate, int eflags)
if (scanrelid > 0)
{
currentRelation = ExecOpenScanRelation(estate, scanrelid, eflags);
+ if (unlikely(currentRelation == NULL || !ExecPlanStillValid(estate)))
+ return scanstate;
scanstate->ss.ss_currentRelation = currentRelation;
fdwroutine = GetFdwRoutineForRelation(currentRelation, true);
}
@@ -263,6 +265,8 @@ ExecInitForeignScan(ForeignScan *node, EState *estate, int eflags)
if (outerPlan(node))
outerPlanState(scanstate) =
ExecInitNode(outerPlan(node), estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return scanstate;
/*
* Tell the FDW to initialize the scan.
diff --git a/src/backend/executor/nodeGather.c b/src/backend/executor/nodeGather.c
index cae5ea1f92..67548aa7ba 100644
--- a/src/backend/executor/nodeGather.c
+++ b/src/backend/executor/nodeGather.c
@@ -84,6 +84,8 @@ ExecInitGather(Gather *node, EState *estate, int eflags)
*/
outerNode = outerPlan(node);
outerPlanState(gatherstate) = ExecInitNode(outerNode, estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return gatherstate;
tupDesc = ExecGetResultType(outerPlanState(gatherstate));
/*
diff --git a/src/backend/executor/nodeGatherMerge.c b/src/backend/executor/nodeGatherMerge.c
index b36cd89e7d..cf0e074359 100644
--- a/src/backend/executor/nodeGatherMerge.c
+++ b/src/backend/executor/nodeGatherMerge.c
@@ -103,6 +103,8 @@ ExecInitGatherMerge(GatherMerge *node, EState *estate, int eflags)
*/
outerNode = outerPlan(node);
outerPlanState(gm_state) = ExecInitNode(outerNode, estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return gm_state;
/*
* Leader may access ExecProcNode result directly (if
diff --git a/src/backend/executor/nodeGroup.c b/src/backend/executor/nodeGroup.c
index 807429e504..6d0fd9e7b4 100644
--- a/src/backend/executor/nodeGroup.c
+++ b/src/backend/executor/nodeGroup.c
@@ -184,6 +184,8 @@ ExecInitGroup(Group *node, EState *estate, int eflags)
* initialize child nodes
*/
outerPlanState(grpstate) = ExecInitNode(outerPlan(node), estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return grpstate;
/*
* Initialize scan slot and type.
diff --git a/src/backend/executor/nodeHash.c b/src/backend/executor/nodeHash.c
index a913d5b50c..e71d131d18 100644
--- a/src/backend/executor/nodeHash.c
+++ b/src/backend/executor/nodeHash.c
@@ -396,6 +396,8 @@ ExecInitHash(Hash *node, EState *estate, int eflags)
* initialize child nodes
*/
outerPlanState(hashstate) = ExecInitNode(outerPlan(node), estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return hashstate;
/*
* initialize our result slot and type. No need to build projection
diff --git a/src/backend/executor/nodeHashjoin.c b/src/backend/executor/nodeHashjoin.c
index 901c9e9be7..3c870de1c5 100644
--- a/src/backend/executor/nodeHashjoin.c
+++ b/src/backend/executor/nodeHashjoin.c
@@ -758,8 +758,12 @@ ExecInitHashJoin(HashJoin *node, EState *estate, int eflags)
hashNode = (Hash *) innerPlan(node);
outerPlanState(hjstate) = ExecInitNode(outerNode, estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return hjstate;
outerDesc = ExecGetResultType(outerPlanState(hjstate));
innerPlanState(hjstate) = ExecInitNode((Plan *) hashNode, estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return hjstate;
innerDesc = ExecGetResultType(innerPlanState(hjstate));
/*
diff --git a/src/backend/executor/nodeIncrementalSort.c b/src/backend/executor/nodeIncrementalSort.c
index 010bcfafa8..af723ea755 100644
--- a/src/backend/executor/nodeIncrementalSort.c
+++ b/src/backend/executor/nodeIncrementalSort.c
@@ -1040,6 +1040,8 @@ ExecInitIncrementalSort(IncrementalSort *node, EState *estate, int eflags)
* nodes may be able to do something more useful.
*/
outerPlanState(incrsortstate) = ExecInitNode(outerPlan(node), estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return incrsortstate;
/*
* Initialize scan slot and type.
diff --git a/src/backend/executor/nodeIndexonlyscan.c b/src/backend/executor/nodeIndexonlyscan.c
index 481d479760..0fba8f7d5a 100644
--- a/src/backend/executor/nodeIndexonlyscan.c
+++ b/src/backend/executor/nodeIndexonlyscan.c
@@ -531,6 +531,8 @@ ExecInitIndexOnlyScan(IndexOnlyScan *node, EState *estate, int eflags)
* open the scan relation
*/
currentRelation = ExecOpenScanRelation(estate, node->scan.scanrelid, eflags);
+ if (unlikely(currentRelation == NULL || !ExecPlanStillValid(estate)))
+ return indexstate;
indexstate->ss.ss_currentRelation = currentRelation;
indexstate->ss.ss_currentScanDesc = NULL; /* no heap scan here */
@@ -583,9 +585,12 @@ ExecInitIndexOnlyScan(IndexOnlyScan *node, EState *estate, int eflags)
/* Open the index relation. */
lockmode = exec_rt_fetch(node->scan.scanrelid, estate)->rellockmode;
- indexRelation = index_open(node->indexid, lockmode);
+ indexRelation = ExecOpenScanIndexRelation(estate, node->indexid, lockmode);
indexstate->ioss_RelationDesc = indexRelation;
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return indexstate;
+
/*
* Initialize index-specific scan state
*/
diff --git a/src/backend/executor/nodeIndexscan.c b/src/backend/executor/nodeIndexscan.c
index a8172d8b82..db28aeb3d6 100644
--- a/src/backend/executor/nodeIndexscan.c
+++ b/src/backend/executor/nodeIndexscan.c
@@ -907,6 +907,8 @@ ExecInitIndexScan(IndexScan *node, EState *estate, int eflags)
* open the scan relation
*/
currentRelation = ExecOpenScanRelation(estate, node->scan.scanrelid, eflags);
+ if (unlikely(currentRelation == NULL || !ExecPlanStillValid(estate)))
+ return indexstate;
indexstate->ss.ss_currentRelation = currentRelation;
indexstate->ss.ss_currentScanDesc = NULL; /* no heap scan here */
@@ -951,7 +953,11 @@ ExecInitIndexScan(IndexScan *node, EState *estate, int eflags)
/* Open the index relation. */
lockmode = exec_rt_fetch(node->scan.scanrelid, estate)->rellockmode;
- indexstate->iss_RelationDesc = index_open(node->indexid, lockmode);
+ indexstate->iss_RelationDesc = ExecOpenScanIndexRelation(estate,
+ node->indexid,
+ lockmode);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return indexstate;
/*
* Initialize index-specific scan state
diff --git a/src/backend/executor/nodeLimit.c b/src/backend/executor/nodeLimit.c
index eb7b6e52be..369c904577 100644
--- a/src/backend/executor/nodeLimit.c
+++ b/src/backend/executor/nodeLimit.c
@@ -475,6 +475,8 @@ ExecInitLimit(Limit *node, EState *estate, int eflags)
*/
outerPlan = outerPlan(node);
outerPlanState(limitstate) = ExecInitNode(outerPlan, estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return limitstate;
/*
* initialize child expressions
diff --git a/src/backend/executor/nodeLockRows.c b/src/backend/executor/nodeLockRows.c
index 0d3489195b..9077858413 100644
--- a/src/backend/executor/nodeLockRows.c
+++ b/src/backend/executor/nodeLockRows.c
@@ -322,6 +322,8 @@ ExecInitLockRows(LockRows *node, EState *estate, int eflags)
* then initialize outer plan
*/
outerPlanState(lrstate) = ExecInitNode(outerPlan, estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return lrstate;
/* node returns unmodified slots from the outer plan */
lrstate->ps.resultopsset = true;
diff --git a/src/backend/executor/nodeMaterial.c b/src/backend/executor/nodeMaterial.c
index 883e3f3933..972962d44d 100644
--- a/src/backend/executor/nodeMaterial.c
+++ b/src/backend/executor/nodeMaterial.c
@@ -214,6 +214,8 @@ ExecInitMaterial(Material *node, EState *estate, int eflags)
outerPlan = outerPlan(node);
outerPlanState(matstate) = ExecInitNode(outerPlan, estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return matstate;
/*
* Initialize result type and slot. No need to initialize projection info
diff --git a/src/backend/executor/nodeMemoize.c b/src/backend/executor/nodeMemoize.c
index 690dee1daa..6aaab743b5 100644
--- a/src/backend/executor/nodeMemoize.c
+++ b/src/backend/executor/nodeMemoize.c
@@ -973,6 +973,8 @@ ExecInitMemoize(Memoize *node, EState *estate, int eflags)
outerNode = outerPlan(node);
outerPlanState(mstate) = ExecInitNode(outerNode, estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return mstate;
/*
* Initialize return slot and type. No need to initialize projection info
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 3236444cf1..a82f0a71a0 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -95,6 +95,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
list_length(node->mergeplans),
node->part_prune_info,
&validsubplans);
+ if (!ExecPlanStillValid(estate))
+ return mergestate;
mergestate->ms_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
@@ -120,7 +122,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
mergestate->ms_prune_state = NULL;
}
- mergeplanstates = (PlanState **) palloc(nplans * sizeof(PlanState *));
+ mergeplanstates = (PlanState **) palloc0(nplans * sizeof(PlanState *));
mergestate->mergeplans = mergeplanstates;
mergestate->ms_nplans = nplans;
@@ -151,6 +153,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
Plan *initNode = (Plan *) list_nth(node->mergeplans, i);
mergeplanstates[j++] = ExecInitNode(initNode, estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return mergestate;
}
mergestate->ps.ps_ProjInfo = NULL;
diff --git a/src/backend/executor/nodeMergejoin.c b/src/backend/executor/nodeMergejoin.c
index 926e631d88..53cb1ff207 100644
--- a/src/backend/executor/nodeMergejoin.c
+++ b/src/backend/executor/nodeMergejoin.c
@@ -1490,11 +1490,15 @@ ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags)
mergestate->mj_SkipMarkRestore = node->skip_mark_restore;
outerPlanState(mergestate) = ExecInitNode(outerPlan(node), estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return mergestate;
outerDesc = ExecGetResultType(outerPlanState(mergestate));
innerPlanState(mergestate) = ExecInitNode(innerPlan(node), estate,
mergestate->mj_SkipMarkRestore ?
eflags :
(eflags | EXEC_FLAG_MARK));
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return mergestate;
innerDesc = ExecGetResultType(innerPlanState(mergestate));
/*
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 9e56f9c36c..8debfbd3ec 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -4277,6 +4277,13 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
linitial_int(node->resultRelations));
}
+ /*
+ * ExecInitResultRelation() may have returned without initializing
+ * rootResultRelInfo if the plan got invalidated, so check.
+ */
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return mtstate;
+
/* set up epqstate with dummy subplan data for the moment */
EvalPlanQualInit(&mtstate->mt_epqstate, estate, NULL, NIL,
node->epqParam, node->resultRelations);
@@ -4309,6 +4316,10 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
{
ExecInitResultRelation(estate, resultRelInfo, resultRelation);
+ /* See the comment above. */
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return mtstate;
+
/*
* For child result relations, store the root result relation
* pointer. We do so for the convenience of places that want to
@@ -4335,6 +4346,8 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
* Now we may initialize the subplan.
*/
outerPlanState(mtstate) = ExecInitNode(subplan, estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return mtstate;
/*
* Do additional per-result-relation initialization.
diff --git a/src/backend/executor/nodeNestloop.c b/src/backend/executor/nodeNestloop.c
index 01f3d56a3b..34eafbb6e0 100644
--- a/src/backend/executor/nodeNestloop.c
+++ b/src/backend/executor/nodeNestloop.c
@@ -294,11 +294,15 @@ ExecInitNestLoop(NestLoop *node, EState *estate, int eflags)
* values.
*/
outerPlanState(nlstate) = ExecInitNode(outerPlan(node), estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return nlstate;
if (node->nestParams == NIL)
eflags |= EXEC_FLAG_REWIND;
else
eflags &= ~EXEC_FLAG_REWIND;
innerPlanState(nlstate) = ExecInitNode(innerPlan(node), estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return nlstate;
/*
* Initialize result slot, type and projection.
diff --git a/src/backend/executor/nodeProjectSet.c b/src/backend/executor/nodeProjectSet.c
index ca9a5e2ed2..f834499479 100644
--- a/src/backend/executor/nodeProjectSet.c
+++ b/src/backend/executor/nodeProjectSet.c
@@ -254,6 +254,8 @@ ExecInitProjectSet(ProjectSet *node, EState *estate, int eflags)
* initialize child nodes
*/
outerPlanState(state) = ExecInitNode(outerPlan(node), estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return state;
/*
* we don't use inner plan
diff --git a/src/backend/executor/nodeRecursiveunion.c b/src/backend/executor/nodeRecursiveunion.c
index 7680142c7b..5dd3285c41 100644
--- a/src/backend/executor/nodeRecursiveunion.c
+++ b/src/backend/executor/nodeRecursiveunion.c
@@ -244,7 +244,11 @@ ExecInitRecursiveUnion(RecursiveUnion *node, EState *estate, int eflags)
* initialize child nodes
*/
outerPlanState(rustate) = ExecInitNode(outerPlan(node), estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return rustate;
innerPlanState(rustate) = ExecInitNode(innerPlan(node), estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return rustate;
/*
* If hashing, precompute fmgr lookup data for inner loop, and create the
diff --git a/src/backend/executor/nodeResult.c b/src/backend/executor/nodeResult.c
index e3cfc9b772..7d7c2aa786 100644
--- a/src/backend/executor/nodeResult.c
+++ b/src/backend/executor/nodeResult.c
@@ -207,6 +207,8 @@ ExecInitResult(Result *node, EState *estate, int eflags)
* initialize child nodes
*/
outerPlanState(resstate) = ExecInitNode(outerPlan(node), estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return resstate;
/*
* we don't use inner plan
diff --git a/src/backend/executor/nodeSamplescan.c b/src/backend/executor/nodeSamplescan.c
index 6ab91001bc..3afdaeecd7 100644
--- a/src/backend/executor/nodeSamplescan.c
+++ b/src/backend/executor/nodeSamplescan.c
@@ -121,6 +121,9 @@ ExecInitSampleScan(SampleScan *node, EState *estate, int eflags)
ExecOpenScanRelation(estate,
node->scan.scanrelid,
eflags);
+ if (unlikely(scanstate->ss.ss_currentRelation == NULL ||
+ !ExecPlanStillValid(estate)))
+ return scanstate;
/* we won't set up the HeapScanDesc till later */
scanstate->ss.ss_currentScanDesc = NULL;
diff --git a/src/backend/executor/nodeSeqscan.c b/src/backend/executor/nodeSeqscan.c
index b052775e5b..f7fb64a4a2 100644
--- a/src/backend/executor/nodeSeqscan.c
+++ b/src/backend/executor/nodeSeqscan.c
@@ -153,6 +153,9 @@ ExecInitSeqScan(SeqScan *node, EState *estate, int eflags)
ExecOpenScanRelation(estate,
node->scan.scanrelid,
eflags);
+ if (unlikely(scanstate->ss.ss_currentRelation == NULL ||
+ !ExecPlanStillValid(estate)))
+ return scanstate;
/* and create slot with the appropriate rowtype */
ExecInitScanTupleSlot(estate, &scanstate->ss,
diff --git a/src/backend/executor/nodeSetOp.c b/src/backend/executor/nodeSetOp.c
index fe34b2134f..2231d8b82f 100644
--- a/src/backend/executor/nodeSetOp.c
+++ b/src/backend/executor/nodeSetOp.c
@@ -528,6 +528,8 @@ ExecInitSetOp(SetOp *node, EState *estate, int eflags)
if (node->strategy == SETOP_HASHED)
eflags &= ~EXEC_FLAG_REWIND;
outerPlanState(setopstate) = ExecInitNode(outerPlan(node), estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return setopstate;
outerDesc = ExecGetResultType(outerPlanState(setopstate));
/*
diff --git a/src/backend/executor/nodeSort.c b/src/backend/executor/nodeSort.c
index af852464d0..fb76e4c01b 100644
--- a/src/backend/executor/nodeSort.c
+++ b/src/backend/executor/nodeSort.c
@@ -263,6 +263,8 @@ ExecInitSort(Sort *node, EState *estate, int eflags)
eflags &= ~(EXEC_FLAG_REWIND | EXEC_FLAG_BACKWARD | EXEC_FLAG_MARK);
outerPlanState(sortstate) = ExecInitNode(outerPlan(node), estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return sortstate;
/*
* Initialize scan slot and type.
diff --git a/src/backend/executor/nodeSubqueryscan.c b/src/backend/executor/nodeSubqueryscan.c
index 0b2612183a..b5b538fa91 100644
--- a/src/backend/executor/nodeSubqueryscan.c
+++ b/src/backend/executor/nodeSubqueryscan.c
@@ -124,6 +124,8 @@ ExecInitSubqueryScan(SubqueryScan *node, EState *estate, int eflags)
* initialize subquery
*/
subquerystate->subplan = ExecInitNode(node->subplan, estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return subquerystate;
/*
* Initialize scan slot and type (needed by ExecAssignScanProjectionInfo)
diff --git a/src/backend/executor/nodeTidrangescan.c b/src/backend/executor/nodeTidrangescan.c
index 702ee884d2..a76836d021 100644
--- a/src/backend/executor/nodeTidrangescan.c
+++ b/src/backend/executor/nodeTidrangescan.c
@@ -377,6 +377,8 @@ ExecInitTidRangeScan(TidRangeScan *node, EState *estate, int eflags)
* open the scan relation
*/
currentRelation = ExecOpenScanRelation(estate, node->scan.scanrelid, eflags);
+ if (unlikely(currentRelation == NULL || !ExecPlanStillValid(estate)))
+ return tidrangestate;
tidrangestate->ss.ss_currentRelation = currentRelation;
tidrangestate->ss.ss_currentScanDesc = NULL; /* no table scan here */
diff --git a/src/backend/executor/nodeTidscan.c b/src/backend/executor/nodeTidscan.c
index f375951699..088babf572 100644
--- a/src/backend/executor/nodeTidscan.c
+++ b/src/backend/executor/nodeTidscan.c
@@ -522,6 +522,8 @@ ExecInitTidScan(TidScan *node, EState *estate, int eflags)
* open the scan relation
*/
currentRelation = ExecOpenScanRelation(estate, node->scan.scanrelid, eflags);
+ if (unlikely(currentRelation == NULL || !ExecPlanStillValid(estate)))
+ return tidstate;
tidstate->ss.ss_currentRelation = currentRelation;
tidstate->ss.ss_currentScanDesc = NULL; /* no heap scan here */
diff --git a/src/backend/executor/nodeUnique.c b/src/backend/executor/nodeUnique.c
index b82d0e9ad5..cb46b2d5d0 100644
--- a/src/backend/executor/nodeUnique.c
+++ b/src/backend/executor/nodeUnique.c
@@ -135,6 +135,8 @@ ExecInitUnique(Unique *node, EState *estate, int eflags)
* then initialize outer plan
*/
outerPlanState(uniquestate) = ExecInitNode(outerPlan(node), estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return uniquestate;
/*
* Initialize result slot and type. Unique nodes do no projections, so
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index 561d7e731d..1b96f51fe8 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -2464,6 +2464,8 @@ ExecInitWindowAgg(WindowAgg *node, EState *estate, int eflags)
*/
outerPlan = outerPlan(node);
outerPlanState(winstate) = ExecInitNode(outerPlan, estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return winstate;
/*
* initialize source tuple type (which is also the tuple type that we'll
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 902793b02b..b754827013 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -70,7 +70,8 @@ static int _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
static ParamListInfo _SPI_convert_params(int nargs, Oid *argtypes,
Datum *Values, const char *Nulls);
-static int _SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount);
+static int _SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount,
+ CachedPlanSource *plansource, int query_index);
static void _SPI_error_callback(void *arg);
@@ -1682,7 +1683,8 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
query_string,
plansource->commandTag,
stmt_list,
- cplan);
+ cplan,
+ plansource);
/*
* Set up options for portal. Default SCROLL type is chosen the same way
@@ -2494,6 +2496,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
List *stmt_list;
ListCell *lc2;
+ int i = 0;
spicallbackarg.query = plansource->query_string;
@@ -2691,8 +2694,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
options->params,
_SPI_current->queryEnv,
0);
- res = _SPI_pquery(qdesc, fire_triggers,
- canSetTag ? options->tcount : 0);
+
+ res = _SPI_pquery(qdesc, fire_triggers, canSetTag ? options->tcount : 0,
+ plansource, i);
FreeQueryDesc(qdesc);
}
else
@@ -2789,6 +2793,8 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
my_res = res;
goto fail;
}
+
+ i++;
}
/* Done with this plan, so release refcount */
@@ -2866,7 +2872,8 @@ _SPI_convert_params(int nargs, Oid *argtypes,
}
static int
-_SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount)
+_SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount,
+ CachedPlanSource *plansource, int query_index)
{
int operation = queryDesc->operation;
int eflags;
@@ -2922,7 +2929,7 @@ _SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount)
else
eflags = EXEC_FLAG_SKIP_TRIGGERS;
- ExecutorStart(queryDesc, eflags);
+ ExecutorStartExt(queryDesc, eflags, plansource, query_index);
ExecutorRun(queryDesc, ForwardScanDirection, tcount, true);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 8bc6bea113..ccbc27b575 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1237,6 +1237,7 @@ exec_simple_query(const char *query_string)
query_string,
commandTag,
plantree_list,
+ NULL,
NULL);
/*
@@ -2027,7 +2028,8 @@ exec_bind_message(StringInfo input_message)
query_string,
psrc->commandTag,
cplan->stmt_list,
- cplan);
+ cplan,
+ psrc);
/* Done with the snapshot used for parameter I/O and parsing/planning */
if (snapshot_set)
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 6e8f6b1b8f..d9ae60579b 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -19,6 +19,7 @@
#include "access/xact.h"
#include "commands/prepare.h"
+#include "executor/execdesc.h"
#include "executor/tstoreReceiver.h"
#include "miscadmin.h"
#include "pg_trace.h"
@@ -37,6 +38,8 @@ Portal ActivePortal = NULL;
static void ProcessQuery(PlannedStmt *plan,
CachedPlan *cplan,
+ CachedPlanSource *plansource,
+ int query_index,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -80,6 +83,7 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
qd->cplan = cplan; /* CachedPlan supplying the plannedstmt */
+ qd->release_cplan = false;
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -114,6 +118,13 @@ FreeQueryDesc(QueryDesc *qdesc)
UnregisterSnapshot(qdesc->snapshot);
UnregisterSnapshot(qdesc->crosscheck_snapshot);
+ /*
+ * Release CachedPlan if requested. The CachedPlan is not associated with
+ * a ResourceOwner when release_cplan is true; see ExecutorStartExt().
+ */
+ if (qdesc->release_cplan)
+ ReleaseCachedPlan(qdesc->cplan, NULL);
+
/* Only the QueryDesc itself need be freed */
pfree(qdesc);
}
@@ -126,6 +137,8 @@ FreeQueryDesc(QueryDesc *qdesc)
*
* plan: the plan tree for the query
* cplan: CachedPlan supplying the plan
+ * plansource: CachedPlanSource supplying the cplan
+ * query_index: index of the query in plansource->query_list
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -139,6 +152,8 @@ FreeQueryDesc(QueryDesc *qdesc)
static void
ProcessQuery(PlannedStmt *plan,
CachedPlan *cplan,
+ CachedPlanSource *plansource,
+ int query_index,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -157,7 +172,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Call ExecutorStart to prepare the plan for execution
*/
- ExecutorStart(queryDesc, 0);
+ ExecutorStartExt(queryDesc, 0, plansource, query_index);
/*
* Run the plan to completion.
@@ -518,9 +533,12 @@ PortalStart(Portal portal, ParamListInfo params,
myeflags = eflags;
/*
- * Call ExecutorStart to prepare the plan for execution
+ * ExecutorStartExt() to prepare the plan for execution. If
+ * the portal is using a cached plan, it may get invalidated
+ * during plan intialization, in which case a new one is
+ * created and saved in the QueryDesc.
*/
- ExecutorStart(queryDesc, myeflags);
+ ExecutorStartExt(queryDesc, myeflags, portal->plansource, 0);
/*
* This tells PortalCleanup to shut down the executor
@@ -1201,6 +1219,7 @@ PortalRunMulti(Portal portal,
{
bool active_snapshot_set = false;
ListCell *stmtlist_item;
+ int i = 0;
/*
* If the destination is DestRemoteExecute, change to DestNone. The
@@ -1283,6 +1302,8 @@ PortalRunMulti(Portal portal,
/* statement can set tag string */
ProcessQuery(pstmt,
portal->cplan,
+ portal->plansource,
+ i,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1293,6 +1314,8 @@ PortalRunMulti(Portal portal,
/* stmt added by rewrite cannot set tag */
ProcessQuery(pstmt,
portal->cplan,
+ portal->plansource,
+ i,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1357,6 +1380,8 @@ PortalRunMulti(Portal portal,
*/
if (lnext(portal->stmts, stmtlist_item) != NULL)
CommandCounterIncrement();
+
+ i++;
}
/* Pop the snapshot if we pushed one. */
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 6d2e385fe8..6ae05175c6 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -1279,6 +1279,56 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
return plan;
}
+/*
+ * Check the validity of, and replan, only the query at the given 0-based index
+ * in the provided CachedPlanSource.
+ *
+ * Returns a CachedPlan for that specific query. The CachedPlan is not saved in
+ * the CachedPlanSource, so it is the caller's responsibility to free it by
+ * eventually calling ReleaseCachedPlan() on it.
+ */
+CachedPlan *
+GetSingleCachedPlan(CachedPlanSource *plansource, int query_index,
+ ParamListInfo boundParams, QueryEnvironment *queryEnv)
+{
+ List *query_list = plansource->query_list;
+ List *query_list_new;
+ CachedPlan *plan = plansource->gplan,
+ *newplan;
+ double generic_cost = plansource->generic_cost;
+ double total_custom_cost = plansource->total_custom_cost;
+
+ if (plan == NULL || plan->is_valid)
+ elog(ERROR, "GetSingleCachedPlan() called in the wrong context");
+
+ /*
+ * Create a new plan for the nth query after revalidating it.
+ *
+ * Temporarily reset gplan to ensure that the CachedPlan that it's pointing
+ * to is not released, because the caller might still need it.
+ */
+ query_list_new = list_make1(list_nth(plansource->query_list, query_index));
+ plansource->query_list = query_list_new;
+ plansource->gplan = NULL;
+ newplan = GetCachedPlan(plansource, boundParams, NULL, queryEnv);
+ plansource->gplan = plan;
+
+ /* Restore original query_list. */
+ plansource->query_list = query_list;
+ list_free(query_list_new);
+
+ /*
+ * Restore the original plan costs. The values after the GetCachedPlan()
+ * call represent the cost of only the nth query, whereas the original
+ * values represent the cumulative costs for all queries in
+ * plansource->query_list.
+ */
+ plansource->generic_cost = generic_cost;
+ plansource->total_custom_cost = total_custom_cost;
+
+ return newplan;
+}
+
/*
* ReleaseCachedPlan: release active use of a cached plan.
*
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 4a24613537..bf70fd4ce7 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -284,7 +284,8 @@ PortalDefineQuery(Portal portal,
const char *sourceText,
CommandTag commandTag,
List *stmts,
- CachedPlan *cplan)
+ CachedPlan *cplan,
+ CachedPlanSource *plansource)
{
Assert(PortalIsValid(portal));
Assert(portal->status == PORTAL_NEW);
@@ -299,6 +300,7 @@ PortalDefineQuery(Portal portal,
portal->commandTag = commandTag;
portal->stmts = stmts;
portal->cplan = cplan;
+ portal->plansource = plansource;
portal->status = PORTAL_DEFINED;
}
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index bf326eeb70..652e1afbf7 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -102,6 +102,7 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ParamListInfo params, QueryEnvironment *queryEnv);
extern void ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
+ CachedPlanSource *plansource, int plan_index,
IntoClause *into, ExplainState *es,
const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
diff --git a/src/include/commands/trigger.h b/src/include/commands/trigger.h
index 8a5a9fe642..db21561c8c 100644
--- a/src/include/commands/trigger.h
+++ b/src/include/commands/trigger.h
@@ -258,6 +258,7 @@ extern void ExecASTruncateTriggers(EState *estate,
extern void AfterTriggerBeginXact(void);
extern void AfterTriggerBeginQuery(void);
extern void AfterTriggerEndQuery(EState *estate);
+extern void AfterTriggerAbortQuery(void);
extern void AfterTriggerFireDeferred(void);
extern void AfterTriggerEndXact(bool isCommit);
extern void AfterTriggerBeginSubXact(void);
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index 0e7245435d..c6ad8fece7 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -36,6 +36,7 @@ typedef struct QueryDesc
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
CachedPlan *cplan; /* CachedPlan that supplies the plannedstmt */
+ bool release_cplan; /* Should FreeQueryDesc() release cplan? */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 69c3ebff00..ce2447a8cf 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -19,6 +19,7 @@
#include "nodes/lockoptions.h"
#include "nodes/parsenodes.h"
#include "utils/memutils.h"
+#include "utils/plancache.h"
/*
@@ -198,6 +199,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
* prototypes from functions in execMain.c
*/
extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
+extern void ExecutorStartExt(QueryDesc *queryDesc, int eflags,
+ CachedPlanSource *plansource, int query_index);
extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void ExecutorRun(QueryDesc *queryDesc,
ScanDirection direction, uint64 count, bool execute_once);
@@ -261,6 +264,20 @@ extern void ExecEndNode(PlanState *node);
extern void ExecShutdownNode(PlanState *node);
extern void ExecSetTupleBound(int64 tuples_needed, PlanState *child_node);
+/*
+ * Is the CachedPlan in es_cachedplan still valid?
+ *
+ * Called at various points during ExecutorStart() because invalidation
+ * messages that affect the plan might be received after locks have been
+ * taken on runtime-prunable relations. The caller should take appropriate
+ * action if the plan has become invalid.
+ */
+static inline bool
+ExecPlanStillValid(EState *estate)
+{
+ return estate->es_cachedplan == NULL ? true :
+ CachedPlanStillValid(estate->es_cachedplan);
+}
/* ----------------------------------------------------------------
* ExecProcNode
@@ -589,6 +606,7 @@ extern void ExecCreateScanSlotFromOuterPlan(EState *estate,
extern bool ExecRelationIsTargetRelation(EState *estate, Index scanrelid);
extern Relation ExecOpenScanRelation(EState *estate, Index scanrelid, int eflags);
+extern Relation ExecOpenScanIndexRelation(EState *estate, Oid indexid, int lockmode);
extern void ExecInitRangeTable(EState *estate, List *rangeTable, List *permInfos);
extern void ExecCloseRangeTableRelations(EState *estate);
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index ee089505a0..2a8e5bd784 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -680,6 +680,7 @@ typedef struct EState
int es_top_eflags; /* eflags passed to ExecutorStart */
int es_instrument; /* OR of InstrumentOption flags */
bool es_finished; /* true when ExecutorFinish is done */
+ bool es_aborted; /* true when execution was aborted */
List *es_exprcontexts; /* List of ExprContexts within EState */
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 0b5ee007ca..f3ecbd279b 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -224,6 +224,11 @@ extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
ParamListInfo boundParams,
ResourceOwner owner,
QueryEnvironment *queryEnv);
+extern CachedPlan *GetSingleCachedPlan(CachedPlanSource *plansource,
+ int query_index,
+ ParamListInfo boundParams,
+ QueryEnvironment *queryEnv);
+
extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
@@ -245,4 +250,17 @@ CachedPlanRequiresLocking(CachedPlan *cplan)
return cplan->is_generic;
}
+/*
+ * CachedPlanStillValid
+ * Returns whether a cached generic plan is still valid.
+ *
+ * Invoked by the executor to check if the plan has not been invalidated after
+ * taking locks during the initialization of the plan.
+ */
+static inline bool
+CachedPlanStillValid(CachedPlan *cplan)
+{
+ return cplan->is_valid;
+}
+
#endif /* PLANCACHE_H */
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index 29f49829f2..58c3828d2c 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
QueryCompletion qc; /* command completion data for executed query */
List *stmts; /* list of PlannedStmts */
CachedPlan *cplan; /* CachedPlan, if stmts are from one */
+ CachedPlanSource *plansource; /* CachedPlanSource, for cplan */
ParamListInfo portalParams; /* params to pass to query */
QueryEnvironment *queryEnv; /* environment for query */
@@ -241,7 +242,8 @@ extern void PortalDefineQuery(Portal portal,
const char *sourceText,
CommandTag commandTag,
List *stmts,
- CachedPlan *cplan);
+ CachedPlan *cplan,
+ CachedPlanSource *plansource);
extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
extern void PortalCreateHoldStore(Portal portal);
extern void PortalHashTableDeleteAll(void);
diff --git a/src/test/modules/delay_execution/Makefile b/src/test/modules/delay_execution/Makefile
index 70f24e846d..3eeb097fde 100644
--- a/src/test/modules/delay_execution/Makefile
+++ b/src/test/modules/delay_execution/Makefile
@@ -8,7 +8,8 @@ OBJS = \
delay_execution.o
ISOLATION = partition-addition \
- partition-removal-1
+ partition-removal-1 \
+ cached-plan-inval
ifdef USE_PGXS
PG_CONFIG = pg_config
diff --git a/src/test/modules/delay_execution/delay_execution.c b/src/test/modules/delay_execution/delay_execution.c
index 155c8a8d55..0b5f317cd1 100644
--- a/src/test/modules/delay_execution/delay_execution.c
+++ b/src/test/modules/delay_execution/delay_execution.c
@@ -1,14 +1,18 @@
/*-------------------------------------------------------------------------
*
* delay_execution.c
- * Test module to allow delay between parsing and execution of a query.
+ * Test module to introduce delay at various points during execution of a
+ * query to test that execution proceeds safely in light of concurrent
+ * changes.
*
* The delay is implemented by taking and immediately releasing a specified
* advisory lock. If another process has previously taken that lock, the
* current process will be blocked until the lock is released; otherwise,
* there's no effect. This allows an isolationtester script to reliably
- * test behaviors where some specified action happens in another backend
- * between parsing and execution of any desired query.
+ * test behaviors where some specified action happens in another backend in
+ * a couple of cases: 1) between parsing and execution of any desired query
+ * when using the planner_hook, 2) between RevalidateCachedQuery() and
+ * ExecutorStart() when using the ExecutorStart_hook.
*
* Copyright (c) 2020-2024, PostgreSQL Global Development Group
*
@@ -22,6 +26,7 @@
#include <limits.h>
+#include "executor/executor.h"
#include "optimizer/planner.h"
#include "utils/builtins.h"
#include "utils/guc.h"
@@ -32,9 +37,11 @@ PG_MODULE_MAGIC;
/* GUC: advisory lock ID to use. Zero disables the feature. */
static int post_planning_lock_id = 0;
+static int executor_start_lock_id = 0;
-/* Save previous planner hook user to be a good citizen */
+/* Save previous hook users to be a good citizen */
static planner_hook_type prev_planner_hook = NULL;
+static ExecutorStart_hook_type prev_ExecutorStart_hook = NULL;
/* planner_hook function to provide the desired delay */
@@ -70,11 +77,41 @@ delay_execution_planner(Query *parse, const char *query_string,
return result;
}
+/* ExecutorStart_hook function to provide the desired delay */
+static void
+delay_execution_ExecutorStart(QueryDesc *queryDesc, int eflags)
+{
+ /* If enabled, delay by taking and releasing the specified lock */
+ if (executor_start_lock_id != 0)
+ {
+ DirectFunctionCall1(pg_advisory_lock_int8,
+ Int64GetDatum((int64) executor_start_lock_id));
+ DirectFunctionCall1(pg_advisory_unlock_int8,
+ Int64GetDatum((int64) executor_start_lock_id));
+
+ /*
+ * Ensure that we notice any pending invalidations, since the advisory
+ * lock functions don't do this.
+ */
+ AcceptInvalidationMessages();
+ }
+
+ /* Now start the executor, possibly via a previous hook user */
+ if (prev_ExecutorStart_hook)
+ prev_ExecutorStart_hook(queryDesc, eflags);
+ else
+ standard_ExecutorStart(queryDesc, eflags);
+
+ if (executor_start_lock_id != 0)
+ elog(NOTICE, "Finished ExecutorStart(): CachedPlan is %s",
+ CachedPlanStillValid(queryDesc->cplan) ? "valid" : "not valid");
+}
+
/* Module load function */
void
_PG_init(void)
{
- /* Set up the GUC to control which lock is used */
+ /* Set up GUCs to control which lock is used */
DefineCustomIntVariable("delay_execution.post_planning_lock_id",
"Sets the advisory lock ID to be locked/unlocked after planning.",
"Zero disables the delay.",
@@ -86,10 +123,22 @@ _PG_init(void)
NULL,
NULL,
NULL);
-
+ DefineCustomIntVariable("delay_execution.executor_start_lock_id",
+ "Sets the advisory lock ID to be locked/unlocked before starting execution.",
+ "Zero disables the delay.",
+ &executor_start_lock_id,
+ 0,
+ 0, INT_MAX,
+ PGC_USERSET,
+ 0,
+ NULL,
+ NULL,
+ NULL);
MarkGUCPrefixReserved("delay_execution");
- /* Install our hook */
+ /* Install our hooks. */
prev_planner_hook = planner_hook;
planner_hook = delay_execution_planner;
+ prev_ExecutorStart_hook = ExecutorStart_hook;
+ ExecutorStart_hook = delay_execution_ExecutorStart;
}
diff --git a/src/test/modules/delay_execution/expected/cached-plan-inval.out b/src/test/modules/delay_execution/expected/cached-plan-inval.out
new file mode 100644
index 0000000000..e8efb6d9d9
--- /dev/null
+++ b/src/test/modules/delay_execution/expected/cached-plan-inval.out
@@ -0,0 +1,175 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s1prep s2lock s1exec s2dropi s2unlock
+step s1prep: SET plan_cache_mode = force_generic_plan;
+ PREPARE q AS SELECT * FROM foov WHERE a = $1 FOR UPDATE;
+ EXPLAIN (COSTS OFF) EXECUTE q (1);
+QUERY PLAN
+------------------------------------------------
+LockRows
+ -> Append
+ Subplans Removed: 2
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: (a = $1)
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = $1)
+(7 rows)
+
+step s2lock: SELECT pg_advisory_lock(12345);
+pg_advisory_lock
+----------------
+
+(1 row)
+
+step s1exec: LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q (1); <waiting ...>
+step s2dropi: DROP INDEX foo12_1_a;
+step s2unlock: SELECT pg_advisory_unlock(12345);
+pg_advisory_unlock
+------------------
+t
+(1 row)
+
+step s1exec: <... completed>
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+-------------------------------------
+LockRows
+ -> Append
+ Subplans Removed: 2
+ -> Seq Scan on foo12_1 foo_1
+ Filter: (a = $1)
+(5 rows)
+
+
+starting permutation: s1prep2 s2lock s1exec2 s2dropi s2unlock
+step s1prep2: SET plan_cache_mode = force_generic_plan;
+ PREPARE q2 AS SELECT * FROM foov WHERE a = one() or a = two();
+ EXPLAIN (COSTS OFF) EXECUTE q2;
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+--------------------------------------------------
+Append
+ Subplans Removed: 1
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: ((a = one()) OR (a = two()))
+ -> BitmapOr
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = one())
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = two())
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+(11 rows)
+
+step s2lock: SELECT pg_advisory_lock(12345);
+pg_advisory_lock
+----------------
+
+(1 row)
+
+step s1exec2: LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q2; <waiting ...>
+step s2dropi: DROP INDEX foo12_1_a;
+step s2unlock: SELECT pg_advisory_unlock(12345);
+pg_advisory_unlock
+------------------
+t
+(1 row)
+
+step s1exec2: <... completed>
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+--------------------------------------------
+Append
+ Subplans Removed: 1
+ -> Seq Scan on foo12_1 foo_1
+ Filter: ((a = one()) OR (a = two()))
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+(6 rows)
+
+
+starting permutation: s1prep3 s2lock s1exec3 s2dropi s2unlock
+step s1prep3: SET plan_cache_mode = force_generic_plan;
+ PREPARE q3 AS UPDATE foov SET a = a WHERE a = one() or a = two();
+ EXPLAIN (COSTS OFF) EXECUTE q3;
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+--------------------------------------------------------
+Append
+ Subplans Removed: 1
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: ((a = one()) OR (a = two()))
+ -> BitmapOr
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = one())
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = two())
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+
+Update on foo
+ Update on foo12_1 foo_1
+ Update on foo12_2 foo_2
+ Update on foo3 foo
+ -> Append
+ Subplans Removed: 1
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: ((a = one()) OR (a = two()))
+ -> BitmapOr
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = one())
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = two())
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+(27 rows)
+
+step s2lock: SELECT pg_advisory_lock(12345);
+pg_advisory_lock
+----------------
+
+(1 row)
+
+step s1exec3: LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q3; <waiting ...>
+step s2dropi: DROP INDEX foo12_1_a;
+step s2unlock: SELECT pg_advisory_unlock(12345);
+pg_advisory_unlock
+------------------
+t
+(1 row)
+
+step s1exec3: <... completed>
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+--------------------------------------------------
+Append
+ Subplans Removed: 1
+ -> Seq Scan on foo12_1 foo_1
+ Filter: ((a = one()) OR (a = two()))
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+
+Update on foo
+ Update on foo12_1 foo_1
+ Update on foo12_2 foo_2
+ Update on foo3 foo
+ -> Append
+ Subplans Removed: 1
+ -> Seq Scan on foo12_1 foo_1
+ Filter: ((a = one()) OR (a = two()))
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+(17 rows)
+
diff --git a/src/test/modules/delay_execution/meson.build b/src/test/modules/delay_execution/meson.build
index 41f3ac0b89..5a70b183d0 100644
--- a/src/test/modules/delay_execution/meson.build
+++ b/src/test/modules/delay_execution/meson.build
@@ -24,6 +24,7 @@ tests += {
'specs': [
'partition-addition',
'partition-removal-1',
+ 'cached-plan-inval',
],
},
}
diff --git a/src/test/modules/delay_execution/specs/cached-plan-inval.spec b/src/test/modules/delay_execution/specs/cached-plan-inval.spec
new file mode 100644
index 0000000000..5b1f72b4a8
--- /dev/null
+++ b/src/test/modules/delay_execution/specs/cached-plan-inval.spec
@@ -0,0 +1,65 @@
+# Test to check that invalidation of cached generic plans during ExecutorStart
+# correctly triggers replanning and re-execution.
+
+setup
+{
+ CREATE TABLE foo (a int, b text) PARTITION BY LIST(a);
+ CREATE TABLE foo12 PARTITION OF foo FOR VALUES IN (1, 2) PARTITION BY LIST (a);
+ CREATE TABLE foo12_1 PARTITION OF foo12 FOR VALUES IN (1);
+ CREATE TABLE foo12_2 PARTITION OF foo12 FOR VALUES IN (2);
+ CREATE INDEX foo12_1_a ON foo12_1 (a);
+ CREATE TABLE foo3 PARTITION OF foo FOR VALUES IN (3);
+ CREATE VIEW foov AS SELECT * FROM foo;
+ CREATE FUNCTION one () RETURNS int AS $$ BEGIN RETURN 1; END; $$ LANGUAGE PLPGSQL STABLE;
+ CREATE FUNCTION two () RETURNS int AS $$ BEGIN RETURN 2; END; $$ LANGUAGE PLPGSQL STABLE;
+ CREATE RULE update_foo AS ON UPDATE TO foo DO ALSO SELECT 1;
+}
+
+teardown
+{
+ DROP VIEW foov;
+ DROP RULE update_foo ON foo;
+ DROP TABLE foo;
+ DROP FUNCTION one(), two();
+}
+
+session "s1"
+# Append with run-time pruning
+step "s1prep" { SET plan_cache_mode = force_generic_plan;
+ PREPARE q AS SELECT * FROM foov WHERE a = $1 FOR UPDATE;
+ EXPLAIN (COSTS OFF) EXECUTE q (1); }
+
+# Another case with Append with run-time pruning
+step "s1prep2" { SET plan_cache_mode = force_generic_plan;
+ PREPARE q2 AS SELECT * FROM foov WHERE a = one() or a = two();
+ EXPLAIN (COSTS OFF) EXECUTE q2; }
+
+# Case with a rule adding another query
+step "s1prep3" { SET plan_cache_mode = force_generic_plan;
+ PREPARE q3 AS UPDATE foov SET a = a WHERE a = one() or a = two();
+ EXPLAIN (COSTS OFF) EXECUTE q3; }
+
+# Executes a generic plan
+step "s1exec" { LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q (1); }
+step "s1exec2" { LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q2; }
+step "s1exec3" { LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q3; }
+
+session "s2"
+step "s2lock" { SELECT pg_advisory_lock(12345); }
+step "s2unlock" { SELECT pg_advisory_unlock(12345); }
+step "s2dropi" { DROP INDEX foo12_1_a; }
+
+# While "s1exec", etc. wait to acquire the advisory lock, "s2drop" is able to
+# drop the index being used in the cached plan. When "s1exec" is then
+# unblocked and initializes the cached plan for execution, it detects the
+# concurrent index drop and causes the cached plan to be discarded and
+# recreated without the index.
+permutation "s1prep" "s2lock" "s1exec" "s2dropi" "s2unlock"
+permutation "s1prep2" "s2lock" "s1exec2" "s2dropi" "s2unlock"
+permutation "s1prep3" "s2lock" "s1exec3" "s2dropi" "s2unlock"
--
2.43.0
[application/octet-stream] v51-0001-Defer-locking-of-runtime-prunable-relations-to-e.patch (31.1K, 3-v51-0001-Defer-locking-of-runtime-prunable-relations-to-e.patch)
download | inline diff:
From d766e737ade779de3da4addbf71a05bb2a74ab75 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Wed, 7 Aug 2024 18:25:51 +0900
Subject: [PATCH v51 1/3] Defer locking of runtime-prunable relations to
executor
When preparing a cached plan for execution, plancache.c locks the
relations contained in the plan's range table to ensure it is safe for
execution. However, this simplistic approach, implemented in
AcquireExecutorLocks(), results in unnecessarily locking relations that
might be pruned during "initial" runtime pruning.
To optimize this, the locking is now deferred for relations that are
subject to "initial" runtime pruning. The planner now provides a set
of "unprunable" relations, available through the new
PlannedStmt.unprunableRelids field. AcquireExecutorLocks() will now
only lock those relations.
PlannedStmt.unprunableRelids is populated by subtracting the set of
initially prunable relids from the set of all RT indexes. The prunable
relids set is constructed by examining all PartitionPruneInfos during
set_plan_refs() and storing the RT indexes of partitions subject to
"initial" pruning steps. While at it, some duplicated code in
set_append_references() and set_mergeappend_references() that
constructs the prunable relids set has been refactored into a common
function.
To enable the executor to determine whether the plan tree it's
executing is a cached one, the CachedPlan is now made available via
the QueryDesc. The executor can call CachedPlanRequiresLocking(),
which returns true if the CachedPlan is a reusable generic plan that
might contain relations needing to be locked. If so, the executor
will lock any relation that is not in PlannedStmt.unprunableRelids.
Finally, an Assert has been added in ExecCheckPermissions() to ensure
that all relations whose permissions are checked have been properly
locked. This helps catch any accidental omission of relations from the
unprunableRelids set that should have their permissions checked.
This deferment introduces a window in which prunable relations may be
altered by concurrent DDL, potentially causing the plan to become
invalid. As a result, the executor might attempt to run an invalid plan,
leading to errors such as being unable to locate a partition-only index
during ExecInitIndexScan(). Future commits will introduce changes to
ready the executor to check plan validity during ExecutorStart() and
retry with a newly created plan if the original one becomes invalid
after taking deferred locks.
---
src/backend/commands/copyto.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/explain.c | 7 ++--
src/backend/commands/extension.c | 1 +
src/backend/commands/matview.c | 2 +-
src/backend/commands/prepare.c | 3 +-
src/backend/executor/execMain.c | 18 ++++++++
src/backend/executor/execParallel.c | 9 +++-
src/backend/executor/execUtils.c | 30 +++++++++++++-
src/backend/executor/functions.c | 1 +
src/backend/executor/spi.c | 1 +
src/backend/optimizer/plan/planner.c | 2 +
src/backend/optimizer/plan/setrefs.c | 62 +++++++++++++++-------------
src/backend/partitioning/partprune.c | 24 ++++++++++-
src/backend/storage/lmgr/lmgr.c | 1 +
src/backend/tcop/pquery.c | 10 ++++-
src/backend/utils/cache/lsyscache.c | 1 -
src/backend/utils/cache/plancache.c | 25 +++++++----
src/include/commands/explain.h | 5 ++-
src/include/executor/execdesc.h | 2 +
src/include/nodes/execnodes.h | 2 +
src/include/nodes/pathnodes.h | 6 +++
src/include/nodes/plannodes.h | 11 +++++
src/include/utils/plancache.h | 10 +++++
24 files changed, 186 insertions(+), 51 deletions(-)
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 91de442f43..db976f928a 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -552,7 +552,7 @@ BeginCopyTo(ParseState *pstate,
((DR_copy *) dest)->cstate = cstate;
/* Create a QueryDesc requesting no output */
- cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(),
InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 0b629b1f79..57a3375cad 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -324,7 +324,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 11df4a04d4..a83ea07db1 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -507,7 +507,7 @@ standard_ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL),
es->memory ? &mem_counters : NULL);
}
@@ -615,7 +615,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
* to call it.
*/
void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
+ IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
const BufferUsage *bufusage,
@@ -671,7 +672,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
dest = None_Receiver;
/* Create a QueryDesc for the query */
- queryDesc = CreateQueryDesc(plannedstmt, queryString,
+ queryDesc = CreateQueryDesc(plannedstmt, cplan, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, instrument_option);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 1643c8c69a..3f7f4306fe 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -798,6 +798,7 @@ execute_sql_string(const char *sql)
QueryDesc *qdesc;
qdesc = CreateQueryDesc(stmt,
+ NULL,
sql,
GetActiveSnapshot(), NULL,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 91f0fd6ea3..a7a79583ec 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -438,7 +438,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, queryString,
+ queryDesc = CreateQueryDesc(plan, NULL, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 07257d4db9..311b9ebd5b 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -655,7 +655,8 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
+ ExplainOnePlan(pstmt, cplan, into, es, query_string, paramLI,
+ queryEnv,
&planduration, (es->buffers ? &bufusage : NULL),
es->memory ? &mem_counters : NULL);
else
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 29e186fa73..271f9d93fc 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -52,6 +52,7 @@
#include "miscadmin.h"
#include "parser/parse_relation.h"
#include "rewrite/rewriteHandler.h"
+#include "storage/lmgr.h"
#include "tcop/utility.h"
#include "utils/acl.h"
#include "utils/backend_status.h"
@@ -597,6 +598,21 @@ ExecCheckPermissions(List *rangeTable, List *rteperminfos,
(rte->rtekind == RTE_SUBQUERY &&
rte->relkind == RELKIND_VIEW));
+ /*
+ * Ensure that we have at least an AccessShareLock on relations
+ * whose permissions need to be checked.
+ *
+ * Skip this check in a parallel worker because locks won't be
+ * taken until ExecInitNode() performs plan initialization.
+ *
+ * XXX: ExecCheckPermissions() in a parallel worker may be
+ * redundant with the checks done in the leader process, so this
+ * should be reviewed to ensure it’s necessary.
+ */
+ Assert(IsParallelWorker() ||
+ CheckRelationOidLockedByMe(rte->relid, AccessShareLock,
+ true));
+
(void) getRTEPermissionInfo(rteperminfos, rte);
/* Many-to-one mapping not allowed */
Assert(!bms_is_member(rte->perminfoindex, indexset));
@@ -829,6 +845,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
{
CmdType operation = queryDesc->operation;
PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+ CachedPlan *cachedplan = queryDesc->cplan;
Plan *plan = plannedstmt->planTree;
List *rangeTable = plannedstmt->rtable;
EState *estate = queryDesc->estate;
@@ -848,6 +865,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable, plannedstmt->permInfos);
estate->es_plannedstmt = plannedstmt;
+ estate->es_cachedplan = cachedplan;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index bfb3419efb..03b48e12b4 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -1256,8 +1256,15 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
paramLI = RestoreParamList(¶mspace);
- /* Create a QueryDesc for the query. */
+ /*
+ * Create a QueryDesc for the query. We pass NULL for cachedplan, because
+ * we don't have a pointer to the CachedPlan in the leader's process. It's
+ * fine because the only reason the executor needs to see it is to decide
+ * if it should take locks on certain relations, but paraller workers
+ * always take locks anyway.
+ */
return CreateQueryDesc(pstmt,
+ NULL,
queryString,
GetActiveSnapshot(), InvalidSnapshot,
receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 5737f9f4eb..6dfd5a26b7 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -752,6 +752,26 @@ ExecInitRangeTable(EState *estate, List *rangeTable, List *permInfos)
estate->es_rowmarks = NULL;
}
+/*
+ * ExecShouldLockRelation
+ * Determine if the relation should be locked.
+ *
+ * The relation does not need to be locked if we are not running a cached
+ * plan or if it has already been locked as an unprunable relation.
+ *
+ * Lock the relation if it might be one of the prunable relations mentioned
+ * in the cached plan.
+ */
+static bool
+ExecShouldLockRelation(EState *estate, Index rtindex)
+{
+ if (estate->es_cachedplan == NULL ||
+ bms_is_member(rtindex, estate->es_plannedstmt->unprunableRelids))
+ return false;
+
+ return CachedPlanRequiresLocking(estate->es_cachedplan);
+}
+
/*
* ExecGetRangeTableRelation
* Open the Relation for a range table entry, if not already done
@@ -773,7 +793,7 @@ ExecGetRangeTableRelation(EState *estate, Index rti)
Assert(rte->rtekind == RTE_RELATION);
- if (!IsParallelWorker())
+ if (!IsParallelWorker() && !ExecShouldLockRelation(estate, rti))
{
/*
* In a normal query, we should already have the appropriate lock,
@@ -789,9 +809,17 @@ ExecGetRangeTableRelation(EState *estate, Index rti)
else
{
/*
+ * Lock relation either if we are a parallel worker or if
+ * ExecShouldLockRelation() says we should.
+ *
* If we are a parallel worker, we need to obtain our own local
* lock on the relation. This ensures sane behavior in case the
* parent process exits before we do.
+ *
+ * ExecShouldLockRelation() would return true if the RT index is
+ * that of a prunable relation and we're running a cached generic
+ * plan. AcquireExecutorLocks() of plancache.c would have locked
+ * only the unprunable relations in the plan tree.
*/
rel = table_open(rte->relid, rte->rellockmode);
}
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 692854e2b3..6f6f45e0ad 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -840,6 +840,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
dest = None_Receiver;
es->qd = CreateQueryDesc(es->stmt,
+ NULL,
fcache->src,
GetActiveSnapshot(),
InvalidSnapshot,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index d6516b1bca..902793b02b 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -2684,6 +2684,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
snap = InvalidSnapshot;
qdesc = CreateQueryDesc(stmt,
+ cplan,
plansource->query_string,
snap, crosscheck_snapshot,
dest,
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index b5827d3980..cb9b6f0147 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -546,6 +546,8 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
result->rtable = glob->finalrtable;
+ result->unprunableRelids = bms_difference(bms_add_range(NULL, 1, list_length(result->rtable)),
+ glob->prunableRelids);
result->permInfos = glob->finalrteperminfos;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 7aed84584c..b6be0e5730 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -154,6 +154,9 @@ static Plan *set_append_references(PlannerInfo *root,
static Plan *set_mergeappend_references(PlannerInfo *root,
MergeAppend *mplan,
int rtoffset);
+static void set_part_prune_references(PartitionPruneInfo *pinfo,
+ PlannerGlobal *glob,
+ int rtoffset);
static void set_hash_references(PlannerInfo *root, Plan *plan, int rtoffset);
static Relids offset_relid_set(Relids relids, int rtoffset);
static Node *fix_scan_expr(PlannerInfo *root, Node *node,
@@ -1783,20 +1786,8 @@ set_append_references(PlannerInfo *root,
aplan->apprelids = offset_relid_set(aplan->apprelids, rtoffset);
if (aplan->part_prune_info)
- {
- foreach(l, aplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ set_part_prune_references(aplan->part_prune_info, root->glob,
+ rtoffset);
/* We don't need to recurse to lefttree or righttree ... */
Assert(aplan->plan.lefttree == NULL);
@@ -1859,20 +1850,8 @@ set_mergeappend_references(PlannerInfo *root,
mplan->apprelids = offset_relid_set(mplan->apprelids, rtoffset);
if (mplan->part_prune_info)
- {
- foreach(l, mplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ set_part_prune_references(mplan->part_prune_info, root->glob,
+ rtoffset);
/* We don't need to recurse to lefttree or righttree ... */
Assert(mplan->plan.lefttree == NULL);
@@ -1881,6 +1860,33 @@ set_mergeappend_references(PlannerInfo *root,
return (Plan *) mplan;
}
+/*
+ * Updates RT indexes in PartitionedRelPruneInfos contained in pinfo and adds
+ * the RT indexes of "prunable" relations into glob->prunableRelids.
+ */
+static void
+set_part_prune_references(PartitionPruneInfo *pinfo, PlannerGlobal *glob,
+ int rtoffset)
+{
+ ListCell *l;
+
+ foreach(l, pinfo->prune_infos)
+ {
+ List *prune_infos = lfirst(l);
+ ListCell *l2;
+
+ foreach(l2, prune_infos)
+ {
+ PartitionedRelPruneInfo *prelinfo = lfirst(l2);
+
+ prelinfo->rtindex += rtoffset;
+ if (prelinfo->initial_pruning_steps != NIL)
+ glob->prunableRelids = bms_add_members(glob->prunableRelids,
+ prelinfo->present_part_rtis);
+ }
+ }
+}
+
/*
* set_hash_references
* Do set_plan_references processing on a Hash node
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 9a1a7faac7..8e27e35df2 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -634,6 +634,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
PartitionedRelPruneInfo *pinfo = lfirst(lc);
RelOptInfo *subpart = find_base_rel(root, pinfo->rtindex);
Bitmapset *present_parts;
+ Bitmapset *present_part_rtis;
int nparts = subpart->nparts;
int *subplan_map;
int *subpart_map;
@@ -650,7 +651,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subpart_map = (int *) palloc(nparts * sizeof(int));
memset(subpart_map, -1, nparts * sizeof(int));
relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
- present_parts = NULL;
+ present_parts = present_part_rtis = NULL;
i = -1;
while ((i = bms_next_member(subpart->live_parts, i)) >= 0)
@@ -664,15 +665,35 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+
+ /*
+ * Track the RT indexes of partitions to ensure they are included
+ * in the prunableRelids set of relations that are locked during
+ * execution. This ensures that if the plan is cached, these
+ * partitions are locked when the plan is reused.
+ *
+ * Partitions without a subplan and sub-partitioned partitions
+ * where none of the sub-partitions have a subplan due to
+ * constraint exclusion are not included in this set. Instead,
+ * they are added to the unprunableRelids set, and the relations
+ * in this set are locked by AcquireExecutorLocks() before
+ * executing a cached plan.
+ */
if (subplanidx >= 0)
{
present_parts = bms_add_member(present_parts, i);
+ present_part_rtis = bms_add_member(present_part_rtis,
+ partrel->relid);
/* Record finding this subplan */
subplansfound = bms_add_member(subplansfound, subplanidx);
}
else if (subpartidx >= 0)
+ {
present_parts = bms_add_member(present_parts, i);
+ present_part_rtis = bms_add_member(present_part_rtis,
+ partrel->relid);
+ }
}
/*
@@ -684,6 +705,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
/* Record the maps and other information. */
pinfo->present_parts = present_parts;
+ pinfo->present_part_rtis = present_part_rtis;
pinfo->nparts = nparts;
pinfo->subplan_map = subplan_map;
pinfo->subpart_map = subpart_map;
diff --git a/src/backend/storage/lmgr/lmgr.c b/src/backend/storage/lmgr/lmgr.c
index 094522acb4..a1c89f5d72 100644
--- a/src/backend/storage/lmgr/lmgr.c
+++ b/src/backend/storage/lmgr/lmgr.c
@@ -26,6 +26,7 @@
#include "storage/procarray.h"
#include "storage/sinvaladt.h"
#include "utils/inval.h"
+#include "utils/lsyscache.h"
/*
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index a1f8d03db1..6e8f6b1b8f 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -36,6 +36,7 @@ Portal ActivePortal = NULL;
static void ProcessQuery(PlannedStmt *plan,
+ CachedPlan *cplan,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -65,6 +66,7 @@ static void DoPortalRewind(Portal portal);
*/
QueryDesc *
CreateQueryDesc(PlannedStmt *plannedstmt,
+ CachedPlan *cplan,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
@@ -77,6 +79,7 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
+ qd->cplan = cplan; /* CachedPlan supplying the plannedstmt */
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
* PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
*
* plan: the plan tree for the query
+ * cplan: CachedPlan supplying the plan
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
*/
static void
ProcessQuery(PlannedStmt *plan,
+ CachedPlan *cplan,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Create the QueryDesc object
*/
- queryDesc = CreateQueryDesc(plan, sourceText,
+ queryDesc = CreateQueryDesc(plan, cplan, sourceText,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
@@ -493,6 +498,7 @@ PortalStart(Portal portal, ParamListInfo params,
* the destination to DestNone.
*/
queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+ portal->cplan,
portal->sourceText,
GetActiveSnapshot(),
InvalidSnapshot,
@@ -1276,6 +1282,7 @@ PortalRunMulti(Portal portal,
{
/* statement can set tag string */
ProcessQuery(pstmt,
+ portal->cplan,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1285,6 +1292,7 @@ PortalRunMulti(Portal portal,
{
/* stmt added by rewrite cannot set tag */
ProcessQuery(pstmt,
+ portal->cplan,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c
index 48a280d089..f647821382 100644
--- a/src/backend/utils/cache/lsyscache.c
+++ b/src/backend/utils/cache/lsyscache.c
@@ -2113,7 +2113,6 @@ get_rel_relam(Oid relid)
return result;
}
-
/* ---------- TRANSFORM CACHE ---------- */
Oid
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 5af1a168ec..6d2e385fe8 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -104,7 +104,8 @@ static List *RevalidateCachedQuery(CachedPlanSource *plansource,
QueryEnvironment *queryEnv);
static bool CheckCachedPlan(CachedPlanSource *plansource);
static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv);
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ bool generic);
static bool choose_custom_plan(CachedPlanSource *plansource,
ParamListInfo boundParams);
static double cached_plan_cost(CachedPlan *plan, bool include_planner);
@@ -904,7 +905,8 @@ CheckCachedPlan(CachedPlanSource *plansource)
*/
static CachedPlan *
BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv)
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ bool generic)
{
CachedPlan *plan;
List *plist;
@@ -1026,6 +1028,7 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
plan->refcount = 0;
plan->context = plan_context;
plan->is_oneshot = plansource->is_oneshot;
+ plan->is_generic = generic;
plan->is_saved = false;
plan->is_valid = true;
@@ -1196,7 +1199,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
else
{
/* Build a new generic plan */
- plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv, true);
/* Just make real sure plansource->gplan is clear */
ReleaseGenericPlan(plansource);
/* Link the new generic plan into the plansource */
@@ -1241,7 +1244,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (customplan)
{
/* Build a custom plan */
- plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv, true);
/* Accumulate total costs of custom plans */
plansource->total_custom_cost += cached_plan_cost(plan, true);
@@ -1387,8 +1390,8 @@ CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
}
/*
- * Reject if AcquireExecutorLocks would have anything to do. This is
- * probably unnecessary given the previous check, but let's be safe.
+ * Reject if there are any lockable relations. This is probably
+ * unnecessary given the previous check, but let's be safe.
*/
foreach(lc, plan->stmt_list)
{
@@ -1776,7 +1779,7 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
foreach(lc1, stmt_list)
{
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
- ListCell *lc2;
+ int rtindex;
if (plannedstmt->commandType == CMD_UTILITY)
{
@@ -1794,9 +1797,13 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
continue;
}
- foreach(lc2, plannedstmt->rtable)
+ rtindex = -1;
+ while ((rtindex = bms_next_member(plannedstmt->unprunableRelids,
+ rtindex)) >= 0)
{
- RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+ RangeTblEntry *rte = list_nth_node(RangeTblEntry,
+ plannedstmt->rtable,
+ rtindex - 1);
if (!(rte->rtekind == RTE_RELATION ||
(rte->rtekind == RTE_SUBQUERY && OidIsValid(rte->relid))))
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 9b8b351d9a..bf326eeb70 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -101,8 +101,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv);
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
- ExplainState *es, const char *queryString,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
+ IntoClause *into, ExplainState *es,
+ const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
const instr_time *planduration,
const BufferUsage *bufusage,
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index 0a7274e26c..0e7245435d 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,7 @@ typedef struct QueryDesc
/* These fields are provided by CreateQueryDesc */
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
+ CachedPlan *cplan; /* CachedPlan that supplies the plannedstmt */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
@@ -57,6 +58,7 @@ typedef struct QueryDesc
/* in pquery.c */
extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+ CachedPlan *cplan,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index af7d8fd1e7..ee089505a0 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -42,6 +42,7 @@
#include "storage/condition_variable.h"
#include "utils/hsearch.h"
#include "utils/queryenvironment.h"
+#include "utils/plancache.h"
#include "utils/reltrigger.h"
#include "utils/sharedtuplestore.h"
#include "utils/snapshot.h"
@@ -633,6 +634,7 @@ typedef struct EState
* ExecRowMarks, or NULL if none */
List *es_rteperminfos; /* List of RTEPermissionInfo */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
+ CachedPlan *es_cachedplan; /* CachedPlan supplying the plannedstmt */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 540d021592..2466157b25 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -116,6 +116,12 @@ typedef struct PlannerGlobal
/* "flat" rangetable for executor */
List *finalrtable;
+ /*
+ * RT indexes of relations subject to removal from the plan due to runtime
+ * pruning at plan initialization time
+ */
+ Bitmapset *prunableRelids;
+
/* "flat" list of RTEPermissionInfos */
List *finalrteperminfos;
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 62cd6a6666..ae608812f1 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -71,6 +71,10 @@ typedef struct PlannedStmt
List *rtable; /* list of RangeTblEntry nodes */
+ Bitmapset *unprunableRelids; /* RT indexes of relations that are not
+ * subject to runtime pruning; for
+ * AcquireExecutorLocks() */
+
List *permInfos; /* list of RTEPermissionInfo nodes for rtable
* entries needing one */
@@ -1459,6 +1463,13 @@ typedef struct PartitionedRelPruneInfo
/* Indexes of all partitions which subplans or subparts are present for */
Bitmapset *present_parts;
+ /*
+ * RT indexes of all partitions which subplans or subparts are present
+ * for; only used during planning to help in the construction of
+ * PlannerGlobal.prunableRelids.
+ */
+ Bitmapset *present_part_rtis;
+
/* Length of the following arrays: */
int nparts;
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index a90dfdf906..0b5ee007ca 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -149,6 +149,7 @@ typedef struct CachedPlan
int magic; /* should equal CACHEDPLAN_MAGIC */
List *stmt_list; /* list of PlannedStmts */
bool is_oneshot; /* is it a "oneshot" plan? */
+ bool is_generic; /* is it a reusable generic plan? */
bool is_saved; /* is CachedPlan in a long-lived context? */
bool is_valid; /* is the stmt_list currently valid? */
Oid planRoleId; /* Role ID the plan was created for */
@@ -235,4 +236,13 @@ extern bool CachedPlanIsSimplyValid(CachedPlanSource *plansource,
extern CachedExpression *GetCachedExpression(Node *expr);
extern void FreeCachedExpression(CachedExpression *cexpr);
+/*
+ * CachedPlanRequiresLocking: should the executor acquire locks?
+ */
+static inline bool
+CachedPlanRequiresLocking(CachedPlan *cplan)
+{
+ return cplan->is_generic;
+}
+
#endif /* PLANCACHE_H */
--
2.43.0
[application/octet-stream] v51-0002-Assorted-tightening-in-various-ExecEnd-routines.patch (31.3K, 4-v51-0002-Assorted-tightening-in-various-ExecEnd-routines.patch)
download | inline diff:
From 509bdce6a875278385f47ba9184774bc9e57fb8b Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Thu, 28 Sep 2023 16:56:29 +0900
Subject: [PATCH v51 2/3] Assorted tightening in various ExecEnd()* routines
This includes adding NULLness checks on pointers before cleaning them
up. Many ExecEnd*() routines already perform this check, but a few
are missing them. These NULLness checks might seem redundant as
things stand since the ExecEnd*() routines operate under the
assumption that their matching ExecInit* routine would have fully
executed, ensuring pointers are set. However, that assumption seems a
bit shaky in the face of future changes.
This also adds a guard at the begigging of EvalPlanQualEnd() to return
early if the EPQState does not appear to have been initialized. That
case can happen if the corresponding ExecInit*() routine returned
early without calling EvalPlanQualInit().
While at it, this commit ensures that pointers are consistently set
to NULL after cleanup in all ExecEnd*() routines.
Finally, for enhanced consistency, the format of NULLness checks has
been standardized to "if (pointer != NULL)", replacing the previous
"if (pointer)" style.
Reviewed-by: Robert Haas
Discussion: https://postgr.es/m/CA+HiwqFGkMSge6TgC9KQzde0ohpAycLQuV7ooitEEpbKB0O_mg@mail.gmail.com
---
src/backend/executor/execMain.c | 4 ++
src/backend/executor/nodeAgg.c | 27 +++++++++----
src/backend/executor/nodeAppend.c | 3 ++
src/backend/executor/nodeBitmapAnd.c | 4 +-
src/backend/executor/nodeBitmapHeapscan.c | 46 ++++++++++++++--------
src/backend/executor/nodeBitmapIndexscan.c | 23 ++++++-----
src/backend/executor/nodeBitmapOr.c | 4 +-
src/backend/executor/nodeCtescan.c | 3 +-
src/backend/executor/nodeForeignscan.c | 17 ++++----
src/backend/executor/nodeGather.c | 1 +
src/backend/executor/nodeGatherMerge.c | 1 +
src/backend/executor/nodeGroup.c | 6 +--
src/backend/executor/nodeHash.c | 6 +--
src/backend/executor/nodeHashjoin.c | 4 +-
src/backend/executor/nodeIncrementalSort.c | 13 +++++-
src/backend/executor/nodeIndexonlyscan.c | 25 ++++++------
src/backend/executor/nodeIndexscan.c | 23 ++++++-----
src/backend/executor/nodeLimit.c | 1 +
src/backend/executor/nodeLockRows.c | 1 +
src/backend/executor/nodeMaterial.c | 5 ++-
src/backend/executor/nodeMemoize.c | 7 +++-
src/backend/executor/nodeMergeAppend.c | 3 ++
src/backend/executor/nodeMergejoin.c | 2 +
src/backend/executor/nodeModifyTable.c | 11 +++++-
src/backend/executor/nodeNestloop.c | 2 +
src/backend/executor/nodeProjectSet.c | 1 +
src/backend/executor/nodeRecursiveunion.c | 24 +++++++++--
src/backend/executor/nodeResult.c | 1 +
src/backend/executor/nodeSamplescan.c | 7 +++-
src/backend/executor/nodeSeqscan.c | 16 +++-----
src/backend/executor/nodeSetOp.c | 6 ++-
src/backend/executor/nodeSort.c | 5 ++-
src/backend/executor/nodeSubqueryscan.c | 1 +
src/backend/executor/nodeTableFuncscan.c | 4 +-
src/backend/executor/nodeTidrangescan.c | 12 ++++--
src/backend/executor/nodeTidscan.c | 8 +++-
src/backend/executor/nodeUnique.c | 1 +
src/backend/executor/nodeWindowAgg.c | 41 +++++++++++++------
38 files changed, 246 insertions(+), 123 deletions(-)
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 271f9d93fc..0f6dbd1e2b 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -2999,6 +2999,10 @@ EvalPlanQualEnd(EPQState *epqstate)
MemoryContext oldcontext;
ListCell *l;
+ /* Nothing to do if no EvalPlanQualInit() was done to begin with. */
+ if (epqstate->parentestate == NULL)
+ return;
+
rtsize = epqstate->parentestate->es_range_table_size;
/*
diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
index 53ead77ece..0dfba5ca16 100644
--- a/src/backend/executor/nodeAgg.c
+++ b/src/backend/executor/nodeAgg.c
@@ -4303,7 +4303,6 @@ GetAggInitVal(Datum textInitVal, Oid transtype)
void
ExecEndAgg(AggState *node)
{
- PlanState *outerPlan;
int transno;
int numGroupingSets = Max(node->maxsets, 1);
int setno;
@@ -4313,7 +4312,7 @@ ExecEndAgg(AggState *node)
* worker back into shared memory so that it can be picked up by the main
* process to report in EXPLAIN ANALYZE.
*/
- if (node->shared_info && IsParallelWorker())
+ if (node->shared_info != NULL && IsParallelWorker())
{
AggregateInstrumentation *si;
@@ -4326,10 +4325,16 @@ ExecEndAgg(AggState *node)
/* Make sure we have closed any open tuplesorts */
- if (node->sort_in)
+ if (node->sort_in != NULL)
+ {
tuplesort_end(node->sort_in);
- if (node->sort_out)
+ node->sort_in = NULL;
+ }
+ if (node->sort_out != NULL)
+ {
tuplesort_end(node->sort_out);
+ node->sort_out = NULL;
+ }
hashagg_reset_spill_state(node);
@@ -4345,19 +4350,25 @@ ExecEndAgg(AggState *node)
for (setno = 0; setno < numGroupingSets; setno++)
{
- if (pertrans->sortstates[setno])
+ if (pertrans->sortstates[setno] != NULL)
tuplesort_end(pertrans->sortstates[setno]);
}
}
/* And ensure any agg shutdown callbacks have been called */
for (setno = 0; setno < numGroupingSets; setno++)
+ {
ReScanExprContext(node->aggcontexts[setno]);
- if (node->hashcontext)
+ node->aggcontexts[setno] = NULL;
+ }
+ if (node->hashcontext != NULL)
+ {
ReScanExprContext(node->hashcontext);
+ node->hashcontext = NULL;
+ }
- outerPlan = outerPlanState(node);
- ExecEndNode(outerPlan);
+ ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
}
void
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index ca0f54d676..86d75b1a7e 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -399,7 +399,10 @@ ExecEndAppend(AppendState *node)
* shut down each of the subscans
*/
for (i = 0; i < nplans; i++)
+ {
ExecEndNode(appendplans[i]);
+ appendplans[i] = NULL;
+ }
}
void
diff --git a/src/backend/executor/nodeBitmapAnd.c b/src/backend/executor/nodeBitmapAnd.c
index 9c9c666872..ae391222bf 100644
--- a/src/backend/executor/nodeBitmapAnd.c
+++ b/src/backend/executor/nodeBitmapAnd.c
@@ -192,8 +192,8 @@ ExecEndBitmapAnd(BitmapAndState *node)
*/
for (i = 0; i < nplans; i++)
{
- if (bitmapplans[i])
- ExecEndNode(bitmapplans[i]);
+ ExecEndNode(bitmapplans[i]);
+ bitmapplans[i] = NULL;
}
}
diff --git a/src/backend/executor/nodeBitmapHeapscan.c b/src/backend/executor/nodeBitmapHeapscan.c
index 3c63bdd93d..19f18ab817 100644
--- a/src/backend/executor/nodeBitmapHeapscan.c
+++ b/src/backend/executor/nodeBitmapHeapscan.c
@@ -625,8 +625,6 @@ ExecReScanBitmapHeapScan(BitmapHeapScanState *node)
void
ExecEndBitmapHeapScan(BitmapHeapScanState *node)
{
- TableScanDesc scanDesc;
-
/*
* When ending a parallel worker, copy the statistics gathered by the
* worker back into shared memory so that it can be picked up by the main
@@ -650,38 +648,54 @@ ExecEndBitmapHeapScan(BitmapHeapScanState *node)
si->lossy_pages += node->stats.lossy_pages;
}
- /*
- * extract information from the node
- */
- scanDesc = node->ss.ss_currentScanDesc;
-
/*
* close down subplans
*/
ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
/*
* release bitmaps and buffers if any
*/
- if (node->tbmiterator)
+ if (node->tbmiterator != NULL)
+ {
tbm_end_iterate(node->tbmiterator);
- if (node->prefetch_iterator)
+ node->tbmiterator = NULL;
+ }
+ if (node->prefetch_iterator != NULL)
+ {
tbm_end_iterate(node->prefetch_iterator);
- if (node->tbm)
+ node->prefetch_iterator = NULL;
+ }
+ if (node->tbm != NULL)
+ {
tbm_free(node->tbm);
- if (node->shared_tbmiterator)
+ node->tbm = NULL;
+ }
+ if (node->shared_tbmiterator != NULL)
+ {
tbm_end_shared_iterate(node->shared_tbmiterator);
- if (node->shared_prefetch_iterator)
+ node->shared_tbmiterator = NULL;
+ }
+ if (node->shared_prefetch_iterator != NULL)
+ {
tbm_end_shared_iterate(node->shared_prefetch_iterator);
+ node->shared_prefetch_iterator = NULL;
+ }
if (node->pvmbuffer != InvalidBuffer)
+ {
ReleaseBuffer(node->pvmbuffer);
+ node->pvmbuffer = InvalidBuffer;
+ }
/*
- * close heap scan
+ * close heap scan (no-op if we didn't start it)
*/
- if (scanDesc)
- table_endscan(scanDesc);
-
+ if (node->ss.ss_currentScanDesc != NULL)
+ {
+ table_endscan(node->ss.ss_currentScanDesc);
+ node->ss.ss_currentScanDesc = NULL;
+ }
}
/* ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeBitmapIndexscan.c b/src/backend/executor/nodeBitmapIndexscan.c
index 6df8e17ec8..4669e8d0ce 100644
--- a/src/backend/executor/nodeBitmapIndexscan.c
+++ b/src/backend/executor/nodeBitmapIndexscan.c
@@ -174,22 +174,21 @@ ExecReScanBitmapIndexScan(BitmapIndexScanState *node)
void
ExecEndBitmapIndexScan(BitmapIndexScanState *node)
{
- Relation indexRelationDesc;
- IndexScanDesc indexScanDesc;
-
- /*
- * extract information from the node
- */
- indexRelationDesc = node->biss_RelationDesc;
- indexScanDesc = node->biss_ScanDesc;
+ /* close the scan (no-op if we didn't start it) */
+ if (node->biss_ScanDesc != NULL)
+ {
+ index_endscan(node->biss_ScanDesc);
+ node->biss_ScanDesc = NULL;
+ }
/*
* close the index relation (no-op if we didn't open it)
*/
- if (indexScanDesc)
- index_endscan(indexScanDesc);
- if (indexRelationDesc)
- index_close(indexRelationDesc, NoLock);
+ if (node->biss_RelationDesc != NULL)
+ {
+ index_close(node->biss_RelationDesc, NoLock);
+ node->biss_RelationDesc = NULL;
+ }
}
/* ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeBitmapOr.c b/src/backend/executor/nodeBitmapOr.c
index 7029536c64..de439235d2 100644
--- a/src/backend/executor/nodeBitmapOr.c
+++ b/src/backend/executor/nodeBitmapOr.c
@@ -210,8 +210,8 @@ ExecEndBitmapOr(BitmapOrState *node)
*/
for (i = 0; i < nplans; i++)
{
- if (bitmapplans[i])
- ExecEndNode(bitmapplans[i]);
+ ExecEndNode(bitmapplans[i]);
+ bitmapplans[i] = NULL;
}
}
diff --git a/src/backend/executor/nodeCtescan.c b/src/backend/executor/nodeCtescan.c
index 8081eed887..7cea943988 100644
--- a/src/backend/executor/nodeCtescan.c
+++ b/src/backend/executor/nodeCtescan.c
@@ -290,10 +290,11 @@ ExecEndCteScan(CteScanState *node)
/*
* If I am the leader, free the tuplestore.
*/
- if (node->leader == node)
+ if (node->leader != NULL && node->leader == node)
{
tuplestore_end(node->cte_table);
node->cte_table = NULL;
+ node->leader = NULL;
}
}
diff --git a/src/backend/executor/nodeForeignscan.c b/src/backend/executor/nodeForeignscan.c
index fe4ae55c0f..1357ccf3c9 100644
--- a/src/backend/executor/nodeForeignscan.c
+++ b/src/backend/executor/nodeForeignscan.c
@@ -300,17 +300,20 @@ ExecEndForeignScan(ForeignScanState *node)
EState *estate = node->ss.ps.state;
/* Let the FDW shut down */
- if (plan->operation != CMD_SELECT)
+ if (node->fdwroutine != NULL)
{
- if (estate->es_epq_active == NULL)
- node->fdwroutine->EndDirectModify(node);
+ if (plan->operation != CMD_SELECT)
+ {
+ if (estate->es_epq_active == NULL)
+ node->fdwroutine->EndDirectModify(node);
+ }
+ else
+ node->fdwroutine->EndForeignScan(node);
}
- else
- node->fdwroutine->EndForeignScan(node);
/* Shut down any outer plan. */
- if (outerPlanState(node))
- ExecEndNode(outerPlanState(node));
+ ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
}
/* ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeGather.c b/src/backend/executor/nodeGather.c
index 5d4ffe989c..cae5ea1f92 100644
--- a/src/backend/executor/nodeGather.c
+++ b/src/backend/executor/nodeGather.c
@@ -244,6 +244,7 @@ void
ExecEndGather(GatherState *node)
{
ExecEndNode(outerPlanState(node)); /* let children clean up first */
+ outerPlanState(node) = NULL;
ExecShutdownGather(node);
}
diff --git a/src/backend/executor/nodeGatherMerge.c b/src/backend/executor/nodeGatherMerge.c
index 45f6017c29..b36cd89e7d 100644
--- a/src/backend/executor/nodeGatherMerge.c
+++ b/src/backend/executor/nodeGatherMerge.c
@@ -284,6 +284,7 @@ void
ExecEndGatherMerge(GatherMergeState *node)
{
ExecEndNode(outerPlanState(node)); /* let children clean up first */
+ outerPlanState(node) = NULL;
ExecShutdownGatherMerge(node);
}
diff --git a/src/backend/executor/nodeGroup.c b/src/backend/executor/nodeGroup.c
index da32bec181..807429e504 100644
--- a/src/backend/executor/nodeGroup.c
+++ b/src/backend/executor/nodeGroup.c
@@ -225,10 +225,8 @@ ExecInitGroup(Group *node, EState *estate, int eflags)
void
ExecEndGroup(GroupState *node)
{
- PlanState *outerPlan;
-
- outerPlan = outerPlanState(node);
- ExecEndNode(outerPlan);
+ ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
}
void
diff --git a/src/backend/executor/nodeHash.c b/src/backend/executor/nodeHash.c
index 570a90ebe1..a913d5b50c 100644
--- a/src/backend/executor/nodeHash.c
+++ b/src/backend/executor/nodeHash.c
@@ -427,13 +427,11 @@ ExecInitHash(Hash *node, EState *estate, int eflags)
void
ExecEndHash(HashState *node)
{
- PlanState *outerPlan;
-
/*
* shut down the subplan
*/
- outerPlan = outerPlanState(node);
- ExecEndNode(outerPlan);
+ ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
}
diff --git a/src/backend/executor/nodeHashjoin.c b/src/backend/executor/nodeHashjoin.c
index 2f7170604d..901c9e9be7 100644
--- a/src/backend/executor/nodeHashjoin.c
+++ b/src/backend/executor/nodeHashjoin.c
@@ -950,7 +950,7 @@ ExecEndHashJoin(HashJoinState *node)
/*
* Free hash table
*/
- if (node->hj_HashTable)
+ if (node->hj_HashTable != NULL)
{
ExecHashTableDestroy(node->hj_HashTable);
node->hj_HashTable = NULL;
@@ -960,7 +960,9 @@ ExecEndHashJoin(HashJoinState *node)
* clean up subtrees
*/
ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
ExecEndNode(innerPlanState(node));
+ innerPlanState(node) = NULL;
}
/*
diff --git a/src/backend/executor/nodeIncrementalSort.c b/src/backend/executor/nodeIncrementalSort.c
index 2ce5ed5ec8..010bcfafa8 100644
--- a/src/backend/executor/nodeIncrementalSort.c
+++ b/src/backend/executor/nodeIncrementalSort.c
@@ -1078,8 +1078,16 @@ ExecEndIncrementalSort(IncrementalSortState *node)
{
SO_printf("ExecEndIncrementalSort: shutting down sort node\n");
- ExecDropSingleTupleTableSlot(node->group_pivot);
- ExecDropSingleTupleTableSlot(node->transfer_tuple);
+ if (node->group_pivot != NULL)
+ {
+ ExecDropSingleTupleTableSlot(node->group_pivot);
+ node->group_pivot = NULL;
+ }
+ if (node->transfer_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(node->transfer_tuple);
+ node->transfer_tuple = NULL;
+ }
/*
* Release tuplesort resources.
@@ -1099,6 +1107,7 @@ ExecEndIncrementalSort(IncrementalSortState *node)
* Shut down the subplan.
*/
ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
SO_printf("ExecEndIncrementalSort: sort node shutdown\n");
}
diff --git a/src/backend/executor/nodeIndexonlyscan.c b/src/backend/executor/nodeIndexonlyscan.c
index 612c673895..481d479760 100644
--- a/src/backend/executor/nodeIndexonlyscan.c
+++ b/src/backend/executor/nodeIndexonlyscan.c
@@ -397,15 +397,6 @@ ExecReScanIndexOnlyScan(IndexOnlyScanState *node)
void
ExecEndIndexOnlyScan(IndexOnlyScanState *node)
{
- Relation indexRelationDesc;
- IndexScanDesc indexScanDesc;
-
- /*
- * extract information from the node
- */
- indexRelationDesc = node->ioss_RelationDesc;
- indexScanDesc = node->ioss_ScanDesc;
-
/* Release VM buffer pin, if any. */
if (node->ioss_VMBuffer != InvalidBuffer)
{
@@ -413,13 +404,21 @@ ExecEndIndexOnlyScan(IndexOnlyScanState *node)
node->ioss_VMBuffer = InvalidBuffer;
}
+ /* close the scan (no-op if we didn't start it) */
+ if (node->ioss_ScanDesc != NULL)
+ {
+ index_endscan(node->ioss_ScanDesc);
+ node->ioss_ScanDesc = NULL;
+ }
+
/*
* close the index relation (no-op if we didn't open it)
*/
- if (indexScanDesc)
- index_endscan(indexScanDesc);
- if (indexRelationDesc)
- index_close(indexRelationDesc, NoLock);
+ if (node->ioss_RelationDesc != NULL)
+ {
+ index_close(node->ioss_RelationDesc, NoLock);
+ node->ioss_RelationDesc = NULL;
+ }
}
/* ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeIndexscan.c b/src/backend/executor/nodeIndexscan.c
index 8000feff4c..a8172d8b82 100644
--- a/src/backend/executor/nodeIndexscan.c
+++ b/src/backend/executor/nodeIndexscan.c
@@ -784,22 +784,21 @@ ExecIndexAdvanceArrayKeys(IndexArrayKeyInfo *arrayKeys, int numArrayKeys)
void
ExecEndIndexScan(IndexScanState *node)
{
- Relation indexRelationDesc;
- IndexScanDesc indexScanDesc;
-
- /*
- * extract information from the node
- */
- indexRelationDesc = node->iss_RelationDesc;
- indexScanDesc = node->iss_ScanDesc;
+ /* close the scan (no-op if we didn't start it) */
+ if (node->iss_ScanDesc != NULL)
+ {
+ index_endscan(node->iss_ScanDesc);
+ node->iss_ScanDesc = NULL;
+ }
/*
* close the index relation (no-op if we didn't open it)
*/
- if (indexScanDesc)
- index_endscan(indexScanDesc);
- if (indexRelationDesc)
- index_close(indexRelationDesc, NoLock);
+ if (node->iss_RelationDesc != NULL)
+ {
+ index_close(node->iss_RelationDesc, NoLock);
+ node->iss_RelationDesc = NULL;
+ }
}
/* ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeLimit.c b/src/backend/executor/nodeLimit.c
index e6f1fb1562..eb7b6e52be 100644
--- a/src/backend/executor/nodeLimit.c
+++ b/src/backend/executor/nodeLimit.c
@@ -534,6 +534,7 @@ void
ExecEndLimit(LimitState *node)
{
ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
}
diff --git a/src/backend/executor/nodeLockRows.c b/src/backend/executor/nodeLockRows.c
index 41754ddfea..0d3489195b 100644
--- a/src/backend/executor/nodeLockRows.c
+++ b/src/backend/executor/nodeLockRows.c
@@ -387,6 +387,7 @@ ExecEndLockRows(LockRowsState *node)
/* We may have shut down EPQ already, but no harm in another call */
EvalPlanQualEnd(&node->lr_epqstate);
ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
}
diff --git a/src/backend/executor/nodeMaterial.c b/src/backend/executor/nodeMaterial.c
index 22e1787fbd..883e3f3933 100644
--- a/src/backend/executor/nodeMaterial.c
+++ b/src/backend/executor/nodeMaterial.c
@@ -243,13 +243,16 @@ ExecEndMaterial(MaterialState *node)
* Release tuplestore resources
*/
if (node->tuplestorestate != NULL)
+ {
tuplestore_end(node->tuplestorestate);
- node->tuplestorestate = NULL;
+ node->tuplestorestate = NULL;
+ }
/*
* shut down the subplan
*/
ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
}
/* ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeMemoize.c b/src/backend/executor/nodeMemoize.c
index df8e3fff08..690dee1daa 100644
--- a/src/backend/executor/nodeMemoize.c
+++ b/src/backend/executor/nodeMemoize.c
@@ -1128,12 +1128,17 @@ ExecEndMemoize(MemoizeState *node)
}
/* Remove the cache context */
- MemoryContextDelete(node->tableContext);
+ if (node->tableContext != NULL)
+ {
+ MemoryContextDelete(node->tableContext);
+ node->tableContext = NULL;
+ }
/*
* shut down the subplan
*/
ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
}
void
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index e1b9b984a7..3236444cf1 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -333,7 +333,10 @@ ExecEndMergeAppend(MergeAppendState *node)
* shut down each of the subscans
*/
for (i = 0; i < nplans; i++)
+ {
ExecEndNode(mergeplans[i]);
+ mergeplans[i] = NULL;
+ }
}
void
diff --git a/src/backend/executor/nodeMergejoin.c b/src/backend/executor/nodeMergejoin.c
index 29c54fcd75..926e631d88 100644
--- a/src/backend/executor/nodeMergejoin.c
+++ b/src/backend/executor/nodeMergejoin.c
@@ -1647,7 +1647,9 @@ ExecEndMergeJoin(MergeJoinState *node)
* shut down the subplans
*/
ExecEndNode(innerPlanState(node));
+ innerPlanState(node) = NULL;
ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
MJ1_printf("ExecEndMergeJoin: %s\n",
"node processing ended");
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 8bf4c80d4a..9e56f9c36c 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -4724,7 +4724,9 @@ ExecEndModifyTable(ModifyTableState *node)
for (j = 0; j < resultRelInfo->ri_NumSlotsInitialized; j++)
{
ExecDropSingleTupleTableSlot(resultRelInfo->ri_Slots[j]);
+ resultRelInfo->ri_Slots[j] = NULL;
ExecDropSingleTupleTableSlot(resultRelInfo->ri_PlanSlots[j]);
+ resultRelInfo->ri_PlanSlots[j] = NULL;
}
}
@@ -4732,12 +4734,16 @@ ExecEndModifyTable(ModifyTableState *node)
* Close all the partitioned tables, leaf partitions, and their indices
* and release the slot used for tuple routing, if set.
*/
- if (node->mt_partition_tuple_routing)
+ if (node->mt_partition_tuple_routing != NULL)
{
ExecCleanupTupleRouting(node, node->mt_partition_tuple_routing);
+ node->mt_partition_tuple_routing = NULL;
- if (node->mt_root_tuple_slot)
+ if (node->mt_root_tuple_slot != NULL)
+ {
ExecDropSingleTupleTableSlot(node->mt_root_tuple_slot);
+ node->mt_root_tuple_slot = NULL;
+ }
}
/*
@@ -4749,6 +4755,7 @@ ExecEndModifyTable(ModifyTableState *node)
* shut down subplan
*/
ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
}
void
diff --git a/src/backend/executor/nodeNestloop.c b/src/backend/executor/nodeNestloop.c
index 7f4bf6c4db..01f3d56a3b 100644
--- a/src/backend/executor/nodeNestloop.c
+++ b/src/backend/executor/nodeNestloop.c
@@ -367,7 +367,9 @@ ExecEndNestLoop(NestLoopState *node)
* close down subplans
*/
ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
ExecEndNode(innerPlanState(node));
+ innerPlanState(node) = NULL;
NL1_printf("ExecEndNestLoop: %s\n",
"node processing ended");
diff --git a/src/backend/executor/nodeProjectSet.c b/src/backend/executor/nodeProjectSet.c
index e483730015..ca9a5e2ed2 100644
--- a/src/backend/executor/nodeProjectSet.c
+++ b/src/backend/executor/nodeProjectSet.c
@@ -331,6 +331,7 @@ ExecEndProjectSet(ProjectSetState *node)
* shut down subplans
*/
ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
}
void
diff --git a/src/backend/executor/nodeRecursiveunion.c b/src/backend/executor/nodeRecursiveunion.c
index c7f8a19fa4..7680142c7b 100644
--- a/src/backend/executor/nodeRecursiveunion.c
+++ b/src/backend/executor/nodeRecursiveunion.c
@@ -272,20 +272,36 @@ void
ExecEndRecursiveUnion(RecursiveUnionState *node)
{
/* Release tuplestores */
- tuplestore_end(node->working_table);
- tuplestore_end(node->intermediate_table);
+ if (node->working_table != NULL)
+ {
+ tuplestore_end(node->working_table);
+ node->working_table = NULL;
+ }
+ if (node->intermediate_table != NULL)
+ {
+ tuplestore_end(node->intermediate_table);
+ node->intermediate_table = NULL;
+ }
/* free subsidiary stuff including hashtable */
- if (node->tempContext)
+ if (node->tempContext != NULL)
+ {
MemoryContextDelete(node->tempContext);
- if (node->tableContext)
+ node->tempContext = NULL;
+ }
+ if (node->tableContext != NULL)
+ {
MemoryContextDelete(node->tableContext);
+ node->tableContext = NULL;
+ }
/*
* close down subplans
*/
ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
ExecEndNode(innerPlanState(node));
+ innerPlanState(node) = NULL;
}
/* ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeResult.c b/src/backend/executor/nodeResult.c
index 348361e7f4..e3cfc9b772 100644
--- a/src/backend/executor/nodeResult.c
+++ b/src/backend/executor/nodeResult.c
@@ -243,6 +243,7 @@ ExecEndResult(ResultState *node)
* shut down subplans
*/
ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
}
void
diff --git a/src/backend/executor/nodeSamplescan.c b/src/backend/executor/nodeSamplescan.c
index 714b076e64..6ab91001bc 100644
--- a/src/backend/executor/nodeSamplescan.c
+++ b/src/backend/executor/nodeSamplescan.c
@@ -181,14 +181,17 @@ ExecEndSampleScan(SampleScanState *node)
/*
* Tell sampling function that we finished the scan.
*/
- if (node->tsmroutine->EndSampleScan)
+ if (node->tsmroutine != NULL && node->tsmroutine->EndSampleScan)
node->tsmroutine->EndSampleScan(node);
/*
- * close heap scan
+ * close heap scan (no-op if we didn't start it)
*/
if (node->ss.ss_currentScanDesc)
+ {
table_endscan(node->ss.ss_currentScanDesc);
+ node->ss.ss_currentScanDesc = NULL;
+ }
}
/* ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeSeqscan.c b/src/backend/executor/nodeSeqscan.c
index 7cb12a11c2..b052775e5b 100644
--- a/src/backend/executor/nodeSeqscan.c
+++ b/src/backend/executor/nodeSeqscan.c
@@ -183,18 +183,14 @@ ExecInitSeqScan(SeqScan *node, EState *estate, int eflags)
void
ExecEndSeqScan(SeqScanState *node)
{
- TableScanDesc scanDesc;
-
- /*
- * get information from node
- */
- scanDesc = node->ss.ss_currentScanDesc;
-
/*
- * close heap scan
+ * close heap scan (no-op if we didn't start it)
*/
- if (scanDesc != NULL)
- table_endscan(scanDesc);
+ if (node->ss.ss_currentScanDesc != NULL)
+ {
+ table_endscan(node->ss.ss_currentScanDesc);
+ node->ss.ss_currentScanDesc = NULL;
+ }
}
/* ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeSetOp.c b/src/backend/executor/nodeSetOp.c
index a8ac68b482..fe34b2134f 100644
--- a/src/backend/executor/nodeSetOp.c
+++ b/src/backend/executor/nodeSetOp.c
@@ -583,10 +583,14 @@ void
ExecEndSetOp(SetOpState *node)
{
/* free subsidiary stuff including hashtable */
- if (node->tableContext)
+ if (node->tableContext != NULL)
+ {
MemoryContextDelete(node->tableContext);
+ node->tableContext = NULL;
+ }
ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
}
diff --git a/src/backend/executor/nodeSort.c b/src/backend/executor/nodeSort.c
index 3fc925d7b4..af852464d0 100644
--- a/src/backend/executor/nodeSort.c
+++ b/src/backend/executor/nodeSort.c
@@ -307,13 +307,16 @@ ExecEndSort(SortState *node)
* Release tuplesort resources
*/
if (node->tuplesortstate != NULL)
+ {
tuplesort_end((Tuplesortstate *) node->tuplesortstate);
- node->tuplesortstate = NULL;
+ node->tuplesortstate = NULL;
+ }
/*
* shut down the subplan
*/
ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
SO1_printf("ExecEndSort: %s\n",
"sort node shutdown");
diff --git a/src/backend/executor/nodeSubqueryscan.c b/src/backend/executor/nodeSubqueryscan.c
index 782097eaf2..0b2612183a 100644
--- a/src/backend/executor/nodeSubqueryscan.c
+++ b/src/backend/executor/nodeSubqueryscan.c
@@ -171,6 +171,7 @@ ExecEndSubqueryScan(SubqueryScanState *node)
* close down subquery
*/
ExecEndNode(node->subplan);
+ node->subplan = NULL;
}
/* ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeTableFuncscan.c b/src/backend/executor/nodeTableFuncscan.c
index f483221bb8..778d25d511 100644
--- a/src/backend/executor/nodeTableFuncscan.c
+++ b/src/backend/executor/nodeTableFuncscan.c
@@ -223,8 +223,10 @@ ExecEndTableFuncScan(TableFuncScanState *node)
* Release tuplestore resources
*/
if (node->tupstore != NULL)
+ {
tuplestore_end(node->tupstore);
- node->tupstore = NULL;
+ node->tupstore = NULL;
+ }
}
/* ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeTidrangescan.c b/src/backend/executor/nodeTidrangescan.c
index 9aa7683d7e..702ee884d2 100644
--- a/src/backend/executor/nodeTidrangescan.c
+++ b/src/backend/executor/nodeTidrangescan.c
@@ -326,10 +326,14 @@ ExecReScanTidRangeScan(TidRangeScanState *node)
void
ExecEndTidRangeScan(TidRangeScanState *node)
{
- TableScanDesc scan = node->ss.ss_currentScanDesc;
-
- if (scan != NULL)
- table_endscan(scan);
+ /*
+ * close heap scan (no-op if we didn't start it)
+ */
+ if (node->ss.ss_currentScanDesc != NULL)
+ {
+ table_endscan(node->ss.ss_currentScanDesc);
+ node->ss.ss_currentScanDesc = NULL;
+ }
}
/* ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeTidscan.c b/src/backend/executor/nodeTidscan.c
index 864a9013b6..f375951699 100644
--- a/src/backend/executor/nodeTidscan.c
+++ b/src/backend/executor/nodeTidscan.c
@@ -469,8 +469,14 @@ ExecReScanTidScan(TidScanState *node)
void
ExecEndTidScan(TidScanState *node)
{
- if (node->ss.ss_currentScanDesc)
+ /*
+ * close heap scan (no-op if we didn't start it)
+ */
+ if (node->ss.ss_currentScanDesc != NULL)
+ {
table_endscan(node->ss.ss_currentScanDesc);
+ node->ss.ss_currentScanDesc = NULL;
+ }
}
/* ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeUnique.c b/src/backend/executor/nodeUnique.c
index a125923e93..b82d0e9ad5 100644
--- a/src/backend/executor/nodeUnique.c
+++ b/src/backend/executor/nodeUnique.c
@@ -168,6 +168,7 @@ void
ExecEndUnique(UniqueState *node)
{
ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
}
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index 3221fa1522..561d7e731d 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -1351,11 +1351,14 @@ release_partition(WindowAggState *winstate)
* any aggregate temp data). We don't rely on retail pfree because some
* aggregates might have allocated data we don't have direct pointers to.
*/
- MemoryContextReset(winstate->partcontext);
- MemoryContextReset(winstate->aggcontext);
+ if (winstate->partcontext != NULL)
+ MemoryContextReset(winstate->partcontext);
+ if (winstate->aggcontext != NULL)
+ MemoryContextReset(winstate->aggcontext);
for (i = 0; i < winstate->numaggs; i++)
{
- if (winstate->peragg[i].aggcontext != winstate->aggcontext)
+ if (winstate->peragg[i].aggcontext != NULL &&
+ winstate->peragg[i].aggcontext != winstate->aggcontext)
MemoryContextReset(winstate->peragg[i].aggcontext);
}
@@ -2681,24 +2684,40 @@ ExecInitWindowAgg(WindowAgg *node, EState *estate, int eflags)
void
ExecEndWindowAgg(WindowAggState *node)
{
- PlanState *outerPlan;
int i;
release_partition(node);
for (i = 0; i < node->numaggs; i++)
{
- if (node->peragg[i].aggcontext != node->aggcontext)
+ if (node->peragg[i].aggcontext != NULL &&
+ node->peragg[i].aggcontext != node->aggcontext)
MemoryContextDelete(node->peragg[i].aggcontext);
}
- MemoryContextDelete(node->partcontext);
- MemoryContextDelete(node->aggcontext);
+ if (node->partcontext != NULL)
+ {
+ MemoryContextDelete(node->partcontext);
+ node->partcontext = NULL;
+ }
+ if (node->aggcontext != NULL)
+ {
+ MemoryContextDelete(node->aggcontext);
+ node->aggcontext = NULL;
+ }
- pfree(node->perfunc);
- pfree(node->peragg);
+ if (node->perfunc != NULL)
+ {
+ pfree(node->perfunc);
+ node->perfunc = NULL;
+ }
+ if (node->peragg != NULL)
+ {
+ pfree(node->peragg);
+ node->peragg = NULL;
+ }
- outerPlan = outerPlanState(node);
- ExecEndNode(outerPlan);
+ ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
}
/* -----------------
--
2.43.0
^ permalink raw reply [nested|flat] 29+ messages in thread
* Re: generic plans and "initial" pruning
2024-08-15 15:34 Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-16 12:35 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-19 16:39 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-20 13:00 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-20 14:53 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-21 12:45 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-21 13:10 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-23 12:48 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-29 13:34 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
@ 2024-08-31 12:30 ` Junwang Zhao <[email protected]>
2024-09-02 08:19 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
1 sibling, 1 reply; 29+ messages in thread
From: Junwang Zhao @ 2024-08-31 12:30 UTC (permalink / raw)
To: Amit Langote <[email protected]>; +Cc: Robert Haas <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>; Tom Lane <[email protected]>
Hi,
On Thu, Aug 29, 2024 at 9:34 PM Amit Langote <[email protected]> wrote:
>
> On Fri, Aug 23, 2024 at 9:48 PM Amit Langote <[email protected]> wrote:
> > On Wed, Aug 21, 2024 at 10:10 PM Robert Haas <[email protected]> wrote:
> > > On Wed, Aug 21, 2024 at 8:45 AM Amit Langote <[email protected]> wrote:
> > > > * The replanning aspect of the lock-in-the-executor design would be
> > > > simpler if a CachedPlan contained the plan for a single query rather
> > > > than a list of queries, as previously mentioned. This is particularly
> > > > due to the requirements of the PORTAL_MULTI_QUERY case. However, this
> > > > option might be impractical.
> > >
> > > It might be, but maybe it would be worth a try? I mean,
> > > GetCachedPlan() seems to just call pg_plan_queries() which just loops
> > > over the list of query trees and does the same thing for each one. If
> > > we wanted to replan a single query, why couldn't we do
> > > fake_querytree_list = list_make1(list_nth(querytree_list, n)) and then
> > > call pg_plan_queries(fake_querytree_list)? Or something equivalent to
> > > that. We could have a new GetCachedSinglePlan(cplan, n) to do this.
> >
> > I've been hacking to prototype this, and it's showing promise. It
> > helps make the replan loop at the call sites that start the executor
> > with an invalidatable plan more localized and less prone to
> > action-at-a-distance issues. However, the interface and contract of
> > the new function in my prototype are pretty specialized for the replan
> > loop in this context—meaning it's not as general-purpose as
> > GetCachedPlan(). Essentially, what you get when you call it is a
> > 'throwaway' CachedPlan containing only the plan for the query that
> > failed during ExecutorStart(), not a plan integrated into the original
> > CachedPlanSource's stmt_list. A call site entering the replan loop
> > will retry the execution with that throwaway plan, release it once
> > done, and resume looping over the plans in the original list. The
> > invalid plan that remains in the original list will be discarded and
> > replanned in the next call to GetCachedPlan() using the same
> > CachedPlanSource. While that may sound undesirable, I'm inclined to
> > think it's not something that needs optimization, given that we're
> > expecting this code path to be taken rarely.
> >
> > I'll post a version of a revamped locks-in-the-executor patch set
> > using the above function after debugging some more.
>
> Here it is.
>
> 0001 implements changes to defer the locking of runtime-prunable
> relations to the executor. The new design introduces a bitmapset
> field in PlannedStmt to distinguish at runtime between relations that
> are prunable whose locking can be deferred until ExecInitNode() and
> those that are not and must be locked in advance. The set of prunable
> relations can be constructed by looking at all the PartitionPruneInfos
> in the plan and checking which are subject to "initial" pruning steps.
> The set of unprunable relations is obtained by subtracting those from
> the set of all RT indexes. This design gets rid of one annoying
> aspect of the old design which was the need to add specialized fields
> to store the RT indexes of partitioned relations that are not
> otherwise referenced in the plan tree. That was necessary because in
> the old design, I had removed the function AcquireExecutorLocks()
> altogether to defer the locking of all child relations to execution.
> In the new design such relations are still locked by
> AcquireExecutorLocks().
>
> 0002 is the old patch to make ExecEndNode() robust against partially
> initialized PlanState nodes by adding NULL checks.
>
> 0003 is the patch to add changes to deal with the CachedPlan becoming
> invalid before the deferred locks on prunable relations are taken.
> I've moved the replan loop into a new wrapper-over-ExecutorStart()
> function instead of having the same logic at multiple sites. The
> replan logic uses the GetSingleCachedPlan() described in the quoted
> text. The callers of the new ExecutorStart()-wrapper, which I've
> dubbed ExecutorStartExt(), need to pass the CachedPlanSource and a
> query_index, which is the index of the query being executed in the
> list CachedPlanSource.query_list. They are needed by
> GetSingleCachedPlan(). The changes outside the executor are pretty
> minimal in this design and all the difficulties of having to loop back
> to GetCachedPlan() are now gone. I like how this turned out.
>
> One idea that I think might be worth trying to reduce the footprint of
> 0003 is to try to lock the prunable relations in a step of InitPlan()
> separate from ExecInitNode(), which can be implemented by doing the
> initial runtime pruning in that separate step. That way, we'll have
> all the necessary locks before calling ExecInitNode() and so we don't
> need to sprinkle the CachedPlanStillValid() checks all over the place
> and worry about missed checks and dealing with partially initialized
> PlanState trees.
>
> --
> Thanks, Amit Langote
@@ -1241,7 +1244,7 @@ GetCachedPlan(CachedPlanSource *plansource,
ParamListInfo boundParams,
if (customplan)
{
/* Build a custom plan */
- plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv, true);
Is the *true* here a typo? Seems it should be *false* for custom plan?
--
Regards
Junwang Zhao
^ permalink raw reply [nested|flat] 29+ messages in thread
* Re: generic plans and "initial" pruning
2024-08-15 15:34 Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-16 12:35 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-19 16:39 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-20 13:00 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-20 14:53 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-21 12:45 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-21 13:10 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-23 12:48 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-29 13:34 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-31 12:30 ` Re: generic plans and "initial" pruning Junwang Zhao <[email protected]>
@ 2024-09-02 08:19 ` Amit Langote <[email protected]>
2024-09-05 09:55 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
0 siblings, 1 reply; 29+ messages in thread
From: Amit Langote @ 2024-09-02 08:19 UTC (permalink / raw)
To: Junwang Zhao <[email protected]>; +Cc: Robert Haas <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>; Tom Lane <[email protected]>
On Sat, Aug 31, 2024 at 9:30 PM Junwang Zhao <[email protected]> wrote:
> @@ -1241,7 +1244,7 @@ GetCachedPlan(CachedPlanSource *plansource,
> ParamListInfo boundParams,
> if (customplan)
> {
> /* Build a custom plan */
> - plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
> + plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv, true);
>
> Is the *true* here a typo? Seems it should be *false* for custom plan?
That's correct, thanks for catching that. Will fix.
--
Thanks, Amit Langote
^ permalink raw reply [nested|flat] 29+ messages in thread
* Re: generic plans and "initial" pruning
2024-08-15 15:34 Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-16 12:35 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-19 16:39 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-20 13:00 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-20 14:53 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-21 12:45 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-21 13:10 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-23 12:48 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-29 13:34 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-31 12:30 ` Re: generic plans and "initial" pruning Junwang Zhao <[email protected]>
2024-09-02 08:19 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
@ 2024-09-05 09:55 ` Amit Langote <[email protected]>
0 siblings, 0 replies; 29+ messages in thread
From: Amit Langote @ 2024-09-05 09:55 UTC (permalink / raw)
To: Junwang Zhao <[email protected]>; +Cc: Robert Haas <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>; Tom Lane <[email protected]>
On Mon, Sep 2, 2024 at 5:19 PM Amit Langote <[email protected]> wrote:
> On Sat, Aug 31, 2024 at 9:30 PM Junwang Zhao <[email protected]> wrote:
> > @@ -1241,7 +1244,7 @@ GetCachedPlan(CachedPlanSource *plansource,
> > ParamListInfo boundParams,
> > if (customplan)
> > {
> > /* Build a custom plan */
> > - plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
> > + plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv, true);
> >
> > Is the *true* here a typo? Seems it should be *false* for custom plan?
>
> That's correct, thanks for catching that. Will fix.
Done.
I've also rewritten the new GetSingleCachedPlan() function in 0003.
The most glaring bug in the previous version was that the transient
CachedPlan it creates cannot be seen by PlanCacheRelCallback() et al
functions because it was intentionally not linked to the
CachedPlanSource, so if the CachedPlan would not be invalidated even
if some prunable relation got changed before it is locked during
ExecutorStart(). I've added a new list standalone_plan_list to add
these to and changed the inval callback functions to invalidate any
plans contained in them.
Another thing I found out through testing is that CachedPlanSource can
have become invalid since leaving GetCachedPlan() (actually even
before returning from that function) because of
PlanCacheSysCallback(), which drops/invalidates *all* plans when a
syscache is invalidated. There are comments in plancache.c (see
BuildCachedPlan()) saying that such invalidations are, in theory,
false positives, but that gave me a pause nonetheless.
Finally, instead of calling GetCachedPlan() from GetSingleCachedPlan()
to create a plan for only the query whose plan got invalidated, which
required a bunch of care to ensure that the CachedPlanSource is not
overwritten with the information about this single-query planning,
I've made GetSingleCachedPlan() create the PlannedStmt and the
detached CachedPlan object on its own, borrowing the minimal necessary
code from BuildCachedPlan() to do so.
--
Thanks, Amit Langote
Attachments:
[application/octet-stream] v52-0001-Defer-locking-of-runtime-prunable-relations-to-e.patch (32.8K, 2-v52-0001-Defer-locking-of-runtime-prunable-relations-to-e.patch)
download | inline diff:
From 74906c4bbc42b362f7a5608774af68615a299912 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Wed, 7 Aug 2024 18:25:51 +0900
Subject: [PATCH v52 1/3] Defer locking of runtime-prunable relations to
executor
When preparing a cached plan for execution, plancache.c locks the
relations contained in the plan's range table to ensure it is safe for
execution. However, this simplistic approach, implemented in
AcquireExecutorLocks(), results in unnecessarily locking relations that
might be pruned during "initial" runtime pruning.
To optimize this, the locking is now deferred for relations that are
subject to "initial" runtime pruning. The planner now provides a set
of "unprunable" relations, available through the new
PlannedStmt.unprunableRelids field. AcquireExecutorLocks() will now
only lock those relations.
PlannedStmt.unprunableRelids is populated by subtracting the set of
initially prunable relids from the set of all RT indexes. The prunable
relids set is constructed by examining all PartitionPruneInfos during
set_plan_refs() and storing the RT indexes of partitions subject to
"initial" pruning steps. While at it, some duplicated code in
set_append_references() and set_mergeappend_references() that
constructs the prunable relids set has been refactored into a common
function.
To enable the executor to determine whether the plan tree it's
executing is a cached one, the CachedPlan is now made available via
the QueryDesc. The executor can call CachedPlanRequiresLocking(),
which returns true if the CachedPlan is a reusable generic plan that
might contain relations needing to be locked. If so, the executor
will lock any relation that is not in PlannedStmt.unprunableRelids.
Finally, an Assert has been added in ExecCheckPermissions() to ensure
that all relations whose permissions are checked have been properly
locked. This helps catch any accidental omission of relations from the
unprunableRelids set that should have their permissions checked.
This deferment introduces a window in which prunable relations may be
altered by concurrent DDL, potentially causing the plan to become
invalid. As a result, the executor might attempt to run an invalid plan,
leading to errors such as being unable to locate a partition-only index
during ExecInitIndexScan(). Future commits will introduce changes to
ready the executor to check plan validity during ExecutorStart() and
retry with a newly created plan if the original one becomes invalid
after taking deferred locks.
---
src/backend/commands/copyto.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/explain.c | 7 ++--
src/backend/commands/extension.c | 1 +
src/backend/commands/matview.c | 2 +-
src/backend/commands/prepare.c | 3 +-
src/backend/executor/execMain.c | 18 ++++++++
src/backend/executor/execParallel.c | 9 +++-
src/backend/executor/execUtils.c | 30 +++++++++++++-
src/backend/executor/functions.c | 1 +
src/backend/executor/spi.c | 1 +
src/backend/optimizer/plan/planner.c | 2 +
src/backend/optimizer/plan/setrefs.c | 62 +++++++++++++++-------------
src/backend/partitioning/partprune.c | 24 ++++++++++-
src/backend/storage/lmgr/lmgr.c | 1 +
src/backend/tcop/pquery.c | 10 ++++-
src/backend/utils/cache/lsyscache.c | 1 -
src/backend/utils/cache/plancache.c | 40 +++++++++++-------
src/include/commands/explain.h | 5 ++-
src/include/executor/execdesc.h | 2 +
src/include/nodes/execnodes.h | 2 +
src/include/nodes/pathnodes.h | 6 +++
src/include/nodes/plannodes.h | 11 +++++
src/include/utils/plancache.h | 10 +++++
24 files changed, 195 insertions(+), 57 deletions(-)
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 91de442f43..db976f928a 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -552,7 +552,7 @@ BeginCopyTo(ParseState *pstate,
((DR_copy *) dest)->cstate = cstate;
/* Create a QueryDesc requesting no output */
- cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(),
InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 0b629b1f79..57a3375cad 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -324,7 +324,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 11df4a04d4..a83ea07db1 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -507,7 +507,7 @@ standard_ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL),
es->memory ? &mem_counters : NULL);
}
@@ -615,7 +615,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
* to call it.
*/
void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
+ IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
const BufferUsage *bufusage,
@@ -671,7 +672,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
dest = None_Receiver;
/* Create a QueryDesc for the query */
- queryDesc = CreateQueryDesc(plannedstmt, queryString,
+ queryDesc = CreateQueryDesc(plannedstmt, cplan, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, instrument_option);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 1643c8c69a..3f7f4306fe 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -798,6 +798,7 @@ execute_sql_string(const char *sql)
QueryDesc *qdesc;
qdesc = CreateQueryDesc(stmt,
+ NULL,
sql,
GetActiveSnapshot(), NULL,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 91f0fd6ea3..a7a79583ec 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -438,7 +438,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, queryString,
+ queryDesc = CreateQueryDesc(plan, NULL, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 07257d4db9..311b9ebd5b 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -655,7 +655,8 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
+ ExplainOnePlan(pstmt, cplan, into, es, query_string, paramLI,
+ queryEnv,
&planduration, (es->buffers ? &bufusage : NULL),
es->memory ? &mem_counters : NULL);
else
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 29e186fa73..271f9d93fc 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -52,6 +52,7 @@
#include "miscadmin.h"
#include "parser/parse_relation.h"
#include "rewrite/rewriteHandler.h"
+#include "storage/lmgr.h"
#include "tcop/utility.h"
#include "utils/acl.h"
#include "utils/backend_status.h"
@@ -597,6 +598,21 @@ ExecCheckPermissions(List *rangeTable, List *rteperminfos,
(rte->rtekind == RTE_SUBQUERY &&
rte->relkind == RELKIND_VIEW));
+ /*
+ * Ensure that we have at least an AccessShareLock on relations
+ * whose permissions need to be checked.
+ *
+ * Skip this check in a parallel worker because locks won't be
+ * taken until ExecInitNode() performs plan initialization.
+ *
+ * XXX: ExecCheckPermissions() in a parallel worker may be
+ * redundant with the checks done in the leader process, so this
+ * should be reviewed to ensure it’s necessary.
+ */
+ Assert(IsParallelWorker() ||
+ CheckRelationOidLockedByMe(rte->relid, AccessShareLock,
+ true));
+
(void) getRTEPermissionInfo(rteperminfos, rte);
/* Many-to-one mapping not allowed */
Assert(!bms_is_member(rte->perminfoindex, indexset));
@@ -829,6 +845,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
{
CmdType operation = queryDesc->operation;
PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+ CachedPlan *cachedplan = queryDesc->cplan;
Plan *plan = plannedstmt->planTree;
List *rangeTable = plannedstmt->rtable;
EState *estate = queryDesc->estate;
@@ -848,6 +865,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable, plannedstmt->permInfos);
estate->es_plannedstmt = plannedstmt;
+ estate->es_cachedplan = cachedplan;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index bfb3419efb..03b48e12b4 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -1256,8 +1256,15 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
paramLI = RestoreParamList(¶mspace);
- /* Create a QueryDesc for the query. */
+ /*
+ * Create a QueryDesc for the query. We pass NULL for cachedplan, because
+ * we don't have a pointer to the CachedPlan in the leader's process. It's
+ * fine because the only reason the executor needs to see it is to decide
+ * if it should take locks on certain relations, but paraller workers
+ * always take locks anyway.
+ */
return CreateQueryDesc(pstmt,
+ NULL,
queryString,
GetActiveSnapshot(), InvalidSnapshot,
receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 5737f9f4eb..6dfd5a26b7 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -752,6 +752,26 @@ ExecInitRangeTable(EState *estate, List *rangeTable, List *permInfos)
estate->es_rowmarks = NULL;
}
+/*
+ * ExecShouldLockRelation
+ * Determine if the relation should be locked.
+ *
+ * The relation does not need to be locked if we are not running a cached
+ * plan or if it has already been locked as an unprunable relation.
+ *
+ * Lock the relation if it might be one of the prunable relations mentioned
+ * in the cached plan.
+ */
+static bool
+ExecShouldLockRelation(EState *estate, Index rtindex)
+{
+ if (estate->es_cachedplan == NULL ||
+ bms_is_member(rtindex, estate->es_plannedstmt->unprunableRelids))
+ return false;
+
+ return CachedPlanRequiresLocking(estate->es_cachedplan);
+}
+
/*
* ExecGetRangeTableRelation
* Open the Relation for a range table entry, if not already done
@@ -773,7 +793,7 @@ ExecGetRangeTableRelation(EState *estate, Index rti)
Assert(rte->rtekind == RTE_RELATION);
- if (!IsParallelWorker())
+ if (!IsParallelWorker() && !ExecShouldLockRelation(estate, rti))
{
/*
* In a normal query, we should already have the appropriate lock,
@@ -789,9 +809,17 @@ ExecGetRangeTableRelation(EState *estate, Index rti)
else
{
/*
+ * Lock relation either if we are a parallel worker or if
+ * ExecShouldLockRelation() says we should.
+ *
* If we are a parallel worker, we need to obtain our own local
* lock on the relation. This ensures sane behavior in case the
* parent process exits before we do.
+ *
+ * ExecShouldLockRelation() would return true if the RT index is
+ * that of a prunable relation and we're running a cached generic
+ * plan. AcquireExecutorLocks() of plancache.c would have locked
+ * only the unprunable relations in the plan tree.
*/
rel = table_open(rte->relid, rte->rellockmode);
}
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 692854e2b3..6f6f45e0ad 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -840,6 +840,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
dest = None_Receiver;
es->qd = CreateQueryDesc(es->stmt,
+ NULL,
fcache->src,
GetActiveSnapshot(),
InvalidSnapshot,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index d6516b1bca..902793b02b 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -2684,6 +2684,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
snap = InvalidSnapshot;
qdesc = CreateQueryDesc(stmt,
+ cplan,
plansource->query_string,
snap, crosscheck_snapshot,
dest,
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index b5827d3980..cb9b6f0147 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -546,6 +546,8 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
result->rtable = glob->finalrtable;
+ result->unprunableRelids = bms_difference(bms_add_range(NULL, 1, list_length(result->rtable)),
+ glob->prunableRelids);
result->permInfos = glob->finalrteperminfos;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 7aed84584c..b6be0e5730 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -154,6 +154,9 @@ static Plan *set_append_references(PlannerInfo *root,
static Plan *set_mergeappend_references(PlannerInfo *root,
MergeAppend *mplan,
int rtoffset);
+static void set_part_prune_references(PartitionPruneInfo *pinfo,
+ PlannerGlobal *glob,
+ int rtoffset);
static void set_hash_references(PlannerInfo *root, Plan *plan, int rtoffset);
static Relids offset_relid_set(Relids relids, int rtoffset);
static Node *fix_scan_expr(PlannerInfo *root, Node *node,
@@ -1783,20 +1786,8 @@ set_append_references(PlannerInfo *root,
aplan->apprelids = offset_relid_set(aplan->apprelids, rtoffset);
if (aplan->part_prune_info)
- {
- foreach(l, aplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ set_part_prune_references(aplan->part_prune_info, root->glob,
+ rtoffset);
/* We don't need to recurse to lefttree or righttree ... */
Assert(aplan->plan.lefttree == NULL);
@@ -1859,20 +1850,8 @@ set_mergeappend_references(PlannerInfo *root,
mplan->apprelids = offset_relid_set(mplan->apprelids, rtoffset);
if (mplan->part_prune_info)
- {
- foreach(l, mplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ set_part_prune_references(mplan->part_prune_info, root->glob,
+ rtoffset);
/* We don't need to recurse to lefttree or righttree ... */
Assert(mplan->plan.lefttree == NULL);
@@ -1881,6 +1860,33 @@ set_mergeappend_references(PlannerInfo *root,
return (Plan *) mplan;
}
+/*
+ * Updates RT indexes in PartitionedRelPruneInfos contained in pinfo and adds
+ * the RT indexes of "prunable" relations into glob->prunableRelids.
+ */
+static void
+set_part_prune_references(PartitionPruneInfo *pinfo, PlannerGlobal *glob,
+ int rtoffset)
+{
+ ListCell *l;
+
+ foreach(l, pinfo->prune_infos)
+ {
+ List *prune_infos = lfirst(l);
+ ListCell *l2;
+
+ foreach(l2, prune_infos)
+ {
+ PartitionedRelPruneInfo *prelinfo = lfirst(l2);
+
+ prelinfo->rtindex += rtoffset;
+ if (prelinfo->initial_pruning_steps != NIL)
+ glob->prunableRelids = bms_add_members(glob->prunableRelids,
+ prelinfo->present_part_rtis);
+ }
+ }
+}
+
/*
* set_hash_references
* Do set_plan_references processing on a Hash node
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 9a1a7faac7..8e27e35df2 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -634,6 +634,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
PartitionedRelPruneInfo *pinfo = lfirst(lc);
RelOptInfo *subpart = find_base_rel(root, pinfo->rtindex);
Bitmapset *present_parts;
+ Bitmapset *present_part_rtis;
int nparts = subpart->nparts;
int *subplan_map;
int *subpart_map;
@@ -650,7 +651,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subpart_map = (int *) palloc(nparts * sizeof(int));
memset(subpart_map, -1, nparts * sizeof(int));
relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
- present_parts = NULL;
+ present_parts = present_part_rtis = NULL;
i = -1;
while ((i = bms_next_member(subpart->live_parts, i)) >= 0)
@@ -664,15 +665,35 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+
+ /*
+ * Track the RT indexes of partitions to ensure they are included
+ * in the prunableRelids set of relations that are locked during
+ * execution. This ensures that if the plan is cached, these
+ * partitions are locked when the plan is reused.
+ *
+ * Partitions without a subplan and sub-partitioned partitions
+ * where none of the sub-partitions have a subplan due to
+ * constraint exclusion are not included in this set. Instead,
+ * they are added to the unprunableRelids set, and the relations
+ * in this set are locked by AcquireExecutorLocks() before
+ * executing a cached plan.
+ */
if (subplanidx >= 0)
{
present_parts = bms_add_member(present_parts, i);
+ present_part_rtis = bms_add_member(present_part_rtis,
+ partrel->relid);
/* Record finding this subplan */
subplansfound = bms_add_member(subplansfound, subplanidx);
}
else if (subpartidx >= 0)
+ {
present_parts = bms_add_member(present_parts, i);
+ present_part_rtis = bms_add_member(present_part_rtis,
+ partrel->relid);
+ }
}
/*
@@ -684,6 +705,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
/* Record the maps and other information. */
pinfo->present_parts = present_parts;
+ pinfo->present_part_rtis = present_part_rtis;
pinfo->nparts = nparts;
pinfo->subplan_map = subplan_map;
pinfo->subpart_map = subpart_map;
diff --git a/src/backend/storage/lmgr/lmgr.c b/src/backend/storage/lmgr/lmgr.c
index 094522acb4..a1c89f5d72 100644
--- a/src/backend/storage/lmgr/lmgr.c
+++ b/src/backend/storage/lmgr/lmgr.c
@@ -26,6 +26,7 @@
#include "storage/procarray.h"
#include "storage/sinvaladt.h"
#include "utils/inval.h"
+#include "utils/lsyscache.h"
/*
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index a1f8d03db1..6e8f6b1b8f 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -36,6 +36,7 @@ Portal ActivePortal = NULL;
static void ProcessQuery(PlannedStmt *plan,
+ CachedPlan *cplan,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -65,6 +66,7 @@ static void DoPortalRewind(Portal portal);
*/
QueryDesc *
CreateQueryDesc(PlannedStmt *plannedstmt,
+ CachedPlan *cplan,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
@@ -77,6 +79,7 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
+ qd->cplan = cplan; /* CachedPlan supplying the plannedstmt */
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
* PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
*
* plan: the plan tree for the query
+ * cplan: CachedPlan supplying the plan
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
*/
static void
ProcessQuery(PlannedStmt *plan,
+ CachedPlan *cplan,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Create the QueryDesc object
*/
- queryDesc = CreateQueryDesc(plan, sourceText,
+ queryDesc = CreateQueryDesc(plan, cplan, sourceText,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
@@ -493,6 +498,7 @@ PortalStart(Portal portal, ParamListInfo params,
* the destination to DestNone.
*/
queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+ portal->cplan,
portal->sourceText,
GetActiveSnapshot(),
InvalidSnapshot,
@@ -1276,6 +1282,7 @@ PortalRunMulti(Portal portal,
{
/* statement can set tag string */
ProcessQuery(pstmt,
+ portal->cplan,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1285,6 +1292,7 @@ PortalRunMulti(Portal portal,
{
/* stmt added by rewrite cannot set tag */
ProcessQuery(pstmt,
+ portal->cplan,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
diff --git a/src/backend/utils/cache/lsyscache.c b/src/backend/utils/cache/lsyscache.c
index 48a280d089..f647821382 100644
--- a/src/backend/utils/cache/lsyscache.c
+++ b/src/backend/utils/cache/lsyscache.c
@@ -2113,7 +2113,6 @@ get_rel_relam(Oid relid)
return result;
}
-
/* ---------- TRANSFORM CACHE ---------- */
Oid
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 5af1a168ec..5b75dadf13 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -104,7 +104,8 @@ static List *RevalidateCachedQuery(CachedPlanSource *plansource,
QueryEnvironment *queryEnv);
static bool CheckCachedPlan(CachedPlanSource *plansource);
static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv);
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ bool generic);
static bool choose_custom_plan(CachedPlanSource *plansource,
ParamListInfo boundParams);
static double cached_plan_cost(CachedPlan *plan, bool include_planner);
@@ -815,8 +816,11 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
* Caller must have already called RevalidateCachedQuery to verify that the
* querytree is up to date.
*
- * On a "true" return, we have acquired the locks needed to run the plan.
- * (We must do this for the "true" result to be race-condition-free.)
+ * On a "true" return, we have acquired locks on the "unprunableRelids" set
+ * for all plans in plansource->stmt_list. The plans are not completely
+ * race-condition-free until the executor takes locks on the set of prunable
+ * relations that survive initial runtime pruning during executor
+ * initialization;
*/
static bool
CheckCachedPlan(CachedPlanSource *plansource)
@@ -893,10 +897,10 @@ CheckCachedPlan(CachedPlanSource *plansource)
* or it can be set to NIL if we need to re-copy the plansource's query_list.
*
* To build a generic, parameter-value-independent plan, pass NULL for
- * boundParams. To build a custom plan, pass the actual parameter values via
- * boundParams. For best effect, the PARAM_FLAG_CONST flag should be set on
- * each parameter value; otherwise the planner will treat the value as a
- * hint rather than a hard constant.
+ * boundParams, and true for generic. To build a custom plan, pass the actual
+ * parameter values via boundParams, and false for generic. For best effect,
+ * the PARAM_FLAG_CONST flag should be set on each parameter value; otherwise
+ * the planner will treat the value as a hint rather than a hard constant.
*
* Planning work is done in the caller's memory context. The finished plan
* is in a child memory context, which typically should get reparented
@@ -904,7 +908,8 @@ CheckCachedPlan(CachedPlanSource *plansource)
*/
static CachedPlan *
BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv)
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ bool generic)
{
CachedPlan *plan;
List *plist;
@@ -1026,6 +1031,7 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
plan->refcount = 0;
plan->context = plan_context;
plan->is_oneshot = plansource->is_oneshot;
+ plan->is_generic = generic;
plan->is_saved = false;
plan->is_valid = true;
@@ -1196,7 +1202,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
else
{
/* Build a new generic plan */
- plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv, true);
/* Just make real sure plansource->gplan is clear */
ReleaseGenericPlan(plansource);
/* Link the new generic plan into the plansource */
@@ -1241,7 +1247,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (customplan)
{
/* Build a custom plan */
- plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv, false);
/* Accumulate total costs of custom plans */
plansource->total_custom_cost += cached_plan_cost(plan, true);
@@ -1387,8 +1393,8 @@ CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
}
/*
- * Reject if AcquireExecutorLocks would have anything to do. This is
- * probably unnecessary given the previous check, but let's be safe.
+ * Reject if there are any lockable relations. This is probably
+ * unnecessary given the previous check, but let's be safe.
*/
foreach(lc, plan->stmt_list)
{
@@ -1776,7 +1782,7 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
foreach(lc1, stmt_list)
{
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
- ListCell *lc2;
+ int rtindex;
if (plannedstmt->commandType == CMD_UTILITY)
{
@@ -1794,9 +1800,13 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
continue;
}
- foreach(lc2, plannedstmt->rtable)
+ rtindex = -1;
+ while ((rtindex = bms_next_member(plannedstmt->unprunableRelids,
+ rtindex)) >= 0)
{
- RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+ RangeTblEntry *rte = list_nth_node(RangeTblEntry,
+ plannedstmt->rtable,
+ rtindex - 1);
if (!(rte->rtekind == RTE_RELATION ||
(rte->rtekind == RTE_SUBQUERY && OidIsValid(rte->relid))))
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 9b8b351d9a..bf326eeb70 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -101,8 +101,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv);
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
- ExplainState *es, const char *queryString,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
+ IntoClause *into, ExplainState *es,
+ const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
const instr_time *planduration,
const BufferUsage *bufusage,
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index 0a7274e26c..0e7245435d 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,7 @@ typedef struct QueryDesc
/* These fields are provided by CreateQueryDesc */
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
+ CachedPlan *cplan; /* CachedPlan that supplies the plannedstmt */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
@@ -57,6 +58,7 @@ typedef struct QueryDesc
/* in pquery.c */
extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+ CachedPlan *cplan,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index af7d8fd1e7..ee089505a0 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -42,6 +42,7 @@
#include "storage/condition_variable.h"
#include "utils/hsearch.h"
#include "utils/queryenvironment.h"
+#include "utils/plancache.h"
#include "utils/reltrigger.h"
#include "utils/sharedtuplestore.h"
#include "utils/snapshot.h"
@@ -633,6 +634,7 @@ typedef struct EState
* ExecRowMarks, or NULL if none */
List *es_rteperminfos; /* List of RTEPermissionInfo */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
+ CachedPlan *es_cachedplan; /* CachedPlan supplying the plannedstmt */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 540d021592..2466157b25 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -116,6 +116,12 @@ typedef struct PlannerGlobal
/* "flat" rangetable for executor */
List *finalrtable;
+ /*
+ * RT indexes of relations subject to removal from the plan due to runtime
+ * pruning at plan initialization time
+ */
+ Bitmapset *prunableRelids;
+
/* "flat" list of RTEPermissionInfos */
List *finalrteperminfos;
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 62cd6a6666..ae608812f1 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -71,6 +71,10 @@ typedef struct PlannedStmt
List *rtable; /* list of RangeTblEntry nodes */
+ Bitmapset *unprunableRelids; /* RT indexes of relations that are not
+ * subject to runtime pruning; for
+ * AcquireExecutorLocks() */
+
List *permInfos; /* list of RTEPermissionInfo nodes for rtable
* entries needing one */
@@ -1459,6 +1463,13 @@ typedef struct PartitionedRelPruneInfo
/* Indexes of all partitions which subplans or subparts are present for */
Bitmapset *present_parts;
+ /*
+ * RT indexes of all partitions which subplans or subparts are present
+ * for; only used during planning to help in the construction of
+ * PlannerGlobal.prunableRelids.
+ */
+ Bitmapset *present_part_rtis;
+
/* Length of the following arrays: */
int nparts;
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index a90dfdf906..0b5ee007ca 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -149,6 +149,7 @@ typedef struct CachedPlan
int magic; /* should equal CACHEDPLAN_MAGIC */
List *stmt_list; /* list of PlannedStmts */
bool is_oneshot; /* is it a "oneshot" plan? */
+ bool is_generic; /* is it a reusable generic plan? */
bool is_saved; /* is CachedPlan in a long-lived context? */
bool is_valid; /* is the stmt_list currently valid? */
Oid planRoleId; /* Role ID the plan was created for */
@@ -235,4 +236,13 @@ extern bool CachedPlanIsSimplyValid(CachedPlanSource *plansource,
extern CachedExpression *GetCachedExpression(Node *expr);
extern void FreeCachedExpression(CachedExpression *cexpr);
+/*
+ * CachedPlanRequiresLocking: should the executor acquire locks?
+ */
+static inline bool
+CachedPlanRequiresLocking(CachedPlan *cplan)
+{
+ return cplan->is_generic;
+}
+
#endif /* PLANCACHE_H */
--
2.43.0
[application/octet-stream] v52-0002-Assorted-tightening-in-various-ExecEnd-routines.patch (31.3K, 3-v52-0002-Assorted-tightening-in-various-ExecEnd-routines.patch)
download | inline diff:
From a4077c425a1874036d2937bb93a6116cfd3640cb Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Thu, 28 Sep 2023 16:56:29 +0900
Subject: [PATCH v52 2/3] Assorted tightening in various ExecEnd()* routines
This includes adding NULLness checks on pointers before cleaning them
up. Many ExecEnd*() routines already perform this check, but a few
are missing them. These NULLness checks might seem redundant as
things stand since the ExecEnd*() routines operate under the
assumption that their matching ExecInit* routine would have fully
executed, ensuring pointers are set. However, that assumption seems a
bit shaky in the face of future changes.
This also adds a guard at the begigging of EvalPlanQualEnd() to return
early if the EPQState does not appear to have been initialized. That
case can happen if the corresponding ExecInit*() routine returned
early without calling EvalPlanQualInit().
While at it, this commit ensures that pointers are consistently set
to NULL after cleanup in all ExecEnd*() routines.
Finally, for enhanced consistency, the format of NULLness checks has
been standardized to "if (pointer != NULL)", replacing the previous
"if (pointer)" style.
Reviewed-by: Robert Haas
Discussion: https://postgr.es/m/CA+HiwqFGkMSge6TgC9KQzde0ohpAycLQuV7ooitEEpbKB0O_mg@mail.gmail.com
---
src/backend/executor/execMain.c | 4 ++
src/backend/executor/nodeAgg.c | 27 +++++++++----
src/backend/executor/nodeAppend.c | 3 ++
src/backend/executor/nodeBitmapAnd.c | 4 +-
src/backend/executor/nodeBitmapHeapscan.c | 46 ++++++++++++++--------
src/backend/executor/nodeBitmapIndexscan.c | 23 ++++++-----
src/backend/executor/nodeBitmapOr.c | 4 +-
src/backend/executor/nodeCtescan.c | 3 +-
src/backend/executor/nodeForeignscan.c | 17 ++++----
src/backend/executor/nodeGather.c | 1 +
src/backend/executor/nodeGatherMerge.c | 1 +
src/backend/executor/nodeGroup.c | 6 +--
src/backend/executor/nodeHash.c | 6 +--
src/backend/executor/nodeHashjoin.c | 4 +-
src/backend/executor/nodeIncrementalSort.c | 13 +++++-
src/backend/executor/nodeIndexonlyscan.c | 25 ++++++------
src/backend/executor/nodeIndexscan.c | 23 ++++++-----
src/backend/executor/nodeLimit.c | 1 +
src/backend/executor/nodeLockRows.c | 1 +
src/backend/executor/nodeMaterial.c | 5 ++-
src/backend/executor/nodeMemoize.c | 7 +++-
src/backend/executor/nodeMergeAppend.c | 3 ++
src/backend/executor/nodeMergejoin.c | 2 +
src/backend/executor/nodeModifyTable.c | 11 +++++-
src/backend/executor/nodeNestloop.c | 2 +
src/backend/executor/nodeProjectSet.c | 1 +
src/backend/executor/nodeRecursiveunion.c | 24 +++++++++--
src/backend/executor/nodeResult.c | 1 +
src/backend/executor/nodeSamplescan.c | 7 +++-
src/backend/executor/nodeSeqscan.c | 16 +++-----
src/backend/executor/nodeSetOp.c | 6 ++-
src/backend/executor/nodeSort.c | 5 ++-
src/backend/executor/nodeSubqueryscan.c | 1 +
src/backend/executor/nodeTableFuncscan.c | 4 +-
src/backend/executor/nodeTidrangescan.c | 12 ++++--
src/backend/executor/nodeTidscan.c | 8 +++-
src/backend/executor/nodeUnique.c | 1 +
src/backend/executor/nodeWindowAgg.c | 41 +++++++++++++------
38 files changed, 246 insertions(+), 123 deletions(-)
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 271f9d93fc..0f6dbd1e2b 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -2999,6 +2999,10 @@ EvalPlanQualEnd(EPQState *epqstate)
MemoryContext oldcontext;
ListCell *l;
+ /* Nothing to do if no EvalPlanQualInit() was done to begin with. */
+ if (epqstate->parentestate == NULL)
+ return;
+
rtsize = epqstate->parentestate->es_range_table_size;
/*
diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
index 53ead77ece..0dfba5ca16 100644
--- a/src/backend/executor/nodeAgg.c
+++ b/src/backend/executor/nodeAgg.c
@@ -4303,7 +4303,6 @@ GetAggInitVal(Datum textInitVal, Oid transtype)
void
ExecEndAgg(AggState *node)
{
- PlanState *outerPlan;
int transno;
int numGroupingSets = Max(node->maxsets, 1);
int setno;
@@ -4313,7 +4312,7 @@ ExecEndAgg(AggState *node)
* worker back into shared memory so that it can be picked up by the main
* process to report in EXPLAIN ANALYZE.
*/
- if (node->shared_info && IsParallelWorker())
+ if (node->shared_info != NULL && IsParallelWorker())
{
AggregateInstrumentation *si;
@@ -4326,10 +4325,16 @@ ExecEndAgg(AggState *node)
/* Make sure we have closed any open tuplesorts */
- if (node->sort_in)
+ if (node->sort_in != NULL)
+ {
tuplesort_end(node->sort_in);
- if (node->sort_out)
+ node->sort_in = NULL;
+ }
+ if (node->sort_out != NULL)
+ {
tuplesort_end(node->sort_out);
+ node->sort_out = NULL;
+ }
hashagg_reset_spill_state(node);
@@ -4345,19 +4350,25 @@ ExecEndAgg(AggState *node)
for (setno = 0; setno < numGroupingSets; setno++)
{
- if (pertrans->sortstates[setno])
+ if (pertrans->sortstates[setno] != NULL)
tuplesort_end(pertrans->sortstates[setno]);
}
}
/* And ensure any agg shutdown callbacks have been called */
for (setno = 0; setno < numGroupingSets; setno++)
+ {
ReScanExprContext(node->aggcontexts[setno]);
- if (node->hashcontext)
+ node->aggcontexts[setno] = NULL;
+ }
+ if (node->hashcontext != NULL)
+ {
ReScanExprContext(node->hashcontext);
+ node->hashcontext = NULL;
+ }
- outerPlan = outerPlanState(node);
- ExecEndNode(outerPlan);
+ ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
}
void
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index ca0f54d676..86d75b1a7e 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -399,7 +399,10 @@ ExecEndAppend(AppendState *node)
* shut down each of the subscans
*/
for (i = 0; i < nplans; i++)
+ {
ExecEndNode(appendplans[i]);
+ appendplans[i] = NULL;
+ }
}
void
diff --git a/src/backend/executor/nodeBitmapAnd.c b/src/backend/executor/nodeBitmapAnd.c
index 9c9c666872..ae391222bf 100644
--- a/src/backend/executor/nodeBitmapAnd.c
+++ b/src/backend/executor/nodeBitmapAnd.c
@@ -192,8 +192,8 @@ ExecEndBitmapAnd(BitmapAndState *node)
*/
for (i = 0; i < nplans; i++)
{
- if (bitmapplans[i])
- ExecEndNode(bitmapplans[i]);
+ ExecEndNode(bitmapplans[i]);
+ bitmapplans[i] = NULL;
}
}
diff --git a/src/backend/executor/nodeBitmapHeapscan.c b/src/backend/executor/nodeBitmapHeapscan.c
index 3c63bdd93d..19f18ab817 100644
--- a/src/backend/executor/nodeBitmapHeapscan.c
+++ b/src/backend/executor/nodeBitmapHeapscan.c
@@ -625,8 +625,6 @@ ExecReScanBitmapHeapScan(BitmapHeapScanState *node)
void
ExecEndBitmapHeapScan(BitmapHeapScanState *node)
{
- TableScanDesc scanDesc;
-
/*
* When ending a parallel worker, copy the statistics gathered by the
* worker back into shared memory so that it can be picked up by the main
@@ -650,38 +648,54 @@ ExecEndBitmapHeapScan(BitmapHeapScanState *node)
si->lossy_pages += node->stats.lossy_pages;
}
- /*
- * extract information from the node
- */
- scanDesc = node->ss.ss_currentScanDesc;
-
/*
* close down subplans
*/
ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
/*
* release bitmaps and buffers if any
*/
- if (node->tbmiterator)
+ if (node->tbmiterator != NULL)
+ {
tbm_end_iterate(node->tbmiterator);
- if (node->prefetch_iterator)
+ node->tbmiterator = NULL;
+ }
+ if (node->prefetch_iterator != NULL)
+ {
tbm_end_iterate(node->prefetch_iterator);
- if (node->tbm)
+ node->prefetch_iterator = NULL;
+ }
+ if (node->tbm != NULL)
+ {
tbm_free(node->tbm);
- if (node->shared_tbmiterator)
+ node->tbm = NULL;
+ }
+ if (node->shared_tbmiterator != NULL)
+ {
tbm_end_shared_iterate(node->shared_tbmiterator);
- if (node->shared_prefetch_iterator)
+ node->shared_tbmiterator = NULL;
+ }
+ if (node->shared_prefetch_iterator != NULL)
+ {
tbm_end_shared_iterate(node->shared_prefetch_iterator);
+ node->shared_prefetch_iterator = NULL;
+ }
if (node->pvmbuffer != InvalidBuffer)
+ {
ReleaseBuffer(node->pvmbuffer);
+ node->pvmbuffer = InvalidBuffer;
+ }
/*
- * close heap scan
+ * close heap scan (no-op if we didn't start it)
*/
- if (scanDesc)
- table_endscan(scanDesc);
-
+ if (node->ss.ss_currentScanDesc != NULL)
+ {
+ table_endscan(node->ss.ss_currentScanDesc);
+ node->ss.ss_currentScanDesc = NULL;
+ }
}
/* ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeBitmapIndexscan.c b/src/backend/executor/nodeBitmapIndexscan.c
index 6df8e17ec8..4669e8d0ce 100644
--- a/src/backend/executor/nodeBitmapIndexscan.c
+++ b/src/backend/executor/nodeBitmapIndexscan.c
@@ -174,22 +174,21 @@ ExecReScanBitmapIndexScan(BitmapIndexScanState *node)
void
ExecEndBitmapIndexScan(BitmapIndexScanState *node)
{
- Relation indexRelationDesc;
- IndexScanDesc indexScanDesc;
-
- /*
- * extract information from the node
- */
- indexRelationDesc = node->biss_RelationDesc;
- indexScanDesc = node->biss_ScanDesc;
+ /* close the scan (no-op if we didn't start it) */
+ if (node->biss_ScanDesc != NULL)
+ {
+ index_endscan(node->biss_ScanDesc);
+ node->biss_ScanDesc = NULL;
+ }
/*
* close the index relation (no-op if we didn't open it)
*/
- if (indexScanDesc)
- index_endscan(indexScanDesc);
- if (indexRelationDesc)
- index_close(indexRelationDesc, NoLock);
+ if (node->biss_RelationDesc != NULL)
+ {
+ index_close(node->biss_RelationDesc, NoLock);
+ node->biss_RelationDesc = NULL;
+ }
}
/* ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeBitmapOr.c b/src/backend/executor/nodeBitmapOr.c
index 7029536c64..de439235d2 100644
--- a/src/backend/executor/nodeBitmapOr.c
+++ b/src/backend/executor/nodeBitmapOr.c
@@ -210,8 +210,8 @@ ExecEndBitmapOr(BitmapOrState *node)
*/
for (i = 0; i < nplans; i++)
{
- if (bitmapplans[i])
- ExecEndNode(bitmapplans[i]);
+ ExecEndNode(bitmapplans[i]);
+ bitmapplans[i] = NULL;
}
}
diff --git a/src/backend/executor/nodeCtescan.c b/src/backend/executor/nodeCtescan.c
index 8081eed887..7cea943988 100644
--- a/src/backend/executor/nodeCtescan.c
+++ b/src/backend/executor/nodeCtescan.c
@@ -290,10 +290,11 @@ ExecEndCteScan(CteScanState *node)
/*
* If I am the leader, free the tuplestore.
*/
- if (node->leader == node)
+ if (node->leader != NULL && node->leader == node)
{
tuplestore_end(node->cte_table);
node->cte_table = NULL;
+ node->leader = NULL;
}
}
diff --git a/src/backend/executor/nodeForeignscan.c b/src/backend/executor/nodeForeignscan.c
index fe4ae55c0f..1357ccf3c9 100644
--- a/src/backend/executor/nodeForeignscan.c
+++ b/src/backend/executor/nodeForeignscan.c
@@ -300,17 +300,20 @@ ExecEndForeignScan(ForeignScanState *node)
EState *estate = node->ss.ps.state;
/* Let the FDW shut down */
- if (plan->operation != CMD_SELECT)
+ if (node->fdwroutine != NULL)
{
- if (estate->es_epq_active == NULL)
- node->fdwroutine->EndDirectModify(node);
+ if (plan->operation != CMD_SELECT)
+ {
+ if (estate->es_epq_active == NULL)
+ node->fdwroutine->EndDirectModify(node);
+ }
+ else
+ node->fdwroutine->EndForeignScan(node);
}
- else
- node->fdwroutine->EndForeignScan(node);
/* Shut down any outer plan. */
- if (outerPlanState(node))
- ExecEndNode(outerPlanState(node));
+ ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
}
/* ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeGather.c b/src/backend/executor/nodeGather.c
index 5d4ffe989c..cae5ea1f92 100644
--- a/src/backend/executor/nodeGather.c
+++ b/src/backend/executor/nodeGather.c
@@ -244,6 +244,7 @@ void
ExecEndGather(GatherState *node)
{
ExecEndNode(outerPlanState(node)); /* let children clean up first */
+ outerPlanState(node) = NULL;
ExecShutdownGather(node);
}
diff --git a/src/backend/executor/nodeGatherMerge.c b/src/backend/executor/nodeGatherMerge.c
index 45f6017c29..b36cd89e7d 100644
--- a/src/backend/executor/nodeGatherMerge.c
+++ b/src/backend/executor/nodeGatherMerge.c
@@ -284,6 +284,7 @@ void
ExecEndGatherMerge(GatherMergeState *node)
{
ExecEndNode(outerPlanState(node)); /* let children clean up first */
+ outerPlanState(node) = NULL;
ExecShutdownGatherMerge(node);
}
diff --git a/src/backend/executor/nodeGroup.c b/src/backend/executor/nodeGroup.c
index da32bec181..807429e504 100644
--- a/src/backend/executor/nodeGroup.c
+++ b/src/backend/executor/nodeGroup.c
@@ -225,10 +225,8 @@ ExecInitGroup(Group *node, EState *estate, int eflags)
void
ExecEndGroup(GroupState *node)
{
- PlanState *outerPlan;
-
- outerPlan = outerPlanState(node);
- ExecEndNode(outerPlan);
+ ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
}
void
diff --git a/src/backend/executor/nodeHash.c b/src/backend/executor/nodeHash.c
index 570a90ebe1..a913d5b50c 100644
--- a/src/backend/executor/nodeHash.c
+++ b/src/backend/executor/nodeHash.c
@@ -427,13 +427,11 @@ ExecInitHash(Hash *node, EState *estate, int eflags)
void
ExecEndHash(HashState *node)
{
- PlanState *outerPlan;
-
/*
* shut down the subplan
*/
- outerPlan = outerPlanState(node);
- ExecEndNode(outerPlan);
+ ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
}
diff --git a/src/backend/executor/nodeHashjoin.c b/src/backend/executor/nodeHashjoin.c
index 2f7170604d..901c9e9be7 100644
--- a/src/backend/executor/nodeHashjoin.c
+++ b/src/backend/executor/nodeHashjoin.c
@@ -950,7 +950,7 @@ ExecEndHashJoin(HashJoinState *node)
/*
* Free hash table
*/
- if (node->hj_HashTable)
+ if (node->hj_HashTable != NULL)
{
ExecHashTableDestroy(node->hj_HashTable);
node->hj_HashTable = NULL;
@@ -960,7 +960,9 @@ ExecEndHashJoin(HashJoinState *node)
* clean up subtrees
*/
ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
ExecEndNode(innerPlanState(node));
+ innerPlanState(node) = NULL;
}
/*
diff --git a/src/backend/executor/nodeIncrementalSort.c b/src/backend/executor/nodeIncrementalSort.c
index 2ce5ed5ec8..010bcfafa8 100644
--- a/src/backend/executor/nodeIncrementalSort.c
+++ b/src/backend/executor/nodeIncrementalSort.c
@@ -1078,8 +1078,16 @@ ExecEndIncrementalSort(IncrementalSortState *node)
{
SO_printf("ExecEndIncrementalSort: shutting down sort node\n");
- ExecDropSingleTupleTableSlot(node->group_pivot);
- ExecDropSingleTupleTableSlot(node->transfer_tuple);
+ if (node->group_pivot != NULL)
+ {
+ ExecDropSingleTupleTableSlot(node->group_pivot);
+ node->group_pivot = NULL;
+ }
+ if (node->transfer_tuple != NULL)
+ {
+ ExecDropSingleTupleTableSlot(node->transfer_tuple);
+ node->transfer_tuple = NULL;
+ }
/*
* Release tuplesort resources.
@@ -1099,6 +1107,7 @@ ExecEndIncrementalSort(IncrementalSortState *node)
* Shut down the subplan.
*/
ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
SO_printf("ExecEndIncrementalSort: sort node shutdown\n");
}
diff --git a/src/backend/executor/nodeIndexonlyscan.c b/src/backend/executor/nodeIndexonlyscan.c
index 612c673895..481d479760 100644
--- a/src/backend/executor/nodeIndexonlyscan.c
+++ b/src/backend/executor/nodeIndexonlyscan.c
@@ -397,15 +397,6 @@ ExecReScanIndexOnlyScan(IndexOnlyScanState *node)
void
ExecEndIndexOnlyScan(IndexOnlyScanState *node)
{
- Relation indexRelationDesc;
- IndexScanDesc indexScanDesc;
-
- /*
- * extract information from the node
- */
- indexRelationDesc = node->ioss_RelationDesc;
- indexScanDesc = node->ioss_ScanDesc;
-
/* Release VM buffer pin, if any. */
if (node->ioss_VMBuffer != InvalidBuffer)
{
@@ -413,13 +404,21 @@ ExecEndIndexOnlyScan(IndexOnlyScanState *node)
node->ioss_VMBuffer = InvalidBuffer;
}
+ /* close the scan (no-op if we didn't start it) */
+ if (node->ioss_ScanDesc != NULL)
+ {
+ index_endscan(node->ioss_ScanDesc);
+ node->ioss_ScanDesc = NULL;
+ }
+
/*
* close the index relation (no-op if we didn't open it)
*/
- if (indexScanDesc)
- index_endscan(indexScanDesc);
- if (indexRelationDesc)
- index_close(indexRelationDesc, NoLock);
+ if (node->ioss_RelationDesc != NULL)
+ {
+ index_close(node->ioss_RelationDesc, NoLock);
+ node->ioss_RelationDesc = NULL;
+ }
}
/* ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeIndexscan.c b/src/backend/executor/nodeIndexscan.c
index 8000feff4c..a8172d8b82 100644
--- a/src/backend/executor/nodeIndexscan.c
+++ b/src/backend/executor/nodeIndexscan.c
@@ -784,22 +784,21 @@ ExecIndexAdvanceArrayKeys(IndexArrayKeyInfo *arrayKeys, int numArrayKeys)
void
ExecEndIndexScan(IndexScanState *node)
{
- Relation indexRelationDesc;
- IndexScanDesc indexScanDesc;
-
- /*
- * extract information from the node
- */
- indexRelationDesc = node->iss_RelationDesc;
- indexScanDesc = node->iss_ScanDesc;
+ /* close the scan (no-op if we didn't start it) */
+ if (node->iss_ScanDesc != NULL)
+ {
+ index_endscan(node->iss_ScanDesc);
+ node->iss_ScanDesc = NULL;
+ }
/*
* close the index relation (no-op if we didn't open it)
*/
- if (indexScanDesc)
- index_endscan(indexScanDesc);
- if (indexRelationDesc)
- index_close(indexRelationDesc, NoLock);
+ if (node->iss_RelationDesc != NULL)
+ {
+ index_close(node->iss_RelationDesc, NoLock);
+ node->iss_RelationDesc = NULL;
+ }
}
/* ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeLimit.c b/src/backend/executor/nodeLimit.c
index e6f1fb1562..eb7b6e52be 100644
--- a/src/backend/executor/nodeLimit.c
+++ b/src/backend/executor/nodeLimit.c
@@ -534,6 +534,7 @@ void
ExecEndLimit(LimitState *node)
{
ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
}
diff --git a/src/backend/executor/nodeLockRows.c b/src/backend/executor/nodeLockRows.c
index 41754ddfea..0d3489195b 100644
--- a/src/backend/executor/nodeLockRows.c
+++ b/src/backend/executor/nodeLockRows.c
@@ -387,6 +387,7 @@ ExecEndLockRows(LockRowsState *node)
/* We may have shut down EPQ already, but no harm in another call */
EvalPlanQualEnd(&node->lr_epqstate);
ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
}
diff --git a/src/backend/executor/nodeMaterial.c b/src/backend/executor/nodeMaterial.c
index 22e1787fbd..883e3f3933 100644
--- a/src/backend/executor/nodeMaterial.c
+++ b/src/backend/executor/nodeMaterial.c
@@ -243,13 +243,16 @@ ExecEndMaterial(MaterialState *node)
* Release tuplestore resources
*/
if (node->tuplestorestate != NULL)
+ {
tuplestore_end(node->tuplestorestate);
- node->tuplestorestate = NULL;
+ node->tuplestorestate = NULL;
+ }
/*
* shut down the subplan
*/
ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
}
/* ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeMemoize.c b/src/backend/executor/nodeMemoize.c
index df8e3fff08..690dee1daa 100644
--- a/src/backend/executor/nodeMemoize.c
+++ b/src/backend/executor/nodeMemoize.c
@@ -1128,12 +1128,17 @@ ExecEndMemoize(MemoizeState *node)
}
/* Remove the cache context */
- MemoryContextDelete(node->tableContext);
+ if (node->tableContext != NULL)
+ {
+ MemoryContextDelete(node->tableContext);
+ node->tableContext = NULL;
+ }
/*
* shut down the subplan
*/
ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
}
void
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index e1b9b984a7..3236444cf1 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -333,7 +333,10 @@ ExecEndMergeAppend(MergeAppendState *node)
* shut down each of the subscans
*/
for (i = 0; i < nplans; i++)
+ {
ExecEndNode(mergeplans[i]);
+ mergeplans[i] = NULL;
+ }
}
void
diff --git a/src/backend/executor/nodeMergejoin.c b/src/backend/executor/nodeMergejoin.c
index 29c54fcd75..926e631d88 100644
--- a/src/backend/executor/nodeMergejoin.c
+++ b/src/backend/executor/nodeMergejoin.c
@@ -1647,7 +1647,9 @@ ExecEndMergeJoin(MergeJoinState *node)
* shut down the subplans
*/
ExecEndNode(innerPlanState(node));
+ innerPlanState(node) = NULL;
ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
MJ1_printf("ExecEndMergeJoin: %s\n",
"node processing ended");
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 8bf4c80d4a..9e56f9c36c 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -4724,7 +4724,9 @@ ExecEndModifyTable(ModifyTableState *node)
for (j = 0; j < resultRelInfo->ri_NumSlotsInitialized; j++)
{
ExecDropSingleTupleTableSlot(resultRelInfo->ri_Slots[j]);
+ resultRelInfo->ri_Slots[j] = NULL;
ExecDropSingleTupleTableSlot(resultRelInfo->ri_PlanSlots[j]);
+ resultRelInfo->ri_PlanSlots[j] = NULL;
}
}
@@ -4732,12 +4734,16 @@ ExecEndModifyTable(ModifyTableState *node)
* Close all the partitioned tables, leaf partitions, and their indices
* and release the slot used for tuple routing, if set.
*/
- if (node->mt_partition_tuple_routing)
+ if (node->mt_partition_tuple_routing != NULL)
{
ExecCleanupTupleRouting(node, node->mt_partition_tuple_routing);
+ node->mt_partition_tuple_routing = NULL;
- if (node->mt_root_tuple_slot)
+ if (node->mt_root_tuple_slot != NULL)
+ {
ExecDropSingleTupleTableSlot(node->mt_root_tuple_slot);
+ node->mt_root_tuple_slot = NULL;
+ }
}
/*
@@ -4749,6 +4755,7 @@ ExecEndModifyTable(ModifyTableState *node)
* shut down subplan
*/
ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
}
void
diff --git a/src/backend/executor/nodeNestloop.c b/src/backend/executor/nodeNestloop.c
index 7f4bf6c4db..01f3d56a3b 100644
--- a/src/backend/executor/nodeNestloop.c
+++ b/src/backend/executor/nodeNestloop.c
@@ -367,7 +367,9 @@ ExecEndNestLoop(NestLoopState *node)
* close down subplans
*/
ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
ExecEndNode(innerPlanState(node));
+ innerPlanState(node) = NULL;
NL1_printf("ExecEndNestLoop: %s\n",
"node processing ended");
diff --git a/src/backend/executor/nodeProjectSet.c b/src/backend/executor/nodeProjectSet.c
index e483730015..ca9a5e2ed2 100644
--- a/src/backend/executor/nodeProjectSet.c
+++ b/src/backend/executor/nodeProjectSet.c
@@ -331,6 +331,7 @@ ExecEndProjectSet(ProjectSetState *node)
* shut down subplans
*/
ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
}
void
diff --git a/src/backend/executor/nodeRecursiveunion.c b/src/backend/executor/nodeRecursiveunion.c
index c7f8a19fa4..7680142c7b 100644
--- a/src/backend/executor/nodeRecursiveunion.c
+++ b/src/backend/executor/nodeRecursiveunion.c
@@ -272,20 +272,36 @@ void
ExecEndRecursiveUnion(RecursiveUnionState *node)
{
/* Release tuplestores */
- tuplestore_end(node->working_table);
- tuplestore_end(node->intermediate_table);
+ if (node->working_table != NULL)
+ {
+ tuplestore_end(node->working_table);
+ node->working_table = NULL;
+ }
+ if (node->intermediate_table != NULL)
+ {
+ tuplestore_end(node->intermediate_table);
+ node->intermediate_table = NULL;
+ }
/* free subsidiary stuff including hashtable */
- if (node->tempContext)
+ if (node->tempContext != NULL)
+ {
MemoryContextDelete(node->tempContext);
- if (node->tableContext)
+ node->tempContext = NULL;
+ }
+ if (node->tableContext != NULL)
+ {
MemoryContextDelete(node->tableContext);
+ node->tableContext = NULL;
+ }
/*
* close down subplans
*/
ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
ExecEndNode(innerPlanState(node));
+ innerPlanState(node) = NULL;
}
/* ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeResult.c b/src/backend/executor/nodeResult.c
index 348361e7f4..e3cfc9b772 100644
--- a/src/backend/executor/nodeResult.c
+++ b/src/backend/executor/nodeResult.c
@@ -243,6 +243,7 @@ ExecEndResult(ResultState *node)
* shut down subplans
*/
ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
}
void
diff --git a/src/backend/executor/nodeSamplescan.c b/src/backend/executor/nodeSamplescan.c
index 714b076e64..6ab91001bc 100644
--- a/src/backend/executor/nodeSamplescan.c
+++ b/src/backend/executor/nodeSamplescan.c
@@ -181,14 +181,17 @@ ExecEndSampleScan(SampleScanState *node)
/*
* Tell sampling function that we finished the scan.
*/
- if (node->tsmroutine->EndSampleScan)
+ if (node->tsmroutine != NULL && node->tsmroutine->EndSampleScan)
node->tsmroutine->EndSampleScan(node);
/*
- * close heap scan
+ * close heap scan (no-op if we didn't start it)
*/
if (node->ss.ss_currentScanDesc)
+ {
table_endscan(node->ss.ss_currentScanDesc);
+ node->ss.ss_currentScanDesc = NULL;
+ }
}
/* ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeSeqscan.c b/src/backend/executor/nodeSeqscan.c
index 7cb12a11c2..b052775e5b 100644
--- a/src/backend/executor/nodeSeqscan.c
+++ b/src/backend/executor/nodeSeqscan.c
@@ -183,18 +183,14 @@ ExecInitSeqScan(SeqScan *node, EState *estate, int eflags)
void
ExecEndSeqScan(SeqScanState *node)
{
- TableScanDesc scanDesc;
-
- /*
- * get information from node
- */
- scanDesc = node->ss.ss_currentScanDesc;
-
/*
- * close heap scan
+ * close heap scan (no-op if we didn't start it)
*/
- if (scanDesc != NULL)
- table_endscan(scanDesc);
+ if (node->ss.ss_currentScanDesc != NULL)
+ {
+ table_endscan(node->ss.ss_currentScanDesc);
+ node->ss.ss_currentScanDesc = NULL;
+ }
}
/* ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeSetOp.c b/src/backend/executor/nodeSetOp.c
index a8ac68b482..fe34b2134f 100644
--- a/src/backend/executor/nodeSetOp.c
+++ b/src/backend/executor/nodeSetOp.c
@@ -583,10 +583,14 @@ void
ExecEndSetOp(SetOpState *node)
{
/* free subsidiary stuff including hashtable */
- if (node->tableContext)
+ if (node->tableContext != NULL)
+ {
MemoryContextDelete(node->tableContext);
+ node->tableContext = NULL;
+ }
ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
}
diff --git a/src/backend/executor/nodeSort.c b/src/backend/executor/nodeSort.c
index 3fc925d7b4..af852464d0 100644
--- a/src/backend/executor/nodeSort.c
+++ b/src/backend/executor/nodeSort.c
@@ -307,13 +307,16 @@ ExecEndSort(SortState *node)
* Release tuplesort resources
*/
if (node->tuplesortstate != NULL)
+ {
tuplesort_end((Tuplesortstate *) node->tuplesortstate);
- node->tuplesortstate = NULL;
+ node->tuplesortstate = NULL;
+ }
/*
* shut down the subplan
*/
ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
SO1_printf("ExecEndSort: %s\n",
"sort node shutdown");
diff --git a/src/backend/executor/nodeSubqueryscan.c b/src/backend/executor/nodeSubqueryscan.c
index 782097eaf2..0b2612183a 100644
--- a/src/backend/executor/nodeSubqueryscan.c
+++ b/src/backend/executor/nodeSubqueryscan.c
@@ -171,6 +171,7 @@ ExecEndSubqueryScan(SubqueryScanState *node)
* close down subquery
*/
ExecEndNode(node->subplan);
+ node->subplan = NULL;
}
/* ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeTableFuncscan.c b/src/backend/executor/nodeTableFuncscan.c
index f483221bb8..778d25d511 100644
--- a/src/backend/executor/nodeTableFuncscan.c
+++ b/src/backend/executor/nodeTableFuncscan.c
@@ -223,8 +223,10 @@ ExecEndTableFuncScan(TableFuncScanState *node)
* Release tuplestore resources
*/
if (node->tupstore != NULL)
+ {
tuplestore_end(node->tupstore);
- node->tupstore = NULL;
+ node->tupstore = NULL;
+ }
}
/* ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeTidrangescan.c b/src/backend/executor/nodeTidrangescan.c
index 9aa7683d7e..702ee884d2 100644
--- a/src/backend/executor/nodeTidrangescan.c
+++ b/src/backend/executor/nodeTidrangescan.c
@@ -326,10 +326,14 @@ ExecReScanTidRangeScan(TidRangeScanState *node)
void
ExecEndTidRangeScan(TidRangeScanState *node)
{
- TableScanDesc scan = node->ss.ss_currentScanDesc;
-
- if (scan != NULL)
- table_endscan(scan);
+ /*
+ * close heap scan (no-op if we didn't start it)
+ */
+ if (node->ss.ss_currentScanDesc != NULL)
+ {
+ table_endscan(node->ss.ss_currentScanDesc);
+ node->ss.ss_currentScanDesc = NULL;
+ }
}
/* ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeTidscan.c b/src/backend/executor/nodeTidscan.c
index 864a9013b6..f375951699 100644
--- a/src/backend/executor/nodeTidscan.c
+++ b/src/backend/executor/nodeTidscan.c
@@ -469,8 +469,14 @@ ExecReScanTidScan(TidScanState *node)
void
ExecEndTidScan(TidScanState *node)
{
- if (node->ss.ss_currentScanDesc)
+ /*
+ * close heap scan (no-op if we didn't start it)
+ */
+ if (node->ss.ss_currentScanDesc != NULL)
+ {
table_endscan(node->ss.ss_currentScanDesc);
+ node->ss.ss_currentScanDesc = NULL;
+ }
}
/* ----------------------------------------------------------------
diff --git a/src/backend/executor/nodeUnique.c b/src/backend/executor/nodeUnique.c
index a125923e93..b82d0e9ad5 100644
--- a/src/backend/executor/nodeUnique.c
+++ b/src/backend/executor/nodeUnique.c
@@ -168,6 +168,7 @@ void
ExecEndUnique(UniqueState *node)
{
ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
}
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index 3221fa1522..561d7e731d 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -1351,11 +1351,14 @@ release_partition(WindowAggState *winstate)
* any aggregate temp data). We don't rely on retail pfree because some
* aggregates might have allocated data we don't have direct pointers to.
*/
- MemoryContextReset(winstate->partcontext);
- MemoryContextReset(winstate->aggcontext);
+ if (winstate->partcontext != NULL)
+ MemoryContextReset(winstate->partcontext);
+ if (winstate->aggcontext != NULL)
+ MemoryContextReset(winstate->aggcontext);
for (i = 0; i < winstate->numaggs; i++)
{
- if (winstate->peragg[i].aggcontext != winstate->aggcontext)
+ if (winstate->peragg[i].aggcontext != NULL &&
+ winstate->peragg[i].aggcontext != winstate->aggcontext)
MemoryContextReset(winstate->peragg[i].aggcontext);
}
@@ -2681,24 +2684,40 @@ ExecInitWindowAgg(WindowAgg *node, EState *estate, int eflags)
void
ExecEndWindowAgg(WindowAggState *node)
{
- PlanState *outerPlan;
int i;
release_partition(node);
for (i = 0; i < node->numaggs; i++)
{
- if (node->peragg[i].aggcontext != node->aggcontext)
+ if (node->peragg[i].aggcontext != NULL &&
+ node->peragg[i].aggcontext != node->aggcontext)
MemoryContextDelete(node->peragg[i].aggcontext);
}
- MemoryContextDelete(node->partcontext);
- MemoryContextDelete(node->aggcontext);
+ if (node->partcontext != NULL)
+ {
+ MemoryContextDelete(node->partcontext);
+ node->partcontext = NULL;
+ }
+ if (node->aggcontext != NULL)
+ {
+ MemoryContextDelete(node->aggcontext);
+ node->aggcontext = NULL;
+ }
- pfree(node->perfunc);
- pfree(node->peragg);
+ if (node->perfunc != NULL)
+ {
+ pfree(node->perfunc);
+ node->perfunc = NULL;
+ }
+ if (node->peragg != NULL)
+ {
+ pfree(node->peragg);
+ node->peragg = NULL;
+ }
- outerPlan = outerPlanState(node);
- ExecEndNode(outerPlan);
+ ExecEndNode(outerPlanState(node));
+ outerPlanState(node) = NULL;
}
/* -----------------
--
2.43.0
[application/octet-stream] v52-0003-Handle-CachedPlan-invalidation-in-the-executor.patch (92.1K, 4-v52-0003-Handle-CachedPlan-invalidation-in-the-executor.patch)
download | inline diff:
From 533abeac5857c4ac2950c8eb1699485b46cce9c7 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Thu, 22 Aug 2024 19:38:13 +0900
Subject: [PATCH v52 3/3] Handle CachedPlan invalidation in the executor
This commit makes changes to handle cases where a cached plan
becomes invalid before deferred locks on prunable relations are taken.
* Add checks at various points in ExecutorStart() and its called
functions to determine if the plan becomes invalid. If detected,
the function and its callers return immediately. A previous commit
ensures any partially initialized PlanState tree objects are cleaned
up appropriately.
* Introduce ExecutorStartExt(), a wrapper over ExecutorStart(), to
handle cases where plan initialization is aborted due to invalidation.
ExecutorStartExt() creates a new transient CachedPlan if needed and
retries execution. This new entry point is only required for sites
using plancache.c. It requires passing the QueryDesc, eflags,
CachedPlanSource, and query_index (index in CachedPlanSource.query_list).
* Add GetSingleCachedPlan() in plancache.c to create a transient
CachedPlan for a specified query in the given CachedPlanSource.
Such CachedPlans are tracked in a separate global list for the
plancache invalidation callbacks to check.
This also adds isolation tests using the delay_execution test module
to verify scenarios where a CachedPlan becomes invalid before the
deferred locks are taken.
All ExecutorStart_hook implementations now must add the following
block after the ExecutorStart() call to ensure it doesn't work with an
invalid plan:
/* The plan may have become invalid during ExecutorStart() */
if (!ExecPlanStillValid(queryDesc->estate))
return;
Reviewed-by: Robert Haas
Discussion: https://postgr.es/m/CA+HiwqFGkMSge6TgC9KQzde0ohpAycLQuV7ooitEEpbKB0O_mg@mail.gmail.comk
---
contrib/auto_explain/auto_explain.c | 4 +
.../pg_stat_statements/pg_stat_statements.c | 4 +
contrib/postgres_fdw/postgres_fdw.c | 36 ++-
src/backend/commands/explain.c | 8 +-
src/backend/commands/portalcmds.c | 1 +
src/backend/commands/prepare.c | 10 +-
src/backend/commands/trigger.c | 14 ++
src/backend/executor/README | 32 ++-
src/backend/executor/execMain.c | 99 ++++++++-
src/backend/executor/execParallel.c | 4 +-
src/backend/executor/execPartition.c | 10 +
src/backend/executor/execProcnode.c | 7 +
src/backend/executor/execUtils.c | 42 +++-
src/backend/executor/nodeAgg.c | 2 +
src/backend/executor/nodeAppend.c | 12 +-
src/backend/executor/nodeBitmapAnd.c | 2 +
src/backend/executor/nodeBitmapHeapscan.c | 4 +
src/backend/executor/nodeBitmapIndexscan.c | 6 +-
src/backend/executor/nodeBitmapOr.c | 2 +
src/backend/executor/nodeCustom.c | 2 +
src/backend/executor/nodeForeignscan.c | 4 +
src/backend/executor/nodeGather.c | 2 +
src/backend/executor/nodeGatherMerge.c | 2 +
src/backend/executor/nodeGroup.c | 2 +
src/backend/executor/nodeHash.c | 2 +
src/backend/executor/nodeHashjoin.c | 4 +
src/backend/executor/nodeIncrementalSort.c | 2 +
src/backend/executor/nodeIndexonlyscan.c | 7 +-
src/backend/executor/nodeIndexscan.c | 8 +-
src/backend/executor/nodeLimit.c | 2 +
src/backend/executor/nodeLockRows.c | 2 +
src/backend/executor/nodeMaterial.c | 2 +
src/backend/executor/nodeMemoize.c | 2 +
src/backend/executor/nodeMergeAppend.c | 6 +-
src/backend/executor/nodeMergejoin.c | 4 +
src/backend/executor/nodeModifyTable.c | 13 ++
src/backend/executor/nodeNestloop.c | 4 +
src/backend/executor/nodeProjectSet.c | 2 +
src/backend/executor/nodeRecursiveunion.c | 4 +
src/backend/executor/nodeResult.c | 2 +
src/backend/executor/nodeSamplescan.c | 3 +
src/backend/executor/nodeSeqscan.c | 3 +
src/backend/executor/nodeSetOp.c | 2 +
src/backend/executor/nodeSort.c | 2 +
src/backend/executor/nodeSubqueryscan.c | 2 +
src/backend/executor/nodeTidrangescan.c | 2 +
src/backend/executor/nodeTidscan.c | 2 +
src/backend/executor/nodeUnique.c | 2 +
src/backend/executor/nodeWindowAgg.c | 2 +
src/backend/executor/spi.c | 19 +-
src/backend/tcop/postgres.c | 4 +-
src/backend/tcop/pquery.c | 31 ++-
src/backend/utils/cache/plancache.c | 206 ++++++++++++++++++
src/backend/utils/mmgr/portalmem.c | 4 +-
src/include/commands/explain.h | 1 +
src/include/commands/trigger.h | 1 +
src/include/executor/execdesc.h | 1 +
src/include/executor/executor.h | 18 ++
src/include/nodes/execnodes.h | 1 +
src/include/utils/plancache.h | 26 +++
src/include/utils/portal.h | 4 +-
src/test/modules/delay_execution/Makefile | 3 +-
.../modules/delay_execution/delay_execution.c | 63 +++++-
.../expected/cached-plan-inval.out | 175 +++++++++++++++
src/test/modules/delay_execution/meson.build | 1 +
.../specs/cached-plan-inval.spec | 65 ++++++
66 files changed, 962 insertions(+), 58 deletions(-)
create mode 100644 src/test/modules/delay_execution/expected/cached-plan-inval.out
create mode 100644 src/test/modules/delay_execution/specs/cached-plan-inval.spec
diff --git a/contrib/auto_explain/auto_explain.c b/contrib/auto_explain/auto_explain.c
index 677c135f59..3675ce9a88 100644
--- a/contrib/auto_explain/auto_explain.c
+++ b/contrib/auto_explain/auto_explain.c
@@ -300,6 +300,10 @@ explain_ExecutorStart(QueryDesc *queryDesc, int eflags)
else
standard_ExecutorStart(queryDesc, eflags);
+ /* The plan may have become invalid during ExecutorStart() */
+ if (!ExecPlanStillValid(queryDesc->estate))
+ return;
+
if (auto_explain_enabled())
{
/*
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 362d222f63..98a328b79f 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -992,6 +992,10 @@ pgss_ExecutorStart(QueryDesc *queryDesc, int eflags)
else
standard_ExecutorStart(queryDesc, eflags);
+ /* The plan may have become invalid during ExecutorStart() */
+ if (!ExecPlanStillValid(queryDesc->estate))
+ return;
+
/*
* If query has queryId zero, don't track it. This prevents double
* counting of optimizable statements that are directly contained in
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index adc62576d1..65f4ffe5ee 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -2144,7 +2144,11 @@ postgresEndForeignModify(EState *estate,
{
PgFdwModifyState *fmstate = (PgFdwModifyState *) resultRelInfo->ri_FdwState;
- /* If fmstate is NULL, we are in EXPLAIN; nothing to do */
+ /*
+ * fmstate could be NULL under two conditions: during an EXPLAIN
+ * operation or if BeginForeignModify() hasn't been invoked.
+ * In either case, no action is required.
+ */
if (fmstate == NULL)
return;
@@ -2650,8 +2654,9 @@ postgresBeginDirectModify(ForeignScanState *node, int eflags)
{
ForeignScan *fsplan = (ForeignScan *) node->ss.ps.plan;
EState *estate = node->ss.ps.state;
+ Relation rel = node->ss.ss_currentRelation;
PgFdwDirectModifyState *dmstate;
- Index rtindex;
+ Index rtindex = node->resultRelInfo->ri_RangeTableIndex;
Oid userid;
ForeignTable *table;
UserMapping *user;
@@ -2663,24 +2668,32 @@ postgresBeginDirectModify(ForeignScanState *node, int eflags)
if (eflags & EXEC_FLAG_EXPLAIN_ONLY)
return;
+ /*
+ * Open the foreign table using the RT index given in the ResultRelInfo if
+ * the ScanState doesn't provide it. If the plan becomes invalid as a
+ * result of taking a lock in ExecOpenScanRelation(), do nothing, in which
+ * case node->fdw_state remains NULL.
+ */
+ if (rel == NULL)
+ {
+ Assert(fsplan->scan.scanrelid == 0);
+ rel = ExecOpenScanRelation(estate, rtindex, eflags);
+ if (unlikely(rel == NULL || !ExecPlanStillValid(estate)))
+ return;
+ }
+
/*
* We'll save private state in node->fdw_state.
*/
dmstate = (PgFdwDirectModifyState *) palloc0(sizeof(PgFdwDirectModifyState));
node->fdw_state = (void *) dmstate;
+ dmstate->rel = rel;
/*
* Identify which user to do the remote access as. This should match what
* ExecCheckPermissions() does.
*/
userid = OidIsValid(fsplan->checkAsUser) ? fsplan->checkAsUser : GetUserId();
-
- /* Get info about foreign table. */
- rtindex = node->resultRelInfo->ri_RangeTableIndex;
- if (fsplan->scan.scanrelid == 0)
- dmstate->rel = ExecOpenScanRelation(estate, rtindex, eflags);
- else
- dmstate->rel = node->ss.ss_currentRelation;
table = GetForeignTable(RelationGetRelid(dmstate->rel));
user = GetUserMapping(userid, table->serverid);
@@ -2811,7 +2824,10 @@ postgresEndDirectModify(ForeignScanState *node)
{
PgFdwDirectModifyState *dmstate = (PgFdwDirectModifyState *) node->fdw_state;
- /* if dmstate is NULL, we are in EXPLAIN; nothing to do */
+ /*
+ * Nothing to do if dmstate is NULL, either because we are in EXPLAIN or
+ * dmstate wasn't initialized due to aborted plan initialization.
+ */
if (dmstate == NULL)
return;
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index a83ea07db1..a7643360a7 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -507,7 +507,8 @@ standard_ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NULL, NULL, -1, into, es, queryString, params,
+ queryEnv,
&planduration, (es->buffers ? &bufusage : NULL),
es->memory ? &mem_counters : NULL);
}
@@ -616,6 +617,7 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
*/
void
ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
+ CachedPlanSource *plansource, int query_index,
IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
@@ -686,8 +688,8 @@ ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
if (into)
eflags |= GetIntoRelEFlags(into);
- /* call ExecutorStart to prepare the plan for execution */
- ExecutorStart(queryDesc, eflags);
+ /* Call ExecutorStartExt to prepare the plan for execution. */
+ ExecutorStartExt(queryDesc, eflags, plansource, query_index);
/* Execute the plan for statistics if asked for */
if (es->analyze)
diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index 4f6acf6719..4b1503c05e 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -107,6 +107,7 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
queryString,
CMDTAG_SELECT, /* cursor's query is always a SELECT */
list_make1(plan),
+ NULL,
NULL);
/*----------
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 311b9ebd5b..4cd79a6e3a 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -202,7 +202,8 @@ ExecuteQuery(ParseState *pstate,
query_string,
entry->plansource->commandTag,
plan_list,
- cplan);
+ cplan,
+ entry->plansource);
/*
* For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
@@ -583,6 +584,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
MemoryContextCounters mem_counters;
MemoryContext planner_ctx = NULL;
MemoryContext saved_ctx = NULL;
+ int i = 0;
if (es->memory)
{
@@ -655,8 +657,8 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, cplan, into, es, query_string, paramLI,
- queryEnv,
+ ExplainOnePlan(pstmt, cplan, entry->plansource, i,
+ into, es, query_string, paramLI, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL),
es->memory ? &mem_counters : NULL);
else
@@ -668,6 +670,8 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
/* Separate plans with an appropriate separator */
if (lnext(plan_list, p) != NULL)
ExplainSeparatePlans(es);
+
+ i++;
}
if (estate)
diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index 170360edda..91e4b821a0 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -5119,6 +5119,20 @@ AfterTriggerEndQuery(EState *estate)
afterTriggers.query_depth--;
}
+/* ----------
+ * AfterTriggerAbortQuery()
+ *
+ * Called by ExecutorEnd() if the query execution was aborted due to the
+ * plan becoming invalid during initialization.
+ * ----------
+ */
+void
+AfterTriggerAbortQuery(void)
+{
+ /* Revert the actions of AfterTriggerBeginQuery(). */
+ afterTriggers.query_depth--;
+}
+
/*
* AfterTriggerFreeQuery
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 642d63be61..e583df5be0 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -280,6 +280,28 @@ are typically reset to empty once per tuple. Per-tuple contexts are usually
associated with ExprContexts, and commonly each PlanState node has its own
ExprContext to evaluate its qual and targetlist expressions in.
+Relation Locking
+----------------
+
+Typically, when the executor initializes a plan tree for execution, it doesn't
+lock non-index relations if the plan tree is freshly generated and not derived
+from a CachedPlan. This is because such locks have already been established
+during the query's parsing, rewriting, and planning phases. However, with a
+cached plan tree, some relations may remain unlocked. The function
+AcquireExecutorLocks() only locks unprunable relations in the plan, deferring
+the locking of prunable ones to executor initialization. This avoids
+unnecessary locking of relations that will be pruned during "initial" runtime
+pruning in the ExecInitNode() routine of nodes containing the pruning info.
+
+This approach creates a window where a cached plan tree with child tables
+could become outdated if another backend modifies these tables before
+ExecInitNode() locks them. As a result, the executor has the added duty to
+verify the plan tree's validity whenever it locks a child table after
+execution-initialization-pruning. This validation is done by checking the
+CachedPlan.is_valid attribute. If the plan tree is outdated (is_valid=false),
+the executor halts further initialization, cleans up the partially initialized
+PlanState tree, and retries execution after creating a new transient
+CachedPlan.
Query Processing Control Flow
-----------------------------
@@ -288,7 +310,7 @@ This is a sketch of control flow for full query processing:
CreateQueryDesc
- ExecutorStart
+ ExecutorStart or ExecutorStartExt
CreateExecutorState
creates per-query context
switch to per-query context to run ExecInitNode
@@ -316,7 +338,13 @@ This is a sketch of control flow for full query processing:
FreeQueryDesc
-Per above comments, it's not really critical for ExecEndNode to free any
+As mentioned in the "Relation Locking" section, if the plan tree is found to
+be stale during one of the recursive calls of ExecInitNode() after taking a
+lock on a child table, the control is immmediately returned to
+ExecutorStartExt(), which will create a new plan tree and perform the
+steps starting from CreateExecutorState() again.
+
+Per above comments, it's not really critical for ExecEndPlan to free any
memory; it'll all go away in FreeExecutorState anyway. However, we do need to
be careful to close relations, drop buffer pins, etc, so we do need to scan
the plan state tree to find these sorts of resources.
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 0f6dbd1e2b..000d02a337 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -58,6 +58,7 @@
#include "utils/backend_status.h"
#include "utils/lsyscache.h"
#include "utils/partcache.h"
+#include "utils/plancache.h"
#include "utils/rls.h"
#include "utils/snapmgr.h"
@@ -133,6 +134,60 @@ ExecutorStart(QueryDesc *queryDesc, int eflags)
standard_ExecutorStart(queryDesc, eflags);
}
+/*
+ * A variant of ExecutorStart() that handles cleanup and replanning if the
+ * input CachedPlan becomes invalid due to locks being taken during
+ * ExecutorStartInternal(). If that happens, a new CachedPlan is created
+ * only for the at the index 'query_index' in plansource->query_list, which
+ * is released separately from the original CachedPlan.
+ */
+void
+ExecutorStartExt(QueryDesc *queryDesc, int eflags,
+ CachedPlanSource *plansource,
+ int query_index)
+{
+ if (queryDesc->cplan == NULL)
+ {
+ ExecutorStart(queryDesc, eflags);
+ return;
+ }
+
+ while (1)
+ {
+ ExecutorStart(queryDesc, eflags);
+ if (!CachedPlanValid(queryDesc->cplan))
+ {
+ CachedPlan *cplan;
+
+ /*
+ * The plan got invalidated, so try with a new updated plan.
+ *
+ * But first undo what ExecutorStart() would've done. Mark
+ * execution as aborted to ensure that AFTER trigger state is
+ * properly reset.
+ */
+ queryDesc->estate->es_aborted = true;
+ ExecutorEnd(queryDesc);
+
+ cplan = GetSingleCachedPlan(plansource, query_index,
+ queryDesc->queryEnv);
+
+ /*
+ * Install the new transient cplan into the QueryDesc replacing
+ * the old one so that executor initialization code can see it.
+ * Mark it as in use by us and ask FreeQueryDesc() to release it.
+ */
+ cplan->refcount = 1;
+ queryDesc->cplan = cplan;
+ queryDesc->cplan_release = true;
+ queryDesc->plannedstmt = linitial_node(PlannedStmt,
+ queryDesc->cplan->stmt_list);
+ }
+ else
+ break; /* ExecutorStart() succeeded! */
+ }
+}
+
void
standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
{
@@ -316,6 +371,7 @@ standard_ExecutorRun(QueryDesc *queryDesc,
estate = queryDesc->estate;
Assert(estate != NULL);
+ Assert(!estate->es_aborted);
Assert(!(estate->es_top_eflags & EXEC_FLAG_EXPLAIN_ONLY));
/* caller must ensure the query's snapshot is active */
@@ -422,8 +478,11 @@ standard_ExecutorFinish(QueryDesc *queryDesc)
Assert(estate != NULL);
Assert(!(estate->es_top_eflags & EXEC_FLAG_EXPLAIN_ONLY));
- /* This should be run once and only once per Executor instance */
- Assert(!estate->es_finished);
+ /*
+ * This should be run once and only once per Executor instance and never
+ * if the execution was aborted.
+ */
+ Assert(!estate->es_finished && !estate->es_aborted);
/* Switch into per-query memory context */
oldcontext = MemoryContextSwitchTo(estate->es_query_cxt);
@@ -482,11 +541,10 @@ standard_ExecutorEnd(QueryDesc *queryDesc)
Assert(estate != NULL);
/*
- * Check that ExecutorFinish was called, unless in EXPLAIN-only mode. This
- * Assert is needed because ExecutorFinish is new as of 9.1, and callers
- * might forget to call it.
+ * Check that ExecutorFinish was called, unless in EXPLAIN-only mode or if
+ * execution was aborted.
*/
- Assert(estate->es_finished ||
+ Assert(estate->es_finished || estate->es_aborted ||
(estate->es_top_eflags & EXEC_FLAG_EXPLAIN_ONLY));
/*
@@ -500,6 +558,14 @@ standard_ExecutorEnd(QueryDesc *queryDesc)
UnregisterSnapshot(estate->es_snapshot);
UnregisterSnapshot(estate->es_crosscheck_snapshot);
+ /*
+ * Reset AFTER trigger module if the query execution was aborted.
+ */
+ if (estate->es_aborted &&
+ !(estate->es_top_eflags &
+ (EXEC_FLAG_SKIP_TRIGGERS | EXEC_FLAG_EXPLAIN_ONLY)))
+ AfterTriggerAbortQuery();
+
/*
* Must switch out of context before destroying it
*/
@@ -832,7 +898,6 @@ ExecCheckXactReadOnly(PlannedStmt *plannedstmt)
PreventCommandIfParallelMode(CreateCommandName((Node *) plannedstmt));
}
-
/* ----------------------------------------------------------------
* InitPlan
*
@@ -897,6 +962,9 @@ InitPlan(QueryDesc *queryDesc, int eflags)
case ROW_MARK_KEYSHARE:
case ROW_MARK_REFERENCE:
relation = ExecGetRangeTableRelation(estate, rc->rti);
+ if (unlikely(relation == NULL ||
+ !ExecPlanStillValid(estate)))
+ return;
break;
case ROW_MARK_COPY:
/* no physical table access is required */
@@ -967,6 +1035,8 @@ InitPlan(QueryDesc *queryDesc, int eflags)
estate->es_subplanstates = lappend(estate->es_subplanstates,
subplanstate);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return;
i++;
}
@@ -977,6 +1047,8 @@ InitPlan(QueryDesc *queryDesc, int eflags)
* processing tuples.
*/
planstate = ExecInitNode(plan, estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return;
/*
* Get the tuple descriptor describing the type of tuples to return.
@@ -2858,6 +2930,7 @@ EvalPlanQualStart(EPQState *epqstate, Plan *planTree)
rcestate->es_rowmarks = parentestate->es_rowmarks;
rcestate->es_rteperminfos = parentestate->es_rteperminfos;
rcestate->es_plannedstmt = parentestate->es_plannedstmt;
+ rcestate->es_cachedplan = parentestate->es_cachedplan;
rcestate->es_junkFilter = parentestate->es_junkFilter;
rcestate->es_output_cid = parentestate->es_output_cid;
rcestate->es_queryEnv = parentestate->es_queryEnv;
@@ -2936,6 +3009,14 @@ EvalPlanQualStart(EPQState *epqstate, Plan *planTree)
subplanstate = ExecInitNode(subplan, rcestate, 0);
rcestate->es_subplanstates = lappend(rcestate->es_subplanstates,
subplanstate);
+
+ /*
+ * All necessary locks should have been taken when initializing the
+ * parent's copy of subplanstate, so the CachedPlan, if any, should
+ * not have become invalid during the above ExecInitNode().
+ */
+ if (!ExecPlanStillValid(rcestate))
+ elog(ERROR, "unexpected failure to initialize subplan in EvalPlanQualStart()");
}
/*
@@ -2977,6 +3058,10 @@ EvalPlanQualStart(EPQState *epqstate, Plan *planTree)
*/
epqstate->recheckplanstate = ExecInitNode(planTree, rcestate, 0);
+ /* See the comment above. */
+ if (!ExecPlanStillValid(rcestate))
+ elog(ERROR, "unexpected failure to initialize main plantree in EvalPlanQualStart()");
+
MemoryContextSwitchTo(oldcontext);
}
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index 03b48e12b4..2017433c64 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -1263,9 +1263,7 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
* if it should take locks on certain relations, but paraller workers
* always take locks anyway.
*/
- return CreateQueryDesc(pstmt,
- NULL,
- queryString,
+ return CreateQueryDesc(pstmt, NULL, queryString,
GetActiveSnapshot(), InvalidSnapshot,
receiver, paramLI, NULL, instrument_options);
}
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 7651886229..38cd97b59c 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1794,6 +1794,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* If subplans are indeed pruned, subplan_map arrays contained in the returned
* PartitionPruneState are re-sequenced to not count those, though only if the
* maps will be needed for subsequent execution pruning passes.
+ *
+ * Returns NULL if the plan has become invalid after taking the locks to
+ * create the PartitionPruneState in CreatePartitionPruneState().
*/
PartitionPruneState *
ExecInitPartitionPruning(PlanState *planstate,
@@ -1809,6 +1812,8 @@ ExecInitPartitionPruning(PlanState *planstate,
/* Create the working data structure for pruning */
prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+ if (!ExecPlanStillValid(estate))
+ return NULL;
/*
* Perform an initial partition prune pass, if required.
@@ -1860,6 +1865,9 @@ ExecInitPartitionPruning(PlanState *planstate,
* stored in each PartitionedRelPruningData can be re-used each time we
* re-evaluate which partitions match the pruning steps provided in each
* PartitionedRelPruneInfo.
+ *
+ * Returns NULL if the plan has become invalid after taking a lock to create
+ * a PartitionedRelPruningData.
*/
static PartitionPruneState *
CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
@@ -1935,6 +1943,8 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
* duration of this executor run.
*/
partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
+ if (unlikely(partrel == NULL || !ExecPlanStillValid(estate)))
+ return NULL;
partkey = RelationGetPartitionKey(partrel);
partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
partrel);
diff --git a/src/backend/executor/execProcnode.c b/src/backend/executor/execProcnode.c
index 34f28dfece..7689d34dd0 100644
--- a/src/backend/executor/execProcnode.c
+++ b/src/backend/executor/execProcnode.c
@@ -136,6 +136,10 @@ static bool ExecShutdownNode_walker(PlanState *node, void *context);
* 'eflags' is a bitwise OR of flag bits described in executor.h
*
* Returns a PlanState node corresponding to the given Plan node.
+ *
+ * Callers should check upon returning that ExecPlanStillValid(estate)
+ * returns true before continuing further with its processing, because the
+ * returned PlanState might be only partially valid otherwise.
* ------------------------------------------------------------------------
*/
PlanState *
@@ -388,6 +392,9 @@ ExecInitNode(Plan *node, EState *estate, int eflags)
break;
}
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return result;
+
ExecSetExecProcNode(result, result->ExecProcNode);
/*
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 6dfd5a26b7..39b388e6b4 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -146,6 +146,7 @@ CreateExecutorState(void)
estate->es_top_eflags = 0;
estate->es_instrument = 0;
estate->es_finished = false;
+ estate->es_aborted = false;
estate->es_exprcontexts = NIL;
@@ -691,6 +692,8 @@ ExecRelationIsTargetRelation(EState *estate, Index scanrelid)
*
* Open the heap relation to be scanned by a base-level scan plan node.
* This should be called during the node's ExecInit routine.
+ *
+ * NULL is returned if the relation is found to have been dropped.
* ----------------------------------------------------------------
*/
Relation
@@ -700,6 +703,8 @@ ExecOpenScanRelation(EState *estate, Index scanrelid, int eflags)
/* Open the relation. */
rel = ExecGetRangeTableRelation(estate, scanrelid);
+ if (unlikely(rel == NULL || !ExecPlanStillValid(estate)))
+ return rel;
/*
* Complain if we're attempting a scan of an unscannable relation, except
@@ -717,6 +722,26 @@ ExecOpenScanRelation(EState *estate, Index scanrelid, int eflags)
return rel;
}
+/* ----------------------------------------------------------------
+ * ExecOpenScanIndexRelation
+ *
+ * Open the index relation to be scanned by an index scan plan node.
+ * This should be called during the node's ExecInit routine.
+ * ----------------------------------------------------------------
+ */
+Relation
+ExecOpenScanIndexRelation(EState *estate, Oid indexid, int lockmode)
+{
+ Relation rel;
+
+ /* Open the index. */
+ rel = index_open(indexid, lockmode);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ elog(DEBUG2, "CachedPlan invalidated on locking index %u", indexid);
+
+ return rel;
+}
+
/*
* ExecInitRangeTable
* Set up executor's range-table-related data
@@ -776,8 +801,12 @@ ExecShouldLockRelation(EState *estate, Index rtindex)
* ExecGetRangeTableRelation
* Open the Relation for a range table entry, if not already done
*
- * The Relations will be closed again in ExecEndPlan().
+ * The Relations will be closed in ExecEndPlan().
+ *
+ * The returned value may be NULL if the relation is a prunable relation
+ * that has not been locked and may have been concurrently dropped.
*/
+
Relation
ExecGetRangeTableRelation(EState *estate, Index rti)
{
@@ -820,8 +849,14 @@ ExecGetRangeTableRelation(EState *estate, Index rti)
* that of a prunable relation and we're running a cached generic
* plan. AcquireExecutorLocks() of plancache.c would have locked
* only the unprunable relations in the plan tree.
+ *
+ * Note that we use try_table_open() here, because without a lock
+ * held on the relation, it may have disappeared from under us.
*/
- rel = table_open(rte->relid, rte->rellockmode);
+ rel = try_table_open(rte->relid, rte->rellockmode);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ elog(DEBUG2, "CachedPlan invalidated on locking relation %u",
+ rte->relid);
}
estate->es_relations[rti - 1] = rel;
@@ -845,6 +880,9 @@ ExecInitResultRelation(EState *estate, ResultRelInfo *resultRelInfo,
Relation resultRelationDesc;
resultRelationDesc = ExecGetRangeTableRelation(estate, rti);
+ if (unlikely(resultRelationDesc == NULL ||
+ !ExecPlanStillValid(estate)))
+ return;
InitResultRelInfo(resultRelInfo,
resultRelationDesc,
rti,
diff --git a/src/backend/executor/nodeAgg.c b/src/backend/executor/nodeAgg.c
index 0dfba5ca16..8c40d8c520 100644
--- a/src/backend/executor/nodeAgg.c
+++ b/src/backend/executor/nodeAgg.c
@@ -3303,6 +3303,8 @@ ExecInitAgg(Agg *node, EState *estate, int eflags)
eflags &= ~EXEC_FLAG_REWIND;
outerPlan = outerPlan(node);
outerPlanState(aggstate) = ExecInitNode(outerPlan, estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return aggstate;
/*
* initialize source tuple type.
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index 86d75b1a7e..3c82a1ceab 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -147,6 +147,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
list_length(node->appendplans),
node->part_prune_info,
&validsubplans);
+ if (!ExecPlanStillValid(estate))
+ return appendstate;
appendstate->as_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
@@ -185,8 +187,10 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
appendstate->ps.resultopsset = true;
appendstate->ps.resultopsfixed = false;
- appendplanstates = (PlanState **) palloc(nplans *
- sizeof(PlanState *));
+ appendplanstates = (PlanState **) palloc0(nplans *
+ sizeof(PlanState *));
+ appendstate->appendplans = appendplanstates;
+ appendstate->as_nplans = nplans;
/*
* call ExecInitNode on each of the valid plans to be executed and save
@@ -221,11 +225,11 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
firstvalid = j;
appendplanstates[j++] = ExecInitNode(initNode, estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return appendstate;
}
appendstate->as_first_partial_plan = firstvalid;
- appendstate->appendplans = appendplanstates;
- appendstate->as_nplans = nplans;
/* Initialize async state */
appendstate->as_asyncplans = asyncplans;
diff --git a/src/backend/executor/nodeBitmapAnd.c b/src/backend/executor/nodeBitmapAnd.c
index ae391222bf..168c440692 100644
--- a/src/backend/executor/nodeBitmapAnd.c
+++ b/src/backend/executor/nodeBitmapAnd.c
@@ -89,6 +89,8 @@ ExecInitBitmapAnd(BitmapAnd *node, EState *estate, int eflags)
{
initNode = (Plan *) lfirst(l);
bitmapplanstates[i] = ExecInitNode(initNode, estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return bitmapandstate;
i++;
}
diff --git a/src/backend/executor/nodeBitmapHeapscan.c b/src/backend/executor/nodeBitmapHeapscan.c
index 19f18ab817..b13cae1cbb 100644
--- a/src/backend/executor/nodeBitmapHeapscan.c
+++ b/src/backend/executor/nodeBitmapHeapscan.c
@@ -754,11 +754,15 @@ ExecInitBitmapHeapScan(BitmapHeapScan *node, EState *estate, int eflags)
* open the scan relation
*/
currentRelation = ExecOpenScanRelation(estate, node->scan.scanrelid, eflags);
+ if (unlikely(currentRelation == NULL || !ExecPlanStillValid(estate)))
+ return scanstate;
/*
* initialize child nodes
*/
outerPlanState(scanstate) = ExecInitNode(outerPlan(node), estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return scanstate;
/*
* get the scan type from the relation descriptor.
diff --git a/src/backend/executor/nodeBitmapIndexscan.c b/src/backend/executor/nodeBitmapIndexscan.c
index 4669e8d0ce..f04a53e9be 100644
--- a/src/backend/executor/nodeBitmapIndexscan.c
+++ b/src/backend/executor/nodeBitmapIndexscan.c
@@ -252,7 +252,11 @@ ExecInitBitmapIndexScan(BitmapIndexScan *node, EState *estate, int eflags)
/* Open the index relation. */
lockmode = exec_rt_fetch(node->scan.scanrelid, estate)->rellockmode;
- indexstate->biss_RelationDesc = index_open(node->indexid, lockmode);
+ indexstate->biss_RelationDesc = ExecOpenScanIndexRelation(estate,
+ node->indexid,
+ lockmode);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return indexstate;
/*
* Initialize index-specific scan state
diff --git a/src/backend/executor/nodeBitmapOr.c b/src/backend/executor/nodeBitmapOr.c
index de439235d2..980b68dd82 100644
--- a/src/backend/executor/nodeBitmapOr.c
+++ b/src/backend/executor/nodeBitmapOr.c
@@ -90,6 +90,8 @@ ExecInitBitmapOr(BitmapOr *node, EState *estate, int eflags)
{
initNode = (Plan *) lfirst(l);
bitmapplanstates[i] = ExecInitNode(initNode, estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return bitmaporstate;
i++;
}
diff --git a/src/backend/executor/nodeCustom.c b/src/backend/executor/nodeCustom.c
index e559cd2346..2a7c5dccd8 100644
--- a/src/backend/executor/nodeCustom.c
+++ b/src/backend/executor/nodeCustom.c
@@ -58,6 +58,8 @@ ExecInitCustomScan(CustomScan *cscan, EState *estate, int eflags)
if (scanrelid > 0)
{
scan_rel = ExecOpenScanRelation(estate, scanrelid, eflags);
+ if (unlikely(scan_rel == NULL || !ExecPlanStillValid(estate)))
+ return css;
css->ss.ss_currentRelation = scan_rel;
}
diff --git a/src/backend/executor/nodeForeignscan.c b/src/backend/executor/nodeForeignscan.c
index 1357ccf3c9..90d5878ae3 100644
--- a/src/backend/executor/nodeForeignscan.c
+++ b/src/backend/executor/nodeForeignscan.c
@@ -172,6 +172,8 @@ ExecInitForeignScan(ForeignScan *node, EState *estate, int eflags)
if (scanrelid > 0)
{
currentRelation = ExecOpenScanRelation(estate, scanrelid, eflags);
+ if (unlikely(currentRelation == NULL || !ExecPlanStillValid(estate)))
+ return scanstate;
scanstate->ss.ss_currentRelation = currentRelation;
fdwroutine = GetFdwRoutineForRelation(currentRelation, true);
}
@@ -263,6 +265,8 @@ ExecInitForeignScan(ForeignScan *node, EState *estate, int eflags)
if (outerPlan(node))
outerPlanState(scanstate) =
ExecInitNode(outerPlan(node), estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return scanstate;
/*
* Tell the FDW to initialize the scan.
diff --git a/src/backend/executor/nodeGather.c b/src/backend/executor/nodeGather.c
index cae5ea1f92..67548aa7ba 100644
--- a/src/backend/executor/nodeGather.c
+++ b/src/backend/executor/nodeGather.c
@@ -84,6 +84,8 @@ ExecInitGather(Gather *node, EState *estate, int eflags)
*/
outerNode = outerPlan(node);
outerPlanState(gatherstate) = ExecInitNode(outerNode, estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return gatherstate;
tupDesc = ExecGetResultType(outerPlanState(gatherstate));
/*
diff --git a/src/backend/executor/nodeGatherMerge.c b/src/backend/executor/nodeGatherMerge.c
index b36cd89e7d..cf0e074359 100644
--- a/src/backend/executor/nodeGatherMerge.c
+++ b/src/backend/executor/nodeGatherMerge.c
@@ -103,6 +103,8 @@ ExecInitGatherMerge(GatherMerge *node, EState *estate, int eflags)
*/
outerNode = outerPlan(node);
outerPlanState(gm_state) = ExecInitNode(outerNode, estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return gm_state;
/*
* Leader may access ExecProcNode result directly (if
diff --git a/src/backend/executor/nodeGroup.c b/src/backend/executor/nodeGroup.c
index 807429e504..6d0fd9e7b4 100644
--- a/src/backend/executor/nodeGroup.c
+++ b/src/backend/executor/nodeGroup.c
@@ -184,6 +184,8 @@ ExecInitGroup(Group *node, EState *estate, int eflags)
* initialize child nodes
*/
outerPlanState(grpstate) = ExecInitNode(outerPlan(node), estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return grpstate;
/*
* Initialize scan slot and type.
diff --git a/src/backend/executor/nodeHash.c b/src/backend/executor/nodeHash.c
index a913d5b50c..e71d131d18 100644
--- a/src/backend/executor/nodeHash.c
+++ b/src/backend/executor/nodeHash.c
@@ -396,6 +396,8 @@ ExecInitHash(Hash *node, EState *estate, int eflags)
* initialize child nodes
*/
outerPlanState(hashstate) = ExecInitNode(outerPlan(node), estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return hashstate;
/*
* initialize our result slot and type. No need to build projection
diff --git a/src/backend/executor/nodeHashjoin.c b/src/backend/executor/nodeHashjoin.c
index 901c9e9be7..3c870de1c5 100644
--- a/src/backend/executor/nodeHashjoin.c
+++ b/src/backend/executor/nodeHashjoin.c
@@ -758,8 +758,12 @@ ExecInitHashJoin(HashJoin *node, EState *estate, int eflags)
hashNode = (Hash *) innerPlan(node);
outerPlanState(hjstate) = ExecInitNode(outerNode, estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return hjstate;
outerDesc = ExecGetResultType(outerPlanState(hjstate));
innerPlanState(hjstate) = ExecInitNode((Plan *) hashNode, estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return hjstate;
innerDesc = ExecGetResultType(innerPlanState(hjstate));
/*
diff --git a/src/backend/executor/nodeIncrementalSort.c b/src/backend/executor/nodeIncrementalSort.c
index 010bcfafa8..af723ea755 100644
--- a/src/backend/executor/nodeIncrementalSort.c
+++ b/src/backend/executor/nodeIncrementalSort.c
@@ -1040,6 +1040,8 @@ ExecInitIncrementalSort(IncrementalSort *node, EState *estate, int eflags)
* nodes may be able to do something more useful.
*/
outerPlanState(incrsortstate) = ExecInitNode(outerPlan(node), estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return incrsortstate;
/*
* Initialize scan slot and type.
diff --git a/src/backend/executor/nodeIndexonlyscan.c b/src/backend/executor/nodeIndexonlyscan.c
index 481d479760..0fba8f7d5a 100644
--- a/src/backend/executor/nodeIndexonlyscan.c
+++ b/src/backend/executor/nodeIndexonlyscan.c
@@ -531,6 +531,8 @@ ExecInitIndexOnlyScan(IndexOnlyScan *node, EState *estate, int eflags)
* open the scan relation
*/
currentRelation = ExecOpenScanRelation(estate, node->scan.scanrelid, eflags);
+ if (unlikely(currentRelation == NULL || !ExecPlanStillValid(estate)))
+ return indexstate;
indexstate->ss.ss_currentRelation = currentRelation;
indexstate->ss.ss_currentScanDesc = NULL; /* no heap scan here */
@@ -583,9 +585,12 @@ ExecInitIndexOnlyScan(IndexOnlyScan *node, EState *estate, int eflags)
/* Open the index relation. */
lockmode = exec_rt_fetch(node->scan.scanrelid, estate)->rellockmode;
- indexRelation = index_open(node->indexid, lockmode);
+ indexRelation = ExecOpenScanIndexRelation(estate, node->indexid, lockmode);
indexstate->ioss_RelationDesc = indexRelation;
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return indexstate;
+
/*
* Initialize index-specific scan state
*/
diff --git a/src/backend/executor/nodeIndexscan.c b/src/backend/executor/nodeIndexscan.c
index a8172d8b82..db28aeb3d6 100644
--- a/src/backend/executor/nodeIndexscan.c
+++ b/src/backend/executor/nodeIndexscan.c
@@ -907,6 +907,8 @@ ExecInitIndexScan(IndexScan *node, EState *estate, int eflags)
* open the scan relation
*/
currentRelation = ExecOpenScanRelation(estate, node->scan.scanrelid, eflags);
+ if (unlikely(currentRelation == NULL || !ExecPlanStillValid(estate)))
+ return indexstate;
indexstate->ss.ss_currentRelation = currentRelation;
indexstate->ss.ss_currentScanDesc = NULL; /* no heap scan here */
@@ -951,7 +953,11 @@ ExecInitIndexScan(IndexScan *node, EState *estate, int eflags)
/* Open the index relation. */
lockmode = exec_rt_fetch(node->scan.scanrelid, estate)->rellockmode;
- indexstate->iss_RelationDesc = index_open(node->indexid, lockmode);
+ indexstate->iss_RelationDesc = ExecOpenScanIndexRelation(estate,
+ node->indexid,
+ lockmode);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return indexstate;
/*
* Initialize index-specific scan state
diff --git a/src/backend/executor/nodeLimit.c b/src/backend/executor/nodeLimit.c
index eb7b6e52be..369c904577 100644
--- a/src/backend/executor/nodeLimit.c
+++ b/src/backend/executor/nodeLimit.c
@@ -475,6 +475,8 @@ ExecInitLimit(Limit *node, EState *estate, int eflags)
*/
outerPlan = outerPlan(node);
outerPlanState(limitstate) = ExecInitNode(outerPlan, estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return limitstate;
/*
* initialize child expressions
diff --git a/src/backend/executor/nodeLockRows.c b/src/backend/executor/nodeLockRows.c
index 0d3489195b..9077858413 100644
--- a/src/backend/executor/nodeLockRows.c
+++ b/src/backend/executor/nodeLockRows.c
@@ -322,6 +322,8 @@ ExecInitLockRows(LockRows *node, EState *estate, int eflags)
* then initialize outer plan
*/
outerPlanState(lrstate) = ExecInitNode(outerPlan, estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return lrstate;
/* node returns unmodified slots from the outer plan */
lrstate->ps.resultopsset = true;
diff --git a/src/backend/executor/nodeMaterial.c b/src/backend/executor/nodeMaterial.c
index 883e3f3933..972962d44d 100644
--- a/src/backend/executor/nodeMaterial.c
+++ b/src/backend/executor/nodeMaterial.c
@@ -214,6 +214,8 @@ ExecInitMaterial(Material *node, EState *estate, int eflags)
outerPlan = outerPlan(node);
outerPlanState(matstate) = ExecInitNode(outerPlan, estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return matstate;
/*
* Initialize result type and slot. No need to initialize projection info
diff --git a/src/backend/executor/nodeMemoize.c b/src/backend/executor/nodeMemoize.c
index 690dee1daa..6aaab743b5 100644
--- a/src/backend/executor/nodeMemoize.c
+++ b/src/backend/executor/nodeMemoize.c
@@ -973,6 +973,8 @@ ExecInitMemoize(Memoize *node, EState *estate, int eflags)
outerNode = outerPlan(node);
outerPlanState(mstate) = ExecInitNode(outerNode, estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return mstate;
/*
* Initialize return slot and type. No need to initialize projection info
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 3236444cf1..a82f0a71a0 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -95,6 +95,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
list_length(node->mergeplans),
node->part_prune_info,
&validsubplans);
+ if (!ExecPlanStillValid(estate))
+ return mergestate;
mergestate->ms_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
@@ -120,7 +122,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
mergestate->ms_prune_state = NULL;
}
- mergeplanstates = (PlanState **) palloc(nplans * sizeof(PlanState *));
+ mergeplanstates = (PlanState **) palloc0(nplans * sizeof(PlanState *));
mergestate->mergeplans = mergeplanstates;
mergestate->ms_nplans = nplans;
@@ -151,6 +153,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
Plan *initNode = (Plan *) list_nth(node->mergeplans, i);
mergeplanstates[j++] = ExecInitNode(initNode, estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return mergestate;
}
mergestate->ps.ps_ProjInfo = NULL;
diff --git a/src/backend/executor/nodeMergejoin.c b/src/backend/executor/nodeMergejoin.c
index 926e631d88..53cb1ff207 100644
--- a/src/backend/executor/nodeMergejoin.c
+++ b/src/backend/executor/nodeMergejoin.c
@@ -1490,11 +1490,15 @@ ExecInitMergeJoin(MergeJoin *node, EState *estate, int eflags)
mergestate->mj_SkipMarkRestore = node->skip_mark_restore;
outerPlanState(mergestate) = ExecInitNode(outerPlan(node), estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return mergestate;
outerDesc = ExecGetResultType(outerPlanState(mergestate));
innerPlanState(mergestate) = ExecInitNode(innerPlan(node), estate,
mergestate->mj_SkipMarkRestore ?
eflags :
(eflags | EXEC_FLAG_MARK));
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return mergestate;
innerDesc = ExecGetResultType(innerPlanState(mergestate));
/*
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 9e56f9c36c..8debfbd3ec 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -4277,6 +4277,13 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
linitial_int(node->resultRelations));
}
+ /*
+ * ExecInitResultRelation() may have returned without initializing
+ * rootResultRelInfo if the plan got invalidated, so check.
+ */
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return mtstate;
+
/* set up epqstate with dummy subplan data for the moment */
EvalPlanQualInit(&mtstate->mt_epqstate, estate, NULL, NIL,
node->epqParam, node->resultRelations);
@@ -4309,6 +4316,10 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
{
ExecInitResultRelation(estate, resultRelInfo, resultRelation);
+ /* See the comment above. */
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return mtstate;
+
/*
* For child result relations, store the root result relation
* pointer. We do so for the convenience of places that want to
@@ -4335,6 +4346,8 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
* Now we may initialize the subplan.
*/
outerPlanState(mtstate) = ExecInitNode(subplan, estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return mtstate;
/*
* Do additional per-result-relation initialization.
diff --git a/src/backend/executor/nodeNestloop.c b/src/backend/executor/nodeNestloop.c
index 01f3d56a3b..34eafbb6e0 100644
--- a/src/backend/executor/nodeNestloop.c
+++ b/src/backend/executor/nodeNestloop.c
@@ -294,11 +294,15 @@ ExecInitNestLoop(NestLoop *node, EState *estate, int eflags)
* values.
*/
outerPlanState(nlstate) = ExecInitNode(outerPlan(node), estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return nlstate;
if (node->nestParams == NIL)
eflags |= EXEC_FLAG_REWIND;
else
eflags &= ~EXEC_FLAG_REWIND;
innerPlanState(nlstate) = ExecInitNode(innerPlan(node), estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return nlstate;
/*
* Initialize result slot, type and projection.
diff --git a/src/backend/executor/nodeProjectSet.c b/src/backend/executor/nodeProjectSet.c
index ca9a5e2ed2..f834499479 100644
--- a/src/backend/executor/nodeProjectSet.c
+++ b/src/backend/executor/nodeProjectSet.c
@@ -254,6 +254,8 @@ ExecInitProjectSet(ProjectSet *node, EState *estate, int eflags)
* initialize child nodes
*/
outerPlanState(state) = ExecInitNode(outerPlan(node), estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return state;
/*
* we don't use inner plan
diff --git a/src/backend/executor/nodeRecursiveunion.c b/src/backend/executor/nodeRecursiveunion.c
index 7680142c7b..5dd3285c41 100644
--- a/src/backend/executor/nodeRecursiveunion.c
+++ b/src/backend/executor/nodeRecursiveunion.c
@@ -244,7 +244,11 @@ ExecInitRecursiveUnion(RecursiveUnion *node, EState *estate, int eflags)
* initialize child nodes
*/
outerPlanState(rustate) = ExecInitNode(outerPlan(node), estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return rustate;
innerPlanState(rustate) = ExecInitNode(innerPlan(node), estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return rustate;
/*
* If hashing, precompute fmgr lookup data for inner loop, and create the
diff --git a/src/backend/executor/nodeResult.c b/src/backend/executor/nodeResult.c
index e3cfc9b772..7d7c2aa786 100644
--- a/src/backend/executor/nodeResult.c
+++ b/src/backend/executor/nodeResult.c
@@ -207,6 +207,8 @@ ExecInitResult(Result *node, EState *estate, int eflags)
* initialize child nodes
*/
outerPlanState(resstate) = ExecInitNode(outerPlan(node), estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return resstate;
/*
* we don't use inner plan
diff --git a/src/backend/executor/nodeSamplescan.c b/src/backend/executor/nodeSamplescan.c
index 6ab91001bc..3afdaeecd7 100644
--- a/src/backend/executor/nodeSamplescan.c
+++ b/src/backend/executor/nodeSamplescan.c
@@ -121,6 +121,9 @@ ExecInitSampleScan(SampleScan *node, EState *estate, int eflags)
ExecOpenScanRelation(estate,
node->scan.scanrelid,
eflags);
+ if (unlikely(scanstate->ss.ss_currentRelation == NULL ||
+ !ExecPlanStillValid(estate)))
+ return scanstate;
/* we won't set up the HeapScanDesc till later */
scanstate->ss.ss_currentScanDesc = NULL;
diff --git a/src/backend/executor/nodeSeqscan.c b/src/backend/executor/nodeSeqscan.c
index b052775e5b..f7fb64a4a2 100644
--- a/src/backend/executor/nodeSeqscan.c
+++ b/src/backend/executor/nodeSeqscan.c
@@ -153,6 +153,9 @@ ExecInitSeqScan(SeqScan *node, EState *estate, int eflags)
ExecOpenScanRelation(estate,
node->scan.scanrelid,
eflags);
+ if (unlikely(scanstate->ss.ss_currentRelation == NULL ||
+ !ExecPlanStillValid(estate)))
+ return scanstate;
/* and create slot with the appropriate rowtype */
ExecInitScanTupleSlot(estate, &scanstate->ss,
diff --git a/src/backend/executor/nodeSetOp.c b/src/backend/executor/nodeSetOp.c
index fe34b2134f..2231d8b82f 100644
--- a/src/backend/executor/nodeSetOp.c
+++ b/src/backend/executor/nodeSetOp.c
@@ -528,6 +528,8 @@ ExecInitSetOp(SetOp *node, EState *estate, int eflags)
if (node->strategy == SETOP_HASHED)
eflags &= ~EXEC_FLAG_REWIND;
outerPlanState(setopstate) = ExecInitNode(outerPlan(node), estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return setopstate;
outerDesc = ExecGetResultType(outerPlanState(setopstate));
/*
diff --git a/src/backend/executor/nodeSort.c b/src/backend/executor/nodeSort.c
index af852464d0..fb76e4c01b 100644
--- a/src/backend/executor/nodeSort.c
+++ b/src/backend/executor/nodeSort.c
@@ -263,6 +263,8 @@ ExecInitSort(Sort *node, EState *estate, int eflags)
eflags &= ~(EXEC_FLAG_REWIND | EXEC_FLAG_BACKWARD | EXEC_FLAG_MARK);
outerPlanState(sortstate) = ExecInitNode(outerPlan(node), estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return sortstate;
/*
* Initialize scan slot and type.
diff --git a/src/backend/executor/nodeSubqueryscan.c b/src/backend/executor/nodeSubqueryscan.c
index 0b2612183a..b5b538fa91 100644
--- a/src/backend/executor/nodeSubqueryscan.c
+++ b/src/backend/executor/nodeSubqueryscan.c
@@ -124,6 +124,8 @@ ExecInitSubqueryScan(SubqueryScan *node, EState *estate, int eflags)
* initialize subquery
*/
subquerystate->subplan = ExecInitNode(node->subplan, estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return subquerystate;
/*
* Initialize scan slot and type (needed by ExecAssignScanProjectionInfo)
diff --git a/src/backend/executor/nodeTidrangescan.c b/src/backend/executor/nodeTidrangescan.c
index 702ee884d2..a76836d021 100644
--- a/src/backend/executor/nodeTidrangescan.c
+++ b/src/backend/executor/nodeTidrangescan.c
@@ -377,6 +377,8 @@ ExecInitTidRangeScan(TidRangeScan *node, EState *estate, int eflags)
* open the scan relation
*/
currentRelation = ExecOpenScanRelation(estate, node->scan.scanrelid, eflags);
+ if (unlikely(currentRelation == NULL || !ExecPlanStillValid(estate)))
+ return tidrangestate;
tidrangestate->ss.ss_currentRelation = currentRelation;
tidrangestate->ss.ss_currentScanDesc = NULL; /* no table scan here */
diff --git a/src/backend/executor/nodeTidscan.c b/src/backend/executor/nodeTidscan.c
index f375951699..088babf572 100644
--- a/src/backend/executor/nodeTidscan.c
+++ b/src/backend/executor/nodeTidscan.c
@@ -522,6 +522,8 @@ ExecInitTidScan(TidScan *node, EState *estate, int eflags)
* open the scan relation
*/
currentRelation = ExecOpenScanRelation(estate, node->scan.scanrelid, eflags);
+ if (unlikely(currentRelation == NULL || !ExecPlanStillValid(estate)))
+ return tidstate;
tidstate->ss.ss_currentRelation = currentRelation;
tidstate->ss.ss_currentScanDesc = NULL; /* no heap scan here */
diff --git a/src/backend/executor/nodeUnique.c b/src/backend/executor/nodeUnique.c
index b82d0e9ad5..cb46b2d5d0 100644
--- a/src/backend/executor/nodeUnique.c
+++ b/src/backend/executor/nodeUnique.c
@@ -135,6 +135,8 @@ ExecInitUnique(Unique *node, EState *estate, int eflags)
* then initialize outer plan
*/
outerPlanState(uniquestate) = ExecInitNode(outerPlan(node), estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return uniquestate;
/*
* Initialize result slot and type. Unique nodes do no projections, so
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index 561d7e731d..1b96f51fe8 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -2464,6 +2464,8 @@ ExecInitWindowAgg(WindowAgg *node, EState *estate, int eflags)
*/
outerPlan = outerPlan(node);
outerPlanState(winstate) = ExecInitNode(outerPlan, estate, eflags);
+ if (unlikely(!ExecPlanStillValid(estate)))
+ return winstate;
/*
* initialize source tuple type (which is also the tuple type that we'll
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 902793b02b..b754827013 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -70,7 +70,8 @@ static int _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
static ParamListInfo _SPI_convert_params(int nargs, Oid *argtypes,
Datum *Values, const char *Nulls);
-static int _SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount);
+static int _SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount,
+ CachedPlanSource *plansource, int query_index);
static void _SPI_error_callback(void *arg);
@@ -1682,7 +1683,8 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
query_string,
plansource->commandTag,
stmt_list,
- cplan);
+ cplan,
+ plansource);
/*
* Set up options for portal. Default SCROLL type is chosen the same way
@@ -2494,6 +2496,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
List *stmt_list;
ListCell *lc2;
+ int i = 0;
spicallbackarg.query = plansource->query_string;
@@ -2691,8 +2694,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
options->params,
_SPI_current->queryEnv,
0);
- res = _SPI_pquery(qdesc, fire_triggers,
- canSetTag ? options->tcount : 0);
+
+ res = _SPI_pquery(qdesc, fire_triggers, canSetTag ? options->tcount : 0,
+ plansource, i);
FreeQueryDesc(qdesc);
}
else
@@ -2789,6 +2793,8 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
my_res = res;
goto fail;
}
+
+ i++;
}
/* Done with this plan, so release refcount */
@@ -2866,7 +2872,8 @@ _SPI_convert_params(int nargs, Oid *argtypes,
}
static int
-_SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount)
+_SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount,
+ CachedPlanSource *plansource, int query_index)
{
int operation = queryDesc->operation;
int eflags;
@@ -2922,7 +2929,7 @@ _SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount)
else
eflags = EXEC_FLAG_SKIP_TRIGGERS;
- ExecutorStart(queryDesc, eflags);
+ ExecutorStartExt(queryDesc, eflags, plansource, query_index);
ExecutorRun(queryDesc, ForwardScanDirection, tcount, true);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 8bc6bea113..ccbc27b575 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1237,6 +1237,7 @@ exec_simple_query(const char *query_string)
query_string,
commandTag,
plantree_list,
+ NULL,
NULL);
/*
@@ -2027,7 +2028,8 @@ exec_bind_message(StringInfo input_message)
query_string,
psrc->commandTag,
cplan->stmt_list,
- cplan);
+ cplan,
+ psrc);
/* Done with the snapshot used for parameter I/O and parsing/planning */
if (snapshot_set)
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 6e8f6b1b8f..c2ebddaa84 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -19,6 +19,7 @@
#include "access/xact.h"
#include "commands/prepare.h"
+#include "executor/execdesc.h"
#include "executor/tstoreReceiver.h"
#include "miscadmin.h"
#include "pg_trace.h"
@@ -37,6 +38,8 @@ Portal ActivePortal = NULL;
static void ProcessQuery(PlannedStmt *plan,
CachedPlan *cplan,
+ CachedPlanSource *plansource,
+ int query_index,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -80,6 +83,7 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
qd->cplan = cplan; /* CachedPlan supplying the plannedstmt */
+ qd->cplan_release = false;
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -114,6 +118,13 @@ FreeQueryDesc(QueryDesc *qdesc)
UnregisterSnapshot(qdesc->snapshot);
UnregisterSnapshot(qdesc->crosscheck_snapshot);
+ /*
+ * Release CachedPlan if requested. The CachedPlan is not associated with
+ * a ResourceOwner when release_cplan is true; see ExecutorStartExt().
+ */
+ if (qdesc->cplan_release)
+ ReleaseCachedPlan(qdesc->cplan, NULL);
+
/* Only the QueryDesc itself need be freed */
pfree(qdesc);
}
@@ -126,6 +137,8 @@ FreeQueryDesc(QueryDesc *qdesc)
*
* plan: the plan tree for the query
* cplan: CachedPlan supplying the plan
+ * plansource: CachedPlanSource supplying the cplan
+ * query_index: index of the query in plansource->query_list
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -139,6 +152,8 @@ FreeQueryDesc(QueryDesc *qdesc)
static void
ProcessQuery(PlannedStmt *plan,
CachedPlan *cplan,
+ CachedPlanSource *plansource,
+ int query_index,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -157,7 +172,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Call ExecutorStart to prepare the plan for execution
*/
- ExecutorStart(queryDesc, 0);
+ ExecutorStartExt(queryDesc, 0, plansource, query_index);
/*
* Run the plan to completion.
@@ -518,9 +533,12 @@ PortalStart(Portal portal, ParamListInfo params,
myeflags = eflags;
/*
- * Call ExecutorStart to prepare the plan for execution
+ * ExecutorStartExt() to prepare the plan for execution. If
+ * the portal is using a cached plan, it may get invalidated
+ * during plan intialization, in which case a new one is
+ * created and saved in the QueryDesc.
*/
- ExecutorStart(queryDesc, myeflags);
+ ExecutorStartExt(queryDesc, myeflags, portal->plansource, 0);
/*
* This tells PortalCleanup to shut down the executor
@@ -1201,6 +1219,7 @@ PortalRunMulti(Portal portal,
{
bool active_snapshot_set = false;
ListCell *stmtlist_item;
+ int i = 0;
/*
* If the destination is DestRemoteExecute, change to DestNone. The
@@ -1283,6 +1302,8 @@ PortalRunMulti(Portal portal,
/* statement can set tag string */
ProcessQuery(pstmt,
portal->cplan,
+ portal->plansource,
+ i,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1293,6 +1314,8 @@ PortalRunMulti(Portal portal,
/* stmt added by rewrite cannot set tag */
ProcessQuery(pstmt,
portal->cplan,
+ portal->plansource,
+ i,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1357,6 +1380,8 @@ PortalRunMulti(Portal portal,
*/
if (lnext(portal->stmts, stmtlist_item) != NULL)
CommandCounterIncrement();
+
+ i++;
}
/* Pop the snapshot if we pushed one. */
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 5b75dadf13..d33f871ea2 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -94,6 +94,14 @@
*/
static dlist_head saved_plan_list = DLIST_STATIC_INIT(saved_plan_list);
+/*
+ * Head of the backend's list of "standalone" CachedPlans that are not
+ * associated with a CachedPlanSource, created by GetSingleCachedPlan() for
+ * transient use by the executor in certain scenarios where they're needed
+ * only for one execution of the plan.
+ */
+static dlist_head standalone_plan_list = DLIST_STATIC_INIT(standalone_plan_list);
+
/*
* This is the head of the backend's list of CachedExpressions.
*/
@@ -905,6 +913,8 @@ CheckCachedPlan(CachedPlanSource *plansource)
* Planning work is done in the caller's memory context. The finished plan
* is in a child memory context, which typically should get reparented
* (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * Note: When changing this, you should also look at GetSingleCachedPlan().
*/
static CachedPlan *
BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
@@ -1034,6 +1044,7 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
plan->is_generic = generic;
plan->is_saved = false;
plan->is_valid = true;
+ plan->is_standalone = false;
/* assign generation number to new plan */
plan->generation = ++(plansource->generation);
@@ -1282,6 +1293,121 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
return plan;
}
+/*
+ * Create a fresh CachedPlan for the query_index'th query in the provided
+ * CachedPlanSource.
+ *
+ * The created CachedPlan is standalone, meaning it is not tracked in the
+ * CachedPlanSource. The CachedPlan and its plan trees are allocated in a
+ * child context of the caller's memory context. The caller must ensure they
+ * remain valid until execution is complete, after which the plan should be
+ * released by calling ReleaseCachedPlan().
+ *
+ * This function primarily supports ExecutorStartExt(), which handles cases
+ * where the original generic CachedPlan becomes invalid after prunable
+ * relations are locked.
+ */
+CachedPlan *
+GetSingleCachedPlan(CachedPlanSource *plansource, int query_index,
+ QueryEnvironment *queryEnv)
+{
+ List *query_list = plansource->query_list,
+ *plan_list;
+ CachedPlan *plan = plansource->gplan;
+ MemoryContext oldcxt = CurrentMemoryContext,
+ plan_context;
+ PlannedStmt *plannedstmt;
+
+ Assert(ActiveSnapshotSet());
+
+ /* Sanity checks */
+ if (plan == NULL)
+ elog(ERROR, "GetSingleCachedPlan() called in the wrong context: plansource->gplan is NULL");
+ else if (plan->is_valid)
+ elog(ERROR, "GetSingleCachedPlan() called in the wrong context: plansource->gplan->is_valid");
+
+ /*
+ * The plansource might have become invalid since GetCachedPlan(). See the
+ * comment in BuildCachedPlan() for details on why this might happen.
+ *
+ * The risk is greater here because this function is called from the
+ * executor, meaning much more processing may have occurred compared to
+ * when BuildCachedPlan() is called from GetCachedPlan().
+ */
+ if (!plansource->is_valid)
+ query_list = RevalidateCachedQuery(plansource, queryEnv);
+ Assert(query_list != NIL);
+
+ /*
+ * Build a new generic plan for the query_index'th query, but make a copy
+ * to be scribbled on by the planner
+ */
+ query_list = list_make1(copyObject(list_nth_node(Query, query_list,
+ query_index)));
+ plan_list = pg_plan_queries(query_list, plansource->query_string,
+ plansource->cursor_options, NULL);
+
+ list_free_deep(query_list);
+
+ /*
+ * Make a dedicated memory context for the CachedPlan and its subsidiary
+ * data so that we can release it in ReleaseCachedPlan() that will be
+ * called in FreeQueryDesc().
+ */
+ plan_context = AllocSetContextCreate(CurrentMemoryContext,
+ "Standalone CachedPlan",
+ ALLOCSET_START_SMALL_SIZES);
+ MemoryContextCopyAndSetIdentifier(plan_context, plansource->query_string);
+
+ /*
+ * Copy plan into the new context.
+ */
+ MemoryContextSwitchTo(plan_context);
+ plan_list = copyObject(plan_list);
+
+ /*
+ * Create and fill the CachedPlan struct within the new context.
+ */
+ plan = (CachedPlan *) palloc(sizeof(CachedPlan));
+ plan->magic = CACHEDPLAN_MAGIC;
+ plan->stmt_list = plan_list;
+
+ plan->planRoleId = GetUserId();
+ Assert(list_length(plan_list) == 1);
+ plannedstmt = linitial_node(PlannedStmt, plan_list);
+
+ /*
+ * CachedPlan is dependent on role either if RLS affected the rewrite
+ * phase or if a role dependency was injected during planning. And it's
+ * transient if any plan is marked so.
+ */
+ plan->dependsOnRole = plansource->dependsOnRLS || plannedstmt->dependsOnRole;
+ if (plannedstmt->transientPlan)
+ {
+ Assert(TransactionIdIsNormal(TransactionXmin));
+ plan->saved_xmin = TransactionXmin;
+ }
+ else
+ plan->saved_xmin = InvalidTransactionId;
+ plan->refcount = 0;
+ plan->context = plan_context;
+ plan->is_oneshot = false;
+ plan->is_generic = true;
+ plan->is_saved = false;
+ plan->is_valid = true;
+ plan->is_standalone = true;
+ plan->generation = 1;
+ MemoryContextSwitchTo(oldcxt);
+
+ /*
+ * Add the entry to the global list of "standalone" cached plans. It is
+ * removed from the list by ReleaseCachedPlan().
+ */
+ dlist_push_tail(&standalone_plan_list, &plan->node);
+
+ return plan;
+}
+
/*
* ReleaseCachedPlan: release active use of a cached plan.
*
@@ -1309,6 +1435,10 @@ ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner)
/* Mark it no longer valid */
plan->magic = 0;
+ /* Remove from the global list if we are a standalone plan. */
+ if (plan->is_standalone)
+ dlist_delete(&plan->node);
+
/* One-shot plans do not own their context, so we can't free them */
if (!plan->is_oneshot)
MemoryContextDelete(plan->context);
@@ -2066,6 +2196,33 @@ PlanCacheRelCallback(Datum arg, Oid relid)
cexpr->is_valid = false;
}
}
+
+ /* Finally, invalidate any standalone cached plans */
+ dlist_foreach(iter, &standalone_plan_list)
+ {
+ CachedPlan *cplan = dlist_container(CachedPlan,
+ node, iter.cur);
+
+ Assert(cplan->magic == CACHEDPLAN_MAGIC);
+
+ if (cplan->is_valid)
+ {
+ ListCell *lc;
+
+ foreach(lc, cplan->stmt_list)
+ {
+ PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc);
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ continue; /* Ignore utility statements */
+ if ((relid == InvalidOid) ? plannedstmt->relationOids != NIL :
+ list_member_oid(plannedstmt->relationOids, relid))
+ cplan->is_valid = false;
+ if (!cplan->is_valid)
+ break; /* out of stmt_list scan */
+ }
+ }
+ }
}
/*
@@ -2176,6 +2333,44 @@ PlanCacheObjectCallback(Datum arg, int cacheid, uint32 hashvalue)
}
}
}
+
+ /* Finally, invalidate any standalone cached plans */
+ dlist_foreach(iter, &standalone_plan_list)
+ {
+ CachedPlan *cplan = dlist_container(CachedPlan,
+ node, iter.cur);
+
+ Assert(cplan->magic == CACHEDPLAN_MAGIC);
+
+ if (cplan->is_valid)
+ {
+ ListCell *lc;
+
+ foreach(lc, cplan->stmt_list)
+ {
+ PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc);
+ ListCell *lc3;
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ continue; /* Ignore utility statements */
+ foreach(lc3, plannedstmt->invalItems)
+ {
+ PlanInvalItem *item = (PlanInvalItem *) lfirst(lc3);
+
+ if (item->cacheId != cacheid)
+ continue;
+ if (hashvalue == 0 ||
+ item->hashValue == hashvalue)
+ {
+ cplan->is_valid = false;
+ break; /* out of invalItems scan */
+ }
+ }
+ if (!cplan->is_valid)
+ break; /* out of stmt_list scan */
+ }
+ }
+ }
}
/*
@@ -2235,6 +2430,17 @@ ResetPlanCache(void)
cexpr->is_valid = false;
}
+
+ /* Finally, invalidate any standalone cached plans */
+ dlist_foreach(iter, &standalone_plan_list)
+ {
+ CachedPlan *cplan = dlist_container(CachedPlan,
+ node, iter.cur);
+
+ Assert(cplan->magic == CACHEDPLAN_MAGIC);
+
+ cplan->is_valid = false;
+ }
}
/*
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 4a24613537..bf70fd4ce7 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -284,7 +284,8 @@ PortalDefineQuery(Portal portal,
const char *sourceText,
CommandTag commandTag,
List *stmts,
- CachedPlan *cplan)
+ CachedPlan *cplan,
+ CachedPlanSource *plansource)
{
Assert(PortalIsValid(portal));
Assert(portal->status == PORTAL_NEW);
@@ -299,6 +300,7 @@ PortalDefineQuery(Portal portal,
portal->commandTag = commandTag;
portal->stmts = stmts;
portal->cplan = cplan;
+ portal->plansource = plansource;
portal->status = PORTAL_DEFINED;
}
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index bf326eeb70..652e1afbf7 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -102,6 +102,7 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ParamListInfo params, QueryEnvironment *queryEnv);
extern void ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
+ CachedPlanSource *plansource, int plan_index,
IntoClause *into, ExplainState *es,
const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
diff --git a/src/include/commands/trigger.h b/src/include/commands/trigger.h
index 8a5a9fe642..db21561c8c 100644
--- a/src/include/commands/trigger.h
+++ b/src/include/commands/trigger.h
@@ -258,6 +258,7 @@ extern void ExecASTruncateTriggers(EState *estate,
extern void AfterTriggerBeginXact(void);
extern void AfterTriggerBeginQuery(void);
extern void AfterTriggerEndQuery(EState *estate);
+extern void AfterTriggerAbortQuery(void);
extern void AfterTriggerFireDeferred(void);
extern void AfterTriggerEndXact(bool isCommit);
extern void AfterTriggerBeginSubXact(void);
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index 0e7245435d..f6cb6479c0 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -36,6 +36,7 @@ typedef struct QueryDesc
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
CachedPlan *cplan; /* CachedPlan that supplies the plannedstmt */
+ bool cplan_release; /* Should FreeQueryDesc() release cplan? */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 69c3ebff00..084d8d5d91 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -19,6 +19,7 @@
#include "nodes/lockoptions.h"
#include "nodes/parsenodes.h"
#include "utils/memutils.h"
+#include "utils/plancache.h"
/*
@@ -198,6 +199,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
* prototypes from functions in execMain.c
*/
extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
+extern void ExecutorStartExt(QueryDesc *queryDesc, int eflags,
+ CachedPlanSource *plansource, int query_index);
extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void ExecutorRun(QueryDesc *queryDesc,
ScanDirection direction, uint64 count, bool execute_once);
@@ -261,6 +264,20 @@ extern void ExecEndNode(PlanState *node);
extern void ExecShutdownNode(PlanState *node);
extern void ExecSetTupleBound(int64 tuples_needed, PlanState *child_node);
+/*
+ * Is the CachedPlan in es_cachedplan still valid?
+ *
+ * Called at various points during ExecutorStart() because invalidation
+ * messages that affect the plan might be received after locks have been
+ * taken on runtime-prunable relations. The caller should take appropriate
+ * action if the plan has become invalid.
+ */
+static inline bool
+ExecPlanStillValid(EState *estate)
+{
+ return estate->es_cachedplan == NULL ? true :
+ CachedPlanValid(estate->es_cachedplan);
+}
/* ----------------------------------------------------------------
* ExecProcNode
@@ -589,6 +606,7 @@ extern void ExecCreateScanSlotFromOuterPlan(EState *estate,
extern bool ExecRelationIsTargetRelation(EState *estate, Index scanrelid);
extern Relation ExecOpenScanRelation(EState *estate, Index scanrelid, int eflags);
+extern Relation ExecOpenScanIndexRelation(EState *estate, Oid indexid, int lockmode);
extern void ExecInitRangeTable(EState *estate, List *rangeTable, List *permInfos);
extern void ExecCloseRangeTableRelations(EState *estate);
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index ee089505a0..2a8e5bd784 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -680,6 +680,7 @@ typedef struct EState
int es_top_eflags; /* eflags passed to ExecutorStart */
int es_instrument; /* OR of InstrumentOption flags */
bool es_finished; /* true when ExecutorFinish is done */
+ bool es_aborted; /* true when execution was aborted */
List *es_exprcontexts; /* List of ExprContexts within EState */
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 0b5ee007ca..154f68f671 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -18,6 +18,7 @@
#include "access/tupdesc.h"
#include "lib/ilist.h"
#include "nodes/params.h"
+#include "nodes/parsenodes.h"
#include "tcop/cmdtag.h"
#include "utils/queryenvironment.h"
#include "utils/resowner.h"
@@ -152,6 +153,8 @@ typedef struct CachedPlan
bool is_generic; /* is it a reusable generic plan? */
bool is_saved; /* is CachedPlan in a long-lived context? */
bool is_valid; /* is the stmt_list currently valid? */
+ bool is_standalone; /* is it not associated with a
+ * CachedPlanSource? */
Oid planRoleId; /* Role ID the plan was created for */
bool dependsOnRole; /* is plan specific to that role? */
TransactionId saved_xmin; /* if valid, replan when TransactionXmin
@@ -159,6 +162,12 @@ typedef struct CachedPlan
int generation; /* parent's generation number for this plan */
int refcount; /* count of live references to this struct */
MemoryContext context; /* context containing this CachedPlan */
+
+ /*
+ * If the plan is not associated with a CachedPlanSource, it is saved in
+ * a separate global list.
+ */
+ dlist_node node; /* list link, if is_standalone */
} CachedPlan;
/*
@@ -224,6 +233,10 @@ extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
ParamListInfo boundParams,
ResourceOwner owner,
QueryEnvironment *queryEnv);
+extern CachedPlan *GetSingleCachedPlan(CachedPlanSource *plansource,
+ int query_index,
+ QueryEnvironment *queryEnv);
+
extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
@@ -245,4 +258,17 @@ CachedPlanRequiresLocking(CachedPlan *cplan)
return cplan->is_generic;
}
+/*
+ * CachedPlanValid
+ * Returns whether a cached generic plan is still valid.
+ *
+ * Invoked by the executor to check if the plan has not been invalidated after
+ * taking locks during the initialization of the plan.
+ */
+static inline bool
+CachedPlanValid(CachedPlan *cplan)
+{
+ return cplan->is_valid;
+}
+
#endif /* PLANCACHE_H */
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index 29f49829f2..58c3828d2c 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
QueryCompletion qc; /* command completion data for executed query */
List *stmts; /* list of PlannedStmts */
CachedPlan *cplan; /* CachedPlan, if stmts are from one */
+ CachedPlanSource *plansource; /* CachedPlanSource, for cplan */
ParamListInfo portalParams; /* params to pass to query */
QueryEnvironment *queryEnv; /* environment for query */
@@ -241,7 +242,8 @@ extern void PortalDefineQuery(Portal portal,
const char *sourceText,
CommandTag commandTag,
List *stmts,
- CachedPlan *cplan);
+ CachedPlan *cplan,
+ CachedPlanSource *plansource);
extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
extern void PortalCreateHoldStore(Portal portal);
extern void PortalHashTableDeleteAll(void);
diff --git a/src/test/modules/delay_execution/Makefile b/src/test/modules/delay_execution/Makefile
index 70f24e846d..3eeb097fde 100644
--- a/src/test/modules/delay_execution/Makefile
+++ b/src/test/modules/delay_execution/Makefile
@@ -8,7 +8,8 @@ OBJS = \
delay_execution.o
ISOLATION = partition-addition \
- partition-removal-1
+ partition-removal-1 \
+ cached-plan-inval
ifdef USE_PGXS
PG_CONFIG = pg_config
diff --git a/src/test/modules/delay_execution/delay_execution.c b/src/test/modules/delay_execution/delay_execution.c
index 155c8a8d55..304ca77f7b 100644
--- a/src/test/modules/delay_execution/delay_execution.c
+++ b/src/test/modules/delay_execution/delay_execution.c
@@ -1,14 +1,18 @@
/*-------------------------------------------------------------------------
*
* delay_execution.c
- * Test module to allow delay between parsing and execution of a query.
+ * Test module to introduce delay at various points during execution of a
+ * query to test that execution proceeds safely in light of concurrent
+ * changes.
*
* The delay is implemented by taking and immediately releasing a specified
* advisory lock. If another process has previously taken that lock, the
* current process will be blocked until the lock is released; otherwise,
* there's no effect. This allows an isolationtester script to reliably
- * test behaviors where some specified action happens in another backend
- * between parsing and execution of any desired query.
+ * test behaviors where some specified action happens in another backend in
+ * a couple of cases: 1) between parsing and execution of any desired query
+ * when using the planner_hook, 2) between RevalidateCachedQuery() and
+ * ExecutorStart() when using the ExecutorStart_hook.
*
* Copyright (c) 2020-2024, PostgreSQL Global Development Group
*
@@ -22,6 +26,7 @@
#include <limits.h>
+#include "executor/executor.h"
#include "optimizer/planner.h"
#include "utils/builtins.h"
#include "utils/guc.h"
@@ -32,9 +37,11 @@ PG_MODULE_MAGIC;
/* GUC: advisory lock ID to use. Zero disables the feature. */
static int post_planning_lock_id = 0;
+static int executor_start_lock_id = 0;
-/* Save previous planner hook user to be a good citizen */
+/* Save previous hook users to be a good citizen */
static planner_hook_type prev_planner_hook = NULL;
+static ExecutorStart_hook_type prev_ExecutorStart_hook = NULL;
/* planner_hook function to provide the desired delay */
@@ -70,11 +77,41 @@ delay_execution_planner(Query *parse, const char *query_string,
return result;
}
+/* ExecutorStart_hook function to provide the desired delay */
+static void
+delay_execution_ExecutorStart(QueryDesc *queryDesc, int eflags)
+{
+ /* If enabled, delay by taking and releasing the specified lock */
+ if (executor_start_lock_id != 0)
+ {
+ DirectFunctionCall1(pg_advisory_lock_int8,
+ Int64GetDatum((int64) executor_start_lock_id));
+ DirectFunctionCall1(pg_advisory_unlock_int8,
+ Int64GetDatum((int64) executor_start_lock_id));
+
+ /*
+ * Ensure that we notice any pending invalidations, since the advisory
+ * lock functions don't do this.
+ */
+ AcceptInvalidationMessages();
+ }
+
+ /* Now start the executor, possibly via a previous hook user */
+ if (prev_ExecutorStart_hook)
+ prev_ExecutorStart_hook(queryDesc, eflags);
+ else
+ standard_ExecutorStart(queryDesc, eflags);
+
+ if (executor_start_lock_id != 0)
+ elog(NOTICE, "Finished ExecutorStart(): CachedPlan is %s",
+ CachedPlanValid(queryDesc->cplan) ? "valid" : "not valid");
+}
+
/* Module load function */
void
_PG_init(void)
{
- /* Set up the GUC to control which lock is used */
+ /* Set up GUCs to control which lock is used */
DefineCustomIntVariable("delay_execution.post_planning_lock_id",
"Sets the advisory lock ID to be locked/unlocked after planning.",
"Zero disables the delay.",
@@ -86,10 +123,22 @@ _PG_init(void)
NULL,
NULL,
NULL);
-
+ DefineCustomIntVariable("delay_execution.executor_start_lock_id",
+ "Sets the advisory lock ID to be locked/unlocked before starting execution.",
+ "Zero disables the delay.",
+ &executor_start_lock_id,
+ 0,
+ 0, INT_MAX,
+ PGC_USERSET,
+ 0,
+ NULL,
+ NULL,
+ NULL);
MarkGUCPrefixReserved("delay_execution");
- /* Install our hook */
+ /* Install our hooks. */
prev_planner_hook = planner_hook;
planner_hook = delay_execution_planner;
+ prev_ExecutorStart_hook = ExecutorStart_hook;
+ ExecutorStart_hook = delay_execution_ExecutorStart;
}
diff --git a/src/test/modules/delay_execution/expected/cached-plan-inval.out b/src/test/modules/delay_execution/expected/cached-plan-inval.out
new file mode 100644
index 0000000000..e8efb6d9d9
--- /dev/null
+++ b/src/test/modules/delay_execution/expected/cached-plan-inval.out
@@ -0,0 +1,175 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s1prep s2lock s1exec s2dropi s2unlock
+step s1prep: SET plan_cache_mode = force_generic_plan;
+ PREPARE q AS SELECT * FROM foov WHERE a = $1 FOR UPDATE;
+ EXPLAIN (COSTS OFF) EXECUTE q (1);
+QUERY PLAN
+------------------------------------------------
+LockRows
+ -> Append
+ Subplans Removed: 2
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: (a = $1)
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = $1)
+(7 rows)
+
+step s2lock: SELECT pg_advisory_lock(12345);
+pg_advisory_lock
+----------------
+
+(1 row)
+
+step s1exec: LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q (1); <waiting ...>
+step s2dropi: DROP INDEX foo12_1_a;
+step s2unlock: SELECT pg_advisory_unlock(12345);
+pg_advisory_unlock
+------------------
+t
+(1 row)
+
+step s1exec: <... completed>
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+-------------------------------------
+LockRows
+ -> Append
+ Subplans Removed: 2
+ -> Seq Scan on foo12_1 foo_1
+ Filter: (a = $1)
+(5 rows)
+
+
+starting permutation: s1prep2 s2lock s1exec2 s2dropi s2unlock
+step s1prep2: SET plan_cache_mode = force_generic_plan;
+ PREPARE q2 AS SELECT * FROM foov WHERE a = one() or a = two();
+ EXPLAIN (COSTS OFF) EXECUTE q2;
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+--------------------------------------------------
+Append
+ Subplans Removed: 1
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: ((a = one()) OR (a = two()))
+ -> BitmapOr
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = one())
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = two())
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+(11 rows)
+
+step s2lock: SELECT pg_advisory_lock(12345);
+pg_advisory_lock
+----------------
+
+(1 row)
+
+step s1exec2: LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q2; <waiting ...>
+step s2dropi: DROP INDEX foo12_1_a;
+step s2unlock: SELECT pg_advisory_unlock(12345);
+pg_advisory_unlock
+------------------
+t
+(1 row)
+
+step s1exec2: <... completed>
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+--------------------------------------------
+Append
+ Subplans Removed: 1
+ -> Seq Scan on foo12_1 foo_1
+ Filter: ((a = one()) OR (a = two()))
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+(6 rows)
+
+
+starting permutation: s1prep3 s2lock s1exec3 s2dropi s2unlock
+step s1prep3: SET plan_cache_mode = force_generic_plan;
+ PREPARE q3 AS UPDATE foov SET a = a WHERE a = one() or a = two();
+ EXPLAIN (COSTS OFF) EXECUTE q3;
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+--------------------------------------------------------
+Append
+ Subplans Removed: 1
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: ((a = one()) OR (a = two()))
+ -> BitmapOr
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = one())
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = two())
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+
+Update on foo
+ Update on foo12_1 foo_1
+ Update on foo12_2 foo_2
+ Update on foo3 foo
+ -> Append
+ Subplans Removed: 1
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: ((a = one()) OR (a = two()))
+ -> BitmapOr
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = one())
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = two())
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+(27 rows)
+
+step s2lock: SELECT pg_advisory_lock(12345);
+pg_advisory_lock
+----------------
+
+(1 row)
+
+step s1exec3: LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q3; <waiting ...>
+step s2dropi: DROP INDEX foo12_1_a;
+step s2unlock: SELECT pg_advisory_unlock(12345);
+pg_advisory_unlock
+------------------
+t
+(1 row)
+
+step s1exec3: <... completed>
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+--------------------------------------------------
+Append
+ Subplans Removed: 1
+ -> Seq Scan on foo12_1 foo_1
+ Filter: ((a = one()) OR (a = two()))
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+
+Update on foo
+ Update on foo12_1 foo_1
+ Update on foo12_2 foo_2
+ Update on foo3 foo
+ -> Append
+ Subplans Removed: 1
+ -> Seq Scan on foo12_1 foo_1
+ Filter: ((a = one()) OR (a = two()))
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+(17 rows)
+
diff --git a/src/test/modules/delay_execution/meson.build b/src/test/modules/delay_execution/meson.build
index 41f3ac0b89..5a70b183d0 100644
--- a/src/test/modules/delay_execution/meson.build
+++ b/src/test/modules/delay_execution/meson.build
@@ -24,6 +24,7 @@ tests += {
'specs': [
'partition-addition',
'partition-removal-1',
+ 'cached-plan-inval',
],
},
}
diff --git a/src/test/modules/delay_execution/specs/cached-plan-inval.spec b/src/test/modules/delay_execution/specs/cached-plan-inval.spec
new file mode 100644
index 0000000000..5b1f72b4a8
--- /dev/null
+++ b/src/test/modules/delay_execution/specs/cached-plan-inval.spec
@@ -0,0 +1,65 @@
+# Test to check that invalidation of cached generic plans during ExecutorStart
+# correctly triggers replanning and re-execution.
+
+setup
+{
+ CREATE TABLE foo (a int, b text) PARTITION BY LIST(a);
+ CREATE TABLE foo12 PARTITION OF foo FOR VALUES IN (1, 2) PARTITION BY LIST (a);
+ CREATE TABLE foo12_1 PARTITION OF foo12 FOR VALUES IN (1);
+ CREATE TABLE foo12_2 PARTITION OF foo12 FOR VALUES IN (2);
+ CREATE INDEX foo12_1_a ON foo12_1 (a);
+ CREATE TABLE foo3 PARTITION OF foo FOR VALUES IN (3);
+ CREATE VIEW foov AS SELECT * FROM foo;
+ CREATE FUNCTION one () RETURNS int AS $$ BEGIN RETURN 1; END; $$ LANGUAGE PLPGSQL STABLE;
+ CREATE FUNCTION two () RETURNS int AS $$ BEGIN RETURN 2; END; $$ LANGUAGE PLPGSQL STABLE;
+ CREATE RULE update_foo AS ON UPDATE TO foo DO ALSO SELECT 1;
+}
+
+teardown
+{
+ DROP VIEW foov;
+ DROP RULE update_foo ON foo;
+ DROP TABLE foo;
+ DROP FUNCTION one(), two();
+}
+
+session "s1"
+# Append with run-time pruning
+step "s1prep" { SET plan_cache_mode = force_generic_plan;
+ PREPARE q AS SELECT * FROM foov WHERE a = $1 FOR UPDATE;
+ EXPLAIN (COSTS OFF) EXECUTE q (1); }
+
+# Another case with Append with run-time pruning
+step "s1prep2" { SET plan_cache_mode = force_generic_plan;
+ PREPARE q2 AS SELECT * FROM foov WHERE a = one() or a = two();
+ EXPLAIN (COSTS OFF) EXECUTE q2; }
+
+# Case with a rule adding another query
+step "s1prep3" { SET plan_cache_mode = force_generic_plan;
+ PREPARE q3 AS UPDATE foov SET a = a WHERE a = one() or a = two();
+ EXPLAIN (COSTS OFF) EXECUTE q3; }
+
+# Executes a generic plan
+step "s1exec" { LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q (1); }
+step "s1exec2" { LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q2; }
+step "s1exec3" { LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q3; }
+
+session "s2"
+step "s2lock" { SELECT pg_advisory_lock(12345); }
+step "s2unlock" { SELECT pg_advisory_unlock(12345); }
+step "s2dropi" { DROP INDEX foo12_1_a; }
+
+# While "s1exec", etc. wait to acquire the advisory lock, "s2drop" is able to
+# drop the index being used in the cached plan. When "s1exec" is then
+# unblocked and initializes the cached plan for execution, it detects the
+# concurrent index drop and causes the cached plan to be discarded and
+# recreated without the index.
+permutation "s1prep" "s2lock" "s1exec" "s2dropi" "s2unlock"
+permutation "s1prep2" "s2lock" "s1exec2" "s2dropi" "s2unlock"
+permutation "s1prep3" "s2lock" "s1exec3" "s2dropi" "s2unlock"
--
2.43.0
^ permalink raw reply [nested|flat] 29+ messages in thread
* Re: generic plans and "initial" pruning
2024-08-15 15:34 Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-16 12:35 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-19 16:39 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-20 13:00 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-20 14:53 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-21 12:45 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-21 13:10 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-23 12:48 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-29 13:34 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
@ 2024-09-17 12:57 ` Amit Langote <[email protected]>
2024-09-19 08:39 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
1 sibling, 1 reply; 29+ messages in thread
From: Amit Langote @ 2024-09-17 12:57 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>; Tom Lane <[email protected]>
On Thu, Aug 29, 2024 at 10:34 PM Amit Langote <[email protected]> wrote:
> One idea that I think might be worth trying to reduce the footprint of
> 0003 is to try to lock the prunable relations in a step of InitPlan()
> separate from ExecInitNode(), which can be implemented by doing the
> initial runtime pruning in that separate step. That way, we'll have
> all the necessary locks before calling ExecInitNode() and so we don't
> need to sprinkle the CachedPlanStillValid() checks all over the place
> and worry about missed checks and dealing with partially initialized
> PlanState trees.
I've worked on this and found that it results in a much simpler design.
Attached are 0001 and 0002, which contain patches to refactor the
runtime pruning code. These changes move initial pruning outside of
ExecInitNode() and use the results during ExecInitNode() to determine
the set of child subnodes to initialize.
With that in place, the patches (0003, 0004) that move the locking of
prunable relations from plancache.c into the executor becomes simpler.
It no longer needs to modify any code called by ExecInitNode(). Since
no new locks are taken during ExecInitNode(), I didn't have to worry
about changing all the code involved in PlanState tree initialization
to add checks for CachedPlan validity. The check is only needed after
performing initial pruning, and if the CachedPlan is invalid,
ExecInitNode() won’t be called in the first place.
--
Thanks, Amit Langote
Attachments:
[application/octet-stream] v53-0001-Move-PartitionPruneInfo-out-of-plan-nodes-into-P.patch (19.9K, 2-v53-0001-Move-PartitionPruneInfo-out-of-plan-nodes-into-P.patch)
download | inline diff:
From fe2eaf3a8047ce63318818f157e7d85754e38cc9 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Fri, 6 Sep 2024 13:11:05 +0900
Subject: [PATCH v53 1/4] Move PartitionPruneInfo out of plan nodes into
PlannedStmt
This change moves PartitionPruneInfo from individual plan nodes to
PlannedStmt, allowing runtime initial pruning to be performed across
the entire plan tree without traversing the tree to find nodes
containing PartitionPruneInfos.
The PartitionPruneInfo pointer fields in Append and MergeAppend nodes
have been replaced with an integer index that points to
PartitionPruneInfos in a list within PlannedStmt, which holds the
PartitionPruneInfos for all subqueries.
Reviewed-by: Alvaro Herrera
---
src/backend/executor/execMain.c | 1 +
src/backend/executor/execParallel.c | 1 +
src/backend/executor/execPartition.c | 19 +++++-
src/backend/executor/execUtils.c | 1 +
src/backend/executor/nodeAppend.c | 5 +-
src/backend/executor/nodeMergeAppend.c | 5 +-
src/backend/optimizer/plan/createplan.c | 24 +++----
src/backend/optimizer/plan/planner.c | 1 +
src/backend/optimizer/plan/setrefs.c | 86 ++++++++++++++++---------
src/backend/partitioning/partprune.c | 19 ++++--
src/include/executor/execPartition.h | 4 +-
src/include/nodes/execnodes.h | 1 +
src/include/nodes/pathnodes.h | 6 ++
src/include/nodes/plannodes.h | 14 ++--
src/include/partitioning/partprune.h | 8 +--
15 files changed, 133 insertions(+), 62 deletions(-)
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 29e186fa73..8837d77c3e 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -848,6 +848,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable, plannedstmt->permInfos);
estate->es_plannedstmt = plannedstmt;
+ estate->es_part_prune_infos = plannedstmt->partPruneInfos;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index bfb3419efb..b01a2fdfdd 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -181,6 +181,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
pstmt->planTree = plan;
+ pstmt->partPruneInfos = estate->es_part_prune_infos;
pstmt->rtable = estate->es_range_table;
pstmt->permInfos = estate->es_rteperminfos;
pstmt->resultRelations = NIL;
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 7651886229..ec730674f2 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1786,6 +1786,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* Initialize data structure needed for run-time partition pruning and
* do initial pruning if needed
*
+ * 'root_parent_relids' identifies the relation to which both the parent plan
+ * and the PartitionPruneInfo given by 'part_prune_index' belong.
+ *
* On return, *initially_valid_subplans is assigned the set of indexes of
* child subplans that must be initialized along with the parent plan node.
* Initial pruning is performed here if needed and in that case only the
@@ -1798,11 +1801,25 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
PartitionPruneState *
ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
+ Bitmapset *root_parent_relids,
Bitmapset **initially_valid_subplans)
{
PartitionPruneState *prunestate;
EState *estate = planstate->state;
+ PartitionPruneInfo *pruneinfo;
+
+ /* Obtain the pruneinfo we need, and make sure it's the right one */
+ pruneinfo = list_nth_node(PartitionPruneInfo, estate->es_part_prune_infos,
+ part_prune_index);
+ if (!bms_equal(root_parent_relids, pruneinfo->root_parent_relids))
+ ereport(ERROR,
+ errcode(ERRCODE_INTERNAL_ERROR),
+ errmsg_internal("mismatching PartitionPruneInfo found at part_prune_index %d",
+ part_prune_index),
+ errdetail_internal("plan node relids %s, pruneinfo relids %s",
+ bmsToString(root_parent_relids),
+ bmsToString(pruneinfo->root_parent_relids)));
/* We may need an expression context to evaluate partition exprs */
ExecAssignExprContext(estate, planstate);
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 5737f9f4eb..67734979b0 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -118,6 +118,7 @@ CreateExecutorState(void)
estate->es_rowmarks = NULL;
estate->es_rteperminfos = NIL;
estate->es_plannedstmt = NULL;
+ estate->es_part_prune_infos = NIL;
estate->es_junkFilter = NULL;
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index ca0f54d676..de7ebab5c2 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -134,7 +134,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
appendstate->as_begun = false;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -145,7 +145,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&appendstate->ps,
list_length(node->appendplans),
- node->part_prune_info,
+ node->part_prune_index,
+ node->apprelids,
&validsubplans);
appendstate->as_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index e1b9b984a7..3ed91808dd 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -82,7 +82,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
mergestate->ps.ExecProcNode = ExecMergeAppend;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -93,7 +93,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&mergestate->ps,
list_length(node->mergeplans),
- node->part_prune_info,
+ node->part_prune_index,
+ node->apprelids,
&validsubplans);
mergestate->ms_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index bb45ef318f..6642d09a39 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1225,7 +1225,6 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
ListCell *subpaths;
int nasyncplans = 0;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
int nodenumsortkeys = 0;
AttrNumber *nodeSortColIdx = NULL;
Oid *nodeSortOperators = NULL;
@@ -1376,6 +1375,9 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ plan->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1399,16 +1401,14 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
}
if (prunequal != NIL)
- partpruneinfo =
- make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ plan->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
plan->appendplans = subplans;
plan->nasyncplans = nasyncplans;
plan->first_partial_plan = best_path->first_partial_path;
- plan->part_prune_info = partpruneinfo;
copy_generic_path_info(&plan->plan, (Path *) best_path);
@@ -1447,7 +1447,6 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
List *subplans = NIL;
ListCell *subpaths;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
/*
* We don't have the actual creation of the MergeAppend node split out
@@ -1540,6 +1539,9 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ node->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1555,13 +1557,13 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
Assert(best_path->path.param_info == NULL);
if (prunequal != NIL)
- partpruneinfo = make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ node->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
node->mergeplans = subplans;
- node->part_prune_info = partpruneinfo;
+
/*
* If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index df35d1ff9c..1b9071c774 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -547,6 +547,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->dependsOnRole = glob->dependsOnRole;
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
+ result->partPruneInfos = glob->partPruneInfos;
result->rtable = glob->finalrtable;
result->permInfos = glob->finalrteperminfos;
result->resultRelations = glob->resultRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 91c7c4fe2f..e2ea406c4e 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1732,6 +1732,48 @@ set_customscan_references(PlannerInfo *root,
cscan->custom_relids = offset_relid_set(cscan->custom_relids, rtoffset);
}
+/*
+ * register_partpruneinfo
+ * Subroutine for set_append_references and set_mergeappend_references
+ *
+ * Add the PartitionPruneInfo from root->partPruneInfos at the given index
+ * into PlannerGlobal->partPruneInfos and return its index there.
+ *
+ * Also update the RT indexes present in PartitionedRelPruneInfos to add the
+ * offset.
+ */
+static int
+register_partpruneinfo(PlannerInfo *root, int part_prune_index, int rtoffset)
+{
+ PlannerGlobal *glob = root->glob;
+ PartitionPruneInfo *pinfo;
+ ListCell *l;
+
+ Assert(part_prune_index >= 0 &&
+ part_prune_index < list_length(root->partPruneInfos));
+ pinfo = list_nth_node(PartitionPruneInfo, root->partPruneInfos,
+ part_prune_index);
+
+ pinfo->root_parent_relids = offset_relid_set(pinfo->root_parent_relids,
+ rtoffset);
+ foreach(l, pinfo->prune_infos)
+ {
+ List *prune_infos = lfirst(l);
+ ListCell *l2;
+
+ foreach(l2, prune_infos)
+ {
+ PartitionedRelPruneInfo *prelinfo = lfirst(l2);
+
+ prelinfo->rtindex += rtoffset;
+ }
+ }
+
+ glob->partPruneInfos = lappend(glob->partPruneInfos, pinfo);
+
+ return list_length(glob->partPruneInfos) - 1;
+}
+
/*
* set_append_references
* Do set_plan_references processing on an Append
@@ -1784,21 +1826,13 @@ set_append_references(PlannerInfo *root,
aplan->apprelids = offset_relid_set(aplan->apprelids, rtoffset);
- if (aplan->part_prune_info)
- {
- foreach(l, aplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * Add PartitionPruneInfo, if any, to PlannerGlobal and update the index.
+ * Also update the RT indexes present in it to add the offset.
+ */
+ if (aplan->part_prune_index >= 0)
+ aplan->part_prune_index =
+ register_partpruneinfo(root, aplan->part_prune_index, rtoffset);
/* We don't need to recurse to lefttree or righttree ... */
Assert(aplan->plan.lefttree == NULL);
@@ -1860,21 +1894,13 @@ set_mergeappend_references(PlannerInfo *root,
mplan->apprelids = offset_relid_set(mplan->apprelids, rtoffset);
- if (mplan->part_prune_info)
- {
- foreach(l, mplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * Add PartitionPruneInfo, if any, to PlannerGlobal and update the index.
+ * Also update the RT indexes present in it to add the offset.
+ */
+ if (mplan->part_prune_index >= 0)
+ mplan->part_prune_index =
+ register_partpruneinfo(root, mplan->part_prune_index, rtoffset);
/* We don't need to recurse to lefttree or righttree ... */
Assert(mplan->plan.lefttree == NULL);
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 9a1a7faac7..60fabb1734 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -207,16 +207,20 @@ static void partkey_datum_from_expr(PartitionPruneContext *context,
/*
* make_partition_pruneinfo
- * Builds a PartitionPruneInfo which can be used in the executor to allow
- * additional partition pruning to take place. Returns NULL when
- * partition pruning would be useless.
+ * Checks if the given set of quals can be used to build pruning steps
+ * that the executor can use to prune away unneeded partitions. If
+ * suitable quals are found then a PartitionPruneInfo is built and tagged
+ * onto the PlannerInfo's partPruneInfos list.
+ *
+ * The return value is the 0-based index of the item added to the
+ * partPruneInfos list or -1 if nothing was added.
*
* 'parentrel' is the RelOptInfo for an appendrel, and 'subpaths' is the list
* of scan paths for its child rels.
* 'prunequal' is a list of potential pruning quals (i.e., restriction
* clauses that are applicable to the appendrel).
*/
-PartitionPruneInfo *
+int
make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *subpaths,
List *prunequal)
@@ -330,10 +334,11 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* quals, then we can just not bother with run-time pruning.
*/
if (prunerelinfos == NIL)
- return NULL;
+ return -1;
/* Else build the result data structure */
pruneinfo = makeNode(PartitionPruneInfo);
+ pruneinfo->root_parent_relids = parentrel->relids;
pruneinfo->prune_infos = prunerelinfos;
/*
@@ -356,7 +361,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
else
pruneinfo->other_subplans = NULL;
- return pruneinfo;
+ root->partPruneInfos = lappend(root->partPruneInfos, pruneinfo);
+
+ return list_length(root->partPruneInfos) - 1;
}
/*
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index c09bc83b2a..12aacc84ff 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -123,9 +123,9 @@ typedef struct PartitionPruneState
extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
+ Bitmapset *root_parent_relids,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
bool initial_prune);
-
#endif /* EXECPARTITION_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 516b948743..49f1d56a5d 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -635,6 +635,7 @@ typedef struct EState
* ExecRowMarks, or NULL if none */
List *es_rteperminfos; /* List of RTEPermissionInfo */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
+ List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 07e2415398..8d30b6e896 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -128,6 +128,9 @@ typedef struct PlannerGlobal
/* "flat" list of AppendRelInfos */
List *appendRelations;
+ /* List of PartitionPruneInfo contained in the plan */
+ List *partPruneInfos;
+
/* OIDs of relations the plan depends on */
List *relationOids;
@@ -559,6 +562,9 @@ struct PlannerInfo
/* Does this query modify any partition key columns? */
bool partColsUpdated;
+
+ /* PartitionPruneInfos added in this query's plan. */
+ List *partPruneInfos;
};
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 62cd6a6666..39d0281c23 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -69,6 +69,9 @@ typedef struct PlannedStmt
struct Plan *planTree; /* tree of Plan nodes */
+ List *partPruneInfos; /* List of PartitionPruneInfo contained in the
+ * plan */
+
List *rtable; /* list of RangeTblEntry nodes */
List *permInfos; /* list of RTEPermissionInfo nodes for rtable
@@ -276,8 +279,8 @@ typedef struct Append
*/
int first_partial_plan;
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+ int part_prune_index;
} Append;
/* ----------------
@@ -311,8 +314,8 @@ typedef struct MergeAppend
/* NULLS FIRST/LAST directions */
bool *nullsFirst pg_node_attr(array_size(numCols));
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+ int part_prune_index;
} MergeAppend;
/* ----------------
@@ -1414,6 +1417,8 @@ typedef struct PlanRowMark
* Then, since an Append-type node could have multiple partitioning
* hierarchies among its children, we have an unordered List of those Lists.
*
+ * root_parent_relids RelOptInfo.relids of the relation to which the parent
+ * plan node and this PartitionPruneInfo node belong
* prune_infos List of Lists containing PartitionedRelPruneInfo nodes,
* one sublist per run-time-prunable partition hierarchy
* appearing in the parent plan node's subplans.
@@ -1426,6 +1431,7 @@ typedef struct PartitionPruneInfo
pg_node_attr(no_equal, no_query_jumble)
NodeTag type;
+ Bitmapset *root_parent_relids;
List *prune_infos;
Bitmapset *other_subplans;
} PartitionPruneInfo;
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index bd490d154f..c536a1fe19 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -70,10 +70,10 @@ typedef struct PartitionPruneContext
#define PruneCxtStateIdx(partnatts, step_id, keyno) \
((partnatts) * (step_id) + (keyno))
-extern PartitionPruneInfo *make_partition_pruneinfo(struct PlannerInfo *root,
- struct RelOptInfo *parentrel,
- List *subpaths,
- List *prunequal);
+extern int make_partition_pruneinfo(struct PlannerInfo *root,
+ struct RelOptInfo *parentrel,
+ List *subpaths,
+ List *prunequal);
extern Bitmapset *prune_append_rel_partitions(struct RelOptInfo *rel);
extern Bitmapset *get_matching_partitions(PartitionPruneContext *context,
List *pruning_steps);
--
2.43.0
[application/octet-stream] v53-0002-Perform-runtime-initial-pruning-outside-ExecInit.patch (14.2K, 3-v53-0002-Perform-runtime-initial-pruning-outside-ExecInit.patch)
download | inline diff:
From 6e63bee3e1f306e7c618d3256c1f94780d325ce6 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Thu, 12 Sep 2024 15:44:43 +0900
Subject: [PATCH v53 2/4] Perform runtime initial pruning outside
ExecInitNode()
This commit follows up on the previous change that moved
PartitionPruneInfos out of individual plan nodes into a list in
PlannedStmt. It moves the initialization of PartitionPruneStates
and runtime initial pruning out of ExecInitNode() and into a new
routine, ExecDoInitialPruning(), which is called by InitPlan()
before ExecInitNode() is invoked on the main plan tree and subplans.
ExecDoInitialPruning() stores the PartitionPruneStates in a list
matching the length of es_part_prune_infos (which holds the
PartitionPruneInfos from PlannedStmt), allowing both lists to share
the same index. It also saves the initial pruning result -- a
bitmapset of indexes for surviving child subnodes -- in a similarly
indexed list.
While the initial pruning is done earlier, the execution pruning
context information (needed for runtime pruning) is initialized
later during ExecInitNode() for the parent plan node, as it requires
access to the parent node's PlanState struct.
---
src/backend/executor/execMain.c | 55 ++++++++++
src/backend/executor/execPartition.c | 146 +++++++++++++++++++++------
src/include/executor/execPartition.h | 3 +
src/include/nodes/execnodes.h | 2 +
4 files changed, 176 insertions(+), 30 deletions(-)
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 8837d77c3e..dceef322af 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -46,6 +46,7 @@
#include "commands/matview.h"
#include "commands/trigger.h"
#include "executor/executor.h"
+#include "executor/execPartition.h"
#include "executor/nodeSubplan.h"
#include "foreign/fdwapi.h"
#include "mb/pg_wchar.h"
@@ -816,6 +817,54 @@ ExecCheckXactReadOnly(PlannedStmt *plannedstmt)
PreventCommandIfParallelMode(CreateCommandName((Node *) plannedstmt));
}
+/*
+ * ExecDoInitialPruning
+ * Perform runtime "initial" pruning, if necessary, to determine the set
+ * of child subnodes that need to be initialized during ExecInitNode()
+ * for plan nodes that support partition pruning.
+ *
+ * For each PartitionPruneInfo in estate->es_part_prune_infos, this function
+ * creates a PartitionPruneState (even if no initial pruning is done) and adds
+ * it to es_part_prune_states. For PartitionPruneInfo entries that include
+ * initial pruning steps, the result of those steps is saved as a bitmapset
+ * of indexes representing child subnodes that are "valid" and should be
+ * initialized for execution.
+ */
+static void
+ExecDoInitialPruning(EState *estate)
+{
+ ListCell *lc;
+
+ foreach(lc, estate->es_part_prune_infos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc);
+ PartitionPruneState *prunestate;
+ Bitmapset *validsubplans = NULL;
+
+ /*
+ * Create the working data structure for pruning, and save it for use
+ * later in ExecInitPartitionPruning(), which will be called by the
+ * parent plan node's ExecInit* function.
+ */
+ prunestate = CreatePartitionPruneState(estate, pruneinfo);
+ estate->es_part_prune_states = lappend(estate->es_part_prune_states,
+ prunestate);
+
+ /*
+ * Perform an initial partition pruning pass, if necessary, and save
+ * the bitmapset of valid subplans for use in
+ * ExecInitPartitionPruning(). If no initial pruning is performed, we
+ * still store a NULL to ensure that es_part_prune_results is the same
+ * length as es_part_prune_infos. This ensures that
+ * ExecInitPartitionPruning() can use the same index to locate the
+ * result.
+ */
+ if (prunestate->do_initial_prune)
+ validsubplans = ExecFindMatchingSubPlans(prunestate, true);
+ estate->es_part_prune_results = lappend(estate->es_part_prune_results,
+ validsubplans);
+ }
+}
/* ----------------------------------------------------------------
* InitPlan
@@ -848,7 +897,13 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable, plannedstmt->permInfos);
estate->es_plannedstmt = plannedstmt;
+
+ /*
+ * Perform runtime "initial" pruning to determine the plan nodes that will
+ * not be executed.
+ */
estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+ ExecDoInitialPruning(estate);
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index ec730674f2..08b1f3d030 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -181,8 +181,6 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
int maxfieldlen);
static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
-static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *pruneinfo);
static void InitPartitionPruneContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
@@ -192,6 +190,9 @@ static void InitPartitionPruneContext(PartitionPruneContext *context,
static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
Bitmapset *initially_valid_subplans,
int n_total_subplans);
+static void PartitionPruneInitExecPruning(PartitionPruneInfo *pruneinfo,
+ PartitionPruneState *prunestate,
+ PlanState *planstate);
static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
@@ -1821,17 +1822,21 @@ ExecInitPartitionPruning(PlanState *planstate,
bmsToString(root_parent_relids),
bmsToString(pruneinfo->root_parent_relids)));
- /* We may need an expression context to evaluate partition exprs */
- ExecAssignExprContext(estate, planstate);
-
- /* Create the working data structure for pruning */
- prunestate = CreatePartitionPruneState(planstate, pruneinfo);
-
/*
- * Perform an initial partition prune pass, if required.
+ * ExecDoInitialPruning() must have initialized the PartitionPruneState to
+ * perform the initial pruning. Now we simply need to initialize the
+ * context information for exec pruning.
*/
+ prunestate = list_nth(estate->es_part_prune_states, part_prune_index);
+ Assert(prunestate != NULL);
+ if (prunestate->do_exec_prune)
+ PartitionPruneInitExecPruning(pruneinfo, prunestate, planstate);
+
+ /* Use the result of initial pruning done by ExecDoInitialPruning(). */
if (prunestate->do_initial_prune)
- *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+ *initially_valid_subplans = list_nth_node(Bitmapset,
+ estate->es_part_prune_results,
+ part_prune_index);
else
{
/* No pruning, so we'll need to initialize all subplans */
@@ -1878,15 +1883,15 @@ ExecInitPartitionPruning(PlanState *planstate,
* re-evaluate which partitions match the pruning steps provided in each
* PartitionedRelPruneInfo.
*/
-static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+PartitionPruneState *
+CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
{
- EState *estate = planstate->state;
PartitionPruneState *prunestate;
int n_part_hierarchies;
ListCell *lc;
int i;
- ExprContext *econtext = planstate->ps_ExprContext;
+ /* We may need an expression context to evaluate partition exprs */
+ ExprContext *econtext = CreateExprContext(estate);
/* For data reading, executor always includes detached partitions */
if (estate->es_partition_directory == NULL)
@@ -1974,6 +1979,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
* set to -1, as if they were pruned. By construction, both
* arrays are in partition bounds order.
*/
+ pprune->partrel = partrel;
pprune->nparts = partdesc->nparts;
pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
@@ -2073,29 +2079,31 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
{
InitPartitionPruneContext(&pprune->initial_context,
pinfo->initial_pruning_steps,
- partdesc, partkey, planstate,
+ partdesc, partkey, NULL,
econtext);
/* Record whether initial pruning is needed at any level */
prunestate->do_initial_prune = true;
}
- pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
- if (pinfo->exec_pruning_steps &&
- !(econtext->ecxt_estate->es_top_eflags & EXEC_FLAG_EXPLAIN_GENERIC))
- {
- InitPartitionPruneContext(&pprune->exec_context,
- pinfo->exec_pruning_steps,
- partdesc, partkey, planstate,
- econtext);
- /* Record whether exec pruning is needed at any level */
- prunestate->do_exec_prune = true;
- }
/*
- * Accumulate the IDs of all PARAM_EXEC Params affecting the
- * partitioning decisions at this plan node.
+ * The exec pruning context will be initialized in
+ * ExecInitPartitionPruning() when called during the initialization
+ * of the parent plan node.
+ *
+ * pprune->exec_pruning_steps is set to NIL to prevent
+ * ExecFindMatchingSubPlans() from accessing an uninitialized
+ * pprune->exec_context during the initial pruning by
+ * ExecDoInitialPruning().
+ *
+ * prunestate->do_exec_prune is set to indicate whether
+ * PartitionPruneInitExecPruning() needs to be called by
+ * ExecInitPartitionPruning(). This optimization avoids
+ * unnecessary cycles when only initial pruning is required.
*/
- prunestate->execparamids = bms_add_members(prunestate->execparamids,
- pinfo->execparamids);
+ pprune->exec_pruning_steps = NIL;
+ if (pinfo->exec_pruning_steps &&
+ !(econtext->ecxt_estate->es_top_eflags & EXEC_FLAG_EXPLAIN_GENERIC))
+ prunestate->do_exec_prune = true;
j++;
}
@@ -2305,6 +2313,84 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
pfree(new_subplan_indexes);
}
+/*
+ * PartitionPruneInitExecPruning
+ * Initialize PartitionPruneState for exec pruning.
+ */
+static void
+PartitionPruneInitExecPruning(PartitionPruneInfo *pruneinfo,
+ PartitionPruneState *prunestate,
+ PlanState *planstate)
+{
+ EState *estate = planstate->state;
+ int i;
+ ExprContext *econtext;
+
+ /* CreatePartitionPruneState() must have initialized. */
+ Assert(estate->es_partition_directory != NULL);
+
+ /* CreatePartitionPruneState() must have set this. */
+ Assert(prunestate->do_exec_prune);
+
+ /*
+ * Create ExprContext if not already done for the planstate. We may need
+ * an expression context to evaluate partition exprs.
+ */
+ ExecAssignExprContext(estate, planstate);
+ econtext = planstate->ps_ExprContext;
+ for (i = 0; i < prunestate->num_partprunedata; i++)
+ {
+ List *partrel_pruneinfos =
+ list_nth_node(List, pruneinfo->prune_infos, i);
+ PartitionPruningData *prunedata = prunestate->partprunedata[i];
+ int j;
+
+ for (j = 0; j < prunedata->num_partrelprunedata; j++)
+ {
+ PartitionedRelPruneInfo *pinfo =
+ list_nth_node(PartitionedRelPruneInfo, partrel_pruneinfos, j);
+ PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
+ Relation partrel = pprune->partrel;
+ PartitionDesc partdesc;
+ PartitionKey partkey;
+
+ /*
+ * Nothing to do if there are no exec pruning steps, but do set
+ * pprune->exec_pruning_steps, becasue
+ * find_matching_subplans_recurse() looks at it.
+ *
+ * Also skip if doing EXPLAIN (GENERIC_PLAN), since parameter
+ * values may be missing.
+ */
+ pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
+ if (pprune->exec_pruning_steps == NIL ||
+ (econtext->ecxt_estate->es_top_eflags & EXEC_FLAG_EXPLAIN_GENERIC))
+ continue;
+
+ /*
+ * We can rely on the copies of the partitioned table's partition
+ * key and partition descriptor appearing in its relcache entry,
+ * because that entry will be held open and locked for the
+ * duration of this executor run.
+ */
+ partkey = RelationGetPartitionKey(partrel);
+ partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
+ partrel);
+ InitPartitionPruneContext(&pprune->exec_context,
+ pprune->exec_pruning_steps,
+ partdesc, partkey, planstate,
+ econtext);
+
+ /*
+ * Accumulate the IDs of all PARAM_EXEC Params affecting the
+ * partitioning decisions at this plan node.
+ */
+ prunestate->execparamids = bms_add_members(prunestate->execparamids,
+ pinfo->execparamids);
+ }
+ }
+}
+
/*
* ExecFindMatchingSubPlans
* Determine which subplans match the pruning steps detailed in
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 12aacc84ff..dc73de8738 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -58,6 +58,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
*/
typedef struct PartitionedRelPruningData
{
+ Relation partrel;
int nparts;
int *subplan_map;
int *subpart_map;
@@ -128,4 +129,6 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
bool initial_prune);
+extern PartitionPruneState *CreatePartitionPruneState(EState *estate,
+ PartitionPruneInfo *pruneinfo);
#endif /* EXECPARTITION_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 49f1d56a5d..daf04dcf5c 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -636,6 +636,8 @@ typedef struct EState
List *es_rteperminfos; /* List of RTEPermissionInfo */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
+ List *es_part_prune_states; /* List of PartitionPruneState */
+ List *es_part_prune_results; /* List of Bitmapset */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
--
2.43.0
[application/octet-stream] v53-0004-Handle-CachedPlan-invalidation-in-the-executor.patch (55.2K, 4-v53-0004-Handle-CachedPlan-invalidation-in-the-executor.patch)
download | inline diff:
From 10d54077987f8532c1d2a04d6004c1ec03450845 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Thu, 22 Aug 2024 19:38:13 +0900
Subject: [PATCH v53 4/4] Handle CachedPlan invalidation in the executor
This commit makes changes to handle cases where a cached plan
becomes invalid before deferred locks on prunable relations are taken.
* Add checks at various points in ExecutorStart() and its called
functions to determine if the plan becomes invalid. If detected,
the function and its callers return immediately. A previous commit
ensures any partially initialized PlanState tree objects are cleaned
up appropriately.
* Introduce ExecutorStartExt(), a wrapper over ExecutorStart(), to
handle cases where plan initialization is aborted due to invalidation.
ExecutorStartExt() creates a new transient CachedPlan if needed and
retries execution. This new entry point is only required for sites
using plancache.c. It requires passing the QueryDesc, eflags,
CachedPlanSource, and query_index (index in CachedPlanSource.query_list).
* Add GetSingleCachedPlan() in plancache.c to create a transient
CachedPlan for a specified query in the given CachedPlanSource.
Such CachedPlans are tracked in a separate global list for the
plancache invalidation callbacks to check.
This also adds isolation tests using the delay_execution test module
to verify scenarios where a CachedPlan becomes invalid before the
deferred locks are taken.
All ExecutorStart_hook implementations now must add the following
block after the ExecutorStart() call to ensure it doesn't work with an
invalid plan:
/* The plan may have become invalid during ExecutorStart() */
if (!ExecPlanStillValid(queryDesc->estate))
return;
Reviewed-by: Robert Haas
Discussion: https://postgr.es/m/CA+HiwqFGkMSge6TgC9KQzde0ohpAycLQuV7ooitEEpbKB0O_mg@mail.gmail.comk
---
contrib/auto_explain/auto_explain.c | 4 +
.../pg_stat_statements/pg_stat_statements.c | 4 +
src/backend/commands/explain.c | 8 +-
src/backend/commands/portalcmds.c | 1 +
src/backend/commands/prepare.c | 10 +-
src/backend/commands/trigger.c | 14 ++
src/backend/executor/README | 35 ++-
src/backend/executor/execMain.c | 84 ++++++-
src/backend/executor/execUtils.c | 3 +-
src/backend/executor/spi.c | 19 +-
src/backend/tcop/postgres.c | 4 +-
src/backend/tcop/pquery.c | 31 ++-
src/backend/utils/cache/plancache.c | 206 ++++++++++++++++++
src/backend/utils/mmgr/portalmem.c | 4 +-
src/include/commands/explain.h | 1 +
src/include/commands/trigger.h | 1 +
src/include/executor/execdesc.h | 1 +
src/include/executor/executor.h | 17 ++
src/include/nodes/execnodes.h | 1 +
src/include/utils/plancache.h | 26 +++
src/include/utils/portal.h | 4 +-
src/test/modules/delay_execution/Makefile | 3 +-
.../modules/delay_execution/delay_execution.c | 63 +++++-
.../expected/cached-plan-inval.out | 175 +++++++++++++++
src/test/modules/delay_execution/meson.build | 1 +
.../specs/cached-plan-inval.spec | 65 ++++++
26 files changed, 749 insertions(+), 36 deletions(-)
create mode 100644 src/test/modules/delay_execution/expected/cached-plan-inval.out
create mode 100644 src/test/modules/delay_execution/specs/cached-plan-inval.spec
diff --git a/contrib/auto_explain/auto_explain.c b/contrib/auto_explain/auto_explain.c
index 677c135f59..9eb5e9a619 100644
--- a/contrib/auto_explain/auto_explain.c
+++ b/contrib/auto_explain/auto_explain.c
@@ -300,6 +300,10 @@ explain_ExecutorStart(QueryDesc *queryDesc, int eflags)
else
standard_ExecutorStart(queryDesc, eflags);
+ /* The plan may have become invalid during standard_ExecutorStart() */
+ if (!ExecPlanStillValid(queryDesc->estate))
+ return;
+
if (auto_explain_enabled())
{
/*
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 362d222f63..026a3f1362 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -992,6 +992,10 @@ pgss_ExecutorStart(QueryDesc *queryDesc, int eflags)
else
standard_ExecutorStart(queryDesc, eflags);
+ /* The plan may have become invalid during standard_ExecutorStart() */
+ if (!ExecPlanStillValid(queryDesc->estate))
+ return;
+
/*
* If query has queryId zero, don't track it. This prevents double
* counting of optimizable statements that are directly contained in
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 13f5683cf6..ecae32c32b 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -507,7 +507,8 @@ standard_ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NULL, NULL, -1, into, es, queryString, params,
+ queryEnv,
&planduration, (es->buffers ? &bufusage : NULL),
es->memory ? &mem_counters : NULL);
}
@@ -616,6 +617,7 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
*/
void
ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
+ CachedPlanSource *plansource, int query_index,
IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
@@ -686,8 +688,8 @@ ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
if (into)
eflags |= GetIntoRelEFlags(into);
- /* call ExecutorStart to prepare the plan for execution */
- ExecutorStart(queryDesc, eflags);
+ /* Call ExecutorStartExt to prepare the plan for execution. */
+ ExecutorStartExt(queryDesc, eflags, plansource, query_index);
/* Execute the plan for statistics if asked for */
if (es->analyze)
diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index 4f6acf6719..4b1503c05e 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -107,6 +107,7 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
queryString,
CMDTAG_SELECT, /* cursor's query is always a SELECT */
list_make1(plan),
+ NULL,
NULL);
/*----------
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 311b9ebd5b..4cd79a6e3a 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -202,7 +202,8 @@ ExecuteQuery(ParseState *pstate,
query_string,
entry->plansource->commandTag,
plan_list,
- cplan);
+ cplan,
+ entry->plansource);
/*
* For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
@@ -583,6 +584,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
MemoryContextCounters mem_counters;
MemoryContext planner_ctx = NULL;
MemoryContext saved_ctx = NULL;
+ int i = 0;
if (es->memory)
{
@@ -655,8 +657,8 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, cplan, into, es, query_string, paramLI,
- queryEnv,
+ ExplainOnePlan(pstmt, cplan, entry->plansource, i,
+ into, es, query_string, paramLI, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL),
es->memory ? &mem_counters : NULL);
else
@@ -668,6 +670,8 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
/* Separate plans with an appropriate separator */
if (lnext(plan_list, p) != NULL)
ExplainSeparatePlans(es);
+
+ i++;
}
if (estate)
diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index 170360edda..91e4b821a0 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -5119,6 +5119,20 @@ AfterTriggerEndQuery(EState *estate)
afterTriggers.query_depth--;
}
+/* ----------
+ * AfterTriggerAbortQuery()
+ *
+ * Called by ExecutorEnd() if the query execution was aborted due to the
+ * plan becoming invalid during initialization.
+ * ----------
+ */
+void
+AfterTriggerAbortQuery(void)
+{
+ /* Revert the actions of AfterTriggerBeginQuery(). */
+ afterTriggers.query_depth--;
+}
+
/*
* AfterTriggerFreeQuery
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 642d63be61..c76a00b394 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -280,6 +280,28 @@ are typically reset to empty once per tuple. Per-tuple contexts are usually
associated with ExprContexts, and commonly each PlanState node has its own
ExprContext to evaluate its qual and targetlist expressions in.
+Relation Locking
+----------------
+
+Typically, when the executor initializes a plan tree for execution, it doesn't
+lock non-index relations if the plan tree is freshly generated and not derived
+from a CachedPlan. This is because such locks have already been established
+during the query's parsing, rewriting, and planning phases. However, with a
+cached plan tree, some relations may remain unlocked. The function
+AcquireExecutorLocks() only locks unprunable relations in the plan, deferring
+the locking of prunable ones to executor initialization. This avoids
+unnecessary locking of relations that will be pruned during "initial" runtime
+pruning in ExecDoInitialPruning().
+
+This approach creates a window where a cached plan tree with child tables
+could become outdated if another backend modifies these tables before
+ExecDoInitialPruning() locks them. As a result, the executor has the added duty
+to verify the plan tree's validity whenever it locks a child table after
+doing initial pruning. This validation is done by checking the CachedPlan.is_valid
+attribute. If the plan tree is outdated (is_valid=false), the executor halts
+further initialization, cleans up anything in EState that would have been
+allocated up to that point, and retries execution after recreating the
+invalid plan in the CachedPlan.
Query Processing Control Flow
-----------------------------
@@ -288,11 +310,13 @@ This is a sketch of control flow for full query processing:
CreateQueryDesc
- ExecutorStart
+ ExecutorStart or ExecutorStartExt
CreateExecutorState
creates per-query context
- switch to per-query context to run ExecInitNode
+ switch to per-query context to run ExecDoInitialPruning and ExecInitNode
AfterTriggerBeginQuery
+ ExecDoInitialPruning
+ does initial pruning and locks surviving partitions if needed
ExecInitNode --- recursively scans plan tree
ExecInitNode
recurse into subsidiary nodes
@@ -316,7 +340,12 @@ This is a sketch of control flow for full query processing:
FreeQueryDesc
-Per above comments, it's not really critical for ExecEndNode to free any
+As mentioned in the "Relation Locking" section, if the plan tree is found to
+be stale after locking partitions in ExecDoInitialPruning(), the control is
+immediately returned to ExecutorStartExt(), which will create a new plan tree
+and perform the steps starting from CreateExecutorState() again.
+
+Per above comments, it's not really critical for ExecEndPlan to free any
memory; it'll all go away in FreeExecutorState anyway. However, we do need to
be careful to close relations, drop buffer pins, etc, so we do need to scan
the plan state tree to find these sorts of resources.
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index cb5ed921d0..741801adb9 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -59,6 +59,7 @@
#include "utils/backend_status.h"
#include "utils/lsyscache.h"
#include "utils/partcache.h"
+#include "utils/plancache.h"
#include "utils/rls.h"
#include "utils/snapmgr.h"
@@ -135,6 +136,60 @@ ExecutorStart(QueryDesc *queryDesc, int eflags)
standard_ExecutorStart(queryDesc, eflags);
}
+/*
+ * A variant of ExecutorStart() that handles cleanup and replanning if the
+ * input CachedPlan becomes invalid due to locks being taken during
+ * ExecutorStartInternal(). If that happens, a new CachedPlan is created
+ * only for the at the index 'query_index' in plansource->query_list, which
+ * is released separately from the original CachedPlan.
+ */
+void
+ExecutorStartExt(QueryDesc *queryDesc, int eflags,
+ CachedPlanSource *plansource,
+ int query_index)
+{
+ if (queryDesc->cplan == NULL)
+ {
+ ExecutorStart(queryDesc, eflags);
+ return;
+ }
+
+ while (1)
+ {
+ ExecutorStart(queryDesc, eflags);
+ if (!CachedPlanValid(queryDesc->cplan))
+ {
+ CachedPlan *cplan;
+
+ /*
+ * The plan got invalidated, so try with a new updated plan.
+ *
+ * But first undo what ExecutorStart() would've done. Mark
+ * execution as aborted to ensure that AFTER trigger state is
+ * properly reset.
+ */
+ queryDesc->estate->es_aborted = true;
+ ExecutorEnd(queryDesc);
+
+ cplan = GetSingleCachedPlan(plansource, query_index,
+ queryDesc->queryEnv);
+
+ /*
+ * Install the new transient cplan into the QueryDesc replacing
+ * the old one so that executor initialization code can see it.
+ * Mark it as in use by us and ask FreeQueryDesc() to release it.
+ */
+ cplan->refcount = 1;
+ queryDesc->cplan = cplan;
+ queryDesc->cplan_release = true;
+ queryDesc->plannedstmt = linitial_node(PlannedStmt,
+ queryDesc->cplan->stmt_list);
+ }
+ else
+ break; /* ExecutorStart() succeeded! */
+ }
+}
+
void
standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
{
@@ -318,6 +373,7 @@ standard_ExecutorRun(QueryDesc *queryDesc,
estate = queryDesc->estate;
Assert(estate != NULL);
+ Assert(!estate->es_aborted);
Assert(!(estate->es_top_eflags & EXEC_FLAG_EXPLAIN_ONLY));
/* caller must ensure the query's snapshot is active */
@@ -424,8 +480,11 @@ standard_ExecutorFinish(QueryDesc *queryDesc)
Assert(estate != NULL);
Assert(!(estate->es_top_eflags & EXEC_FLAG_EXPLAIN_ONLY));
- /* This should be run once and only once per Executor instance */
- Assert(!estate->es_finished);
+ /*
+ * This should be run once and only once per Executor instance and never
+ * if the execution was aborted.
+ */
+ Assert(!estate->es_finished && !estate->es_aborted);
/* Switch into per-query memory context */
oldcontext = MemoryContextSwitchTo(estate->es_query_cxt);
@@ -484,11 +543,10 @@ standard_ExecutorEnd(QueryDesc *queryDesc)
Assert(estate != NULL);
/*
- * Check that ExecutorFinish was called, unless in EXPLAIN-only mode. This
- * Assert is needed because ExecutorFinish is new as of 9.1, and callers
- * might forget to call it.
+ * Check that ExecutorFinish was called, unless in EXPLAIN-only mode or if
+ * execution was aborted.
*/
- Assert(estate->es_finished ||
+ Assert(estate->es_finished || estate->es_aborted ||
(estate->es_top_eflags & EXEC_FLAG_EXPLAIN_ONLY));
/*
@@ -502,6 +560,14 @@ standard_ExecutorEnd(QueryDesc *queryDesc)
UnregisterSnapshot(estate->es_snapshot);
UnregisterSnapshot(estate->es_crosscheck_snapshot);
+ /*
+ * Reset AFTER trigger module if the query execution was aborted.
+ */
+ if (estate->es_aborted &&
+ !(estate->es_top_eflags &
+ (EXEC_FLAG_SKIP_TRIGGERS | EXEC_FLAG_EXPLAIN_ONLY)))
+ AfterTriggerAbortQuery();
+
/*
* Must switch out of context before destroying it
*/
@@ -948,6 +1014,9 @@ InitPlan(QueryDesc *queryDesc, int eflags)
estate->es_part_prune_infos = plannedstmt->partPruneInfos;
ExecDoInitialPruning(estate);
+ if (!ExecPlanStillValid(estate))
+ return;
+
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
*/
@@ -2929,6 +2998,9 @@ EvalPlanQualStart(EPQState *epqstate, Plan *planTree)
* the snapshot, rangetable, and external Param info. They need their own
* copies of local state, including a tuple table, es_param_exec_vals,
* result-rel info, etc.
+ *
+ * es_cachedplan is not copied because EPQ plan execution does not acquire
+ * any new locks that could invalidate the CachedPlan.
*/
rcestate->es_direction = ForwardScanDirection;
rcestate->es_snapshot = parentestate->es_snapshot;
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 67734979b0..435ae0df7a 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -147,6 +147,7 @@ CreateExecutorState(void)
estate->es_top_eflags = 0;
estate->es_instrument = 0;
estate->es_finished = false;
+ estate->es_aborted = false;
estate->es_exprcontexts = NIL;
@@ -757,7 +758,7 @@ ExecInitRangeTable(EState *estate, List *rangeTable, List *permInfos)
* ExecGetRangeTableRelation
* Open the Relation for a range table entry, if not already done
*
- * The Relations will be closed again in ExecEndPlan().
+ * The Relations will be closed in ExecEndPlan().
*/
Relation
ExecGetRangeTableRelation(EState *estate, Index rti)
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 659bd6dcd9..f84f376c9c 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -70,7 +70,8 @@ static int _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
static ParamListInfo _SPI_convert_params(int nargs, Oid *argtypes,
Datum *Values, const char *Nulls);
-static int _SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount);
+static int _SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount,
+ CachedPlanSource *plansource, int query_index);
static void _SPI_error_callback(void *arg);
@@ -1682,7 +1683,8 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
query_string,
plansource->commandTag,
stmt_list,
- cplan);
+ cplan,
+ plansource);
/*
* Set up options for portal. Default SCROLL type is chosen the same way
@@ -2494,6 +2496,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
List *stmt_list;
ListCell *lc2;
+ int i = 0;
spicallbackarg.query = plansource->query_string;
@@ -2691,8 +2694,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
options->params,
_SPI_current->queryEnv,
0);
- res = _SPI_pquery(qdesc, fire_triggers,
- canSetTag ? options->tcount : 0);
+
+ res = _SPI_pquery(qdesc, fire_triggers, canSetTag ? options->tcount : 0,
+ plansource, i);
FreeQueryDesc(qdesc);
}
else
@@ -2789,6 +2793,8 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
my_res = res;
goto fail;
}
+
+ i++;
}
/* Done with this plan, so release refcount */
@@ -2866,7 +2872,8 @@ _SPI_convert_params(int nargs, Oid *argtypes,
}
static int
-_SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount)
+_SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount,
+ CachedPlanSource *plansource, int query_index)
{
int operation = queryDesc->operation;
int eflags;
@@ -2922,7 +2929,7 @@ _SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount)
else
eflags = EXEC_FLAG_SKIP_TRIGGERS;
- ExecutorStart(queryDesc, eflags);
+ ExecutorStartExt(queryDesc, eflags, plansource, query_index);
ExecutorRun(queryDesc, ForwardScanDirection, tcount, true);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 8bc6bea113..ccbc27b575 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1237,6 +1237,7 @@ exec_simple_query(const char *query_string)
query_string,
commandTag,
plantree_list,
+ NULL,
NULL);
/*
@@ -2027,7 +2028,8 @@ exec_bind_message(StringInfo input_message)
query_string,
psrc->commandTag,
cplan->stmt_list,
- cplan);
+ cplan,
+ psrc);
/* Done with the snapshot used for parameter I/O and parsing/planning */
if (snapshot_set)
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 6e8f6b1b8f..dbb0ffb771 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -19,6 +19,7 @@
#include "access/xact.h"
#include "commands/prepare.h"
+#include "executor/execdesc.h"
#include "executor/tstoreReceiver.h"
#include "miscadmin.h"
#include "pg_trace.h"
@@ -37,6 +38,8 @@ Portal ActivePortal = NULL;
static void ProcessQuery(PlannedStmt *plan,
CachedPlan *cplan,
+ CachedPlanSource *plansource,
+ int query_index,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -80,6 +83,7 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
qd->cplan = cplan; /* CachedPlan supplying the plannedstmt */
+ qd->cplan_release = false;
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -114,6 +118,13 @@ FreeQueryDesc(QueryDesc *qdesc)
UnregisterSnapshot(qdesc->snapshot);
UnregisterSnapshot(qdesc->crosscheck_snapshot);
+ /*
+ * Release CachedPlan if requested. The CachedPlan is not associated with
+ * a ResourceOwner when cplan_release is true; see ExecutorStartExt().
+ */
+ if (qdesc->cplan_release)
+ ReleaseCachedPlan(qdesc->cplan, NULL);
+
/* Only the QueryDesc itself need be freed */
pfree(qdesc);
}
@@ -126,6 +137,8 @@ FreeQueryDesc(QueryDesc *qdesc)
*
* plan: the plan tree for the query
* cplan: CachedPlan supplying the plan
+ * plansource: CachedPlanSource supplying the cplan
+ * query_index: index of the query in plansource->query_list
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -139,6 +152,8 @@ FreeQueryDesc(QueryDesc *qdesc)
static void
ProcessQuery(PlannedStmt *plan,
CachedPlan *cplan,
+ CachedPlanSource *plansource,
+ int query_index,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -157,7 +172,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Call ExecutorStart to prepare the plan for execution
*/
- ExecutorStart(queryDesc, 0);
+ ExecutorStartExt(queryDesc, 0, plansource, query_index);
/*
* Run the plan to completion.
@@ -518,9 +533,12 @@ PortalStart(Portal portal, ParamListInfo params,
myeflags = eflags;
/*
- * Call ExecutorStart to prepare the plan for execution
+ * ExecutorStartExt() to prepare the plan for execution. If
+ * the portal is using a cached plan, it may get invalidated
+ * during plan intialization, in which case a new one is
+ * created and saved in the QueryDesc.
*/
- ExecutorStart(queryDesc, myeflags);
+ ExecutorStartExt(queryDesc, myeflags, portal->plansource, 0);
/*
* This tells PortalCleanup to shut down the executor
@@ -1201,6 +1219,7 @@ PortalRunMulti(Portal portal,
{
bool active_snapshot_set = false;
ListCell *stmtlist_item;
+ int i = 0;
/*
* If the destination is DestRemoteExecute, change to DestNone. The
@@ -1283,6 +1302,8 @@ PortalRunMulti(Portal portal,
/* statement can set tag string */
ProcessQuery(pstmt,
portal->cplan,
+ portal->plansource,
+ i,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1293,6 +1314,8 @@ PortalRunMulti(Portal portal,
/* stmt added by rewrite cannot set tag */
ProcessQuery(pstmt,
portal->cplan,
+ portal->plansource,
+ i,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1357,6 +1380,8 @@ PortalRunMulti(Portal portal,
*/
if (lnext(portal->stmts, stmtlist_item) != NULL)
CommandCounterIncrement();
+
+ i++;
}
/* Pop the snapshot if we pushed one. */
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 5b75dadf13..d33f871ea2 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -94,6 +94,14 @@
*/
static dlist_head saved_plan_list = DLIST_STATIC_INIT(saved_plan_list);
+/*
+ * Head of the backend's list of "standalone" CachedPlans that are not
+ * associated with a CachedPlanSource, created by GetSingleCachedPlan() for
+ * transient use by the executor in certain scenarios where they're needed
+ * only for one execution of the plan.
+ */
+static dlist_head standalone_plan_list = DLIST_STATIC_INIT(standalone_plan_list);
+
/*
* This is the head of the backend's list of CachedExpressions.
*/
@@ -905,6 +913,8 @@ CheckCachedPlan(CachedPlanSource *plansource)
* Planning work is done in the caller's memory context. The finished plan
* is in a child memory context, which typically should get reparented
* (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * Note: When changing this, you should also look at GetSingleCachedPlan().
*/
static CachedPlan *
BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
@@ -1034,6 +1044,7 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
plan->is_generic = generic;
plan->is_saved = false;
plan->is_valid = true;
+ plan->is_standalone = false;
/* assign generation number to new plan */
plan->generation = ++(plansource->generation);
@@ -1282,6 +1293,121 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
return plan;
}
+/*
+ * Create a fresh CachedPlan for the query_index'th query in the provided
+ * CachedPlanSource.
+ *
+ * The created CachedPlan is standalone, meaning it is not tracked in the
+ * CachedPlanSource. The CachedPlan and its plan trees are allocated in a
+ * child context of the caller's memory context. The caller must ensure they
+ * remain valid until execution is complete, after which the plan should be
+ * released by calling ReleaseCachedPlan().
+ *
+ * This function primarily supports ExecutorStartExt(), which handles cases
+ * where the original generic CachedPlan becomes invalid after prunable
+ * relations are locked.
+ */
+CachedPlan *
+GetSingleCachedPlan(CachedPlanSource *plansource, int query_index,
+ QueryEnvironment *queryEnv)
+{
+ List *query_list = plansource->query_list,
+ *plan_list;
+ CachedPlan *plan = plansource->gplan;
+ MemoryContext oldcxt = CurrentMemoryContext,
+ plan_context;
+ PlannedStmt *plannedstmt;
+
+ Assert(ActiveSnapshotSet());
+
+ /* Sanity checks */
+ if (plan == NULL)
+ elog(ERROR, "GetSingleCachedPlan() called in the wrong context: plansource->gplan is NULL");
+ else if (plan->is_valid)
+ elog(ERROR, "GetSingleCachedPlan() called in the wrong context: plansource->gplan->is_valid");
+
+ /*
+ * The plansource might have become invalid since GetCachedPlan(). See the
+ * comment in BuildCachedPlan() for details on why this might happen.
+ *
+ * The risk is greater here because this function is called from the
+ * executor, meaning much more processing may have occurred compared to
+ * when BuildCachedPlan() is called from GetCachedPlan().
+ */
+ if (!plansource->is_valid)
+ query_list = RevalidateCachedQuery(plansource, queryEnv);
+ Assert(query_list != NIL);
+
+ /*
+ * Build a new generic plan for the query_index'th query, but make a copy
+ * to be scribbled on by the planner
+ */
+ query_list = list_make1(copyObject(list_nth_node(Query, query_list,
+ query_index)));
+ plan_list = pg_plan_queries(query_list, plansource->query_string,
+ plansource->cursor_options, NULL);
+
+ list_free_deep(query_list);
+
+ /*
+ * Make a dedicated memory context for the CachedPlan and its subsidiary
+ * data so that we can release it in ReleaseCachedPlan() that will be
+ * called in FreeQueryDesc().
+ */
+ plan_context = AllocSetContextCreate(CurrentMemoryContext,
+ "Standalone CachedPlan",
+ ALLOCSET_START_SMALL_SIZES);
+ MemoryContextCopyAndSetIdentifier(plan_context, plansource->query_string);
+
+ /*
+ * Copy plan into the new context.
+ */
+ MemoryContextSwitchTo(plan_context);
+ plan_list = copyObject(plan_list);
+
+ /*
+ * Create and fill the CachedPlan struct within the new context.
+ */
+ plan = (CachedPlan *) palloc(sizeof(CachedPlan));
+ plan->magic = CACHEDPLAN_MAGIC;
+ plan->stmt_list = plan_list;
+
+ plan->planRoleId = GetUserId();
+ Assert(list_length(plan_list) == 1);
+ plannedstmt = linitial_node(PlannedStmt, plan_list);
+
+ /*
+ * CachedPlan is dependent on role either if RLS affected the rewrite
+ * phase or if a role dependency was injected during planning. And it's
+ * transient if any plan is marked so.
+ */
+ plan->dependsOnRole = plansource->dependsOnRLS || plannedstmt->dependsOnRole;
+ if (plannedstmt->transientPlan)
+ {
+ Assert(TransactionIdIsNormal(TransactionXmin));
+ plan->saved_xmin = TransactionXmin;
+ }
+ else
+ plan->saved_xmin = InvalidTransactionId;
+ plan->refcount = 0;
+ plan->context = plan_context;
+ plan->is_oneshot = false;
+ plan->is_generic = true;
+ plan->is_saved = false;
+ plan->is_valid = true;
+ plan->is_standalone = true;
+ plan->generation = 1;
+ MemoryContextSwitchTo(oldcxt);
+
+ /*
+ * Add the entry to the global list of "standalone" cached plans. It is
+ * removed from the list by ReleaseCachedPlan().
+ */
+ dlist_push_tail(&standalone_plan_list, &plan->node);
+
+ return plan;
+}
+
/*
* ReleaseCachedPlan: release active use of a cached plan.
*
@@ -1309,6 +1435,10 @@ ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner)
/* Mark it no longer valid */
plan->magic = 0;
+ /* Remove from the global list if we are a standalone plan. */
+ if (plan->is_standalone)
+ dlist_delete(&plan->node);
+
/* One-shot plans do not own their context, so we can't free them */
if (!plan->is_oneshot)
MemoryContextDelete(plan->context);
@@ -2066,6 +2196,33 @@ PlanCacheRelCallback(Datum arg, Oid relid)
cexpr->is_valid = false;
}
}
+
+ /* Finally, invalidate any standalone cached plans */
+ dlist_foreach(iter, &standalone_plan_list)
+ {
+ CachedPlan *cplan = dlist_container(CachedPlan,
+ node, iter.cur);
+
+ Assert(cplan->magic == CACHEDPLAN_MAGIC);
+
+ if (cplan->is_valid)
+ {
+ ListCell *lc;
+
+ foreach(lc, cplan->stmt_list)
+ {
+ PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc);
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ continue; /* Ignore utility statements */
+ if ((relid == InvalidOid) ? plannedstmt->relationOids != NIL :
+ list_member_oid(plannedstmt->relationOids, relid))
+ cplan->is_valid = false;
+ if (!cplan->is_valid)
+ break; /* out of stmt_list scan */
+ }
+ }
+ }
}
/*
@@ -2176,6 +2333,44 @@ PlanCacheObjectCallback(Datum arg, int cacheid, uint32 hashvalue)
}
}
}
+
+ /* Finally, invalidate any standalone cached plans */
+ dlist_foreach(iter, &standalone_plan_list)
+ {
+ CachedPlan *cplan = dlist_container(CachedPlan,
+ node, iter.cur);
+
+ Assert(cplan->magic == CACHEDPLAN_MAGIC);
+
+ if (cplan->is_valid)
+ {
+ ListCell *lc;
+
+ foreach(lc, cplan->stmt_list)
+ {
+ PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc);
+ ListCell *lc3;
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ continue; /* Ignore utility statements */
+ foreach(lc3, plannedstmt->invalItems)
+ {
+ PlanInvalItem *item = (PlanInvalItem *) lfirst(lc3);
+
+ if (item->cacheId != cacheid)
+ continue;
+ if (hashvalue == 0 ||
+ item->hashValue == hashvalue)
+ {
+ cplan->is_valid = false;
+ break; /* out of invalItems scan */
+ }
+ }
+ if (!cplan->is_valid)
+ break; /* out of stmt_list scan */
+ }
+ }
+ }
}
/*
@@ -2235,6 +2430,17 @@ ResetPlanCache(void)
cexpr->is_valid = false;
}
+
+ /* Finally, invalidate any standalone cached plans */
+ dlist_foreach(iter, &standalone_plan_list)
+ {
+ CachedPlan *cplan = dlist_container(CachedPlan,
+ node, iter.cur);
+
+ Assert(cplan->magic == CACHEDPLAN_MAGIC);
+
+ cplan->is_valid = false;
+ }
}
/*
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 4a24613537..bf70fd4ce7 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -284,7 +284,8 @@ PortalDefineQuery(Portal portal,
const char *sourceText,
CommandTag commandTag,
List *stmts,
- CachedPlan *cplan)
+ CachedPlan *cplan,
+ CachedPlanSource *plansource)
{
Assert(PortalIsValid(portal));
Assert(portal->status == PORTAL_NEW);
@@ -299,6 +300,7 @@ PortalDefineQuery(Portal portal,
portal->commandTag = commandTag;
portal->stmts = stmts;
portal->cplan = cplan;
+ portal->plansource = plansource;
portal->status = PORTAL_DEFINED;
}
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 21c71e0d53..a39989a950 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -104,6 +104,7 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ParamListInfo params, QueryEnvironment *queryEnv);
extern void ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
+ CachedPlanSource *plansource, int plan_index,
IntoClause *into, ExplainState *es,
const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
diff --git a/src/include/commands/trigger.h b/src/include/commands/trigger.h
index 8a5a9fe642..db21561c8c 100644
--- a/src/include/commands/trigger.h
+++ b/src/include/commands/trigger.h
@@ -258,6 +258,7 @@ extern void ExecASTruncateTriggers(EState *estate,
extern void AfterTriggerBeginXact(void);
extern void AfterTriggerBeginQuery(void);
extern void AfterTriggerEndQuery(EState *estate);
+extern void AfterTriggerAbortQuery(void);
extern void AfterTriggerFireDeferred(void);
extern void AfterTriggerEndXact(bool isCommit);
extern void AfterTriggerBeginSubXact(void);
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index 0e7245435d..f6cb6479c0 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -36,6 +36,7 @@ typedef struct QueryDesc
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
CachedPlan *cplan; /* CachedPlan that supplies the plannedstmt */
+ bool cplan_release; /* Should FreeQueryDesc() release cplan? */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 69c3ebff00..5bc0edb5a0 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -19,6 +19,7 @@
#include "nodes/lockoptions.h"
#include "nodes/parsenodes.h"
#include "utils/memutils.h"
+#include "utils/plancache.h"
/*
@@ -198,6 +199,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
* prototypes from functions in execMain.c
*/
extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
+extern void ExecutorStartExt(QueryDesc *queryDesc, int eflags,
+ CachedPlanSource *plansource, int query_index);
extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void ExecutorRun(QueryDesc *queryDesc,
ScanDirection direction, uint64 count, bool execute_once);
@@ -261,6 +264,19 @@ extern void ExecEndNode(PlanState *node);
extern void ExecShutdownNode(PlanState *node);
extern void ExecSetTupleBound(int64 tuples_needed, PlanState *child_node);
+/*
+ * Is the CachedPlan in es_cachedplan still valid?
+ *
+ * Called from InitPlan() because invalidation messages that affect the plan
+ * might be received after locks have been taken on runtime-prunable relations.
+ * The caller should take appropriate action if the plan has become invalid.
+ */
+static inline bool
+ExecPlanStillValid(EState *estate)
+{
+ return estate->es_cachedplan == NULL ? true :
+ CachedPlanValid(estate->es_cachedplan);
+}
/* ----------------------------------------------------------------
* ExecProcNode
@@ -589,6 +605,7 @@ extern void ExecCreateScanSlotFromOuterPlan(EState *estate,
extern bool ExecRelationIsTargetRelation(EState *estate, Index scanrelid);
extern Relation ExecOpenScanRelation(EState *estate, Index scanrelid, int eflags);
+extern Relation ExecOpenScanIndexRelation(EState *estate, Oid indexid, int lockmode);
extern void ExecInitRangeTable(EState *estate, List *rangeTable, List *permInfos);
extern void ExecCloseRangeTableRelations(EState *estate);
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 181cf5ad09..aa984eee0f 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -685,6 +685,7 @@ typedef struct EState
int es_top_eflags; /* eflags passed to ExecutorStart */
int es_instrument; /* OR of InstrumentOption flags */
bool es_finished; /* true when ExecutorFinish is done */
+ bool es_aborted; /* true when execution was aborted */
List *es_exprcontexts; /* List of ExprContexts within EState */
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 0b5ee007ca..154f68f671 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -18,6 +18,7 @@
#include "access/tupdesc.h"
#include "lib/ilist.h"
#include "nodes/params.h"
+#include "nodes/parsenodes.h"
#include "tcop/cmdtag.h"
#include "utils/queryenvironment.h"
#include "utils/resowner.h"
@@ -152,6 +153,8 @@ typedef struct CachedPlan
bool is_generic; /* is it a reusable generic plan? */
bool is_saved; /* is CachedPlan in a long-lived context? */
bool is_valid; /* is the stmt_list currently valid? */
+ bool is_standalone; /* is it not associated with a
+ * CachedPlanSource? */
Oid planRoleId; /* Role ID the plan was created for */
bool dependsOnRole; /* is plan specific to that role? */
TransactionId saved_xmin; /* if valid, replan when TransactionXmin
@@ -159,6 +162,12 @@ typedef struct CachedPlan
int generation; /* parent's generation number for this plan */
int refcount; /* count of live references to this struct */
MemoryContext context; /* context containing this CachedPlan */
+
+ /*
+ * If the plan is not associated with a CachedPlanSource, it is saved in
+ * a separate global list.
+ */
+ dlist_node node; /* list link, if is_standalone */
} CachedPlan;
/*
@@ -224,6 +233,10 @@ extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
ParamListInfo boundParams,
ResourceOwner owner,
QueryEnvironment *queryEnv);
+extern CachedPlan *GetSingleCachedPlan(CachedPlanSource *plansource,
+ int query_index,
+ QueryEnvironment *queryEnv);
+
extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
@@ -245,4 +258,17 @@ CachedPlanRequiresLocking(CachedPlan *cplan)
return cplan->is_generic;
}
+/*
+ * CachedPlanValid
+ * Returns whether a cached generic plan is still valid.
+ *
+ * Invoked by the executor to check if the plan has not been invalidated after
+ * taking locks during the initialization of the plan.
+ */
+static inline bool
+CachedPlanValid(CachedPlan *cplan)
+{
+ return cplan->is_valid;
+}
+
#endif /* PLANCACHE_H */
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index 29f49829f2..58c3828d2c 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
QueryCompletion qc; /* command completion data for executed query */
List *stmts; /* list of PlannedStmts */
CachedPlan *cplan; /* CachedPlan, if stmts are from one */
+ CachedPlanSource *plansource; /* CachedPlanSource, for cplan */
ParamListInfo portalParams; /* params to pass to query */
QueryEnvironment *queryEnv; /* environment for query */
@@ -241,7 +242,8 @@ extern void PortalDefineQuery(Portal portal,
const char *sourceText,
CommandTag commandTag,
List *stmts,
- CachedPlan *cplan);
+ CachedPlan *cplan,
+ CachedPlanSource *plansource);
extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
extern void PortalCreateHoldStore(Portal portal);
extern void PortalHashTableDeleteAll(void);
diff --git a/src/test/modules/delay_execution/Makefile b/src/test/modules/delay_execution/Makefile
index 70f24e846d..3eeb097fde 100644
--- a/src/test/modules/delay_execution/Makefile
+++ b/src/test/modules/delay_execution/Makefile
@@ -8,7 +8,8 @@ OBJS = \
delay_execution.o
ISOLATION = partition-addition \
- partition-removal-1
+ partition-removal-1 \
+ cached-plan-inval
ifdef USE_PGXS
PG_CONFIG = pg_config
diff --git a/src/test/modules/delay_execution/delay_execution.c b/src/test/modules/delay_execution/delay_execution.c
index 155c8a8d55..304ca77f7b 100644
--- a/src/test/modules/delay_execution/delay_execution.c
+++ b/src/test/modules/delay_execution/delay_execution.c
@@ -1,14 +1,18 @@
/*-------------------------------------------------------------------------
*
* delay_execution.c
- * Test module to allow delay between parsing and execution of a query.
+ * Test module to introduce delay at various points during execution of a
+ * query to test that execution proceeds safely in light of concurrent
+ * changes.
*
* The delay is implemented by taking and immediately releasing a specified
* advisory lock. If another process has previously taken that lock, the
* current process will be blocked until the lock is released; otherwise,
* there's no effect. This allows an isolationtester script to reliably
- * test behaviors where some specified action happens in another backend
- * between parsing and execution of any desired query.
+ * test behaviors where some specified action happens in another backend in
+ * a couple of cases: 1) between parsing and execution of any desired query
+ * when using the planner_hook, 2) between RevalidateCachedQuery() and
+ * ExecutorStart() when using the ExecutorStart_hook.
*
* Copyright (c) 2020-2024, PostgreSQL Global Development Group
*
@@ -22,6 +26,7 @@
#include <limits.h>
+#include "executor/executor.h"
#include "optimizer/planner.h"
#include "utils/builtins.h"
#include "utils/guc.h"
@@ -32,9 +37,11 @@ PG_MODULE_MAGIC;
/* GUC: advisory lock ID to use. Zero disables the feature. */
static int post_planning_lock_id = 0;
+static int executor_start_lock_id = 0;
-/* Save previous planner hook user to be a good citizen */
+/* Save previous hook users to be a good citizen */
static planner_hook_type prev_planner_hook = NULL;
+static ExecutorStart_hook_type prev_ExecutorStart_hook = NULL;
/* planner_hook function to provide the desired delay */
@@ -70,11 +77,41 @@ delay_execution_planner(Query *parse, const char *query_string,
return result;
}
+/* ExecutorStart_hook function to provide the desired delay */
+static void
+delay_execution_ExecutorStart(QueryDesc *queryDesc, int eflags)
+{
+ /* If enabled, delay by taking and releasing the specified lock */
+ if (executor_start_lock_id != 0)
+ {
+ DirectFunctionCall1(pg_advisory_lock_int8,
+ Int64GetDatum((int64) executor_start_lock_id));
+ DirectFunctionCall1(pg_advisory_unlock_int8,
+ Int64GetDatum((int64) executor_start_lock_id));
+
+ /*
+ * Ensure that we notice any pending invalidations, since the advisory
+ * lock functions don't do this.
+ */
+ AcceptInvalidationMessages();
+ }
+
+ /* Now start the executor, possibly via a previous hook user */
+ if (prev_ExecutorStart_hook)
+ prev_ExecutorStart_hook(queryDesc, eflags);
+ else
+ standard_ExecutorStart(queryDesc, eflags);
+
+ if (executor_start_lock_id != 0)
+ elog(NOTICE, "Finished ExecutorStart(): CachedPlan is %s",
+ CachedPlanValid(queryDesc->cplan) ? "valid" : "not valid");
+}
+
/* Module load function */
void
_PG_init(void)
{
- /* Set up the GUC to control which lock is used */
+ /* Set up GUCs to control which lock is used */
DefineCustomIntVariable("delay_execution.post_planning_lock_id",
"Sets the advisory lock ID to be locked/unlocked after planning.",
"Zero disables the delay.",
@@ -86,10 +123,22 @@ _PG_init(void)
NULL,
NULL,
NULL);
-
+ DefineCustomIntVariable("delay_execution.executor_start_lock_id",
+ "Sets the advisory lock ID to be locked/unlocked before starting execution.",
+ "Zero disables the delay.",
+ &executor_start_lock_id,
+ 0,
+ 0, INT_MAX,
+ PGC_USERSET,
+ 0,
+ NULL,
+ NULL,
+ NULL);
MarkGUCPrefixReserved("delay_execution");
- /* Install our hook */
+ /* Install our hooks. */
prev_planner_hook = planner_hook;
planner_hook = delay_execution_planner;
+ prev_ExecutorStart_hook = ExecutorStart_hook;
+ ExecutorStart_hook = delay_execution_ExecutorStart;
}
diff --git a/src/test/modules/delay_execution/expected/cached-plan-inval.out b/src/test/modules/delay_execution/expected/cached-plan-inval.out
new file mode 100644
index 0000000000..e8efb6d9d9
--- /dev/null
+++ b/src/test/modules/delay_execution/expected/cached-plan-inval.out
@@ -0,0 +1,175 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s1prep s2lock s1exec s2dropi s2unlock
+step s1prep: SET plan_cache_mode = force_generic_plan;
+ PREPARE q AS SELECT * FROM foov WHERE a = $1 FOR UPDATE;
+ EXPLAIN (COSTS OFF) EXECUTE q (1);
+QUERY PLAN
+------------------------------------------------
+LockRows
+ -> Append
+ Subplans Removed: 2
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: (a = $1)
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = $1)
+(7 rows)
+
+step s2lock: SELECT pg_advisory_lock(12345);
+pg_advisory_lock
+----------------
+
+(1 row)
+
+step s1exec: LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q (1); <waiting ...>
+step s2dropi: DROP INDEX foo12_1_a;
+step s2unlock: SELECT pg_advisory_unlock(12345);
+pg_advisory_unlock
+------------------
+t
+(1 row)
+
+step s1exec: <... completed>
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+-------------------------------------
+LockRows
+ -> Append
+ Subplans Removed: 2
+ -> Seq Scan on foo12_1 foo_1
+ Filter: (a = $1)
+(5 rows)
+
+
+starting permutation: s1prep2 s2lock s1exec2 s2dropi s2unlock
+step s1prep2: SET plan_cache_mode = force_generic_plan;
+ PREPARE q2 AS SELECT * FROM foov WHERE a = one() or a = two();
+ EXPLAIN (COSTS OFF) EXECUTE q2;
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+--------------------------------------------------
+Append
+ Subplans Removed: 1
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: ((a = one()) OR (a = two()))
+ -> BitmapOr
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = one())
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = two())
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+(11 rows)
+
+step s2lock: SELECT pg_advisory_lock(12345);
+pg_advisory_lock
+----------------
+
+(1 row)
+
+step s1exec2: LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q2; <waiting ...>
+step s2dropi: DROP INDEX foo12_1_a;
+step s2unlock: SELECT pg_advisory_unlock(12345);
+pg_advisory_unlock
+------------------
+t
+(1 row)
+
+step s1exec2: <... completed>
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+--------------------------------------------
+Append
+ Subplans Removed: 1
+ -> Seq Scan on foo12_1 foo_1
+ Filter: ((a = one()) OR (a = two()))
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+(6 rows)
+
+
+starting permutation: s1prep3 s2lock s1exec3 s2dropi s2unlock
+step s1prep3: SET plan_cache_mode = force_generic_plan;
+ PREPARE q3 AS UPDATE foov SET a = a WHERE a = one() or a = two();
+ EXPLAIN (COSTS OFF) EXECUTE q3;
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+--------------------------------------------------------
+Append
+ Subplans Removed: 1
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: ((a = one()) OR (a = two()))
+ -> BitmapOr
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = one())
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = two())
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+
+Update on foo
+ Update on foo12_1 foo_1
+ Update on foo12_2 foo_2
+ Update on foo3 foo
+ -> Append
+ Subplans Removed: 1
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: ((a = one()) OR (a = two()))
+ -> BitmapOr
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = one())
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = two())
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+(27 rows)
+
+step s2lock: SELECT pg_advisory_lock(12345);
+pg_advisory_lock
+----------------
+
+(1 row)
+
+step s1exec3: LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q3; <waiting ...>
+step s2dropi: DROP INDEX foo12_1_a;
+step s2unlock: SELECT pg_advisory_unlock(12345);
+pg_advisory_unlock
+------------------
+t
+(1 row)
+
+step s1exec3: <... completed>
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+--------------------------------------------------
+Append
+ Subplans Removed: 1
+ -> Seq Scan on foo12_1 foo_1
+ Filter: ((a = one()) OR (a = two()))
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+
+Update on foo
+ Update on foo12_1 foo_1
+ Update on foo12_2 foo_2
+ Update on foo3 foo
+ -> Append
+ Subplans Removed: 1
+ -> Seq Scan on foo12_1 foo_1
+ Filter: ((a = one()) OR (a = two()))
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+(17 rows)
+
diff --git a/src/test/modules/delay_execution/meson.build b/src/test/modules/delay_execution/meson.build
index 41f3ac0b89..5a70b183d0 100644
--- a/src/test/modules/delay_execution/meson.build
+++ b/src/test/modules/delay_execution/meson.build
@@ -24,6 +24,7 @@ tests += {
'specs': [
'partition-addition',
'partition-removal-1',
+ 'cached-plan-inval',
],
},
}
diff --git a/src/test/modules/delay_execution/specs/cached-plan-inval.spec b/src/test/modules/delay_execution/specs/cached-plan-inval.spec
new file mode 100644
index 0000000000..5b1f72b4a8
--- /dev/null
+++ b/src/test/modules/delay_execution/specs/cached-plan-inval.spec
@@ -0,0 +1,65 @@
+# Test to check that invalidation of cached generic plans during ExecutorStart
+# correctly triggers replanning and re-execution.
+
+setup
+{
+ CREATE TABLE foo (a int, b text) PARTITION BY LIST(a);
+ CREATE TABLE foo12 PARTITION OF foo FOR VALUES IN (1, 2) PARTITION BY LIST (a);
+ CREATE TABLE foo12_1 PARTITION OF foo12 FOR VALUES IN (1);
+ CREATE TABLE foo12_2 PARTITION OF foo12 FOR VALUES IN (2);
+ CREATE INDEX foo12_1_a ON foo12_1 (a);
+ CREATE TABLE foo3 PARTITION OF foo FOR VALUES IN (3);
+ CREATE VIEW foov AS SELECT * FROM foo;
+ CREATE FUNCTION one () RETURNS int AS $$ BEGIN RETURN 1; END; $$ LANGUAGE PLPGSQL STABLE;
+ CREATE FUNCTION two () RETURNS int AS $$ BEGIN RETURN 2; END; $$ LANGUAGE PLPGSQL STABLE;
+ CREATE RULE update_foo AS ON UPDATE TO foo DO ALSO SELECT 1;
+}
+
+teardown
+{
+ DROP VIEW foov;
+ DROP RULE update_foo ON foo;
+ DROP TABLE foo;
+ DROP FUNCTION one(), two();
+}
+
+session "s1"
+# Append with run-time pruning
+step "s1prep" { SET plan_cache_mode = force_generic_plan;
+ PREPARE q AS SELECT * FROM foov WHERE a = $1 FOR UPDATE;
+ EXPLAIN (COSTS OFF) EXECUTE q (1); }
+
+# Another case with Append with run-time pruning
+step "s1prep2" { SET plan_cache_mode = force_generic_plan;
+ PREPARE q2 AS SELECT * FROM foov WHERE a = one() or a = two();
+ EXPLAIN (COSTS OFF) EXECUTE q2; }
+
+# Case with a rule adding another query
+step "s1prep3" { SET plan_cache_mode = force_generic_plan;
+ PREPARE q3 AS UPDATE foov SET a = a WHERE a = one() or a = two();
+ EXPLAIN (COSTS OFF) EXECUTE q3; }
+
+# Executes a generic plan
+step "s1exec" { LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q (1); }
+step "s1exec2" { LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q2; }
+step "s1exec3" { LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q3; }
+
+session "s2"
+step "s2lock" { SELECT pg_advisory_lock(12345); }
+step "s2unlock" { SELECT pg_advisory_unlock(12345); }
+step "s2dropi" { DROP INDEX foo12_1_a; }
+
+# While "s1exec", etc. wait to acquire the advisory lock, "s2drop" is able to
+# drop the index being used in the cached plan. When "s1exec" is then
+# unblocked and initializes the cached plan for execution, it detects the
+# concurrent index drop and causes the cached plan to be discarded and
+# recreated without the index.
+permutation "s1prep" "s2lock" "s1exec" "s2dropi" "s2unlock"
+permutation "s1prep2" "s2lock" "s1exec2" "s2dropi" "s2unlock"
+permutation "s1prep3" "s2lock" "s1exec3" "s2dropi" "s2unlock"
--
2.43.0
[application/octet-stream] v53-0003-Defer-locking-of-runtime-prunable-relations-to-e.patch (38.3K, 5-v53-0003-Defer-locking-of-runtime-prunable-relations-to-e.patch)
download | inline diff:
From 7a3e86a079be3fb6e6c8fa4a2ed8f0101fe0ee5b Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Wed, 7 Aug 2024 18:25:51 +0900
Subject: [PATCH v53 3/4] Defer locking of runtime-prunable relations to
executor
When preparing a cached plan for execution, plancache.c locks the
relations contained in the plan's range table to ensure it is safe for
execution. However, this simplistic approach, implemented in
AcquireExecutorLocks(), results in unnecessarily locking relations that
might be pruned during "initial" runtime pruning.
To optimize this, the locking is now deferred for relations that are
subject to "initial" runtime pruning. The planner now provides a set
of "unprunable" relations, available through the new
PlannedStmt.unprunableRelids field. AcquireExecutorLocks() will now
only lock those relations.
PlannedStmt.unprunableRelids is populated by subtracting the set of
initially prunable relids from the set of all RT indexes. The prunable
relids set is constructed by examining all PartitionPruneInfos during
set_plan_refs() and storing the RT indexes of partitions subject to
"initial" pruning steps. While at it, some duplicated code in
set_append_references() and set_mergeappend_references() that
constructs the prunable relids set has been refactored into a common
function.
To enable the executor to determine whether the plan tree it's
executing is a cached one, the CachedPlan is now made available via
the QueryDesc. The executor can call CachedPlanRequiresLocking(),
which returns true if the CachedPlan is a reusable generic plan that
might contain relations needing to be locked. If so, the executor
will lock any relation that is not in PlannedStmt.unprunableRelids.
Finally, an Assert has been added in ExecCheckPermissions() to ensure
that all relations whose permissions are checked have been properly
locked. This helps catch any accidental omission of relations from the
unprunableRelids set that should have their permissions checked.
This deferment introduces a window in which prunable relations may be
altered by concurrent DDL, potentially causing the plan to become
invalid. As a result, the executor might attempt to run an invalid plan,
leading to errors such as being unable to locate a partition-only index
during ExecInitIndexScan(). Future commits will introduce changes to
ready the executor to check plan validity during ExecutorStart() and
retry with a newly created plan if the original one becomes invalid
after taking deferred locks.
---
src/backend/commands/copyto.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/explain.c | 7 ++--
src/backend/commands/extension.c | 1 +
src/backend/commands/matview.c | 2 +-
src/backend/commands/prepare.c | 3 +-
src/backend/executor/execMain.c | 45 +++++++++++++++++++++++++-
src/backend/executor/execParallel.c | 9 +++++-
src/backend/executor/execPartition.c | 37 ++++++++++++++++++---
src/backend/executor/functions.c | 1 +
src/backend/executor/nodeAppend.c | 8 ++---
src/backend/executor/nodeMergeAppend.c | 2 +-
src/backend/executor/spi.c | 1 +
src/backend/optimizer/plan/planner.c | 2 ++
src/backend/optimizer/plan/setrefs.c | 11 +++++++
src/backend/partitioning/partprune.c | 20 +++++++++++-
src/backend/tcop/pquery.c | 10 +++++-
src/backend/utils/cache/plancache.c | 40 ++++++++++++++---------
src/include/commands/explain.h | 5 +--
src/include/executor/execPartition.h | 9 +++++-
src/include/executor/execdesc.h | 2 ++
src/include/nodes/execnodes.h | 2 ++
src/include/nodes/pathnodes.h | 6 ++++
src/include/nodes/plannodes.h | 11 +++++++
src/include/utils/plancache.h | 10 ++++++
25 files changed, 209 insertions(+), 39 deletions(-)
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 91de442f43..db976f928a 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -552,7 +552,7 @@ BeginCopyTo(ParseState *pstate,
((DR_copy *) dest)->cstate = cstate;
/* Create a QueryDesc requesting no output */
- cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(),
InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 0b629b1f79..57a3375cad 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -324,7 +324,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 2819e479f8..13f5683cf6 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -507,7 +507,7 @@ standard_ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL),
es->memory ? &mem_counters : NULL);
}
@@ -615,7 +615,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
* to call it.
*/
void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
+ IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
const BufferUsage *bufusage,
@@ -671,7 +672,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
dest = None_Receiver;
/* Create a QueryDesc for the query */
- queryDesc = CreateQueryDesc(plannedstmt, queryString,
+ queryDesc = CreateQueryDesc(plannedstmt, cplan, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, instrument_option);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index fab59ad5f6..bd169edeff 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -742,6 +742,7 @@ execute_sql_string(const char *sql)
QueryDesc *qdesc;
qdesc = CreateQueryDesc(stmt,
+ NULL,
sql,
GetActiveSnapshot(), NULL,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 010097873d..69be74b4bd 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -438,7 +438,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, queryString,
+ queryDesc = CreateQueryDesc(plan, NULL, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 07257d4db9..311b9ebd5b 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -655,7 +655,8 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
+ ExplainOnePlan(pstmt, cplan, into, es, query_string, paramLI,
+ queryEnv,
&planduration, (es->buffers ? &bufusage : NULL),
es->memory ? &mem_counters : NULL);
else
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index dceef322af..cb5ed921d0 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -53,6 +53,7 @@
#include "miscadmin.h"
#include "parser/parse_relation.h"
#include "rewrite/rewriteHandler.h"
+#include "storage/lmgr.h"
#include "tcop/utility.h"
#include "utils/acl.h"
#include "utils/backend_status.h"
@@ -90,6 +91,7 @@ static bool ExecCheckPermissionsModified(Oid relOid, Oid userid,
AclMode requiredPerms);
static void ExecCheckXactReadOnly(PlannedStmt *plannedstmt);
static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
+static inline bool ExecShouldLockRelations(EState *estate);
/* end of local decls */
@@ -598,6 +600,21 @@ ExecCheckPermissions(List *rangeTable, List *rteperminfos,
(rte->rtekind == RTE_SUBQUERY &&
rte->relkind == RELKIND_VIEW));
+ /*
+ * Ensure that we have at least an AccessShareLock on relations
+ * whose permissions need to be checked.
+ *
+ * Skip this check in a parallel worker because locks won't be
+ * taken until ExecInitNode() performs plan initialization.
+ *
+ * XXX: ExecCheckPermissions() in a parallel worker may be
+ * redundant with the checks done in the leader process, so this
+ * should be reviewed to ensure it’s necessary.
+ */
+ Assert(IsParallelWorker() ||
+ CheckRelationOidLockedByMe(rte->relid, AccessShareLock,
+ true));
+
(void) getRTEPermissionInfo(rteperminfos, rte);
/* Many-to-one mapping not allowed */
Assert(!bms_is_member(rte->perminfoindex, indexset));
@@ -860,11 +877,35 @@ ExecDoInitialPruning(EState *estate)
* result.
*/
if (prunestate->do_initial_prune)
- validsubplans = ExecFindMatchingSubPlans(prunestate, true);
+ {
+ List *leaf_partition_oids = NIL;
+
+ validsubplans = ExecFindMatchingSubPlans(prunestate, true,
+ &leaf_partition_oids);
+ if (ExecShouldLockRelations(estate))
+ {
+ ListCell *lc1;
+
+ foreach(lc1, leaf_partition_oids)
+ {
+ LockRelationOid(lfirst_oid(lc1), prunestate->lockmode);
+ }
+ }
+ }
estate->es_part_prune_results = lappend(estate->es_part_prune_results,
validsubplans);
}
}
+/*
+ * Locks might be needed only if running a cached plan that might contain
+ * unlocked relations, such as reused generic plans.
+ */
+static inline bool
+ExecShouldLockRelations(EState *estate)
+{
+ return estate->es_cachedplan == NULL ? false :
+ CachedPlanRequiresLocking(estate->es_cachedplan);
+}
/* ----------------------------------------------------------------
* InitPlan
@@ -878,6 +919,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
{
CmdType operation = queryDesc->operation;
PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+ CachedPlan *cachedplan = queryDesc->cplan;
Plan *plan = plannedstmt->planTree;
List *rangeTable = plannedstmt->rtable;
EState *estate = queryDesc->estate;
@@ -897,6 +939,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable, plannedstmt->permInfos);
estate->es_plannedstmt = plannedstmt;
+ estate->es_cachedplan = cachedplan;
/*
* Perform runtime "initial" pruning to determine the plan nodes that will
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index b01a2fdfdd..7519c9a860 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -1257,8 +1257,15 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
paramLI = RestoreParamList(¶mspace);
- /* Create a QueryDesc for the query. */
+ /*
+ * Create a QueryDesc for the query. We pass NULL for cachedplan, because
+ * we don't have a pointer to the CachedPlan in the leader's process. It's
+ * fine because the only reason the executor needs to see it is to decide
+ * if it should take locks on certain relations, but paraller workers
+ * always take locks anyway.
+ */
return CreateQueryDesc(pstmt,
+ NULL,
queryString,
GetActiveSnapshot(), InvalidSnapshot,
receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 08b1f3d030..861e64856d 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -26,6 +26,7 @@
#include "partitioning/partdesc.h"
#include "partitioning/partprune.h"
#include "rewrite/rewriteManip.h"
+#include "storage/lmgr.h"
#include "utils/acl.h"
#include "utils/lsyscache.h"
#include "utils/partcache.h"
@@ -196,7 +197,8 @@ static void PartitionPruneInitExecPruning(PartitionPruneInfo *pruneinfo,
static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans);
+ Bitmapset **validsubplans,
+ List **leaf_part_oids);
/*
@@ -1927,6 +1929,7 @@ CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
ALLOCSET_DEFAULT_SIZES);
i = 0;
+ prunestate->lockmode = NoLock;
foreach(lc, pruneinfo->prune_infos)
{
List *partrelpruneinfos = lfirst_node(List, lc);
@@ -1950,6 +1953,15 @@ CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
PartitionDesc partdesc;
PartitionKey partkey;
+ /*
+ * Assign the lock mode of the first (root) partitioned table's RTE
+ * as the lock mode to lock leaf partitions after initial pruning,
+ * if needed.
+ */
+ if (prunestate->lockmode == NoLock)
+ prunestate->lockmode = exec_rt_fetch(pinfo->rtindex, estate)->rellockmode;
+ Assert(prunestate->lockmode != NoLock);
+
/*
* We can rely on the copies of the partitioned table's partition
* key and partition descriptor appearing in its relcache entry,
@@ -1982,6 +1994,9 @@ CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
pprune->partrel = partrel;
pprune->nparts = partdesc->nparts;
pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+ pprune->relid_map = palloc(sizeof(Oid) * partdesc->nparts);
+ memcpy(pprune->relid_map, partdesc->oids,
+ sizeof(Oid) * partdesc->nparts);
if (partdesc->nparts == pinfo->nparts &&
memcmp(partdesc->oids, pinfo->relid_map,
@@ -2399,10 +2414,13 @@ PartitionPruneInitExecPruning(PartitionPruneInfo *pruneinfo,
* Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated. This
* differentiates the initial executor-time pruning step from later
* runtime pruning.
+ *
+ * leaf_part_oids must be non-NULL if initial_prune is true.
*/
Bitmapset *
ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune)
+ bool initial_prune,
+ List **leaf_part_oids)
{
Bitmapset *result = NULL;
MemoryContext oldcontext;
@@ -2437,7 +2455,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
*/
pprune = &prunedata->partrelprunedata[0];
find_matching_subplans_recurse(prunedata, pprune, initial_prune,
- &result);
+ &result, leaf_part_oids);
/* Expression eval may have used space in ExprContext too */
if (pprune->exec_pruning_steps)
@@ -2451,6 +2469,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
/* Copy result out of the temp context before we reset it */
result = bms_copy(result);
+ if (leaf_part_oids)
+ *leaf_part_oids = list_copy(*leaf_part_oids);
MemoryContextReset(prunestate->prune_context);
@@ -2467,7 +2487,8 @@ static void
find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans)
+ Bitmapset **validsubplans,
+ List **leaf_part_oids)
{
Bitmapset *partset;
int i;
@@ -2494,8 +2515,13 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
while ((i = bms_next_member(partset, i)) >= 0)
{
if (pprune->subplan_map[i] >= 0)
+ {
*validsubplans = bms_add_member(*validsubplans,
pprune->subplan_map[i]);
+ if (leaf_part_oids)
+ *leaf_part_oids = lappend_oid(*leaf_part_oids,
+ pprune->relid_map[i]);
+ }
else
{
int partidx = pprune->subpart_map[i];
@@ -2503,7 +2529,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
if (partidx >= 0)
find_matching_subplans_recurse(prunedata,
&prunedata->partrelprunedata[partidx],
- initial_prune, validsubplans);
+ initial_prune, validsubplans,
+ leaf_part_oids);
else
{
/*
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 692854e2b3..6f6f45e0ad 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -840,6 +840,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
dest = None_Receiver;
es->qd = CreateQueryDesc(es->stmt,
+ NULL,
fcache->src,
GetActiveSnapshot(),
InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index de7ebab5c2..006bdafaea 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -581,7 +581,7 @@ choose_next_subplan_locally(AppendState *node)
else if (!node->as_valid_subplans_identified)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
node->as_valid_subplans_identified = true;
}
@@ -648,7 +648,7 @@ choose_next_subplan_for_leader(AppendState *node)
if (!node->as_valid_subplans_identified)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
node->as_valid_subplans_identified = true;
/*
@@ -724,7 +724,7 @@ choose_next_subplan_for_worker(AppendState *node)
else if (!node->as_valid_subplans_identified)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
node->as_valid_subplans_identified = true;
mark_invalid_subplans_as_finished(node);
@@ -877,7 +877,7 @@ ExecAppendAsyncBegin(AppendState *node)
if (!node->as_valid_subplans_identified)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
node->as_valid_subplans_identified = true;
classify_matching_subplans(node);
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 3ed91808dd..f7821aa178 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -219,7 +219,7 @@ ExecMergeAppend(PlanState *pstate)
*/
if (node->ms_valid_subplans == NULL)
node->ms_valid_subplans =
- ExecFindMatchingSubPlans(node->ms_prune_state, false);
+ ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
/*
* First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 90d9834576..659bd6dcd9 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -2684,6 +2684,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
snap = InvalidSnapshot;
qdesc = CreateQueryDesc(stmt,
+ cplan,
plansource->query_string,
snap, crosscheck_snapshot,
dest,
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 1b9071c774..9e47a7fd50 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -549,6 +549,8 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->planTree = top_plan;
result->partPruneInfos = glob->partPruneInfos;
result->rtable = glob->finalrtable;
+ result->unprunableRelids = bms_difference(bms_add_range(NULL, 1, list_length(result->rtable)),
+ glob->prunableRelids);
result->permInfos = glob->finalrteperminfos;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index e2ea406c4e..8ce6d1149d 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1764,8 +1764,19 @@ register_partpruneinfo(PlannerInfo *root, int part_prune_index, int rtoffset)
foreach(l2, prune_infos)
{
PartitionedRelPruneInfo *prelinfo = lfirst(l2);
+ Bitmapset *present_leafpart_rtis = prelinfo->present_leafpart_rtis;
prelinfo->rtindex += rtoffset;
+ present_leafpart_rtis = offset_relid_set(present_leafpart_rtis,
+ rtoffset);
+ if (prelinfo->initial_pruning_steps != NIL)
+ glob->prunableRelids = bms_add_members(glob->prunableRelids,
+ present_leafpart_rtis);
+ /*
+ * Don't need this anymore, so set to NULL to save space in the
+ * final plan tree.
+ */
+ prelinfo->present_leafpart_rtis = NULL;
}
}
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 60fabb1734..c022c5ee0b 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -641,6 +641,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
PartitionedRelPruneInfo *pinfo = lfirst(lc);
RelOptInfo *subpart = find_base_rel(root, pinfo->rtindex);
Bitmapset *present_parts;
+ Bitmapset *present_leafpart_rtis;
int nparts = subpart->nparts;
int *subplan_map;
int *subpart_map;
@@ -657,7 +658,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subpart_map = (int *) palloc(nparts * sizeof(int));
memset(subpart_map, -1, nparts * sizeof(int));
relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
- present_parts = NULL;
+ present_parts = present_leafpart_rtis = NULL;
i = -1;
while ((i = bms_next_member(subpart->live_parts, i)) >= 0)
@@ -671,9 +672,25 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+
+ /*
+ * Track the RT indexes of partitions to ensure they are included
+ * in the prunableRelids set of relations that are locked during
+ * execution. This ensures that if the plan is cached, these
+ * partitions are locked when the plan is reused.
+ *
+ * Partitions without a subplan and sub-partitioned partitions
+ * where none of the sub-partitions have a subplan due to
+ * constraint exclusion are not included in this set. Instead,
+ * they are added to the unprunableRelids set, and the relations
+ * in this set are locked by AcquireExecutorLocks() before
+ * executing a cached plan.
+ */
if (subplanidx >= 0)
{
present_parts = bms_add_member(present_parts, i);
+ present_leafpart_rtis = bms_add_member(present_leafpart_rtis,
+ partrel->relid);
/* Record finding this subplan */
subplansfound = bms_add_member(subplansfound, subplanidx);
@@ -691,6 +708,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
/* Record the maps and other information. */
pinfo->present_parts = present_parts;
+ pinfo->present_leafpart_rtis = present_leafpart_rtis;
pinfo->nparts = nparts;
pinfo->subplan_map = subplan_map;
pinfo->subpart_map = subpart_map;
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index a1f8d03db1..6e8f6b1b8f 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -36,6 +36,7 @@ Portal ActivePortal = NULL;
static void ProcessQuery(PlannedStmt *plan,
+ CachedPlan *cplan,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -65,6 +66,7 @@ static void DoPortalRewind(Portal portal);
*/
QueryDesc *
CreateQueryDesc(PlannedStmt *plannedstmt,
+ CachedPlan *cplan,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
@@ -77,6 +79,7 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
+ qd->cplan = cplan; /* CachedPlan supplying the plannedstmt */
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
* PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
*
* plan: the plan tree for the query
+ * cplan: CachedPlan supplying the plan
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
*/
static void
ProcessQuery(PlannedStmt *plan,
+ CachedPlan *cplan,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Create the QueryDesc object
*/
- queryDesc = CreateQueryDesc(plan, sourceText,
+ queryDesc = CreateQueryDesc(plan, cplan, sourceText,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
@@ -493,6 +498,7 @@ PortalStart(Portal portal, ParamListInfo params,
* the destination to DestNone.
*/
queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+ portal->cplan,
portal->sourceText,
GetActiveSnapshot(),
InvalidSnapshot,
@@ -1276,6 +1282,7 @@ PortalRunMulti(Portal portal,
{
/* statement can set tag string */
ProcessQuery(pstmt,
+ portal->cplan,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1285,6 +1292,7 @@ PortalRunMulti(Portal portal,
{
/* stmt added by rewrite cannot set tag */
ProcessQuery(pstmt,
+ portal->cplan,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 5af1a168ec..5b75dadf13 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -104,7 +104,8 @@ static List *RevalidateCachedQuery(CachedPlanSource *plansource,
QueryEnvironment *queryEnv);
static bool CheckCachedPlan(CachedPlanSource *plansource);
static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv);
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ bool generic);
static bool choose_custom_plan(CachedPlanSource *plansource,
ParamListInfo boundParams);
static double cached_plan_cost(CachedPlan *plan, bool include_planner);
@@ -815,8 +816,11 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
* Caller must have already called RevalidateCachedQuery to verify that the
* querytree is up to date.
*
- * On a "true" return, we have acquired the locks needed to run the plan.
- * (We must do this for the "true" result to be race-condition-free.)
+ * On a "true" return, we have acquired locks on the "unprunableRelids" set
+ * for all plans in plansource->stmt_list. The plans are not completely
+ * race-condition-free until the executor takes locks on the set of prunable
+ * relations that survive initial runtime pruning during executor
+ * initialization;
*/
static bool
CheckCachedPlan(CachedPlanSource *plansource)
@@ -893,10 +897,10 @@ CheckCachedPlan(CachedPlanSource *plansource)
* or it can be set to NIL if we need to re-copy the plansource's query_list.
*
* To build a generic, parameter-value-independent plan, pass NULL for
- * boundParams. To build a custom plan, pass the actual parameter values via
- * boundParams. For best effect, the PARAM_FLAG_CONST flag should be set on
- * each parameter value; otherwise the planner will treat the value as a
- * hint rather than a hard constant.
+ * boundParams, and true for generic. To build a custom plan, pass the actual
+ * parameter values via boundParams, and false for generic. For best effect,
+ * the PARAM_FLAG_CONST flag should be set on each parameter value; otherwise
+ * the planner will treat the value as a hint rather than a hard constant.
*
* Planning work is done in the caller's memory context. The finished plan
* is in a child memory context, which typically should get reparented
@@ -904,7 +908,8 @@ CheckCachedPlan(CachedPlanSource *plansource)
*/
static CachedPlan *
BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv)
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ bool generic)
{
CachedPlan *plan;
List *plist;
@@ -1026,6 +1031,7 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
plan->refcount = 0;
plan->context = plan_context;
plan->is_oneshot = plansource->is_oneshot;
+ plan->is_generic = generic;
plan->is_saved = false;
plan->is_valid = true;
@@ -1196,7 +1202,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
else
{
/* Build a new generic plan */
- plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv, true);
/* Just make real sure plansource->gplan is clear */
ReleaseGenericPlan(plansource);
/* Link the new generic plan into the plansource */
@@ -1241,7 +1247,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (customplan)
{
/* Build a custom plan */
- plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv, false);
/* Accumulate total costs of custom plans */
plansource->total_custom_cost += cached_plan_cost(plan, true);
@@ -1387,8 +1393,8 @@ CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
}
/*
- * Reject if AcquireExecutorLocks would have anything to do. This is
- * probably unnecessary given the previous check, but let's be safe.
+ * Reject if there are any lockable relations. This is probably
+ * unnecessary given the previous check, but let's be safe.
*/
foreach(lc, plan->stmt_list)
{
@@ -1776,7 +1782,7 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
foreach(lc1, stmt_list)
{
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
- ListCell *lc2;
+ int rtindex;
if (plannedstmt->commandType == CMD_UTILITY)
{
@@ -1794,9 +1800,13 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
continue;
}
- foreach(lc2, plannedstmt->rtable)
+ rtindex = -1;
+ while ((rtindex = bms_next_member(plannedstmt->unprunableRelids,
+ rtindex)) >= 0)
{
- RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+ RangeTblEntry *rte = list_nth_node(RangeTblEntry,
+ plannedstmt->rtable,
+ rtindex - 1);
if (!(rte->rtekind == RTE_RELATION ||
(rte->rtekind == RTE_SUBQUERY && OidIsValid(rte->relid))))
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 3ab0aae78f..21c71e0d53 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -103,8 +103,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv);
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
- ExplainState *es, const char *queryString,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
+ IntoClause *into, ExplainState *es,
+ const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
const instr_time *planduration,
const BufferUsage *bufusage,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index dc73de8738..ca39fa1feb 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -45,6 +45,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
* nparts Length of subplan_map[] and subpart_map[].
* subplan_map Subplan index by partition index, or -1.
* subpart_map Subpart index by partition index, or -1.
+ * relid_map Partition OID by partition index.
* present_parts A Bitmapset of the partition indexes that we
* have subplans or subparts for.
* initial_pruning_steps List of PartitionPruneSteps used to
@@ -62,6 +63,7 @@ typedef struct PartitionedRelPruningData
int nparts;
int *subplan_map;
int *subpart_map;
+ Oid *relid_map;
Bitmapset *present_parts;
List *initial_pruning_steps;
List *exec_pruning_steps;
@@ -91,6 +93,9 @@ typedef struct PartitionPruningData
* the clauses being unable to match to any tuple that the subplan could
* possibly produce.
*
+ * lockmode Lock mode to lock the leaf partitions with, if needed;
+ * this is same as the lock mode that the root partitioned
+ * table would be locked with.
* execparamids Contains paramids of PARAM_EXEC Params found within
* any of the partprunedata structs. Pruning must be
* done again each time the value of one of these
@@ -113,6 +118,7 @@ typedef struct PartitionPruningData
*/
typedef struct PartitionPruneState
{
+ int lockmode;
Bitmapset *execparamids;
Bitmapset *other_subplans;
MemoryContext prune_context;
@@ -128,7 +134,8 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
Bitmapset *root_parent_relids,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune);
+ bool initial_prune,
+ List **leaf_part_oids);
extern PartitionPruneState *CreatePartitionPruneState(EState *estate,
PartitionPruneInfo *pruneinfo);
#endif /* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index 0a7274e26c..0e7245435d 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,7 @@ typedef struct QueryDesc
/* These fields are provided by CreateQueryDesc */
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
+ CachedPlan *cplan; /* CachedPlan that supplies the plannedstmt */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
@@ -57,6 +58,7 @@ typedef struct QueryDesc
/* in pquery.c */
extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+ CachedPlan *cplan,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index daf04dcf5c..181cf5ad09 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -42,6 +42,7 @@
#include "storage/condition_variable.h"
#include "utils/hsearch.h"
#include "utils/queryenvironment.h"
+#include "utils/plancache.h"
#include "utils/reltrigger.h"
#include "utils/sharedtuplestore.h"
#include "utils/snapshot.h"
@@ -635,6 +636,7 @@ typedef struct EState
* ExecRowMarks, or NULL if none */
List *es_rteperminfos; /* List of RTEPermissionInfo */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
+ CachedPlan *es_cachedplan;
List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
List *es_part_prune_states; /* List of PartitionPruneState */
List *es_part_prune_results; /* List of Bitmapset */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 8d30b6e896..cc2190ea63 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -116,6 +116,12 @@ typedef struct PlannerGlobal
/* "flat" rangetable for executor */
List *finalrtable;
+ /*
+ * RT indexes of relations subject to removal from the plan due to runtime
+ * pruning at plan initialization time
+ */
+ Bitmapset *prunableRelids;
+
/* "flat" list of RTEPermissionInfos */
List *finalrteperminfos;
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 39d0281c23..4f552550c8 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -74,6 +74,10 @@ typedef struct PlannedStmt
List *rtable; /* list of RangeTblEntry nodes */
+ Bitmapset *unprunableRelids; /* RT indexes of relations that are not
+ * subject to runtime pruning; for
+ * AcquireExecutorLocks() */
+
List *permInfos; /* list of RTEPermissionInfo nodes for rtable
* entries needing one */
@@ -1465,6 +1469,13 @@ typedef struct PartitionedRelPruneInfo
/* Indexes of all partitions which subplans or subparts are present for */
Bitmapset *present_parts;
+ /*
+ * RT indexes of all leaf for which subplans are present; only used during
+ * planning to help in the construction of PlannerGlobal.prunableRelids
+ * and set to NULL afterwards to save space in the final plan tree.
+ */
+ Bitmapset *present_leafpart_rtis;
+
/* Length of the following arrays: */
int nparts;
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index a90dfdf906..0b5ee007ca 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -149,6 +149,7 @@ typedef struct CachedPlan
int magic; /* should equal CACHEDPLAN_MAGIC */
List *stmt_list; /* list of PlannedStmts */
bool is_oneshot; /* is it a "oneshot" plan? */
+ bool is_generic; /* is it a reusable generic plan? */
bool is_saved; /* is CachedPlan in a long-lived context? */
bool is_valid; /* is the stmt_list currently valid? */
Oid planRoleId; /* Role ID the plan was created for */
@@ -235,4 +236,13 @@ extern bool CachedPlanIsSimplyValid(CachedPlanSource *plansource,
extern CachedExpression *GetCachedExpression(Node *expr);
extern void FreeCachedExpression(CachedExpression *cexpr);
+/*
+ * CachedPlanRequiresLocking: should the executor acquire locks?
+ */
+static inline bool
+CachedPlanRequiresLocking(CachedPlan *cplan)
+{
+ return cplan->is_generic;
+}
+
#endif /* PLANCACHE_H */
--
2.43.0
^ permalink raw reply [nested|flat] 29+ messages in thread
* Re: generic plans and "initial" pruning
2024-08-15 15:34 Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-16 12:35 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-19 16:39 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-20 13:00 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-20 14:53 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-21 12:45 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-21 13:10 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-23 12:48 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-29 13:34 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-17 12:57 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
@ 2024-09-19 08:39 ` Amit Langote <[email protected]>
2024-09-19 12:10 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
0 siblings, 1 reply; 29+ messages in thread
From: Amit Langote @ 2024-09-19 08:39 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>; Tom Lane <[email protected]>
On Tue, Sep 17, 2024 at 9:57 PM Amit Langote <[email protected]> wrote:
> On Thu, Aug 29, 2024 at 10:34 PM Amit Langote <[email protected]> wrote:
> > One idea that I think might be worth trying to reduce the footprint of
> > 0003 is to try to lock the prunable relations in a step of InitPlan()
> > separate from ExecInitNode(), which can be implemented by doing the
> > initial runtime pruning in that separate step. That way, we'll have
> > all the necessary locks before calling ExecInitNode() and so we don't
> > need to sprinkle the CachedPlanStillValid() checks all over the place
> > and worry about missed checks and dealing with partially initialized
> > PlanState trees.
>
> I've worked on this and found that it results in a much simpler design.
>
> Attached are 0001 and 0002, which contain patches to refactor the
> runtime pruning code. These changes move initial pruning outside of
> ExecInitNode() and use the results during ExecInitNode() to determine
> the set of child subnodes to initialize.
>
> With that in place, the patches (0003, 0004) that move the locking of
> prunable relations from plancache.c into the executor becomes simpler.
> It no longer needs to modify any code called by ExecInitNode(). Since
> no new locks are taken during ExecInitNode(), I didn't have to worry
> about changing all the code involved in PlanState tree initialization
> to add checks for CachedPlan validity. The check is only needed after
> performing initial pruning, and if the CachedPlan is invalid,
> ExecInitNode() won’t be called in the first place.
Sorry, I had missed merging some hunks into 0002 that fixed obsolete
comments. Fixed in the attached v54.
Regarding 0002, I was a bit bothered by the need to add a new function
just to iterate over the PartitionPruningDatas and the
PartitionedRelPruningData they contain, solely to initialize the
PartitionPruneContext needed for exec pruning. To address this, I
propose 0003, which moves the initialization of those contexts to be
done "lazily" in find_matching_subplan_recurse(), where they are
actually used. To make this work, I added an is_valid flag to
PartitionPruneContext, which is checked as follows in the code block
where it's initialized:
+ if (unlikely(!pprune->exec_context.is_valid))
I didn't notice any overhead of adding this to
find_matching_partitions_recurse() which is called for every instance
of exec pruning, so I think it's worthwhile to consider 0003.
I realized that I had missed considering, in the
delay-locking-to-executor patch (now 0004), that there may be plan
objects belonging to pruned partitions, such as RowMarks and
ResultRelInfos, which should not be initialized.
ExecGetRangeTableRelation() invoked with the RT indexes in these
objects would cause crashes in Assert builds since the pruned
partitions would not have been locked. I've updated the patch to
ignore RowMarks and result relations (in ModifyTable.resultRelations)
for pruned child relations, which required adding more accounting info
to EState to store the bitmapset of unpruned RT indexes. For
ResultRelInfos, I took the approach of memsetting them to 0 for pruned
result relations and adding checks at multiple sites to ensure the
ResultRelInfo being handled is valid. I recall previously proposing
lazy initialization for these objects when first needed [1], which
would make the added code unnecessary, but I might save that for
another time.
--
Thanks, Amit Langote
[1] https://postgr.es/m/[email protected]
Attachments:
[application/x-patch] v54-0003-Defer-locking-of-runtime-prunable-relations-to-e.patch (38.3K, 2-v54-0003-Defer-locking-of-runtime-prunable-relations-to-e.patch)
download | inline diff:
From 9ee8384daaff650ebea44e590ced0885fd2be8e3 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Wed, 7 Aug 2024 18:25:51 +0900
Subject: [PATCH v54 3/4] Defer locking of runtime-prunable relations to
executor
When preparing a cached plan for execution, plancache.c locks the
relations contained in the plan's range table to ensure it is safe for
execution. However, this simplistic approach, implemented in
AcquireExecutorLocks(), results in unnecessarily locking relations that
might be pruned during "initial" runtime pruning.
To optimize this, the locking is now deferred for relations that are
subject to "initial" runtime pruning. The planner now provides a set
of "unprunable" relations, available through the new
PlannedStmt.unprunableRelids field. AcquireExecutorLocks() will now
only lock those relations.
PlannedStmt.unprunableRelids is populated by subtracting the set of
initially prunable relids from the set of all RT indexes. The prunable
relids set is constructed by examining all PartitionPruneInfos during
set_plan_refs() and storing the RT indexes of partitions subject to
"initial" pruning steps. While at it, some duplicated code in
set_append_references() and set_mergeappend_references() that
constructs the prunable relids set has been refactored into a common
function.
To enable the executor to determine whether the plan tree it's
executing is a cached one, the CachedPlan is now made available via
the QueryDesc. The executor can call CachedPlanRequiresLocking(),
which returns true if the CachedPlan is a reusable generic plan that
might contain relations needing to be locked. If so, the executor
will lock any relation that is not in PlannedStmt.unprunableRelids.
Finally, an Assert has been added in ExecCheckPermissions() to ensure
that all relations whose permissions are checked have been properly
locked. This helps catch any accidental omission of relations from the
unprunableRelids set that should have their permissions checked.
This deferment introduces a window in which prunable relations may be
altered by concurrent DDL, potentially causing the plan to become
invalid. As a result, the executor might attempt to run an invalid plan,
leading to errors such as being unable to locate a partition-only index
during ExecInitIndexScan(). Future commits will introduce changes to
ready the executor to check plan validity during ExecutorStart() and
retry with a newly created plan if the original one becomes invalid
after taking deferred locks.
---
src/backend/commands/copyto.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/explain.c | 7 ++--
src/backend/commands/extension.c | 1 +
src/backend/commands/matview.c | 2 +-
src/backend/commands/prepare.c | 3 +-
src/backend/executor/execMain.c | 45 +++++++++++++++++++++++++-
src/backend/executor/execParallel.c | 9 +++++-
src/backend/executor/execPartition.c | 37 ++++++++++++++++++---
src/backend/executor/functions.c | 1 +
src/backend/executor/nodeAppend.c | 8 ++---
src/backend/executor/nodeMergeAppend.c | 2 +-
src/backend/executor/spi.c | 1 +
src/backend/optimizer/plan/planner.c | 2 ++
src/backend/optimizer/plan/setrefs.c | 11 +++++++
src/backend/partitioning/partprune.c | 20 +++++++++++-
src/backend/tcop/pquery.c | 10 +++++-
src/backend/utils/cache/plancache.c | 40 ++++++++++++++---------
src/include/commands/explain.h | 5 +--
src/include/executor/execPartition.h | 9 +++++-
src/include/executor/execdesc.h | 2 ++
src/include/nodes/execnodes.h | 2 ++
src/include/nodes/pathnodes.h | 6 ++++
src/include/nodes/plannodes.h | 11 +++++++
src/include/utils/plancache.h | 10 ++++++
25 files changed, 209 insertions(+), 39 deletions(-)
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 91de442f43..db976f928a 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -552,7 +552,7 @@ BeginCopyTo(ParseState *pstate,
((DR_copy *) dest)->cstate = cstate;
/* Create a QueryDesc requesting no output */
- cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(),
InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 0b629b1f79..57a3375cad 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -324,7 +324,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index aaec439892..49f7370734 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -509,7 +509,7 @@ standard_ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL),
es->memory ? &mem_counters : NULL);
}
@@ -617,7 +617,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
* to call it.
*/
void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
+ IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
const BufferUsage *bufusage,
@@ -673,7 +674,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
dest = None_Receiver;
/* Create a QueryDesc for the query */
- queryDesc = CreateQueryDesc(plannedstmt, queryString,
+ queryDesc = CreateQueryDesc(plannedstmt, cplan, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, instrument_option);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index fab59ad5f6..bd169edeff 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -742,6 +742,7 @@ execute_sql_string(const char *sql)
QueryDesc *qdesc;
qdesc = CreateQueryDesc(stmt,
+ NULL,
sql,
GetActiveSnapshot(), NULL,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 010097873d..69be74b4bd 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -438,7 +438,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, queryString,
+ queryDesc = CreateQueryDesc(plan, NULL, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 07257d4db9..311b9ebd5b 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -655,7 +655,8 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
+ ExplainOnePlan(pstmt, cplan, into, es, query_string, paramLI,
+ queryEnv,
&planduration, (es->buffers ? &bufusage : NULL),
es->memory ? &mem_counters : NULL);
else
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 8fab8dbccd..cb7a2bc456 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -53,6 +53,7 @@
#include "miscadmin.h"
#include "parser/parse_relation.h"
#include "rewrite/rewriteHandler.h"
+#include "storage/lmgr.h"
#include "tcop/utility.h"
#include "utils/acl.h"
#include "utils/backend_status.h"
@@ -90,6 +91,7 @@ static bool ExecCheckPermissionsModified(Oid relOid, Oid userid,
AclMode requiredPerms);
static void ExecCheckXactReadOnly(PlannedStmt *plannedstmt);
static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
+static inline bool ExecShouldLockRelations(EState *estate);
/* end of local decls */
@@ -600,6 +602,21 @@ ExecCheckPermissions(List *rangeTable, List *rteperminfos,
(rte->rtekind == RTE_SUBQUERY &&
rte->relkind == RELKIND_VIEW));
+ /*
+ * Ensure that we have at least an AccessShareLock on relations
+ * whose permissions need to be checked.
+ *
+ * Skip this check in a parallel worker because locks won't be
+ * taken until ExecInitNode() performs plan initialization.
+ *
+ * XXX: ExecCheckPermissions() in a parallel worker may be
+ * redundant with the checks done in the leader process, so this
+ * should be reviewed to ensure it’s necessary.
+ */
+ Assert(IsParallelWorker() ||
+ CheckRelationOidLockedByMe(rte->relid, AccessShareLock,
+ true));
+
(void) getRTEPermissionInfo(rteperminfos, rte);
/* Many-to-one mapping not allowed */
Assert(!bms_is_member(rte->perminfoindex, indexset));
@@ -862,11 +879,35 @@ ExecDoInitialPruning(EState *estate)
* result.
*/
if (prunestate->do_initial_prune)
- validsubplans = ExecFindMatchingSubPlans(prunestate, true);
+ {
+ List *leaf_partition_oids = NIL;
+
+ validsubplans = ExecFindMatchingSubPlans(prunestate, true,
+ &leaf_partition_oids);
+ if (ExecShouldLockRelations(estate))
+ {
+ ListCell *lc1;
+
+ foreach(lc1, leaf_partition_oids)
+ {
+ LockRelationOid(lfirst_oid(lc1), prunestate->lockmode);
+ }
+ }
+ }
estate->es_part_prune_results = lappend(estate->es_part_prune_results,
validsubplans);
}
}
+/*
+ * Locks might be needed only if running a cached plan that might contain
+ * unlocked relations, such as reused generic plans.
+ */
+static inline bool
+ExecShouldLockRelations(EState *estate)
+{
+ return estate->es_cachedplan == NULL ? false :
+ CachedPlanRequiresLocking(estate->es_cachedplan);
+}
/* ----------------------------------------------------------------
* InitPlan
@@ -880,6 +921,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
{
CmdType operation = queryDesc->operation;
PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+ CachedPlan *cachedplan = queryDesc->cplan;
Plan *plan = plannedstmt->planTree;
List *rangeTable = plannedstmt->rtable;
EState *estate = queryDesc->estate;
@@ -899,6 +941,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable, plannedstmt->permInfos);
estate->es_plannedstmt = plannedstmt;
+ estate->es_cachedplan = cachedplan;
/*
* Perform runtime "initial" pruning to determine the plan nodes that will
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index b01a2fdfdd..7519c9a860 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -1257,8 +1257,15 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
paramLI = RestoreParamList(¶mspace);
- /* Create a QueryDesc for the query. */
+ /*
+ * Create a QueryDesc for the query. We pass NULL for cachedplan, because
+ * we don't have a pointer to the CachedPlan in the leader's process. It's
+ * fine because the only reason the executor needs to see it is to decide
+ * if it should take locks on certain relations, but paraller workers
+ * always take locks anyway.
+ */
return CreateQueryDesc(pstmt,
+ NULL,
queryString,
GetActiveSnapshot(), InvalidSnapshot,
receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index d205e64e84..f958973378 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -26,6 +26,7 @@
#include "partitioning/partdesc.h"
#include "partitioning/partprune.h"
#include "rewrite/rewriteManip.h"
+#include "storage/lmgr.h"
#include "utils/acl.h"
#include "utils/lsyscache.h"
#include "utils/partcache.h"
@@ -196,7 +197,8 @@ static void PartitionPruneInitExecPruning(PartitionPruneInfo *pruneinfo,
static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans);
+ Bitmapset **validsubplans,
+ List **leaf_part_oids);
/*
@@ -1940,6 +1942,7 @@ CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
ALLOCSET_DEFAULT_SIZES);
i = 0;
+ prunestate->lockmode = NoLock;
foreach(lc, pruneinfo->prune_infos)
{
List *partrelpruneinfos = lfirst_node(List, lc);
@@ -1963,6 +1966,15 @@ CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
PartitionDesc partdesc;
PartitionKey partkey;
+ /*
+ * Assign the lock mode of the first (root) partitioned table's RTE
+ * as the lock mode to lock leaf partitions after initial pruning,
+ * if needed.
+ */
+ if (prunestate->lockmode == NoLock)
+ prunestate->lockmode = exec_rt_fetch(pinfo->rtindex, estate)->rellockmode;
+ Assert(prunestate->lockmode != NoLock);
+
/*
* We can rely on the copies of the partitioned table's partition
* key and partition descriptor appearing in its relcache entry,
@@ -1995,6 +2007,9 @@ CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
pprune->partrel = partrel;
pprune->nparts = partdesc->nparts;
pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
+ pprune->relid_map = palloc(sizeof(Oid) * partdesc->nparts);
+ memcpy(pprune->relid_map, partdesc->oids,
+ sizeof(Oid) * partdesc->nparts);
if (partdesc->nparts == pinfo->nparts &&
memcmp(partdesc->oids, pinfo->relid_map,
@@ -2412,10 +2427,13 @@ PartitionPruneInitExecPruning(PartitionPruneInfo *pruneinfo,
* Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated. This
* differentiates the initial executor-time pruning step from later
* runtime pruning.
+ *
+ * leaf_part_oids must be non-NULL if initial_prune is true.
*/
Bitmapset *
ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune)
+ bool initial_prune,
+ List **leaf_part_oids)
{
Bitmapset *result = NULL;
MemoryContext oldcontext;
@@ -2450,7 +2468,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
*/
pprune = &prunedata->partrelprunedata[0];
find_matching_subplans_recurse(prunedata, pprune, initial_prune,
- &result);
+ &result, leaf_part_oids);
/* Expression eval may have used space in ExprContext too */
if (pprune->exec_pruning_steps)
@@ -2464,6 +2482,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
/* Copy result out of the temp context before we reset it */
result = bms_copy(result);
+ if (leaf_part_oids)
+ *leaf_part_oids = list_copy(*leaf_part_oids);
MemoryContextReset(prunestate->prune_context);
@@ -2480,7 +2500,8 @@ static void
find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans)
+ Bitmapset **validsubplans,
+ List **leaf_part_oids)
{
Bitmapset *partset;
int i;
@@ -2507,8 +2528,13 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
while ((i = bms_next_member(partset, i)) >= 0)
{
if (pprune->subplan_map[i] >= 0)
+ {
*validsubplans = bms_add_member(*validsubplans,
pprune->subplan_map[i]);
+ if (leaf_part_oids)
+ *leaf_part_oids = lappend_oid(*leaf_part_oids,
+ pprune->relid_map[i]);
+ }
else
{
int partidx = pprune->subpart_map[i];
@@ -2516,7 +2542,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
if (partidx >= 0)
find_matching_subplans_recurse(prunedata,
&prunedata->partrelprunedata[partidx],
- initial_prune, validsubplans);
+ initial_prune, validsubplans,
+ leaf_part_oids);
else
{
/*
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 692854e2b3..6f6f45e0ad 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -840,6 +840,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
dest = None_Receiver;
es->qd = CreateQueryDesc(es->stmt,
+ NULL,
fcache->src,
GetActiveSnapshot(),
InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index de7ebab5c2..006bdafaea 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -581,7 +581,7 @@ choose_next_subplan_locally(AppendState *node)
else if (!node->as_valid_subplans_identified)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
node->as_valid_subplans_identified = true;
}
@@ -648,7 +648,7 @@ choose_next_subplan_for_leader(AppendState *node)
if (!node->as_valid_subplans_identified)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
node->as_valid_subplans_identified = true;
/*
@@ -724,7 +724,7 @@ choose_next_subplan_for_worker(AppendState *node)
else if (!node->as_valid_subplans_identified)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
node->as_valid_subplans_identified = true;
mark_invalid_subplans_as_finished(node);
@@ -877,7 +877,7 @@ ExecAppendAsyncBegin(AppendState *node)
if (!node->as_valid_subplans_identified)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
node->as_valid_subplans_identified = true;
classify_matching_subplans(node);
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 3ed91808dd..f7821aa178 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -219,7 +219,7 @@ ExecMergeAppend(PlanState *pstate)
*/
if (node->ms_valid_subplans == NULL)
node->ms_valid_subplans =
- ExecFindMatchingSubPlans(node->ms_prune_state, false);
+ ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
/*
* First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 90d9834576..659bd6dcd9 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -2684,6 +2684,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
snap = InvalidSnapshot;
qdesc = CreateQueryDesc(stmt,
+ cplan,
plansource->query_string,
snap, crosscheck_snapshot,
dest,
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 1b9071c774..9e47a7fd50 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -549,6 +549,8 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->planTree = top_plan;
result->partPruneInfos = glob->partPruneInfos;
result->rtable = glob->finalrtable;
+ result->unprunableRelids = bms_difference(bms_add_range(NULL, 1, list_length(result->rtable)),
+ glob->prunableRelids);
result->permInfos = glob->finalrteperminfos;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index e2ea406c4e..8ce6d1149d 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1764,8 +1764,19 @@ register_partpruneinfo(PlannerInfo *root, int part_prune_index, int rtoffset)
foreach(l2, prune_infos)
{
PartitionedRelPruneInfo *prelinfo = lfirst(l2);
+ Bitmapset *present_leafpart_rtis = prelinfo->present_leafpart_rtis;
prelinfo->rtindex += rtoffset;
+ present_leafpart_rtis = offset_relid_set(present_leafpart_rtis,
+ rtoffset);
+ if (prelinfo->initial_pruning_steps != NIL)
+ glob->prunableRelids = bms_add_members(glob->prunableRelids,
+ present_leafpart_rtis);
+ /*
+ * Don't need this anymore, so set to NULL to save space in the
+ * final plan tree.
+ */
+ prelinfo->present_leafpart_rtis = NULL;
}
}
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 60fabb1734..c022c5ee0b 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -641,6 +641,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
PartitionedRelPruneInfo *pinfo = lfirst(lc);
RelOptInfo *subpart = find_base_rel(root, pinfo->rtindex);
Bitmapset *present_parts;
+ Bitmapset *present_leafpart_rtis;
int nparts = subpart->nparts;
int *subplan_map;
int *subpart_map;
@@ -657,7 +658,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subpart_map = (int *) palloc(nparts * sizeof(int));
memset(subpart_map, -1, nparts * sizeof(int));
relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
- present_parts = NULL;
+ present_parts = present_leafpart_rtis = NULL;
i = -1;
while ((i = bms_next_member(subpart->live_parts, i)) >= 0)
@@ -671,9 +672,25 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+
+ /*
+ * Track the RT indexes of partitions to ensure they are included
+ * in the prunableRelids set of relations that are locked during
+ * execution. This ensures that if the plan is cached, these
+ * partitions are locked when the plan is reused.
+ *
+ * Partitions without a subplan and sub-partitioned partitions
+ * where none of the sub-partitions have a subplan due to
+ * constraint exclusion are not included in this set. Instead,
+ * they are added to the unprunableRelids set, and the relations
+ * in this set are locked by AcquireExecutorLocks() before
+ * executing a cached plan.
+ */
if (subplanidx >= 0)
{
present_parts = bms_add_member(present_parts, i);
+ present_leafpart_rtis = bms_add_member(present_leafpart_rtis,
+ partrel->relid);
/* Record finding this subplan */
subplansfound = bms_add_member(subplansfound, subplanidx);
@@ -691,6 +708,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
/* Record the maps and other information. */
pinfo->present_parts = present_parts;
+ pinfo->present_leafpart_rtis = present_leafpart_rtis;
pinfo->nparts = nparts;
pinfo->subplan_map = subplan_map;
pinfo->subpart_map = subpart_map;
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index a1f8d03db1..6e8f6b1b8f 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -36,6 +36,7 @@ Portal ActivePortal = NULL;
static void ProcessQuery(PlannedStmt *plan,
+ CachedPlan *cplan,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -65,6 +66,7 @@ static void DoPortalRewind(Portal portal);
*/
QueryDesc *
CreateQueryDesc(PlannedStmt *plannedstmt,
+ CachedPlan *cplan,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
@@ -77,6 +79,7 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
+ qd->cplan = cplan; /* CachedPlan supplying the plannedstmt */
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
* PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
*
* plan: the plan tree for the query
+ * cplan: CachedPlan supplying the plan
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
*/
static void
ProcessQuery(PlannedStmt *plan,
+ CachedPlan *cplan,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Create the QueryDesc object
*/
- queryDesc = CreateQueryDesc(plan, sourceText,
+ queryDesc = CreateQueryDesc(plan, cplan, sourceText,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
@@ -493,6 +498,7 @@ PortalStart(Portal portal, ParamListInfo params,
* the destination to DestNone.
*/
queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+ portal->cplan,
portal->sourceText,
GetActiveSnapshot(),
InvalidSnapshot,
@@ -1276,6 +1282,7 @@ PortalRunMulti(Portal portal,
{
/* statement can set tag string */
ProcessQuery(pstmt,
+ portal->cplan,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1285,6 +1292,7 @@ PortalRunMulti(Portal portal,
{
/* stmt added by rewrite cannot set tag */
ProcessQuery(pstmt,
+ portal->cplan,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 5af1a168ec..5b75dadf13 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -104,7 +104,8 @@ static List *RevalidateCachedQuery(CachedPlanSource *plansource,
QueryEnvironment *queryEnv);
static bool CheckCachedPlan(CachedPlanSource *plansource);
static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv);
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ bool generic);
static bool choose_custom_plan(CachedPlanSource *plansource,
ParamListInfo boundParams);
static double cached_plan_cost(CachedPlan *plan, bool include_planner);
@@ -815,8 +816,11 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
* Caller must have already called RevalidateCachedQuery to verify that the
* querytree is up to date.
*
- * On a "true" return, we have acquired the locks needed to run the plan.
- * (We must do this for the "true" result to be race-condition-free.)
+ * On a "true" return, we have acquired locks on the "unprunableRelids" set
+ * for all plans in plansource->stmt_list. The plans are not completely
+ * race-condition-free until the executor takes locks on the set of prunable
+ * relations that survive initial runtime pruning during executor
+ * initialization;
*/
static bool
CheckCachedPlan(CachedPlanSource *plansource)
@@ -893,10 +897,10 @@ CheckCachedPlan(CachedPlanSource *plansource)
* or it can be set to NIL if we need to re-copy the plansource's query_list.
*
* To build a generic, parameter-value-independent plan, pass NULL for
- * boundParams. To build a custom plan, pass the actual parameter values via
- * boundParams. For best effect, the PARAM_FLAG_CONST flag should be set on
- * each parameter value; otherwise the planner will treat the value as a
- * hint rather than a hard constant.
+ * boundParams, and true for generic. To build a custom plan, pass the actual
+ * parameter values via boundParams, and false for generic. For best effect,
+ * the PARAM_FLAG_CONST flag should be set on each parameter value; otherwise
+ * the planner will treat the value as a hint rather than a hard constant.
*
* Planning work is done in the caller's memory context. The finished plan
* is in a child memory context, which typically should get reparented
@@ -904,7 +908,8 @@ CheckCachedPlan(CachedPlanSource *plansource)
*/
static CachedPlan *
BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv)
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ bool generic)
{
CachedPlan *plan;
List *plist;
@@ -1026,6 +1031,7 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
plan->refcount = 0;
plan->context = plan_context;
plan->is_oneshot = plansource->is_oneshot;
+ plan->is_generic = generic;
plan->is_saved = false;
plan->is_valid = true;
@@ -1196,7 +1202,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
else
{
/* Build a new generic plan */
- plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv, true);
/* Just make real sure plansource->gplan is clear */
ReleaseGenericPlan(plansource);
/* Link the new generic plan into the plansource */
@@ -1241,7 +1247,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (customplan)
{
/* Build a custom plan */
- plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv, false);
/* Accumulate total costs of custom plans */
plansource->total_custom_cost += cached_plan_cost(plan, true);
@@ -1387,8 +1393,8 @@ CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
}
/*
- * Reject if AcquireExecutorLocks would have anything to do. This is
- * probably unnecessary given the previous check, but let's be safe.
+ * Reject if there are any lockable relations. This is probably
+ * unnecessary given the previous check, but let's be safe.
*/
foreach(lc, plan->stmt_list)
{
@@ -1776,7 +1782,7 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
foreach(lc1, stmt_list)
{
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
- ListCell *lc2;
+ int rtindex;
if (plannedstmt->commandType == CMD_UTILITY)
{
@@ -1794,9 +1800,13 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
continue;
}
- foreach(lc2, plannedstmt->rtable)
+ rtindex = -1;
+ while ((rtindex = bms_next_member(plannedstmt->unprunableRelids,
+ rtindex)) >= 0)
{
- RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+ RangeTblEntry *rte = list_nth_node(RangeTblEntry,
+ plannedstmt->rtable,
+ rtindex - 1);
if (!(rte->rtekind == RTE_RELATION ||
(rte->rtekind == RTE_SUBQUERY && OidIsValid(rte->relid))))
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 3ab0aae78f..21c71e0d53 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -103,8 +103,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv);
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
- ExplainState *es, const char *queryString,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
+ IntoClause *into, ExplainState *es,
+ const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
const instr_time *planduration,
const BufferUsage *bufusage,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index c0ba23097f..496ecef4c4 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -48,6 +48,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
* nparts Length of subplan_map[] and subpart_map[].
* subplan_map Subplan index by partition index, or -1.
* subpart_map Subpart index by partition index, or -1.
+ * relid_map Partition OID by partition index.
* present_parts A Bitmapset of the partition indexes that we
* have subplans or subparts for.
* initial_pruning_steps List of PartitionPruneSteps used to
@@ -65,6 +66,7 @@ typedef struct PartitionedRelPruningData
int nparts;
int *subplan_map;
int *subpart_map;
+ Oid *relid_map;
Bitmapset *present_parts;
List *initial_pruning_steps;
List *exec_pruning_steps;
@@ -94,6 +96,9 @@ typedef struct PartitionPruningData
* the clauses being unable to match to any tuple that the subplan could
* possibly produce.
*
+ * lockmode Lock mode to lock the leaf partitions with, if needed;
+ * this is same as the lock mode that the root partitioned
+ * table would be locked with.
* execparamids Contains paramids of PARAM_EXEC Params found within
* any of the partprunedata structs. Pruning must be
* done again each time the value of one of these
@@ -116,6 +121,7 @@ typedef struct PartitionPruningData
*/
typedef struct PartitionPruneState
{
+ int lockmode;
Bitmapset *execparamids;
Bitmapset *other_subplans;
MemoryContext prune_context;
@@ -131,7 +137,8 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
Bitmapset *root_parent_relids,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune);
+ bool initial_prune,
+ List **leaf_part_oids);
extern PartitionPruneState *CreatePartitionPruneState(EState *estate,
PartitionPruneInfo *pruneinfo);
#endif /* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index 0a7274e26c..0e7245435d 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,7 @@ typedef struct QueryDesc
/* These fields are provided by CreateQueryDesc */
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
+ CachedPlan *cplan; /* CachedPlan that supplies the plannedstmt */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
@@ -57,6 +58,7 @@ typedef struct QueryDesc
/* in pquery.c */
extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+ CachedPlan *cplan,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 518a9fcd15..1ed925b99b 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -42,6 +42,7 @@
#include "storage/condition_variable.h"
#include "utils/hsearch.h"
#include "utils/queryenvironment.h"
+#include "utils/plancache.h"
#include "utils/reltrigger.h"
#include "utils/sharedtuplestore.h"
#include "utils/snapshot.h"
@@ -636,6 +637,7 @@ typedef struct EState
* ExecRowMarks, or NULL if none */
List *es_rteperminfos; /* List of RTEPermissionInfo */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
+ CachedPlan *es_cachedplan;
List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
List *es_part_prune_states; /* List of PartitionPruneState */
List *es_part_prune_results; /* List of Bitmapset */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 8d30b6e896..cc2190ea63 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -116,6 +116,12 @@ typedef struct PlannerGlobal
/* "flat" rangetable for executor */
List *finalrtable;
+ /*
+ * RT indexes of relations subject to removal from the plan due to runtime
+ * pruning at plan initialization time
+ */
+ Bitmapset *prunableRelids;
+
/* "flat" list of RTEPermissionInfos */
List *finalrteperminfos;
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 39d0281c23..4f552550c8 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -74,6 +74,10 @@ typedef struct PlannedStmt
List *rtable; /* list of RangeTblEntry nodes */
+ Bitmapset *unprunableRelids; /* RT indexes of relations that are not
+ * subject to runtime pruning; for
+ * AcquireExecutorLocks() */
+
List *permInfos; /* list of RTEPermissionInfo nodes for rtable
* entries needing one */
@@ -1465,6 +1469,13 @@ typedef struct PartitionedRelPruneInfo
/* Indexes of all partitions which subplans or subparts are present for */
Bitmapset *present_parts;
+ /*
+ * RT indexes of all leaf for which subplans are present; only used during
+ * planning to help in the construction of PlannerGlobal.prunableRelids
+ * and set to NULL afterwards to save space in the final plan tree.
+ */
+ Bitmapset *present_leafpart_rtis;
+
/* Length of the following arrays: */
int nparts;
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index a90dfdf906..0b5ee007ca 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -149,6 +149,7 @@ typedef struct CachedPlan
int magic; /* should equal CACHEDPLAN_MAGIC */
List *stmt_list; /* list of PlannedStmts */
bool is_oneshot; /* is it a "oneshot" plan? */
+ bool is_generic; /* is it a reusable generic plan? */
bool is_saved; /* is CachedPlan in a long-lived context? */
bool is_valid; /* is the stmt_list currently valid? */
Oid planRoleId; /* Role ID the plan was created for */
@@ -235,4 +236,13 @@ extern bool CachedPlanIsSimplyValid(CachedPlanSource *plansource,
extern CachedExpression *GetCachedExpression(Node *expr);
extern void FreeCachedExpression(CachedExpression *cexpr);
+/*
+ * CachedPlanRequiresLocking: should the executor acquire locks?
+ */
+static inline bool
+CachedPlanRequiresLocking(CachedPlan *cplan)
+{
+ return cplan->is_generic;
+}
+
#endif /* PLANCACHE_H */
--
2.43.0
[application/x-patch] v54-0002-Perform-runtime-initial-pruning-outside-ExecInit.patch (17.2K, 3-v54-0002-Perform-runtime-initial-pruning-outside-ExecInit.patch)
download | inline diff:
From 1deb363d5d0b7573d116198798a3e550be9a320f Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Thu, 12 Sep 2024 15:44:43 +0900
Subject: [PATCH v54 2/4] Perform runtime initial pruning outside
ExecInitNode()
This commit follows up on the previous change that moved
PartitionPruneInfos out of individual plan nodes into a list in
PlannedStmt. It moves the initialization of PartitionPruneStates
and runtime initial pruning out of ExecInitNode() and into a new
routine, ExecDoInitialPruning(), which is called by InitPlan()
before ExecInitNode() is invoked on the main plan tree and subplans.
ExecDoInitialPruning() stores the PartitionPruneStates in a list
matching the length of es_part_prune_infos (which holds the
PartitionPruneInfos from PlannedStmt), allowing both lists to share
the same index. It also saves the initial pruning result -- a
bitmapset of indexes for surviving child subnodes -- in a similarly
indexed list.
While the initial pruning is done earlier, the execution pruning
context information (needed for runtime pruning) is initialized
later during ExecInitNode() for the parent plan node, as it requires
access to the parent node's PlanState struct.
---
src/backend/executor/execMain.c | 55 ++++++++
src/backend/executor/execPartition.c | 179 +++++++++++++++++++++------
src/include/executor/execPartition.h | 6 +
src/include/nodes/execnodes.h | 2 +
4 files changed, 202 insertions(+), 40 deletions(-)
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index e6197c165e..8fab8dbccd 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -46,6 +46,7 @@
#include "commands/matview.h"
#include "commands/trigger.h"
#include "executor/executor.h"
+#include "executor/execPartition.h"
#include "executor/nodeSubplan.h"
#include "foreign/fdwapi.h"
#include "mb/pg_wchar.h"
@@ -818,6 +819,54 @@ ExecCheckXactReadOnly(PlannedStmt *plannedstmt)
PreventCommandIfParallelMode(CreateCommandName((Node *) plannedstmt));
}
+/*
+ * ExecDoInitialPruning
+ * Perform runtime "initial" pruning, if necessary, to determine the set
+ * of child subnodes that need to be initialized during ExecInitNode()
+ * for plan nodes that support partition pruning.
+ *
+ * For each PartitionPruneInfo in estate->es_part_prune_infos, this function
+ * creates a PartitionPruneState (even if no initial pruning is done) and adds
+ * it to es_part_prune_states. For PartitionPruneInfo entries that include
+ * initial pruning steps, the result of those steps is saved as a bitmapset
+ * of indexes representing child subnodes that are "valid" and should be
+ * initialized for execution.
+ */
+static void
+ExecDoInitialPruning(EState *estate)
+{
+ ListCell *lc;
+
+ foreach(lc, estate->es_part_prune_infos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc);
+ PartitionPruneState *prunestate;
+ Bitmapset *validsubplans = NULL;
+
+ /*
+ * Create the working data structure for pruning, and save it for use
+ * later in ExecInitPartitionPruning(), which will be called by the
+ * parent plan node's ExecInit* function.
+ */
+ prunestate = CreatePartitionPruneState(estate, pruneinfo);
+ estate->es_part_prune_states = lappend(estate->es_part_prune_states,
+ prunestate);
+
+ /*
+ * Perform an initial partition pruning pass, if necessary, and save
+ * the bitmapset of valid subplans for use in
+ * ExecInitPartitionPruning(). If no initial pruning is performed, we
+ * still store a NULL to ensure that es_part_prune_results is the same
+ * length as es_part_prune_infos. This ensures that
+ * ExecInitPartitionPruning() can use the same index to locate the
+ * result.
+ */
+ if (prunestate->do_initial_prune)
+ validsubplans = ExecFindMatchingSubPlans(prunestate, true);
+ estate->es_part_prune_results = lappend(estate->es_part_prune_results,
+ validsubplans);
+ }
+}
/* ----------------------------------------------------------------
* InitPlan
@@ -850,7 +899,13 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable, plannedstmt->permInfos);
estate->es_plannedstmt = plannedstmt;
+
+ /*
+ * Perform runtime "initial" pruning to determine the plan nodes that will
+ * not be executed.
+ */
estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+ ExecDoInitialPruning(estate);
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index ec730674f2..d205e64e84 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -181,8 +181,6 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
int maxfieldlen);
static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
-static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *pruneinfo);
static void InitPartitionPruneContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
@@ -192,6 +190,9 @@ static void InitPartitionPruneContext(PartitionPruneContext *context,
static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
Bitmapset *initially_valid_subplans,
int n_total_subplans);
+static void PartitionPruneInitExecPruning(PartitionPruneInfo *pruneinfo,
+ PartitionPruneState *prunestate,
+ PlanState *planstate);
static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
@@ -1783,20 +1784,26 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
/*
* ExecInitPartitionPruning
- * Initialize data structure needed for run-time partition pruning and
- * do initial pruning if needed
+ * Initialize the data structures needed for runtime "exec" partition
+ * pruning and return the result of initial pruning, if available.
*
* 'root_parent_relids' identifies the relation to which both the parent plan
- * and the PartitionPruneInfo given by 'part_prune_index' belong.
+ * and the PartitionPruneInfo associated with 'part_prune_index' belong.
*
- * On return, *initially_valid_subplans is assigned the set of indexes of
- * child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * The PartitionPruneState would have been created by ExecDoInitialPruning()
+ * and stored as the part_prune_index'th element of EState.es_part_prune_states.
+ * Here, we initialize only the PartitionPruneContext necessary for execution
+ * pruning.
*
- * If subplans are indeed pruned, subplan_map arrays contained in the returned
- * PartitionPruneState are re-sequenced to not count those, though only if the
- * maps will be needed for subsequent execution pruning passes.
+ * On return, *initially_valid_subplans is assigned the set of indexes of child
+ * subplans that must be initialized alongside the parent plan node. Initial
+ * pruning would have been performed by ExecDoInitialPruning() if necessary,
+ * and the bitmapset of surviving subplans' indexes would have been stored as
+ * the part_prune_index'th element of EState.es_part_prune_results.
+ *
+ * If subplans are pruned, the subplan_map arrays in the returned
+ * PartitionPruneState are re-sequenced to exclude those subplans, but only if
+ * the maps will be needed for subsequent execution pruning passes.
*/
PartitionPruneState *
ExecInitPartitionPruning(PlanState *planstate,
@@ -1821,17 +1828,21 @@ ExecInitPartitionPruning(PlanState *planstate,
bmsToString(root_parent_relids),
bmsToString(pruneinfo->root_parent_relids)));
- /* We may need an expression context to evaluate partition exprs */
- ExecAssignExprContext(estate, planstate);
-
- /* Create the working data structure for pruning */
- prunestate = CreatePartitionPruneState(planstate, pruneinfo);
-
/*
- * Perform an initial partition prune pass, if required.
+ * ExecDoInitialPruning() must have initialized the PartitionPruneState to
+ * perform the initial pruning. Now we simply need to initialize the
+ * context information for exec pruning.
*/
+ prunestate = list_nth(estate->es_part_prune_states, part_prune_index);
+ Assert(prunestate != NULL);
+ if (prunestate->do_exec_prune)
+ PartitionPruneInitExecPruning(pruneinfo, prunestate, planstate);
+
+ /* Use the result of initial pruning done by ExecDoInitialPruning(). */
if (prunestate->do_initial_prune)
- *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+ *initially_valid_subplans = list_nth_node(Bitmapset,
+ estate->es_part_prune_results,
+ part_prune_index);
else
{
/* No pruning, so we'll need to initialize all subplans */
@@ -1877,16 +1888,23 @@ ExecInitPartitionPruning(PlanState *planstate,
* stored in each PartitionedRelPruningData can be re-used each time we
* re-evaluate which partitions match the pruning steps provided in each
* PartitionedRelPruneInfo.
+ *
+ * Note that we only initialize the PartitionPruneContext (which is placed into
+ * each PartitionedRelPruningData) for initial pruning here. Execution pruning
+ * requires access to the parent plan node's PlanState, which is not available
+ * when this function is called from ExecDoInitialPruning(), so it is
+ * initialized later during ExecInitPartitionPruning() by calling
+ * PartitionPruneInitExecPruning().
*/
-static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+PartitionPruneState *
+CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
{
- EState *estate = planstate->state;
PartitionPruneState *prunestate;
int n_part_hierarchies;
ListCell *lc;
int i;
- ExprContext *econtext = planstate->ps_ExprContext;
+ /* We may need an expression context to evaluate partition exprs */
+ ExprContext *econtext = CreateExprContext(estate);
/* For data reading, executor always includes detached partitions */
if (estate->es_partition_directory == NULL)
@@ -1974,6 +1992,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
* set to -1, as if they were pruned. By construction, both
* arrays are in partition bounds order.
*/
+ pprune->partrel = partrel;
pprune->nparts = partdesc->nparts;
pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
@@ -2073,29 +2092,31 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
{
InitPartitionPruneContext(&pprune->initial_context,
pinfo->initial_pruning_steps,
- partdesc, partkey, planstate,
+ partdesc, partkey, NULL,
econtext);
/* Record whether initial pruning is needed at any level */
prunestate->do_initial_prune = true;
}
- pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
- if (pinfo->exec_pruning_steps &&
- !(econtext->ecxt_estate->es_top_eflags & EXEC_FLAG_EXPLAIN_GENERIC))
- {
- InitPartitionPruneContext(&pprune->exec_context,
- pinfo->exec_pruning_steps,
- partdesc, partkey, planstate,
- econtext);
- /* Record whether exec pruning is needed at any level */
- prunestate->do_exec_prune = true;
- }
/*
- * Accumulate the IDs of all PARAM_EXEC Params affecting the
- * partitioning decisions at this plan node.
+ * The exec pruning context will be initialized in
+ * ExecInitPartitionPruning() when called during the initialization
+ * of the parent plan node.
+ *
+ * pprune->exec_pruning_steps is set to NIL to prevent
+ * ExecFindMatchingSubPlans() from accessing an uninitialized
+ * pprune->exec_context during the initial pruning by
+ * ExecDoInitialPruning().
+ *
+ * prunestate->do_exec_prune is set to indicate whether
+ * PartitionPruneInitExecPruning() needs to be called by
+ * ExecInitPartitionPruning(). This optimization avoids
+ * unnecessary cycles when only initial pruning is required.
*/
- prunestate->execparamids = bms_add_members(prunestate->execparamids,
- pinfo->execparamids);
+ pprune->exec_pruning_steps = NIL;
+ if (pinfo->exec_pruning_steps &&
+ !(econtext->ecxt_estate->es_top_eflags & EXEC_FLAG_EXPLAIN_GENERIC))
+ prunestate->do_exec_prune = true;
j++;
}
@@ -2305,6 +2326,84 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
pfree(new_subplan_indexes);
}
+/*
+ * PartitionPruneInitExecPruning
+ * Initialize PartitionPruneState for exec pruning.
+ */
+static void
+PartitionPruneInitExecPruning(PartitionPruneInfo *pruneinfo,
+ PartitionPruneState *prunestate,
+ PlanState *planstate)
+{
+ EState *estate = planstate->state;
+ int i;
+ ExprContext *econtext;
+
+ /* CreatePartitionPruneState() must have initialized. */
+ Assert(estate->es_partition_directory != NULL);
+
+ /* CreatePartitionPruneState() must have set this. */
+ Assert(prunestate->do_exec_prune);
+
+ /*
+ * Create ExprContext if not already done for the planstate. We may need
+ * an expression context to evaluate partition exprs.
+ */
+ ExecAssignExprContext(estate, planstate);
+ econtext = planstate->ps_ExprContext;
+ for (i = 0; i < prunestate->num_partprunedata; i++)
+ {
+ List *partrel_pruneinfos =
+ list_nth_node(List, pruneinfo->prune_infos, i);
+ PartitionPruningData *prunedata = prunestate->partprunedata[i];
+ int j;
+
+ for (j = 0; j < prunedata->num_partrelprunedata; j++)
+ {
+ PartitionedRelPruneInfo *pinfo =
+ list_nth_node(PartitionedRelPruneInfo, partrel_pruneinfos, j);
+ PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
+ Relation partrel = pprune->partrel;
+ PartitionDesc partdesc;
+ PartitionKey partkey;
+
+ /*
+ * Nothing to do if there are no exec pruning steps, but do set
+ * pprune->exec_pruning_steps, becasue
+ * find_matching_subplans_recurse() looks at it.
+ *
+ * Also skip if doing EXPLAIN (GENERIC_PLAN), since parameter
+ * values may be missing.
+ */
+ pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
+ if (pprune->exec_pruning_steps == NIL ||
+ (econtext->ecxt_estate->es_top_eflags & EXEC_FLAG_EXPLAIN_GENERIC))
+ continue;
+
+ /*
+ * We can rely on the copies of the partitioned table's partition
+ * key and partition descriptor appearing in its relcache entry,
+ * because that entry will be held open and locked for the
+ * duration of this executor run.
+ */
+ partkey = RelationGetPartitionKey(partrel);
+ partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
+ partrel);
+ InitPartitionPruneContext(&pprune->exec_context,
+ pprune->exec_pruning_steps,
+ partdesc, partkey, planstate,
+ econtext);
+
+ /*
+ * Accumulate the IDs of all PARAM_EXEC Params affecting the
+ * partitioning decisions at this plan node.
+ */
+ prunestate->execparamids = bms_add_members(prunestate->execparamids,
+ pinfo->execparamids);
+ }
+ }
+}
+
/*
* ExecFindMatchingSubPlans
* Determine which subplans match the pruning steps detailed in
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 12aacc84ff..c0ba23097f 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -42,6 +42,9 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
* PartitionedRelPruneInfo (see plannodes.h); though note that here,
* subpart_map contains indexes into PartitionPruningData.partrelprunedata[].
*
+ * partrel Partitioned table; points to
+ * EState.es_relations[rti-1], where rti is the
+ * table's RT index
* nparts Length of subplan_map[] and subpart_map[].
* subplan_map Subplan index by partition index, or -1.
* subpart_map Subpart index by partition index, or -1.
@@ -58,6 +61,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
*/
typedef struct PartitionedRelPruningData
{
+ Relation partrel;
int nparts;
int *subplan_map;
int *subpart_map;
@@ -128,4 +132,6 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
bool initial_prune);
+extern PartitionPruneState *CreatePartitionPruneState(EState *estate,
+ PartitionPruneInfo *pruneinfo);
#endif /* EXECPARTITION_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 22b928e085..518a9fcd15 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -637,6 +637,8 @@ typedef struct EState
List *es_rteperminfos; /* List of RTEPermissionInfo */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
+ List *es_part_prune_states; /* List of PartitionPruneState */
+ List *es_part_prune_results; /* List of Bitmapset */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
--
2.43.0
[application/x-patch] v54-0001-Move-PartitionPruneInfo-out-of-plan-nodes-into-P.patch (19.9K, 4-v54-0001-Move-PartitionPruneInfo-out-of-plan-nodes-into-P.patch)
download | inline diff:
From cf75d48323a3c28d272e34c942f123a2e04044fd Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Fri, 6 Sep 2024 13:11:05 +0900
Subject: [PATCH v54 1/4] Move PartitionPruneInfo out of plan nodes into
PlannedStmt
This change moves PartitionPruneInfo from individual plan nodes to
PlannedStmt, allowing runtime initial pruning to be performed across
the entire plan tree without traversing the tree to find nodes
containing PartitionPruneInfos.
The PartitionPruneInfo pointer fields in Append and MergeAppend nodes
have been replaced with an integer index that points to
PartitionPruneInfos in a list within PlannedStmt, which holds the
PartitionPruneInfos for all subqueries.
Reviewed-by: Alvaro Herrera
---
src/backend/executor/execMain.c | 1 +
src/backend/executor/execParallel.c | 1 +
src/backend/executor/execPartition.c | 19 +++++-
src/backend/executor/execUtils.c | 1 +
src/backend/executor/nodeAppend.c | 5 +-
src/backend/executor/nodeMergeAppend.c | 5 +-
src/backend/optimizer/plan/createplan.c | 24 +++----
src/backend/optimizer/plan/planner.c | 1 +
src/backend/optimizer/plan/setrefs.c | 86 ++++++++++++++++---------
src/backend/partitioning/partprune.c | 19 ++++--
src/include/executor/execPartition.h | 4 +-
src/include/nodes/execnodes.h | 1 +
src/include/nodes/pathnodes.h | 6 ++
src/include/nodes/plannodes.h | 14 ++--
src/include/partitioning/partprune.h | 8 +--
15 files changed, 133 insertions(+), 62 deletions(-)
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 7042ca6c60..e6197c165e 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -850,6 +850,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable, plannedstmt->permInfos);
estate->es_plannedstmt = plannedstmt;
+ estate->es_part_prune_infos = plannedstmt->partPruneInfos;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index bfb3419efb..b01a2fdfdd 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -181,6 +181,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
pstmt->planTree = plan;
+ pstmt->partPruneInfos = estate->es_part_prune_infos;
pstmt->rtable = estate->es_range_table;
pstmt->permInfos = estate->es_rteperminfos;
pstmt->resultRelations = NIL;
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 7651886229..ec730674f2 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1786,6 +1786,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* Initialize data structure needed for run-time partition pruning and
* do initial pruning if needed
*
+ * 'root_parent_relids' identifies the relation to which both the parent plan
+ * and the PartitionPruneInfo given by 'part_prune_index' belong.
+ *
* On return, *initially_valid_subplans is assigned the set of indexes of
* child subplans that must be initialized along with the parent plan node.
* Initial pruning is performed here if needed and in that case only the
@@ -1798,11 +1801,25 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
PartitionPruneState *
ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
+ Bitmapset *root_parent_relids,
Bitmapset **initially_valid_subplans)
{
PartitionPruneState *prunestate;
EState *estate = planstate->state;
+ PartitionPruneInfo *pruneinfo;
+
+ /* Obtain the pruneinfo we need, and make sure it's the right one */
+ pruneinfo = list_nth_node(PartitionPruneInfo, estate->es_part_prune_infos,
+ part_prune_index);
+ if (!bms_equal(root_parent_relids, pruneinfo->root_parent_relids))
+ ereport(ERROR,
+ errcode(ERRCODE_INTERNAL_ERROR),
+ errmsg_internal("mismatching PartitionPruneInfo found at part_prune_index %d",
+ part_prune_index),
+ errdetail_internal("plan node relids %s, pruneinfo relids %s",
+ bmsToString(root_parent_relids),
+ bmsToString(pruneinfo->root_parent_relids)));
/* We may need an expression context to evaluate partition exprs */
ExecAssignExprContext(estate, planstate);
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 5737f9f4eb..67734979b0 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -118,6 +118,7 @@ CreateExecutorState(void)
estate->es_rowmarks = NULL;
estate->es_rteperminfos = NIL;
estate->es_plannedstmt = NULL;
+ estate->es_part_prune_infos = NIL;
estate->es_junkFilter = NULL;
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index ca0f54d676..de7ebab5c2 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -134,7 +134,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
appendstate->as_begun = false;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -145,7 +145,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&appendstate->ps,
list_length(node->appendplans),
- node->part_prune_info,
+ node->part_prune_index,
+ node->apprelids,
&validsubplans);
appendstate->as_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index e1b9b984a7..3ed91808dd 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -82,7 +82,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
mergestate->ps.ExecProcNode = ExecMergeAppend;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -93,7 +93,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&mergestate->ps,
list_length(node->mergeplans),
- node->part_prune_info,
+ node->part_prune_index,
+ node->apprelids,
&validsubplans);
mergestate->ms_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index bb45ef318f..6642d09a39 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1225,7 +1225,6 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
ListCell *subpaths;
int nasyncplans = 0;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
int nodenumsortkeys = 0;
AttrNumber *nodeSortColIdx = NULL;
Oid *nodeSortOperators = NULL;
@@ -1376,6 +1375,9 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ plan->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1399,16 +1401,14 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
}
if (prunequal != NIL)
- partpruneinfo =
- make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ plan->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
plan->appendplans = subplans;
plan->nasyncplans = nasyncplans;
plan->first_partial_plan = best_path->first_partial_path;
- plan->part_prune_info = partpruneinfo;
copy_generic_path_info(&plan->plan, (Path *) best_path);
@@ -1447,7 +1447,6 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
List *subplans = NIL;
ListCell *subpaths;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
/*
* We don't have the actual creation of the MergeAppend node split out
@@ -1540,6 +1539,9 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ node->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1555,13 +1557,13 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
Assert(best_path->path.param_info == NULL);
if (prunequal != NIL)
- partpruneinfo = make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ node->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
node->mergeplans = subplans;
- node->part_prune_info = partpruneinfo;
+
/*
* If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index df35d1ff9c..1b9071c774 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -547,6 +547,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->dependsOnRole = glob->dependsOnRole;
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
+ result->partPruneInfos = glob->partPruneInfos;
result->rtable = glob->finalrtable;
result->permInfos = glob->finalrteperminfos;
result->resultRelations = glob->resultRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 91c7c4fe2f..e2ea406c4e 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1732,6 +1732,48 @@ set_customscan_references(PlannerInfo *root,
cscan->custom_relids = offset_relid_set(cscan->custom_relids, rtoffset);
}
+/*
+ * register_partpruneinfo
+ * Subroutine for set_append_references and set_mergeappend_references
+ *
+ * Add the PartitionPruneInfo from root->partPruneInfos at the given index
+ * into PlannerGlobal->partPruneInfos and return its index there.
+ *
+ * Also update the RT indexes present in PartitionedRelPruneInfos to add the
+ * offset.
+ */
+static int
+register_partpruneinfo(PlannerInfo *root, int part_prune_index, int rtoffset)
+{
+ PlannerGlobal *glob = root->glob;
+ PartitionPruneInfo *pinfo;
+ ListCell *l;
+
+ Assert(part_prune_index >= 0 &&
+ part_prune_index < list_length(root->partPruneInfos));
+ pinfo = list_nth_node(PartitionPruneInfo, root->partPruneInfos,
+ part_prune_index);
+
+ pinfo->root_parent_relids = offset_relid_set(pinfo->root_parent_relids,
+ rtoffset);
+ foreach(l, pinfo->prune_infos)
+ {
+ List *prune_infos = lfirst(l);
+ ListCell *l2;
+
+ foreach(l2, prune_infos)
+ {
+ PartitionedRelPruneInfo *prelinfo = lfirst(l2);
+
+ prelinfo->rtindex += rtoffset;
+ }
+ }
+
+ glob->partPruneInfos = lappend(glob->partPruneInfos, pinfo);
+
+ return list_length(glob->partPruneInfos) - 1;
+}
+
/*
* set_append_references
* Do set_plan_references processing on an Append
@@ -1784,21 +1826,13 @@ set_append_references(PlannerInfo *root,
aplan->apprelids = offset_relid_set(aplan->apprelids, rtoffset);
- if (aplan->part_prune_info)
- {
- foreach(l, aplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * Add PartitionPruneInfo, if any, to PlannerGlobal and update the index.
+ * Also update the RT indexes present in it to add the offset.
+ */
+ if (aplan->part_prune_index >= 0)
+ aplan->part_prune_index =
+ register_partpruneinfo(root, aplan->part_prune_index, rtoffset);
/* We don't need to recurse to lefttree or righttree ... */
Assert(aplan->plan.lefttree == NULL);
@@ -1860,21 +1894,13 @@ set_mergeappend_references(PlannerInfo *root,
mplan->apprelids = offset_relid_set(mplan->apprelids, rtoffset);
- if (mplan->part_prune_info)
- {
- foreach(l, mplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * Add PartitionPruneInfo, if any, to PlannerGlobal and update the index.
+ * Also update the RT indexes present in it to add the offset.
+ */
+ if (mplan->part_prune_index >= 0)
+ mplan->part_prune_index =
+ register_partpruneinfo(root, mplan->part_prune_index, rtoffset);
/* We don't need to recurse to lefttree or righttree ... */
Assert(mplan->plan.lefttree == NULL);
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 9a1a7faac7..60fabb1734 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -207,16 +207,20 @@ static void partkey_datum_from_expr(PartitionPruneContext *context,
/*
* make_partition_pruneinfo
- * Builds a PartitionPruneInfo which can be used in the executor to allow
- * additional partition pruning to take place. Returns NULL when
- * partition pruning would be useless.
+ * Checks if the given set of quals can be used to build pruning steps
+ * that the executor can use to prune away unneeded partitions. If
+ * suitable quals are found then a PartitionPruneInfo is built and tagged
+ * onto the PlannerInfo's partPruneInfos list.
+ *
+ * The return value is the 0-based index of the item added to the
+ * partPruneInfos list or -1 if nothing was added.
*
* 'parentrel' is the RelOptInfo for an appendrel, and 'subpaths' is the list
* of scan paths for its child rels.
* 'prunequal' is a list of potential pruning quals (i.e., restriction
* clauses that are applicable to the appendrel).
*/
-PartitionPruneInfo *
+int
make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *subpaths,
List *prunequal)
@@ -330,10 +334,11 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* quals, then we can just not bother with run-time pruning.
*/
if (prunerelinfos == NIL)
- return NULL;
+ return -1;
/* Else build the result data structure */
pruneinfo = makeNode(PartitionPruneInfo);
+ pruneinfo->root_parent_relids = parentrel->relids;
pruneinfo->prune_infos = prunerelinfos;
/*
@@ -356,7 +361,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
else
pruneinfo->other_subplans = NULL;
- return pruneinfo;
+ root->partPruneInfos = lappend(root->partPruneInfos, pruneinfo);
+
+ return list_length(root->partPruneInfos) - 1;
}
/*
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index c09bc83b2a..12aacc84ff 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -123,9 +123,9 @@ typedef struct PartitionPruneState
extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
+ Bitmapset *root_parent_relids,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
bool initial_prune);
-
#endif /* EXECPARTITION_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 88467977f8..22b928e085 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -636,6 +636,7 @@ typedef struct EState
* ExecRowMarks, or NULL if none */
List *es_rteperminfos; /* List of RTEPermissionInfo */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
+ List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 07e2415398..8d30b6e896 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -128,6 +128,9 @@ typedef struct PlannerGlobal
/* "flat" list of AppendRelInfos */
List *appendRelations;
+ /* List of PartitionPruneInfo contained in the plan */
+ List *partPruneInfos;
+
/* OIDs of relations the plan depends on */
List *relationOids;
@@ -559,6 +562,9 @@ struct PlannerInfo
/* Does this query modify any partition key columns? */
bool partColsUpdated;
+
+ /* PartitionPruneInfos added in this query's plan. */
+ List *partPruneInfos;
};
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 62cd6a6666..39d0281c23 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -69,6 +69,9 @@ typedef struct PlannedStmt
struct Plan *planTree; /* tree of Plan nodes */
+ List *partPruneInfos; /* List of PartitionPruneInfo contained in the
+ * plan */
+
List *rtable; /* list of RangeTblEntry nodes */
List *permInfos; /* list of RTEPermissionInfo nodes for rtable
@@ -276,8 +279,8 @@ typedef struct Append
*/
int first_partial_plan;
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+ int part_prune_index;
} Append;
/* ----------------
@@ -311,8 +314,8 @@ typedef struct MergeAppend
/* NULLS FIRST/LAST directions */
bool *nullsFirst pg_node_attr(array_size(numCols));
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+ int part_prune_index;
} MergeAppend;
/* ----------------
@@ -1414,6 +1417,8 @@ typedef struct PlanRowMark
* Then, since an Append-type node could have multiple partitioning
* hierarchies among its children, we have an unordered List of those Lists.
*
+ * root_parent_relids RelOptInfo.relids of the relation to which the parent
+ * plan node and this PartitionPruneInfo node belong
* prune_infos List of Lists containing PartitionedRelPruneInfo nodes,
* one sublist per run-time-prunable partition hierarchy
* appearing in the parent plan node's subplans.
@@ -1426,6 +1431,7 @@ typedef struct PartitionPruneInfo
pg_node_attr(no_equal, no_query_jumble)
NodeTag type;
+ Bitmapset *root_parent_relids;
List *prune_infos;
Bitmapset *other_subplans;
} PartitionPruneInfo;
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index bd490d154f..c536a1fe19 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -70,10 +70,10 @@ typedef struct PartitionPruneContext
#define PruneCxtStateIdx(partnatts, step_id, keyno) \
((partnatts) * (step_id) + (keyno))
-extern PartitionPruneInfo *make_partition_pruneinfo(struct PlannerInfo *root,
- struct RelOptInfo *parentrel,
- List *subpaths,
- List *prunequal);
+extern int make_partition_pruneinfo(struct PlannerInfo *root,
+ struct RelOptInfo *parentrel,
+ List *subpaths,
+ List *prunequal);
extern Bitmapset *prune_append_rel_partitions(struct RelOptInfo *rel);
extern Bitmapset *get_matching_partitions(PartitionPruneContext *context,
List *pruning_steps);
--
2.43.0
[application/x-patch] v54-0004-Handle-CachedPlan-invalidation-in-the-executor.patch (55.2K, 5-v54-0004-Handle-CachedPlan-invalidation-in-the-executor.patch)
download | inline diff:
From 3916c8617ba777317d01aa11c89b3276b46fe7a0 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Thu, 22 Aug 2024 19:38:13 +0900
Subject: [PATCH v54 4/4] Handle CachedPlan invalidation in the executor
This commit makes changes to handle cases where a cached plan
becomes invalid before deferred locks on prunable relations are taken.
* Add checks at various points in ExecutorStart() and its called
functions to determine if the plan becomes invalid. If detected,
the function and its callers return immediately. A previous commit
ensures any partially initialized PlanState tree objects are cleaned
up appropriately.
* Introduce ExecutorStartExt(), a wrapper over ExecutorStart(), to
handle cases where plan initialization is aborted due to invalidation.
ExecutorStartExt() creates a new transient CachedPlan if needed and
retries execution. This new entry point is only required for sites
using plancache.c. It requires passing the QueryDesc, eflags,
CachedPlanSource, and query_index (index in CachedPlanSource.query_list).
* Add GetSingleCachedPlan() in plancache.c to create a transient
CachedPlan for a specified query in the given CachedPlanSource.
Such CachedPlans are tracked in a separate global list for the
plancache invalidation callbacks to check.
This also adds isolation tests using the delay_execution test module
to verify scenarios where a CachedPlan becomes invalid before the
deferred locks are taken.
All ExecutorStart_hook implementations now must add the following
block after the ExecutorStart() call to ensure it doesn't work with an
invalid plan:
/* The plan may have become invalid during ExecutorStart() */
if (!ExecPlanStillValid(queryDesc->estate))
return;
Reviewed-by: Robert Haas
Discussion: https://postgr.es/m/CA+HiwqFGkMSge6TgC9KQzde0ohpAycLQuV7ooitEEpbKB0O_mg@mail.gmail.comk
---
contrib/auto_explain/auto_explain.c | 4 +
.../pg_stat_statements/pg_stat_statements.c | 4 +
src/backend/commands/explain.c | 8 +-
src/backend/commands/portalcmds.c | 1 +
src/backend/commands/prepare.c | 10 +-
src/backend/commands/trigger.c | 14 ++
src/backend/executor/README | 35 ++-
src/backend/executor/execMain.c | 84 ++++++-
src/backend/executor/execUtils.c | 3 +-
src/backend/executor/spi.c | 19 +-
src/backend/tcop/postgres.c | 4 +-
src/backend/tcop/pquery.c | 31 ++-
src/backend/utils/cache/plancache.c | 206 ++++++++++++++++++
src/backend/utils/mmgr/portalmem.c | 4 +-
src/include/commands/explain.h | 1 +
src/include/commands/trigger.h | 1 +
src/include/executor/execdesc.h | 1 +
src/include/executor/executor.h | 17 ++
src/include/nodes/execnodes.h | 1 +
src/include/utils/plancache.h | 26 +++
src/include/utils/portal.h | 4 +-
src/test/modules/delay_execution/Makefile | 3 +-
.../modules/delay_execution/delay_execution.c | 63 +++++-
.../expected/cached-plan-inval.out | 175 +++++++++++++++
src/test/modules/delay_execution/meson.build | 1 +
.../specs/cached-plan-inval.spec | 65 ++++++
26 files changed, 749 insertions(+), 36 deletions(-)
create mode 100644 src/test/modules/delay_execution/expected/cached-plan-inval.out
create mode 100644 src/test/modules/delay_execution/specs/cached-plan-inval.spec
diff --git a/contrib/auto_explain/auto_explain.c b/contrib/auto_explain/auto_explain.c
index 677c135f59..9eb5e9a619 100644
--- a/contrib/auto_explain/auto_explain.c
+++ b/contrib/auto_explain/auto_explain.c
@@ -300,6 +300,10 @@ explain_ExecutorStart(QueryDesc *queryDesc, int eflags)
else
standard_ExecutorStart(queryDesc, eflags);
+ /* The plan may have become invalid during standard_ExecutorStart() */
+ if (!ExecPlanStillValid(queryDesc->estate))
+ return;
+
if (auto_explain_enabled())
{
/*
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 3c72e437f7..76642b557a 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -985,6 +985,10 @@ pgss_ExecutorStart(QueryDesc *queryDesc, int eflags)
else
standard_ExecutorStart(queryDesc, eflags);
+ /* The plan may have become invalid during standard_ExecutorStart() */
+ if (!ExecPlanStillValid(queryDesc->estate))
+ return;
+
/*
* If query has queryId zero, don't track it. This prevents double
* counting of optimizable statements that are directly contained in
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 49f7370734..b7a0b8c05b 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -509,7 +509,8 @@ standard_ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NULL, NULL, -1, into, es, queryString, params,
+ queryEnv,
&planduration, (es->buffers ? &bufusage : NULL),
es->memory ? &mem_counters : NULL);
}
@@ -618,6 +619,7 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
*/
void
ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
+ CachedPlanSource *plansource, int query_index,
IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
@@ -688,8 +690,8 @@ ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
if (into)
eflags |= GetIntoRelEFlags(into);
- /* call ExecutorStart to prepare the plan for execution */
- ExecutorStart(queryDesc, eflags);
+ /* Call ExecutorStartExt to prepare the plan for execution. */
+ ExecutorStartExt(queryDesc, eflags, plansource, query_index);
/* Execute the plan for statistics if asked for */
if (es->analyze)
diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index 4f6acf6719..4b1503c05e 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -107,6 +107,7 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
queryString,
CMDTAG_SELECT, /* cursor's query is always a SELECT */
list_make1(plan),
+ NULL,
NULL);
/*----------
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 311b9ebd5b..4cd79a6e3a 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -202,7 +202,8 @@ ExecuteQuery(ParseState *pstate,
query_string,
entry->plansource->commandTag,
plan_list,
- cplan);
+ cplan,
+ entry->plansource);
/*
* For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
@@ -583,6 +584,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
MemoryContextCounters mem_counters;
MemoryContext planner_ctx = NULL;
MemoryContext saved_ctx = NULL;
+ int i = 0;
if (es->memory)
{
@@ -655,8 +657,8 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, cplan, into, es, query_string, paramLI,
- queryEnv,
+ ExplainOnePlan(pstmt, cplan, entry->plansource, i,
+ into, es, query_string, paramLI, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL),
es->memory ? &mem_counters : NULL);
else
@@ -668,6 +670,8 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
/* Separate plans with an appropriate separator */
if (lnext(plan_list, p) != NULL)
ExplainSeparatePlans(es);
+
+ i++;
}
if (estate)
diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index 29d30bfb6f..e33b8f573b 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -5120,6 +5120,20 @@ AfterTriggerEndQuery(EState *estate)
afterTriggers.query_depth--;
}
+/* ----------
+ * AfterTriggerAbortQuery()
+ *
+ * Called by ExecutorEnd() if the query execution was aborted due to the
+ * plan becoming invalid during initialization.
+ * ----------
+ */
+void
+AfterTriggerAbortQuery(void)
+{
+ /* Revert the actions of AfterTriggerBeginQuery(). */
+ afterTriggers.query_depth--;
+}
+
/*
* AfterTriggerFreeQuery
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 642d63be61..c76a00b394 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -280,6 +280,28 @@ are typically reset to empty once per tuple. Per-tuple contexts are usually
associated with ExprContexts, and commonly each PlanState node has its own
ExprContext to evaluate its qual and targetlist expressions in.
+Relation Locking
+----------------
+
+Typically, when the executor initializes a plan tree for execution, it doesn't
+lock non-index relations if the plan tree is freshly generated and not derived
+from a CachedPlan. This is because such locks have already been established
+during the query's parsing, rewriting, and planning phases. However, with a
+cached plan tree, some relations may remain unlocked. The function
+AcquireExecutorLocks() only locks unprunable relations in the plan, deferring
+the locking of prunable ones to executor initialization. This avoids
+unnecessary locking of relations that will be pruned during "initial" runtime
+pruning in ExecDoInitialPruning().
+
+This approach creates a window where a cached plan tree with child tables
+could become outdated if another backend modifies these tables before
+ExecDoInitialPruning() locks them. As a result, the executor has the added duty
+to verify the plan tree's validity whenever it locks a child table after
+doing initial pruning. This validation is done by checking the CachedPlan.is_valid
+attribute. If the plan tree is outdated (is_valid=false), the executor halts
+further initialization, cleans up anything in EState that would have been
+allocated up to that point, and retries execution after recreating the
+invalid plan in the CachedPlan.
Query Processing Control Flow
-----------------------------
@@ -288,11 +310,13 @@ This is a sketch of control flow for full query processing:
CreateQueryDesc
- ExecutorStart
+ ExecutorStart or ExecutorStartExt
CreateExecutorState
creates per-query context
- switch to per-query context to run ExecInitNode
+ switch to per-query context to run ExecDoInitialPruning and ExecInitNode
AfterTriggerBeginQuery
+ ExecDoInitialPruning
+ does initial pruning and locks surviving partitions if needed
ExecInitNode --- recursively scans plan tree
ExecInitNode
recurse into subsidiary nodes
@@ -316,7 +340,12 @@ This is a sketch of control flow for full query processing:
FreeQueryDesc
-Per above comments, it's not really critical for ExecEndNode to free any
+As mentioned in the "Relation Locking" section, if the plan tree is found to
+be stale after locking partitions in ExecDoInitialPruning(), the control is
+immediately returned to ExecutorStartExt(), which will create a new plan tree
+and perform the steps starting from CreateExecutorState() again.
+
+Per above comments, it's not really critical for ExecEndPlan to free any
memory; it'll all go away in FreeExecutorState anyway. However, we do need to
be careful to close relations, drop buffer pins, etc, so we do need to scan
the plan state tree to find these sorts of resources.
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index cb7a2bc456..4065e01f10 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -59,6 +59,7 @@
#include "utils/backend_status.h"
#include "utils/lsyscache.h"
#include "utils/partcache.h"
+#include "utils/plancache.h"
#include "utils/rls.h"
#include "utils/snapmgr.h"
@@ -137,6 +138,60 @@ ExecutorStart(QueryDesc *queryDesc, int eflags)
standard_ExecutorStart(queryDesc, eflags);
}
+/*
+ * A variant of ExecutorStart() that handles cleanup and replanning if the
+ * input CachedPlan becomes invalid due to locks being taken during
+ * ExecutorStartInternal(). If that happens, a new CachedPlan is created
+ * only for the at the index 'query_index' in plansource->query_list, which
+ * is released separately from the original CachedPlan.
+ */
+void
+ExecutorStartExt(QueryDesc *queryDesc, int eflags,
+ CachedPlanSource *plansource,
+ int query_index)
+{
+ if (queryDesc->cplan == NULL)
+ {
+ ExecutorStart(queryDesc, eflags);
+ return;
+ }
+
+ while (1)
+ {
+ ExecutorStart(queryDesc, eflags);
+ if (!CachedPlanValid(queryDesc->cplan))
+ {
+ CachedPlan *cplan;
+
+ /*
+ * The plan got invalidated, so try with a new updated plan.
+ *
+ * But first undo what ExecutorStart() would've done. Mark
+ * execution as aborted to ensure that AFTER trigger state is
+ * properly reset.
+ */
+ queryDesc->estate->es_aborted = true;
+ ExecutorEnd(queryDesc);
+
+ cplan = GetSingleCachedPlan(plansource, query_index,
+ queryDesc->queryEnv);
+
+ /*
+ * Install the new transient cplan into the QueryDesc replacing
+ * the old one so that executor initialization code can see it.
+ * Mark it as in use by us and ask FreeQueryDesc() to release it.
+ */
+ cplan->refcount = 1;
+ queryDesc->cplan = cplan;
+ queryDesc->cplan_release = true;
+ queryDesc->plannedstmt = linitial_node(PlannedStmt,
+ queryDesc->cplan->stmt_list);
+ }
+ else
+ break; /* ExecutorStart() succeeded! */
+ }
+}
+
void
standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
{
@@ -320,6 +375,7 @@ standard_ExecutorRun(QueryDesc *queryDesc,
estate = queryDesc->estate;
Assert(estate != NULL);
+ Assert(!estate->es_aborted);
Assert(!(estate->es_top_eflags & EXEC_FLAG_EXPLAIN_ONLY));
/* caller must ensure the query's snapshot is active */
@@ -426,8 +482,11 @@ standard_ExecutorFinish(QueryDesc *queryDesc)
Assert(estate != NULL);
Assert(!(estate->es_top_eflags & EXEC_FLAG_EXPLAIN_ONLY));
- /* This should be run once and only once per Executor instance */
- Assert(!estate->es_finished);
+ /*
+ * This should be run once and only once per Executor instance and never
+ * if the execution was aborted.
+ */
+ Assert(!estate->es_finished && !estate->es_aborted);
/* Switch into per-query memory context */
oldcontext = MemoryContextSwitchTo(estate->es_query_cxt);
@@ -486,11 +545,10 @@ standard_ExecutorEnd(QueryDesc *queryDesc)
Assert(estate != NULL);
/*
- * Check that ExecutorFinish was called, unless in EXPLAIN-only mode. This
- * Assert is needed because ExecutorFinish is new as of 9.1, and callers
- * might forget to call it.
+ * Check that ExecutorFinish was called, unless in EXPLAIN-only mode or if
+ * execution was aborted.
*/
- Assert(estate->es_finished ||
+ Assert(estate->es_finished || estate->es_aborted ||
(estate->es_top_eflags & EXEC_FLAG_EXPLAIN_ONLY));
/*
@@ -504,6 +562,14 @@ standard_ExecutorEnd(QueryDesc *queryDesc)
UnregisterSnapshot(estate->es_snapshot);
UnregisterSnapshot(estate->es_crosscheck_snapshot);
+ /*
+ * Reset AFTER trigger module if the query execution was aborted.
+ */
+ if (estate->es_aborted &&
+ !(estate->es_top_eflags &
+ (EXEC_FLAG_SKIP_TRIGGERS | EXEC_FLAG_EXPLAIN_ONLY)))
+ AfterTriggerAbortQuery();
+
/*
* Must switch out of context before destroying it
*/
@@ -950,6 +1016,9 @@ InitPlan(QueryDesc *queryDesc, int eflags)
estate->es_part_prune_infos = plannedstmt->partPruneInfos;
ExecDoInitialPruning(estate);
+ if (!ExecPlanStillValid(estate))
+ return;
+
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
*/
@@ -2931,6 +3000,9 @@ EvalPlanQualStart(EPQState *epqstate, Plan *planTree)
* the snapshot, rangetable, and external Param info. They need their own
* copies of local state, including a tuple table, es_param_exec_vals,
* result-rel info, etc.
+ *
+ * es_cachedplan is not copied because EPQ plan execution does not acquire
+ * any new locks that could invalidate the CachedPlan.
*/
rcestate->es_direction = ForwardScanDirection;
rcestate->es_snapshot = parentestate->es_snapshot;
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 67734979b0..435ae0df7a 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -147,6 +147,7 @@ CreateExecutorState(void)
estate->es_top_eflags = 0;
estate->es_instrument = 0;
estate->es_finished = false;
+ estate->es_aborted = false;
estate->es_exprcontexts = NIL;
@@ -757,7 +758,7 @@ ExecInitRangeTable(EState *estate, List *rangeTable, List *permInfos)
* ExecGetRangeTableRelation
* Open the Relation for a range table entry, if not already done
*
- * The Relations will be closed again in ExecEndPlan().
+ * The Relations will be closed in ExecEndPlan().
*/
Relation
ExecGetRangeTableRelation(EState *estate, Index rti)
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 659bd6dcd9..f84f376c9c 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -70,7 +70,8 @@ static int _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
static ParamListInfo _SPI_convert_params(int nargs, Oid *argtypes,
Datum *Values, const char *Nulls);
-static int _SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount);
+static int _SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount,
+ CachedPlanSource *plansource, int query_index);
static void _SPI_error_callback(void *arg);
@@ -1682,7 +1683,8 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
query_string,
plansource->commandTag,
stmt_list,
- cplan);
+ cplan,
+ plansource);
/*
* Set up options for portal. Default SCROLL type is chosen the same way
@@ -2494,6 +2496,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
List *stmt_list;
ListCell *lc2;
+ int i = 0;
spicallbackarg.query = plansource->query_string;
@@ -2691,8 +2694,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
options->params,
_SPI_current->queryEnv,
0);
- res = _SPI_pquery(qdesc, fire_triggers,
- canSetTag ? options->tcount : 0);
+
+ res = _SPI_pquery(qdesc, fire_triggers, canSetTag ? options->tcount : 0,
+ plansource, i);
FreeQueryDesc(qdesc);
}
else
@@ -2789,6 +2793,8 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
my_res = res;
goto fail;
}
+
+ i++;
}
/* Done with this plan, so release refcount */
@@ -2866,7 +2872,8 @@ _SPI_convert_params(int nargs, Oid *argtypes,
}
static int
-_SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount)
+_SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount,
+ CachedPlanSource *plansource, int query_index)
{
int operation = queryDesc->operation;
int eflags;
@@ -2922,7 +2929,7 @@ _SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount)
else
eflags = EXEC_FLAG_SKIP_TRIGGERS;
- ExecutorStart(queryDesc, eflags);
+ ExecutorStartExt(queryDesc, eflags, plansource, query_index);
ExecutorRun(queryDesc, ForwardScanDirection, tcount, true);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index e394f1419a..b95c859655 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1237,6 +1237,7 @@ exec_simple_query(const char *query_string)
query_string,
commandTag,
plantree_list,
+ NULL,
NULL);
/*
@@ -2039,7 +2040,8 @@ exec_bind_message(StringInfo input_message)
query_string,
psrc->commandTag,
cplan->stmt_list,
- cplan);
+ cplan,
+ psrc);
/* Done with the snapshot used for parameter I/O and parsing/planning */
if (snapshot_set)
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 6e8f6b1b8f..dbb0ffb771 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -19,6 +19,7 @@
#include "access/xact.h"
#include "commands/prepare.h"
+#include "executor/execdesc.h"
#include "executor/tstoreReceiver.h"
#include "miscadmin.h"
#include "pg_trace.h"
@@ -37,6 +38,8 @@ Portal ActivePortal = NULL;
static void ProcessQuery(PlannedStmt *plan,
CachedPlan *cplan,
+ CachedPlanSource *plansource,
+ int query_index,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -80,6 +83,7 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
qd->cplan = cplan; /* CachedPlan supplying the plannedstmt */
+ qd->cplan_release = false;
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -114,6 +118,13 @@ FreeQueryDesc(QueryDesc *qdesc)
UnregisterSnapshot(qdesc->snapshot);
UnregisterSnapshot(qdesc->crosscheck_snapshot);
+ /*
+ * Release CachedPlan if requested. The CachedPlan is not associated with
+ * a ResourceOwner when cplan_release is true; see ExecutorStartExt().
+ */
+ if (qdesc->cplan_release)
+ ReleaseCachedPlan(qdesc->cplan, NULL);
+
/* Only the QueryDesc itself need be freed */
pfree(qdesc);
}
@@ -126,6 +137,8 @@ FreeQueryDesc(QueryDesc *qdesc)
*
* plan: the plan tree for the query
* cplan: CachedPlan supplying the plan
+ * plansource: CachedPlanSource supplying the cplan
+ * query_index: index of the query in plansource->query_list
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -139,6 +152,8 @@ FreeQueryDesc(QueryDesc *qdesc)
static void
ProcessQuery(PlannedStmt *plan,
CachedPlan *cplan,
+ CachedPlanSource *plansource,
+ int query_index,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -157,7 +172,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Call ExecutorStart to prepare the plan for execution
*/
- ExecutorStart(queryDesc, 0);
+ ExecutorStartExt(queryDesc, 0, plansource, query_index);
/*
* Run the plan to completion.
@@ -518,9 +533,12 @@ PortalStart(Portal portal, ParamListInfo params,
myeflags = eflags;
/*
- * Call ExecutorStart to prepare the plan for execution
+ * ExecutorStartExt() to prepare the plan for execution. If
+ * the portal is using a cached plan, it may get invalidated
+ * during plan intialization, in which case a new one is
+ * created and saved in the QueryDesc.
*/
- ExecutorStart(queryDesc, myeflags);
+ ExecutorStartExt(queryDesc, myeflags, portal->plansource, 0);
/*
* This tells PortalCleanup to shut down the executor
@@ -1201,6 +1219,7 @@ PortalRunMulti(Portal portal,
{
bool active_snapshot_set = false;
ListCell *stmtlist_item;
+ int i = 0;
/*
* If the destination is DestRemoteExecute, change to DestNone. The
@@ -1283,6 +1302,8 @@ PortalRunMulti(Portal portal,
/* statement can set tag string */
ProcessQuery(pstmt,
portal->cplan,
+ portal->plansource,
+ i,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1293,6 +1314,8 @@ PortalRunMulti(Portal portal,
/* stmt added by rewrite cannot set tag */
ProcessQuery(pstmt,
portal->cplan,
+ portal->plansource,
+ i,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1357,6 +1380,8 @@ PortalRunMulti(Portal portal,
*/
if (lnext(portal->stmts, stmtlist_item) != NULL)
CommandCounterIncrement();
+
+ i++;
}
/* Pop the snapshot if we pushed one. */
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 5b75dadf13..d33f871ea2 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -94,6 +94,14 @@
*/
static dlist_head saved_plan_list = DLIST_STATIC_INIT(saved_plan_list);
+/*
+ * Head of the backend's list of "standalone" CachedPlans that are not
+ * associated with a CachedPlanSource, created by GetSingleCachedPlan() for
+ * transient use by the executor in certain scenarios where they're needed
+ * only for one execution of the plan.
+ */
+static dlist_head standalone_plan_list = DLIST_STATIC_INIT(standalone_plan_list);
+
/*
* This is the head of the backend's list of CachedExpressions.
*/
@@ -905,6 +913,8 @@ CheckCachedPlan(CachedPlanSource *plansource)
* Planning work is done in the caller's memory context. The finished plan
* is in a child memory context, which typically should get reparented
* (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * Note: When changing this, you should also look at GetSingleCachedPlan().
*/
static CachedPlan *
BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
@@ -1034,6 +1044,7 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
plan->is_generic = generic;
plan->is_saved = false;
plan->is_valid = true;
+ plan->is_standalone = false;
/* assign generation number to new plan */
plan->generation = ++(plansource->generation);
@@ -1282,6 +1293,121 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
return plan;
}
+/*
+ * Create a fresh CachedPlan for the query_index'th query in the provided
+ * CachedPlanSource.
+ *
+ * The created CachedPlan is standalone, meaning it is not tracked in the
+ * CachedPlanSource. The CachedPlan and its plan trees are allocated in a
+ * child context of the caller's memory context. The caller must ensure they
+ * remain valid until execution is complete, after which the plan should be
+ * released by calling ReleaseCachedPlan().
+ *
+ * This function primarily supports ExecutorStartExt(), which handles cases
+ * where the original generic CachedPlan becomes invalid after prunable
+ * relations are locked.
+ */
+CachedPlan *
+GetSingleCachedPlan(CachedPlanSource *plansource, int query_index,
+ QueryEnvironment *queryEnv)
+{
+ List *query_list = plansource->query_list,
+ *plan_list;
+ CachedPlan *plan = plansource->gplan;
+ MemoryContext oldcxt = CurrentMemoryContext,
+ plan_context;
+ PlannedStmt *plannedstmt;
+
+ Assert(ActiveSnapshotSet());
+
+ /* Sanity checks */
+ if (plan == NULL)
+ elog(ERROR, "GetSingleCachedPlan() called in the wrong context: plansource->gplan is NULL");
+ else if (plan->is_valid)
+ elog(ERROR, "GetSingleCachedPlan() called in the wrong context: plansource->gplan->is_valid");
+
+ /*
+ * The plansource might have become invalid since GetCachedPlan(). See the
+ * comment in BuildCachedPlan() for details on why this might happen.
+ *
+ * The risk is greater here because this function is called from the
+ * executor, meaning much more processing may have occurred compared to
+ * when BuildCachedPlan() is called from GetCachedPlan().
+ */
+ if (!plansource->is_valid)
+ query_list = RevalidateCachedQuery(plansource, queryEnv);
+ Assert(query_list != NIL);
+
+ /*
+ * Build a new generic plan for the query_index'th query, but make a copy
+ * to be scribbled on by the planner
+ */
+ query_list = list_make1(copyObject(list_nth_node(Query, query_list,
+ query_index)));
+ plan_list = pg_plan_queries(query_list, plansource->query_string,
+ plansource->cursor_options, NULL);
+
+ list_free_deep(query_list);
+
+ /*
+ * Make a dedicated memory context for the CachedPlan and its subsidiary
+ * data so that we can release it in ReleaseCachedPlan() that will be
+ * called in FreeQueryDesc().
+ */
+ plan_context = AllocSetContextCreate(CurrentMemoryContext,
+ "Standalone CachedPlan",
+ ALLOCSET_START_SMALL_SIZES);
+ MemoryContextCopyAndSetIdentifier(plan_context, plansource->query_string);
+
+ /*
+ * Copy plan into the new context.
+ */
+ MemoryContextSwitchTo(plan_context);
+ plan_list = copyObject(plan_list);
+
+ /*
+ * Create and fill the CachedPlan struct within the new context.
+ */
+ plan = (CachedPlan *) palloc(sizeof(CachedPlan));
+ plan->magic = CACHEDPLAN_MAGIC;
+ plan->stmt_list = plan_list;
+
+ plan->planRoleId = GetUserId();
+ Assert(list_length(plan_list) == 1);
+ plannedstmt = linitial_node(PlannedStmt, plan_list);
+
+ /*
+ * CachedPlan is dependent on role either if RLS affected the rewrite
+ * phase or if a role dependency was injected during planning. And it's
+ * transient if any plan is marked so.
+ */
+ plan->dependsOnRole = plansource->dependsOnRLS || plannedstmt->dependsOnRole;
+ if (plannedstmt->transientPlan)
+ {
+ Assert(TransactionIdIsNormal(TransactionXmin));
+ plan->saved_xmin = TransactionXmin;
+ }
+ else
+ plan->saved_xmin = InvalidTransactionId;
+ plan->refcount = 0;
+ plan->context = plan_context;
+ plan->is_oneshot = false;
+ plan->is_generic = true;
+ plan->is_saved = false;
+ plan->is_valid = true;
+ plan->is_standalone = true;
+ plan->generation = 1;
+ MemoryContextSwitchTo(oldcxt);
+
+ /*
+ * Add the entry to the global list of "standalone" cached plans. It is
+ * removed from the list by ReleaseCachedPlan().
+ */
+ dlist_push_tail(&standalone_plan_list, &plan->node);
+
+ return plan;
+}
+
/*
* ReleaseCachedPlan: release active use of a cached plan.
*
@@ -1309,6 +1435,10 @@ ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner)
/* Mark it no longer valid */
plan->magic = 0;
+ /* Remove from the global list if we are a standalone plan. */
+ if (plan->is_standalone)
+ dlist_delete(&plan->node);
+
/* One-shot plans do not own their context, so we can't free them */
if (!plan->is_oneshot)
MemoryContextDelete(plan->context);
@@ -2066,6 +2196,33 @@ PlanCacheRelCallback(Datum arg, Oid relid)
cexpr->is_valid = false;
}
}
+
+ /* Finally, invalidate any standalone cached plans */
+ dlist_foreach(iter, &standalone_plan_list)
+ {
+ CachedPlan *cplan = dlist_container(CachedPlan,
+ node, iter.cur);
+
+ Assert(cplan->magic == CACHEDPLAN_MAGIC);
+
+ if (cplan->is_valid)
+ {
+ ListCell *lc;
+
+ foreach(lc, cplan->stmt_list)
+ {
+ PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc);
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ continue; /* Ignore utility statements */
+ if ((relid == InvalidOid) ? plannedstmt->relationOids != NIL :
+ list_member_oid(plannedstmt->relationOids, relid))
+ cplan->is_valid = false;
+ if (!cplan->is_valid)
+ break; /* out of stmt_list scan */
+ }
+ }
+ }
}
/*
@@ -2176,6 +2333,44 @@ PlanCacheObjectCallback(Datum arg, int cacheid, uint32 hashvalue)
}
}
}
+
+ /* Finally, invalidate any standalone cached plans */
+ dlist_foreach(iter, &standalone_plan_list)
+ {
+ CachedPlan *cplan = dlist_container(CachedPlan,
+ node, iter.cur);
+
+ Assert(cplan->magic == CACHEDPLAN_MAGIC);
+
+ if (cplan->is_valid)
+ {
+ ListCell *lc;
+
+ foreach(lc, cplan->stmt_list)
+ {
+ PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc);
+ ListCell *lc3;
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ continue; /* Ignore utility statements */
+ foreach(lc3, plannedstmt->invalItems)
+ {
+ PlanInvalItem *item = (PlanInvalItem *) lfirst(lc3);
+
+ if (item->cacheId != cacheid)
+ continue;
+ if (hashvalue == 0 ||
+ item->hashValue == hashvalue)
+ {
+ cplan->is_valid = false;
+ break; /* out of invalItems scan */
+ }
+ }
+ if (!cplan->is_valid)
+ break; /* out of stmt_list scan */
+ }
+ }
+ }
}
/*
@@ -2235,6 +2430,17 @@ ResetPlanCache(void)
cexpr->is_valid = false;
}
+
+ /* Finally, invalidate any standalone cached plans */
+ dlist_foreach(iter, &standalone_plan_list)
+ {
+ CachedPlan *cplan = dlist_container(CachedPlan,
+ node, iter.cur);
+
+ Assert(cplan->magic == CACHEDPLAN_MAGIC);
+
+ cplan->is_valid = false;
+ }
}
/*
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 4a24613537..bf70fd4ce7 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -284,7 +284,8 @@ PortalDefineQuery(Portal portal,
const char *sourceText,
CommandTag commandTag,
List *stmts,
- CachedPlan *cplan)
+ CachedPlan *cplan,
+ CachedPlanSource *plansource)
{
Assert(PortalIsValid(portal));
Assert(portal->status == PORTAL_NEW);
@@ -299,6 +300,7 @@ PortalDefineQuery(Portal portal,
portal->commandTag = commandTag;
portal->stmts = stmts;
portal->cplan = cplan;
+ portal->plansource = plansource;
portal->status = PORTAL_DEFINED;
}
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 21c71e0d53..a39989a950 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -104,6 +104,7 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ParamListInfo params, QueryEnvironment *queryEnv);
extern void ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
+ CachedPlanSource *plansource, int plan_index,
IntoClause *into, ExplainState *es,
const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
diff --git a/src/include/commands/trigger.h b/src/include/commands/trigger.h
index 8a5a9fe642..db21561c8c 100644
--- a/src/include/commands/trigger.h
+++ b/src/include/commands/trigger.h
@@ -258,6 +258,7 @@ extern void ExecASTruncateTriggers(EState *estate,
extern void AfterTriggerBeginXact(void);
extern void AfterTriggerBeginQuery(void);
extern void AfterTriggerEndQuery(EState *estate);
+extern void AfterTriggerAbortQuery(void);
extern void AfterTriggerFireDeferred(void);
extern void AfterTriggerEndXact(bool isCommit);
extern void AfterTriggerBeginSubXact(void);
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index 0e7245435d..f6cb6479c0 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -36,6 +36,7 @@ typedef struct QueryDesc
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
CachedPlan *cplan; /* CachedPlan that supplies the plannedstmt */
+ bool cplan_release; /* Should FreeQueryDesc() release cplan? */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 69c3ebff00..5bc0edb5a0 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -19,6 +19,7 @@
#include "nodes/lockoptions.h"
#include "nodes/parsenodes.h"
#include "utils/memutils.h"
+#include "utils/plancache.h"
/*
@@ -198,6 +199,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
* prototypes from functions in execMain.c
*/
extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
+extern void ExecutorStartExt(QueryDesc *queryDesc, int eflags,
+ CachedPlanSource *plansource, int query_index);
extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void ExecutorRun(QueryDesc *queryDesc,
ScanDirection direction, uint64 count, bool execute_once);
@@ -261,6 +264,19 @@ extern void ExecEndNode(PlanState *node);
extern void ExecShutdownNode(PlanState *node);
extern void ExecSetTupleBound(int64 tuples_needed, PlanState *child_node);
+/*
+ * Is the CachedPlan in es_cachedplan still valid?
+ *
+ * Called from InitPlan() because invalidation messages that affect the plan
+ * might be received after locks have been taken on runtime-prunable relations.
+ * The caller should take appropriate action if the plan has become invalid.
+ */
+static inline bool
+ExecPlanStillValid(EState *estate)
+{
+ return estate->es_cachedplan == NULL ? true :
+ CachedPlanValid(estate->es_cachedplan);
+}
/* ----------------------------------------------------------------
* ExecProcNode
@@ -589,6 +605,7 @@ extern void ExecCreateScanSlotFromOuterPlan(EState *estate,
extern bool ExecRelationIsTargetRelation(EState *estate, Index scanrelid);
extern Relation ExecOpenScanRelation(EState *estate, Index scanrelid, int eflags);
+extern Relation ExecOpenScanIndexRelation(EState *estate, Oid indexid, int lockmode);
extern void ExecInitRangeTable(EState *estate, List *rangeTable, List *permInfos);
extern void ExecCloseRangeTableRelations(EState *estate);
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 1ed925b99b..3ca96a85b6 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -686,6 +686,7 @@ typedef struct EState
int es_top_eflags; /* eflags passed to ExecutorStart */
int es_instrument; /* OR of InstrumentOption flags */
bool es_finished; /* true when ExecutorFinish is done */
+ bool es_aborted; /* true when execution was aborted */
List *es_exprcontexts; /* List of ExprContexts within EState */
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 0b5ee007ca..154f68f671 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -18,6 +18,7 @@
#include "access/tupdesc.h"
#include "lib/ilist.h"
#include "nodes/params.h"
+#include "nodes/parsenodes.h"
#include "tcop/cmdtag.h"
#include "utils/queryenvironment.h"
#include "utils/resowner.h"
@@ -152,6 +153,8 @@ typedef struct CachedPlan
bool is_generic; /* is it a reusable generic plan? */
bool is_saved; /* is CachedPlan in a long-lived context? */
bool is_valid; /* is the stmt_list currently valid? */
+ bool is_standalone; /* is it not associated with a
+ * CachedPlanSource? */
Oid planRoleId; /* Role ID the plan was created for */
bool dependsOnRole; /* is plan specific to that role? */
TransactionId saved_xmin; /* if valid, replan when TransactionXmin
@@ -159,6 +162,12 @@ typedef struct CachedPlan
int generation; /* parent's generation number for this plan */
int refcount; /* count of live references to this struct */
MemoryContext context; /* context containing this CachedPlan */
+
+ /*
+ * If the plan is not associated with a CachedPlanSource, it is saved in
+ * a separate global list.
+ */
+ dlist_node node; /* list link, if is_standalone */
} CachedPlan;
/*
@@ -224,6 +233,10 @@ extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
ParamListInfo boundParams,
ResourceOwner owner,
QueryEnvironment *queryEnv);
+extern CachedPlan *GetSingleCachedPlan(CachedPlanSource *plansource,
+ int query_index,
+ QueryEnvironment *queryEnv);
+
extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
@@ -245,4 +258,17 @@ CachedPlanRequiresLocking(CachedPlan *cplan)
return cplan->is_generic;
}
+/*
+ * CachedPlanValid
+ * Returns whether a cached generic plan is still valid.
+ *
+ * Invoked by the executor to check if the plan has not been invalidated after
+ * taking locks during the initialization of the plan.
+ */
+static inline bool
+CachedPlanValid(CachedPlan *cplan)
+{
+ return cplan->is_valid;
+}
+
#endif /* PLANCACHE_H */
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index 29f49829f2..58c3828d2c 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
QueryCompletion qc; /* command completion data for executed query */
List *stmts; /* list of PlannedStmts */
CachedPlan *cplan; /* CachedPlan, if stmts are from one */
+ CachedPlanSource *plansource; /* CachedPlanSource, for cplan */
ParamListInfo portalParams; /* params to pass to query */
QueryEnvironment *queryEnv; /* environment for query */
@@ -241,7 +242,8 @@ extern void PortalDefineQuery(Portal portal,
const char *sourceText,
CommandTag commandTag,
List *stmts,
- CachedPlan *cplan);
+ CachedPlan *cplan,
+ CachedPlanSource *plansource);
extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
extern void PortalCreateHoldStore(Portal portal);
extern void PortalHashTableDeleteAll(void);
diff --git a/src/test/modules/delay_execution/Makefile b/src/test/modules/delay_execution/Makefile
index 70f24e846d..3eeb097fde 100644
--- a/src/test/modules/delay_execution/Makefile
+++ b/src/test/modules/delay_execution/Makefile
@@ -8,7 +8,8 @@ OBJS = \
delay_execution.o
ISOLATION = partition-addition \
- partition-removal-1
+ partition-removal-1 \
+ cached-plan-inval
ifdef USE_PGXS
PG_CONFIG = pg_config
diff --git a/src/test/modules/delay_execution/delay_execution.c b/src/test/modules/delay_execution/delay_execution.c
index 155c8a8d55..304ca77f7b 100644
--- a/src/test/modules/delay_execution/delay_execution.c
+++ b/src/test/modules/delay_execution/delay_execution.c
@@ -1,14 +1,18 @@
/*-------------------------------------------------------------------------
*
* delay_execution.c
- * Test module to allow delay between parsing and execution of a query.
+ * Test module to introduce delay at various points during execution of a
+ * query to test that execution proceeds safely in light of concurrent
+ * changes.
*
* The delay is implemented by taking and immediately releasing a specified
* advisory lock. If another process has previously taken that lock, the
* current process will be blocked until the lock is released; otherwise,
* there's no effect. This allows an isolationtester script to reliably
- * test behaviors where some specified action happens in another backend
- * between parsing and execution of any desired query.
+ * test behaviors where some specified action happens in another backend in
+ * a couple of cases: 1) between parsing and execution of any desired query
+ * when using the planner_hook, 2) between RevalidateCachedQuery() and
+ * ExecutorStart() when using the ExecutorStart_hook.
*
* Copyright (c) 2020-2024, PostgreSQL Global Development Group
*
@@ -22,6 +26,7 @@
#include <limits.h>
+#include "executor/executor.h"
#include "optimizer/planner.h"
#include "utils/builtins.h"
#include "utils/guc.h"
@@ -32,9 +37,11 @@ PG_MODULE_MAGIC;
/* GUC: advisory lock ID to use. Zero disables the feature. */
static int post_planning_lock_id = 0;
+static int executor_start_lock_id = 0;
-/* Save previous planner hook user to be a good citizen */
+/* Save previous hook users to be a good citizen */
static planner_hook_type prev_planner_hook = NULL;
+static ExecutorStart_hook_type prev_ExecutorStart_hook = NULL;
/* planner_hook function to provide the desired delay */
@@ -70,11 +77,41 @@ delay_execution_planner(Query *parse, const char *query_string,
return result;
}
+/* ExecutorStart_hook function to provide the desired delay */
+static void
+delay_execution_ExecutorStart(QueryDesc *queryDesc, int eflags)
+{
+ /* If enabled, delay by taking and releasing the specified lock */
+ if (executor_start_lock_id != 0)
+ {
+ DirectFunctionCall1(pg_advisory_lock_int8,
+ Int64GetDatum((int64) executor_start_lock_id));
+ DirectFunctionCall1(pg_advisory_unlock_int8,
+ Int64GetDatum((int64) executor_start_lock_id));
+
+ /*
+ * Ensure that we notice any pending invalidations, since the advisory
+ * lock functions don't do this.
+ */
+ AcceptInvalidationMessages();
+ }
+
+ /* Now start the executor, possibly via a previous hook user */
+ if (prev_ExecutorStart_hook)
+ prev_ExecutorStart_hook(queryDesc, eflags);
+ else
+ standard_ExecutorStart(queryDesc, eflags);
+
+ if (executor_start_lock_id != 0)
+ elog(NOTICE, "Finished ExecutorStart(): CachedPlan is %s",
+ CachedPlanValid(queryDesc->cplan) ? "valid" : "not valid");
+}
+
/* Module load function */
void
_PG_init(void)
{
- /* Set up the GUC to control which lock is used */
+ /* Set up GUCs to control which lock is used */
DefineCustomIntVariable("delay_execution.post_planning_lock_id",
"Sets the advisory lock ID to be locked/unlocked after planning.",
"Zero disables the delay.",
@@ -86,10 +123,22 @@ _PG_init(void)
NULL,
NULL,
NULL);
-
+ DefineCustomIntVariable("delay_execution.executor_start_lock_id",
+ "Sets the advisory lock ID to be locked/unlocked before starting execution.",
+ "Zero disables the delay.",
+ &executor_start_lock_id,
+ 0,
+ 0, INT_MAX,
+ PGC_USERSET,
+ 0,
+ NULL,
+ NULL,
+ NULL);
MarkGUCPrefixReserved("delay_execution");
- /* Install our hook */
+ /* Install our hooks. */
prev_planner_hook = planner_hook;
planner_hook = delay_execution_planner;
+ prev_ExecutorStart_hook = ExecutorStart_hook;
+ ExecutorStart_hook = delay_execution_ExecutorStart;
}
diff --git a/src/test/modules/delay_execution/expected/cached-plan-inval.out b/src/test/modules/delay_execution/expected/cached-plan-inval.out
new file mode 100644
index 0000000000..e8efb6d9d9
--- /dev/null
+++ b/src/test/modules/delay_execution/expected/cached-plan-inval.out
@@ -0,0 +1,175 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s1prep s2lock s1exec s2dropi s2unlock
+step s1prep: SET plan_cache_mode = force_generic_plan;
+ PREPARE q AS SELECT * FROM foov WHERE a = $1 FOR UPDATE;
+ EXPLAIN (COSTS OFF) EXECUTE q (1);
+QUERY PLAN
+------------------------------------------------
+LockRows
+ -> Append
+ Subplans Removed: 2
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: (a = $1)
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = $1)
+(7 rows)
+
+step s2lock: SELECT pg_advisory_lock(12345);
+pg_advisory_lock
+----------------
+
+(1 row)
+
+step s1exec: LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q (1); <waiting ...>
+step s2dropi: DROP INDEX foo12_1_a;
+step s2unlock: SELECT pg_advisory_unlock(12345);
+pg_advisory_unlock
+------------------
+t
+(1 row)
+
+step s1exec: <... completed>
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+-------------------------------------
+LockRows
+ -> Append
+ Subplans Removed: 2
+ -> Seq Scan on foo12_1 foo_1
+ Filter: (a = $1)
+(5 rows)
+
+
+starting permutation: s1prep2 s2lock s1exec2 s2dropi s2unlock
+step s1prep2: SET plan_cache_mode = force_generic_plan;
+ PREPARE q2 AS SELECT * FROM foov WHERE a = one() or a = two();
+ EXPLAIN (COSTS OFF) EXECUTE q2;
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+--------------------------------------------------
+Append
+ Subplans Removed: 1
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: ((a = one()) OR (a = two()))
+ -> BitmapOr
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = one())
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = two())
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+(11 rows)
+
+step s2lock: SELECT pg_advisory_lock(12345);
+pg_advisory_lock
+----------------
+
+(1 row)
+
+step s1exec2: LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q2; <waiting ...>
+step s2dropi: DROP INDEX foo12_1_a;
+step s2unlock: SELECT pg_advisory_unlock(12345);
+pg_advisory_unlock
+------------------
+t
+(1 row)
+
+step s1exec2: <... completed>
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+--------------------------------------------
+Append
+ Subplans Removed: 1
+ -> Seq Scan on foo12_1 foo_1
+ Filter: ((a = one()) OR (a = two()))
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+(6 rows)
+
+
+starting permutation: s1prep3 s2lock s1exec3 s2dropi s2unlock
+step s1prep3: SET plan_cache_mode = force_generic_plan;
+ PREPARE q3 AS UPDATE foov SET a = a WHERE a = one() or a = two();
+ EXPLAIN (COSTS OFF) EXECUTE q3;
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+--------------------------------------------------------
+Append
+ Subplans Removed: 1
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: ((a = one()) OR (a = two()))
+ -> BitmapOr
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = one())
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = two())
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+
+Update on foo
+ Update on foo12_1 foo_1
+ Update on foo12_2 foo_2
+ Update on foo3 foo
+ -> Append
+ Subplans Removed: 1
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: ((a = one()) OR (a = two()))
+ -> BitmapOr
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = one())
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = two())
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+(27 rows)
+
+step s2lock: SELECT pg_advisory_lock(12345);
+pg_advisory_lock
+----------------
+
+(1 row)
+
+step s1exec3: LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q3; <waiting ...>
+step s2dropi: DROP INDEX foo12_1_a;
+step s2unlock: SELECT pg_advisory_unlock(12345);
+pg_advisory_unlock
+------------------
+t
+(1 row)
+
+step s1exec3: <... completed>
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+--------------------------------------------------
+Append
+ Subplans Removed: 1
+ -> Seq Scan on foo12_1 foo_1
+ Filter: ((a = one()) OR (a = two()))
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+
+Update on foo
+ Update on foo12_1 foo_1
+ Update on foo12_2 foo_2
+ Update on foo3 foo
+ -> Append
+ Subplans Removed: 1
+ -> Seq Scan on foo12_1 foo_1
+ Filter: ((a = one()) OR (a = two()))
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+(17 rows)
+
diff --git a/src/test/modules/delay_execution/meson.build b/src/test/modules/delay_execution/meson.build
index 41f3ac0b89..5a70b183d0 100644
--- a/src/test/modules/delay_execution/meson.build
+++ b/src/test/modules/delay_execution/meson.build
@@ -24,6 +24,7 @@ tests += {
'specs': [
'partition-addition',
'partition-removal-1',
+ 'cached-plan-inval',
],
},
}
diff --git a/src/test/modules/delay_execution/specs/cached-plan-inval.spec b/src/test/modules/delay_execution/specs/cached-plan-inval.spec
new file mode 100644
index 0000000000..5b1f72b4a8
--- /dev/null
+++ b/src/test/modules/delay_execution/specs/cached-plan-inval.spec
@@ -0,0 +1,65 @@
+# Test to check that invalidation of cached generic plans during ExecutorStart
+# correctly triggers replanning and re-execution.
+
+setup
+{
+ CREATE TABLE foo (a int, b text) PARTITION BY LIST(a);
+ CREATE TABLE foo12 PARTITION OF foo FOR VALUES IN (1, 2) PARTITION BY LIST (a);
+ CREATE TABLE foo12_1 PARTITION OF foo12 FOR VALUES IN (1);
+ CREATE TABLE foo12_2 PARTITION OF foo12 FOR VALUES IN (2);
+ CREATE INDEX foo12_1_a ON foo12_1 (a);
+ CREATE TABLE foo3 PARTITION OF foo FOR VALUES IN (3);
+ CREATE VIEW foov AS SELECT * FROM foo;
+ CREATE FUNCTION one () RETURNS int AS $$ BEGIN RETURN 1; END; $$ LANGUAGE PLPGSQL STABLE;
+ CREATE FUNCTION two () RETURNS int AS $$ BEGIN RETURN 2; END; $$ LANGUAGE PLPGSQL STABLE;
+ CREATE RULE update_foo AS ON UPDATE TO foo DO ALSO SELECT 1;
+}
+
+teardown
+{
+ DROP VIEW foov;
+ DROP RULE update_foo ON foo;
+ DROP TABLE foo;
+ DROP FUNCTION one(), two();
+}
+
+session "s1"
+# Append with run-time pruning
+step "s1prep" { SET plan_cache_mode = force_generic_plan;
+ PREPARE q AS SELECT * FROM foov WHERE a = $1 FOR UPDATE;
+ EXPLAIN (COSTS OFF) EXECUTE q (1); }
+
+# Another case with Append with run-time pruning
+step "s1prep2" { SET plan_cache_mode = force_generic_plan;
+ PREPARE q2 AS SELECT * FROM foov WHERE a = one() or a = two();
+ EXPLAIN (COSTS OFF) EXECUTE q2; }
+
+# Case with a rule adding another query
+step "s1prep3" { SET plan_cache_mode = force_generic_plan;
+ PREPARE q3 AS UPDATE foov SET a = a WHERE a = one() or a = two();
+ EXPLAIN (COSTS OFF) EXECUTE q3; }
+
+# Executes a generic plan
+step "s1exec" { LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q (1); }
+step "s1exec2" { LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q2; }
+step "s1exec3" { LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q3; }
+
+session "s2"
+step "s2lock" { SELECT pg_advisory_lock(12345); }
+step "s2unlock" { SELECT pg_advisory_unlock(12345); }
+step "s2dropi" { DROP INDEX foo12_1_a; }
+
+# While "s1exec", etc. wait to acquire the advisory lock, "s2drop" is able to
+# drop the index being used in the cached plan. When "s1exec" is then
+# unblocked and initializes the cached plan for execution, it detects the
+# concurrent index drop and causes the cached plan to be discarded and
+# recreated without the index.
+permutation "s1prep" "s2lock" "s1exec" "s2dropi" "s2unlock"
+permutation "s1prep2" "s2lock" "s1exec2" "s2dropi" "s2unlock"
+permutation "s1prep3" "s2lock" "s1exec3" "s2dropi" "s2unlock"
--
2.43.0
^ permalink raw reply [nested|flat] 29+ messages in thread
* Re: generic plans and "initial" pruning
2024-08-15 15:34 Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-16 12:35 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-19 16:39 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-20 13:00 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-20 14:53 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-21 12:45 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-21 13:10 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-23 12:48 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-29 13:34 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-17 12:57 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-19 08:39 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
@ 2024-09-19 12:10 ` Amit Langote <[email protected]>
2024-09-20 08:10 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
0 siblings, 1 reply; 29+ messages in thread
From: Amit Langote @ 2024-09-19 12:10 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>; Tom Lane <[email protected]>
On Thu, Sep 19, 2024 at 5:39 PM Amit Langote <[email protected]> wrote:
> For
> ResultRelInfos, I took the approach of memsetting them to 0 for pruned
> result relations and adding checks at multiple sites to ensure the
> ResultRelInfo being handled is valid.
After some reflection, I realized that nobody would think that that
approach is very robust. In the attached, I’ve modified
ExecInitModifyTable() to allocate ResultRelInfos only for unpruned
relations, instead of allocating for all in
ModifyTable.resultRelations and setting pruned ones to 0. This
approach feels more robust.
--
Thanks, Amit Langote
Attachments:
[application/octet-stream] v55-0001-Move-PartitionPruneInfo-out-of-plan-nodes-into-P.patch (19.9K, 2-v55-0001-Move-PartitionPruneInfo-out-of-plan-nodes-into-P.patch)
download | inline diff:
From cf75d48323a3c28d272e34c942f123a2e04044fd Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Fri, 6 Sep 2024 13:11:05 +0900
Subject: [PATCH v55 1/5] Move PartitionPruneInfo out of plan nodes into
PlannedStmt
This change moves PartitionPruneInfo from individual plan nodes to
PlannedStmt, allowing runtime initial pruning to be performed across
the entire plan tree without traversing the tree to find nodes
containing PartitionPruneInfos.
The PartitionPruneInfo pointer fields in Append and MergeAppend nodes
have been replaced with an integer index that points to
PartitionPruneInfos in a list within PlannedStmt, which holds the
PartitionPruneInfos for all subqueries.
Reviewed-by: Alvaro Herrera
---
src/backend/executor/execMain.c | 1 +
src/backend/executor/execParallel.c | 1 +
src/backend/executor/execPartition.c | 19 +++++-
src/backend/executor/execUtils.c | 1 +
src/backend/executor/nodeAppend.c | 5 +-
src/backend/executor/nodeMergeAppend.c | 5 +-
src/backend/optimizer/plan/createplan.c | 24 +++----
src/backend/optimizer/plan/planner.c | 1 +
src/backend/optimizer/plan/setrefs.c | 86 ++++++++++++++++---------
src/backend/partitioning/partprune.c | 19 ++++--
src/include/executor/execPartition.h | 4 +-
src/include/nodes/execnodes.h | 1 +
src/include/nodes/pathnodes.h | 6 ++
src/include/nodes/plannodes.h | 14 ++--
src/include/partitioning/partprune.h | 8 +--
15 files changed, 133 insertions(+), 62 deletions(-)
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 7042ca6c60..e6197c165e 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -850,6 +850,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable, plannedstmt->permInfos);
estate->es_plannedstmt = plannedstmt;
+ estate->es_part_prune_infos = plannedstmt->partPruneInfos;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index bfb3419efb..b01a2fdfdd 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -181,6 +181,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
pstmt->planTree = plan;
+ pstmt->partPruneInfos = estate->es_part_prune_infos;
pstmt->rtable = estate->es_range_table;
pstmt->permInfos = estate->es_rteperminfos;
pstmt->resultRelations = NIL;
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 7651886229..ec730674f2 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1786,6 +1786,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* Initialize data structure needed for run-time partition pruning and
* do initial pruning if needed
*
+ * 'root_parent_relids' identifies the relation to which both the parent plan
+ * and the PartitionPruneInfo given by 'part_prune_index' belong.
+ *
* On return, *initially_valid_subplans is assigned the set of indexes of
* child subplans that must be initialized along with the parent plan node.
* Initial pruning is performed here if needed and in that case only the
@@ -1798,11 +1801,25 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
PartitionPruneState *
ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
+ Bitmapset *root_parent_relids,
Bitmapset **initially_valid_subplans)
{
PartitionPruneState *prunestate;
EState *estate = planstate->state;
+ PartitionPruneInfo *pruneinfo;
+
+ /* Obtain the pruneinfo we need, and make sure it's the right one */
+ pruneinfo = list_nth_node(PartitionPruneInfo, estate->es_part_prune_infos,
+ part_prune_index);
+ if (!bms_equal(root_parent_relids, pruneinfo->root_parent_relids))
+ ereport(ERROR,
+ errcode(ERRCODE_INTERNAL_ERROR),
+ errmsg_internal("mismatching PartitionPruneInfo found at part_prune_index %d",
+ part_prune_index),
+ errdetail_internal("plan node relids %s, pruneinfo relids %s",
+ bmsToString(root_parent_relids),
+ bmsToString(pruneinfo->root_parent_relids)));
/* We may need an expression context to evaluate partition exprs */
ExecAssignExprContext(estate, planstate);
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 5737f9f4eb..67734979b0 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -118,6 +118,7 @@ CreateExecutorState(void)
estate->es_rowmarks = NULL;
estate->es_rteperminfos = NIL;
estate->es_plannedstmt = NULL;
+ estate->es_part_prune_infos = NIL;
estate->es_junkFilter = NULL;
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index ca0f54d676..de7ebab5c2 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -134,7 +134,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
appendstate->as_begun = false;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -145,7 +145,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&appendstate->ps,
list_length(node->appendplans),
- node->part_prune_info,
+ node->part_prune_index,
+ node->apprelids,
&validsubplans);
appendstate->as_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index e1b9b984a7..3ed91808dd 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -82,7 +82,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
mergestate->ps.ExecProcNode = ExecMergeAppend;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -93,7 +93,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&mergestate->ps,
list_length(node->mergeplans),
- node->part_prune_info,
+ node->part_prune_index,
+ node->apprelids,
&validsubplans);
mergestate->ms_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index bb45ef318f..6642d09a39 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1225,7 +1225,6 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
ListCell *subpaths;
int nasyncplans = 0;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
int nodenumsortkeys = 0;
AttrNumber *nodeSortColIdx = NULL;
Oid *nodeSortOperators = NULL;
@@ -1376,6 +1375,9 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ plan->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1399,16 +1401,14 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
}
if (prunequal != NIL)
- partpruneinfo =
- make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ plan->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
plan->appendplans = subplans;
plan->nasyncplans = nasyncplans;
plan->first_partial_plan = best_path->first_partial_path;
- plan->part_prune_info = partpruneinfo;
copy_generic_path_info(&plan->plan, (Path *) best_path);
@@ -1447,7 +1447,6 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
List *subplans = NIL;
ListCell *subpaths;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
/*
* We don't have the actual creation of the MergeAppend node split out
@@ -1540,6 +1539,9 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ node->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1555,13 +1557,13 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
Assert(best_path->path.param_info == NULL);
if (prunequal != NIL)
- partpruneinfo = make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ node->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
node->mergeplans = subplans;
- node->part_prune_info = partpruneinfo;
+
/*
* If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index df35d1ff9c..1b9071c774 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -547,6 +547,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->dependsOnRole = glob->dependsOnRole;
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
+ result->partPruneInfos = glob->partPruneInfos;
result->rtable = glob->finalrtable;
result->permInfos = glob->finalrteperminfos;
result->resultRelations = glob->resultRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 91c7c4fe2f..e2ea406c4e 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1732,6 +1732,48 @@ set_customscan_references(PlannerInfo *root,
cscan->custom_relids = offset_relid_set(cscan->custom_relids, rtoffset);
}
+/*
+ * register_partpruneinfo
+ * Subroutine for set_append_references and set_mergeappend_references
+ *
+ * Add the PartitionPruneInfo from root->partPruneInfos at the given index
+ * into PlannerGlobal->partPruneInfos and return its index there.
+ *
+ * Also update the RT indexes present in PartitionedRelPruneInfos to add the
+ * offset.
+ */
+static int
+register_partpruneinfo(PlannerInfo *root, int part_prune_index, int rtoffset)
+{
+ PlannerGlobal *glob = root->glob;
+ PartitionPruneInfo *pinfo;
+ ListCell *l;
+
+ Assert(part_prune_index >= 0 &&
+ part_prune_index < list_length(root->partPruneInfos));
+ pinfo = list_nth_node(PartitionPruneInfo, root->partPruneInfos,
+ part_prune_index);
+
+ pinfo->root_parent_relids = offset_relid_set(pinfo->root_parent_relids,
+ rtoffset);
+ foreach(l, pinfo->prune_infos)
+ {
+ List *prune_infos = lfirst(l);
+ ListCell *l2;
+
+ foreach(l2, prune_infos)
+ {
+ PartitionedRelPruneInfo *prelinfo = lfirst(l2);
+
+ prelinfo->rtindex += rtoffset;
+ }
+ }
+
+ glob->partPruneInfos = lappend(glob->partPruneInfos, pinfo);
+
+ return list_length(glob->partPruneInfos) - 1;
+}
+
/*
* set_append_references
* Do set_plan_references processing on an Append
@@ -1784,21 +1826,13 @@ set_append_references(PlannerInfo *root,
aplan->apprelids = offset_relid_set(aplan->apprelids, rtoffset);
- if (aplan->part_prune_info)
- {
- foreach(l, aplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * Add PartitionPruneInfo, if any, to PlannerGlobal and update the index.
+ * Also update the RT indexes present in it to add the offset.
+ */
+ if (aplan->part_prune_index >= 0)
+ aplan->part_prune_index =
+ register_partpruneinfo(root, aplan->part_prune_index, rtoffset);
/* We don't need to recurse to lefttree or righttree ... */
Assert(aplan->plan.lefttree == NULL);
@@ -1860,21 +1894,13 @@ set_mergeappend_references(PlannerInfo *root,
mplan->apprelids = offset_relid_set(mplan->apprelids, rtoffset);
- if (mplan->part_prune_info)
- {
- foreach(l, mplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * Add PartitionPruneInfo, if any, to PlannerGlobal and update the index.
+ * Also update the RT indexes present in it to add the offset.
+ */
+ if (mplan->part_prune_index >= 0)
+ mplan->part_prune_index =
+ register_partpruneinfo(root, mplan->part_prune_index, rtoffset);
/* We don't need to recurse to lefttree or righttree ... */
Assert(mplan->plan.lefttree == NULL);
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 9a1a7faac7..60fabb1734 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -207,16 +207,20 @@ static void partkey_datum_from_expr(PartitionPruneContext *context,
/*
* make_partition_pruneinfo
- * Builds a PartitionPruneInfo which can be used in the executor to allow
- * additional partition pruning to take place. Returns NULL when
- * partition pruning would be useless.
+ * Checks if the given set of quals can be used to build pruning steps
+ * that the executor can use to prune away unneeded partitions. If
+ * suitable quals are found then a PartitionPruneInfo is built and tagged
+ * onto the PlannerInfo's partPruneInfos list.
+ *
+ * The return value is the 0-based index of the item added to the
+ * partPruneInfos list or -1 if nothing was added.
*
* 'parentrel' is the RelOptInfo for an appendrel, and 'subpaths' is the list
* of scan paths for its child rels.
* 'prunequal' is a list of potential pruning quals (i.e., restriction
* clauses that are applicable to the appendrel).
*/
-PartitionPruneInfo *
+int
make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *subpaths,
List *prunequal)
@@ -330,10 +334,11 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* quals, then we can just not bother with run-time pruning.
*/
if (prunerelinfos == NIL)
- return NULL;
+ return -1;
/* Else build the result data structure */
pruneinfo = makeNode(PartitionPruneInfo);
+ pruneinfo->root_parent_relids = parentrel->relids;
pruneinfo->prune_infos = prunerelinfos;
/*
@@ -356,7 +361,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
else
pruneinfo->other_subplans = NULL;
- return pruneinfo;
+ root->partPruneInfos = lappend(root->partPruneInfos, pruneinfo);
+
+ return list_length(root->partPruneInfos) - 1;
}
/*
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index c09bc83b2a..12aacc84ff 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -123,9 +123,9 @@ typedef struct PartitionPruneState
extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
+ Bitmapset *root_parent_relids,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
bool initial_prune);
-
#endif /* EXECPARTITION_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 88467977f8..22b928e085 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -636,6 +636,7 @@ typedef struct EState
* ExecRowMarks, or NULL if none */
List *es_rteperminfos; /* List of RTEPermissionInfo */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
+ List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 07e2415398..8d30b6e896 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -128,6 +128,9 @@ typedef struct PlannerGlobal
/* "flat" list of AppendRelInfos */
List *appendRelations;
+ /* List of PartitionPruneInfo contained in the plan */
+ List *partPruneInfos;
+
/* OIDs of relations the plan depends on */
List *relationOids;
@@ -559,6 +562,9 @@ struct PlannerInfo
/* Does this query modify any partition key columns? */
bool partColsUpdated;
+
+ /* PartitionPruneInfos added in this query's plan. */
+ List *partPruneInfos;
};
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 62cd6a6666..39d0281c23 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -69,6 +69,9 @@ typedef struct PlannedStmt
struct Plan *planTree; /* tree of Plan nodes */
+ List *partPruneInfos; /* List of PartitionPruneInfo contained in the
+ * plan */
+
List *rtable; /* list of RangeTblEntry nodes */
List *permInfos; /* list of RTEPermissionInfo nodes for rtable
@@ -276,8 +279,8 @@ typedef struct Append
*/
int first_partial_plan;
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+ int part_prune_index;
} Append;
/* ----------------
@@ -311,8 +314,8 @@ typedef struct MergeAppend
/* NULLS FIRST/LAST directions */
bool *nullsFirst pg_node_attr(array_size(numCols));
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+ int part_prune_index;
} MergeAppend;
/* ----------------
@@ -1414,6 +1417,8 @@ typedef struct PlanRowMark
* Then, since an Append-type node could have multiple partitioning
* hierarchies among its children, we have an unordered List of those Lists.
*
+ * root_parent_relids RelOptInfo.relids of the relation to which the parent
+ * plan node and this PartitionPruneInfo node belong
* prune_infos List of Lists containing PartitionedRelPruneInfo nodes,
* one sublist per run-time-prunable partition hierarchy
* appearing in the parent plan node's subplans.
@@ -1426,6 +1431,7 @@ typedef struct PartitionPruneInfo
pg_node_attr(no_equal, no_query_jumble)
NodeTag type;
+ Bitmapset *root_parent_relids;
List *prune_infos;
Bitmapset *other_subplans;
} PartitionPruneInfo;
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index bd490d154f..c536a1fe19 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -70,10 +70,10 @@ typedef struct PartitionPruneContext
#define PruneCxtStateIdx(partnatts, step_id, keyno) \
((partnatts) * (step_id) + (keyno))
-extern PartitionPruneInfo *make_partition_pruneinfo(struct PlannerInfo *root,
- struct RelOptInfo *parentrel,
- List *subpaths,
- List *prunequal);
+extern int make_partition_pruneinfo(struct PlannerInfo *root,
+ struct RelOptInfo *parentrel,
+ List *subpaths,
+ List *prunequal);
extern Bitmapset *prune_append_rel_partitions(struct RelOptInfo *rel);
extern Bitmapset *get_matching_partitions(PartitionPruneContext *context,
List *pruning_steps);
--
2.43.0
[application/octet-stream] v55-0003-Initialize-PartitionPruneContext-for-exec-prunin.patch (11.8K, 3-v55-0003-Initialize-PartitionPruneContext-for-exec-prunin.patch)
download | inline diff:
From 92d87cdbb3ad675ac6ffa2767f1d7d5876bd5369 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Wed, 18 Sep 2024 11:16:48 +0900
Subject: [PATCH v55 3/5] Initialize PartitionPruneContext for exec pruning
lazily
Currently, ExecInitPartitionPruning() iterates over PartitionPruningDatas
and nested PartitionedRelPruningDatas in a PartitionPruneState solely
to initialize the exec_context of the PartitionedRelPruningData.
This commit moves the initialization to find_matching_subplans_recurse(),
where the exec_context is actually needed, eliminating the need for
the above iteration. To track whether the context has been initialized
and is ready for use, a boolean field is_valid is added to
PartitionPruneContext.
---
src/backend/executor/execPartition.c | 166 ++++++++++-----------------
src/include/executor/execPartition.h | 1 +
src/include/partitioning/partprune.h | 2 +
3 files changed, 65 insertions(+), 104 deletions(-)
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 3c7c631867..d9fa593785 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -190,10 +190,8 @@ static void InitPartitionPruneContext(PartitionPruneContext *context,
static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
Bitmapset *initially_valid_subplans,
int n_total_subplans);
-static void PartitionPruneInitExecPruning(PartitionPruneInfo *pruneinfo,
- PartitionPruneState *prunestate,
- PlanState *planstate);
-static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
+static void find_matching_subplans_recurse(PlanState *parent_plan,
+ PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
Bitmapset **validsubplans);
@@ -1830,13 +1828,14 @@ ExecInitPartitionPruning(PlanState *planstate,
/*
* ExecDoInitialPruning() must have initialized the PartitionPruneState to
- * perform the initial pruning. Now we simply need to initialize the
- * context information for exec pruning.
+ * perform the initial pruning. Store PlanState so that the exec_context
+ * can be initialized using it later when find_matching_subplans_recurse()
+ * needs it.
*/
prunestate = list_nth(estate->es_part_prune_states, part_prune_index);
Assert(prunestate != NULL);
if (prunestate->do_exec_prune)
- PartitionPruneInitExecPruning(pruneinfo, prunestate, planstate);
+ prunestate->parent_plan = planstate;
/* Use the result of initial pruning done by ExecDoInitialPruning(). */
if (prunestate->do_initial_prune)
@@ -1893,8 +1892,7 @@ ExecInitPartitionPruning(PlanState *planstate,
* each PartitionedRelPruningData) for initial pruning here. Execution pruning
* requires access to the parent plan node's PlanState, which is not available
* when this function is called from ExecDoInitialPruning(), so it is
- * initialized later during ExecInitPartitionPruning() by calling
- * PartitionPruneInitExecPruning().
+ * initialized lazily during find_matching_subplans_recurse().
*/
PartitionPruneState *
ExecCreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
@@ -2099,25 +2097,30 @@ ExecCreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
}
/*
- * The exec pruning context will be initialized in
- * ExecInitPartitionPruning() when called during the initialization
- * of the parent plan node.
+ * The exec pruning context will be initialized lazily when it
+ * will be used for the first time in
+ * find_matching_subplans_recurse().
*
- * pprune->exec_pruning_steps is set to NIL to prevent
- * ExecFindMatchingSubPlans() from accessing an uninitialized
- * pprune->exec_context during the initial pruning by
- * ExecDoInitialPruning().
- *
- * prunestate->do_exec_prune is set to indicate whether
- * PartitionPruneInitExecPruning() needs to be called by
- * ExecInitPartitionPruning(). This optimization avoids
- * unnecessary cycles when only initial pruning is required.
+ * prunestate->do_exec_prune is set to indicate whether we're
+ * actually going to perform exec pruning to inform
+ * ExecInitPartitionPruning() whether it should fix the
+ * subplan_map array based on the result of initial pruning
+ * and also the parent node's code to allow it set up its
+ * data structure accordingly.
*/
- pprune->exec_pruning_steps = NIL;
+ pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
+ pprune->exec_context.is_valid = false;
if (pinfo->exec_pruning_steps &&
!(econtext->ecxt_estate->es_top_eflags & EXEC_FLAG_EXPLAIN_GENERIC))
prunestate->do_exec_prune = true;
+ /*
+ * Accumulate the IDs of all PARAM_EXEC Params affecting the
+ * partitioning decisions at this plan node.
+ */
+ prunestate->execparamids = bms_add_members(prunestate->execparamids,
+ pinfo->execparamids);
+
j++;
}
i++;
@@ -2208,6 +2211,8 @@ InitPartitionPruneContext(PartitionPruneContext *context,
}
}
}
+
+ context->is_valid = true;
}
/*
@@ -2326,84 +2331,6 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
pfree(new_subplan_indexes);
}
-/*
- * PartitionPruneInitExecPruning
- * Initialize PartitionPruneState for exec pruning.
- */
-static void
-PartitionPruneInitExecPruning(PartitionPruneInfo *pruneinfo,
- PartitionPruneState *prunestate,
- PlanState *planstate)
-{
- EState *estate = planstate->state;
- int i;
- ExprContext *econtext;
-
- /* CreatePartitionPruneState() must have initialized. */
- Assert(estate->es_partition_directory != NULL);
-
- /* CreatePartitionPruneState() must have set this. */
- Assert(prunestate->do_exec_prune);
-
- /*
- * Create ExprContext if not already done for the planstate. We may need
- * an expression context to evaluate partition exprs.
- */
- ExecAssignExprContext(estate, planstate);
- econtext = planstate->ps_ExprContext;
- for (i = 0; i < prunestate->num_partprunedata; i++)
- {
- List *partrel_pruneinfos =
- list_nth_node(List, pruneinfo->prune_infos, i);
- PartitionPruningData *prunedata = prunestate->partprunedata[i];
- int j;
-
- for (j = 0; j < prunedata->num_partrelprunedata; j++)
- {
- PartitionedRelPruneInfo *pinfo =
- list_nth_node(PartitionedRelPruneInfo, partrel_pruneinfos, j);
- PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
- Relation partrel = pprune->partrel;
- PartitionDesc partdesc;
- PartitionKey partkey;
-
- /*
- * Nothing to do if there are no exec pruning steps, but do set
- * pprune->exec_pruning_steps, becasue
- * find_matching_subplans_recurse() looks at it.
- *
- * Also skip if doing EXPLAIN (GENERIC_PLAN), since parameter
- * values may be missing.
- */
- pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
- if (pprune->exec_pruning_steps == NIL ||
- (econtext->ecxt_estate->es_top_eflags & EXEC_FLAG_EXPLAIN_GENERIC))
- continue;
-
- /*
- * We can rely on the copies of the partitioned table's partition
- * key and partition descriptor appearing in its relcache entry,
- * because that entry will be held open and locked for the
- * duration of this executor run.
- */
- partkey = RelationGetPartitionKey(partrel);
- partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
- partrel);
- InitPartitionPruneContext(&pprune->exec_context,
- pprune->exec_pruning_steps,
- partdesc, partkey, planstate,
- econtext);
-
- /*
- * Accumulate the IDs of all PARAM_EXEC Params affecting the
- * partitioning decisions at this plan node.
- */
- prunestate->execparamids = bms_add_members(prunestate->execparamids,
- pinfo->execparamids);
- }
- }
-}
-
/*
* ExecFindMatchingSubPlans
* Determine which subplans match the pruning steps detailed in
@@ -2449,12 +2376,16 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
* recursing to other (lower-level) parents as needed.
*/
pprune = &prunedata->partrelprunedata[0];
- find_matching_subplans_recurse(prunedata, pprune, initial_prune,
+ find_matching_subplans_recurse(prunestate->parent_plan,
+ prunedata, pprune, initial_prune,
&result);
/* Expression eval may have used space in ExprContext too */
- if (pprune->exec_pruning_steps)
+ if (pprune->exec_context.is_valid)
+ {
+ Assert(pprune->exec_pruning_steps != NIL);
ResetExprContext(pprune->exec_context.exprcontext);
+ }
}
/* Add in any subplans that partition pruning didn't account for */
@@ -2477,7 +2408,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
* Adds valid (non-prunable) subplan IDs to *validsubplans
*/
static void
-find_matching_subplans_recurse(PartitionPruningData *prunedata,
+find_matching_subplans_recurse(PlanState *parent_plan,
+ PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
Bitmapset **validsubplans)
@@ -2497,8 +2429,33 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
partset = get_matching_partitions(&pprune->initial_context,
pprune->initial_pruning_steps);
else if (!initial_prune && pprune->exec_pruning_steps)
+ {
+ /* Initialize exec_context if not already done. */
+ if (unlikely(!pprune->exec_context.is_valid))
+ {
+ ExprContext *econtext;
+ EState *estate = parent_plan->state;
+ /* Must allocate the needed stuff in the query lifetime context. */
+ MemoryContext oldcxt = MemoryContextSwitchTo(estate->es_query_cxt);
+ Relation partrel = pprune->partrel;
+ PartitionKey partkey = RelationGetPartitionKey(partrel);
+ PartitionDesc partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
+ partrel);
+
+ if (parent_plan->ps_ExprContext == NULL)
+ ExecAssignExprContext(estate, parent_plan);
+ econtext = parent_plan->ps_ExprContext;
+
+ InitPartitionPruneContext(&pprune->exec_context,
+ pprune->exec_pruning_steps,
+ partdesc, partkey, parent_plan,
+ econtext);
+
+ MemoryContextSwitchTo(oldcxt);
+ }
partset = get_matching_partitions(&pprune->exec_context,
pprune->exec_pruning_steps);
+ }
else
partset = pprune->present_parts;
@@ -2514,7 +2471,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
int partidx = pprune->subpart_map[i];
if (partidx >= 0)
- find_matching_subplans_recurse(prunedata,
+ find_matching_subplans_recurse(parent_plan,
+ prunedata,
&prunedata->partrelprunedata[partidx],
initial_prune, validsubplans);
else
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 2f45ac1cc8..ef6d8b2d48 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -122,6 +122,7 @@ typedef struct PartitionPruneState
bool do_initial_prune;
bool do_exec_prune;
int num_partprunedata;
+ PlanState *parent_plan;
PartitionPruningData *partprunedata[FLEXIBLE_ARRAY_MEMBER];
} PartitionPruneState;
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index c536a1fe19..b7f48eefcc 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -26,6 +26,7 @@ struct RelOptInfo;
* Stores information needed at runtime for pruning computations
* related to a single partitioned table.
*
+ * is_valid Has the information in this struct been initialized?
* strategy Partition strategy, e.g. LIST, RANGE, HASH.
* partnatts Number of columns in the partition key.
* nparts Number of partitions in this partitioned table.
@@ -48,6 +49,7 @@ struct RelOptInfo;
*/
typedef struct PartitionPruneContext
{
+ bool is_valid;
char strategy;
int partnatts;
int nparts;
--
2.43.0
[application/octet-stream] v55-0002-Perform-runtime-initial-pruning-outside-ExecInit.patch (17.3K, 4-v55-0002-Perform-runtime-initial-pruning-outside-ExecInit.patch)
download | inline diff:
From 808126517d4b0018ee96de1ba28ea664566fd1aa Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Thu, 12 Sep 2024 15:44:43 +0900
Subject: [PATCH v55 2/5] Perform runtime initial pruning outside
ExecInitNode()
This commit follows up on the previous change that moved
PartitionPruneInfos out of individual plan nodes into a list in
PlannedStmt. It moves the initialization of PartitionPruneStates
and runtime initial pruning out of ExecInitNode() and into a new
routine, ExecDoInitialPruning(), which is called by InitPlan()
before ExecInitNode() is invoked on the main plan tree and subplans.
ExecDoInitialPruning() stores the PartitionPruneStates that it
creates to do the initial pruning to use during exec pruninng in a
list matching the length of es_part_prune_infos (which holds the
PartitionPruneInfos from PlannedStmt), allowing both lists to share
the same index. It also saves the initial pruning result -- a
bitmapset of indexes for surviving child subnodes -- in a similarly
indexed list.
While the initial pruning is done earlier, the execution pruning
context information (needed for runtime pruning) is initialized
later during ExecInitNode() for the parent plan node, as it requires
access to the parent node's PlanState struct.
---
src/backend/executor/execMain.c | 55 ++++++++
src/backend/executor/execPartition.c | 179 +++++++++++++++++++++------
src/include/executor/execPartition.h | 6 +
src/include/nodes/execnodes.h | 2 +
4 files changed, 202 insertions(+), 40 deletions(-)
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index e6197c165e..1994112b2e 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -46,6 +46,7 @@
#include "commands/matview.h"
#include "commands/trigger.h"
#include "executor/executor.h"
+#include "executor/execPartition.h"
#include "executor/nodeSubplan.h"
#include "foreign/fdwapi.h"
#include "mb/pg_wchar.h"
@@ -818,6 +819,54 @@ ExecCheckXactReadOnly(PlannedStmt *plannedstmt)
PreventCommandIfParallelMode(CreateCommandName((Node *) plannedstmt));
}
+/*
+ * ExecDoInitialPruning
+ * Perform runtime "initial" pruning, if necessary, to determine the set
+ * of child subnodes that need to be initialized during ExecInitNode()
+ * for plan nodes that support partition pruning.
+ *
+ * For each PartitionPruneInfo in estate->es_part_prune_infos, this function
+ * creates a PartitionPruneState (even if no initial pruning is done) and adds
+ * it to es_part_prune_states. For PartitionPruneInfo entries that include
+ * initial pruning steps, the result of those steps is saved as a bitmapset
+ * of indexes representing child subnodes that are "valid" and should be
+ * initialized for execution.
+ */
+static void
+ExecDoInitialPruning(EState *estate)
+{
+ ListCell *lc;
+
+ foreach(lc, estate->es_part_prune_infos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc);
+ PartitionPruneState *prunestate;
+ Bitmapset *validsubplans = NULL;
+
+ /*
+ * Create the working data structure for pruning, and save it for use
+ * later in ExecInitPartitionPruning(), which will be called by the
+ * parent plan node's ExecInit* function.
+ */
+ prunestate = ExecCreatePartitionPruneState(estate, pruneinfo);
+ estate->es_part_prune_states = lappend(estate->es_part_prune_states,
+ prunestate);
+
+ /*
+ * Perform an initial partition pruning pass, if necessary, and save
+ * the bitmapset of valid subplans for use in
+ * ExecInitPartitionPruning(). If no initial pruning is performed, we
+ * still store a NULL to ensure that es_part_prune_results is the same
+ * length as es_part_prune_infos. This ensures that
+ * ExecInitPartitionPruning() can use the same index to locate the
+ * result.
+ */
+ if (prunestate->do_initial_prune)
+ validsubplans = ExecFindMatchingSubPlans(prunestate, true);
+ estate->es_part_prune_results = lappend(estate->es_part_prune_results,
+ validsubplans);
+ }
+}
/* ----------------------------------------------------------------
* InitPlan
@@ -850,7 +899,13 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable, plannedstmt->permInfos);
estate->es_plannedstmt = plannedstmt;
+
+ /*
+ * Perform runtime "initial" pruning to determine the plan nodes that will
+ * not be executed.
+ */
estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+ ExecDoInitialPruning(estate);
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index ec730674f2..3c7c631867 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -181,8 +181,6 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
int maxfieldlen);
static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
-static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
- PartitionPruneInfo *pruneinfo);
static void InitPartitionPruneContext(PartitionPruneContext *context,
List *pruning_steps,
PartitionDesc partdesc,
@@ -192,6 +190,9 @@ static void InitPartitionPruneContext(PartitionPruneContext *context,
static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
Bitmapset *initially_valid_subplans,
int n_total_subplans);
+static void PartitionPruneInitExecPruning(PartitionPruneInfo *pruneinfo,
+ PartitionPruneState *prunestate,
+ PlanState *planstate);
static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
@@ -1783,20 +1784,26 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
/*
* ExecInitPartitionPruning
- * Initialize data structure needed for run-time partition pruning and
- * do initial pruning if needed
+ * Initialize the data structures needed for runtime "exec" partition
+ * pruning and return the result of initial pruning, if available.
*
* 'root_parent_relids' identifies the relation to which both the parent plan
- * and the PartitionPruneInfo given by 'part_prune_index' belong.
+ * and the PartitionPruneInfo associated with 'part_prune_index' belong.
*
- * On return, *initially_valid_subplans is assigned the set of indexes of
- * child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * The PartitionPruneState would have been created by ExecDoInitialPruning()
+ * and stored as the part_prune_index'th element of EState.es_part_prune_states.
+ * Here, we initialize only the PartitionPruneContext necessary for execution
+ * pruning.
*
- * If subplans are indeed pruned, subplan_map arrays contained in the returned
- * PartitionPruneState are re-sequenced to not count those, though only if the
- * maps will be needed for subsequent execution pruning passes.
+ * On return, *initially_valid_subplans is assigned the set of indexes of child
+ * subplans that must be initialized alongside the parent plan node. Initial
+ * pruning would have been performed by ExecDoInitialPruning() if necessary,
+ * and the bitmapset of surviving subplans' indexes would have been stored as
+ * the part_prune_index'th element of EState.es_part_prune_results.
+ *
+ * If subplans are pruned, the subplan_map arrays in the returned
+ * PartitionPruneState are re-sequenced to exclude those subplans, but only if
+ * the maps will be needed for subsequent execution pruning passes.
*/
PartitionPruneState *
ExecInitPartitionPruning(PlanState *planstate,
@@ -1821,17 +1828,21 @@ ExecInitPartitionPruning(PlanState *planstate,
bmsToString(root_parent_relids),
bmsToString(pruneinfo->root_parent_relids)));
- /* We may need an expression context to evaluate partition exprs */
- ExecAssignExprContext(estate, planstate);
-
- /* Create the working data structure for pruning */
- prunestate = CreatePartitionPruneState(planstate, pruneinfo);
-
/*
- * Perform an initial partition prune pass, if required.
+ * ExecDoInitialPruning() must have initialized the PartitionPruneState to
+ * perform the initial pruning. Now we simply need to initialize the
+ * context information for exec pruning.
*/
+ prunestate = list_nth(estate->es_part_prune_states, part_prune_index);
+ Assert(prunestate != NULL);
+ if (prunestate->do_exec_prune)
+ PartitionPruneInitExecPruning(pruneinfo, prunestate, planstate);
+
+ /* Use the result of initial pruning done by ExecDoInitialPruning(). */
if (prunestate->do_initial_prune)
- *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+ *initially_valid_subplans = list_nth_node(Bitmapset,
+ estate->es_part_prune_results,
+ part_prune_index);
else
{
/* No pruning, so we'll need to initialize all subplans */
@@ -1877,16 +1888,23 @@ ExecInitPartitionPruning(PlanState *planstate,
* stored in each PartitionedRelPruningData can be re-used each time we
* re-evaluate which partitions match the pruning steps provided in each
* PartitionedRelPruneInfo.
+ *
+ * Note that we only initialize the PartitionPruneContext (which is placed into
+ * each PartitionedRelPruningData) for initial pruning here. Execution pruning
+ * requires access to the parent plan node's PlanState, which is not available
+ * when this function is called from ExecDoInitialPruning(), so it is
+ * initialized later during ExecInitPartitionPruning() by calling
+ * PartitionPruneInitExecPruning().
*/
-static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+PartitionPruneState *
+ExecCreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
{
- EState *estate = planstate->state;
PartitionPruneState *prunestate;
int n_part_hierarchies;
ListCell *lc;
int i;
- ExprContext *econtext = planstate->ps_ExprContext;
+ /* We may need an expression context to evaluate partition exprs */
+ ExprContext *econtext = CreateExprContext(estate);
/* For data reading, executor always includes detached partitions */
if (estate->es_partition_directory == NULL)
@@ -1974,6 +1992,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
* set to -1, as if they were pruned. By construction, both
* arrays are in partition bounds order.
*/
+ pprune->partrel = partrel;
pprune->nparts = partdesc->nparts;
pprune->subplan_map = palloc(sizeof(int) * partdesc->nparts);
@@ -2073,29 +2092,31 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
{
InitPartitionPruneContext(&pprune->initial_context,
pinfo->initial_pruning_steps,
- partdesc, partkey, planstate,
+ partdesc, partkey, NULL,
econtext);
/* Record whether initial pruning is needed at any level */
prunestate->do_initial_prune = true;
}
- pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
- if (pinfo->exec_pruning_steps &&
- !(econtext->ecxt_estate->es_top_eflags & EXEC_FLAG_EXPLAIN_GENERIC))
- {
- InitPartitionPruneContext(&pprune->exec_context,
- pinfo->exec_pruning_steps,
- partdesc, partkey, planstate,
- econtext);
- /* Record whether exec pruning is needed at any level */
- prunestate->do_exec_prune = true;
- }
/*
- * Accumulate the IDs of all PARAM_EXEC Params affecting the
- * partitioning decisions at this plan node.
+ * The exec pruning context will be initialized in
+ * ExecInitPartitionPruning() when called during the initialization
+ * of the parent plan node.
+ *
+ * pprune->exec_pruning_steps is set to NIL to prevent
+ * ExecFindMatchingSubPlans() from accessing an uninitialized
+ * pprune->exec_context during the initial pruning by
+ * ExecDoInitialPruning().
+ *
+ * prunestate->do_exec_prune is set to indicate whether
+ * PartitionPruneInitExecPruning() needs to be called by
+ * ExecInitPartitionPruning(). This optimization avoids
+ * unnecessary cycles when only initial pruning is required.
*/
- prunestate->execparamids = bms_add_members(prunestate->execparamids,
- pinfo->execparamids);
+ pprune->exec_pruning_steps = NIL;
+ if (pinfo->exec_pruning_steps &&
+ !(econtext->ecxt_estate->es_top_eflags & EXEC_FLAG_EXPLAIN_GENERIC))
+ prunestate->do_exec_prune = true;
j++;
}
@@ -2305,6 +2326,84 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
pfree(new_subplan_indexes);
}
+/*
+ * PartitionPruneInitExecPruning
+ * Initialize PartitionPruneState for exec pruning.
+ */
+static void
+PartitionPruneInitExecPruning(PartitionPruneInfo *pruneinfo,
+ PartitionPruneState *prunestate,
+ PlanState *planstate)
+{
+ EState *estate = planstate->state;
+ int i;
+ ExprContext *econtext;
+
+ /* CreatePartitionPruneState() must have initialized. */
+ Assert(estate->es_partition_directory != NULL);
+
+ /* CreatePartitionPruneState() must have set this. */
+ Assert(prunestate->do_exec_prune);
+
+ /*
+ * Create ExprContext if not already done for the planstate. We may need
+ * an expression context to evaluate partition exprs.
+ */
+ ExecAssignExprContext(estate, planstate);
+ econtext = planstate->ps_ExprContext;
+ for (i = 0; i < prunestate->num_partprunedata; i++)
+ {
+ List *partrel_pruneinfos =
+ list_nth_node(List, pruneinfo->prune_infos, i);
+ PartitionPruningData *prunedata = prunestate->partprunedata[i];
+ int j;
+
+ for (j = 0; j < prunedata->num_partrelprunedata; j++)
+ {
+ PartitionedRelPruneInfo *pinfo =
+ list_nth_node(PartitionedRelPruneInfo, partrel_pruneinfos, j);
+ PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
+ Relation partrel = pprune->partrel;
+ PartitionDesc partdesc;
+ PartitionKey partkey;
+
+ /*
+ * Nothing to do if there are no exec pruning steps, but do set
+ * pprune->exec_pruning_steps, becasue
+ * find_matching_subplans_recurse() looks at it.
+ *
+ * Also skip if doing EXPLAIN (GENERIC_PLAN), since parameter
+ * values may be missing.
+ */
+ pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
+ if (pprune->exec_pruning_steps == NIL ||
+ (econtext->ecxt_estate->es_top_eflags & EXEC_FLAG_EXPLAIN_GENERIC))
+ continue;
+
+ /*
+ * We can rely on the copies of the partitioned table's partition
+ * key and partition descriptor appearing in its relcache entry,
+ * because that entry will be held open and locked for the
+ * duration of this executor run.
+ */
+ partkey = RelationGetPartitionKey(partrel);
+ partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
+ partrel);
+ InitPartitionPruneContext(&pprune->exec_context,
+ pprune->exec_pruning_steps,
+ partdesc, partkey, planstate,
+ econtext);
+
+ /*
+ * Accumulate the IDs of all PARAM_EXEC Params affecting the
+ * partitioning decisions at this plan node.
+ */
+ prunestate->execparamids = bms_add_members(prunestate->execparamids,
+ pinfo->execparamids);
+ }
+ }
+}
+
/*
* ExecFindMatchingSubPlans
* Determine which subplans match the pruning steps detailed in
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 12aacc84ff..2f45ac1cc8 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -42,6 +42,9 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
* PartitionedRelPruneInfo (see plannodes.h); though note that here,
* subpart_map contains indexes into PartitionPruningData.partrelprunedata[].
*
+ * partrel Partitioned table; points to
+ * EState.es_relations[rti-1], where rti is the
+ * table's RT index
* nparts Length of subplan_map[] and subpart_map[].
* subplan_map Subplan index by partition index, or -1.
* subpart_map Subpart index by partition index, or -1.
@@ -58,6 +61,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
*/
typedef struct PartitionedRelPruningData
{
+ Relation partrel;
int nparts;
int *subplan_map;
int *subpart_map;
@@ -128,4 +132,6 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
bool initial_prune);
+extern PartitionPruneState *ExecCreatePartitionPruneState(EState *estate,
+ PartitionPruneInfo *pruneinfo);
#endif /* EXECPARTITION_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 22b928e085..518a9fcd15 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -637,6 +637,8 @@ typedef struct EState
List *es_rteperminfos; /* List of RTEPermissionInfo */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
+ List *es_part_prune_states; /* List of PartitionPruneState */
+ List *es_part_prune_results; /* List of Bitmapset */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
--
2.43.0
[application/octet-stream] v55-0004-Defer-locking-of-runtime-prunable-relations-to-e.patch (45.1K, 5-v55-0004-Defer-locking-of-runtime-prunable-relations-to-e.patch)
download | inline diff:
From ad047f0bb7b703c0d2079464622588138e64b117 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Wed, 18 Sep 2024 12:00:41 +0900
Subject: [PATCH v55 4/5] Defer locking of runtime-prunable relations to
executor
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
When preparing a cached plan for execution, plancache.c locks the
relations in the plan's range table to ensure they are safe for
execution. However, this approach, implemented in
AcquireExecutorLocks(), results in unnecessarily locking relations
that might be pruned during "initial" runtime pruning.
To optimize this, locking is now deferred for relations subject to
"initial" runtime pruning. The planner now provides a set of
"unprunable" relations through the new PlannedStmt.unprunableRelids
field. AcquireExecutorLocks() will only lock these unprunable
relations. PlannedStmt.unprunableRelids is populated by subtracting
the set of initially prunable relids from all RT indexes. The prunable
relids are identified by examining all PartitionPruneInfos during
set_plan_refs() and storing the RT indexes of partitions subject to
"initial" pruning steps. While at it, some duplicated code in
set_append_references() and set_mergeappend_references() that
constructs the prunable relids set has been refactored into a common
function.
Deferred locks are taken, if necessary, after ExecDoInitialPruning()
determines the set of unpruned partitions. To allow the executor to
determine whether the plan tree it’s executing is cached and may
contain unlocked relations, the CachedPlan is now made available via
the QueryDesc. The executor can call CachedPlanRequiresLocking(),
which returns true if the CachedPlan is a reusable generic plan that
might contain unlocked relations.
Plan nodes like Append have already been updated to consider only the
set of unpruned relations. However, there are cases, such as child
RowMarks and child result relations, where the code manipulating those
do not directly receive information about unpruned partitions.
Therefore, code handling child RowMarks and result relations has been
modified to ensure they don’t belong to pruned partitions. For this,
the RT indexes of unpruned partitions are added in
ExecDoInitialPruning() to es_unprunable_relids, which initially
contains PlannedStmt.unprunableRelids. The corresponding code now
processes only those child RowMarks and result relations whose owning
relations are in this set. For result relations managed by a
ModifyTable node, its resultRelations list is truncated in
ExecInitModifyTable to only consider unpruned relations and the
ResultRelInfo structs are created only for those.
Finally, an Assert has also been added in ExecCheckPermissions() to
ensure that all relations whose permissions are checked have been
properly locked, helping to catch any accidental omission of relations
from the unprunableRelids set that should have their permissions
checked.
This deferment introduces a window where prunable relations may be
altered by concurrent DDL, potentially causing the plan to become
invalid. Consequently, the executor might attempt to execute an
invalid plan, leading to errors such as failing to locate the index
of an unpruned partition that may have been dropped concurrently
during ExecInitIndexScan() (if it's partition-local, not inherited,
for example). Future commits will introduce changes to enable the
executor to check plan validity during ExecutorStart() and retry with
a newly created plan if the original becomes invalid after taking
deferred locks.
---
src/backend/commands/copyto.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/explain.c | 7 +--
src/backend/commands/extension.c | 1 +
src/backend/commands/matview.c | 2 +-
src/backend/commands/prepare.c | 3 +-
src/backend/executor/execMain.c | 75 ++++++++++++++++++++++++--
src/backend/executor/execParallel.c | 9 +++-
src/backend/executor/execPartition.c | 36 ++++++++++---
src/backend/executor/functions.c | 1 +
src/backend/executor/nodeAppend.c | 8 +--
src/backend/executor/nodeLockRows.c | 10 +++-
src/backend/executor/nodeMergeAppend.c | 2 +-
src/backend/executor/nodeModifyTable.c | 38 ++++++++++---
src/backend/executor/spi.c | 1 +
src/backend/optimizer/plan/planner.c | 2 +
src/backend/optimizer/plan/setrefs.c | 7 +++
src/backend/partitioning/partprune.c | 18 +++++++
src/backend/tcop/pquery.c | 10 +++-
src/backend/utils/cache/plancache.c | 40 ++++++++------
src/include/commands/explain.h | 5 +-
src/include/executor/execPartition.h | 5 +-
src/include/executor/execdesc.h | 2 +
src/include/nodes/execnodes.h | 6 +++
src/include/nodes/pathnodes.h | 6 +++
src/include/nodes/plannodes.h | 7 +++
src/include/utils/plancache.h | 10 ++++
27 files changed, 263 insertions(+), 52 deletions(-)
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 91de442f43..db976f928a 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -552,7 +552,7 @@ BeginCopyTo(ParseState *pstate,
((DR_copy *) dest)->cstate = cstate;
/* Create a QueryDesc requesting no output */
- cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(),
InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 0b629b1f79..57a3375cad 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -324,7 +324,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index aaec439892..49f7370734 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -509,7 +509,7 @@ standard_ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL),
es->memory ? &mem_counters : NULL);
}
@@ -617,7 +617,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
* to call it.
*/
void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
+ IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
const BufferUsage *bufusage,
@@ -673,7 +674,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
dest = None_Receiver;
/* Create a QueryDesc for the query */
- queryDesc = CreateQueryDesc(plannedstmt, queryString,
+ queryDesc = CreateQueryDesc(plannedstmt, cplan, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, instrument_option);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index fab59ad5f6..bd169edeff 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -742,6 +742,7 @@ execute_sql_string(const char *sql)
QueryDesc *qdesc;
qdesc = CreateQueryDesc(stmt,
+ NULL,
sql,
GetActiveSnapshot(), NULL,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 010097873d..69be74b4bd 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -438,7 +438,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, queryString,
+ queryDesc = CreateQueryDesc(plan, NULL, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 07257d4db9..311b9ebd5b 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -655,7 +655,8 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
+ ExplainOnePlan(pstmt, cplan, into, es, query_string, paramLI,
+ queryEnv,
&planduration, (es->buffers ? &bufusage : NULL),
es->memory ? &mem_counters : NULL);
else
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 1994112b2e..df1b5b2dc3 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -53,6 +53,7 @@
#include "miscadmin.h"
#include "parser/parse_relation.h"
#include "rewrite/rewriteHandler.h"
+#include "storage/lmgr.h"
#include "tcop/utility.h"
#include "utils/acl.h"
#include "utils/backend_status.h"
@@ -90,6 +91,7 @@ static bool ExecCheckPermissionsModified(Oid relOid, Oid userid,
AclMode requiredPerms);
static void ExecCheckXactReadOnly(PlannedStmt *plannedstmt);
static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
+static inline bool ExecShouldLockRelations(EState *estate);
/* end of local decls */
@@ -600,6 +602,21 @@ ExecCheckPermissions(List *rangeTable, List *rteperminfos,
(rte->rtekind == RTE_SUBQUERY &&
rte->relkind == RELKIND_VIEW));
+ /*
+ * Ensure that we have at least an AccessShareLock on relations
+ * whose permissions need to be checked.
+ *
+ * Skip this check in a parallel worker because locks won't be
+ * taken until ExecInitNode() performs plan initialization.
+ *
+ * XXX: ExecCheckPermissions() in a parallel worker may be
+ * redundant with the checks done in the leader process, so this
+ * should be reviewed to ensure it’s necessary.
+ */
+ Assert(IsParallelWorker() ||
+ CheckRelationOidLockedByMe(rte->relid, AccessShareLock,
+ true));
+
(void) getRTEPermissionInfo(rteperminfos, rte);
/* Many-to-one mapping not allowed */
Assert(!bms_is_member(rte->perminfoindex, indexset));
@@ -862,12 +879,46 @@ ExecDoInitialPruning(EState *estate)
* result.
*/
if (prunestate->do_initial_prune)
- validsubplans = ExecFindMatchingSubPlans(prunestate, true);
+ {
+ Bitmapset *validsubplan_rtis = NULL;
+
+ validsubplans = ExecFindMatchingSubPlans(prunestate, true,
+ &validsubplan_rtis);
+ if (ExecShouldLockRelations(estate))
+ {
+ int rtindex = -1;
+
+ rtindex = -1;
+ while ((rtindex = bms_next_member(validsubplan_rtis,
+ rtindex)) >= 0)
+ {
+ RangeTblEntry *rte = exec_rt_fetch(rtindex, estate);
+
+ Assert(rte->rtekind == RTE_RELATION &&
+ rte->rellockmode != NoLock);
+ LockRelationOid(rte->relid, rte->rellockmode);
+ }
+ }
+ estate->es_unprunable_relids = bms_add_members(estate->es_unprunable_relids,
+ validsubplan_rtis);
+ }
+
estate->es_part_prune_results = lappend(estate->es_part_prune_results,
validsubplans);
}
}
+/*
+ * Locks might be needed only if running a cached plan that might contain
+ * unlocked relations, such as reused generic plans.
+ */
+static inline bool
+ExecShouldLockRelations(EState *estate)
+{
+ return estate->es_cachedplan == NULL ? false :
+ CachedPlanRequiresLocking(estate->es_cachedplan);
+}
+
/* ----------------------------------------------------------------
* InitPlan
*
@@ -880,6 +931,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
{
CmdType operation = queryDesc->operation;
PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+ CachedPlan *cachedplan = queryDesc->cplan;
Plan *plan = plannedstmt->planTree;
List *rangeTable = plannedstmt->rtable;
EState *estate = queryDesc->estate;
@@ -899,10 +951,13 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable, plannedstmt->permInfos);
estate->es_plannedstmt = plannedstmt;
+ estate->es_cachedplan = cachedplan;
+ estate->es_unprunable_relids = bms_copy(plannedstmt->unprunableRelids);
/*
* Perform runtime "initial" pruning to determine the plan nodes that will
- * not be executed.
+ * not be executed. This will also add the RT indexes of surviving leaf
+ * partitions to es_unprunable_relids.
*/
estate->es_part_prune_infos = plannedstmt->partPruneInfos;
ExecDoInitialPruning(estate);
@@ -921,8 +976,13 @@ InitPlan(QueryDesc *queryDesc, int eflags)
Relation relation;
ExecRowMark *erm;
- /* ignore "parent" rowmarks; they are irrelevant at runtime */
- if (rc->isParent)
+ /*
+ * Ignore "parent" rowmarks, because they are irrelevant at
+ * runtime. Also ignore the rowmarks belonging to child tables
+ * that have been pruned in ExecDoInitialPruning().
+ */
+ if (rc->isParent ||
+ !bms_is_member(rc->rti, estate->es_unprunable_relids))
continue;
/* get relation's OID (will produce InvalidOid if subquery) */
@@ -2959,6 +3019,13 @@ EvalPlanQualStart(EPQState *epqstate, Plan *planTree)
}
}
+ /*
+ * Copy es_unprunable_relids so that RowMarks of pruned relations are
+ * ignored in ExecInitLockRows() and ExecInitModifyTable() when
+ * initializing the plan trees below.
+ */
+ rcestate->es_unprunable_relids = parentestate->es_unprunable_relids;
+
/*
* Initialize private state information for each SubPlan. We must do this
* before running ExecInitNode on the main query tree, since
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index b01a2fdfdd..7519c9a860 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -1257,8 +1257,15 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
paramLI = RestoreParamList(¶mspace);
- /* Create a QueryDesc for the query. */
+ /*
+ * Create a QueryDesc for the query. We pass NULL for cachedplan, because
+ * we don't have a pointer to the CachedPlan in the leader's process. It's
+ * fine because the only reason the executor needs to see it is to decide
+ * if it should take locks on certain relations, but paraller workers
+ * always take locks anyway.
+ */
return CreateQueryDesc(pstmt,
+ NULL,
queryString,
GetActiveSnapshot(), InvalidSnapshot,
receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index d9fa593785..551e0ce9b2 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -26,6 +26,7 @@
#include "partitioning/partdesc.h"
#include "partitioning/partprune.h"
#include "rewrite/rewriteManip.h"
+#include "storage/lmgr.h"
#include "utils/acl.h"
#include "utils/lsyscache.h"
#include "utils/partcache.h"
@@ -194,7 +195,8 @@ static void find_matching_subplans_recurse(PlanState *parent_plan,
PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans);
+ Bitmapset **validsubplans,
+ Bitmapset **validsubplan_rtis);
/*
@@ -1978,8 +1980,8 @@ ExecCreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
* The set of partitions that exist now might not be the same that
* existed when the plan was made. The normal case is that it is;
* optimize for that case with a quick comparison, and just copy
- * the subplan_map and make subpart_map point to the one in
- * PruneInfo.
+ * the subplan_map and make subpart_map, rti_map point to the
+ * ones in PruneInfo.
*
* For the case where they aren't identical, we could have more
* partitions on either side; or even exactly the same number of
@@ -1999,6 +2001,7 @@ ExecCreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
sizeof(int) * partdesc->nparts) == 0)
{
pprune->subpart_map = pinfo->subpart_map;
+ pprune->rti_map = pinfo->rti_map;
memcpy(pprune->subplan_map, pinfo->subplan_map,
sizeof(int) * pinfo->nparts);
}
@@ -2019,6 +2022,7 @@ ExecCreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
* mismatches.
*/
pprune->subpart_map = palloc(sizeof(int) * partdesc->nparts);
+ pprune->rti_map = palloc(sizeof(int) * partdesc->nparts);
for (pp_idx = 0; pp_idx < partdesc->nparts; pp_idx++)
{
@@ -2036,6 +2040,8 @@ ExecCreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
pinfo->subplan_map[pd_idx];
pprune->subpart_map[pp_idx] =
pinfo->subpart_map[pd_idx];
+ pprune->rti_map[pp_idx] =
+ pinfo->rti_map[pd_idx];
pd_idx++;
continue;
}
@@ -2073,6 +2079,7 @@ ExecCreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
pprune->subpart_map[pp_idx] = -1;
pprune->subplan_map[pp_idx] = -1;
+ pprune->rti_map[pp_idx] = 0;
}
}
@@ -2339,10 +2346,13 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
* Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated. This
* differentiates the initial executor-time pruning step from later
* runtime pruning.
+ *
+ * valisubplan_rtis must be non-NULL if initial_pruning is true.
*/
Bitmapset *
ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune)
+ bool initial_prune,
+ Bitmapset **validsubplan_rtis)
{
Bitmapset *result = NULL;
MemoryContext oldcontext;
@@ -2378,7 +2388,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
pprune = &prunedata->partrelprunedata[0];
find_matching_subplans_recurse(prunestate->parent_plan,
prunedata, pprune, initial_prune,
- &result);
+ &result, validsubplan_rtis);
/* Expression eval may have used space in ExprContext too */
if (pprune->exec_context.is_valid)
@@ -2395,6 +2405,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
/* Copy result out of the temp context before we reset it */
result = bms_copy(result);
+ if (validsubplan_rtis)
+ *validsubplan_rtis = bms_copy(*validsubplan_rtis);
MemoryContextReset(prunestate->prune_context);
@@ -2405,14 +2417,16 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
* find_matching_subplans_recurse
* Recursive worker function for ExecFindMatchingSubPlans
*
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and the RT indexes
+ * of their owning leaf partitions to *validsubplan_rtis if it's non-NULL.
*/
static void
find_matching_subplans_recurse(PlanState *parent_plan,
PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans)
+ Bitmapset **validsubplans,
+ Bitmapset **validsubplan_rtis)
{
Bitmapset *partset;
int i;
@@ -2464,8 +2478,13 @@ find_matching_subplans_recurse(PlanState *parent_plan,
while ((i = bms_next_member(partset, i)) >= 0)
{
if (pprune->subplan_map[i] >= 0)
+ {
*validsubplans = bms_add_member(*validsubplans,
pprune->subplan_map[i]);
+ if (validsubplan_rtis)
+ *validsubplan_rtis = bms_add_member(*validsubplan_rtis,
+ pprune->rti_map[i]);
+ }
else
{
int partidx = pprune->subpart_map[i];
@@ -2474,7 +2493,8 @@ find_matching_subplans_recurse(PlanState *parent_plan,
find_matching_subplans_recurse(parent_plan,
prunedata,
&prunedata->partrelprunedata[partidx],
- initial_prune, validsubplans);
+ initial_prune, validsubplans,
+ validsubplan_rtis);
else
{
/*
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 692854e2b3..6f6f45e0ad 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -840,6 +840,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
dest = None_Receiver;
es->qd = CreateQueryDesc(es->stmt,
+ NULL,
fcache->src,
GetActiveSnapshot(),
InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index de7ebab5c2..006bdafaea 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -581,7 +581,7 @@ choose_next_subplan_locally(AppendState *node)
else if (!node->as_valid_subplans_identified)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
node->as_valid_subplans_identified = true;
}
@@ -648,7 +648,7 @@ choose_next_subplan_for_leader(AppendState *node)
if (!node->as_valid_subplans_identified)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
node->as_valid_subplans_identified = true;
/*
@@ -724,7 +724,7 @@ choose_next_subplan_for_worker(AppendState *node)
else if (!node->as_valid_subplans_identified)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
node->as_valid_subplans_identified = true;
mark_invalid_subplans_as_finished(node);
@@ -877,7 +877,7 @@ ExecAppendAsyncBegin(AppendState *node)
if (!node->as_valid_subplans_identified)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
node->as_valid_subplans_identified = true;
classify_matching_subplans(node);
diff --git a/src/backend/executor/nodeLockRows.c b/src/backend/executor/nodeLockRows.c
index 41754ddfea..b5b2cd53c5 100644
--- a/src/backend/executor/nodeLockRows.c
+++ b/src/backend/executor/nodeLockRows.c
@@ -28,6 +28,7 @@
#include "foreign/fdwapi.h"
#include "miscadmin.h"
#include "utils/rel.h"
+#include "utils/lsyscache.h"
/* ----------------------------------------------------------------
@@ -347,8 +348,13 @@ ExecInitLockRows(LockRows *node, EState *estate, int eflags)
ExecRowMark *erm;
ExecAuxRowMark *aerm;
- /* ignore "parent" rowmarks; they are irrelevant at runtime */
- if (rc->isParent)
+ /*
+ * Ignore "parent" rowmarks, because they are irrelevant at
+ * runtime. Also ignore the rowmarks belonging to child tables
+ * that have been pruned in ExecDoInitialPruning().
+ */
+ if (rc->isParent ||
+ !bms_is_member(rc->rti, estate->es_unprunable_relids))
continue;
/* find ExecRowMark and build ExecAuxRowMark */
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 3ed91808dd..f7821aa178 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -219,7 +219,7 @@ ExecMergeAppend(PlanState *pstate)
*/
if (node->ms_valid_subplans == NULL)
node->ms_valid_subplans =
- ExecFindMatchingSubPlans(node->ms_prune_state, false);
+ ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
/*
* First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 8bf4c80d4a..3c02782445 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -4176,12 +4176,17 @@ ExecLookupResultRelByOid(ModifyTableState *node, Oid resultoid,
hash_search(node->mt_resultOidHash, &resultoid, HASH_FIND, NULL);
if (mtlookup)
{
+ ResultRelInfo *resultRelInfo;
+
if (update_cache)
{
node->mt_lastResultOid = resultoid;
node->mt_lastResultIndex = mtlookup->relationIndex;
}
- return node->resultRelInfo + mtlookup->relationIndex;
+
+ resultRelInfo = node->resultRelInfo + mtlookup->relationIndex;
+
+ return resultRelInfo;
}
}
else
@@ -4218,7 +4223,8 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
ModifyTableState *mtstate;
Plan *subplan = outerPlan(node);
CmdType operation = node->operation;
- int nrels = list_length(node->resultRelations);
+ int nrels;
+ List *resultRelations = NIL;
ResultRelInfo *resultRelInfo;
List *arowmarks;
ListCell *l;
@@ -4228,6 +4234,20 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
/* check for unsupported flags */
Assert(!(eflags & (EXEC_FLAG_BACKWARD | EXEC_FLAG_MARK)));
+ /*
+ * Only consider unpruned relations. In the future, it might be more
+ * efficient to store resultRelations as a bitmapset, which would make
+ * this operation cheaper.
+ */
+ foreach(l, node->resultRelations)
+ {
+ Index rti = lfirst_int(l);
+
+ if (bms_is_member(rti, estate->es_unprunable_relids))
+ resultRelations = lappend_int(resultRelations, rti);
+ }
+ nrels = list_length(resultRelations);
+
/*
* create state structure
*/
@@ -4265,6 +4285,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
*/
if (node->rootRelation > 0)
{
+ Assert(bms_is_member(node->rootRelation, estate->es_unprunable_relids));
mtstate->rootResultRelInfo = makeNode(ResultRelInfo);
ExecInitResultRelation(estate, mtstate->rootResultRelInfo,
node->rootRelation);
@@ -4279,7 +4300,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
/* set up epqstate with dummy subplan data for the moment */
EvalPlanQualInit(&mtstate->mt_epqstate, estate, NULL, NIL,
- node->epqParam, node->resultRelations);
+ node->epqParam, resultRelations);
mtstate->fireBSTriggers = true;
/*
@@ -4297,7 +4318,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
*/
resultRelInfo = mtstate->resultRelInfo;
i = 0;
- foreach(l, node->resultRelations)
+ foreach(l, resultRelations)
{
Index resultRelation = lfirst_int(l);
List *mergeActions = NIL;
@@ -4589,8 +4610,13 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
ExecRowMark *erm;
ExecAuxRowMark *aerm;
- /* ignore "parent" rowmarks; they are irrelevant at runtime */
- if (rc->isParent)
+ /*
+ * Ignore "parent" rowmarks, because they are irrelevant at
+ * runtime. Also ignore the rowmarks belonging to child tables
+ * that have been pruned in ExecDoInitialPruning().
+ */
+ if (rc->isParent ||
+ !bms_is_member(rc->rti, estate->es_unprunable_relids))
continue;
/* Find ExecRowMark and build ExecAuxRowMark */
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 90d9834576..659bd6dcd9 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -2684,6 +2684,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
snap = InvalidSnapshot;
qdesc = CreateQueryDesc(stmt,
+ cplan,
plansource->query_string,
snap, crosscheck_snapshot,
dest,
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 1b9071c774..9e47a7fd50 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -549,6 +549,8 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->planTree = top_plan;
result->partPruneInfos = glob->partPruneInfos;
result->rtable = glob->finalrtable;
+ result->unprunableRelids = bms_difference(bms_add_range(NULL, 1, list_length(result->rtable)),
+ glob->prunableRelids);
result->permInfos = glob->finalrteperminfos;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index e2ea406c4e..283a61a972 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1764,8 +1764,15 @@ register_partpruneinfo(PlannerInfo *root, int part_prune_index, int rtoffset)
foreach(l2, prune_infos)
{
PartitionedRelPruneInfo *prelinfo = lfirst(l2);
+ int i;
prelinfo->rtindex += rtoffset;
+ for (i = 0; i < prelinfo->nparts; i++)
+ {
+ prelinfo->rti_map[i] += rtoffset;
+ glob->prunableRelids = bms_add_member(glob->prunableRelids,
+ prelinfo->rti_map[i]);
+ }
}
}
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 60fabb1734..85894c87af 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -645,6 +645,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *subplan_map;
int *subpart_map;
Oid *relid_map;
+ int *rti_map;
/*
* Construct the subplan and subpart maps for this partitioning level.
@@ -657,6 +658,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subpart_map = (int *) palloc(nparts * sizeof(int));
memset(subpart_map, -1, nparts * sizeof(int));
relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+ rti_map = (int *) palloc0(nparts * sizeof(int));
present_parts = NULL;
i = -1;
@@ -671,9 +673,24 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+
+ /*
+ * Track the RT indexes of partitions to ensure they are included
+ * in the prunableRelids set of relations that are locked during
+ * execution. This ensures that if the plan is cached, these
+ * partitions are locked when the plan is reused.
+ *
+ * Partitions without a subplan and sub-partitioned partitions
+ * where none of the sub-partitions have a subplan due to
+ * constraint exclusion are not included in this set. Instead,
+ * they are added to the unprunableRelids set, and the relations
+ * in this set are locked by AcquireExecutorLocks() before
+ * executing a cached plan.
+ */
if (subplanidx >= 0)
{
present_parts = bms_add_member(present_parts, i);
+ rti_map[i] = (int) partrel->relid;
/* Record finding this subplan */
subplansfound = bms_add_member(subplansfound, subplanidx);
@@ -695,6 +712,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->subplan_map = subplan_map;
pinfo->subpart_map = subpart_map;
pinfo->relid_map = relid_map;
+ pinfo->rti_map = rti_map;
}
pfree(relid_subpart_map);
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index a1f8d03db1..6e8f6b1b8f 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -36,6 +36,7 @@ Portal ActivePortal = NULL;
static void ProcessQuery(PlannedStmt *plan,
+ CachedPlan *cplan,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -65,6 +66,7 @@ static void DoPortalRewind(Portal portal);
*/
QueryDesc *
CreateQueryDesc(PlannedStmt *plannedstmt,
+ CachedPlan *cplan,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
@@ -77,6 +79,7 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
+ qd->cplan = cplan; /* CachedPlan supplying the plannedstmt */
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
* PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
*
* plan: the plan tree for the query
+ * cplan: CachedPlan supplying the plan
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
*/
static void
ProcessQuery(PlannedStmt *plan,
+ CachedPlan *cplan,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Create the QueryDesc object
*/
- queryDesc = CreateQueryDesc(plan, sourceText,
+ queryDesc = CreateQueryDesc(plan, cplan, sourceText,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
@@ -493,6 +498,7 @@ PortalStart(Portal portal, ParamListInfo params,
* the destination to DestNone.
*/
queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+ portal->cplan,
portal->sourceText,
GetActiveSnapshot(),
InvalidSnapshot,
@@ -1276,6 +1282,7 @@ PortalRunMulti(Portal portal,
{
/* statement can set tag string */
ProcessQuery(pstmt,
+ portal->cplan,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1285,6 +1292,7 @@ PortalRunMulti(Portal portal,
{
/* stmt added by rewrite cannot set tag */
ProcessQuery(pstmt,
+ portal->cplan,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 5af1a168ec..5b75dadf13 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -104,7 +104,8 @@ static List *RevalidateCachedQuery(CachedPlanSource *plansource,
QueryEnvironment *queryEnv);
static bool CheckCachedPlan(CachedPlanSource *plansource);
static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv);
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ bool generic);
static bool choose_custom_plan(CachedPlanSource *plansource,
ParamListInfo boundParams);
static double cached_plan_cost(CachedPlan *plan, bool include_planner);
@@ -815,8 +816,11 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
* Caller must have already called RevalidateCachedQuery to verify that the
* querytree is up to date.
*
- * On a "true" return, we have acquired the locks needed to run the plan.
- * (We must do this for the "true" result to be race-condition-free.)
+ * On a "true" return, we have acquired locks on the "unprunableRelids" set
+ * for all plans in plansource->stmt_list. The plans are not completely
+ * race-condition-free until the executor takes locks on the set of prunable
+ * relations that survive initial runtime pruning during executor
+ * initialization;
*/
static bool
CheckCachedPlan(CachedPlanSource *plansource)
@@ -893,10 +897,10 @@ CheckCachedPlan(CachedPlanSource *plansource)
* or it can be set to NIL if we need to re-copy the plansource's query_list.
*
* To build a generic, parameter-value-independent plan, pass NULL for
- * boundParams. To build a custom plan, pass the actual parameter values via
- * boundParams. For best effect, the PARAM_FLAG_CONST flag should be set on
- * each parameter value; otherwise the planner will treat the value as a
- * hint rather than a hard constant.
+ * boundParams, and true for generic. To build a custom plan, pass the actual
+ * parameter values via boundParams, and false for generic. For best effect,
+ * the PARAM_FLAG_CONST flag should be set on each parameter value; otherwise
+ * the planner will treat the value as a hint rather than a hard constant.
*
* Planning work is done in the caller's memory context. The finished plan
* is in a child memory context, which typically should get reparented
@@ -904,7 +908,8 @@ CheckCachedPlan(CachedPlanSource *plansource)
*/
static CachedPlan *
BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv)
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ bool generic)
{
CachedPlan *plan;
List *plist;
@@ -1026,6 +1031,7 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
plan->refcount = 0;
plan->context = plan_context;
plan->is_oneshot = plansource->is_oneshot;
+ plan->is_generic = generic;
plan->is_saved = false;
plan->is_valid = true;
@@ -1196,7 +1202,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
else
{
/* Build a new generic plan */
- plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv, true);
/* Just make real sure plansource->gplan is clear */
ReleaseGenericPlan(plansource);
/* Link the new generic plan into the plansource */
@@ -1241,7 +1247,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (customplan)
{
/* Build a custom plan */
- plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv, false);
/* Accumulate total costs of custom plans */
plansource->total_custom_cost += cached_plan_cost(plan, true);
@@ -1387,8 +1393,8 @@ CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
}
/*
- * Reject if AcquireExecutorLocks would have anything to do. This is
- * probably unnecessary given the previous check, but let's be safe.
+ * Reject if there are any lockable relations. This is probably
+ * unnecessary given the previous check, but let's be safe.
*/
foreach(lc, plan->stmt_list)
{
@@ -1776,7 +1782,7 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
foreach(lc1, stmt_list)
{
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
- ListCell *lc2;
+ int rtindex;
if (plannedstmt->commandType == CMD_UTILITY)
{
@@ -1794,9 +1800,13 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
continue;
}
- foreach(lc2, plannedstmt->rtable)
+ rtindex = -1;
+ while ((rtindex = bms_next_member(plannedstmt->unprunableRelids,
+ rtindex)) >= 0)
{
- RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+ RangeTblEntry *rte = list_nth_node(RangeTblEntry,
+ plannedstmt->rtable,
+ rtindex - 1);
if (!(rte->rtekind == RTE_RELATION ||
(rte->rtekind == RTE_SUBQUERY && OidIsValid(rte->relid))))
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 3ab0aae78f..21c71e0d53 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -103,8 +103,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv);
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
- ExplainState *es, const char *queryString,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
+ IntoClause *into, ExplainState *es,
+ const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
const instr_time *planduration,
const BufferUsage *bufusage,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index ef6d8b2d48..7f2592e3b0 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -48,6 +48,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
* nparts Length of subplan_map[] and subpart_map[].
* subplan_map Subplan index by partition index, or -1.
* subpart_map Subpart index by partition index, or -1.
+ * rti_map RT index by partition index, or 0.
* present_parts A Bitmapset of the partition indexes that we
* have subplans or subparts for.
* initial_pruning_steps List of PartitionPruneSteps used to
@@ -65,6 +66,7 @@ typedef struct PartitionedRelPruningData
int nparts;
int *subplan_map;
int *subpart_map;
+ int *rti_map pg_node_attr(array_size(nparts));
Bitmapset *present_parts;
List *initial_pruning_steps;
List *exec_pruning_steps;
@@ -132,7 +134,8 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
Bitmapset *root_parent_relids,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune);
+ bool initial_prune,
+ Bitmapset **validsubplan_rtis);
extern PartitionPruneState *ExecCreatePartitionPruneState(EState *estate,
PartitionPruneInfo *pruneinfo);
#endif /* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index 0a7274e26c..0e7245435d 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,7 @@ typedef struct QueryDesc
/* These fields are provided by CreateQueryDesc */
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
+ CachedPlan *cplan; /* CachedPlan that supplies the plannedstmt */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
@@ -57,6 +58,7 @@ typedef struct QueryDesc
/* in pquery.c */
extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+ CachedPlan *cplan,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 518a9fcd15..57170818c0 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -42,6 +42,7 @@
#include "storage/condition_variable.h"
#include "utils/hsearch.h"
#include "utils/queryenvironment.h"
+#include "utils/plancache.h"
#include "utils/reltrigger.h"
#include "utils/sharedtuplestore.h"
#include "utils/snapshot.h"
@@ -636,9 +637,14 @@ typedef struct EState
* ExecRowMarks, or NULL if none */
List *es_rteperminfos; /* List of RTEPermissionInfo */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
+ CachedPlan *es_cachedplan;
List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
List *es_part_prune_states; /* List of PartitionPruneState */
List *es_part_prune_results; /* List of Bitmapset */
+ Bitmapset *es_unprunable_relids; /* PlannedStmt.unprunableRelids + RT
+ * indexes of leaf partitions that
+ * survive initial pruning; see
+ * ExecDoInitialPruning() */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 8d30b6e896..cc2190ea63 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -116,6 +116,12 @@ typedef struct PlannerGlobal
/* "flat" rangetable for executor */
List *finalrtable;
+ /*
+ * RT indexes of relations subject to removal from the plan due to runtime
+ * pruning at plan initialization time
+ */
+ Bitmapset *prunableRelids;
+
/* "flat" list of RTEPermissionInfos */
List *finalrteperminfos;
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 39d0281c23..318e30fe2f 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -74,6 +74,10 @@ typedef struct PlannedStmt
List *rtable; /* list of RangeTblEntry nodes */
+ Bitmapset *unprunableRelids; /* RT indexes of relations that are not
+ * subject to runtime pruning; for
+ * AcquireExecutorLocks() */
+
List *permInfos; /* list of RTEPermissionInfo nodes for rtable
* entries needing one */
@@ -1474,6 +1478,9 @@ typedef struct PartitionedRelPruneInfo
/* subpart index by partition index, or -1 */
int *subpart_map pg_node_attr(array_size(nparts));
+ /* RT index by partition index, or 0 */
+ int *rti_map pg_node_attr(array_size(nparts));
+
/* relation OID by partition index, or 0 */
Oid *relid_map pg_node_attr(array_size(nparts));
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index a90dfdf906..0b5ee007ca 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -149,6 +149,7 @@ typedef struct CachedPlan
int magic; /* should equal CACHEDPLAN_MAGIC */
List *stmt_list; /* list of PlannedStmts */
bool is_oneshot; /* is it a "oneshot" plan? */
+ bool is_generic; /* is it a reusable generic plan? */
bool is_saved; /* is CachedPlan in a long-lived context? */
bool is_valid; /* is the stmt_list currently valid? */
Oid planRoleId; /* Role ID the plan was created for */
@@ -235,4 +236,13 @@ extern bool CachedPlanIsSimplyValid(CachedPlanSource *plansource,
extern CachedExpression *GetCachedExpression(Node *expr);
extern void FreeCachedExpression(CachedExpression *cexpr);
+/*
+ * CachedPlanRequiresLocking: should the executor acquire locks?
+ */
+static inline bool
+CachedPlanRequiresLocking(CachedPlan *cplan)
+{
+ return cplan->is_generic;
+}
+
#endif /* PLANCACHE_H */
--
2.43.0
[application/octet-stream] v55-0005-Handle-CachedPlan-invalidation-in-the-executor.patch (58.0K, 6-v55-0005-Handle-CachedPlan-invalidation-in-the-executor.patch)
download | inline diff:
From 24eea4f10fa7129bc6284a7317d413bed2b177b5 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Thu, 22 Aug 2024 19:38:13 +0900
Subject: [PATCH v55 5/5] Handle CachedPlan invalidation in the executor
This commit makes changes to handle cases where a cached plan
becomes invalid before deferred locks on prunable relations are taken.
* Add checks at various points in ExecutorStart() and its called
functions to determine if the plan becomes invalid. If detected,
the function and its callers return immediately. A previous commit
ensures any partially initialized PlanState tree objects are cleaned
up appropriately.
* Introduce ExecutorStartExt(), a wrapper over ExecutorStart(), to
handle cases where plan initialization is aborted due to invalidation.
ExecutorStartExt() creates a new transient CachedPlan if needed and
retries execution. This new entry point is only required for sites
using plancache.c. It requires passing the QueryDesc, eflags,
CachedPlanSource, and query_index (index in CachedPlanSource.query_list).
* Add GetSingleCachedPlan() in plancache.c to create a transient
CachedPlan for a specified query in the given CachedPlanSource.
Such CachedPlans are tracked in a separate global list for the
plancache invalidation callbacks to check.
This also adds isolation tests using the delay_execution test module
to verify scenarios where a CachedPlan becomes invalid before the
deferred locks are taken.
All ExecutorStart_hook implementations now must add the following
block after the ExecutorStart() call to ensure it doesn't work with an
invalid plan:
/* The plan may have become invalid during ExecutorStart() */
if (!ExecPlanStillValid(queryDesc->estate))
return;
Reviewed-by: Robert Haas
Discussion: https://postgr.es/m/CA+HiwqFGkMSge6TgC9KQzde0ohpAycLQuV7ooitEEpbKB0O_mg@mail.gmail.comk
---
contrib/auto_explain/auto_explain.c | 4 +
.../pg_stat_statements/pg_stat_statements.c | 4 +
src/backend/commands/explain.c | 8 +-
src/backend/commands/portalcmds.c | 1 +
src/backend/commands/prepare.c | 10 +-
src/backend/commands/trigger.c | 14 ++
src/backend/executor/README | 35 ++-
src/backend/executor/execMain.c | 84 ++++++-
src/backend/executor/execUtils.c | 3 +-
src/backend/executor/spi.c | 19 +-
src/backend/tcop/postgres.c | 4 +-
src/backend/tcop/pquery.c | 31 ++-
src/backend/utils/cache/plancache.c | 206 ++++++++++++++++
src/backend/utils/mmgr/portalmem.c | 4 +-
src/include/commands/explain.h | 1 +
src/include/commands/trigger.h | 1 +
src/include/executor/execdesc.h | 1 +
src/include/executor/executor.h | 17 ++
src/include/nodes/execnodes.h | 1 +
src/include/utils/plancache.h | 26 ++
src/include/utils/portal.h | 4 +-
src/test/modules/delay_execution/Makefile | 3 +-
.../modules/delay_execution/delay_execution.c | 63 ++++-
.../expected/cached-plan-inval.out | 230 ++++++++++++++++++
src/test/modules/delay_execution/meson.build | 1 +
.../specs/cached-plan-inval.spec | 75 ++++++
26 files changed, 814 insertions(+), 36 deletions(-)
create mode 100644 src/test/modules/delay_execution/expected/cached-plan-inval.out
create mode 100644 src/test/modules/delay_execution/specs/cached-plan-inval.spec
diff --git a/contrib/auto_explain/auto_explain.c b/contrib/auto_explain/auto_explain.c
index 677c135f59..9eb5e9a619 100644
--- a/contrib/auto_explain/auto_explain.c
+++ b/contrib/auto_explain/auto_explain.c
@@ -300,6 +300,10 @@ explain_ExecutorStart(QueryDesc *queryDesc, int eflags)
else
standard_ExecutorStart(queryDesc, eflags);
+ /* The plan may have become invalid during standard_ExecutorStart() */
+ if (!ExecPlanStillValid(queryDesc->estate))
+ return;
+
if (auto_explain_enabled())
{
/*
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 3c72e437f7..76642b557a 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -985,6 +985,10 @@ pgss_ExecutorStart(QueryDesc *queryDesc, int eflags)
else
standard_ExecutorStart(queryDesc, eflags);
+ /* The plan may have become invalid during standard_ExecutorStart() */
+ if (!ExecPlanStillValid(queryDesc->estate))
+ return;
+
/*
* If query has queryId zero, don't track it. This prevents double
* counting of optimizable statements that are directly contained in
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 49f7370734..b7a0b8c05b 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -509,7 +509,8 @@ standard_ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NULL, NULL, -1, into, es, queryString, params,
+ queryEnv,
&planduration, (es->buffers ? &bufusage : NULL),
es->memory ? &mem_counters : NULL);
}
@@ -618,6 +619,7 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
*/
void
ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
+ CachedPlanSource *plansource, int query_index,
IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
@@ -688,8 +690,8 @@ ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
if (into)
eflags |= GetIntoRelEFlags(into);
- /* call ExecutorStart to prepare the plan for execution */
- ExecutorStart(queryDesc, eflags);
+ /* Call ExecutorStartExt to prepare the plan for execution. */
+ ExecutorStartExt(queryDesc, eflags, plansource, query_index);
/* Execute the plan for statistics if asked for */
if (es->analyze)
diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index 4f6acf6719..4b1503c05e 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -107,6 +107,7 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
queryString,
CMDTAG_SELECT, /* cursor's query is always a SELECT */
list_make1(plan),
+ NULL,
NULL);
/*----------
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 311b9ebd5b..4cd79a6e3a 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -202,7 +202,8 @@ ExecuteQuery(ParseState *pstate,
query_string,
entry->plansource->commandTag,
plan_list,
- cplan);
+ cplan,
+ entry->plansource);
/*
* For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
@@ -583,6 +584,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
MemoryContextCounters mem_counters;
MemoryContext planner_ctx = NULL;
MemoryContext saved_ctx = NULL;
+ int i = 0;
if (es->memory)
{
@@ -655,8 +657,8 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, cplan, into, es, query_string, paramLI,
- queryEnv,
+ ExplainOnePlan(pstmt, cplan, entry->plansource, i,
+ into, es, query_string, paramLI, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL),
es->memory ? &mem_counters : NULL);
else
@@ -668,6 +670,8 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
/* Separate plans with an appropriate separator */
if (lnext(plan_list, p) != NULL)
ExplainSeparatePlans(es);
+
+ i++;
}
if (estate)
diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index 29d30bfb6f..e33b8f573b 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -5120,6 +5120,20 @@ AfterTriggerEndQuery(EState *estate)
afterTriggers.query_depth--;
}
+/* ----------
+ * AfterTriggerAbortQuery()
+ *
+ * Called by ExecutorEnd() if the query execution was aborted due to the
+ * plan becoming invalid during initialization.
+ * ----------
+ */
+void
+AfterTriggerAbortQuery(void)
+{
+ /* Revert the actions of AfterTriggerBeginQuery(). */
+ afterTriggers.query_depth--;
+}
+
/*
* AfterTriggerFreeQuery
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 642d63be61..c76a00b394 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -280,6 +280,28 @@ are typically reset to empty once per tuple. Per-tuple contexts are usually
associated with ExprContexts, and commonly each PlanState node has its own
ExprContext to evaluate its qual and targetlist expressions in.
+Relation Locking
+----------------
+
+Typically, when the executor initializes a plan tree for execution, it doesn't
+lock non-index relations if the plan tree is freshly generated and not derived
+from a CachedPlan. This is because such locks have already been established
+during the query's parsing, rewriting, and planning phases. However, with a
+cached plan tree, some relations may remain unlocked. The function
+AcquireExecutorLocks() only locks unprunable relations in the plan, deferring
+the locking of prunable ones to executor initialization. This avoids
+unnecessary locking of relations that will be pruned during "initial" runtime
+pruning in ExecDoInitialPruning().
+
+This approach creates a window where a cached plan tree with child tables
+could become outdated if another backend modifies these tables before
+ExecDoInitialPruning() locks them. As a result, the executor has the added duty
+to verify the plan tree's validity whenever it locks a child table after
+doing initial pruning. This validation is done by checking the CachedPlan.is_valid
+attribute. If the plan tree is outdated (is_valid=false), the executor halts
+further initialization, cleans up anything in EState that would have been
+allocated up to that point, and retries execution after recreating the
+invalid plan in the CachedPlan.
Query Processing Control Flow
-----------------------------
@@ -288,11 +310,13 @@ This is a sketch of control flow for full query processing:
CreateQueryDesc
- ExecutorStart
+ ExecutorStart or ExecutorStartExt
CreateExecutorState
creates per-query context
- switch to per-query context to run ExecInitNode
+ switch to per-query context to run ExecDoInitialPruning and ExecInitNode
AfterTriggerBeginQuery
+ ExecDoInitialPruning
+ does initial pruning and locks surviving partitions if needed
ExecInitNode --- recursively scans plan tree
ExecInitNode
recurse into subsidiary nodes
@@ -316,7 +340,12 @@ This is a sketch of control flow for full query processing:
FreeQueryDesc
-Per above comments, it's not really critical for ExecEndNode to free any
+As mentioned in the "Relation Locking" section, if the plan tree is found to
+be stale after locking partitions in ExecDoInitialPruning(), the control is
+immediately returned to ExecutorStartExt(), which will create a new plan tree
+and perform the steps starting from CreateExecutorState() again.
+
+Per above comments, it's not really critical for ExecEndPlan to free any
memory; it'll all go away in FreeExecutorState anyway. However, we do need to
be careful to close relations, drop buffer pins, etc, so we do need to scan
the plan state tree to find these sorts of resources.
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index df1b5b2dc3..df117e9477 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -59,6 +59,7 @@
#include "utils/backend_status.h"
#include "utils/lsyscache.h"
#include "utils/partcache.h"
+#include "utils/plancache.h"
#include "utils/rls.h"
#include "utils/snapmgr.h"
@@ -137,6 +138,60 @@ ExecutorStart(QueryDesc *queryDesc, int eflags)
standard_ExecutorStart(queryDesc, eflags);
}
+/*
+ * A variant of ExecutorStart() that handles cleanup and replanning if the
+ * input CachedPlan becomes invalid due to locks being taken during
+ * ExecutorStartInternal(). If that happens, a new CachedPlan is created
+ * only for the at the index 'query_index' in plansource->query_list, which
+ * is released separately from the original CachedPlan.
+ */
+void
+ExecutorStartExt(QueryDesc *queryDesc, int eflags,
+ CachedPlanSource *plansource,
+ int query_index)
+{
+ if (queryDesc->cplan == NULL)
+ {
+ ExecutorStart(queryDesc, eflags);
+ return;
+ }
+
+ while (1)
+ {
+ ExecutorStart(queryDesc, eflags);
+ if (!CachedPlanValid(queryDesc->cplan))
+ {
+ CachedPlan *cplan;
+
+ /*
+ * The plan got invalidated, so try with a new updated plan.
+ *
+ * But first undo what ExecutorStart() would've done. Mark
+ * execution as aborted to ensure that AFTER trigger state is
+ * properly reset.
+ */
+ queryDesc->estate->es_aborted = true;
+ ExecutorEnd(queryDesc);
+
+ cplan = GetSingleCachedPlan(plansource, query_index,
+ queryDesc->queryEnv);
+
+ /*
+ * Install the new transient cplan into the QueryDesc replacing
+ * the old one so that executor initialization code can see it.
+ * Mark it as in use by us and ask FreeQueryDesc() to release it.
+ */
+ cplan->refcount = 1;
+ queryDesc->cplan = cplan;
+ queryDesc->cplan_release = true;
+ queryDesc->plannedstmt = linitial_node(PlannedStmt,
+ queryDesc->cplan->stmt_list);
+ }
+ else
+ break; /* ExecutorStart() succeeded! */
+ }
+}
+
void
standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
{
@@ -320,6 +375,7 @@ standard_ExecutorRun(QueryDesc *queryDesc,
estate = queryDesc->estate;
Assert(estate != NULL);
+ Assert(!estate->es_aborted);
Assert(!(estate->es_top_eflags & EXEC_FLAG_EXPLAIN_ONLY));
/* caller must ensure the query's snapshot is active */
@@ -426,8 +482,11 @@ standard_ExecutorFinish(QueryDesc *queryDesc)
Assert(estate != NULL);
Assert(!(estate->es_top_eflags & EXEC_FLAG_EXPLAIN_ONLY));
- /* This should be run once and only once per Executor instance */
- Assert(!estate->es_finished);
+ /*
+ * This should be run once and only once per Executor instance and never
+ * if the execution was aborted.
+ */
+ Assert(!estate->es_finished && !estate->es_aborted);
/* Switch into per-query memory context */
oldcontext = MemoryContextSwitchTo(estate->es_query_cxt);
@@ -486,11 +545,10 @@ standard_ExecutorEnd(QueryDesc *queryDesc)
Assert(estate != NULL);
/*
- * Check that ExecutorFinish was called, unless in EXPLAIN-only mode. This
- * Assert is needed because ExecutorFinish is new as of 9.1, and callers
- * might forget to call it.
+ * Check that ExecutorFinish was called, unless in EXPLAIN-only mode or if
+ * execution was aborted.
*/
- Assert(estate->es_finished ||
+ Assert(estate->es_finished || estate->es_aborted ||
(estate->es_top_eflags & EXEC_FLAG_EXPLAIN_ONLY));
/*
@@ -504,6 +562,14 @@ standard_ExecutorEnd(QueryDesc *queryDesc)
UnregisterSnapshot(estate->es_snapshot);
UnregisterSnapshot(estate->es_crosscheck_snapshot);
+ /*
+ * Reset AFTER trigger module if the query execution was aborted.
+ */
+ if (estate->es_aborted &&
+ !(estate->es_top_eflags &
+ (EXEC_FLAG_SKIP_TRIGGERS | EXEC_FLAG_EXPLAIN_ONLY)))
+ AfterTriggerAbortQuery();
+
/*
* Must switch out of context before destroying it
*/
@@ -962,6 +1028,9 @@ InitPlan(QueryDesc *queryDesc, int eflags)
estate->es_part_prune_infos = plannedstmt->partPruneInfos;
ExecDoInitialPruning(estate);
+ if (!ExecPlanStillValid(estate))
+ return;
+
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
*/
@@ -2948,6 +3017,9 @@ EvalPlanQualStart(EPQState *epqstate, Plan *planTree)
* the snapshot, rangetable, and external Param info. They need their own
* copies of local state, including a tuple table, es_param_exec_vals,
* result-rel info, etc.
+ *
+ * es_cachedplan is not copied because EPQ plan execution does not acquire
+ * any new locks that could invalidate the CachedPlan.
*/
rcestate->es_direction = ForwardScanDirection;
rcestate->es_snapshot = parentestate->es_snapshot;
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 67734979b0..435ae0df7a 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -147,6 +147,7 @@ CreateExecutorState(void)
estate->es_top_eflags = 0;
estate->es_instrument = 0;
estate->es_finished = false;
+ estate->es_aborted = false;
estate->es_exprcontexts = NIL;
@@ -757,7 +758,7 @@ ExecInitRangeTable(EState *estate, List *rangeTable, List *permInfos)
* ExecGetRangeTableRelation
* Open the Relation for a range table entry, if not already done
*
- * The Relations will be closed again in ExecEndPlan().
+ * The Relations will be closed in ExecEndPlan().
*/
Relation
ExecGetRangeTableRelation(EState *estate, Index rti)
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 659bd6dcd9..f84f376c9c 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -70,7 +70,8 @@ static int _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
static ParamListInfo _SPI_convert_params(int nargs, Oid *argtypes,
Datum *Values, const char *Nulls);
-static int _SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount);
+static int _SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount,
+ CachedPlanSource *plansource, int query_index);
static void _SPI_error_callback(void *arg);
@@ -1682,7 +1683,8 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
query_string,
plansource->commandTag,
stmt_list,
- cplan);
+ cplan,
+ plansource);
/*
* Set up options for portal. Default SCROLL type is chosen the same way
@@ -2494,6 +2496,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
List *stmt_list;
ListCell *lc2;
+ int i = 0;
spicallbackarg.query = plansource->query_string;
@@ -2691,8 +2694,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
options->params,
_SPI_current->queryEnv,
0);
- res = _SPI_pquery(qdesc, fire_triggers,
- canSetTag ? options->tcount : 0);
+
+ res = _SPI_pquery(qdesc, fire_triggers, canSetTag ? options->tcount : 0,
+ plansource, i);
FreeQueryDesc(qdesc);
}
else
@@ -2789,6 +2793,8 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
my_res = res;
goto fail;
}
+
+ i++;
}
/* Done with this plan, so release refcount */
@@ -2866,7 +2872,8 @@ _SPI_convert_params(int nargs, Oid *argtypes,
}
static int
-_SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount)
+_SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount,
+ CachedPlanSource *plansource, int query_index)
{
int operation = queryDesc->operation;
int eflags;
@@ -2922,7 +2929,7 @@ _SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount)
else
eflags = EXEC_FLAG_SKIP_TRIGGERS;
- ExecutorStart(queryDesc, eflags);
+ ExecutorStartExt(queryDesc, eflags, plansource, query_index);
ExecutorRun(queryDesc, ForwardScanDirection, tcount, true);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index e394f1419a..b95c859655 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1237,6 +1237,7 @@ exec_simple_query(const char *query_string)
query_string,
commandTag,
plantree_list,
+ NULL,
NULL);
/*
@@ -2039,7 +2040,8 @@ exec_bind_message(StringInfo input_message)
query_string,
psrc->commandTag,
cplan->stmt_list,
- cplan);
+ cplan,
+ psrc);
/* Done with the snapshot used for parameter I/O and parsing/planning */
if (snapshot_set)
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 6e8f6b1b8f..dbb0ffb771 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -19,6 +19,7 @@
#include "access/xact.h"
#include "commands/prepare.h"
+#include "executor/execdesc.h"
#include "executor/tstoreReceiver.h"
#include "miscadmin.h"
#include "pg_trace.h"
@@ -37,6 +38,8 @@ Portal ActivePortal = NULL;
static void ProcessQuery(PlannedStmt *plan,
CachedPlan *cplan,
+ CachedPlanSource *plansource,
+ int query_index,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -80,6 +83,7 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
qd->cplan = cplan; /* CachedPlan supplying the plannedstmt */
+ qd->cplan_release = false;
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -114,6 +118,13 @@ FreeQueryDesc(QueryDesc *qdesc)
UnregisterSnapshot(qdesc->snapshot);
UnregisterSnapshot(qdesc->crosscheck_snapshot);
+ /*
+ * Release CachedPlan if requested. The CachedPlan is not associated with
+ * a ResourceOwner when cplan_release is true; see ExecutorStartExt().
+ */
+ if (qdesc->cplan_release)
+ ReleaseCachedPlan(qdesc->cplan, NULL);
+
/* Only the QueryDesc itself need be freed */
pfree(qdesc);
}
@@ -126,6 +137,8 @@ FreeQueryDesc(QueryDesc *qdesc)
*
* plan: the plan tree for the query
* cplan: CachedPlan supplying the plan
+ * plansource: CachedPlanSource supplying the cplan
+ * query_index: index of the query in plansource->query_list
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -139,6 +152,8 @@ FreeQueryDesc(QueryDesc *qdesc)
static void
ProcessQuery(PlannedStmt *plan,
CachedPlan *cplan,
+ CachedPlanSource *plansource,
+ int query_index,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -157,7 +172,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Call ExecutorStart to prepare the plan for execution
*/
- ExecutorStart(queryDesc, 0);
+ ExecutorStartExt(queryDesc, 0, plansource, query_index);
/*
* Run the plan to completion.
@@ -518,9 +533,12 @@ PortalStart(Portal portal, ParamListInfo params,
myeflags = eflags;
/*
- * Call ExecutorStart to prepare the plan for execution
+ * ExecutorStartExt() to prepare the plan for execution. If
+ * the portal is using a cached plan, it may get invalidated
+ * during plan intialization, in which case a new one is
+ * created and saved in the QueryDesc.
*/
- ExecutorStart(queryDesc, myeflags);
+ ExecutorStartExt(queryDesc, myeflags, portal->plansource, 0);
/*
* This tells PortalCleanup to shut down the executor
@@ -1201,6 +1219,7 @@ PortalRunMulti(Portal portal,
{
bool active_snapshot_set = false;
ListCell *stmtlist_item;
+ int i = 0;
/*
* If the destination is DestRemoteExecute, change to DestNone. The
@@ -1283,6 +1302,8 @@ PortalRunMulti(Portal portal,
/* statement can set tag string */
ProcessQuery(pstmt,
portal->cplan,
+ portal->plansource,
+ i,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1293,6 +1314,8 @@ PortalRunMulti(Portal portal,
/* stmt added by rewrite cannot set tag */
ProcessQuery(pstmt,
portal->cplan,
+ portal->plansource,
+ i,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1357,6 +1380,8 @@ PortalRunMulti(Portal portal,
*/
if (lnext(portal->stmts, stmtlist_item) != NULL)
CommandCounterIncrement();
+
+ i++;
}
/* Pop the snapshot if we pushed one. */
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 5b75dadf13..d33f871ea2 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -94,6 +94,14 @@
*/
static dlist_head saved_plan_list = DLIST_STATIC_INIT(saved_plan_list);
+/*
+ * Head of the backend's list of "standalone" CachedPlans that are not
+ * associated with a CachedPlanSource, created by GetSingleCachedPlan() for
+ * transient use by the executor in certain scenarios where they're needed
+ * only for one execution of the plan.
+ */
+static dlist_head standalone_plan_list = DLIST_STATIC_INIT(standalone_plan_list);
+
/*
* This is the head of the backend's list of CachedExpressions.
*/
@@ -905,6 +913,8 @@ CheckCachedPlan(CachedPlanSource *plansource)
* Planning work is done in the caller's memory context. The finished plan
* is in a child memory context, which typically should get reparented
* (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * Note: When changing this, you should also look at GetSingleCachedPlan().
*/
static CachedPlan *
BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
@@ -1034,6 +1044,7 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
plan->is_generic = generic;
plan->is_saved = false;
plan->is_valid = true;
+ plan->is_standalone = false;
/* assign generation number to new plan */
plan->generation = ++(plansource->generation);
@@ -1282,6 +1293,121 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
return plan;
}
+/*
+ * Create a fresh CachedPlan for the query_index'th query in the provided
+ * CachedPlanSource.
+ *
+ * The created CachedPlan is standalone, meaning it is not tracked in the
+ * CachedPlanSource. The CachedPlan and its plan trees are allocated in a
+ * child context of the caller's memory context. The caller must ensure they
+ * remain valid until execution is complete, after which the plan should be
+ * released by calling ReleaseCachedPlan().
+ *
+ * This function primarily supports ExecutorStartExt(), which handles cases
+ * where the original generic CachedPlan becomes invalid after prunable
+ * relations are locked.
+ */
+CachedPlan *
+GetSingleCachedPlan(CachedPlanSource *plansource, int query_index,
+ QueryEnvironment *queryEnv)
+{
+ List *query_list = plansource->query_list,
+ *plan_list;
+ CachedPlan *plan = plansource->gplan;
+ MemoryContext oldcxt = CurrentMemoryContext,
+ plan_context;
+ PlannedStmt *plannedstmt;
+
+ Assert(ActiveSnapshotSet());
+
+ /* Sanity checks */
+ if (plan == NULL)
+ elog(ERROR, "GetSingleCachedPlan() called in the wrong context: plansource->gplan is NULL");
+ else if (plan->is_valid)
+ elog(ERROR, "GetSingleCachedPlan() called in the wrong context: plansource->gplan->is_valid");
+
+ /*
+ * The plansource might have become invalid since GetCachedPlan(). See the
+ * comment in BuildCachedPlan() for details on why this might happen.
+ *
+ * The risk is greater here because this function is called from the
+ * executor, meaning much more processing may have occurred compared to
+ * when BuildCachedPlan() is called from GetCachedPlan().
+ */
+ if (!plansource->is_valid)
+ query_list = RevalidateCachedQuery(plansource, queryEnv);
+ Assert(query_list != NIL);
+
+ /*
+ * Build a new generic plan for the query_index'th query, but make a copy
+ * to be scribbled on by the planner
+ */
+ query_list = list_make1(copyObject(list_nth_node(Query, query_list,
+ query_index)));
+ plan_list = pg_plan_queries(query_list, plansource->query_string,
+ plansource->cursor_options, NULL);
+
+ list_free_deep(query_list);
+
+ /*
+ * Make a dedicated memory context for the CachedPlan and its subsidiary
+ * data so that we can release it in ReleaseCachedPlan() that will be
+ * called in FreeQueryDesc().
+ */
+ plan_context = AllocSetContextCreate(CurrentMemoryContext,
+ "Standalone CachedPlan",
+ ALLOCSET_START_SMALL_SIZES);
+ MemoryContextCopyAndSetIdentifier(plan_context, plansource->query_string);
+
+ /*
+ * Copy plan into the new context.
+ */
+ MemoryContextSwitchTo(plan_context);
+ plan_list = copyObject(plan_list);
+
+ /*
+ * Create and fill the CachedPlan struct within the new context.
+ */
+ plan = (CachedPlan *) palloc(sizeof(CachedPlan));
+ plan->magic = CACHEDPLAN_MAGIC;
+ plan->stmt_list = plan_list;
+
+ plan->planRoleId = GetUserId();
+ Assert(list_length(plan_list) == 1);
+ plannedstmt = linitial_node(PlannedStmt, plan_list);
+
+ /*
+ * CachedPlan is dependent on role either if RLS affected the rewrite
+ * phase or if a role dependency was injected during planning. And it's
+ * transient if any plan is marked so.
+ */
+ plan->dependsOnRole = plansource->dependsOnRLS || plannedstmt->dependsOnRole;
+ if (plannedstmt->transientPlan)
+ {
+ Assert(TransactionIdIsNormal(TransactionXmin));
+ plan->saved_xmin = TransactionXmin;
+ }
+ else
+ plan->saved_xmin = InvalidTransactionId;
+ plan->refcount = 0;
+ plan->context = plan_context;
+ plan->is_oneshot = false;
+ plan->is_generic = true;
+ plan->is_saved = false;
+ plan->is_valid = true;
+ plan->is_standalone = true;
+ plan->generation = 1;
+ MemoryContextSwitchTo(oldcxt);
+
+ /*
+ * Add the entry to the global list of "standalone" cached plans. It is
+ * removed from the list by ReleaseCachedPlan().
+ */
+ dlist_push_tail(&standalone_plan_list, &plan->node);
+
+ return plan;
+}
+
/*
* ReleaseCachedPlan: release active use of a cached plan.
*
@@ -1309,6 +1435,10 @@ ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner)
/* Mark it no longer valid */
plan->magic = 0;
+ /* Remove from the global list if we are a standalone plan. */
+ if (plan->is_standalone)
+ dlist_delete(&plan->node);
+
/* One-shot plans do not own their context, so we can't free them */
if (!plan->is_oneshot)
MemoryContextDelete(plan->context);
@@ -2066,6 +2196,33 @@ PlanCacheRelCallback(Datum arg, Oid relid)
cexpr->is_valid = false;
}
}
+
+ /* Finally, invalidate any standalone cached plans */
+ dlist_foreach(iter, &standalone_plan_list)
+ {
+ CachedPlan *cplan = dlist_container(CachedPlan,
+ node, iter.cur);
+
+ Assert(cplan->magic == CACHEDPLAN_MAGIC);
+
+ if (cplan->is_valid)
+ {
+ ListCell *lc;
+
+ foreach(lc, cplan->stmt_list)
+ {
+ PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc);
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ continue; /* Ignore utility statements */
+ if ((relid == InvalidOid) ? plannedstmt->relationOids != NIL :
+ list_member_oid(plannedstmt->relationOids, relid))
+ cplan->is_valid = false;
+ if (!cplan->is_valid)
+ break; /* out of stmt_list scan */
+ }
+ }
+ }
}
/*
@@ -2176,6 +2333,44 @@ PlanCacheObjectCallback(Datum arg, int cacheid, uint32 hashvalue)
}
}
}
+
+ /* Finally, invalidate any standalone cached plans */
+ dlist_foreach(iter, &standalone_plan_list)
+ {
+ CachedPlan *cplan = dlist_container(CachedPlan,
+ node, iter.cur);
+
+ Assert(cplan->magic == CACHEDPLAN_MAGIC);
+
+ if (cplan->is_valid)
+ {
+ ListCell *lc;
+
+ foreach(lc, cplan->stmt_list)
+ {
+ PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc);
+ ListCell *lc3;
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ continue; /* Ignore utility statements */
+ foreach(lc3, plannedstmt->invalItems)
+ {
+ PlanInvalItem *item = (PlanInvalItem *) lfirst(lc3);
+
+ if (item->cacheId != cacheid)
+ continue;
+ if (hashvalue == 0 ||
+ item->hashValue == hashvalue)
+ {
+ cplan->is_valid = false;
+ break; /* out of invalItems scan */
+ }
+ }
+ if (!cplan->is_valid)
+ break; /* out of stmt_list scan */
+ }
+ }
+ }
}
/*
@@ -2235,6 +2430,17 @@ ResetPlanCache(void)
cexpr->is_valid = false;
}
+
+ /* Finally, invalidate any standalone cached plans */
+ dlist_foreach(iter, &standalone_plan_list)
+ {
+ CachedPlan *cplan = dlist_container(CachedPlan,
+ node, iter.cur);
+
+ Assert(cplan->magic == CACHEDPLAN_MAGIC);
+
+ cplan->is_valid = false;
+ }
}
/*
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 4a24613537..bf70fd4ce7 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -284,7 +284,8 @@ PortalDefineQuery(Portal portal,
const char *sourceText,
CommandTag commandTag,
List *stmts,
- CachedPlan *cplan)
+ CachedPlan *cplan,
+ CachedPlanSource *plansource)
{
Assert(PortalIsValid(portal));
Assert(portal->status == PORTAL_NEW);
@@ -299,6 +300,7 @@ PortalDefineQuery(Portal portal,
portal->commandTag = commandTag;
portal->stmts = stmts;
portal->cplan = cplan;
+ portal->plansource = plansource;
portal->status = PORTAL_DEFINED;
}
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 21c71e0d53..a39989a950 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -104,6 +104,7 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ParamListInfo params, QueryEnvironment *queryEnv);
extern void ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
+ CachedPlanSource *plansource, int plan_index,
IntoClause *into, ExplainState *es,
const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
diff --git a/src/include/commands/trigger.h b/src/include/commands/trigger.h
index 8a5a9fe642..db21561c8c 100644
--- a/src/include/commands/trigger.h
+++ b/src/include/commands/trigger.h
@@ -258,6 +258,7 @@ extern void ExecASTruncateTriggers(EState *estate,
extern void AfterTriggerBeginXact(void);
extern void AfterTriggerBeginQuery(void);
extern void AfterTriggerEndQuery(EState *estate);
+extern void AfterTriggerAbortQuery(void);
extern void AfterTriggerFireDeferred(void);
extern void AfterTriggerEndXact(bool isCommit);
extern void AfterTriggerBeginSubXact(void);
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index 0e7245435d..f6cb6479c0 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -36,6 +36,7 @@ typedef struct QueryDesc
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
CachedPlan *cplan; /* CachedPlan that supplies the plannedstmt */
+ bool cplan_release; /* Should FreeQueryDesc() release cplan? */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 69c3ebff00..5bc0edb5a0 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -19,6 +19,7 @@
#include "nodes/lockoptions.h"
#include "nodes/parsenodes.h"
#include "utils/memutils.h"
+#include "utils/plancache.h"
/*
@@ -198,6 +199,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
* prototypes from functions in execMain.c
*/
extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
+extern void ExecutorStartExt(QueryDesc *queryDesc, int eflags,
+ CachedPlanSource *plansource, int query_index);
extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void ExecutorRun(QueryDesc *queryDesc,
ScanDirection direction, uint64 count, bool execute_once);
@@ -261,6 +264,19 @@ extern void ExecEndNode(PlanState *node);
extern void ExecShutdownNode(PlanState *node);
extern void ExecSetTupleBound(int64 tuples_needed, PlanState *child_node);
+/*
+ * Is the CachedPlan in es_cachedplan still valid?
+ *
+ * Called from InitPlan() because invalidation messages that affect the plan
+ * might be received after locks have been taken on runtime-prunable relations.
+ * The caller should take appropriate action if the plan has become invalid.
+ */
+static inline bool
+ExecPlanStillValid(EState *estate)
+{
+ return estate->es_cachedplan == NULL ? true :
+ CachedPlanValid(estate->es_cachedplan);
+}
/* ----------------------------------------------------------------
* ExecProcNode
@@ -589,6 +605,7 @@ extern void ExecCreateScanSlotFromOuterPlan(EState *estate,
extern bool ExecRelationIsTargetRelation(EState *estate, Index scanrelid);
extern Relation ExecOpenScanRelation(EState *estate, Index scanrelid, int eflags);
+extern Relation ExecOpenScanIndexRelation(EState *estate, Oid indexid, int lockmode);
extern void ExecInitRangeTable(EState *estate, List *rangeTable, List *permInfos);
extern void ExecCloseRangeTableRelations(EState *estate);
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 57170818c0..f50b6b50a8 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -690,6 +690,7 @@ typedef struct EState
int es_top_eflags; /* eflags passed to ExecutorStart */
int es_instrument; /* OR of InstrumentOption flags */
bool es_finished; /* true when ExecutorFinish is done */
+ bool es_aborted; /* true when execution was aborted */
List *es_exprcontexts; /* List of ExprContexts within EState */
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 0b5ee007ca..154f68f671 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -18,6 +18,7 @@
#include "access/tupdesc.h"
#include "lib/ilist.h"
#include "nodes/params.h"
+#include "nodes/parsenodes.h"
#include "tcop/cmdtag.h"
#include "utils/queryenvironment.h"
#include "utils/resowner.h"
@@ -152,6 +153,8 @@ typedef struct CachedPlan
bool is_generic; /* is it a reusable generic plan? */
bool is_saved; /* is CachedPlan in a long-lived context? */
bool is_valid; /* is the stmt_list currently valid? */
+ bool is_standalone; /* is it not associated with a
+ * CachedPlanSource? */
Oid planRoleId; /* Role ID the plan was created for */
bool dependsOnRole; /* is plan specific to that role? */
TransactionId saved_xmin; /* if valid, replan when TransactionXmin
@@ -159,6 +162,12 @@ typedef struct CachedPlan
int generation; /* parent's generation number for this plan */
int refcount; /* count of live references to this struct */
MemoryContext context; /* context containing this CachedPlan */
+
+ /*
+ * If the plan is not associated with a CachedPlanSource, it is saved in
+ * a separate global list.
+ */
+ dlist_node node; /* list link, if is_standalone */
} CachedPlan;
/*
@@ -224,6 +233,10 @@ extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
ParamListInfo boundParams,
ResourceOwner owner,
QueryEnvironment *queryEnv);
+extern CachedPlan *GetSingleCachedPlan(CachedPlanSource *plansource,
+ int query_index,
+ QueryEnvironment *queryEnv);
+
extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
@@ -245,4 +258,17 @@ CachedPlanRequiresLocking(CachedPlan *cplan)
return cplan->is_generic;
}
+/*
+ * CachedPlanValid
+ * Returns whether a cached generic plan is still valid.
+ *
+ * Invoked by the executor to check if the plan has not been invalidated after
+ * taking locks during the initialization of the plan.
+ */
+static inline bool
+CachedPlanValid(CachedPlan *cplan)
+{
+ return cplan->is_valid;
+}
+
#endif /* PLANCACHE_H */
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index 29f49829f2..58c3828d2c 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
QueryCompletion qc; /* command completion data for executed query */
List *stmts; /* list of PlannedStmts */
CachedPlan *cplan; /* CachedPlan, if stmts are from one */
+ CachedPlanSource *plansource; /* CachedPlanSource, for cplan */
ParamListInfo portalParams; /* params to pass to query */
QueryEnvironment *queryEnv; /* environment for query */
@@ -241,7 +242,8 @@ extern void PortalDefineQuery(Portal portal,
const char *sourceText,
CommandTag commandTag,
List *stmts,
- CachedPlan *cplan);
+ CachedPlan *cplan,
+ CachedPlanSource *plansource);
extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
extern void PortalCreateHoldStore(Portal portal);
extern void PortalHashTableDeleteAll(void);
diff --git a/src/test/modules/delay_execution/Makefile b/src/test/modules/delay_execution/Makefile
index 70f24e846d..3eeb097fde 100644
--- a/src/test/modules/delay_execution/Makefile
+++ b/src/test/modules/delay_execution/Makefile
@@ -8,7 +8,8 @@ OBJS = \
delay_execution.o
ISOLATION = partition-addition \
- partition-removal-1
+ partition-removal-1 \
+ cached-plan-inval
ifdef USE_PGXS
PG_CONFIG = pg_config
diff --git a/src/test/modules/delay_execution/delay_execution.c b/src/test/modules/delay_execution/delay_execution.c
index 155c8a8d55..304ca77f7b 100644
--- a/src/test/modules/delay_execution/delay_execution.c
+++ b/src/test/modules/delay_execution/delay_execution.c
@@ -1,14 +1,18 @@
/*-------------------------------------------------------------------------
*
* delay_execution.c
- * Test module to allow delay between parsing and execution of a query.
+ * Test module to introduce delay at various points during execution of a
+ * query to test that execution proceeds safely in light of concurrent
+ * changes.
*
* The delay is implemented by taking and immediately releasing a specified
* advisory lock. If another process has previously taken that lock, the
* current process will be blocked until the lock is released; otherwise,
* there's no effect. This allows an isolationtester script to reliably
- * test behaviors where some specified action happens in another backend
- * between parsing and execution of any desired query.
+ * test behaviors where some specified action happens in another backend in
+ * a couple of cases: 1) between parsing and execution of any desired query
+ * when using the planner_hook, 2) between RevalidateCachedQuery() and
+ * ExecutorStart() when using the ExecutorStart_hook.
*
* Copyright (c) 2020-2024, PostgreSQL Global Development Group
*
@@ -22,6 +26,7 @@
#include <limits.h>
+#include "executor/executor.h"
#include "optimizer/planner.h"
#include "utils/builtins.h"
#include "utils/guc.h"
@@ -32,9 +37,11 @@ PG_MODULE_MAGIC;
/* GUC: advisory lock ID to use. Zero disables the feature. */
static int post_planning_lock_id = 0;
+static int executor_start_lock_id = 0;
-/* Save previous planner hook user to be a good citizen */
+/* Save previous hook users to be a good citizen */
static planner_hook_type prev_planner_hook = NULL;
+static ExecutorStart_hook_type prev_ExecutorStart_hook = NULL;
/* planner_hook function to provide the desired delay */
@@ -70,11 +77,41 @@ delay_execution_planner(Query *parse, const char *query_string,
return result;
}
+/* ExecutorStart_hook function to provide the desired delay */
+static void
+delay_execution_ExecutorStart(QueryDesc *queryDesc, int eflags)
+{
+ /* If enabled, delay by taking and releasing the specified lock */
+ if (executor_start_lock_id != 0)
+ {
+ DirectFunctionCall1(pg_advisory_lock_int8,
+ Int64GetDatum((int64) executor_start_lock_id));
+ DirectFunctionCall1(pg_advisory_unlock_int8,
+ Int64GetDatum((int64) executor_start_lock_id));
+
+ /*
+ * Ensure that we notice any pending invalidations, since the advisory
+ * lock functions don't do this.
+ */
+ AcceptInvalidationMessages();
+ }
+
+ /* Now start the executor, possibly via a previous hook user */
+ if (prev_ExecutorStart_hook)
+ prev_ExecutorStart_hook(queryDesc, eflags);
+ else
+ standard_ExecutorStart(queryDesc, eflags);
+
+ if (executor_start_lock_id != 0)
+ elog(NOTICE, "Finished ExecutorStart(): CachedPlan is %s",
+ CachedPlanValid(queryDesc->cplan) ? "valid" : "not valid");
+}
+
/* Module load function */
void
_PG_init(void)
{
- /* Set up the GUC to control which lock is used */
+ /* Set up GUCs to control which lock is used */
DefineCustomIntVariable("delay_execution.post_planning_lock_id",
"Sets the advisory lock ID to be locked/unlocked after planning.",
"Zero disables the delay.",
@@ -86,10 +123,22 @@ _PG_init(void)
NULL,
NULL,
NULL);
-
+ DefineCustomIntVariable("delay_execution.executor_start_lock_id",
+ "Sets the advisory lock ID to be locked/unlocked before starting execution.",
+ "Zero disables the delay.",
+ &executor_start_lock_id,
+ 0,
+ 0, INT_MAX,
+ PGC_USERSET,
+ 0,
+ NULL,
+ NULL,
+ NULL);
MarkGUCPrefixReserved("delay_execution");
- /* Install our hook */
+ /* Install our hooks. */
prev_planner_hook = planner_hook;
planner_hook = delay_execution_planner;
+ prev_ExecutorStart_hook = ExecutorStart_hook;
+ ExecutorStart_hook = delay_execution_ExecutorStart;
}
diff --git a/src/test/modules/delay_execution/expected/cached-plan-inval.out b/src/test/modules/delay_execution/expected/cached-plan-inval.out
new file mode 100644
index 0000000000..e002cfbc9c
--- /dev/null
+++ b/src/test/modules/delay_execution/expected/cached-plan-inval.out
@@ -0,0 +1,230 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s1prep s2lock s1exec s2dropi s2unlock
+step s1prep: SET plan_cache_mode = force_generic_plan;
+ PREPARE q AS SELECT * FROM foov WHERE a = $1 FOR UPDATE;
+ EXPLAIN (COSTS OFF) EXECUTE q (1);
+QUERY PLAN
+------------------------------------------------
+LockRows
+ -> Append
+ Subplans Removed: 2
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: (a = $1)
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = $1)
+(7 rows)
+
+step s2lock: SELECT pg_advisory_lock(12345);
+pg_advisory_lock
+----------------
+
+(1 row)
+
+step s1exec: LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q (1); <waiting ...>
+step s2dropi: DROP INDEX foo12_1_a;
+step s2unlock: SELECT pg_advisory_unlock(12345);
+pg_advisory_unlock
+------------------
+t
+(1 row)
+
+step s1exec: <... completed>
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+-------------------------------------
+LockRows
+ -> Append
+ Subplans Removed: 2
+ -> Seq Scan on foo12_1 foo_1
+ Filter: (a = $1)
+(5 rows)
+
+
+starting permutation: s1prep2 s2lock s1exec2 s2dropi s2unlock
+step s1prep2: SET plan_cache_mode = force_generic_plan;
+ PREPARE q2 AS SELECT * FROM foov WHERE a = one() or a = two();
+ EXPLAIN (COSTS OFF) EXECUTE q2;
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+--------------------------------------------------
+Append
+ Subplans Removed: 1
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: ((a = one()) OR (a = two()))
+ -> BitmapOr
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = one())
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = two())
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+(11 rows)
+
+step s2lock: SELECT pg_advisory_lock(12345);
+pg_advisory_lock
+----------------
+
+(1 row)
+
+step s1exec2: LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q2; <waiting ...>
+step s2dropi: DROP INDEX foo12_1_a;
+step s2unlock: SELECT pg_advisory_unlock(12345);
+pg_advisory_unlock
+------------------
+t
+(1 row)
+
+step s1exec2: <... completed>
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+--------------------------------------------
+Append
+ Subplans Removed: 1
+ -> Seq Scan on foo12_1 foo_1
+ Filter: ((a = one()) OR (a = two()))
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+(6 rows)
+
+
+starting permutation: s1prep3 s2lock s1exec3 s2dropi s2unlock
+step s1prep3: SET plan_cache_mode = force_generic_plan;
+ PREPARE q3 AS UPDATE foov SET a = a WHERE a = one() or a = two();
+ EXPLAIN (COSTS OFF) EXECUTE q3;
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+--------------------------------------------------------
+Append
+ Subplans Removed: 1
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: ((a = one()) OR (a = two()))
+ -> BitmapOr
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = one())
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = two())
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+
+Update on foo
+ Update on foo12_1 foo_1
+ Update on foo12_2 foo_2
+ -> Append
+ Subplans Removed: 1
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: ((a = one()) OR (a = two()))
+ -> BitmapOr
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = one())
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = two())
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+(26 rows)
+
+step s2lock: SELECT pg_advisory_lock(12345);
+pg_advisory_lock
+----------------
+
+(1 row)
+
+step s1exec3: LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q3; <waiting ...>
+step s2dropi: DROP INDEX foo12_1_a;
+step s2unlock: SELECT pg_advisory_unlock(12345);
+pg_advisory_unlock
+------------------
+t
+(1 row)
+
+step s1exec3: <... completed>
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+--------------------------------------------------
+Append
+ Subplans Removed: 1
+ -> Seq Scan on foo12_1 foo_1
+ Filter: ((a = one()) OR (a = two()))
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+
+Update on foo
+ Update on foo12_1 foo_1
+ Update on foo12_2 foo_2
+ -> Append
+ Subplans Removed: 1
+ -> Seq Scan on foo12_1 foo_1
+ Filter: ((a = one()) OR (a = two()))
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+(16 rows)
+
+
+starting permutation: s1prep4 s2lock s1exec4 s2dropi s2unlock
+step s1prep4: SET plan_cache_mode = force_generic_plan;
+ SET enable_seqscan TO off;
+ PREPARE q4 AS SELECT * FROM generate_series(1, 1) WHERE EXISTS (SELECT * FROM foov WHERE a = $1 FOR UPDATE);
+ EXPLAIN (COSTS OFF) EXECUTE q4 (1);
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+---------------------------------------------------------------
+Result
+ One-Time Filter: (InitPlan 1).col1
+ InitPlan 1
+ -> LockRows
+ Disabled Nodes: 2
+ -> Append
+ Disabled Nodes: 2
+ Subplans Removed: 2
+ -> Index Scan using foo12_1_a on foo12_1 foo_1
+ Index Cond: (a = $1)
+ -> Function Scan on generate_series
+(11 rows)
+
+step s2lock: SELECT pg_advisory_lock(12345);
+pg_advisory_lock
+----------------
+
+(1 row)
+
+step s1exec4: LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q4 (1); <waiting ...>
+step s2dropi: DROP INDEX foo12_1_a;
+step s2unlock: SELECT pg_advisory_unlock(12345);
+pg_advisory_unlock
+------------------
+t
+(1 row)
+
+step s1exec4: <... completed>
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+---------------------------------------------
+Result
+ One-Time Filter: (InitPlan 1).col1
+ InitPlan 1
+ -> LockRows
+ Disabled Nodes: 3
+ -> Append
+ Disabled Nodes: 3
+ Subplans Removed: 2
+ -> Seq Scan on foo12_1 foo_1
+ Disabled Nodes: 1
+ Filter: (a = $1)
+ -> Function Scan on generate_series
+(12 rows)
+
diff --git a/src/test/modules/delay_execution/meson.build b/src/test/modules/delay_execution/meson.build
index 41f3ac0b89..5a70b183d0 100644
--- a/src/test/modules/delay_execution/meson.build
+++ b/src/test/modules/delay_execution/meson.build
@@ -24,6 +24,7 @@ tests += {
'specs': [
'partition-addition',
'partition-removal-1',
+ 'cached-plan-inval',
],
},
}
diff --git a/src/test/modules/delay_execution/specs/cached-plan-inval.spec b/src/test/modules/delay_execution/specs/cached-plan-inval.spec
new file mode 100644
index 0000000000..820a843051
--- /dev/null
+++ b/src/test/modules/delay_execution/specs/cached-plan-inval.spec
@@ -0,0 +1,75 @@
+# Test to check that invalidation of cached generic plans during ExecutorStart
+# correctly triggers replanning and re-execution.
+
+setup
+{
+ CREATE TABLE foo (a int, b text) PARTITION BY LIST(a);
+ CREATE TABLE foo12 PARTITION OF foo FOR VALUES IN (1, 2) PARTITION BY LIST (a);
+ CREATE TABLE foo12_1 PARTITION OF foo12 FOR VALUES IN (1);
+ CREATE TABLE foo12_2 PARTITION OF foo12 FOR VALUES IN (2);
+ CREATE INDEX foo12_1_a ON foo12_1 (a);
+ CREATE TABLE foo3 PARTITION OF foo FOR VALUES IN (3);
+ CREATE VIEW foov AS SELECT * FROM foo;
+ CREATE FUNCTION one () RETURNS int AS $$ BEGIN RETURN 1; END; $$ LANGUAGE PLPGSQL STABLE;
+ CREATE FUNCTION two () RETURNS int AS $$ BEGIN RETURN 2; END; $$ LANGUAGE PLPGSQL STABLE;
+ CREATE RULE update_foo AS ON UPDATE TO foo DO ALSO SELECT 1;
+}
+
+teardown
+{
+ DROP VIEW foov;
+ DROP RULE update_foo ON foo;
+ DROP TABLE foo;
+ DROP FUNCTION one(), two();
+}
+
+session "s1"
+# Append with run-time pruning
+step "s1prep" { SET plan_cache_mode = force_generic_plan;
+ PREPARE q AS SELECT * FROM foov WHERE a = $1 FOR UPDATE;
+ EXPLAIN (COSTS OFF) EXECUTE q (1); }
+
+# Another case with Append with run-time pruning
+step "s1prep2" { SET plan_cache_mode = force_generic_plan;
+ PREPARE q2 AS SELECT * FROM foov WHERE a = one() or a = two();
+ EXPLAIN (COSTS OFF) EXECUTE q2; }
+
+# Case with a rule adding another query
+step "s1prep3" { SET plan_cache_mode = force_generic_plan;
+ PREPARE q3 AS UPDATE foov SET a = a WHERE a = one() or a = two();
+ EXPLAIN (COSTS OFF) EXECUTE q3; }
+
+# Another case with Append with run-time pruning in a subquery
+step "s1prep4" { SET plan_cache_mode = force_generic_plan;
+ SET enable_seqscan TO off;
+ PREPARE q4 AS SELECT * FROM generate_series(1, 1) WHERE EXISTS (SELECT * FROM foov WHERE a = $1 FOR UPDATE);
+ EXPLAIN (COSTS OFF) EXECUTE q4 (1); }
+
+# Executes a generic plan
+step "s1exec" { LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q (1); }
+step "s1exec2" { LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q2; }
+step "s1exec3" { LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q3; }
+step "s1exec4" { LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q4 (1); }
+
+session "s2"
+step "s2lock" { SELECT pg_advisory_lock(12345); }
+step "s2unlock" { SELECT pg_advisory_unlock(12345); }
+step "s2dropi" { DROP INDEX foo12_1_a; }
+
+# While "s1exec", etc. wait to acquire the advisory lock, "s2drop" is able to
+# drop the index being used in the cached plan. When "s1exec" is then
+# unblocked and initializes the cached plan for execution, it detects the
+# concurrent index drop and causes the cached plan to be discarded and
+# recreated without the index.
+permutation "s1prep" "s2lock" "s1exec" "s2dropi" "s2unlock"
+permutation "s1prep2" "s2lock" "s1exec2" "s2dropi" "s2unlock"
+permutation "s1prep3" "s2lock" "s1exec3" "s2dropi" "s2unlock"
+permutation "s1prep4" "s2lock" "s1exec4" "s2dropi" "s2unlock"
--
2.43.0
^ permalink raw reply [nested|flat] 29+ messages in thread
* Re: generic plans and "initial" pruning
2024-08-15 15:34 Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-16 12:35 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-19 16:39 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-20 13:00 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-20 14:53 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-21 12:45 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-21 13:10 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-23 12:48 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-29 13:34 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-17 12:57 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-19 08:39 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-19 12:10 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
@ 2024-09-20 08:10 ` Amit Langote <[email protected]>
2024-10-10 20:15 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
0 siblings, 1 reply; 29+ messages in thread
From: Amit Langote @ 2024-09-20 08:10 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>; Tom Lane <[email protected]>
On Thu, Sep 19, 2024 at 9:10 PM Amit Langote <[email protected]> wrote:
> On Thu, Sep 19, 2024 at 5:39 PM Amit Langote <[email protected]> wrote:
> > For
> > ResultRelInfos, I took the approach of memsetting them to 0 for pruned
> > result relations and adding checks at multiple sites to ensure the
> > ResultRelInfo being handled is valid.
>
> After some reflection,
Not enough reflection, evidently...
> I realized that nobody would think that that
> approach is very robust. In the attached, I’ve modified
> ExecInitModifyTable() to allocate ResultRelInfos only for unpruned
> relations, instead of allocating for all in
> ModifyTable.resultRelations and setting pruned ones to 0. This
> approach feels more robust.
Except, I forgot that ModifyTable has other lists that parallel
resultRelations (of the same length) viz. withCheckOptionLists,
returningLists, and updateColnosLists, which need to be similarly
truncated to only consider unpruned relations. I've updated 0004 to
do so. This was broken even in the other design where locking is
delayed all the way until ExecInitAppend does initial pruning(),
because ResultRelInfos are created before initializing the plan
subtree containing the Append node, which would try to lock and open
*all* partitions.
Also, I've switched the order of 0002 and 0003 to avoid a situation
where I add a function in 0002 only to remove it in 0003. By doing
the refactoring to initialize PartitionPruneContexts lazily first, the
patch to move the initial pruning to occur before ExecInitNode()
became much simpler as it doesn't need to touch the code related to
exec pruning.
--
Thanks, Amit Langote
Attachments:
[application/octet-stream] v56-0005-Handle-CachedPlan-invalidation-in-the-executor.patch (58.0K, 2-v56-0005-Handle-CachedPlan-invalidation-in-the-executor.patch)
download | inline diff:
From 74830439945fb9d7b593bbea8b19a213aa4eb47c Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Thu, 22 Aug 2024 19:38:13 +0900
Subject: [PATCH v56 5/5] Handle CachedPlan invalidation in the executor
This commit makes changes to handle cases where a cached plan
becomes invalid before deferred locks on prunable relations are taken.
* Add checks at various points in ExecutorStart() and its called
functions to determine if the plan becomes invalid. If detected,
the function and its callers return immediately. A previous commit
ensures any partially initialized PlanState tree objects are cleaned
up appropriately.
* Introduce ExecutorStartExt(), a wrapper over ExecutorStart(), to
handle cases where plan initialization is aborted due to invalidation.
ExecutorStartExt() creates a new transient CachedPlan if needed and
retries execution. This new entry point is only required for sites
using plancache.c. It requires passing the QueryDesc, eflags,
CachedPlanSource, and query_index (index in CachedPlanSource.query_list).
* Add GetSingleCachedPlan() in plancache.c to create a transient
CachedPlan for a specified query in the given CachedPlanSource.
Such CachedPlans are tracked in a separate global list for the
plancache invalidation callbacks to check.
This also adds isolation tests using the delay_execution test module
to verify scenarios where a CachedPlan becomes invalid before the
deferred locks are taken.
All ExecutorStart_hook implementations now must add the following
block after the ExecutorStart() call to ensure it doesn't work with an
invalid plan:
/* The plan may have become invalid during ExecutorStart() */
if (!ExecPlanStillValid(queryDesc->estate))
return;
Reviewed-by: Robert Haas
Discussion: https://postgr.es/m/CA+HiwqFGkMSge6TgC9KQzde0ohpAycLQuV7ooitEEpbKB0O_mg@mail.gmail.comk
---
contrib/auto_explain/auto_explain.c | 4 +
.../pg_stat_statements/pg_stat_statements.c | 4 +
src/backend/commands/explain.c | 8 +-
src/backend/commands/portalcmds.c | 1 +
src/backend/commands/prepare.c | 10 +-
src/backend/commands/trigger.c | 14 ++
src/backend/executor/README | 35 ++-
src/backend/executor/execMain.c | 84 ++++++-
src/backend/executor/execUtils.c | 3 +-
src/backend/executor/spi.c | 19 +-
src/backend/tcop/postgres.c | 4 +-
src/backend/tcop/pquery.c | 31 ++-
src/backend/utils/cache/plancache.c | 206 ++++++++++++++++
src/backend/utils/mmgr/portalmem.c | 4 +-
src/include/commands/explain.h | 1 +
src/include/commands/trigger.h | 1 +
src/include/executor/execdesc.h | 1 +
src/include/executor/executor.h | 17 ++
src/include/nodes/execnodes.h | 1 +
src/include/utils/plancache.h | 26 ++
src/include/utils/portal.h | 4 +-
src/test/modules/delay_execution/Makefile | 3 +-
.../modules/delay_execution/delay_execution.c | 63 ++++-
.../expected/cached-plan-inval.out | 230 ++++++++++++++++++
src/test/modules/delay_execution/meson.build | 1 +
.../specs/cached-plan-inval.spec | 75 ++++++
26 files changed, 814 insertions(+), 36 deletions(-)
create mode 100644 src/test/modules/delay_execution/expected/cached-plan-inval.out
create mode 100644 src/test/modules/delay_execution/specs/cached-plan-inval.spec
diff --git a/contrib/auto_explain/auto_explain.c b/contrib/auto_explain/auto_explain.c
index 677c135f59..9eb5e9a619 100644
--- a/contrib/auto_explain/auto_explain.c
+++ b/contrib/auto_explain/auto_explain.c
@@ -300,6 +300,10 @@ explain_ExecutorStart(QueryDesc *queryDesc, int eflags)
else
standard_ExecutorStart(queryDesc, eflags);
+ /* The plan may have become invalid during standard_ExecutorStart() */
+ if (!ExecPlanStillValid(queryDesc->estate))
+ return;
+
if (auto_explain_enabled())
{
/*
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 3c72e437f7..76642b557a 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -985,6 +985,10 @@ pgss_ExecutorStart(QueryDesc *queryDesc, int eflags)
else
standard_ExecutorStart(queryDesc, eflags);
+ /* The plan may have become invalid during standard_ExecutorStart() */
+ if (!ExecPlanStillValid(queryDesc->estate))
+ return;
+
/*
* If query has queryId zero, don't track it. This prevents double
* counting of optimizable statements that are directly contained in
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 49f7370734..b7a0b8c05b 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -509,7 +509,8 @@ standard_ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NULL, NULL, -1, into, es, queryString, params,
+ queryEnv,
&planduration, (es->buffers ? &bufusage : NULL),
es->memory ? &mem_counters : NULL);
}
@@ -618,6 +619,7 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
*/
void
ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
+ CachedPlanSource *plansource, int query_index,
IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
@@ -688,8 +690,8 @@ ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
if (into)
eflags |= GetIntoRelEFlags(into);
- /* call ExecutorStart to prepare the plan for execution */
- ExecutorStart(queryDesc, eflags);
+ /* Call ExecutorStartExt to prepare the plan for execution. */
+ ExecutorStartExt(queryDesc, eflags, plansource, query_index);
/* Execute the plan for statistics if asked for */
if (es->analyze)
diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index 4f6acf6719..4b1503c05e 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -107,6 +107,7 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
queryString,
CMDTAG_SELECT, /* cursor's query is always a SELECT */
list_make1(plan),
+ NULL,
NULL);
/*----------
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 311b9ebd5b..4cd79a6e3a 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -202,7 +202,8 @@ ExecuteQuery(ParseState *pstate,
query_string,
entry->plansource->commandTag,
plan_list,
- cplan);
+ cplan,
+ entry->plansource);
/*
* For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
@@ -583,6 +584,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
MemoryContextCounters mem_counters;
MemoryContext planner_ctx = NULL;
MemoryContext saved_ctx = NULL;
+ int i = 0;
if (es->memory)
{
@@ -655,8 +657,8 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, cplan, into, es, query_string, paramLI,
- queryEnv,
+ ExplainOnePlan(pstmt, cplan, entry->plansource, i,
+ into, es, query_string, paramLI, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL),
es->memory ? &mem_counters : NULL);
else
@@ -668,6 +670,8 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
/* Separate plans with an appropriate separator */
if (lnext(plan_list, p) != NULL)
ExplainSeparatePlans(es);
+
+ i++;
}
if (estate)
diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index 29d30bfb6f..e33b8f573b 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -5120,6 +5120,20 @@ AfterTriggerEndQuery(EState *estate)
afterTriggers.query_depth--;
}
+/* ----------
+ * AfterTriggerAbortQuery()
+ *
+ * Called by ExecutorEnd() if the query execution was aborted due to the
+ * plan becoming invalid during initialization.
+ * ----------
+ */
+void
+AfterTriggerAbortQuery(void)
+{
+ /* Revert the actions of AfterTriggerBeginQuery(). */
+ afterTriggers.query_depth--;
+}
+
/*
* AfterTriggerFreeQuery
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 642d63be61..c76a00b394 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -280,6 +280,28 @@ are typically reset to empty once per tuple. Per-tuple contexts are usually
associated with ExprContexts, and commonly each PlanState node has its own
ExprContext to evaluate its qual and targetlist expressions in.
+Relation Locking
+----------------
+
+Typically, when the executor initializes a plan tree for execution, it doesn't
+lock non-index relations if the plan tree is freshly generated and not derived
+from a CachedPlan. This is because such locks have already been established
+during the query's parsing, rewriting, and planning phases. However, with a
+cached plan tree, some relations may remain unlocked. The function
+AcquireExecutorLocks() only locks unprunable relations in the plan, deferring
+the locking of prunable ones to executor initialization. This avoids
+unnecessary locking of relations that will be pruned during "initial" runtime
+pruning in ExecDoInitialPruning().
+
+This approach creates a window where a cached plan tree with child tables
+could become outdated if another backend modifies these tables before
+ExecDoInitialPruning() locks them. As a result, the executor has the added duty
+to verify the plan tree's validity whenever it locks a child table after
+doing initial pruning. This validation is done by checking the CachedPlan.is_valid
+attribute. If the plan tree is outdated (is_valid=false), the executor halts
+further initialization, cleans up anything in EState that would have been
+allocated up to that point, and retries execution after recreating the
+invalid plan in the CachedPlan.
Query Processing Control Flow
-----------------------------
@@ -288,11 +310,13 @@ This is a sketch of control flow for full query processing:
CreateQueryDesc
- ExecutorStart
+ ExecutorStart or ExecutorStartExt
CreateExecutorState
creates per-query context
- switch to per-query context to run ExecInitNode
+ switch to per-query context to run ExecDoInitialPruning and ExecInitNode
AfterTriggerBeginQuery
+ ExecDoInitialPruning
+ does initial pruning and locks surviving partitions if needed
ExecInitNode --- recursively scans plan tree
ExecInitNode
recurse into subsidiary nodes
@@ -316,7 +340,12 @@ This is a sketch of control flow for full query processing:
FreeQueryDesc
-Per above comments, it's not really critical for ExecEndNode to free any
+As mentioned in the "Relation Locking" section, if the plan tree is found to
+be stale after locking partitions in ExecDoInitialPruning(), the control is
+immediately returned to ExecutorStartExt(), which will create a new plan tree
+and perform the steps starting from CreateExecutorState() again.
+
+Per above comments, it's not really critical for ExecEndPlan to free any
memory; it'll all go away in FreeExecutorState anyway. However, we do need to
be careful to close relations, drop buffer pins, etc, so we do need to scan
the plan state tree to find these sorts of resources.
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 2c14ee2b6b..7a6954204e 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -60,6 +60,7 @@
#include "utils/backend_status.h"
#include "utils/lsyscache.h"
#include "utils/partcache.h"
+#include "utils/plancache.h"
#include "utils/rls.h"
#include "utils/snapmgr.h"
@@ -138,6 +139,60 @@ ExecutorStart(QueryDesc *queryDesc, int eflags)
standard_ExecutorStart(queryDesc, eflags);
}
+/*
+ * A variant of ExecutorStart() that handles cleanup and replanning if the
+ * input CachedPlan becomes invalid due to locks being taken during
+ * ExecutorStartInternal(). If that happens, a new CachedPlan is created
+ * only for the at the index 'query_index' in plansource->query_list, which
+ * is released separately from the original CachedPlan.
+ */
+void
+ExecutorStartExt(QueryDesc *queryDesc, int eflags,
+ CachedPlanSource *plansource,
+ int query_index)
+{
+ if (queryDesc->cplan == NULL)
+ {
+ ExecutorStart(queryDesc, eflags);
+ return;
+ }
+
+ while (1)
+ {
+ ExecutorStart(queryDesc, eflags);
+ if (!CachedPlanValid(queryDesc->cplan))
+ {
+ CachedPlan *cplan;
+
+ /*
+ * The plan got invalidated, so try with a new updated plan.
+ *
+ * But first undo what ExecutorStart() would've done. Mark
+ * execution as aborted to ensure that AFTER trigger state is
+ * properly reset.
+ */
+ queryDesc->estate->es_aborted = true;
+ ExecutorEnd(queryDesc);
+
+ cplan = GetSingleCachedPlan(plansource, query_index,
+ queryDesc->queryEnv);
+
+ /*
+ * Install the new transient cplan into the QueryDesc replacing
+ * the old one so that executor initialization code can see it.
+ * Mark it as in use by us and ask FreeQueryDesc() to release it.
+ */
+ cplan->refcount = 1;
+ queryDesc->cplan = cplan;
+ queryDesc->cplan_release = true;
+ queryDesc->plannedstmt = linitial_node(PlannedStmt,
+ queryDesc->cplan->stmt_list);
+ }
+ else
+ break; /* ExecutorStart() succeeded! */
+ }
+}
+
void
standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
{
@@ -324,6 +379,7 @@ standard_ExecutorRun(QueryDesc *queryDesc,
estate = queryDesc->estate;
Assert(estate != NULL);
+ Assert(!estate->es_aborted);
Assert(!(estate->es_top_eflags & EXEC_FLAG_EXPLAIN_ONLY));
/* caller must ensure the query's snapshot is active */
@@ -433,8 +489,11 @@ standard_ExecutorFinish(QueryDesc *queryDesc)
Assert(estate != NULL);
Assert(!(estate->es_top_eflags & EXEC_FLAG_EXPLAIN_ONLY));
- /* This should be run once and only once per Executor instance */
- Assert(!estate->es_finished);
+ /*
+ * This should be run once and only once per Executor instance and never
+ * if the execution was aborted.
+ */
+ Assert(!estate->es_finished && !estate->es_aborted);
/* Switch into per-query memory context */
oldcontext = MemoryContextSwitchTo(estate->es_query_cxt);
@@ -496,11 +555,10 @@ standard_ExecutorEnd(QueryDesc *queryDesc)
Assert(estate != NULL);
/*
- * Check that ExecutorFinish was called, unless in EXPLAIN-only mode. This
- * Assert is needed because ExecutorFinish is new as of 9.1, and callers
- * might forget to call it.
+ * Check that ExecutorFinish was called, unless in EXPLAIN-only mode or if
+ * execution was aborted.
*/
- Assert(estate->es_finished ||
+ Assert(estate->es_finished || estate->es_aborted ||
(estate->es_top_eflags & EXEC_FLAG_EXPLAIN_ONLY));
/*
@@ -514,6 +572,14 @@ standard_ExecutorEnd(QueryDesc *queryDesc)
UnregisterSnapshot(estate->es_snapshot);
UnregisterSnapshot(estate->es_crosscheck_snapshot);
+ /*
+ * Reset AFTER trigger module if the query execution was aborted.
+ */
+ if (estate->es_aborted &&
+ !(estate->es_top_eflags &
+ (EXEC_FLAG_SKIP_TRIGGERS | EXEC_FLAG_EXPLAIN_ONLY)))
+ AfterTriggerAbortQuery();
+
/*
* Must switch out of context before destroying it
*/
@@ -972,6 +1038,9 @@ InitPlan(QueryDesc *queryDesc, int eflags)
estate->es_part_prune_infos = plannedstmt->partPruneInfos;
ExecDoInitialPruning(estate);
+ if (!ExecPlanStillValid(estate))
+ return;
+
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
*/
@@ -2958,6 +3027,9 @@ EvalPlanQualStart(EPQState *epqstate, Plan *planTree)
* the snapshot, rangetable, and external Param info. They need their own
* copies of local state, including a tuple table, es_param_exec_vals,
* result-rel info, etc.
+ *
+ * es_cachedplan is not copied because EPQ plan execution does not acquire
+ * any new locks that could invalidate the CachedPlan.
*/
rcestate->es_direction = ForwardScanDirection;
rcestate->es_snapshot = parentestate->es_snapshot;
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 67734979b0..435ae0df7a 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -147,6 +147,7 @@ CreateExecutorState(void)
estate->es_top_eflags = 0;
estate->es_instrument = 0;
estate->es_finished = false;
+ estate->es_aborted = false;
estate->es_exprcontexts = NIL;
@@ -757,7 +758,7 @@ ExecInitRangeTable(EState *estate, List *rangeTable, List *permInfos)
* ExecGetRangeTableRelation
* Open the Relation for a range table entry, if not already done
*
- * The Relations will be closed again in ExecEndPlan().
+ * The Relations will be closed in ExecEndPlan().
*/
Relation
ExecGetRangeTableRelation(EState *estate, Index rti)
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 659bd6dcd9..f84f376c9c 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -70,7 +70,8 @@ static int _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
static ParamListInfo _SPI_convert_params(int nargs, Oid *argtypes,
Datum *Values, const char *Nulls);
-static int _SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount);
+static int _SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount,
+ CachedPlanSource *plansource, int query_index);
static void _SPI_error_callback(void *arg);
@@ -1682,7 +1683,8 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
query_string,
plansource->commandTag,
stmt_list,
- cplan);
+ cplan,
+ plansource);
/*
* Set up options for portal. Default SCROLL type is chosen the same way
@@ -2494,6 +2496,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
List *stmt_list;
ListCell *lc2;
+ int i = 0;
spicallbackarg.query = plansource->query_string;
@@ -2691,8 +2694,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
options->params,
_SPI_current->queryEnv,
0);
- res = _SPI_pquery(qdesc, fire_triggers,
- canSetTag ? options->tcount : 0);
+
+ res = _SPI_pquery(qdesc, fire_triggers, canSetTag ? options->tcount : 0,
+ plansource, i);
FreeQueryDesc(qdesc);
}
else
@@ -2789,6 +2793,8 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
my_res = res;
goto fail;
}
+
+ i++;
}
/* Done with this plan, so release refcount */
@@ -2866,7 +2872,8 @@ _SPI_convert_params(int nargs, Oid *argtypes,
}
static int
-_SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount)
+_SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount,
+ CachedPlanSource *plansource, int query_index)
{
int operation = queryDesc->operation;
int eflags;
@@ -2922,7 +2929,7 @@ _SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount)
else
eflags = EXEC_FLAG_SKIP_TRIGGERS;
- ExecutorStart(queryDesc, eflags);
+ ExecutorStartExt(queryDesc, eflags, plansource, query_index);
ExecutorRun(queryDesc, ForwardScanDirection, tcount, true);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index e394f1419a..b95c859655 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1237,6 +1237,7 @@ exec_simple_query(const char *query_string)
query_string,
commandTag,
plantree_list,
+ NULL,
NULL);
/*
@@ -2039,7 +2040,8 @@ exec_bind_message(StringInfo input_message)
query_string,
psrc->commandTag,
cplan->stmt_list,
- cplan);
+ cplan,
+ psrc);
/* Done with the snapshot used for parameter I/O and parsing/planning */
if (snapshot_set)
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 6e8f6b1b8f..dbb0ffb771 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -19,6 +19,7 @@
#include "access/xact.h"
#include "commands/prepare.h"
+#include "executor/execdesc.h"
#include "executor/tstoreReceiver.h"
#include "miscadmin.h"
#include "pg_trace.h"
@@ -37,6 +38,8 @@ Portal ActivePortal = NULL;
static void ProcessQuery(PlannedStmt *plan,
CachedPlan *cplan,
+ CachedPlanSource *plansource,
+ int query_index,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -80,6 +83,7 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
qd->cplan = cplan; /* CachedPlan supplying the plannedstmt */
+ qd->cplan_release = false;
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -114,6 +118,13 @@ FreeQueryDesc(QueryDesc *qdesc)
UnregisterSnapshot(qdesc->snapshot);
UnregisterSnapshot(qdesc->crosscheck_snapshot);
+ /*
+ * Release CachedPlan if requested. The CachedPlan is not associated with
+ * a ResourceOwner when cplan_release is true; see ExecutorStartExt().
+ */
+ if (qdesc->cplan_release)
+ ReleaseCachedPlan(qdesc->cplan, NULL);
+
/* Only the QueryDesc itself need be freed */
pfree(qdesc);
}
@@ -126,6 +137,8 @@ FreeQueryDesc(QueryDesc *qdesc)
*
* plan: the plan tree for the query
* cplan: CachedPlan supplying the plan
+ * plansource: CachedPlanSource supplying the cplan
+ * query_index: index of the query in plansource->query_list
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -139,6 +152,8 @@ FreeQueryDesc(QueryDesc *qdesc)
static void
ProcessQuery(PlannedStmt *plan,
CachedPlan *cplan,
+ CachedPlanSource *plansource,
+ int query_index,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -157,7 +172,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Call ExecutorStart to prepare the plan for execution
*/
- ExecutorStart(queryDesc, 0);
+ ExecutorStartExt(queryDesc, 0, plansource, query_index);
/*
* Run the plan to completion.
@@ -518,9 +533,12 @@ PortalStart(Portal portal, ParamListInfo params,
myeflags = eflags;
/*
- * Call ExecutorStart to prepare the plan for execution
+ * ExecutorStartExt() to prepare the plan for execution. If
+ * the portal is using a cached plan, it may get invalidated
+ * during plan intialization, in which case a new one is
+ * created and saved in the QueryDesc.
*/
- ExecutorStart(queryDesc, myeflags);
+ ExecutorStartExt(queryDesc, myeflags, portal->plansource, 0);
/*
* This tells PortalCleanup to shut down the executor
@@ -1201,6 +1219,7 @@ PortalRunMulti(Portal portal,
{
bool active_snapshot_set = false;
ListCell *stmtlist_item;
+ int i = 0;
/*
* If the destination is DestRemoteExecute, change to DestNone. The
@@ -1283,6 +1302,8 @@ PortalRunMulti(Portal portal,
/* statement can set tag string */
ProcessQuery(pstmt,
portal->cplan,
+ portal->plansource,
+ i,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1293,6 +1314,8 @@ PortalRunMulti(Portal portal,
/* stmt added by rewrite cannot set tag */
ProcessQuery(pstmt,
portal->cplan,
+ portal->plansource,
+ i,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1357,6 +1380,8 @@ PortalRunMulti(Portal portal,
*/
if (lnext(portal->stmts, stmtlist_item) != NULL)
CommandCounterIncrement();
+
+ i++;
}
/* Pop the snapshot if we pushed one. */
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 5b75dadf13..d33f871ea2 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -94,6 +94,14 @@
*/
static dlist_head saved_plan_list = DLIST_STATIC_INIT(saved_plan_list);
+/*
+ * Head of the backend's list of "standalone" CachedPlans that are not
+ * associated with a CachedPlanSource, created by GetSingleCachedPlan() for
+ * transient use by the executor in certain scenarios where they're needed
+ * only for one execution of the plan.
+ */
+static dlist_head standalone_plan_list = DLIST_STATIC_INIT(standalone_plan_list);
+
/*
* This is the head of the backend's list of CachedExpressions.
*/
@@ -905,6 +913,8 @@ CheckCachedPlan(CachedPlanSource *plansource)
* Planning work is done in the caller's memory context. The finished plan
* is in a child memory context, which typically should get reparented
* (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * Note: When changing this, you should also look at GetSingleCachedPlan().
*/
static CachedPlan *
BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
@@ -1034,6 +1044,7 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
plan->is_generic = generic;
plan->is_saved = false;
plan->is_valid = true;
+ plan->is_standalone = false;
/* assign generation number to new plan */
plan->generation = ++(plansource->generation);
@@ -1282,6 +1293,121 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
return plan;
}
+/*
+ * Create a fresh CachedPlan for the query_index'th query in the provided
+ * CachedPlanSource.
+ *
+ * The created CachedPlan is standalone, meaning it is not tracked in the
+ * CachedPlanSource. The CachedPlan and its plan trees are allocated in a
+ * child context of the caller's memory context. The caller must ensure they
+ * remain valid until execution is complete, after which the plan should be
+ * released by calling ReleaseCachedPlan().
+ *
+ * This function primarily supports ExecutorStartExt(), which handles cases
+ * where the original generic CachedPlan becomes invalid after prunable
+ * relations are locked.
+ */
+CachedPlan *
+GetSingleCachedPlan(CachedPlanSource *plansource, int query_index,
+ QueryEnvironment *queryEnv)
+{
+ List *query_list = plansource->query_list,
+ *plan_list;
+ CachedPlan *plan = plansource->gplan;
+ MemoryContext oldcxt = CurrentMemoryContext,
+ plan_context;
+ PlannedStmt *plannedstmt;
+
+ Assert(ActiveSnapshotSet());
+
+ /* Sanity checks */
+ if (plan == NULL)
+ elog(ERROR, "GetSingleCachedPlan() called in the wrong context: plansource->gplan is NULL");
+ else if (plan->is_valid)
+ elog(ERROR, "GetSingleCachedPlan() called in the wrong context: plansource->gplan->is_valid");
+
+ /*
+ * The plansource might have become invalid since GetCachedPlan(). See the
+ * comment in BuildCachedPlan() for details on why this might happen.
+ *
+ * The risk is greater here because this function is called from the
+ * executor, meaning much more processing may have occurred compared to
+ * when BuildCachedPlan() is called from GetCachedPlan().
+ */
+ if (!plansource->is_valid)
+ query_list = RevalidateCachedQuery(plansource, queryEnv);
+ Assert(query_list != NIL);
+
+ /*
+ * Build a new generic plan for the query_index'th query, but make a copy
+ * to be scribbled on by the planner
+ */
+ query_list = list_make1(copyObject(list_nth_node(Query, query_list,
+ query_index)));
+ plan_list = pg_plan_queries(query_list, plansource->query_string,
+ plansource->cursor_options, NULL);
+
+ list_free_deep(query_list);
+
+ /*
+ * Make a dedicated memory context for the CachedPlan and its subsidiary
+ * data so that we can release it in ReleaseCachedPlan() that will be
+ * called in FreeQueryDesc().
+ */
+ plan_context = AllocSetContextCreate(CurrentMemoryContext,
+ "Standalone CachedPlan",
+ ALLOCSET_START_SMALL_SIZES);
+ MemoryContextCopyAndSetIdentifier(plan_context, plansource->query_string);
+
+ /*
+ * Copy plan into the new context.
+ */
+ MemoryContextSwitchTo(plan_context);
+ plan_list = copyObject(plan_list);
+
+ /*
+ * Create and fill the CachedPlan struct within the new context.
+ */
+ plan = (CachedPlan *) palloc(sizeof(CachedPlan));
+ plan->magic = CACHEDPLAN_MAGIC;
+ plan->stmt_list = plan_list;
+
+ plan->planRoleId = GetUserId();
+ Assert(list_length(plan_list) == 1);
+ plannedstmt = linitial_node(PlannedStmt, plan_list);
+
+ /*
+ * CachedPlan is dependent on role either if RLS affected the rewrite
+ * phase or if a role dependency was injected during planning. And it's
+ * transient if any plan is marked so.
+ */
+ plan->dependsOnRole = plansource->dependsOnRLS || plannedstmt->dependsOnRole;
+ if (plannedstmt->transientPlan)
+ {
+ Assert(TransactionIdIsNormal(TransactionXmin));
+ plan->saved_xmin = TransactionXmin;
+ }
+ else
+ plan->saved_xmin = InvalidTransactionId;
+ plan->refcount = 0;
+ plan->context = plan_context;
+ plan->is_oneshot = false;
+ plan->is_generic = true;
+ plan->is_saved = false;
+ plan->is_valid = true;
+ plan->is_standalone = true;
+ plan->generation = 1;
+ MemoryContextSwitchTo(oldcxt);
+
+ /*
+ * Add the entry to the global list of "standalone" cached plans. It is
+ * removed from the list by ReleaseCachedPlan().
+ */
+ dlist_push_tail(&standalone_plan_list, &plan->node);
+
+ return plan;
+}
+
/*
* ReleaseCachedPlan: release active use of a cached plan.
*
@@ -1309,6 +1435,10 @@ ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner)
/* Mark it no longer valid */
plan->magic = 0;
+ /* Remove from the global list if we are a standalone plan. */
+ if (plan->is_standalone)
+ dlist_delete(&plan->node);
+
/* One-shot plans do not own their context, so we can't free them */
if (!plan->is_oneshot)
MemoryContextDelete(plan->context);
@@ -2066,6 +2196,33 @@ PlanCacheRelCallback(Datum arg, Oid relid)
cexpr->is_valid = false;
}
}
+
+ /* Finally, invalidate any standalone cached plans */
+ dlist_foreach(iter, &standalone_plan_list)
+ {
+ CachedPlan *cplan = dlist_container(CachedPlan,
+ node, iter.cur);
+
+ Assert(cplan->magic == CACHEDPLAN_MAGIC);
+
+ if (cplan->is_valid)
+ {
+ ListCell *lc;
+
+ foreach(lc, cplan->stmt_list)
+ {
+ PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc);
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ continue; /* Ignore utility statements */
+ if ((relid == InvalidOid) ? plannedstmt->relationOids != NIL :
+ list_member_oid(plannedstmt->relationOids, relid))
+ cplan->is_valid = false;
+ if (!cplan->is_valid)
+ break; /* out of stmt_list scan */
+ }
+ }
+ }
}
/*
@@ -2176,6 +2333,44 @@ PlanCacheObjectCallback(Datum arg, int cacheid, uint32 hashvalue)
}
}
}
+
+ /* Finally, invalidate any standalone cached plans */
+ dlist_foreach(iter, &standalone_plan_list)
+ {
+ CachedPlan *cplan = dlist_container(CachedPlan,
+ node, iter.cur);
+
+ Assert(cplan->magic == CACHEDPLAN_MAGIC);
+
+ if (cplan->is_valid)
+ {
+ ListCell *lc;
+
+ foreach(lc, cplan->stmt_list)
+ {
+ PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc);
+ ListCell *lc3;
+
+ if (plannedstmt->commandType == CMD_UTILITY)
+ continue; /* Ignore utility statements */
+ foreach(lc3, plannedstmt->invalItems)
+ {
+ PlanInvalItem *item = (PlanInvalItem *) lfirst(lc3);
+
+ if (item->cacheId != cacheid)
+ continue;
+ if (hashvalue == 0 ||
+ item->hashValue == hashvalue)
+ {
+ cplan->is_valid = false;
+ break; /* out of invalItems scan */
+ }
+ }
+ if (!cplan->is_valid)
+ break; /* out of stmt_list scan */
+ }
+ }
+ }
}
/*
@@ -2235,6 +2430,17 @@ ResetPlanCache(void)
cexpr->is_valid = false;
}
+
+ /* Finally, invalidate any standalone cached plans */
+ dlist_foreach(iter, &standalone_plan_list)
+ {
+ CachedPlan *cplan = dlist_container(CachedPlan,
+ node, iter.cur);
+
+ Assert(cplan->magic == CACHEDPLAN_MAGIC);
+
+ cplan->is_valid = false;
+ }
}
/*
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 4a24613537..bf70fd4ce7 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -284,7 +284,8 @@ PortalDefineQuery(Portal portal,
const char *sourceText,
CommandTag commandTag,
List *stmts,
- CachedPlan *cplan)
+ CachedPlan *cplan,
+ CachedPlanSource *plansource)
{
Assert(PortalIsValid(portal));
Assert(portal->status == PORTAL_NEW);
@@ -299,6 +300,7 @@ PortalDefineQuery(Portal portal,
portal->commandTag = commandTag;
portal->stmts = stmts;
portal->cplan = cplan;
+ portal->plansource = plansource;
portal->status = PORTAL_DEFINED;
}
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 21c71e0d53..a39989a950 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -104,6 +104,7 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ParamListInfo params, QueryEnvironment *queryEnv);
extern void ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
+ CachedPlanSource *plansource, int plan_index,
IntoClause *into, ExplainState *es,
const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
diff --git a/src/include/commands/trigger.h b/src/include/commands/trigger.h
index 8a5a9fe642..db21561c8c 100644
--- a/src/include/commands/trigger.h
+++ b/src/include/commands/trigger.h
@@ -258,6 +258,7 @@ extern void ExecASTruncateTriggers(EState *estate,
extern void AfterTriggerBeginXact(void);
extern void AfterTriggerBeginQuery(void);
extern void AfterTriggerEndQuery(EState *estate);
+extern void AfterTriggerAbortQuery(void);
extern void AfterTriggerFireDeferred(void);
extern void AfterTriggerEndXact(bool isCommit);
extern void AfterTriggerBeginSubXact(void);
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index 0e7245435d..f6cb6479c0 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -36,6 +36,7 @@ typedef struct QueryDesc
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
CachedPlan *cplan; /* CachedPlan that supplies the plannedstmt */
+ bool cplan_release; /* Should FreeQueryDesc() release cplan? */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 69c3ebff00..5bc0edb5a0 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -19,6 +19,7 @@
#include "nodes/lockoptions.h"
#include "nodes/parsenodes.h"
#include "utils/memutils.h"
+#include "utils/plancache.h"
/*
@@ -198,6 +199,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
* prototypes from functions in execMain.c
*/
extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
+extern void ExecutorStartExt(QueryDesc *queryDesc, int eflags,
+ CachedPlanSource *plansource, int query_index);
extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void ExecutorRun(QueryDesc *queryDesc,
ScanDirection direction, uint64 count, bool execute_once);
@@ -261,6 +264,19 @@ extern void ExecEndNode(PlanState *node);
extern void ExecShutdownNode(PlanState *node);
extern void ExecSetTupleBound(int64 tuples_needed, PlanState *child_node);
+/*
+ * Is the CachedPlan in es_cachedplan still valid?
+ *
+ * Called from InitPlan() because invalidation messages that affect the plan
+ * might be received after locks have been taken on runtime-prunable relations.
+ * The caller should take appropriate action if the plan has become invalid.
+ */
+static inline bool
+ExecPlanStillValid(EState *estate)
+{
+ return estate->es_cachedplan == NULL ? true :
+ CachedPlanValid(estate->es_cachedplan);
+}
/* ----------------------------------------------------------------
* ExecProcNode
@@ -589,6 +605,7 @@ extern void ExecCreateScanSlotFromOuterPlan(EState *estate,
extern bool ExecRelationIsTargetRelation(EState *estate, Index scanrelid);
extern Relation ExecOpenScanRelation(EState *estate, Index scanrelid, int eflags);
+extern Relation ExecOpenScanIndexRelation(EState *estate, Oid indexid, int lockmode);
extern void ExecInitRangeTable(EState *estate, List *rangeTable, List *permInfos);
extern void ExecCloseRangeTableRelations(EState *estate);
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index bd68c60a0b..c80ccf0349 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -690,6 +690,7 @@ typedef struct EState
int es_top_eflags; /* eflags passed to ExecutorStart */
int es_instrument; /* OR of InstrumentOption flags */
bool es_finished; /* true when ExecutorFinish is done */
+ bool es_aborted; /* true when execution was aborted */
List *es_exprcontexts; /* List of ExprContexts within EState */
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index 0b5ee007ca..154f68f671 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -18,6 +18,7 @@
#include "access/tupdesc.h"
#include "lib/ilist.h"
#include "nodes/params.h"
+#include "nodes/parsenodes.h"
#include "tcop/cmdtag.h"
#include "utils/queryenvironment.h"
#include "utils/resowner.h"
@@ -152,6 +153,8 @@ typedef struct CachedPlan
bool is_generic; /* is it a reusable generic plan? */
bool is_saved; /* is CachedPlan in a long-lived context? */
bool is_valid; /* is the stmt_list currently valid? */
+ bool is_standalone; /* is it not associated with a
+ * CachedPlanSource? */
Oid planRoleId; /* Role ID the plan was created for */
bool dependsOnRole; /* is plan specific to that role? */
TransactionId saved_xmin; /* if valid, replan when TransactionXmin
@@ -159,6 +162,12 @@ typedef struct CachedPlan
int generation; /* parent's generation number for this plan */
int refcount; /* count of live references to this struct */
MemoryContext context; /* context containing this CachedPlan */
+
+ /*
+ * If the plan is not associated with a CachedPlanSource, it is saved in
+ * a separate global list.
+ */
+ dlist_node node; /* list link, if is_standalone */
} CachedPlan;
/*
@@ -224,6 +233,10 @@ extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
ParamListInfo boundParams,
ResourceOwner owner,
QueryEnvironment *queryEnv);
+extern CachedPlan *GetSingleCachedPlan(CachedPlanSource *plansource,
+ int query_index,
+ QueryEnvironment *queryEnv);
+
extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
@@ -245,4 +258,17 @@ CachedPlanRequiresLocking(CachedPlan *cplan)
return cplan->is_generic;
}
+/*
+ * CachedPlanValid
+ * Returns whether a cached generic plan is still valid.
+ *
+ * Invoked by the executor to check if the plan has not been invalidated after
+ * taking locks during the initialization of the plan.
+ */
+static inline bool
+CachedPlanValid(CachedPlan *cplan)
+{
+ return cplan->is_valid;
+}
+
#endif /* PLANCACHE_H */
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index 29f49829f2..58c3828d2c 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
QueryCompletion qc; /* command completion data for executed query */
List *stmts; /* list of PlannedStmts */
CachedPlan *cplan; /* CachedPlan, if stmts are from one */
+ CachedPlanSource *plansource; /* CachedPlanSource, for cplan */
ParamListInfo portalParams; /* params to pass to query */
QueryEnvironment *queryEnv; /* environment for query */
@@ -241,7 +242,8 @@ extern void PortalDefineQuery(Portal portal,
const char *sourceText,
CommandTag commandTag,
List *stmts,
- CachedPlan *cplan);
+ CachedPlan *cplan,
+ CachedPlanSource *plansource);
extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
extern void PortalCreateHoldStore(Portal portal);
extern void PortalHashTableDeleteAll(void);
diff --git a/src/test/modules/delay_execution/Makefile b/src/test/modules/delay_execution/Makefile
index 70f24e846d..3eeb097fde 100644
--- a/src/test/modules/delay_execution/Makefile
+++ b/src/test/modules/delay_execution/Makefile
@@ -8,7 +8,8 @@ OBJS = \
delay_execution.o
ISOLATION = partition-addition \
- partition-removal-1
+ partition-removal-1 \
+ cached-plan-inval
ifdef USE_PGXS
PG_CONFIG = pg_config
diff --git a/src/test/modules/delay_execution/delay_execution.c b/src/test/modules/delay_execution/delay_execution.c
index 155c8a8d55..304ca77f7b 100644
--- a/src/test/modules/delay_execution/delay_execution.c
+++ b/src/test/modules/delay_execution/delay_execution.c
@@ -1,14 +1,18 @@
/*-------------------------------------------------------------------------
*
* delay_execution.c
- * Test module to allow delay between parsing and execution of a query.
+ * Test module to introduce delay at various points during execution of a
+ * query to test that execution proceeds safely in light of concurrent
+ * changes.
*
* The delay is implemented by taking and immediately releasing a specified
* advisory lock. If another process has previously taken that lock, the
* current process will be blocked until the lock is released; otherwise,
* there's no effect. This allows an isolationtester script to reliably
- * test behaviors where some specified action happens in another backend
- * between parsing and execution of any desired query.
+ * test behaviors where some specified action happens in another backend in
+ * a couple of cases: 1) between parsing and execution of any desired query
+ * when using the planner_hook, 2) between RevalidateCachedQuery() and
+ * ExecutorStart() when using the ExecutorStart_hook.
*
* Copyright (c) 2020-2024, PostgreSQL Global Development Group
*
@@ -22,6 +26,7 @@
#include <limits.h>
+#include "executor/executor.h"
#include "optimizer/planner.h"
#include "utils/builtins.h"
#include "utils/guc.h"
@@ -32,9 +37,11 @@ PG_MODULE_MAGIC;
/* GUC: advisory lock ID to use. Zero disables the feature. */
static int post_planning_lock_id = 0;
+static int executor_start_lock_id = 0;
-/* Save previous planner hook user to be a good citizen */
+/* Save previous hook users to be a good citizen */
static planner_hook_type prev_planner_hook = NULL;
+static ExecutorStart_hook_type prev_ExecutorStart_hook = NULL;
/* planner_hook function to provide the desired delay */
@@ -70,11 +77,41 @@ delay_execution_planner(Query *parse, const char *query_string,
return result;
}
+/* ExecutorStart_hook function to provide the desired delay */
+static void
+delay_execution_ExecutorStart(QueryDesc *queryDesc, int eflags)
+{
+ /* If enabled, delay by taking and releasing the specified lock */
+ if (executor_start_lock_id != 0)
+ {
+ DirectFunctionCall1(pg_advisory_lock_int8,
+ Int64GetDatum((int64) executor_start_lock_id));
+ DirectFunctionCall1(pg_advisory_unlock_int8,
+ Int64GetDatum((int64) executor_start_lock_id));
+
+ /*
+ * Ensure that we notice any pending invalidations, since the advisory
+ * lock functions don't do this.
+ */
+ AcceptInvalidationMessages();
+ }
+
+ /* Now start the executor, possibly via a previous hook user */
+ if (prev_ExecutorStart_hook)
+ prev_ExecutorStart_hook(queryDesc, eflags);
+ else
+ standard_ExecutorStart(queryDesc, eflags);
+
+ if (executor_start_lock_id != 0)
+ elog(NOTICE, "Finished ExecutorStart(): CachedPlan is %s",
+ CachedPlanValid(queryDesc->cplan) ? "valid" : "not valid");
+}
+
/* Module load function */
void
_PG_init(void)
{
- /* Set up the GUC to control which lock is used */
+ /* Set up GUCs to control which lock is used */
DefineCustomIntVariable("delay_execution.post_planning_lock_id",
"Sets the advisory lock ID to be locked/unlocked after planning.",
"Zero disables the delay.",
@@ -86,10 +123,22 @@ _PG_init(void)
NULL,
NULL,
NULL);
-
+ DefineCustomIntVariable("delay_execution.executor_start_lock_id",
+ "Sets the advisory lock ID to be locked/unlocked before starting execution.",
+ "Zero disables the delay.",
+ &executor_start_lock_id,
+ 0,
+ 0, INT_MAX,
+ PGC_USERSET,
+ 0,
+ NULL,
+ NULL,
+ NULL);
MarkGUCPrefixReserved("delay_execution");
- /* Install our hook */
+ /* Install our hooks. */
prev_planner_hook = planner_hook;
planner_hook = delay_execution_planner;
+ prev_ExecutorStart_hook = ExecutorStart_hook;
+ ExecutorStart_hook = delay_execution_ExecutorStart;
}
diff --git a/src/test/modules/delay_execution/expected/cached-plan-inval.out b/src/test/modules/delay_execution/expected/cached-plan-inval.out
new file mode 100644
index 0000000000..e002cfbc9c
--- /dev/null
+++ b/src/test/modules/delay_execution/expected/cached-plan-inval.out
@@ -0,0 +1,230 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s1prep s2lock s1exec s2dropi s2unlock
+step s1prep: SET plan_cache_mode = force_generic_plan;
+ PREPARE q AS SELECT * FROM foov WHERE a = $1 FOR UPDATE;
+ EXPLAIN (COSTS OFF) EXECUTE q (1);
+QUERY PLAN
+------------------------------------------------
+LockRows
+ -> Append
+ Subplans Removed: 2
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: (a = $1)
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = $1)
+(7 rows)
+
+step s2lock: SELECT pg_advisory_lock(12345);
+pg_advisory_lock
+----------------
+
+(1 row)
+
+step s1exec: LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q (1); <waiting ...>
+step s2dropi: DROP INDEX foo12_1_a;
+step s2unlock: SELECT pg_advisory_unlock(12345);
+pg_advisory_unlock
+------------------
+t
+(1 row)
+
+step s1exec: <... completed>
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+-------------------------------------
+LockRows
+ -> Append
+ Subplans Removed: 2
+ -> Seq Scan on foo12_1 foo_1
+ Filter: (a = $1)
+(5 rows)
+
+
+starting permutation: s1prep2 s2lock s1exec2 s2dropi s2unlock
+step s1prep2: SET plan_cache_mode = force_generic_plan;
+ PREPARE q2 AS SELECT * FROM foov WHERE a = one() or a = two();
+ EXPLAIN (COSTS OFF) EXECUTE q2;
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+--------------------------------------------------
+Append
+ Subplans Removed: 1
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: ((a = one()) OR (a = two()))
+ -> BitmapOr
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = one())
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = two())
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+(11 rows)
+
+step s2lock: SELECT pg_advisory_lock(12345);
+pg_advisory_lock
+----------------
+
+(1 row)
+
+step s1exec2: LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q2; <waiting ...>
+step s2dropi: DROP INDEX foo12_1_a;
+step s2unlock: SELECT pg_advisory_unlock(12345);
+pg_advisory_unlock
+------------------
+t
+(1 row)
+
+step s1exec2: <... completed>
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+--------------------------------------------
+Append
+ Subplans Removed: 1
+ -> Seq Scan on foo12_1 foo_1
+ Filter: ((a = one()) OR (a = two()))
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+(6 rows)
+
+
+starting permutation: s1prep3 s2lock s1exec3 s2dropi s2unlock
+step s1prep3: SET plan_cache_mode = force_generic_plan;
+ PREPARE q3 AS UPDATE foov SET a = a WHERE a = one() or a = two();
+ EXPLAIN (COSTS OFF) EXECUTE q3;
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+--------------------------------------------------------
+Append
+ Subplans Removed: 1
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: ((a = one()) OR (a = two()))
+ -> BitmapOr
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = one())
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = two())
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+
+Update on foo
+ Update on foo12_1 foo_1
+ Update on foo12_2 foo_2
+ -> Append
+ Subplans Removed: 1
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: ((a = one()) OR (a = two()))
+ -> BitmapOr
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = one())
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = two())
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+(26 rows)
+
+step s2lock: SELECT pg_advisory_lock(12345);
+pg_advisory_lock
+----------------
+
+(1 row)
+
+step s1exec3: LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q3; <waiting ...>
+step s2dropi: DROP INDEX foo12_1_a;
+step s2unlock: SELECT pg_advisory_unlock(12345);
+pg_advisory_unlock
+------------------
+t
+(1 row)
+
+step s1exec3: <... completed>
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+--------------------------------------------------
+Append
+ Subplans Removed: 1
+ -> Seq Scan on foo12_1 foo_1
+ Filter: ((a = one()) OR (a = two()))
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+
+Update on foo
+ Update on foo12_1 foo_1
+ Update on foo12_2 foo_2
+ -> Append
+ Subplans Removed: 1
+ -> Seq Scan on foo12_1 foo_1
+ Filter: ((a = one()) OR (a = two()))
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+(16 rows)
+
+
+starting permutation: s1prep4 s2lock s1exec4 s2dropi s2unlock
+step s1prep4: SET plan_cache_mode = force_generic_plan;
+ SET enable_seqscan TO off;
+ PREPARE q4 AS SELECT * FROM generate_series(1, 1) WHERE EXISTS (SELECT * FROM foov WHERE a = $1 FOR UPDATE);
+ EXPLAIN (COSTS OFF) EXECUTE q4 (1);
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+---------------------------------------------------------------
+Result
+ One-Time Filter: (InitPlan 1).col1
+ InitPlan 1
+ -> LockRows
+ Disabled Nodes: 2
+ -> Append
+ Disabled Nodes: 2
+ Subplans Removed: 2
+ -> Index Scan using foo12_1_a on foo12_1 foo_1
+ Index Cond: (a = $1)
+ -> Function Scan on generate_series
+(11 rows)
+
+step s2lock: SELECT pg_advisory_lock(12345);
+pg_advisory_lock
+----------------
+
+(1 row)
+
+step s1exec4: LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q4 (1); <waiting ...>
+step s2dropi: DROP INDEX foo12_1_a;
+step s2unlock: SELECT pg_advisory_unlock(12345);
+pg_advisory_unlock
+------------------
+t
+(1 row)
+
+step s1exec4: <... completed>
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+---------------------------------------------
+Result
+ One-Time Filter: (InitPlan 1).col1
+ InitPlan 1
+ -> LockRows
+ Disabled Nodes: 3
+ -> Append
+ Disabled Nodes: 3
+ Subplans Removed: 2
+ -> Seq Scan on foo12_1 foo_1
+ Disabled Nodes: 1
+ Filter: (a = $1)
+ -> Function Scan on generate_series
+(12 rows)
+
diff --git a/src/test/modules/delay_execution/meson.build b/src/test/modules/delay_execution/meson.build
index 41f3ac0b89..5a70b183d0 100644
--- a/src/test/modules/delay_execution/meson.build
+++ b/src/test/modules/delay_execution/meson.build
@@ -24,6 +24,7 @@ tests += {
'specs': [
'partition-addition',
'partition-removal-1',
+ 'cached-plan-inval',
],
},
}
diff --git a/src/test/modules/delay_execution/specs/cached-plan-inval.spec b/src/test/modules/delay_execution/specs/cached-plan-inval.spec
new file mode 100644
index 0000000000..820a843051
--- /dev/null
+++ b/src/test/modules/delay_execution/specs/cached-plan-inval.spec
@@ -0,0 +1,75 @@
+# Test to check that invalidation of cached generic plans during ExecutorStart
+# correctly triggers replanning and re-execution.
+
+setup
+{
+ CREATE TABLE foo (a int, b text) PARTITION BY LIST(a);
+ CREATE TABLE foo12 PARTITION OF foo FOR VALUES IN (1, 2) PARTITION BY LIST (a);
+ CREATE TABLE foo12_1 PARTITION OF foo12 FOR VALUES IN (1);
+ CREATE TABLE foo12_2 PARTITION OF foo12 FOR VALUES IN (2);
+ CREATE INDEX foo12_1_a ON foo12_1 (a);
+ CREATE TABLE foo3 PARTITION OF foo FOR VALUES IN (3);
+ CREATE VIEW foov AS SELECT * FROM foo;
+ CREATE FUNCTION one () RETURNS int AS $$ BEGIN RETURN 1; END; $$ LANGUAGE PLPGSQL STABLE;
+ CREATE FUNCTION two () RETURNS int AS $$ BEGIN RETURN 2; END; $$ LANGUAGE PLPGSQL STABLE;
+ CREATE RULE update_foo AS ON UPDATE TO foo DO ALSO SELECT 1;
+}
+
+teardown
+{
+ DROP VIEW foov;
+ DROP RULE update_foo ON foo;
+ DROP TABLE foo;
+ DROP FUNCTION one(), two();
+}
+
+session "s1"
+# Append with run-time pruning
+step "s1prep" { SET plan_cache_mode = force_generic_plan;
+ PREPARE q AS SELECT * FROM foov WHERE a = $1 FOR UPDATE;
+ EXPLAIN (COSTS OFF) EXECUTE q (1); }
+
+# Another case with Append with run-time pruning
+step "s1prep2" { SET plan_cache_mode = force_generic_plan;
+ PREPARE q2 AS SELECT * FROM foov WHERE a = one() or a = two();
+ EXPLAIN (COSTS OFF) EXECUTE q2; }
+
+# Case with a rule adding another query
+step "s1prep3" { SET plan_cache_mode = force_generic_plan;
+ PREPARE q3 AS UPDATE foov SET a = a WHERE a = one() or a = two();
+ EXPLAIN (COSTS OFF) EXECUTE q3; }
+
+# Another case with Append with run-time pruning in a subquery
+step "s1prep4" { SET plan_cache_mode = force_generic_plan;
+ SET enable_seqscan TO off;
+ PREPARE q4 AS SELECT * FROM generate_series(1, 1) WHERE EXISTS (SELECT * FROM foov WHERE a = $1 FOR UPDATE);
+ EXPLAIN (COSTS OFF) EXECUTE q4 (1); }
+
+# Executes a generic plan
+step "s1exec" { LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q (1); }
+step "s1exec2" { LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q2; }
+step "s1exec3" { LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q3; }
+step "s1exec4" { LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q4 (1); }
+
+session "s2"
+step "s2lock" { SELECT pg_advisory_lock(12345); }
+step "s2unlock" { SELECT pg_advisory_unlock(12345); }
+step "s2dropi" { DROP INDEX foo12_1_a; }
+
+# While "s1exec", etc. wait to acquire the advisory lock, "s2drop" is able to
+# drop the index being used in the cached plan. When "s1exec" is then
+# unblocked and initializes the cached plan for execution, it detects the
+# concurrent index drop and causes the cached plan to be discarded and
+# recreated without the index.
+permutation "s1prep" "s2lock" "s1exec" "s2dropi" "s2unlock"
+permutation "s1prep2" "s2lock" "s1exec2" "s2dropi" "s2unlock"
+permutation "s1prep3" "s2lock" "s1exec3" "s2dropi" "s2unlock"
+permutation "s1prep4" "s2lock" "s1exec4" "s2dropi" "s2unlock"
--
2.43.0
[application/octet-stream] v56-0002-Initialize-PartitionPruneContexts-lazily.patch (15.6K, 3-v56-0002-Initialize-PartitionPruneContexts-lazily.patch)
download | inline diff:
From 7ba748a1055880ee20f908a2cf2757f2ad82e9ef Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Wed, 18 Sep 2024 11:16:48 +0900
Subject: [PATCH v56 2/5] Initialize PartitionPruneContexts lazily
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
This commit moves the initialization of PartitionPruneContexts for
both initial and exec pruning steps from CreatePartitionPruneState()
to find_matching_subplans_recurse(), where they are actually needed.
To track whether the context has been initialized and is ready for
use, a boolean field is_valid has been added to PartitionPruneContext.
The primary motivation is to eliminate the need to perform
CreatePartitionPruneState() during ExecInitNode(), as creating the
exec pruning context requires access to the parent plan node’s
PlanState. By deferring context creation to where it’s needed, this
change enables calling CreatePartitionPruneState() before ExecInitNode().
This will be useful in a future commit, which will move initial
pruning to occur before ExecInitNode().
---
src/backend/executor/execPartition.c | 150 +++++++++++++++++++--------
src/include/executor/execPartition.h | 11 ++
src/include/partitioning/partprune.h | 2 +
3 files changed, 120 insertions(+), 43 deletions(-)
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index ec730674f2..63c3429fe7 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -181,18 +181,17 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
int maxfieldlen);
static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
-static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
+static PartitionPruneState *CreatePartitionPruneState(EState *estate,
PartitionPruneInfo *pruneinfo);
-static void InitPartitionPruneContext(PartitionPruneContext *context,
+static void InitPartitionPruneContext(PartitionedRelPruningData *pprune,
+ PartitionPruneContext *context,
List *pruning_steps,
- PartitionDesc partdesc,
- PartitionKey partkey,
- PlanState *planstate,
- ExprContext *econtext);
+ PlanState *planstate);
static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
Bitmapset *initially_valid_subplans,
int n_total_subplans);
-static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
+static void find_matching_subplans_recurse(PlanState *parent_plan,
+ PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
Bitmapset **validsubplans);
@@ -1825,7 +1824,14 @@ ExecInitPartitionPruning(PlanState *planstate,
ExecAssignExprContext(estate, planstate);
/* Create the working data structure for pruning */
- prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+ prunestate = CreatePartitionPruneState(estate, pruneinfo);
+
+ /*
+ * Store PlanState for using it to initialize exec pruning contexts later
+ * in find_matching_subplans_recurse() where they are needed.
+ */
+ if (prunestate->do_exec_prune)
+ prunestate->parent_plan = planstate;
/*
* Perform an initial partition prune pass, if required.
@@ -1865,8 +1871,6 @@ ExecInitPartitionPruning(PlanState *planstate,
* CreatePartitionPruneState
* Build the data structure required for calling ExecFindMatchingSubPlans
*
- * 'planstate' is the parent plan node's execution state.
- *
* 'pruneinfo' is a PartitionPruneInfo as generated by
* make_partition_pruneinfo. Here we build a PartitionPruneState containing a
* PartitionPruningData for each partitioning hierarchy (i.e., each sublist of
@@ -1877,16 +1881,20 @@ ExecInitPartitionPruning(PlanState *planstate,
* stored in each PartitionedRelPruningData can be re-used each time we
* re-evaluate which partitions match the pruning steps provided in each
* PartitionedRelPruneInfo.
+ *
+ * Note that the PartitionPruneContexts for both initial and exec pruning
+ * (which are stored in each PartitionedRelPruningData) are initialized lazily
+ * in find_matching_subplans_recurse() when used for the first time.
*/
static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
{
- EState *estate = planstate->state;
PartitionPruneState *prunestate;
int n_part_hierarchies;
ListCell *lc;
int i;
- ExprContext *econtext = planstate->ps_ExprContext;
+ /* We may need an expression context to evaluate partition exprs */
+ ExprContext *econtext = CreateExprContext(estate);
/* For data reading, executor always includes detached partitions */
if (estate->es_partition_directory == NULL)
@@ -1908,6 +1916,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
prunestate->other_subplans = bms_copy(pruneinfo->other_subplans);
prunestate->do_initial_prune = false; /* may be set below */
prunestate->do_exec_prune = false; /* may be set below */
+ prunestate->parent_plan = NULL;
prunestate->num_partprunedata = n_part_hierarchies;
/*
@@ -1943,16 +1952,25 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
Relation partrel;
PartitionDesc partdesc;
- PartitionKey partkey;
/*
- * We can rely on the copies of the partitioned table's partition
- * key and partition descriptor appearing in its relcache entry,
- * because that entry will be held open and locked for the
- * duration of this executor run.
+ * Used for initializing the expressions in initial pruning steps.
+ * For exec pruning steps, the parent plan node's PlanState's
+ * ps_ExprContext will be used.
*/
+ pprune->estate = estate;
+ pprune->econtext = econtext;
+
+ /* Remember Relation for use in InitPartitionPruneContext. */
partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
- partkey = RelationGetPartitionKey(partrel);
+ pprune->partrel = partrel;
+
+ /*
+ * We can rely on the copy partrtitioned table's partition
+ * descriptor appearing in its relcache entry, because that entry
+ * will be held open and locked for the duration of this executor
+ * run.
+ */
partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
partrel);
@@ -2063,32 +2081,26 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pprune->present_parts = bms_copy(pinfo->present_parts);
/*
- * Initialize pruning contexts as needed. Note that we must skip
- * execution-time partition pruning in EXPLAIN (GENERIC_PLAN),
- * since parameter values may be missing.
+ * Pruning contexts (initial_context and exec_context) are
+ * initialized lazily in find_matching_subplans_recurse() when used
+ * for the first time.
+ *
+ * Note that we must skip execution-time partition pruning in
+ * EXPLAIN (GENERIC_PLAN), since parameter values may be missing.
*/
pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
+ pprune->initial_context.is_valid = false;
if (pinfo->initial_pruning_steps &&
!(econtext->ecxt_estate->es_top_eflags & EXEC_FLAG_EXPLAIN_GENERIC))
- {
- InitPartitionPruneContext(&pprune->initial_context,
- pinfo->initial_pruning_steps,
- partdesc, partkey, planstate,
- econtext);
/* Record whether initial pruning is needed at any level */
prunestate->do_initial_prune = true;
- }
+
pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
+ pprune->exec_context.is_valid = false;
if (pinfo->exec_pruning_steps &&
!(econtext->ecxt_estate->es_top_eflags & EXEC_FLAG_EXPLAIN_GENERIC))
- {
- InitPartitionPruneContext(&pprune->exec_context,
- pinfo->exec_pruning_steps,
- partdesc, partkey, planstate,
- econtext);
/* Record whether exec pruning is needed at any level */
prunestate->do_exec_prune = true;
- }
/*
* Accumulate the IDs of all PARAM_EXEC Params affecting the
@@ -2109,16 +2121,43 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
* Initialize a PartitionPruneContext for the given list of pruning steps.
*/
static void
-InitPartitionPruneContext(PartitionPruneContext *context,
+InitPartitionPruneContext(PartitionedRelPruningData *pprune,
+ PartitionPruneContext *context,
List *pruning_steps,
- PartitionDesc partdesc,
- PartitionKey partkey,
- PlanState *planstate,
- ExprContext *econtext)
+ PlanState *planstate)
{
int n_steps;
int partnatts;
ListCell *lc;
+ ExprContext *econtext;
+ EState *estate = pprune->estate;
+ MemoryContext oldcxt;
+ Relation partrel = pprune->partrel;
+ PartitionKey partkey;
+ PartitionDesc partdesc;
+
+ /* Must allocate the needed stuff in the query lifetime context. */
+ oldcxt = MemoryContextSwitchTo(estate->es_query_cxt);
+
+ /* Use parent_plan's ExprContext when available. */
+ if (planstate)
+ {
+ if (planstate->ps_ExprContext == NULL)
+ ExecAssignExprContext(estate, planstate);
+ econtext = planstate->ps_ExprContext;
+ }
+ else
+ econtext = pprune->econtext;
+
+ /*
+ * We can rely on the copies of the partitioned table's partition
+ * key and partition descriptor appearing in its relcache entry,
+ * because that entry will be held open and locked for the
+ * duration of this executor run.
+ */
+ partkey = RelationGetPartitionKey(partrel);
+ partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
+ partrel);
n_steps = list_length(pruning_steps);
@@ -2187,6 +2226,9 @@ InitPartitionPruneContext(PartitionPruneContext *context,
}
}
}
+
+ MemoryContextSwitchTo(oldcxt);
+ context->is_valid = true;
}
/*
@@ -2350,12 +2392,16 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
* recursing to other (lower-level) parents as needed.
*/
pprune = &prunedata->partrelprunedata[0];
- find_matching_subplans_recurse(prunedata, pprune, initial_prune,
+ find_matching_subplans_recurse(prunestate->parent_plan,
+ prunedata, pprune, initial_prune,
&result);
/* Expression eval may have used space in ExprContext too */
- if (pprune->exec_pruning_steps)
+ if (pprune->exec_context.is_valid)
+ {
+ Assert(pprune->exec_pruning_steps != NIL);
ResetExprContext(pprune->exec_context.exprcontext);
+ }
}
/* Add in any subplans that partition pruning didn't account for */
@@ -2378,7 +2424,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
* Adds valid (non-prunable) subplan IDs to *validsubplans
*/
static void
-find_matching_subplans_recurse(PartitionPruningData *prunedata,
+find_matching_subplans_recurse(PlanState *parent_plan,
+ PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
Bitmapset **validsubplans)
@@ -2395,11 +2442,27 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
* level.
*/
if (initial_prune && pprune->initial_pruning_steps)
+ {
+ /* Initialize initial_context if not already done. */
+ if (unlikely(!pprune->initial_context.is_valid))
+ InitPartitionPruneContext(pprune,
+ &pprune->initial_context,
+ pprune->initial_pruning_steps,
+ parent_plan);
partset = get_matching_partitions(&pprune->initial_context,
pprune->initial_pruning_steps);
+ }
else if (!initial_prune && pprune->exec_pruning_steps)
+ {
+ /* Initialize exec_context if not already done. */
+ if (unlikely(!pprune->exec_context.is_valid))
+ InitPartitionPruneContext(pprune,
+ &pprune->exec_context,
+ pprune->exec_pruning_steps,
+ parent_plan);
partset = get_matching_partitions(&pprune->exec_context,
pprune->exec_pruning_steps);
+ }
else
partset = pprune->present_parts;
@@ -2415,7 +2478,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
int partidx = pprune->subpart_map[i];
if (partidx >= 0)
- find_matching_subplans_recurse(prunedata,
+ find_matching_subplans_recurse(parent_plan,
+ prunedata,
&prunedata->partrelprunedata[partidx],
initial_prune, validsubplans);
else
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 12aacc84ff..41afb522f3 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -42,6 +42,10 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
* PartitionedRelPruneInfo (see plannodes.h); though note that here,
* subpart_map contains indexes into PartitionPruningData.partrelprunedata[].
*
+ * estate The EState for the query doing runtime pruning
+ * partrel Partitioned table Relation; points to
+ * estate->es_relations[rti-1] where rti is
+ * the table's RT index.
* nparts Length of subplan_map[] and subpart_map[].
* subplan_map Subplan index by partition index, or -1.
* subpart_map Subpart index by partition index, or -1.
@@ -51,6 +55,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
* perform executor startup pruning.
* exec_pruning_steps List of PartitionPruneSteps used to
* perform per-scan pruning.
+ * econtext ExprContext to use for initial pruning steps
* initial_context If initial_pruning_steps isn't NIL, contains
* the details needed to execute those steps.
* exec_context If exec_pruning_steps isn't NIL, contains
@@ -58,12 +63,15 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
*/
typedef struct PartitionedRelPruningData
{
+ EState *estate;
+ Relation partrel;
int nparts;
int *subplan_map;
int *subpart_map;
Bitmapset *present_parts;
List *initial_pruning_steps;
List *exec_pruning_steps;
+ ExprContext *econtext;
PartitionPruneContext initial_context;
PartitionPruneContext exec_context;
} PartitionedRelPruningData;
@@ -105,6 +113,8 @@ typedef struct PartitionPruningData
* startup (at any hierarchy level).
* do_exec_prune true if pruning should be performed during
* executor run (at any hierarchy level).
+ * parent_plan Parent plan node's PlanState used to initialize exec
+ * pruning contexts
* num_partprunedata Number of items in "partprunedata" array.
* partprunedata Array of PartitionPruningData pointers for the plan's
* partitioned relation(s), one for each partitioning
@@ -117,6 +127,7 @@ typedef struct PartitionPruneState
MemoryContext prune_context;
bool do_initial_prune;
bool do_exec_prune;
+ PlanState *parent_plan;
int num_partprunedata;
PartitionPruningData *partprunedata[FLEXIBLE_ARRAY_MEMBER];
} PartitionPruneState;
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index c536a1fe19..b7f48eefcc 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -26,6 +26,7 @@ struct RelOptInfo;
* Stores information needed at runtime for pruning computations
* related to a single partitioned table.
*
+ * is_valid Has the information in this struct been initialized?
* strategy Partition strategy, e.g. LIST, RANGE, HASH.
* partnatts Number of columns in the partition key.
* nparts Number of partitions in this partitioned table.
@@ -48,6 +49,7 @@ struct RelOptInfo;
*/
typedef struct PartitionPruneContext
{
+ bool is_valid;
char strategy;
int partnatts;
int nparts;
--
2.43.0
[application/octet-stream] v56-0001-Move-PartitionPruneInfo-out-of-plan-nodes-into-P.patch (19.9K, 4-v56-0001-Move-PartitionPruneInfo-out-of-plan-nodes-into-P.patch)
download | inline diff:
From bfc250e76a13546e71e0ea5d95675065075aee42 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Fri, 6 Sep 2024 13:11:05 +0900
Subject: [PATCH v56 1/5] Move PartitionPruneInfo out of plan nodes into
PlannedStmt
This change moves PartitionPruneInfo from individual plan nodes to
PlannedStmt, allowing runtime initial pruning to be performed across
the entire plan tree without traversing the tree to find nodes
containing PartitionPruneInfos.
The PartitionPruneInfo pointer fields in Append and MergeAppend nodes
have been replaced with an integer index that points to
PartitionPruneInfos in a list within PlannedStmt, which holds the
PartitionPruneInfos for all subqueries.
Reviewed-by: Alvaro Herrera
---
src/backend/executor/execMain.c | 1 +
src/backend/executor/execParallel.c | 1 +
src/backend/executor/execPartition.c | 19 +++++-
src/backend/executor/execUtils.c | 1 +
src/backend/executor/nodeAppend.c | 5 +-
src/backend/executor/nodeMergeAppend.c | 5 +-
src/backend/optimizer/plan/createplan.c | 24 +++----
src/backend/optimizer/plan/planner.c | 1 +
src/backend/optimizer/plan/setrefs.c | 86 ++++++++++++++++---------
src/backend/partitioning/partprune.c | 19 ++++--
src/include/executor/execPartition.h | 4 +-
src/include/nodes/execnodes.h | 1 +
src/include/nodes/pathnodes.h | 6 ++
src/include/nodes/plannodes.h | 14 ++--
src/include/partitioning/partprune.h | 8 +--
15 files changed, 133 insertions(+), 62 deletions(-)
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 713cf3e802..f263232c67 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -860,6 +860,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable, plannedstmt->permInfos);
estate->es_plannedstmt = plannedstmt;
+ estate->es_part_prune_infos = plannedstmt->partPruneInfos;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index bfb3419efb..b01a2fdfdd 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -181,6 +181,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
pstmt->planTree = plan;
+ pstmt->partPruneInfos = estate->es_part_prune_infos;
pstmt->rtable = estate->es_range_table;
pstmt->permInfos = estate->es_rteperminfos;
pstmt->resultRelations = NIL;
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 7651886229..ec730674f2 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1786,6 +1786,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* Initialize data structure needed for run-time partition pruning and
* do initial pruning if needed
*
+ * 'root_parent_relids' identifies the relation to which both the parent plan
+ * and the PartitionPruneInfo given by 'part_prune_index' belong.
+ *
* On return, *initially_valid_subplans is assigned the set of indexes of
* child subplans that must be initialized along with the parent plan node.
* Initial pruning is performed here if needed and in that case only the
@@ -1798,11 +1801,25 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
PartitionPruneState *
ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
+ Bitmapset *root_parent_relids,
Bitmapset **initially_valid_subplans)
{
PartitionPruneState *prunestate;
EState *estate = planstate->state;
+ PartitionPruneInfo *pruneinfo;
+
+ /* Obtain the pruneinfo we need, and make sure it's the right one */
+ pruneinfo = list_nth_node(PartitionPruneInfo, estate->es_part_prune_infos,
+ part_prune_index);
+ if (!bms_equal(root_parent_relids, pruneinfo->root_parent_relids))
+ ereport(ERROR,
+ errcode(ERRCODE_INTERNAL_ERROR),
+ errmsg_internal("mismatching PartitionPruneInfo found at part_prune_index %d",
+ part_prune_index),
+ errdetail_internal("plan node relids %s, pruneinfo relids %s",
+ bmsToString(root_parent_relids),
+ bmsToString(pruneinfo->root_parent_relids)));
/* We may need an expression context to evaluate partition exprs */
ExecAssignExprContext(estate, planstate);
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 5737f9f4eb..67734979b0 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -118,6 +118,7 @@ CreateExecutorState(void)
estate->es_rowmarks = NULL;
estate->es_rteperminfos = NIL;
estate->es_plannedstmt = NULL;
+ estate->es_part_prune_infos = NIL;
estate->es_junkFilter = NULL;
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index ca0f54d676..de7ebab5c2 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -134,7 +134,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
appendstate->as_begun = false;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -145,7 +145,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&appendstate->ps,
list_length(node->appendplans),
- node->part_prune_info,
+ node->part_prune_index,
+ node->apprelids,
&validsubplans);
appendstate->as_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index e1b9b984a7..3ed91808dd 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -82,7 +82,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
mergestate->ps.ExecProcNode = ExecMergeAppend;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -93,7 +93,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&mergestate->ps,
list_length(node->mergeplans),
- node->part_prune_info,
+ node->part_prune_index,
+ node->apprelids,
&validsubplans);
mergestate->ms_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index bb45ef318f..6642d09a39 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1225,7 +1225,6 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
ListCell *subpaths;
int nasyncplans = 0;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
int nodenumsortkeys = 0;
AttrNumber *nodeSortColIdx = NULL;
Oid *nodeSortOperators = NULL;
@@ -1376,6 +1375,9 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ plan->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1399,16 +1401,14 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
}
if (prunequal != NIL)
- partpruneinfo =
- make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ plan->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
plan->appendplans = subplans;
plan->nasyncplans = nasyncplans;
plan->first_partial_plan = best_path->first_partial_path;
- plan->part_prune_info = partpruneinfo;
copy_generic_path_info(&plan->plan, (Path *) best_path);
@@ -1447,7 +1447,6 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
List *subplans = NIL;
ListCell *subpaths;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
/*
* We don't have the actual creation of the MergeAppend node split out
@@ -1540,6 +1539,9 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ node->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1555,13 +1557,13 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
Assert(best_path->path.param_info == NULL);
if (prunequal != NIL)
- partpruneinfo = make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ node->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
node->mergeplans = subplans;
- node->part_prune_info = partpruneinfo;
+
/*
* If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index d92d43a17e..8cffa447fd 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -553,6 +553,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->dependsOnRole = glob->dependsOnRole;
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
+ result->partPruneInfos = glob->partPruneInfos;
result->rtable = glob->finalrtable;
result->permInfos = glob->finalrteperminfos;
result->resultRelations = glob->resultRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 91c7c4fe2f..e2ea406c4e 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1732,6 +1732,48 @@ set_customscan_references(PlannerInfo *root,
cscan->custom_relids = offset_relid_set(cscan->custom_relids, rtoffset);
}
+/*
+ * register_partpruneinfo
+ * Subroutine for set_append_references and set_mergeappend_references
+ *
+ * Add the PartitionPruneInfo from root->partPruneInfos at the given index
+ * into PlannerGlobal->partPruneInfos and return its index there.
+ *
+ * Also update the RT indexes present in PartitionedRelPruneInfos to add the
+ * offset.
+ */
+static int
+register_partpruneinfo(PlannerInfo *root, int part_prune_index, int rtoffset)
+{
+ PlannerGlobal *glob = root->glob;
+ PartitionPruneInfo *pinfo;
+ ListCell *l;
+
+ Assert(part_prune_index >= 0 &&
+ part_prune_index < list_length(root->partPruneInfos));
+ pinfo = list_nth_node(PartitionPruneInfo, root->partPruneInfos,
+ part_prune_index);
+
+ pinfo->root_parent_relids = offset_relid_set(pinfo->root_parent_relids,
+ rtoffset);
+ foreach(l, pinfo->prune_infos)
+ {
+ List *prune_infos = lfirst(l);
+ ListCell *l2;
+
+ foreach(l2, prune_infos)
+ {
+ PartitionedRelPruneInfo *prelinfo = lfirst(l2);
+
+ prelinfo->rtindex += rtoffset;
+ }
+ }
+
+ glob->partPruneInfos = lappend(glob->partPruneInfos, pinfo);
+
+ return list_length(glob->partPruneInfos) - 1;
+}
+
/*
* set_append_references
* Do set_plan_references processing on an Append
@@ -1784,21 +1826,13 @@ set_append_references(PlannerInfo *root,
aplan->apprelids = offset_relid_set(aplan->apprelids, rtoffset);
- if (aplan->part_prune_info)
- {
- foreach(l, aplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * Add PartitionPruneInfo, if any, to PlannerGlobal and update the index.
+ * Also update the RT indexes present in it to add the offset.
+ */
+ if (aplan->part_prune_index >= 0)
+ aplan->part_prune_index =
+ register_partpruneinfo(root, aplan->part_prune_index, rtoffset);
/* We don't need to recurse to lefttree or righttree ... */
Assert(aplan->plan.lefttree == NULL);
@@ -1860,21 +1894,13 @@ set_mergeappend_references(PlannerInfo *root,
mplan->apprelids = offset_relid_set(mplan->apprelids, rtoffset);
- if (mplan->part_prune_info)
- {
- foreach(l, mplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * Add PartitionPruneInfo, if any, to PlannerGlobal and update the index.
+ * Also update the RT indexes present in it to add the offset.
+ */
+ if (mplan->part_prune_index >= 0)
+ mplan->part_prune_index =
+ register_partpruneinfo(root, mplan->part_prune_index, rtoffset);
/* We don't need to recurse to lefttree or righttree ... */
Assert(mplan->plan.lefttree == NULL);
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 9a1a7faac7..60fabb1734 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -207,16 +207,20 @@ static void partkey_datum_from_expr(PartitionPruneContext *context,
/*
* make_partition_pruneinfo
- * Builds a PartitionPruneInfo which can be used in the executor to allow
- * additional partition pruning to take place. Returns NULL when
- * partition pruning would be useless.
+ * Checks if the given set of quals can be used to build pruning steps
+ * that the executor can use to prune away unneeded partitions. If
+ * suitable quals are found then a PartitionPruneInfo is built and tagged
+ * onto the PlannerInfo's partPruneInfos list.
+ *
+ * The return value is the 0-based index of the item added to the
+ * partPruneInfos list or -1 if nothing was added.
*
* 'parentrel' is the RelOptInfo for an appendrel, and 'subpaths' is the list
* of scan paths for its child rels.
* 'prunequal' is a list of potential pruning quals (i.e., restriction
* clauses that are applicable to the appendrel).
*/
-PartitionPruneInfo *
+int
make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *subpaths,
List *prunequal)
@@ -330,10 +334,11 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* quals, then we can just not bother with run-time pruning.
*/
if (prunerelinfos == NIL)
- return NULL;
+ return -1;
/* Else build the result data structure */
pruneinfo = makeNode(PartitionPruneInfo);
+ pruneinfo->root_parent_relids = parentrel->relids;
pruneinfo->prune_infos = prunerelinfos;
/*
@@ -356,7 +361,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
else
pruneinfo->other_subplans = NULL;
- return pruneinfo;
+ root->partPruneInfos = lappend(root->partPruneInfos, pruneinfo);
+
+ return list_length(root->partPruneInfos) - 1;
}
/*
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index c09bc83b2a..12aacc84ff 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -123,9 +123,9 @@ typedef struct PartitionPruneState
extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
+ Bitmapset *root_parent_relids,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
bool initial_prune);
-
#endif /* EXECPARTITION_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 88467977f8..22b928e085 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -636,6 +636,7 @@ typedef struct EState
* ExecRowMarks, or NULL if none */
List *es_rteperminfos; /* List of RTEPermissionInfo */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
+ List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 07e2415398..8d30b6e896 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -128,6 +128,9 @@ typedef struct PlannerGlobal
/* "flat" list of AppendRelInfos */
List *appendRelations;
+ /* List of PartitionPruneInfo contained in the plan */
+ List *partPruneInfos;
+
/* OIDs of relations the plan depends on */
List *relationOids;
@@ -559,6 +562,9 @@ struct PlannerInfo
/* Does this query modify any partition key columns? */
bool partColsUpdated;
+
+ /* PartitionPruneInfos added in this query's plan. */
+ List *partPruneInfos;
};
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 62cd6a6666..39d0281c23 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -69,6 +69,9 @@ typedef struct PlannedStmt
struct Plan *planTree; /* tree of Plan nodes */
+ List *partPruneInfos; /* List of PartitionPruneInfo contained in the
+ * plan */
+
List *rtable; /* list of RangeTblEntry nodes */
List *permInfos; /* list of RTEPermissionInfo nodes for rtable
@@ -276,8 +279,8 @@ typedef struct Append
*/
int first_partial_plan;
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+ int part_prune_index;
} Append;
/* ----------------
@@ -311,8 +314,8 @@ typedef struct MergeAppend
/* NULLS FIRST/LAST directions */
bool *nullsFirst pg_node_attr(array_size(numCols));
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+ int part_prune_index;
} MergeAppend;
/* ----------------
@@ -1414,6 +1417,8 @@ typedef struct PlanRowMark
* Then, since an Append-type node could have multiple partitioning
* hierarchies among its children, we have an unordered List of those Lists.
*
+ * root_parent_relids RelOptInfo.relids of the relation to which the parent
+ * plan node and this PartitionPruneInfo node belong
* prune_infos List of Lists containing PartitionedRelPruneInfo nodes,
* one sublist per run-time-prunable partition hierarchy
* appearing in the parent plan node's subplans.
@@ -1426,6 +1431,7 @@ typedef struct PartitionPruneInfo
pg_node_attr(no_equal, no_query_jumble)
NodeTag type;
+ Bitmapset *root_parent_relids;
List *prune_infos;
Bitmapset *other_subplans;
} PartitionPruneInfo;
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index bd490d154f..c536a1fe19 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -70,10 +70,10 @@ typedef struct PartitionPruneContext
#define PruneCxtStateIdx(partnatts, step_id, keyno) \
((partnatts) * (step_id) + (keyno))
-extern PartitionPruneInfo *make_partition_pruneinfo(struct PlannerInfo *root,
- struct RelOptInfo *parentrel,
- List *subpaths,
- List *prunequal);
+extern int make_partition_pruneinfo(struct PlannerInfo *root,
+ struct RelOptInfo *parentrel,
+ List *subpaths,
+ List *prunequal);
extern Bitmapset *prune_append_rel_partitions(struct RelOptInfo *rel);
extern Bitmapset *get_matching_partitions(PartitionPruneContext *context,
List *pruning_steps);
--
2.43.0
[application/octet-stream] v56-0004-Defer-locking-of-runtime-prunable-relations-to-e.patch (52.0K, 5-v56-0004-Defer-locking-of-runtime-prunable-relations-to-e.patch)
download | inline diff:
From c6bc55ad693ea5a52936e4a0e1e105036efc8d02 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Wed, 18 Sep 2024 12:00:41 +0900
Subject: [PATCH v56 4/5] Defer locking of runtime-prunable relations to
executor
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
When preparing a cached plan for execution, plancache.c locks the
relations in the plan's range table to ensure they are safe for
execution. However, this approach, implemented in
AcquireExecutorLocks(), results in unnecessarily locking relations
that might be pruned during "initial" runtime pruning.
To optimize this, locking is now deferred for relations subject to
"initial" runtime pruning. The planner now provides a set of
"unprunable" relations through the new PlannedStmt.unprunableRelids
field. AcquireExecutorLocks() will only lock these unprunable
relations. PlannedStmt.unprunableRelids is populated by subtracting
the set of initially prunable relids from all RT indexes. The prunable
relids are identified by examining all PartitionPruneInfos during
set_plan_refs() and storing the RT indexes of partitions subject to
"initial" pruning steps. While at it, some duplicated code in
set_append_references() and set_mergeappend_references() that
constructs the prunable relids set has been refactored into a common
function.
Deferred locks are taken, if necessary, after ExecDoInitialPruning()
determines the set of unpruned partitions. To allow the executor to
determine whether the plan tree it’s executing is cached and may
contain unlocked relations, the CachedPlan is now made available via
the QueryDesc. The executor can call CachedPlanRequiresLocking(),
which returns true if the CachedPlan is a reusable generic plan that
might contain unlocked relations.
Plan nodes like Append have already been updated to consider only the
set of unpruned relations. However, there are cases, such as child
RowMarks and child result relations, where the code manipulating those
do not directly receive information about unpruned partitions.
Therefore, code handling child RowMarks and result relations has been
modified to ensure they don’t belong to pruned partitions. For this,
the RT indexes of unpruned partitions are added in
ExecDoInitialPruning() to es_unprunable_relids, which initially
contains PlannedStmt.unprunableRelids. The corresponding code now
processes only those child RowMarks and result relations whose owning
relations are in this set. For result relations managed by a
ModifyTable node, its resultRelations list and other lists that
parallel it (withCheckOptionLists, returningLists, and
updateColnosLists) are truncated in ExecInitModifyTable to only
consider unpruned relations and the ResultRelInfo structs are created
only for those.
Finally, an Assert has also been added in ExecCheckPermissions() to
ensure that all relations whose permissions are checked have been
properly locked, helping to catch any accidental omission of relations
from the unprunableRelids set that should have their permissions
checked.
This deferment introduces a window where prunable relations may be
altered by concurrent DDL, potentially causing the plan to become
invalid. Consequently, the executor might attempt to execute an
invalid plan, leading to errors such as failing to locate the index
of an unpruned partition that may have been dropped concurrently
during ExecInitIndexScan() (if it's partition-local, not inherited,
for example). Future commits will introduce changes to enable the
executor to check plan validity during ExecutorStart() and retry with
a newly created plan if the original becomes invalid after taking
deferred locks.
---
src/backend/commands/copyto.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/explain.c | 7 +-
src/backend/commands/extension.c | 1 +
src/backend/commands/matview.c | 2 +-
src/backend/commands/prepare.c | 3 +-
src/backend/executor/execMain.c | 75 +++++++++++++++++-
src/backend/executor/execParallel.c | 9 ++-
src/backend/executor/execPartition.c | 36 +++++++--
src/backend/executor/functions.c | 1 +
src/backend/executor/nodeAppend.c | 8 +-
src/backend/executor/nodeLockRows.c | 10 ++-
src/backend/executor/nodeMergeAppend.c | 2 +-
src/backend/executor/nodeModifyTable.c | 78 ++++++++++++++++---
src/backend/executor/spi.c | 1 +
src/backend/optimizer/plan/planner.c | 2 +
src/backend/optimizer/plan/setrefs.c | 7 ++
src/backend/partitioning/partprune.c | 18 +++++
src/backend/tcop/pquery.c | 10 ++-
src/backend/utils/cache/plancache.c | 40 ++++++----
src/include/commands/explain.h | 5 +-
src/include/executor/execPartition.h | 5 +-
src/include/executor/execdesc.h | 2 +
src/include/nodes/execnodes.h | 12 +++
src/include/nodes/pathnodes.h | 6 ++
src/include/nodes/plannodes.h | 7 ++
src/include/utils/plancache.h | 10 +++
src/test/regress/expected/partition_prune.out | 44 +++++++++++
src/test/regress/sql/partition_prune.sql | 18 +++++
29 files changed, 366 insertions(+), 57 deletions(-)
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index 91de442f43..db976f928a 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -552,7 +552,7 @@ BeginCopyTo(ParseState *pstate,
((DR_copy *) dest)->cstate = cstate;
/* Create a QueryDesc requesting no output */
- cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(),
InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 0b629b1f79..57a3375cad 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -324,7 +324,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index aaec439892..49f7370734 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -509,7 +509,7 @@ standard_ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL),
es->memory ? &mem_counters : NULL);
}
@@ -617,7 +617,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
* to call it.
*/
void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
+ IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
const BufferUsage *bufusage,
@@ -673,7 +674,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
dest = None_Receiver;
/* Create a QueryDesc for the query */
- queryDesc = CreateQueryDesc(plannedstmt, queryString,
+ queryDesc = CreateQueryDesc(plannedstmt, cplan, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, instrument_option);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index fab59ad5f6..bd169edeff 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -742,6 +742,7 @@ execute_sql_string(const char *sql)
QueryDesc *qdesc;
qdesc = CreateQueryDesc(stmt,
+ NULL,
sql,
GetActiveSnapshot(), NULL,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 010097873d..69be74b4bd 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -438,7 +438,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, queryString,
+ queryDesc = CreateQueryDesc(plan, NULL, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 07257d4db9..311b9ebd5b 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -655,7 +655,8 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
+ ExplainOnePlan(pstmt, cplan, into, es, query_string, paramLI,
+ queryEnv,
&planduration, (es->buffers ? &bufusage : NULL),
es->memory ? &mem_counters : NULL);
else
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 5222aa9ab3..2c14ee2b6b 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -54,6 +54,7 @@
#include "nodes/queryjumble.h"
#include "parser/parse_relation.h"
#include "rewrite/rewriteHandler.h"
+#include "storage/lmgr.h"
#include "tcop/utility.h"
#include "utils/acl.h"
#include "utils/backend_status.h"
@@ -91,6 +92,7 @@ static bool ExecCheckPermissionsModified(Oid relOid, Oid userid,
AclMode requiredPerms);
static void ExecCheckXactReadOnly(PlannedStmt *plannedstmt);
static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
+static inline bool ExecShouldLockRelations(EState *estate);
/* end of local decls */
@@ -610,6 +612,21 @@ ExecCheckPermissions(List *rangeTable, List *rteperminfos,
(rte->rtekind == RTE_SUBQUERY &&
rte->relkind == RELKIND_VIEW));
+ /*
+ * Ensure that we have at least an AccessShareLock on relations
+ * whose permissions need to be checked.
+ *
+ * Skip this check in a parallel worker because locks won't be
+ * taken until ExecInitNode() performs plan initialization.
+ *
+ * XXX: ExecCheckPermissions() in a parallel worker may be
+ * redundant with the checks done in the leader process, so this
+ * should be reviewed to ensure it’s necessary.
+ */
+ Assert(IsParallelWorker() ||
+ CheckRelationOidLockedByMe(rte->relid, AccessShareLock,
+ true));
+
(void) getRTEPermissionInfo(rteperminfos, rte);
/* Many-to-one mapping not allowed */
Assert(!bms_is_member(rte->perminfoindex, indexset));
@@ -872,12 +889,46 @@ ExecDoInitialPruning(EState *estate)
* result.
*/
if (prunestate->do_initial_prune)
- validsubplans = ExecFindMatchingSubPlans(prunestate, true);
+ {
+ Bitmapset *validsubplan_rtis = NULL;
+
+ validsubplans = ExecFindMatchingSubPlans(prunestate, true,
+ &validsubplan_rtis);
+ if (ExecShouldLockRelations(estate))
+ {
+ int rtindex = -1;
+
+ rtindex = -1;
+ while ((rtindex = bms_next_member(validsubplan_rtis,
+ rtindex)) >= 0)
+ {
+ RangeTblEntry *rte = exec_rt_fetch(rtindex, estate);
+
+ Assert(rte->rtekind == RTE_RELATION &&
+ rte->rellockmode != NoLock);
+ LockRelationOid(rte->relid, rte->rellockmode);
+ }
+ }
+ estate->es_unprunable_relids = bms_add_members(estate->es_unprunable_relids,
+ validsubplan_rtis);
+ }
+
estate->es_part_prune_results = lappend(estate->es_part_prune_results,
validsubplans);
}
}
+/*
+ * Locks might be needed only if running a cached plan that might contain
+ * unlocked relations, such as reused generic plans.
+ */
+static inline bool
+ExecShouldLockRelations(EState *estate)
+{
+ return estate->es_cachedplan == NULL ? false :
+ CachedPlanRequiresLocking(estate->es_cachedplan);
+}
+
/* ----------------------------------------------------------------
* InitPlan
*
@@ -890,6 +941,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
{
CmdType operation = queryDesc->operation;
PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+ CachedPlan *cachedplan = queryDesc->cplan;
Plan *plan = plannedstmt->planTree;
List *rangeTable = plannedstmt->rtable;
EState *estate = queryDesc->estate;
@@ -909,10 +961,13 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable, plannedstmt->permInfos);
estate->es_plannedstmt = plannedstmt;
+ estate->es_cachedplan = cachedplan;
+ estate->es_unprunable_relids = bms_copy(plannedstmt->unprunableRelids);
/*
* Perform runtime "initial" pruning to determine the plan nodes that will
- * not be executed.
+ * not be executed. This will also add the RT indexes of surviving leaf
+ * partitions to es_unprunable_relids.
*/
estate->es_part_prune_infos = plannedstmt->partPruneInfos;
ExecDoInitialPruning(estate);
@@ -931,8 +986,13 @@ InitPlan(QueryDesc *queryDesc, int eflags)
Relation relation;
ExecRowMark *erm;
- /* ignore "parent" rowmarks; they are irrelevant at runtime */
- if (rc->isParent)
+ /*
+ * Ignore "parent" rowmarks, because they are irrelevant at
+ * runtime. Also ignore the rowmarks belonging to child tables
+ * that have been pruned in ExecDoInitialPruning().
+ */
+ if (rc->isParent ||
+ !bms_is_member(rc->rti, estate->es_unprunable_relids))
continue;
/* get relation's OID (will produce InvalidOid if subquery) */
@@ -2969,6 +3029,13 @@ EvalPlanQualStart(EPQState *epqstate, Plan *planTree)
}
}
+ /*
+ * Copy es_unprunable_relids so that RowMarks of pruned relations are
+ * ignored in ExecInitLockRows() and ExecInitModifyTable() when
+ * initializing the plan trees below.
+ */
+ rcestate->es_unprunable_relids = parentestate->es_unprunable_relids;
+
/*
* Initialize private state information for each SubPlan. We must do this
* before running ExecInitNode on the main query tree, since
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index b01a2fdfdd..7519c9a860 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -1257,8 +1257,15 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
paramLI = RestoreParamList(¶mspace);
- /* Create a QueryDesc for the query. */
+ /*
+ * Create a QueryDesc for the query. We pass NULL for cachedplan, because
+ * we don't have a pointer to the CachedPlan in the leader's process. It's
+ * fine because the only reason the executor needs to see it is to decide
+ * if it should take locks on certain relations, but paraller workers
+ * always take locks anyway.
+ */
return CreateQueryDesc(pstmt,
+ NULL,
queryString,
GetActiveSnapshot(), InvalidSnapshot,
receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 40eb74d187..13d2542c48 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -26,6 +26,7 @@
#include "partitioning/partdesc.h"
#include "partitioning/partprune.h"
#include "rewrite/rewriteManip.h"
+#include "storage/lmgr.h"
#include "utils/acl.h"
#include "utils/lsyscache.h"
#include "utils/partcache.h"
@@ -192,7 +193,8 @@ static void find_matching_subplans_recurse(PlanState *parent_plan,
PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans);
+ Bitmapset **validsubplans,
+ Bitmapset **validsubplan_rtis);
/*
@@ -1985,8 +1987,8 @@ ExecCreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
* The set of partitions that exist now might not be the same that
* existed when the plan was made. The normal case is that it is;
* optimize for that case with a quick comparison, and just copy
- * the subplan_map and make subpart_map point to the one in
- * PruneInfo.
+ * the subplan_map and make subpart_map, rti_map point to the
+ * ones in PruneInfo.
*
* For the case where they aren't identical, we could have more
* partitions on either side; or even exactly the same number of
@@ -2005,6 +2007,7 @@ ExecCreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
sizeof(int) * partdesc->nparts) == 0)
{
pprune->subpart_map = pinfo->subpart_map;
+ pprune->rti_map = pinfo->rti_map;
memcpy(pprune->subplan_map, pinfo->subplan_map,
sizeof(int) * pinfo->nparts);
}
@@ -2025,6 +2028,7 @@ ExecCreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
* mismatches.
*/
pprune->subpart_map = palloc(sizeof(int) * partdesc->nparts);
+ pprune->rti_map = palloc(sizeof(int) * partdesc->nparts);
for (pp_idx = 0; pp_idx < partdesc->nparts; pp_idx++)
{
@@ -2042,6 +2046,8 @@ ExecCreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
pinfo->subplan_map[pd_idx];
pprune->subpart_map[pp_idx] =
pinfo->subpart_map[pd_idx];
+ pprune->rti_map[pp_idx] =
+ pinfo->rti_map[pd_idx];
pd_idx++;
continue;
}
@@ -2079,6 +2085,7 @@ ExecCreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
pprune->subpart_map[pp_idx] = -1;
pprune->subplan_map[pp_idx] = -1;
+ pprune->rti_map[pp_idx] = 0;
}
}
@@ -2360,10 +2367,13 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
* Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated. This
* differentiates the initial executor-time pruning step from later
* runtime pruning.
+ *
+ * valisubplan_rtis must be non-NULL if initial_pruning is true.
*/
Bitmapset *
ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune)
+ bool initial_prune,
+ Bitmapset **validsubplan_rtis)
{
Bitmapset *result = NULL;
MemoryContext oldcontext;
@@ -2399,7 +2409,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
pprune = &prunedata->partrelprunedata[0];
find_matching_subplans_recurse(prunestate->parent_plan,
prunedata, pprune, initial_prune,
- &result);
+ &result, validsubplan_rtis);
/* Expression eval may have used space in ExprContext too */
if (pprune->exec_context.is_valid)
@@ -2416,6 +2426,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
/* Copy result out of the temp context before we reset it */
result = bms_copy(result);
+ if (validsubplan_rtis)
+ *validsubplan_rtis = bms_copy(*validsubplan_rtis);
MemoryContextReset(prunestate->prune_context);
@@ -2426,14 +2438,16 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
* find_matching_subplans_recurse
* Recursive worker function for ExecFindMatchingSubPlans
*
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and the RT indexes
+ * of their owning leaf partitions to *validsubplan_rtis if it's non-NULL.
*/
static void
find_matching_subplans_recurse(PlanState *parent_plan,
PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans)
+ Bitmapset **validsubplans,
+ Bitmapset **validsubplan_rtis)
{
Bitmapset *partset;
int i;
@@ -2476,8 +2490,13 @@ find_matching_subplans_recurse(PlanState *parent_plan,
while ((i = bms_next_member(partset, i)) >= 0)
{
if (pprune->subplan_map[i] >= 0)
+ {
*validsubplans = bms_add_member(*validsubplans,
pprune->subplan_map[i]);
+ if (validsubplan_rtis)
+ *validsubplan_rtis = bms_add_member(*validsubplan_rtis,
+ pprune->rti_map[i]);
+ }
else
{
int partidx = pprune->subpart_map[i];
@@ -2486,7 +2505,8 @@ find_matching_subplans_recurse(PlanState *parent_plan,
find_matching_subplans_recurse(parent_plan,
prunedata,
&prunedata->partrelprunedata[partidx],
- initial_prune, validsubplans);
+ initial_prune, validsubplans,
+ validsubplan_rtis);
else
{
/*
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 692854e2b3..6f6f45e0ad 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -840,6 +840,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
dest = None_Receiver;
es->qd = CreateQueryDesc(es->stmt,
+ NULL,
fcache->src,
GetActiveSnapshot(),
InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index de7ebab5c2..006bdafaea 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -581,7 +581,7 @@ choose_next_subplan_locally(AppendState *node)
else if (!node->as_valid_subplans_identified)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
node->as_valid_subplans_identified = true;
}
@@ -648,7 +648,7 @@ choose_next_subplan_for_leader(AppendState *node)
if (!node->as_valid_subplans_identified)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
node->as_valid_subplans_identified = true;
/*
@@ -724,7 +724,7 @@ choose_next_subplan_for_worker(AppendState *node)
else if (!node->as_valid_subplans_identified)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
node->as_valid_subplans_identified = true;
mark_invalid_subplans_as_finished(node);
@@ -877,7 +877,7 @@ ExecAppendAsyncBegin(AppendState *node)
if (!node->as_valid_subplans_identified)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
node->as_valid_subplans_identified = true;
classify_matching_subplans(node);
diff --git a/src/backend/executor/nodeLockRows.c b/src/backend/executor/nodeLockRows.c
index 41754ddfea..b5b2cd53c5 100644
--- a/src/backend/executor/nodeLockRows.c
+++ b/src/backend/executor/nodeLockRows.c
@@ -28,6 +28,7 @@
#include "foreign/fdwapi.h"
#include "miscadmin.h"
#include "utils/rel.h"
+#include "utils/lsyscache.h"
/* ----------------------------------------------------------------
@@ -347,8 +348,13 @@ ExecInitLockRows(LockRows *node, EState *estate, int eflags)
ExecRowMark *erm;
ExecAuxRowMark *aerm;
- /* ignore "parent" rowmarks; they are irrelevant at runtime */
- if (rc->isParent)
+ /*
+ * Ignore "parent" rowmarks, because they are irrelevant at
+ * runtime. Also ignore the rowmarks belonging to child tables
+ * that have been pruned in ExecDoInitialPruning().
+ */
+ if (rc->isParent ||
+ !bms_is_member(rc->rti, estate->es_unprunable_relids))
continue;
/* find ExecRowMark and build ExecAuxRowMark */
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 3ed91808dd..f7821aa178 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -219,7 +219,7 @@ ExecMergeAppend(PlanState *pstate)
*/
if (node->ms_valid_subplans == NULL)
node->ms_valid_subplans =
- ExecFindMatchingSubPlans(node->ms_prune_state, false);
+ ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
/*
* First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 8bf4c80d4a..652d70223c 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -636,7 +636,7 @@ ExecInitUpdateProjection(ModifyTableState *mtstate,
Assert(whichrel >= 0 && whichrel < mtstate->mt_nrels);
}
- updateColnos = (List *) list_nth(node->updateColnosLists, whichrel);
+ updateColnos = (List *) list_nth(mtstate->mt_updateColnosLists, whichrel);
/*
* For UPDATE, we use the old tuple to fill up missing values in the tuple
@@ -4176,12 +4176,17 @@ ExecLookupResultRelByOid(ModifyTableState *node, Oid resultoid,
hash_search(node->mt_resultOidHash, &resultoid, HASH_FIND, NULL);
if (mtlookup)
{
+ ResultRelInfo *resultRelInfo;
+
if (update_cache)
{
node->mt_lastResultOid = resultoid;
node->mt_lastResultIndex = mtlookup->relationIndex;
}
- return node->resultRelInfo + mtlookup->relationIndex;
+
+ resultRelInfo = node->resultRelInfo + mtlookup->relationIndex;
+
+ return resultRelInfo;
}
}
else
@@ -4218,7 +4223,11 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
ModifyTableState *mtstate;
Plan *subplan = outerPlan(node);
CmdType operation = node->operation;
- int nrels = list_length(node->resultRelations);
+ int nrels;
+ List *resultRelations = NIL;
+ List *withCheckOptionLists = NIL;
+ List *returningLists = NIL;
+ List *updateColnosLists = NIL;
ResultRelInfo *resultRelInfo;
List *arowmarks;
ListCell *l;
@@ -4228,6 +4237,46 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
/* check for unsupported flags */
Assert(!(eflags & (EXEC_FLAG_BACKWARD | EXEC_FLAG_MARK)));
+ /*
+ * Only consider unpruned relations. In the future, it might be more
+ * efficient to store resultRelations as a bitmapset, which would make
+ * this operation cheaper.
+ */
+ i = 0;
+ foreach(l, node->resultRelations)
+ {
+ Index rti = lfirst_int(l);
+
+ if (bms_is_member(rti, estate->es_unprunable_relids))
+ {
+ resultRelations = lappend_int(resultRelations, rti);
+ if (node->withCheckOptionLists)
+ {
+ List *withCheckOptions = list_nth_node(List,
+ node->withCheckOptionLists,
+ i);
+
+ withCheckOptionLists = lappend(withCheckOptionLists, withCheckOptions);
+ }
+ if (node->returningLists)
+ {
+ List *returningList = list_nth_node(List,
+ node->returningLists,
+ i);
+
+ returningLists = lappend(returningLists, returningList);
+ }
+ if (node->updateColnosLists)
+ {
+ List *updateColnosList = list_nth(node->updateColnosLists, i);
+
+ updateColnosLists = lappend(updateColnosLists, updateColnosList);
+ }
+ }
+ i++;
+ }
+ nrels = list_length(resultRelations);
+
/*
* create state structure
*/
@@ -4248,6 +4297,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
mtstate->mt_merge_inserted = 0;
mtstate->mt_merge_updated = 0;
mtstate->mt_merge_deleted = 0;
+ mtstate->mt_updateColnosLists = updateColnosLists;
/*----------
* Resolve the target relation. This is the same as:
@@ -4265,6 +4315,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
*/
if (node->rootRelation > 0)
{
+ Assert(bms_is_member(node->rootRelation, estate->es_unprunable_relids));
mtstate->rootResultRelInfo = makeNode(ResultRelInfo);
ExecInitResultRelation(estate, mtstate->rootResultRelInfo,
node->rootRelation);
@@ -4279,7 +4330,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
/* set up epqstate with dummy subplan data for the moment */
EvalPlanQualInit(&mtstate->mt_epqstate, estate, NULL, NIL,
- node->epqParam, node->resultRelations);
+ node->epqParam, resultRelations);
mtstate->fireBSTriggers = true;
/*
@@ -4297,7 +4348,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
*/
resultRelInfo = mtstate->resultRelInfo;
i = 0;
- foreach(l, node->resultRelations)
+ foreach(l, resultRelations)
{
Index resultRelation = lfirst_int(l);
List *mergeActions = NIL;
@@ -4441,7 +4492,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
* Initialize any WITH CHECK OPTION constraints if needed.
*/
resultRelInfo = mtstate->resultRelInfo;
- foreach(l, node->withCheckOptionLists)
+ foreach(l, withCheckOptionLists)
{
List *wcoList = (List *) lfirst(l);
List *wcoExprs = NIL;
@@ -4464,7 +4515,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
/*
* Initialize RETURNING projections if needed.
*/
- if (node->returningLists)
+ if (returningLists)
{
TupleTableSlot *slot;
ExprContext *econtext;
@@ -4473,7 +4524,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
* Initialize result tuple slot and assign its rowtype using the first
* RETURNING list. We assume the rest will look the same.
*/
- mtstate->ps.plan->targetlist = (List *) linitial(node->returningLists);
+ mtstate->ps.plan->targetlist = (List *) linitial(returningLists);
/* Set up a slot for the output of the RETURNING projection(s) */
ExecInitResultTupleSlotTL(&mtstate->ps, &TTSOpsVirtual);
@@ -4488,7 +4539,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
* Build a projection for each result rel.
*/
resultRelInfo = mtstate->resultRelInfo;
- foreach(l, node->returningLists)
+ foreach(l, returningLists)
{
List *rlist = (List *) lfirst(l);
@@ -4589,8 +4640,13 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
ExecRowMark *erm;
ExecAuxRowMark *aerm;
- /* ignore "parent" rowmarks; they are irrelevant at runtime */
- if (rc->isParent)
+ /*
+ * Ignore "parent" rowmarks, because they are irrelevant at
+ * runtime. Also ignore the rowmarks belonging to child tables
+ * that have been pruned in ExecDoInitialPruning().
+ */
+ if (rc->isParent ||
+ !bms_is_member(rc->rti, estate->es_unprunable_relids))
continue;
/* Find ExecRowMark and build ExecAuxRowMark */
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 90d9834576..659bd6dcd9 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -2684,6 +2684,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
snap = InvalidSnapshot;
qdesc = CreateQueryDesc(stmt,
+ cplan,
plansource->query_string,
snap, crosscheck_snapshot,
dest,
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 8cffa447fd..3b50b767df 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -555,6 +555,8 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->planTree = top_plan;
result->partPruneInfos = glob->partPruneInfos;
result->rtable = glob->finalrtable;
+ result->unprunableRelids = bms_difference(bms_add_range(NULL, 1, list_length(result->rtable)),
+ glob->prunableRelids);
result->permInfos = glob->finalrteperminfos;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index e2ea406c4e..283a61a972 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1764,8 +1764,15 @@ register_partpruneinfo(PlannerInfo *root, int part_prune_index, int rtoffset)
foreach(l2, prune_infos)
{
PartitionedRelPruneInfo *prelinfo = lfirst(l2);
+ int i;
prelinfo->rtindex += rtoffset;
+ for (i = 0; i < prelinfo->nparts; i++)
+ {
+ prelinfo->rti_map[i] += rtoffset;
+ glob->prunableRelids = bms_add_member(glob->prunableRelids,
+ prelinfo->rti_map[i]);
+ }
}
}
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 60fabb1734..85894c87af 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -645,6 +645,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *subplan_map;
int *subpart_map;
Oid *relid_map;
+ int *rti_map;
/*
* Construct the subplan and subpart maps for this partitioning level.
@@ -657,6 +658,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subpart_map = (int *) palloc(nparts * sizeof(int));
memset(subpart_map, -1, nparts * sizeof(int));
relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+ rti_map = (int *) palloc0(nparts * sizeof(int));
present_parts = NULL;
i = -1;
@@ -671,9 +673,24 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+
+ /*
+ * Track the RT indexes of partitions to ensure they are included
+ * in the prunableRelids set of relations that are locked during
+ * execution. This ensures that if the plan is cached, these
+ * partitions are locked when the plan is reused.
+ *
+ * Partitions without a subplan and sub-partitioned partitions
+ * where none of the sub-partitions have a subplan due to
+ * constraint exclusion are not included in this set. Instead,
+ * they are added to the unprunableRelids set, and the relations
+ * in this set are locked by AcquireExecutorLocks() before
+ * executing a cached plan.
+ */
if (subplanidx >= 0)
{
present_parts = bms_add_member(present_parts, i);
+ rti_map[i] = (int) partrel->relid;
/* Record finding this subplan */
subplansfound = bms_add_member(subplansfound, subplanidx);
@@ -695,6 +712,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->subplan_map = subplan_map;
pinfo->subpart_map = subpart_map;
pinfo->relid_map = relid_map;
+ pinfo->rti_map = rti_map;
}
pfree(relid_subpart_map);
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index a1f8d03db1..6e8f6b1b8f 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -36,6 +36,7 @@ Portal ActivePortal = NULL;
static void ProcessQuery(PlannedStmt *plan,
+ CachedPlan *cplan,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -65,6 +66,7 @@ static void DoPortalRewind(Portal portal);
*/
QueryDesc *
CreateQueryDesc(PlannedStmt *plannedstmt,
+ CachedPlan *cplan,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
@@ -77,6 +79,7 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
+ qd->cplan = cplan; /* CachedPlan supplying the plannedstmt */
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
* PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
*
* plan: the plan tree for the query
+ * cplan: CachedPlan supplying the plan
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
*/
static void
ProcessQuery(PlannedStmt *plan,
+ CachedPlan *cplan,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Create the QueryDesc object
*/
- queryDesc = CreateQueryDesc(plan, sourceText,
+ queryDesc = CreateQueryDesc(plan, cplan, sourceText,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
@@ -493,6 +498,7 @@ PortalStart(Portal portal, ParamListInfo params,
* the destination to DestNone.
*/
queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+ portal->cplan,
portal->sourceText,
GetActiveSnapshot(),
InvalidSnapshot,
@@ -1276,6 +1282,7 @@ PortalRunMulti(Portal portal,
{
/* statement can set tag string */
ProcessQuery(pstmt,
+ portal->cplan,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1285,6 +1292,7 @@ PortalRunMulti(Portal portal,
{
/* stmt added by rewrite cannot set tag */
ProcessQuery(pstmt,
+ portal->cplan,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 5af1a168ec..5b75dadf13 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -104,7 +104,8 @@ static List *RevalidateCachedQuery(CachedPlanSource *plansource,
QueryEnvironment *queryEnv);
static bool CheckCachedPlan(CachedPlanSource *plansource);
static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv);
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ bool generic);
static bool choose_custom_plan(CachedPlanSource *plansource,
ParamListInfo boundParams);
static double cached_plan_cost(CachedPlan *plan, bool include_planner);
@@ -815,8 +816,11 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
* Caller must have already called RevalidateCachedQuery to verify that the
* querytree is up to date.
*
- * On a "true" return, we have acquired the locks needed to run the plan.
- * (We must do this for the "true" result to be race-condition-free.)
+ * On a "true" return, we have acquired locks on the "unprunableRelids" set
+ * for all plans in plansource->stmt_list. The plans are not completely
+ * race-condition-free until the executor takes locks on the set of prunable
+ * relations that survive initial runtime pruning during executor
+ * initialization;
*/
static bool
CheckCachedPlan(CachedPlanSource *plansource)
@@ -893,10 +897,10 @@ CheckCachedPlan(CachedPlanSource *plansource)
* or it can be set to NIL if we need to re-copy the plansource's query_list.
*
* To build a generic, parameter-value-independent plan, pass NULL for
- * boundParams. To build a custom plan, pass the actual parameter values via
- * boundParams. For best effect, the PARAM_FLAG_CONST flag should be set on
- * each parameter value; otherwise the planner will treat the value as a
- * hint rather than a hard constant.
+ * boundParams, and true for generic. To build a custom plan, pass the actual
+ * parameter values via boundParams, and false for generic. For best effect,
+ * the PARAM_FLAG_CONST flag should be set on each parameter value; otherwise
+ * the planner will treat the value as a hint rather than a hard constant.
*
* Planning work is done in the caller's memory context. The finished plan
* is in a child memory context, which typically should get reparented
@@ -904,7 +908,8 @@ CheckCachedPlan(CachedPlanSource *plansource)
*/
static CachedPlan *
BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv)
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ bool generic)
{
CachedPlan *plan;
List *plist;
@@ -1026,6 +1031,7 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
plan->refcount = 0;
plan->context = plan_context;
plan->is_oneshot = plansource->is_oneshot;
+ plan->is_generic = generic;
plan->is_saved = false;
plan->is_valid = true;
@@ -1196,7 +1202,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
else
{
/* Build a new generic plan */
- plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv, true);
/* Just make real sure plansource->gplan is clear */
ReleaseGenericPlan(plansource);
/* Link the new generic plan into the plansource */
@@ -1241,7 +1247,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (customplan)
{
/* Build a custom plan */
- plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv, false);
/* Accumulate total costs of custom plans */
plansource->total_custom_cost += cached_plan_cost(plan, true);
@@ -1387,8 +1393,8 @@ CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
}
/*
- * Reject if AcquireExecutorLocks would have anything to do. This is
- * probably unnecessary given the previous check, but let's be safe.
+ * Reject if there are any lockable relations. This is probably
+ * unnecessary given the previous check, but let's be safe.
*/
foreach(lc, plan->stmt_list)
{
@@ -1776,7 +1782,7 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
foreach(lc1, stmt_list)
{
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
- ListCell *lc2;
+ int rtindex;
if (plannedstmt->commandType == CMD_UTILITY)
{
@@ -1794,9 +1800,13 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
continue;
}
- foreach(lc2, plannedstmt->rtable)
+ rtindex = -1;
+ while ((rtindex = bms_next_member(plannedstmt->unprunableRelids,
+ rtindex)) >= 0)
{
- RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+ RangeTblEntry *rte = list_nth_node(RangeTblEntry,
+ plannedstmt->rtable,
+ rtindex - 1);
if (!(rte->rtekind == RTE_RELATION ||
(rte->rtekind == RTE_SUBQUERY && OidIsValid(rte->relid))))
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 3ab0aae78f..21c71e0d53 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -103,8 +103,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv);
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
- ExplainState *es, const char *queryString,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
+ IntoClause *into, ExplainState *es,
+ const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
const instr_time *planduration,
const BufferUsage *bufusage,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 8d85fa990e..599ac0318d 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -49,6 +49,7 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
* nparts Length of subplan_map[] and subpart_map[].
* subplan_map Subplan index by partition index, or -1.
* subpart_map Subpart index by partition index, or -1.
+ * rti_map RT index by partition index, or 0.
* present_parts A Bitmapset of the partition indexes that we
* have subplans or subparts for.
* initial_pruning_steps List of PartitionPruneSteps used to
@@ -68,6 +69,7 @@ typedef struct PartitionedRelPruningData
int nparts;
int *subplan_map;
int *subpart_map;
+ int *rti_map pg_node_attr(array_size(nparts));
Bitmapset *present_parts;
List *initial_pruning_steps;
List *exec_pruning_steps;
@@ -138,7 +140,8 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
Bitmapset *root_parent_relids,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune);
+ bool initial_prune,
+ Bitmapset **validsubplan_rtis);
extern PartitionPruneState *ExecCreatePartitionPruneState(EState *estate,
PartitionPruneInfo *pruneinfo);
#endif /* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index 0a7274e26c..0e7245435d 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,7 @@ typedef struct QueryDesc
/* These fields are provided by CreateQueryDesc */
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
+ CachedPlan *cplan; /* CachedPlan that supplies the plannedstmt */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
@@ -57,6 +58,7 @@ typedef struct QueryDesc
/* in pquery.c */
extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+ CachedPlan *cplan,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 518a9fcd15..bd68c60a0b 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -42,6 +42,7 @@
#include "storage/condition_variable.h"
#include "utils/hsearch.h"
#include "utils/queryenvironment.h"
+#include "utils/plancache.h"
#include "utils/reltrigger.h"
#include "utils/sharedtuplestore.h"
#include "utils/snapshot.h"
@@ -636,9 +637,14 @@ typedef struct EState
* ExecRowMarks, or NULL if none */
List *es_rteperminfos; /* List of RTEPermissionInfo */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
+ CachedPlan *es_cachedplan;
List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
List *es_part_prune_states; /* List of PartitionPruneState */
List *es_part_prune_results; /* List of Bitmapset */
+ Bitmapset *es_unprunable_relids; /* PlannedStmt.unprunableRelids + RT
+ * indexes of leaf partitions that
+ * survive initial pruning; see
+ * ExecDoInitialPruning() */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
@@ -1419,6 +1425,12 @@ typedef struct ModifyTableState
double mt_merge_inserted;
double mt_merge_updated;
double mt_merge_deleted;
+
+ /*
+ * List of valid updateColnosLists. Contains only those belonging to
+ * unpruned relations from ModifyTable.updateColnosLists.
+ */
+ List *mt_updateColnosLists;
} ModifyTableState;
/* ----------------
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 8d30b6e896..cc2190ea63 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -116,6 +116,12 @@ typedef struct PlannerGlobal
/* "flat" rangetable for executor */
List *finalrtable;
+ /*
+ * RT indexes of relations subject to removal from the plan due to runtime
+ * pruning at plan initialization time
+ */
+ Bitmapset *prunableRelids;
+
/* "flat" list of RTEPermissionInfos */
List *finalrteperminfos;
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 39d0281c23..318e30fe2f 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -74,6 +74,10 @@ typedef struct PlannedStmt
List *rtable; /* list of RangeTblEntry nodes */
+ Bitmapset *unprunableRelids; /* RT indexes of relations that are not
+ * subject to runtime pruning; for
+ * AcquireExecutorLocks() */
+
List *permInfos; /* list of RTEPermissionInfo nodes for rtable
* entries needing one */
@@ -1474,6 +1478,9 @@ typedef struct PartitionedRelPruneInfo
/* subpart index by partition index, or -1 */
int *subpart_map pg_node_attr(array_size(nparts));
+ /* RT index by partition index, or 0 */
+ int *rti_map pg_node_attr(array_size(nparts));
+
/* relation OID by partition index, or 0 */
Oid *relid_map pg_node_attr(array_size(nparts));
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index a90dfdf906..0b5ee007ca 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -149,6 +149,7 @@ typedef struct CachedPlan
int magic; /* should equal CACHEDPLAN_MAGIC */
List *stmt_list; /* list of PlannedStmts */
bool is_oneshot; /* is it a "oneshot" plan? */
+ bool is_generic; /* is it a reusable generic plan? */
bool is_saved; /* is CachedPlan in a long-lived context? */
bool is_valid; /* is the stmt_list currently valid? */
Oid planRoleId; /* Role ID the plan was created for */
@@ -235,4 +236,13 @@ extern bool CachedPlanIsSimplyValid(CachedPlanSource *plansource,
extern CachedExpression *GetCachedExpression(Node *expr);
extern void FreeCachedExpression(CachedExpression *cexpr);
+/*
+ * CachedPlanRequiresLocking: should the executor acquire locks?
+ */
+static inline bool
+CachedPlanRequiresLocking(CachedPlan *cplan)
+{
+ return cplan->is_generic;
+}
+
#endif /* PLANCACHE_H */
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index 7a03b4e360..705cd922fc 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -4440,3 +4440,47 @@ drop table hp_contradict_test;
drop operator class part_test_int4_ops2 using hash;
drop operator ===(int4, int4);
drop function explain_analyze(text);
+-- Runtime pruning on UPDATE using WITH CHECK OPTIONS and RETURNING
+create table part_abc (a int, b text, c bool) partition by list (a);
+create table part_abc_1 (b text, a int, c bool);
+create table part_abc_2 (a int, c bool, b text);
+alter table part_abc attach partition part_abc_1 for values in (1);
+alter table part_abc attach partition part_abc_2 for values in (2);
+insert into part_abc values (1, 'b', true);
+insert into part_abc values (2, 'c', true);
+create view part_abc_view as select * from part_abc where b <> 'a' with check option;
+prepare update_part_abc_view as update part_abc_view set b = $2 where a = $1 returning *;
+explain (costs off) execute update_part_abc_view (1, 'd');
+ QUERY PLAN
+-------------------------------------------------------
+ Update on part_abc
+ Update on part_abc_1
+ -> Append
+ Subplans Removed: 1
+ -> Seq Scan on part_abc_1
+ Filter: ((b <> 'a'::text) AND (a = $1))
+(6 rows)
+
+execute update_part_abc_view (1, 'd');
+ a | b | c
+---+---+---
+ 1 | d | t
+(1 row)
+
+explain (costs off) execute update_part_abc_view (2, 'a');
+ QUERY PLAN
+-------------------------------------------------------
+ Update on part_abc
+ Update on part_abc_2 part_abc_1
+ -> Append
+ Subplans Removed: 1
+ -> Seq Scan on part_abc_2 part_abc_1
+ Filter: ((b <> 'a'::text) AND (a = $1))
+(6 rows)
+
+execute update_part_abc_view (2, 'a');
+ERROR: new row violates check option for view "part_abc_view"
+DETAIL: Failing row contains (2, a, t).
+deallocate update_part_abc_view;
+drop view part_abc_view;
+drop table part_abc;
diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql
index 442428d937..af26ad2fb2 100644
--- a/src/test/regress/sql/partition_prune.sql
+++ b/src/test/regress/sql/partition_prune.sql
@@ -1339,3 +1339,21 @@ drop operator class part_test_int4_ops2 using hash;
drop operator ===(int4, int4);
drop function explain_analyze(text);
+
+-- Runtime pruning on UPDATE using WITH CHECK OPTIONS and RETURNING
+create table part_abc (a int, b text, c bool) partition by list (a);
+create table part_abc_1 (b text, a int, c bool);
+create table part_abc_2 (a int, c bool, b text);
+alter table part_abc attach partition part_abc_1 for values in (1);
+alter table part_abc attach partition part_abc_2 for values in (2);
+insert into part_abc values (1, 'b', true);
+insert into part_abc values (2, 'c', true);
+create view part_abc_view as select * from part_abc where b <> 'a' with check option;
+prepare update_part_abc_view as update part_abc_view set b = $2 where a = $1 returning *;
+explain (costs off) execute update_part_abc_view (1, 'd');
+execute update_part_abc_view (1, 'd');
+explain (costs off) execute update_part_abc_view (2, 'a');
+execute update_part_abc_view (2, 'a');
+deallocate update_part_abc_view;
+drop view part_abc_view;
+drop table part_abc;
--
2.43.0
[application/octet-stream] v56-0003-Perform-runtime-initial-pruning-outside-ExecInit.patch (10.0K, 6-v56-0003-Perform-runtime-initial-pruning-outside-ExecInit.patch)
download | inline diff:
From 84b875b1ca3af89a9242cdaf9bea052223f9530e Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Thu, 12 Sep 2024 15:44:43 +0900
Subject: [PATCH v56 3/5] Perform runtime initial pruning outside
ExecInitNode()
This commit follows up on the previous change that moved
PartitionPruneInfos out of individual plan nodes into a list in
PlannedStmt. It moves the initialization of PartitionPruneStates
and runtime initial pruning out of ExecInitNode() and into a new
routine, ExecDoInitialPruning(), which is called by InitPlan()
before ExecInitNode() is invoked on the main plan tree and subplans.
ExecDoInitialPruning() stores the PartitionPruneStates that it
creates to do the initial pruning to use during exec pruninng in a
list matching the length of es_part_prune_infos (which holds the
PartitionPruneInfos from PlannedStmt), allowing both lists to share
the same index. It also saves the initial pruning result -- a
bitmapset of indexes for surviving child subnodes -- in a similarly
indexed list.
---
src/backend/executor/execMain.c | 55 ++++++++++++++++++++++++++++
src/backend/executor/execPartition.c | 51 ++++++++++++++------------
src/include/executor/execPartition.h | 2 +
src/include/nodes/execnodes.h | 2 +
4 files changed, 87 insertions(+), 23 deletions(-)
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index f263232c67..5222aa9ab3 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -46,6 +46,7 @@
#include "commands/matview.h"
#include "commands/trigger.h"
#include "executor/executor.h"
+#include "executor/execPartition.h"
#include "executor/nodeSubplan.h"
#include "foreign/fdwapi.h"
#include "mb/pg_wchar.h"
@@ -828,6 +829,54 @@ ExecCheckXactReadOnly(PlannedStmt *plannedstmt)
PreventCommandIfParallelMode(CreateCommandName((Node *) plannedstmt));
}
+/*
+ * ExecDoInitialPruning
+ * Perform runtime "initial" pruning, if necessary, to determine the set
+ * of child subnodes that need to be initialized during ExecInitNode()
+ * for plan nodes that support partition pruning.
+ *
+ * For each PartitionPruneInfo in estate->es_part_prune_infos, this function
+ * creates a PartitionPruneState (even if no initial pruning is done) and adds
+ * it to es_part_prune_states. For PartitionPruneInfo entries that include
+ * initial pruning steps, the result of those steps is saved as a bitmapset
+ * of indexes representing child subnodes that are "valid" and should be
+ * initialized for execution.
+ */
+static void
+ExecDoInitialPruning(EState *estate)
+{
+ ListCell *lc;
+
+ foreach(lc, estate->es_part_prune_infos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc);
+ PartitionPruneState *prunestate;
+ Bitmapset *validsubplans = NULL;
+
+ /*
+ * Create the working data structure for pruning, and save it for use
+ * later in ExecInitPartitionPruning(), which will be called by the
+ * parent plan node's ExecInit* function.
+ */
+ prunestate = ExecCreatePartitionPruneState(estate, pruneinfo);
+ estate->es_part_prune_states = lappend(estate->es_part_prune_states,
+ prunestate);
+
+ /*
+ * Perform an initial partition pruning pass, if necessary, and save
+ * the bitmapset of valid subplans for use in
+ * ExecInitPartitionPruning(). If no initial pruning is performed, we
+ * still store a NULL to ensure that es_part_prune_results is the same
+ * length as es_part_prune_infos. This ensures that
+ * ExecInitPartitionPruning() can use the same index to locate the
+ * result.
+ */
+ if (prunestate->do_initial_prune)
+ validsubplans = ExecFindMatchingSubPlans(prunestate, true);
+ estate->es_part_prune_results = lappend(estate->es_part_prune_results,
+ validsubplans);
+ }
+}
/* ----------------------------------------------------------------
* InitPlan
@@ -860,7 +909,13 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable, plannedstmt->permInfos);
estate->es_plannedstmt = plannedstmt;
+
+ /*
+ * Perform runtime "initial" pruning to determine the plan nodes that will
+ * not be executed.
+ */
estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+ ExecDoInitialPruning(estate);
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 63c3429fe7..40eb74d187 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -181,8 +181,6 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
int maxfieldlen);
static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
-static PartitionPruneState *CreatePartitionPruneState(EState *estate,
- PartitionPruneInfo *pruneinfo);
static void InitPartitionPruneContext(PartitionedRelPruningData *pprune,
PartitionPruneContext *context,
List *pruning_steps,
@@ -1782,20 +1780,26 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
/*
* ExecInitPartitionPruning
- * Initialize data structure needed for run-time partition pruning and
- * do initial pruning if needed
+ * Initialize the data structures needed for runtime "exec" partition
+ * pruning and return the result of initial pruning, if available.
*
* 'root_parent_relids' identifies the relation to which both the parent plan
- * and the PartitionPruneInfo given by 'part_prune_index' belong.
+ * and the PartitionPruneInfo associated with 'part_prune_index' belong.
*
- * On return, *initially_valid_subplans is assigned the set of indexes of
- * child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * The PartitionPruneState would have been created by ExecDoInitialPruning()
+ * and stored as the part_prune_index'th element of EState.es_part_prune_states.
+ * Here, we initialize only the PartitionPruneContext necessary for execution
+ * pruning.
*
- * If subplans are indeed pruned, subplan_map arrays contained in the returned
- * PartitionPruneState are re-sequenced to not count those, though only if the
- * maps will be needed for subsequent execution pruning passes.
+ * On return, *initially_valid_subplans is assigned the set of indexes of child
+ * subplans that must be initialized alongside the parent plan node. Initial
+ * pruning would have been performed by ExecDoInitialPruning() if necessary,
+ * and the bitmapset of surviving subplans' indexes would have been stored as
+ * the part_prune_index'th element of EState.es_part_prune_results.
+ *
+ * If subplans are pruned, the subplan_map arrays in the returned
+ * PartitionPruneState are re-sequenced to exclude those subplans, but only if
+ * the maps will be needed for subsequent execution pruning passes.
*/
PartitionPruneState *
ExecInitPartitionPruning(PlanState *planstate,
@@ -1820,11 +1824,12 @@ ExecInitPartitionPruning(PlanState *planstate,
bmsToString(root_parent_relids),
bmsToString(pruneinfo->root_parent_relids)));
- /* We may need an expression context to evaluate partition exprs */
- ExecAssignExprContext(estate, planstate);
-
- /* Create the working data structure for pruning */
- prunestate = CreatePartitionPruneState(estate, pruneinfo);
+ /*
+ * ExecDoInitialPruning() must have initialized the PartitionPruneState to
+ * perform the initial pruning.
+ */
+ prunestate = list_nth(estate->es_part_prune_states, part_prune_index);
+ Assert(prunestate != NULL);
/*
* Store PlanState for using it to initialize exec pruning contexts later
@@ -1833,11 +1838,11 @@ ExecInitPartitionPruning(PlanState *planstate,
if (prunestate->do_exec_prune)
prunestate->parent_plan = planstate;
- /*
- * Perform an initial partition prune pass, if required.
- */
+ /* Use the result of initial pruning done by ExecDoInitialPruning(). */
if (prunestate->do_initial_prune)
- *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+ *initially_valid_subplans = list_nth_node(Bitmapset,
+ estate->es_part_prune_results,
+ part_prune_index);
else
{
/* No pruning, so we'll need to initialize all subplans */
@@ -1886,8 +1891,8 @@ ExecInitPartitionPruning(PlanState *planstate,
* (which are stored in each PartitionedRelPruningData) are initialized lazily
* in find_matching_subplans_recurse() when used for the first time.
*/
-static PartitionPruneState *
-CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
+PartitionPruneState *
+ExecCreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
{
PartitionPruneState *prunestate;
int n_part_hierarchies;
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 41afb522f3..8d85fa990e 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -139,4 +139,6 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
bool initial_prune);
+extern PartitionPruneState *ExecCreatePartitionPruneState(EState *estate,
+ PartitionPruneInfo *pruneinfo);
#endif /* EXECPARTITION_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 22b928e085..518a9fcd15 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -637,6 +637,8 @@ typedef struct EState
List *es_rteperminfos; /* List of RTEPermissionInfo */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
+ List *es_part_prune_states; /* List of PartitionPruneState */
+ List *es_part_prune_results; /* List of Bitmapset */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
--
2.43.0
^ permalink raw reply [nested|flat] 29+ messages in thread
* Re: generic plans and "initial" pruning
2024-08-15 15:34 Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-16 12:35 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-19 16:39 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-20 13:00 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-20 14:53 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-21 12:45 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-21 13:10 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-23 12:48 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-29 13:34 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-17 12:57 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-19 08:39 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-19 12:10 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-20 08:10 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
@ 2024-10-10 20:15 ` Robert Haas <[email protected]>
2024-10-11 07:30 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
0 siblings, 1 reply; 29+ messages in thread
From: Robert Haas @ 2024-10-10 20:15 UTC (permalink / raw)
To: Amit Langote <[email protected]>; +Cc: Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>; Tom Lane <[email protected]>
Hi Amit,
This is not a full review (sorry!) but here are a few comments.
In general, I don't have a problem with this direction. I thought
Tom's previous proposal of abandoning ExecInitNode() in medias res if
we discover that we need to replan was doable and I still think that,
but ISTM that this approach needs to touch less code, because
abandoning ExecInitNode() partly through means we could have leftover
state to clean up in any node in the PlanState tree, and as we've
discussed, ExecEndNode() isn't necessarily prepared to clean up a
PlanState tree that was only partially processed by ExecInitNode(). As
far as I can see in the time I've spent looking at this today, 0001
looks pretty unobjectionable (with some exceptions that I've noted
below). I also think 0003 looks pretty safe. It seems like partition
pruning moves backward across a pretty modest amount of code that does
pretty well-defined things. Basically, initialization-time pruning now
happens before other types of node initialization, and before setting
up row marks. I do however find the changes in 0002 to be less
obviously correct and less obviously safe; see below for some notes
about that.
In 0001, the name root_parent_relids doesn't seem very clear to me,
and neither does the explanation of what it does. You say
"'root_parent_relids' identifies the relation to which both the parent
plan and the PartitionPruneInfo given by 'part_prune_index' belong."
But it's a set, so what does it mean to identify "the" relation? It's
a set of relations, not just one. And why does the name include the
word "root"? It's neither the PlannerGlobal object, which we often
call root, nor is it the root of the partitioning hierarchy. To me, it
looks like it's just the set of relids that we can potentially prune.
I don't see why this isn't just called "relids", like the field from
which it's copied:
+ pruneinfo->root_parent_relids = parentrel->relids;
It just doesn't seem very root-y or very parent-y.
- node->part_prune_info = partpruneinfo;
+
Extra blank line.
In 0002, the handling of ExprContexts seems a little bit hard to
understand. Sometimes we're using the PlanState's ExprContext, and
sometimes we're using a separate context owned by the
PartitionedRelPruningData's context, and it's not exactly clear why
that is or what the consequences are. Likewise I wouldn't mind some
more comments or explanation in the commit message of the changes in
this patch related to EState objects. I can't help wondering if the
changes here could have either semantic implications (like expression
evaluation can produce different results than before) or performance
implications (because we create objects that we didn't previously
create). As noted above, this is really my only design-level concern
about 0001-0003.
Typo: partrtitioned
Regrettably, I have not looked seriously at 0004 and 0005, so I can't
comment on those.
--
Robert Haas
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 29+ messages in thread
* Re: generic plans and "initial" pruning
2024-08-15 15:34 Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-16 12:35 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-19 16:39 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-20 13:00 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-20 14:53 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-21 12:45 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-21 13:10 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-23 12:48 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-29 13:34 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-17 12:57 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-19 08:39 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-19 12:10 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-20 08:10 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-10-10 20:15 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
@ 2024-10-11 07:30 ` Amit Langote <[email protected]>
2024-10-15 14:38 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-10-25 12:30 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
0 siblings, 2 replies; 29+ messages in thread
From: Amit Langote @ 2024-10-11 07:30 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>; Tom Lane <[email protected]>
Robert,
On Fri, Oct 11, 2024 at 5:15 AM Robert Haas <[email protected]> wrote:
>
> Hi Amit,
>
> This is not a full review (sorry!) but here are a few comments.
Thank you for taking a look.
> In general, I don't have a problem with this direction. I thought
> Tom's previous proposal of abandoning ExecInitNode() in medias res if
> we discover that we need to replan was doable and I still think that,
> but ISTM that this approach needs to touch less code, because
> abandoning ExecInitNode() partly through means we could have leftover
> state to clean up in any node in the PlanState tree, and as we've
> discussed, ExecEndNode() isn't necessarily prepared to clean up a
> PlanState tree that was only partially processed by ExecInitNode().
I will say that I feel more comfortable committing and be responsible
for the refactoring I'm proposing in 0001-0003 than the changes
required to take locks during ExecInitNode(), as seen in the patches
up to version v52..
> As
> far as I can see in the time I've spent looking at this today, 0001
> looks pretty unobjectionable (with some exceptions that I've noted
> below). I also think 0003 looks pretty safe. It seems like partition
> pruning moves backward across a pretty modest amount of code that does
> pretty well-defined things. Basically, initialization-time pruning now
> happens before other types of node initialization, and before setting
> up row marks. I do however find the changes in 0002 to be less
> obviously correct and less obviously safe; see below for some notes
> about that.
>
> In 0001, the name root_parent_relids doesn't seem very clear to me,
> and neither does the explanation of what it does. You say
> "'root_parent_relids' identifies the relation to which both the parent
> plan and the PartitionPruneInfo given by 'part_prune_index' belong."
> But it's a set, so what does it mean to identify "the" relation? It's
> a set of relations, not just one.
The intention is to ensure that the bitmapset in PartitionPruneInfo
corresponds to the apprelids bitmapset in the Append or MergeAppend
node that owns the PartitionPruneInfo. Essentially, root_parent_relids
is used to cross-check that both sets align, ensuring that the pruning
logic applies to the same relations as the parent plan.
> And why does the name include the
> word "root"? It's neither the PlannerGlobal object, which we often
> call root, nor is it the root of the partitioning hierarchy. To me, it
> looks like it's just the set of relids that we can potentially prune.
> I don't see why this isn't just called "relids", like the field from
> which it's copied:
>
> + pruneinfo->root_parent_relids = parentrel->relids;
>
> It just doesn't seem very root-y or very parent-y.
Maybe just "relids" suffices with a comment updated like this:
* relids RelOptInfo.relids of the parent plan node (e.g. Append
* or MergeAppend) to which his PartitionPruneInfo node
* belongs. Used to ensure that the pruning logic matches
* the parent plan's apprelids.
> - node->part_prune_info = partpruneinfo;
> +
>
> Extra blank line.
Fixed.
> In 0002, the handling of ExprContexts seems a little bit hard to
> understand. Sometimes we're using the PlanState's ExprContext, and
> sometimes we're using a separate context owned by the
> PartitionedRelPruningData's context, and it's not exactly clear why
> that is or what the consequences are. Likewise I wouldn't mind some
> more comments or explanation in the commit message of the changes in
> this patch related to EState objects. I can't help wondering if the
> changes here could have either semantic implications (like expression
> evaluation can produce different results than before) or performance
> implications (because we create objects that we didn't previously
> create).
I have taken another look at whether there's any real need to use
separate ExprContexts for initial and runtime pruning and ISTM there
isn't, so we can make "exec" pruning use the same ExprContext as what
"init" would have used. There *is* a difference however in how we
initializing the partition key expressions for initial and runtime
pruning, but it's not problematic to use the same ExprContext.
I'll update the commentary a bit more.
> Typo: partrtitioned
Fixed.
> Regrettably, I have not looked seriously at 0004 and 0005, so I can't
> comment on those.
Ok, I'm updating 0005 to change how the CachedPlan is handled when it
becomes invalid during InitPlan(). Currently (v56), a separate
transient CachedPlan is created for the query being initialized when
invalidation occurs. However, it seems better to update the original
CachedPlan in place to avoid extra bookkeeping for transient plans—an
approach Robert suggested in an off-list discussion.
Will post a new version next week.
--
Thanks, Amit Langote
^ permalink raw reply [nested|flat] 29+ messages in thread
* Re: generic plans and "initial" pruning
2024-08-15 15:34 Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-16 12:35 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-19 16:39 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-20 13:00 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-20 14:53 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-21 12:45 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-21 13:10 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-23 12:48 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-29 13:34 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-17 12:57 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-19 08:39 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-19 12:10 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-20 08:10 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-10-10 20:15 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-10-11 07:30 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
@ 2024-10-15 14:38 ` Robert Haas <[email protected]>
2024-10-15 15:22 ` Re: generic plans and "initial" pruning Tom Lane <[email protected]>
1 sibling, 1 reply; 29+ messages in thread
From: Robert Haas @ 2024-10-15 14:38 UTC (permalink / raw)
To: Amit Langote <[email protected]>; +Cc: Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>; Tom Lane <[email protected]>
On Fri, Oct 11, 2024 at 3:30 AM Amit Langote <[email protected]> wrote:
> Maybe just "relids" suffices with a comment updated like this:
>
> * relids RelOptInfo.relids of the parent plan node (e.g. Append
> * or MergeAppend) to which his PartitionPruneInfo node
> * belongs. Used to ensure that the pruning logic matches
> * the parent plan's apprelids.
LGTM.
--
Robert Haas
EDB: http://www.enterprisedb.com
^ permalink raw reply [nested|flat] 29+ messages in thread
* Re: generic plans and "initial" pruning
2024-08-15 15:34 Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-16 12:35 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-19 16:39 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-20 13:00 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-20 14:53 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-21 12:45 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-21 13:10 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-23 12:48 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-29 13:34 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-17 12:57 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-19 08:39 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-19 12:10 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-20 08:10 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-10-10 20:15 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-10-11 07:30 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-10-15 14:38 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
@ 2024-10-15 15:22 ` Tom Lane <[email protected]>
0 siblings, 0 replies; 29+ messages in thread
From: Tom Lane @ 2024-10-15 15:22 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: Amit Langote <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>
Robert Haas <[email protected]> writes:
> On Fri, Oct 11, 2024 at 3:30 AM Amit Langote <[email protected]> wrote:
>> Maybe just "relids" suffices with a comment updated like this:
>>
>> * relids RelOptInfo.relids of the parent plan node (e.g. Append
>> * or MergeAppend) to which his PartitionPruneInfo node
>> * belongs. Used to ensure that the pruning logic matches
>> * the parent plan's apprelids.
> LGTM.
"his" -> "this", surely?
regards, tom lane
^ permalink raw reply [nested|flat] 29+ messages in thread
* Re: generic plans and "initial" pruning
2024-08-15 15:34 Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-16 12:35 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-19 16:39 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-20 13:00 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-20 14:53 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-21 12:45 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-21 13:10 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-23 12:48 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-29 13:34 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-17 12:57 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-19 08:39 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-19 12:10 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-20 08:10 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-10-10 20:15 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-10-11 07:30 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
@ 2024-10-25 12:30 ` Amit Langote <[email protected]>
2024-12-01 18:36 ` Re: generic plans and "initial" pruning Tomas Vondra <[email protected]>
1 sibling, 1 reply; 29+ messages in thread
From: Amit Langote @ 2024-10-25 12:30 UTC (permalink / raw)
To: Robert Haas <[email protected]>; +Cc: Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>; Tom Lane <[email protected]>
On Fri, Oct 11, 2024 at 4:30 PM Amit Langote <[email protected]> wrote:
> On Fri, Oct 11, 2024 at 5:15 AM Robert Haas <[email protected]> wrote:
> Ok, I'm updating 0005 to change how the CachedPlan is handled when it
> becomes invalid during InitPlan(). Currently (v56), a separate
> transient CachedPlan is created for the query being initialized when
> invalidation occurs. However, it seems better to update the original
> CachedPlan in place to avoid extra bookkeeping for transient plans—an
> approach Robert suggested in an off-list discussion.
>
> Will post a new version next week.
Sorry for the delay.
I've completed hacking on the approach to update the existing
CachedPlan in-place when it’s invalidated during plan initialization
in its stmt_list. Previously, we created transient (living for that
execution) CachedPlans for each query/plan, tracked separately from
the original CachedPlan, so that invalidation callbacks could
reference them. This meant that the original CachedPlan would continue
to hold invalid plans until the next GetCachedPlan() call.
With the new approach, the original CachedPlan is updated directly:
new PlannedStmts are installed into the existing stmt_list, allowing
any callers iterating over that list to continue unaffected. The new
UpdateCachedPlan() function now creates new plans for all queries in
the CachedPlan’s owning CachedPlanSource, replacing the previous
plans, and marks it valid. So the CachedPlan becomes valid
immediately instead of in the next GetCachedPlan().
One caveat is that, without a dedicated memory context for the
PlannedStmts in stmt_list, the old ones leak into CacheMemoryContext.
However, since UpdateCachedPlan() is rarely invoked, I haven’t focused
on addressing this leak. If needed, we could introduce an additional
memory context next to CachedPlan.context, which would allow freeing
the PlannedStmts without affecting the stmt_list. For now, I’ve
ensured that stmt_list itself is not overwritten in
UpdateCachedPlan().
UpdateCachedPlan() is added in 0005.
I've kept 0005, the patch to retry execution with an updated plan if
the plan becomes invalid after taking locks on prunable relations
(deferred until initial pruning), separate for now. However, I plan to
eventually merge it into 0004, the patch implementing deferred
locking.
I've also fixed the comment in 0003 about PartitionPruneInfo.relid as
Tom pointed out, which now reads:
* relids RelOptInfo.relids of the parent plan node (e.g. Append
* or MergeAppend) to which this PartitionPruneInfo node
* belongs. The pruning logic ensures that this matches
* the parent plan node's apprelids.
I've stared at the refactoring patches 0001-0003 long enough at this
point to think that they are good for committing.
--
Thanks, Amit Langote
Attachments:
[application/octet-stream] v57-0002-Initialize-PartitionPruneContexts-lazily.patch (16.7K, 2-v57-0002-Initialize-PartitionPruneContexts-lazily.patch)
download | inline diff:
From 98efea44aaa0780d3be013c2ef4acdff5ff39d7b Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Wed, 23 Oct 2024 16:55:42 +0900
Subject: [PATCH v57 2/5] Initialize PartitionPruneContexts lazily
This commit moves the initialization of PartitionPruneContexts for
both initial and exec pruning steps from CreatePartitionPruneState()
to find_matching_subplans_recurse(), where they are actually needed.
To track whether the context has been initialized and is ready for
use, a boolean field is_valid has been added to PartitionPruneContext.
The primary motivation is to allow CreatePartitionPruneState() to be
called before ExecInitNode(). Right now, it's coupled with
ExecInitNode() because setting up the exec pruning context requires
access to the parent plan node's PlanState. By deferring context
creation to where it's actually needed, we break this dependency.
The ExprContext used for both pruning phases is now a standalone
context, independent of the parent PlanState.
This change will be useful in a future commit, which will move initial
pruning to occur outside ExecInitNode(), specifically before it is
called by InitPlan().
Reviewed-by: Robert Haas
Reviewed-by: Tom Lane
---
src/backend/executor/execPartition.c | 151 +++++++++++++++++++--------
src/backend/partitioning/partprune.c | 7 +-
src/include/executor/execPartition.h | 12 +++
src/include/partitioning/partprune.h | 2 +
4 files changed, 123 insertions(+), 49 deletions(-)
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 323d5330ff..38311d2991 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -181,18 +181,17 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
int maxfieldlen);
static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
-static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
+static PartitionPruneState *CreatePartitionPruneState(EState *estate,
PartitionPruneInfo *pruneinfo);
-static void InitPartitionPruneContext(PartitionPruneContext *context,
+static void InitPartitionPruneContext(PartitionedRelPruningData *pprune,
+ PartitionPruneContext *context,
List *pruning_steps,
- PartitionDesc partdesc,
- PartitionKey partkey,
- PlanState *planstate,
- ExprContext *econtext);
+ PlanState *planstate);
static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
Bitmapset *initially_valid_subplans,
int n_total_subplans);
-static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
+static void find_matching_subplans_recurse(PlanState *parent_plan,
+ PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
Bitmapset **validsubplans);
@@ -1825,7 +1824,14 @@ ExecInitPartitionPruning(PlanState *planstate,
ExecAssignExprContext(estate, planstate);
/* Create the working data structure for pruning */
- prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+ prunestate = CreatePartitionPruneState(estate, pruneinfo);
+
+ /*
+ * Store PlanState for using it to initialize exec pruning contexts later
+ * in find_matching_subplans_recurse() where they are needed.
+ */
+ if (prunestate->do_exec_prune)
+ prunestate->parent_plan = planstate;
/*
* Perform an initial partition prune pass, if required.
@@ -1865,8 +1871,6 @@ ExecInitPartitionPruning(PlanState *planstate,
* CreatePartitionPruneState
* Build the data structure required for calling ExecFindMatchingSubPlans
*
- * 'planstate' is the parent plan node's execution state.
- *
* 'pruneinfo' is a PartitionPruneInfo as generated by
* make_partition_pruneinfo. Here we build a PartitionPruneState containing a
* PartitionPruningData for each partitioning hierarchy (i.e., each sublist of
@@ -1877,16 +1881,24 @@ ExecInitPartitionPruning(PlanState *planstate,
* stored in each PartitionedRelPruningData can be re-used each time we
* re-evaluate which partitions match the pruning steps provided in each
* PartitionedRelPruneInfo.
+ *
+ * Note that the PartitionPruneContexts for both initial and exec pruning
+ * (which are stored in each PartitionedRelPruningData) are initialized lazily
+ * in find_matching_subplans_recurse() when used for the first time.
*/
static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
{
- EState *estate = planstate->state;
PartitionPruneState *prunestate;
int n_part_hierarchies;
ListCell *lc;
int i;
- ExprContext *econtext = planstate->ps_ExprContext;
+
+ /*
+ * Expression context that will be used by partkey_datum_from_expr() to
+ * evaluate expressions for comparison against partition bounds.
+ */
+ ExprContext *econtext = CreateExprContext(estate);
/* For data reading, executor always includes detached partitions */
if (estate->es_partition_directory == NULL)
@@ -1908,6 +1920,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
prunestate->other_subplans = bms_copy(pruneinfo->other_subplans);
prunestate->do_initial_prune = false; /* may be set below */
prunestate->do_exec_prune = false; /* may be set below */
+ prunestate->parent_plan = NULL;
prunestate->num_partprunedata = n_part_hierarchies;
/*
@@ -1943,16 +1956,25 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
Relation partrel;
PartitionDesc partdesc;
- PartitionKey partkey;
/*
- * We can rely on the copies of the partitioned table's partition
- * key and partition descriptor appearing in its relcache entry,
- * because that entry will be held open and locked for the
- * duration of this executor run.
+ * Used for initializing the expressions in initial pruning steps.
+ * For exec pruning steps, the parent plan node's PlanState's
+ * ps_ExprContext will be used.
*/
+ pprune->estate = estate;
+ pprune->econtext = econtext;
+
+ /* Remember Relation for use in InitPartitionPruneContext. */
partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
- partkey = RelationGetPartitionKey(partrel);
+ pprune->partrel = partrel;
+
+ /*
+ * We can rely on the copy of partitioned table's partition
+ * descriptor appearing in its relcache entry, because that entry
+ * will be held open and locked for the duration of this executor
+ * run.
+ */
partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
partrel);
@@ -2063,32 +2085,26 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pprune->present_parts = bms_copy(pinfo->present_parts);
/*
- * Initialize pruning contexts as needed. Note that we must skip
- * execution-time partition pruning in EXPLAIN (GENERIC_PLAN),
- * since parameter values may be missing.
+ * Pruning contexts (initial_context and exec_context) are
+ * initialized lazily in find_matching_subplans_recurse() when
+ * used for the first time.
+ *
+ * Note that we must skip execution-time partition pruning in
+ * EXPLAIN (GENERIC_PLAN), since parameter values may be missing.
*/
pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
+ pprune->initial_context.is_valid = false;
if (pinfo->initial_pruning_steps &&
!(econtext->ecxt_estate->es_top_eflags & EXEC_FLAG_EXPLAIN_GENERIC))
- {
- InitPartitionPruneContext(&pprune->initial_context,
- pinfo->initial_pruning_steps,
- partdesc, partkey, planstate,
- econtext);
/* Record whether initial pruning is needed at any level */
prunestate->do_initial_prune = true;
- }
+
pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
+ pprune->exec_context.is_valid = false;
if (pinfo->exec_pruning_steps &&
!(econtext->ecxt_estate->es_top_eflags & EXEC_FLAG_EXPLAIN_GENERIC))
- {
- InitPartitionPruneContext(&pprune->exec_context,
- pinfo->exec_pruning_steps,
- partdesc, partkey, planstate,
- econtext);
/* Record whether exec pruning is needed at any level */
prunestate->do_exec_prune = true;
- }
/*
* Accumulate the IDs of all PARAM_EXEC Params affecting the
@@ -2109,17 +2125,41 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
* Initialize a PartitionPruneContext for the given list of pruning steps.
*/
static void
-InitPartitionPruneContext(PartitionPruneContext *context,
+InitPartitionPruneContext(PartitionedRelPruningData *pprune,
+ PartitionPruneContext *context,
List *pruning_steps,
- PartitionDesc partdesc,
- PartitionKey partkey,
- PlanState *planstate,
- ExprContext *econtext)
+ PlanState *planstate)
{
int n_steps;
int partnatts;
ListCell *lc;
+ /*
+ * Use the ExprContext that CreatePartitionPruneState() should have
+ * created.
+ */
+ ExprContext *econtext = pprune->econtext;
+ EState *estate = pprune->estate;
+ MemoryContext oldcxt;
+ Relation partrel = pprune->partrel;
+ PartitionKey partkey;
+ PartitionDesc partdesc;
+
+ Assert(econtext != NULL);
+
+ /* Must allocate the needed stuff in the query lifetime context. */
+ oldcxt = MemoryContextSwitchTo(estate->es_query_cxt);
+
+ /*
+ * We can rely on the copies of the partitioned table's partition key and
+ * partition descriptor appearing in its relcache entry, because that
+ * entry will be held open and locked for the duration of this executor
+ * run.
+ */
+ partkey = RelationGetPartitionKey(partrel);
+ partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
+ partrel);
+
n_steps = list_length(pruning_steps);
context->strategy = partkey->strategy;
@@ -2187,6 +2227,9 @@ InitPartitionPruneContext(PartitionPruneContext *context,
}
}
}
+
+ MemoryContextSwitchTo(oldcxt);
+ context->is_valid = true;
}
/*
@@ -2350,12 +2393,16 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
* recursing to other (lower-level) parents as needed.
*/
pprune = &prunedata->partrelprunedata[0];
- find_matching_subplans_recurse(prunedata, pprune, initial_prune,
+ find_matching_subplans_recurse(prunestate->parent_plan,
+ prunedata, pprune, initial_prune,
&result);
/* Expression eval may have used space in ExprContext too */
- if (pprune->exec_pruning_steps)
+ if (pprune->exec_context.is_valid)
+ {
+ Assert(pprune->exec_pruning_steps != NIL);
ResetExprContext(pprune->exec_context.exprcontext);
+ }
}
/* Add in any subplans that partition pruning didn't account for */
@@ -2378,7 +2425,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
* Adds valid (non-prunable) subplan IDs to *validsubplans
*/
static void
-find_matching_subplans_recurse(PartitionPruningData *prunedata,
+find_matching_subplans_recurse(PlanState *parent_plan,
+ PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
Bitmapset **validsubplans)
@@ -2395,11 +2443,27 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
* level.
*/
if (initial_prune && pprune->initial_pruning_steps)
+ {
+ /* Initialize initial_context if not already done. */
+ if (unlikely(!pprune->initial_context.is_valid))
+ InitPartitionPruneContext(pprune,
+ &pprune->initial_context,
+ pprune->initial_pruning_steps,
+ parent_plan);
partset = get_matching_partitions(&pprune->initial_context,
pprune->initial_pruning_steps);
+ }
else if (!initial_prune && pprune->exec_pruning_steps)
+ {
+ /* Initialize exec_context if not already done. */
+ if (unlikely(!pprune->exec_context.is_valid))
+ InitPartitionPruneContext(pprune,
+ &pprune->exec_context,
+ pprune->exec_pruning_steps,
+ parent_plan);
partset = get_matching_partitions(&pprune->exec_context,
pprune->exec_pruning_steps);
+ }
else
partset = pprune->present_parts;
@@ -2415,7 +2479,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
int partidx = pprune->subpart_map[i];
if (partidx >= 0)
- find_matching_subplans_recurse(prunedata,
+ find_matching_subplans_recurse(parent_plan,
+ prunedata,
&prunedata->partrelprunedata[partidx],
initial_prune, validsubplans);
else
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 6f0ead1fa8..df767f9e5b 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -3784,13 +3784,8 @@ partkey_datum_from_expr(PartitionPruneContext *context,
/*
* We should never see a non-Const in a step unless the caller has
* passed a valid ExprContext.
- *
- * When context->planstate is valid, context->exprcontext is same as
- * context->planstate->ps_ExprContext.
*/
- Assert(context->planstate != NULL || context->exprcontext != NULL);
- Assert(context->planstate == NULL ||
- (context->exprcontext == context->planstate->ps_ExprContext));
+ Assert(context->exprcontext != NULL);
exprstate = context->exprstates[stateidx];
ectx = context->exprcontext;
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index ed2b019c09..5178c27743 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -42,6 +42,10 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
* PartitionedRelPruneInfo (see plannodes.h); though note that here,
* subpart_map contains indexes into PartitionPruningData.partrelprunedata[].
*
+ * estate The EState for the query doing runtime pruning
+ * partrel Partitioned table Relation; obtained by
+ * ExecGetRangeTableRelation(estate, rti), where
+ * rti is PartitionedRelPruneInfo.rtindex.
* nparts Length of subplan_map[] and subpart_map[].
* subplan_map Subplan index by partition index, or -1.
* subpart_map Subpart index by partition index, or -1.
@@ -51,6 +55,8 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
* perform executor startup pruning.
* exec_pruning_steps List of PartitionPruneSteps used to
* perform per-scan pruning.
+ * econtext ExprContext to use for evaluating partition
+ * key
* initial_context If initial_pruning_steps isn't NIL, contains
* the details needed to execute those steps.
* exec_context If exec_pruning_steps isn't NIL, contains
@@ -58,12 +64,15 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
*/
typedef struct PartitionedRelPruningData
{
+ EState *estate;
+ Relation partrel;
int nparts;
int *subplan_map;
int *subpart_map;
Bitmapset *present_parts;
List *initial_pruning_steps;
List *exec_pruning_steps;
+ ExprContext *econtext;
PartitionPruneContext initial_context;
PartitionPruneContext exec_context;
} PartitionedRelPruningData;
@@ -105,6 +114,8 @@ typedef struct PartitionPruningData
* startup (at any hierarchy level).
* do_exec_prune true if pruning should be performed during
* executor run (at any hierarchy level).
+ * parent_plan Parent plan node's PlanState used to initialize exec
+ * pruning contexts
* num_partprunedata Number of items in "partprunedata" array.
* partprunedata Array of PartitionPruningData pointers for the plan's
* partitioned relation(s), one for each partitioning
@@ -117,6 +128,7 @@ typedef struct PartitionPruneState
MemoryContext prune_context;
bool do_initial_prune;
bool do_exec_prune;
+ PlanState *parent_plan;
int num_partprunedata;
PartitionPruningData *partprunedata[FLEXIBLE_ARRAY_MEMBER];
} PartitionPruneState;
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index 6922e04430..b90c2e57a2 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -26,6 +26,7 @@ struct RelOptInfo;
* Stores information needed at runtime for pruning computations
* related to a single partitioned table.
*
+ * is_valid Has the information in this struct been initialized?
* strategy Partition strategy, e.g. LIST, RANGE, HASH.
* partnatts Number of columns in the partition key.
* nparts Number of partitions in this partitioned table.
@@ -48,6 +49,7 @@ struct RelOptInfo;
*/
typedef struct PartitionPruneContext
{
+ bool is_valid;
char strategy;
int partnatts;
int nparts;
--
2.43.0
[application/octet-stream] v57-0003-Perform-runtime-initial-pruning-outside-ExecInit.patch (10.9K, 3-v57-0003-Perform-runtime-initial-pruning-outside-ExecInit.patch)
download | inline diff:
From 0e18a7dd1e5b1e9aaea442b140e9591ccb25644a Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Wed, 23 Oct 2024 17:02:33 +0900
Subject: [PATCH v57 3/5] Perform runtime initial pruning outside
ExecInitNode()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
This commit follows up on the previous change that moved
PartitionPruneInfos out of individual plan nodes into a list in
PlannedStmt. It moves the initialization of PartitionPruneStates
and runtime initial pruning out of ExecInitNode() and into a new
routine, ExecDoInitialPruning(), which is called by InitPlan()
before ExecInitNode() is invoked on the main plan tree and subplans.
ExecDoInitialPruning() performs the initial pruning and saves the
result—a bitmapset of indexes for the surviving child subnodes—in
es_part_prune_results, a list in EState. The PartitionPruneStates
created for initial pruning are also saved in es_part_prune_states,
another list in EState, for later use during exec pruning. Both lists
are parallel to es_part_prune_infos (which holds the
PartitionPruneInfos from PlannedStmt), allowing them to share the
same index.
Reviewed-by: Robert Haas
---
src/backend/executor/execMain.c | 59 ++++++++++++++++++++++++++++
src/backend/executor/execPartition.c | 51 +++++++++++++-----------
src/include/executor/execPartition.h | 2 +
src/include/nodes/execnodes.h | 2 +
4 files changed, 90 insertions(+), 24 deletions(-)
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index c460c6aa32..2fcec32dcb 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -46,6 +46,7 @@
#include "commands/matview.h"
#include "commands/trigger.h"
#include "executor/executor.h"
+#include "executor/execPartition.h"
#include "executor/nodeSubplan.h"
#include "foreign/fdwapi.h"
#include "mb/pg_wchar.h"
@@ -819,6 +820,53 @@ ExecCheckXactReadOnly(PlannedStmt *plannedstmt)
PreventCommandIfParallelMode(CreateCommandName((Node *) plannedstmt));
}
+/*
+ * ExecDoInitialPruning
+ * Perform runtime "initial" pruning, if necessary, to determine the set
+ * of child subnodes that need to be initialized during ExecInitNode() for
+ * plan nodes that support partition pruning.
+ *
+ * This function iterates over each PartitionPruneInfo entry in
+ * estate->es_part_prune_infos. For each entry, it creates a PartitionPruneState
+ * and adds it to es_part_prune_states, where ExecInitPartitionPruning() can
+ * access it for use during "exec" pruning.
+ *
+ * If initial pruning steps exist for a PartitionPruneInfo entry, this function
+ * executes those pruning steps and stores the result as a bitmapset of valid
+ * child subplans, identifying which subplans should be initialized for
+ * execution. The results are saved in estate->es_part_prune_results.
+ *
+ * If no initial pruning is performed for a given PartitionPruneInfo, a NULL
+ * entry is still added to es_part_prune_results to maintain alignment with
+ * es_part_prune_infos. This ensures that ExecInitPartitionPruning() can use
+ * the same index to retrieve the pruning results.
+ */
+static void
+ExecDoInitialPruning(EState *estate)
+{
+ ListCell *lc;
+
+ foreach(lc, estate->es_part_prune_infos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc);
+ PartitionPruneState *prunestate;
+ Bitmapset *validsubplans = NULL;
+
+ /* Create and save the PartitionPruneState. */
+ prunestate = ExecCreatePartitionPruneState(estate, pruneinfo);
+ estate->es_part_prune_states = lappend(estate->es_part_prune_states,
+ prunestate);
+
+ /*
+ * Perform initial pruning steps, if any, and save the result
+ * bitmapset or NULL as described in the header comment.
+ */
+ if (prunestate->do_initial_prune)
+ validsubplans = ExecFindMatchingSubPlans(prunestate, true);
+ estate->es_part_prune_results = lappend(estate->es_part_prune_results,
+ validsubplans);
+ }
+}
/* ----------------------------------------------------------------
* InitPlan
@@ -851,7 +899,18 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable, plannedstmt->permInfos);
estate->es_plannedstmt = plannedstmt;
+
+ /*
+ * Perform runtime "initial" pruning to identify which child subplans,
+ * corresponding to the children of plan nodes that contain
+ * PartitionPruneInfo such as Append, will not be executed. The results,
+ * which are bitmapsets of indexes of the child subplans that will be
+ * executed, are saved in es_part_prune_results. These results correspond
+ * to each PartitionPruneInfo entry, and the es_part_prune_results list is
+ * parallel to es_part_prune_infos.
+ */
estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+ ExecDoInitialPruning(estate);
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 38311d2991..83d1b61101 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -181,8 +181,6 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
int maxfieldlen);
static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
-static PartitionPruneState *CreatePartitionPruneState(EState *estate,
- PartitionPruneInfo *pruneinfo);
static void InitPartitionPruneContext(PartitionedRelPruningData *pprune,
PartitionPruneContext *context,
List *pruning_steps,
@@ -1782,20 +1780,24 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
/*
* ExecInitPartitionPruning
- * Initialize data structure needed for run-time partition pruning and
- * do initial pruning if needed
+ * Initialize the data structures needed for runtime "exec" partition
+ * pruning and return the result of initial pruning, if available.
*
* 'relids' identifies the relation to which both the parent plan and the
* PartitionPruneInfo given by 'part_prune_index' belong.
*
- * On return, *initially_valid_subplans is assigned the set of indexes of
- * child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * The PartitionPruneState would have been created by ExecDoInitialPruning()
+ * and stored as the part_prune_index'th element of EState.es_part_prune_states.
*
- * If subplans are indeed pruned, subplan_map arrays contained in the returned
- * PartitionPruneState are re-sequenced to not count those, though only if the
- * maps will be needed for subsequent execution pruning passes.
+ * On return, *initially_valid_subplans is assigned the set of indexes of child
+ * subplans that must be initialized alongside the parent plan node. Initial
+ * pruning would have been performed by ExecDoInitialPruning() if necessary,
+ * and the bitmapset of surviving subplans' indexes would have been stored as
+ * the part_prune_index'th element of EState.es_part_prune_results.
+ *
+ * If subplans were pruned during initial pruning, the subplan_map arrays in
+ * the returned PartitionPruneState are re-sequenced to exclude those subplans,
+ * but only if the maps will be needed for subsequent execution pruning passes.
*/
PartitionPruneState *
ExecInitPartitionPruning(PlanState *planstate,
@@ -1820,11 +1822,12 @@ ExecInitPartitionPruning(PlanState *planstate,
bmsToString(relids),
bmsToString(pruneinfo->relids)));
- /* We may need an expression context to evaluate partition exprs */
- ExecAssignExprContext(estate, planstate);
-
- /* Create the working data structure for pruning */
- prunestate = CreatePartitionPruneState(estate, pruneinfo);
+ /*
+ * ExecDoInitialPruning() must have initialized the PartitionPruneState to
+ * perform the initial pruning.
+ */
+ prunestate = list_nth(estate->es_part_prune_states, part_prune_index);
+ Assert(prunestate != NULL);
/*
* Store PlanState for using it to initialize exec pruning contexts later
@@ -1833,11 +1836,11 @@ ExecInitPartitionPruning(PlanState *planstate,
if (prunestate->do_exec_prune)
prunestate->parent_plan = planstate;
- /*
- * Perform an initial partition prune pass, if required.
- */
+ /* Use the result of initial pruning done by ExecDoInitialPruning(). */
if (prunestate->do_initial_prune)
- *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+ *initially_valid_subplans = list_nth_node(Bitmapset,
+ estate->es_part_prune_results,
+ part_prune_index);
else
{
/* No pruning, so we'll need to initialize all subplans */
@@ -1848,8 +1851,8 @@ ExecInitPartitionPruning(PlanState *planstate,
/*
* Re-sequence subplan indexes contained in prunestate to account for any
- * that were removed above due to initial pruning. No need to do this if
- * no steps were removed.
+ * that were removed due to initial pruning. No need to do this if no
+ * partitions were removed.
*/
if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
{
@@ -1886,8 +1889,8 @@ ExecInitPartitionPruning(PlanState *planstate,
* (which are stored in each PartitionedRelPruningData) are initialized lazily
* in find_matching_subplans_recurse() when used for the first time.
*/
-static PartitionPruneState *
-CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
+PartitionPruneState *
+ExecCreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
{
PartitionPruneState *prunestate;
int n_part_hierarchies;
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 5178c27743..1497aed533 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -140,4 +140,6 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
bool initial_prune);
+extern PartitionPruneState *ExecCreatePartitionPruneState(EState *estate,
+ PartitionPruneInfo *pruneinfo);
#endif /* EXECPARTITION_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 5deed9232a..b0ceb1ab05 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -640,6 +640,8 @@ typedef struct EState
List *es_rteperminfos; /* List of RTEPermissionInfo */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
+ List *es_part_prune_states; /* List of PartitionPruneState */
+ List *es_part_prune_results; /* List of Bitmapset */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
--
2.43.0
[application/octet-stream] v57-0004-Defer-locking-of-runtime-prunable-relations-to-e.patch (54.5K, 4-v57-0004-Defer-locking-of-runtime-prunable-relations-to-e.patch)
download | inline diff:
From beff511dd6d4b87b763a3f70de26988a37c82d31 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Fri, 25 Oct 2024 15:45:38 +0900
Subject: [PATCH v57 4/5] Defer locking of runtime-prunable relations to
executor
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
When preparing a cached plan for execution, plancache.c locks the
relations in the plan's range table to ensure they are safe for
execution. However, this approach, implemented in
AcquireExecutorLocks(), results in unnecessarily locking relations
that might be pruned during "initial" runtime pruning.
To optimize this, locking is now deferred for relations subject to
"initial" runtime pruning. The planner now provides a set of
"unprunable" relations through the new PlannedStmt.unprunableRelids
field. AcquireExecutorLocks() will only lock these unprunable
relations. PlannedStmt.unprunableRelids is populated by subtracting
the set of initially prunable relids from all RT indexes. The
prunable relids are identified by examining all PartitionPruneInfos
during set_plan_refs() and storing the RT indexes of partitions
subject to "initial" pruning steps in PlannerGlobal.prunableRelids.
Deferred locks are taken, if necessary, after ExecDoInitialPruning()
determines the set of unpruned partitions. To enable this, the
CachedPlan is now available via QueryDesc, allowing the executor to
determine if the plan tree it’s executing is cached and may contain
unlocked relations. The executor calls CachedPlanRequiresLocking()
to check whether a cached plan might contain such unlocked relations,
ensuring that appropriate locks are acquired before execution.
Plan nodes like Append are already updated to consider only unpruned
relations. However, child RowMarks and child result relations are not
directly informed about unpruned partitions. Code handling child
RowMarks and result relations has therefore been modified to ensure
they don’t belong to pruned partitions. ExecDoInitialPruning() now
adds RT indexes of unpruned partitions to es_unpruned_relids,
initially populated with PlannedStmt.unprunableRelids. This ensures
only those child RowMarks and result relations whose owning relations
are in this set are processed.
For ModifyTable nodes, ExecInitModifyTable truncates the
resultRelations list (and parallel lists like withCheckOptionLists,
returningLists, and updateColnosLists) to consider only unpruned
relations, and creates ResultRelInfo structs only for those.
To obtain RT indexes of unpruned leaf partitions for
es_unpruned_relids, each PartitionedRelPruneInfo and the corresponding
PartitionedRelPruningData now includes a mapping from partition
indexes (from get_matching_partitions()) to their RT indexes in a
leafpart_rti_map[] array.
An Assert in ExecCheckPermissions() ensures that all relations
undergoing permission checks are properly locked, helping to catch
any missed additions to the unprunableRelids set.
Deferring locking introduces a window where prunable relations may be
altered by concurrent DDL, which can invalidate the plan. This might
cause errors if the executor attempts to use an invalid plan, such as
failing to locate a dropped partition index during
ExecInitIndexScan(). Future commits will add support for the executor
to validate the plan during ExecutorStart() and retry with a new plan
if the original becomes invalid after deferred locks.
---
src/backend/commands/copyto.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/explain.c | 7 +-
src/backend/commands/extension.c | 1 +
src/backend/commands/matview.c | 2 +-
src/backend/commands/prepare.c | 3 +-
src/backend/executor/execMain.c | 75 ++++++++++++++++++-
src/backend/executor/execParallel.c | 9 ++-
src/backend/executor/execPartition.c | 36 +++++++--
src/backend/executor/functions.c | 1 +
src/backend/executor/nodeAppend.c | 8 +-
src/backend/executor/nodeLockRows.c | 9 ++-
src/backend/executor/nodeMergeAppend.c | 2 +-
src/backend/executor/nodeModifyTable.c | 71 +++++++++++++++---
src/backend/executor/spi.c | 1 +
src/backend/optimizer/plan/planner.c | 2 +
src/backend/optimizer/plan/setrefs.c | 29 ++++++-
src/backend/partitioning/partprune.c | 22 ++++++
src/backend/tcop/pquery.c | 10 ++-
src/backend/utils/cache/plancache.c | 47 +++++++-----
src/include/commands/explain.h | 5 +-
src/include/executor/execPartition.h | 6 +-
src/include/executor/execdesc.h | 2 +
src/include/nodes/execnodes.h | 12 +++
src/include/nodes/pathnodes.h | 8 ++
src/include/nodes/plannodes.h | 7 ++
src/include/utils/plancache.h | 18 +++++
src/test/regress/expected/partition_prune.out | 44 +++++++++++
src/test/regress/sql/partition_prune.sql | 18 +++++
29 files changed, 400 insertions(+), 59 deletions(-)
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index f55e6d9675..27b6f6f069 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -556,7 +556,7 @@ BeginCopyTo(ParseState *pstate,
((DR_copy *) dest)->cstate = cstate;
/* Create a QueryDesc requesting no output */
- cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(),
InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 68ec122dbf..290c8bd240 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -324,7 +324,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 18a5af6b91..b699089bd8 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -515,7 +515,7 @@ standard_ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL),
es->memory ? &mem_counters : NULL);
}
@@ -623,7 +623,8 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
* to call it.
*/
void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
+ IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
const BufferUsage *bufusage,
@@ -679,7 +680,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
dest = None_Receiver;
/* Create a QueryDesc for the query */
- queryDesc = CreateQueryDesc(plannedstmt, queryString,
+ queryDesc = CreateQueryDesc(plannedstmt, cplan, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, instrument_option);
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index 86ea9cd9da..cb168ab6dd 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -903,6 +903,7 @@ execute_sql_string(const char *sql, const char *filename)
QueryDesc *qdesc;
qdesc = CreateQueryDesc(stmt,
+ NULL,
sql,
GetActiveSnapshot(), NULL,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 010097873d..69be74b4bd 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -438,7 +438,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, queryString,
+ queryDesc = CreateQueryDesc(plan, NULL, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 07257d4db9..311b9ebd5b 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -655,7 +655,8 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, into, es, query_string, paramLI, queryEnv,
+ ExplainOnePlan(pstmt, cplan, into, es, query_string, paramLI,
+ queryEnv,
&planduration, (es->buffers ? &bufusage : NULL),
es->memory ? &mem_counters : NULL);
else
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 2fcec32dcb..ed783236eb 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -54,6 +54,7 @@
#include "nodes/queryjumble.h"
#include "parser/parse_relation.h"
#include "rewrite/rewriteHandler.h"
+#include "storage/lmgr.h"
#include "tcop/utility.h"
#include "utils/acl.h"
#include "utils/backend_status.h"
@@ -91,6 +92,7 @@ static bool ExecCheckPermissionsModified(Oid relOid, Oid userid,
AclMode requiredPerms);
static void ExecCheckXactReadOnly(PlannedStmt *plannedstmt);
static void EvalPlanQualStart(EPQState *epqstate, Plan *planTree);
+static inline bool ExecShouldLockRelations(EState *estate);
/* end of local decls */
@@ -601,6 +603,21 @@ ExecCheckPermissions(List *rangeTable, List *rteperminfos,
(rte->rtekind == RTE_SUBQUERY &&
rte->relkind == RELKIND_VIEW));
+ /*
+ * Ensure that we have at least an AccessShareLock on relations
+ * whose permissions need to be checked.
+ *
+ * Skip this check in a parallel worker because locks won't be
+ * taken until ExecInitNode() performs plan initialization.
+ *
+ * XXX: ExecCheckPermissions() in a parallel worker may be
+ * redundant with the checks done in the leader process, so this
+ * should be reviewed to ensure it’s necessary.
+ */
+ Assert(IsParallelWorker() ||
+ CheckRelationOidLockedByMe(rte->relid, AccessShareLock,
+ true));
+
(void) getRTEPermissionInfo(rteperminfos, rte);
/* Many-to-one mapping not allowed */
Assert(!bms_is_member(rte->perminfoindex, indexset));
@@ -862,12 +879,46 @@ ExecDoInitialPruning(EState *estate)
* bitmapset or NULL as described in the header comment.
*/
if (prunestate->do_initial_prune)
- validsubplans = ExecFindMatchingSubPlans(prunestate, true);
+ {
+ Bitmapset *validsubplan_rtis = NULL;
+
+ validsubplans = ExecFindMatchingSubPlans(prunestate, true,
+ &validsubplan_rtis);
+ if (ExecShouldLockRelations(estate))
+ {
+ int rtindex = -1;
+
+ rtindex = -1;
+ while ((rtindex = bms_next_member(validsubplan_rtis,
+ rtindex)) >= 0)
+ {
+ RangeTblEntry *rte = exec_rt_fetch(rtindex, estate);
+
+ Assert(rte->rtekind == RTE_RELATION &&
+ rte->rellockmode != NoLock);
+ LockRelationOid(rte->relid, rte->rellockmode);
+ }
+ }
+ estate->es_unpruned_relids = bms_add_members(estate->es_unpruned_relids,
+ validsubplan_rtis);
+ }
+
estate->es_part_prune_results = lappend(estate->es_part_prune_results,
validsubplans);
}
}
+/*
+ * Locks might be needed only if running a cached plan that might contain
+ * unlocked relations, such as reused generic plans.
+ */
+static inline bool
+ExecShouldLockRelations(EState *estate)
+{
+ return estate->es_cachedplan == NULL ? false :
+ CachedPlanRequiresLocking(estate->es_cachedplan);
+}
+
/* ----------------------------------------------------------------
* InitPlan
*
@@ -880,6 +931,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
{
CmdType operation = queryDesc->operation;
PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+ CachedPlan *cachedplan = queryDesc->cplan;
Plan *plan = plannedstmt->planTree;
List *rangeTable = plannedstmt->rtable;
EState *estate = queryDesc->estate;
@@ -899,6 +951,8 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable, plannedstmt->permInfos);
estate->es_plannedstmt = plannedstmt;
+ estate->es_cachedplan = cachedplan;
+ estate->es_unpruned_relids = bms_copy(plannedstmt->unprunableRelids);
/*
* Perform runtime "initial" pruning to identify which child subplans,
@@ -908,6 +962,9 @@ InitPlan(QueryDesc *queryDesc, int eflags)
* executed, are saved in es_part_prune_results. These results correspond
* to each PartitionPruneInfo entry, and the es_part_prune_results list is
* parallel to es_part_prune_infos.
+ *
+ * This will also add the RT indexes of surviving leaf partitions to
+ * es_unpruned_relids.
*/
estate->es_part_prune_infos = plannedstmt->partPruneInfos;
ExecDoInitialPruning(estate);
@@ -926,8 +983,13 @@ InitPlan(QueryDesc *queryDesc, int eflags)
Relation relation;
ExecRowMark *erm;
- /* ignore "parent" rowmarks; they are irrelevant at runtime */
- if (rc->isParent)
+ /*
+ * Ignore "parent" rowmarks, because they are irrelevant at
+ * runtime. Also ignore the rowmarks belonging to child tables
+ * that have been pruned in ExecDoInitialPruning().
+ */
+ if (rc->isParent ||
+ !bms_is_member(rc->rti, estate->es_unpruned_relids))
continue;
/* get relation's OID (will produce InvalidOid if subquery) */
@@ -2970,6 +3032,13 @@ EvalPlanQualStart(EPQState *epqstate, Plan *planTree)
}
}
+ /*
+ * Copy es_unpruned_relids so that RowMarks of pruned relations are
+ * ignored in ExecInitLockRows() and ExecInitModifyTable() when
+ * initializing the plan trees below.
+ */
+ rcestate->es_unpruned_relids = parentestate->es_unpruned_relids;
+
/*
* Initialize private state information for each SubPlan. We must do this
* before running ExecInitNode on the main query tree, since
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index b01a2fdfdd..7519c9a860 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -1257,8 +1257,15 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
paramLI = RestoreParamList(¶mspace);
- /* Create a QueryDesc for the query. */
+ /*
+ * Create a QueryDesc for the query. We pass NULL for cachedplan, because
+ * we don't have a pointer to the CachedPlan in the leader's process. It's
+ * fine because the only reason the executor needs to see it is to decide
+ * if it should take locks on certain relations, but paraller workers
+ * always take locks anyway.
+ */
return CreateQueryDesc(pstmt,
+ NULL,
queryString,
GetActiveSnapshot(), InvalidSnapshot,
receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 83d1b61101..802a16b6fa 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -192,7 +192,8 @@ static void find_matching_subplans_recurse(PlanState *parent_plan,
PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans);
+ Bitmapset **validsubplans,
+ Bitmapset **validsubplan_rtis);
/*
@@ -1987,8 +1988,8 @@ ExecCreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
* The set of partitions that exist now might not be the same that
* existed when the plan was made. The normal case is that it is;
* optimize for that case with a quick comparison, and just copy
- * the subplan_map and make subpart_map point to the one in
- * PruneInfo.
+ * the subplan_map and make subpart_map, leafpart_rti_map point to
+ * the ones in PruneInfo.
*
* For the case where they aren't identical, we could have more
* partitions on either side; or even exactly the same number of
@@ -2007,6 +2008,7 @@ ExecCreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
sizeof(int) * partdesc->nparts) == 0)
{
pprune->subpart_map = pinfo->subpart_map;
+ pprune->leafpart_rti_map = pinfo->leafpart_rti_map;
memcpy(pprune->subplan_map, pinfo->subplan_map,
sizeof(int) * pinfo->nparts);
}
@@ -2027,6 +2029,7 @@ ExecCreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
* mismatches.
*/
pprune->subpart_map = palloc(sizeof(int) * partdesc->nparts);
+ pprune->leafpart_rti_map = palloc(sizeof(int) * partdesc->nparts);
for (pp_idx = 0; pp_idx < partdesc->nparts; pp_idx++)
{
@@ -2044,6 +2047,8 @@ ExecCreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
pinfo->subplan_map[pd_idx];
pprune->subpart_map[pp_idx] =
pinfo->subpart_map[pd_idx];
+ pprune->leafpart_rti_map[pp_idx] =
+ pinfo->leafpart_rti_map[pd_idx];
pd_idx++;
continue;
}
@@ -2081,6 +2086,7 @@ ExecCreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
pprune->subpart_map[pp_idx] = -1;
pprune->subplan_map[pp_idx] = -1;
+ pprune->leafpart_rti_map[pp_idx] = 0;
}
}
@@ -2359,10 +2365,13 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
* Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated. This
* differentiates the initial executor-time pruning step from later
* runtime pruning.
+ *
+ * valisubplan_rtis must be non-NULL if initial_prune is true.
*/
Bitmapset *
ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune)
+ bool initial_prune,
+ Bitmapset **validsubplan_rtis)
{
Bitmapset *result = NULL;
MemoryContext oldcontext;
@@ -2398,7 +2407,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
pprune = &prunedata->partrelprunedata[0];
find_matching_subplans_recurse(prunestate->parent_plan,
prunedata, pprune, initial_prune,
- &result);
+ &result, validsubplan_rtis);
/* Expression eval may have used space in ExprContext too */
if (pprune->exec_context.is_valid)
@@ -2415,6 +2424,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
/* Copy result out of the temp context before we reset it */
result = bms_copy(result);
+ if (validsubplan_rtis)
+ *validsubplan_rtis = bms_copy(*validsubplan_rtis);
MemoryContextReset(prunestate->prune_context);
@@ -2425,14 +2436,17 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
* find_matching_subplans_recurse
* Recursive worker function for ExecFindMatchingSubPlans
*
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and the RT indexes
+ * of their corresponding leaf partitions to *validsubplan_rtis if
+ * it's non-NULL.
*/
static void
find_matching_subplans_recurse(PlanState *parent_plan,
PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans)
+ Bitmapset **validsubplans,
+ Bitmapset **validsubplan_rtis)
{
Bitmapset *partset;
int i;
@@ -2475,8 +2489,13 @@ find_matching_subplans_recurse(PlanState *parent_plan,
while ((i = bms_next_member(partset, i)) >= 0)
{
if (pprune->subplan_map[i] >= 0)
+ {
*validsubplans = bms_add_member(*validsubplans,
pprune->subplan_map[i]);
+ if (validsubplan_rtis)
+ *validsubplan_rtis = bms_add_member(*validsubplan_rtis,
+ pprune->leafpart_rti_map[i]);
+ }
else
{
int partidx = pprune->subpart_map[i];
@@ -2485,7 +2504,8 @@ find_matching_subplans_recurse(PlanState *parent_plan,
find_matching_subplans_recurse(parent_plan,
prunedata,
&prunedata->partrelprunedata[partidx],
- initial_prune, validsubplans);
+ initial_prune, validsubplans,
+ validsubplan_rtis);
else
{
/*
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 692854e2b3..6f6f45e0ad 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -840,6 +840,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
dest = None_Receiver;
es->qd = CreateQueryDesc(es->stmt,
+ NULL,
fcache->src,
GetActiveSnapshot(),
InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index de7ebab5c2..006bdafaea 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -581,7 +581,7 @@ choose_next_subplan_locally(AppendState *node)
else if (!node->as_valid_subplans_identified)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
node->as_valid_subplans_identified = true;
}
@@ -648,7 +648,7 @@ choose_next_subplan_for_leader(AppendState *node)
if (!node->as_valid_subplans_identified)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
node->as_valid_subplans_identified = true;
/*
@@ -724,7 +724,7 @@ choose_next_subplan_for_worker(AppendState *node)
else if (!node->as_valid_subplans_identified)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
node->as_valid_subplans_identified = true;
mark_invalid_subplans_as_finished(node);
@@ -877,7 +877,7 @@ ExecAppendAsyncBegin(AppendState *node)
if (!node->as_valid_subplans_identified)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
node->as_valid_subplans_identified = true;
classify_matching_subplans(node);
diff --git a/src/backend/executor/nodeLockRows.c b/src/backend/executor/nodeLockRows.c
index 41754ddfea..cfead7ded2 100644
--- a/src/backend/executor/nodeLockRows.c
+++ b/src/backend/executor/nodeLockRows.c
@@ -347,8 +347,13 @@ ExecInitLockRows(LockRows *node, EState *estate, int eflags)
ExecRowMark *erm;
ExecAuxRowMark *aerm;
- /* ignore "parent" rowmarks; they are irrelevant at runtime */
- if (rc->isParent)
+ /*
+ * Ignore "parent" rowmarks, because they are irrelevant at
+ * runtime. Also ignore the rowmarks belonging to child tables
+ * that have been pruned in ExecDoInitialPruning().
+ */
+ if (rc->isParent ||
+ !bms_is_member(rc->rti, estate->es_unpruned_relids))
continue;
/* find ExecRowMark and build ExecAuxRowMark */
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 3ed91808dd..f7821aa178 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -219,7 +219,7 @@ ExecMergeAppend(PlanState *pstate)
*/
if (node->ms_valid_subplans == NULL)
node->ms_valid_subplans =
- ExecFindMatchingSubPlans(node->ms_prune_state, false);
+ ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
/*
* First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 1161520f76..004273f868 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -636,7 +636,7 @@ ExecInitUpdateProjection(ModifyTableState *mtstate,
Assert(whichrel >= 0 && whichrel < mtstate->mt_nrels);
}
- updateColnos = (List *) list_nth(node->updateColnosLists, whichrel);
+ updateColnos = (List *) list_nth(mtstate->mt_updateColnosLists, whichrel);
/*
* For UPDATE, we use the old tuple to fill up missing values in the tuple
@@ -4245,6 +4245,7 @@ ExecLookupResultRelByOid(ModifyTableState *node, Oid resultoid,
node->mt_lastResultOid = resultoid;
node->mt_lastResultIndex = mtlookup->relationIndex;
}
+
return node->resultRelInfo + mtlookup->relationIndex;
}
}
@@ -4282,7 +4283,11 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
ModifyTableState *mtstate;
Plan *subplan = outerPlan(node);
CmdType operation = node->operation;
- int nrels = list_length(node->resultRelations);
+ int nrels;
+ List *resultRelations = NIL;
+ List *withCheckOptionLists = NIL;
+ List *returningLists = NIL;
+ List *updateColnosLists = NIL;
ResultRelInfo *resultRelInfo;
List *arowmarks;
ListCell *l;
@@ -4292,6 +4297,45 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
/* check for unsupported flags */
Assert(!(eflags & (EXEC_FLAG_BACKWARD | EXEC_FLAG_MARK)));
+ /*
+ * Only consider unpruned relations for initializing their ResultRelInfo
+ * struct and other fields such as withCheckOptions, etc.
+ */
+ i = 0;
+ foreach(l, node->resultRelations)
+ {
+ Index rti = lfirst_int(l);
+
+ if (bms_is_member(rti, estate->es_unpruned_relids))
+ {
+ resultRelations = lappend_int(resultRelations, rti);
+ if (node->withCheckOptionLists)
+ {
+ List *withCheckOptions = list_nth_node(List,
+ node->withCheckOptionLists,
+ i);
+
+ withCheckOptionLists = lappend(withCheckOptionLists, withCheckOptions);
+ }
+ if (node->returningLists)
+ {
+ List *returningList = list_nth_node(List,
+ node->returningLists,
+ i);
+
+ returningLists = lappend(returningLists, returningList);
+ }
+ if (node->updateColnosLists)
+ {
+ List *updateColnosList = list_nth(node->updateColnosLists, i);
+
+ updateColnosLists = lappend(updateColnosLists, updateColnosList);
+ }
+ }
+ i++;
+ }
+ nrels = list_length(resultRelations);
+
/*
* create state structure
*/
@@ -4312,6 +4356,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
mtstate->mt_merge_inserted = 0;
mtstate->mt_merge_updated = 0;
mtstate->mt_merge_deleted = 0;
+ mtstate->mt_updateColnosLists = updateColnosLists;
/*----------
* Resolve the target relation. This is the same as:
@@ -4329,6 +4374,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
*/
if (node->rootRelation > 0)
{
+ Assert(bms_is_member(node->rootRelation, estate->es_unpruned_relids));
mtstate->rootResultRelInfo = makeNode(ResultRelInfo);
ExecInitResultRelation(estate, mtstate->rootResultRelInfo,
node->rootRelation);
@@ -4343,7 +4389,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
/* set up epqstate with dummy subplan data for the moment */
EvalPlanQualInit(&mtstate->mt_epqstate, estate, NULL, NIL,
- node->epqParam, node->resultRelations);
+ node->epqParam, resultRelations);
mtstate->fireBSTriggers = true;
/*
@@ -4361,7 +4407,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
*/
resultRelInfo = mtstate->resultRelInfo;
i = 0;
- foreach(l, node->resultRelations)
+ foreach(l, resultRelations)
{
Index resultRelation = lfirst_int(l);
List *mergeActions = NIL;
@@ -4505,7 +4551,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
* Initialize any WITH CHECK OPTION constraints if needed.
*/
resultRelInfo = mtstate->resultRelInfo;
- foreach(l, node->withCheckOptionLists)
+ foreach(l, withCheckOptionLists)
{
List *wcoList = (List *) lfirst(l);
List *wcoExprs = NIL;
@@ -4528,7 +4574,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
/*
* Initialize RETURNING projections if needed.
*/
- if (node->returningLists)
+ if (returningLists)
{
TupleTableSlot *slot;
ExprContext *econtext;
@@ -4537,7 +4583,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
* Initialize result tuple slot and assign its rowtype using the first
* RETURNING list. We assume the rest will look the same.
*/
- mtstate->ps.plan->targetlist = (List *) linitial(node->returningLists);
+ mtstate->ps.plan->targetlist = (List *) linitial(returningLists);
/* Set up a slot for the output of the RETURNING projection(s) */
ExecInitResultTupleSlotTL(&mtstate->ps, &TTSOpsVirtual);
@@ -4552,7 +4598,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
* Build a projection for each result rel.
*/
resultRelInfo = mtstate->resultRelInfo;
- foreach(l, node->returningLists)
+ foreach(l, returningLists)
{
List *rlist = (List *) lfirst(l);
@@ -4653,8 +4699,13 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
ExecRowMark *erm;
ExecAuxRowMark *aerm;
- /* ignore "parent" rowmarks; they are irrelevant at runtime */
- if (rc->isParent)
+ /*
+ * Ignore "parent" rowmarks, because they are irrelevant at
+ * runtime. Also ignore the rowmarks belonging to child tables
+ * that have been pruned in ExecDoInitialPruning().
+ */
+ if (rc->isParent ||
+ !bms_is_member(rc->rti, estate->es_unpruned_relids))
continue;
/* Find ExecRowMark and build ExecAuxRowMark */
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 2fb2e73604..e2b781e939 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -2690,6 +2690,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
snap = InvalidSnapshot;
qdesc = CreateQueryDesc(stmt,
+ cplan,
plansource->query_string,
snap, crosscheck_snapshot,
dest,
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index cce226fff1..c98895976e 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -555,6 +555,8 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->planTree = top_plan;
result->partPruneInfos = glob->partPruneInfos;
result->rtable = glob->finalrtable;
+ result->unprunableRelids = bms_difference(glob->allRelids,
+ glob->prunableRelids);
result->permInfos = glob->finalrteperminfos;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 8deb012d8e..a6899e100f 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -565,7 +565,8 @@ add_rte_to_flat_rtable(PlannerGlobal *glob, List *rteperminfos,
/*
* If it's a plain relation RTE (or a subquery that was once a view
- * reference), add the relation OID to relationOids.
+ * reference), add the relation OID to relationOids. Also add its new RT
+ * index to the set of relations that need to be locked for execution.
*
* We do this even though the RTE might be unreferenced in the plan tree;
* this would correspond to cases such as views that were expanded, child
@@ -577,7 +578,11 @@ add_rte_to_flat_rtable(PlannerGlobal *glob, List *rteperminfos,
*/
if (newrte->rtekind == RTE_RELATION ||
(newrte->rtekind == RTE_SUBQUERY && OidIsValid(newrte->relid)))
+ {
glob->relationOids = lappend_oid(glob->relationOids, newrte->relid);
+ glob->allRelids = bms_add_member(glob->allRelids,
+ list_length(glob->finalrtable));
+ }
/*
* Add a copy of the RTEPermissionInfo, if any, corresponding to this RTE
@@ -1741,6 +1746,11 @@ set_customscan_references(PlannerInfo *root,
*
* Also update the RT indexes present in PartitionedRelPruneInfos to add the
* offset.
+ *
+ * Finally, if there are initial pruning steps, add the RT indexes of the
+ * leaf partitions to the set of relations prunable at execution startup time.
+ * This set indicates which relations should not be locked before executor
+ * startup, as they may be pruned during initial pruning.
*/
static int
register_partpruneinfo(PlannerInfo *root, int part_prune_index, int rtoffset)
@@ -1763,8 +1773,25 @@ register_partpruneinfo(PlannerInfo *root, int part_prune_index, int rtoffset)
foreach(l2, prune_infos)
{
PartitionedRelPruneInfo *prelinfo = lfirst(l2);
+ int i;
prelinfo->rtindex += rtoffset;
+
+ for (i = 0; i < prelinfo->nparts; i++)
+ {
+ /*
+ * Non-leaf partitions and partitions that do not have a
+ * subplan are not included in this map as mentioned in
+ * make_partitionedrel_pruneinfo().
+ */
+ if (prelinfo->leafpart_rti_map[i])
+ {
+ prelinfo->leafpart_rti_map[i] += rtoffset;
+ if (prelinfo->initial_pruning_steps)
+ glob->prunableRelids = bms_add_member(glob->prunableRelids,
+ prelinfo->leafpart_rti_map[i]);
+ }
+ }
}
}
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index df767f9e5b..5a518e99bc 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -645,6 +645,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *subplan_map;
int *subpart_map;
Oid *relid_map;
+ int *leafpart_rti_map;
/*
* Construct the subplan and subpart maps for this partitioning level.
@@ -657,6 +658,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subpart_map = (int *) palloc(nparts * sizeof(int));
memset(subpart_map, -1, nparts * sizeof(int));
relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+ leafpart_rti_map = (int *) palloc0(nparts * sizeof(int));
present_parts = NULL;
i = -1;
@@ -671,9 +673,28 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+
+ /*
+ * Track the RT indexes of "leaf" partitions so they can be
+ * included in the PlannerGlobal.prunableRelids set, indicating
+ * relations whose locking is deferred until executor startup.
+ *
+ * We don’t defer locking of sub-partitioned partitions because
+ * setting up PartitionedRelPruningData currently occurs before
+ * initial pruning, so the relation must be locked at that stage,
+ * even if it may be pruned.
+ *
+ * Only leaf partitions with a valid subplan that are prunable
+ * using initial pruning are added to prunableRelids. So
+ * partitions without a subplan due to constraint exclusion will
+ * remain in PlannedStmt.unprunableRelids and thus their locking
+ * will not be deferred even if they may ultimately be pruned due
+ * to initial pruning.
+ */
if (subplanidx >= 0)
{
present_parts = bms_add_member(present_parts, i);
+ leafpart_rti_map[i] = (int) partrel->relid;
/* Record finding this subplan */
subplansfound = bms_add_member(subplansfound, subplanidx);
@@ -695,6 +716,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->subplan_map = subplan_map;
pinfo->subpart_map = subpart_map;
pinfo->relid_map = relid_map;
+ pinfo->leafpart_rti_map = leafpart_rti_map;
}
pfree(relid_subpart_map);
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index a1f8d03db1..6e8f6b1b8f 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -36,6 +36,7 @@ Portal ActivePortal = NULL;
static void ProcessQuery(PlannedStmt *plan,
+ CachedPlan *cplan,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -65,6 +66,7 @@ static void DoPortalRewind(Portal portal);
*/
QueryDesc *
CreateQueryDesc(PlannedStmt *plannedstmt,
+ CachedPlan *cplan,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
@@ -77,6 +79,7 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
+ qd->cplan = cplan; /* CachedPlan supplying the plannedstmt */
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -122,6 +125,7 @@ FreeQueryDesc(QueryDesc *qdesc)
* PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
*
* plan: the plan tree for the query
+ * cplan: CachedPlan supplying the plan
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -134,6 +138,7 @@ FreeQueryDesc(QueryDesc *qdesc)
*/
static void
ProcessQuery(PlannedStmt *plan,
+ CachedPlan *cplan,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -145,7 +150,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Create the QueryDesc object
*/
- queryDesc = CreateQueryDesc(plan, sourceText,
+ queryDesc = CreateQueryDesc(plan, cplan, sourceText,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
@@ -493,6 +498,7 @@ PortalStart(Portal portal, ParamListInfo params,
* the destination to DestNone.
*/
queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+ portal->cplan,
portal->sourceText,
GetActiveSnapshot(),
InvalidSnapshot,
@@ -1276,6 +1282,7 @@ PortalRunMulti(Portal portal,
{
/* statement can set tag string */
ProcessQuery(pstmt,
+ portal->cplan,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1285,6 +1292,7 @@ PortalRunMulti(Portal portal,
{
/* stmt added by rewrite cannot set tag */
ProcessQuery(pstmt,
+ portal->cplan,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 5af1a168ec..449fb8f4e2 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -104,7 +104,8 @@ static List *RevalidateCachedQuery(CachedPlanSource *plansource,
QueryEnvironment *queryEnv);
static bool CheckCachedPlan(CachedPlanSource *plansource);
static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv);
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ bool generic);
static bool choose_custom_plan(CachedPlanSource *plansource,
ParamListInfo boundParams);
static double cached_plan_cost(CachedPlan *plan, bool include_planner);
@@ -815,8 +816,11 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
* Caller must have already called RevalidateCachedQuery to verify that the
* querytree is up to date.
*
- * On a "true" return, we have acquired the locks needed to run the plan.
- * (We must do this for the "true" result to be race-condition-free.)
+ * On a "true" return, we have acquired locks on the "unprunableRelids" set
+ * for all plans in plansource->stmt_list. However, the plans are not fully
+ * race-condition-free until the executor acquires locks on the prunable
+ * relations that survive initial runtime pruning during executor
+ * initialization.
*/
static bool
CheckCachedPlan(CachedPlanSource *plansource)
@@ -893,10 +897,10 @@ CheckCachedPlan(CachedPlanSource *plansource)
* or it can be set to NIL if we need to re-copy the plansource's query_list.
*
* To build a generic, parameter-value-independent plan, pass NULL for
- * boundParams. To build a custom plan, pass the actual parameter values via
- * boundParams. For best effect, the PARAM_FLAG_CONST flag should be set on
- * each parameter value; otherwise the planner will treat the value as a
- * hint rather than a hard constant.
+ * boundParams, and true for generic. To build a custom plan, pass the actual
+ * parameter values via boundParams, and false for generic. For best effect,
+ * the PARAM_FLAG_CONST flag should be set on each parameter value; otherwise
+ * the planner will treat the value as a hint rather than a hard constant.
*
* Planning work is done in the caller's memory context. The finished plan
* is in a child memory context, which typically should get reparented
@@ -904,7 +908,8 @@ CheckCachedPlan(CachedPlanSource *plansource)
*/
static CachedPlan *
BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
- ParamListInfo boundParams, QueryEnvironment *queryEnv)
+ ParamListInfo boundParams, QueryEnvironment *queryEnv,
+ bool generic)
{
CachedPlan *plan;
List *plist;
@@ -1026,6 +1031,7 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
plan->refcount = 0;
plan->context = plan_context;
plan->is_oneshot = plansource->is_oneshot;
+ plan->is_generic = generic;
plan->is_saved = false;
plan->is_valid = true;
@@ -1153,8 +1159,10 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
* plan or a custom plan for the given parameters: the caller does not know
* which it will get.
*
- * On return, the plan is valid and we have sufficient locks to begin
- * execution.
+ * On return, the plan is valid, but not all locks are acquired if a cached
+ * generic plan is being reused. In such cases, locks on relations subject
+ * to initial runtime pruning are deferred until the execution startup phase,
+ * specifically when ExecDoInitialPruning() performs initial pruning.
*
* On return, the refcount of the plan has been incremented; a later
* ReleaseCachedPlan() call is expected. If "owner" is not NULL then
@@ -1196,7 +1204,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
else
{
/* Build a new generic plan */
- plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, NULL, queryEnv, true);
/* Just make real sure plansource->gplan is clear */
ReleaseGenericPlan(plansource);
/* Link the new generic plan into the plansource */
@@ -1241,7 +1249,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
if (customplan)
{
/* Build a custom plan */
- plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv, false);
/* Accumulate total costs of custom plans */
plansource->total_custom_cost += cached_plan_cost(plan, true);
@@ -1776,7 +1784,7 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
foreach(lc1, stmt_list)
{
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
- ListCell *lc2;
+ int rtindex;
if (plannedstmt->commandType == CMD_UTILITY)
{
@@ -1794,13 +1802,16 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
continue;
}
- foreach(lc2, plannedstmt->rtable)
+ rtindex = -1;
+ while ((rtindex = bms_next_member(plannedstmt->unprunableRelids,
+ rtindex)) >= 0)
{
- RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+ RangeTblEntry *rte = list_nth_node(RangeTblEntry,
+ plannedstmt->rtable,
+ rtindex - 1);
- if (!(rte->rtekind == RTE_RELATION ||
- (rte->rtekind == RTE_SUBQUERY && OidIsValid(rte->relid))))
- continue;
+ Assert(rte->rtekind == RTE_RELATION ||
+ (rte->rtekind == RTE_SUBQUERY && OidIsValid(rte->relid)));
/*
* Acquire the appropriate type of lock on each relation OID. Note
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 3ab0aae78f..21c71e0d53 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -103,8 +103,9 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ExplainState *es, const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv);
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
- ExplainState *es, const char *queryString,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
+ IntoClause *into, ExplainState *es,
+ const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
const instr_time *planduration,
const BufferUsage *bufusage,
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 1497aed533..ec5cf4233e 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -49,6 +49,8 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
* nparts Length of subplan_map[] and subpart_map[].
* subplan_map Subplan index by partition index, or -1.
* subpart_map Subpart index by partition index, or -1.
+ * leafpart_rti_map RT index by partition index, or 0 if not a leaf
+ * partition.
* present_parts A Bitmapset of the partition indexes that we
* have subplans or subparts for.
* initial_pruning_steps List of PartitionPruneSteps used to
@@ -69,6 +71,7 @@ typedef struct PartitionedRelPruningData
int nparts;
int *subplan_map;
int *subpart_map;
+ int *leafpart_rti_map;
Bitmapset *present_parts;
List *initial_pruning_steps;
List *exec_pruning_steps;
@@ -139,7 +142,8 @@ extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
Bitmapset *relids,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune);
+ bool initial_prune,
+ Bitmapset **validsubplan_rtis);
extern PartitionPruneState *ExecCreatePartitionPruneState(EState *estate,
PartitionPruneInfo *pruneinfo);
#endif /* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index 0a7274e26c..0e7245435d 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,7 @@ typedef struct QueryDesc
/* These fields are provided by CreateQueryDesc */
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
+ CachedPlan *cplan; /* CachedPlan that supplies the plannedstmt */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
@@ -57,6 +58,7 @@ typedef struct QueryDesc
/* in pquery.c */
extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+ CachedPlan *cplan,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index b0ceb1ab05..ac9be82e19 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -42,6 +42,7 @@
#include "storage/condition_variable.h"
#include "utils/hsearch.h"
#include "utils/queryenvironment.h"
+#include "utils/plancache.h"
#include "utils/reltrigger.h"
#include "utils/sharedtuplestore.h"
#include "utils/snapshot.h"
@@ -639,9 +640,14 @@ typedef struct EState
* ExecRowMarks, or NULL if none */
List *es_rteperminfos; /* List of RTEPermissionInfo */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
+ CachedPlan *es_cachedplan;
List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
List *es_part_prune_states; /* List of PartitionPruneState */
List *es_part_prune_results; /* List of Bitmapset */
+ Bitmapset *es_unpruned_relids; /* PlannedStmt.unprunableRelids + RT
+ * indexes of leaf partitions that
+ * survive initial pruning; see
+ * ExecDoInitialPruning() */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
@@ -1427,6 +1433,12 @@ typedef struct ModifyTableState
double mt_merge_inserted;
double mt_merge_updated;
double mt_merge_deleted;
+
+ /*
+ * List of valid updateColnosLists. Contains only those belonging to
+ * unpruned relations from ModifyTable.updateColnosLists.
+ */
+ List *mt_updateColnosLists;
} ModifyTableState;
/* ----------------
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index c603a9bb1c..ab33b8faf9 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -116,6 +116,14 @@ typedef struct PlannerGlobal
/* "flat" rangetable for executor */
List *finalrtable;
+ /*
+ * RT indexes of all relation RTEs in finalrtable (RTE_RELATION and
+ * RTE_SUBQUERY RTEs of views) and of those that are subject to runtime
+ * pruning at plan initialization time ("initial" pruning).
+ */
+ Bitmapset *allRelids;
+ Bitmapset *prunableRelids;
+
/* "flat" list of RTEPermissionInfos */
List *finalrteperminfos;
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index ef89927471..59699a1f86 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -74,6 +74,10 @@ typedef struct PlannedStmt
List *rtable; /* list of RangeTblEntry nodes */
+ Bitmapset *unprunableRelids; /* RT indexes of relations that are not
+ * subject to runtime pruning; set for
+ * AcquireExecutorLocks(). */
+
List *permInfos; /* list of RTEPermissionInfo nodes for rtable
* entries needing one */
@@ -1476,6 +1480,9 @@ typedef struct PartitionedRelPruneInfo
/* subpart index by partition index, or -1 */
int *subpart_map pg_node_attr(array_size(nparts));
+ /* RT index by partition index, or 0 if not a leaf partition */
+ int *leafpart_rti_map pg_node_attr(array_size(nparts));
+
/* relation OID by partition index, or 0 */
Oid *relid_map pg_node_attr(array_size(nparts));
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index a90dfdf906..e227c4f11b 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -149,6 +149,7 @@ typedef struct CachedPlan
int magic; /* should equal CACHEDPLAN_MAGIC */
List *stmt_list; /* list of PlannedStmts */
bool is_oneshot; /* is it a "oneshot" plan? */
+ bool is_generic; /* is it a reusable generic plan? */
bool is_saved; /* is CachedPlan in a long-lived context? */
bool is_valid; /* is the stmt_list currently valid? */
Oid planRoleId; /* Role ID the plan was created for */
@@ -235,4 +236,21 @@ extern bool CachedPlanIsSimplyValid(CachedPlanSource *plansource,
extern CachedExpression *GetCachedExpression(Node *expr);
extern void FreeCachedExpression(CachedExpression *cexpr);
+/*
+ * CachedPlanRequiresLocking: should the executor acquire additional locks?
+ *
+ * If the plan is a saved generic plan, the executor must acquire locks for
+ * relations that are not covered by AcquireExecutorLocks(), such as partitions
+ * that are subject to initial runtime pruning.
+ *
+ * Note: These locks are unnecessary if the plan is executed immediately after
+ * its creation, since the planner would have already acquired them. However,
+ * we do not optimize for that case.
+ */
+static inline bool
+CachedPlanRequiresLocking(CachedPlan *cplan)
+{
+ return !cplan->is_oneshot && cplan->is_generic;
+}
+
#endif /* PLANCACHE_H */
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index 7a03b4e360..705cd922fc 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -4440,3 +4440,47 @@ drop table hp_contradict_test;
drop operator class part_test_int4_ops2 using hash;
drop operator ===(int4, int4);
drop function explain_analyze(text);
+-- Runtime pruning on UPDATE using WITH CHECK OPTIONS and RETURNING
+create table part_abc (a int, b text, c bool) partition by list (a);
+create table part_abc_1 (b text, a int, c bool);
+create table part_abc_2 (a int, c bool, b text);
+alter table part_abc attach partition part_abc_1 for values in (1);
+alter table part_abc attach partition part_abc_2 for values in (2);
+insert into part_abc values (1, 'b', true);
+insert into part_abc values (2, 'c', true);
+create view part_abc_view as select * from part_abc where b <> 'a' with check option;
+prepare update_part_abc_view as update part_abc_view set b = $2 where a = $1 returning *;
+explain (costs off) execute update_part_abc_view (1, 'd');
+ QUERY PLAN
+-------------------------------------------------------
+ Update on part_abc
+ Update on part_abc_1
+ -> Append
+ Subplans Removed: 1
+ -> Seq Scan on part_abc_1
+ Filter: ((b <> 'a'::text) AND (a = $1))
+(6 rows)
+
+execute update_part_abc_view (1, 'd');
+ a | b | c
+---+---+---
+ 1 | d | t
+(1 row)
+
+explain (costs off) execute update_part_abc_view (2, 'a');
+ QUERY PLAN
+-------------------------------------------------------
+ Update on part_abc
+ Update on part_abc_2 part_abc_1
+ -> Append
+ Subplans Removed: 1
+ -> Seq Scan on part_abc_2 part_abc_1
+ Filter: ((b <> 'a'::text) AND (a = $1))
+(6 rows)
+
+execute update_part_abc_view (2, 'a');
+ERROR: new row violates check option for view "part_abc_view"
+DETAIL: Failing row contains (2, a, t).
+deallocate update_part_abc_view;
+drop view part_abc_view;
+drop table part_abc;
diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql
index 442428d937..af26ad2fb2 100644
--- a/src/test/regress/sql/partition_prune.sql
+++ b/src/test/regress/sql/partition_prune.sql
@@ -1339,3 +1339,21 @@ drop operator class part_test_int4_ops2 using hash;
drop operator ===(int4, int4);
drop function explain_analyze(text);
+
+-- Runtime pruning on UPDATE using WITH CHECK OPTIONS and RETURNING
+create table part_abc (a int, b text, c bool) partition by list (a);
+create table part_abc_1 (b text, a int, c bool);
+create table part_abc_2 (a int, c bool, b text);
+alter table part_abc attach partition part_abc_1 for values in (1);
+alter table part_abc attach partition part_abc_2 for values in (2);
+insert into part_abc values (1, 'b', true);
+insert into part_abc values (2, 'c', true);
+create view part_abc_view as select * from part_abc where b <> 'a' with check option;
+prepare update_part_abc_view as update part_abc_view set b = $2 where a = $1 returning *;
+explain (costs off) execute update_part_abc_view (1, 'd');
+execute update_part_abc_view (1, 'd');
+explain (costs off) execute update_part_abc_view (2, 'a');
+execute update_part_abc_view (2, 'a');
+deallocate update_part_abc_view;
+drop view part_abc_view;
+drop table part_abc;
--
2.43.0
[application/octet-stream] v57-0001-Move-PartitionPruneInfo-out-of-plan-nodes-into-P.patch (20.4K, 5-v57-0001-Move-PartitionPruneInfo-out-of-plan-nodes-into-P.patch)
download | inline diff:
From 0e7344da196e8f20ebe46c5b8104720e1e3725fa Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Wed, 23 Oct 2024 15:37:32 +0900
Subject: [PATCH v57 1/5] Move PartitionPruneInfo out of plan nodes into
PlannedStmt
This change moves PartitionPruneInfo from individual plan nodes to
PlannedStmt, enabling runtime initial pruning to be performed across
the entire plan tree without traversing it to find nodes containing
PartitionPruneInfos.
The PartitionPruneInfo pointer fields in Append and MergeAppend nodes
have been replaced with an integer index that points to a list of
PartitionPruneInfos within PlannedStmt, which now holds the
PartitionPruneInfos for all subqueries.
A bitmapset field has been added to PartitionPruneInfo to store the RT
indexes that correspond to the apprelids field in Append or
MergeAppend. This ensures that the execution pruning logic
cross-checks that it operates on the correct plan node.
Duplicated code in set_append_references() and
set_mergeappend_references() has been moved to a new function,
register_pruneinfo(), which both updates the RT indexes by adding
rtoffset and adds the PartitionPruneInfo to the global list in
PlannerGlobal.
Reviewed-by: Alvaro Herrera
Reviewed-by: Robert Haas
---
src/backend/executor/execMain.c | 1 +
src/backend/executor/execParallel.c | 1 +
src/backend/executor/execPartition.c | 19 +++++-
src/backend/executor/execUtils.c | 1 +
src/backend/executor/nodeAppend.c | 5 +-
src/backend/executor/nodeMergeAppend.c | 5 +-
src/backend/optimizer/plan/createplan.c | 23 +++----
src/backend/optimizer/plan/planner.c | 1 +
src/backend/optimizer/plan/setrefs.c | 85 ++++++++++++++++---------
src/backend/partitioning/partprune.c | 19 ++++--
src/include/executor/execPartition.h | 4 +-
src/include/nodes/execnodes.h | 1 +
src/include/nodes/pathnodes.h | 6 ++
src/include/nodes/plannodes.h | 16 +++--
src/include/partitioning/partprune.h | 8 +--
15 files changed, 133 insertions(+), 62 deletions(-)
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index cc9a594cba..c460c6aa32 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -851,6 +851,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable, plannedstmt->permInfos);
estate->es_plannedstmt = plannedstmt;
+ estate->es_part_prune_infos = plannedstmt->partPruneInfos;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index bfb3419efb..b01a2fdfdd 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -181,6 +181,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
pstmt->planTree = plan;
+ pstmt->partPruneInfos = estate->es_part_prune_infos;
pstmt->rtable = estate->es_range_table;
pstmt->permInfos = estate->es_rteperminfos;
pstmt->resultRelations = NIL;
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 7651886229..323d5330ff 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1786,6 +1786,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* Initialize data structure needed for run-time partition pruning and
* do initial pruning if needed
*
+ * 'relids' identifies the relation to which both the parent plan and the
+ * PartitionPruneInfo given by 'part_prune_index' belong.
+ *
* On return, *initially_valid_subplans is assigned the set of indexes of
* child subplans that must be initialized along with the parent plan node.
* Initial pruning is performed here if needed and in that case only the
@@ -1798,11 +1801,25 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
PartitionPruneState *
ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
+ Bitmapset *relids,
Bitmapset **initially_valid_subplans)
{
PartitionPruneState *prunestate;
EState *estate = planstate->state;
+ PartitionPruneInfo *pruneinfo;
+
+ /* Obtain the pruneinfo we need, and make sure it's the right one */
+ pruneinfo = list_nth_node(PartitionPruneInfo, estate->es_part_prune_infos,
+ part_prune_index);
+ if (!bms_equal(relids, pruneinfo->relids))
+ ereport(ERROR,
+ errcode(ERRCODE_INTERNAL_ERROR),
+ errmsg_internal("mismatching PartitionPruneInfo found at part_prune_index %d",
+ part_prune_index),
+ errdetail_internal("plan node relids %s, pruneinfo relids %s",
+ bmsToString(relids),
+ bmsToString(pruneinfo->relids)));
/* We may need an expression context to evaluate partition exprs */
ExecAssignExprContext(estate, planstate);
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 740e8fb148..bc905a0cdc 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -118,6 +118,7 @@ CreateExecutorState(void)
estate->es_rowmarks = NULL;
estate->es_rteperminfos = NIL;
estate->es_plannedstmt = NULL;
+ estate->es_part_prune_infos = NIL;
estate->es_junkFilter = NULL;
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index ca0f54d676..de7ebab5c2 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -134,7 +134,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
appendstate->as_begun = false;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -145,7 +145,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&appendstate->ps,
list_length(node->appendplans),
- node->part_prune_info,
+ node->part_prune_index,
+ node->apprelids,
&validsubplans);
appendstate->as_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index e1b9b984a7..3ed91808dd 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -82,7 +82,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
mergestate->ps.ExecProcNode = ExecMergeAppend;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -93,7 +93,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&mergestate->ps,
list_length(node->mergeplans),
- node->part_prune_info,
+ node->part_prune_index,
+ node->apprelids,
&validsubplans);
mergestate->ms_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index f2ed0d81f6..fafcb8f1ad 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1227,7 +1227,6 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
ListCell *subpaths;
int nasyncplans = 0;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
int nodenumsortkeys = 0;
AttrNumber *nodeSortColIdx = NULL;
Oid *nodeSortOperators = NULL;
@@ -1378,6 +1377,9 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ plan->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1401,16 +1403,14 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
}
if (prunequal != NIL)
- partpruneinfo =
- make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ plan->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
plan->appendplans = subplans;
plan->nasyncplans = nasyncplans;
plan->first_partial_plan = best_path->first_partial_path;
- plan->part_prune_info = partpruneinfo;
copy_generic_path_info(&plan->plan, (Path *) best_path);
@@ -1449,7 +1449,6 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
List *subplans = NIL;
ListCell *subpaths;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
/*
* We don't have the actual creation of the MergeAppend node split out
@@ -1542,6 +1541,9 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ node->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1557,13 +1559,12 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
Assert(best_path->path.param_info == NULL);
if (prunequal != NIL)
- partpruneinfo = make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ node->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
node->mergeplans = subplans;
- node->part_prune_info = partpruneinfo;
/*
* If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 0f423e9684..cce226fff1 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -553,6 +553,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->dependsOnRole = glob->dependsOnRole;
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
+ result->partPruneInfos = glob->partPruneInfos;
result->rtable = glob->finalrtable;
result->permInfos = glob->finalrteperminfos;
result->resultRelations = glob->resultRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 91c7c4fe2f..8deb012d8e 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1732,6 +1732,47 @@ set_customscan_references(PlannerInfo *root,
cscan->custom_relids = offset_relid_set(cscan->custom_relids, rtoffset);
}
+/*
+ * register_partpruneinfo
+ * Subroutine for set_append_references and set_mergeappend_references
+ *
+ * Add the PartitionPruneInfo from root->partPruneInfos at the given index
+ * into PlannerGlobal->partPruneInfos and return its index there.
+ *
+ * Also update the RT indexes present in PartitionedRelPruneInfos to add the
+ * offset.
+ */
+static int
+register_partpruneinfo(PlannerInfo *root, int part_prune_index, int rtoffset)
+{
+ PlannerGlobal *glob = root->glob;
+ PartitionPruneInfo *pinfo;
+ ListCell *l;
+
+ Assert(part_prune_index >= 0 &&
+ part_prune_index < list_length(root->partPruneInfos));
+ pinfo = list_nth_node(PartitionPruneInfo, root->partPruneInfos,
+ part_prune_index);
+
+ pinfo->relids = offset_relid_set(pinfo->relids, rtoffset);
+ foreach(l, pinfo->prune_infos)
+ {
+ List *prune_infos = lfirst(l);
+ ListCell *l2;
+
+ foreach(l2, prune_infos)
+ {
+ PartitionedRelPruneInfo *prelinfo = lfirst(l2);
+
+ prelinfo->rtindex += rtoffset;
+ }
+ }
+
+ glob->partPruneInfos = lappend(glob->partPruneInfos, pinfo);
+
+ return list_length(glob->partPruneInfos) - 1;
+}
+
/*
* set_append_references
* Do set_plan_references processing on an Append
@@ -1784,21 +1825,13 @@ set_append_references(PlannerInfo *root,
aplan->apprelids = offset_relid_set(aplan->apprelids, rtoffset);
- if (aplan->part_prune_info)
- {
- foreach(l, aplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * Add PartitionPruneInfo, if any, to PlannerGlobal and update the index.
+ * Also update the RT indexes present in it to add the offset.
+ */
+ if (aplan->part_prune_index >= 0)
+ aplan->part_prune_index =
+ register_partpruneinfo(root, aplan->part_prune_index, rtoffset);
/* We don't need to recurse to lefttree or righttree ... */
Assert(aplan->plan.lefttree == NULL);
@@ -1860,21 +1893,13 @@ set_mergeappend_references(PlannerInfo *root,
mplan->apprelids = offset_relid_set(mplan->apprelids, rtoffset);
- if (mplan->part_prune_info)
- {
- foreach(l, mplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * Add PartitionPruneInfo, if any, to PlannerGlobal and update the index.
+ * Also update the RT indexes present in it to add the offset.
+ */
+ if (mplan->part_prune_index >= 0)
+ mplan->part_prune_index =
+ register_partpruneinfo(root, mplan->part_prune_index, rtoffset);
/* We don't need to recurse to lefttree or righttree ... */
Assert(mplan->plan.lefttree == NULL);
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 9a1a7faac7..6f0ead1fa8 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -207,16 +207,20 @@ static void partkey_datum_from_expr(PartitionPruneContext *context,
/*
* make_partition_pruneinfo
- * Builds a PartitionPruneInfo which can be used in the executor to allow
- * additional partition pruning to take place. Returns NULL when
- * partition pruning would be useless.
+ * Checks if the given set of quals can be used to build pruning steps
+ * that the executor can use to prune away unneeded partitions. If
+ * suitable quals are found then a PartitionPruneInfo is built and tagged
+ * onto the PlannerInfo's partPruneInfos list.
+ *
+ * The return value is the 0-based index of the item added to the
+ * partPruneInfos list or -1 if nothing was added.
*
* 'parentrel' is the RelOptInfo for an appendrel, and 'subpaths' is the list
* of scan paths for its child rels.
* 'prunequal' is a list of potential pruning quals (i.e., restriction
* clauses that are applicable to the appendrel).
*/
-PartitionPruneInfo *
+int
make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *subpaths,
List *prunequal)
@@ -330,10 +334,11 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* quals, then we can just not bother with run-time pruning.
*/
if (prunerelinfos == NIL)
- return NULL;
+ return -1;
/* Else build the result data structure */
pruneinfo = makeNode(PartitionPruneInfo);
+ pruneinfo->relids = bms_copy(parentrel->relids);
pruneinfo->prune_infos = prunerelinfos;
/*
@@ -356,7 +361,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
else
pruneinfo->other_subplans = NULL;
- return pruneinfo;
+ root->partPruneInfos = lappend(root->partPruneInfos, pruneinfo);
+
+ return list_length(root->partPruneInfos) - 1;
}
/*
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index c09bc83b2a..ed2b019c09 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -123,9 +123,9 @@ typedef struct PartitionPruneState
extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
+ Bitmapset *relids,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
bool initial_prune);
-
#endif /* EXECPARTITION_H */
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index e4698a28c4..5deed9232a 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -639,6 +639,7 @@ typedef struct EState
* ExecRowMarks, or NULL if none */
List *es_rteperminfos; /* List of RTEPermissionInfo */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
+ List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index add0f9e45f..c603a9bb1c 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -128,6 +128,9 @@ typedef struct PlannerGlobal
/* "flat" list of AppendRelInfos */
List *appendRelations;
+ /* List of PartitionPruneInfo contained in the plan */
+ List *partPruneInfos;
+
/* OIDs of relations the plan depends on */
List *relationOids;
@@ -559,6 +562,9 @@ struct PlannerInfo
/* Does this query modify any partition key columns? */
bool partColsUpdated;
+
+ /* PartitionPruneInfos added in this query's plan. */
+ List *partPruneInfos;
};
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 52f29bcdb6..ef89927471 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -69,6 +69,9 @@ typedef struct PlannedStmt
struct Plan *planTree; /* tree of Plan nodes */
+ List *partPruneInfos; /* List of PartitionPruneInfo contained in the
+ * plan */
+
List *rtable; /* list of RangeTblEntry nodes */
List *permInfos; /* list of RTEPermissionInfo nodes for rtable
@@ -276,8 +279,8 @@ typedef struct Append
*/
int first_partial_plan;
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+ int part_prune_index;
} Append;
/* ----------------
@@ -311,8 +314,8 @@ typedef struct MergeAppend
/* NULLS FIRST/LAST directions */
bool *nullsFirst pg_node_attr(array_size(numCols));
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+ int part_prune_index;
} MergeAppend;
/* ----------------
@@ -1414,6 +1417,10 @@ typedef struct PlanRowMark
* Then, since an Append-type node could have multiple partitioning
* hierarchies among its children, we have an unordered List of those Lists.
*
+ * relids RelOptInfo.relids of the parent plan node (e.g. Append
+ * or MergeAppend) to which this PartitionPruneInfo node
+ * belongs. The pruning logic ensures that this matches
+ * the parent plan node's apprelids.
* prune_infos List of Lists containing PartitionedRelPruneInfo nodes,
* one sublist per run-time-prunable partition hierarchy
* appearing in the parent plan node's subplans.
@@ -1426,6 +1433,7 @@ typedef struct PartitionPruneInfo
pg_node_attr(no_equal, no_query_jumble)
NodeTag type;
+ Bitmapset *relids;
List *prune_infos;
Bitmapset *other_subplans;
} PartitionPruneInfo;
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index bd490d154f..6922e04430 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -70,10 +70,10 @@ typedef struct PartitionPruneContext
#define PruneCxtStateIdx(partnatts, step_id, keyno) \
((partnatts) * (step_id) + (keyno))
-extern PartitionPruneInfo *make_partition_pruneinfo(struct PlannerInfo *root,
- struct RelOptInfo *parentrel,
- List *subpaths,
- List *prunequal);
+extern int make_partition_pruneinfo(struct PlannerInfo *root,
+ struct RelOptInfo *parentrel,
+ List *subpaths,
+ List *prunequal);
extern Bitmapset *prune_append_rel_partitions(struct RelOptInfo *rel);
extern Bitmapset *get_matching_partitions(PartitionPruneContext *context,
List *pruning_steps);
--
2.43.0
[application/octet-stream] v57-0005-Handle-CachedPlan-invalidation-in-the-executor.patch (59.8K, 6-v57-0005-Handle-CachedPlan-invalidation-in-the-executor.patch)
download | inline diff:
From 7423b46ffe28f0def985f49932704504c71ca8e1 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Fri, 18 Oct 2024 22:06:02 +0900
Subject: [PATCH v57 5/5] Handle CachedPlan invalidation in the executor
This commit makes changes to handle cases where a cached plan
becomes invalid after deferred locks on prunable relations are taken.
InitPlan() now returns immediately without doing anything after
finding that the locks taken by ExecDoInitialPruning() have
invalidated the CachedPlan.
ExecutorStartExt(), a wrapper over ExecutorStart(), is added to
handle cases where InitPlan() returns early due to plan invalidation.
ExecutorStartExt() updates the CachedPlan to create fresh plans for
all queries contained it its owning CachedPlanSource and
retries execution with the new plan for the query. This new function
is only called by sites that use plancache.c for getting a plan.
To update an invalid CachedPlan, ExecutorStartExt() calls the new
plancache.c function UpdateCachedPlan(), which creates fresh plans
for each query in the CachedPlanSource and replaces the old stale
ones in CachedPlan.stmt_list in place. This leads to the old ones
leaking into CachedPlan.plan_context, but UpdateCachedPlan() should
be called fairly rarely for this to amount to huge amount of leaked
memory.
This also adds isolation tests using the delay_execution test module
to verify scenarios where a CachedPlan becomes invalid before the
deferred locks are taken.
All ExecutorStart_hook implementations now must add the following
block after the ExecutorStart() call to ensure it doesn't work with an
invalid plan:
/* The plan may have become invalid during ExecutorStart() */
if (!ExecPlanStillValid(queryDesc->estate))
return;
Reviewed-by: Robert Haas
Discussion: https://postgr.es/m/CA+HiwqFGkMSge6TgC9KQzde0ohpAycLQuV7ooitEEpbKB0O_mg@mail.gmail.comk
---
contrib/auto_explain/auto_explain.c | 4 +
.../pg_stat_statements/pg_stat_statements.c | 4 +
src/backend/commands/explain.c | 8 +-
src/backend/commands/portalcmds.c | 1 +
src/backend/commands/prepare.c | 10 +-
src/backend/commands/trigger.c | 14 +
src/backend/executor/README | 35 ++-
src/backend/executor/execMain.c | 103 ++++++-
src/backend/executor/execUtils.c | 1 +
src/backend/executor/spi.c | 19 +-
src/backend/tcop/postgres.c | 4 +-
src/backend/tcop/pquery.c | 20 +-
src/backend/utils/cache/plancache.c | 125 +++++++-
src/backend/utils/mmgr/portalmem.c | 4 +-
src/include/commands/explain.h | 1 +
src/include/commands/trigger.h | 1 +
src/include/executor/executor.h | 16 +
src/include/nodes/execnodes.h | 1 +
src/include/utils/plancache.h | 25 ++
src/include/utils/portal.h | 4 +-
src/test/modules/delay_execution/Makefile | 3 +-
.../modules/delay_execution/delay_execution.c | 63 +++-
.../expected/cached-plan-inval.out | 282 ++++++++++++++++++
src/test/modules/delay_execution/meson.build | 1 +
.../specs/cached-plan-inval.spec | 80 +++++
25 files changed, 787 insertions(+), 42 deletions(-)
create mode 100644 src/test/modules/delay_execution/expected/cached-plan-inval.out
create mode 100644 src/test/modules/delay_execution/specs/cached-plan-inval.spec
diff --git a/contrib/auto_explain/auto_explain.c b/contrib/auto_explain/auto_explain.c
index 677c135f59..9eb5e9a619 100644
--- a/contrib/auto_explain/auto_explain.c
+++ b/contrib/auto_explain/auto_explain.c
@@ -300,6 +300,10 @@ explain_ExecutorStart(QueryDesc *queryDesc, int eflags)
else
standard_ExecutorStart(queryDesc, eflags);
+ /* The plan may have become invalid during standard_ExecutorStart() */
+ if (!ExecPlanStillValid(queryDesc->estate))
+ return;
+
if (auto_explain_enabled())
{
/*
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 21b26b7b6e..0bddcf8a48 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -997,6 +997,10 @@ pgss_ExecutorStart(QueryDesc *queryDesc, int eflags)
else
standard_ExecutorStart(queryDesc, eflags);
+ /* The plan may have become invalid during standard_ExecutorStart() */
+ if (!ExecPlanStillValid(queryDesc->estate))
+ return;
+
/*
* If query has queryId zero, don't track it. This prevents double
* counting of optimizable statements that are directly contained in
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index b699089bd8..07781ce915 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -515,7 +515,8 @@ standard_ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, NULL, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NULL, NULL, -1, into, es, queryString, params,
+ queryEnv,
&planduration, (es->buffers ? &bufusage : NULL),
es->memory ? &mem_counters : NULL);
}
@@ -624,6 +625,7 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
*/
void
ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
+ CachedPlanSource *plansource, int query_index,
IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
@@ -694,8 +696,8 @@ ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
if (into)
eflags |= GetIntoRelEFlags(into);
- /* call ExecutorStart to prepare the plan for execution */
- ExecutorStart(queryDesc, eflags);
+ /* Call ExecutorStartExt to prepare the plan for execution. */
+ ExecutorStartExt(queryDesc, eflags, plansource, query_index);
/* Execute the plan for statistics if asked for */
if (es->analyze)
diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index 4f6acf6719..4b1503c05e 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -107,6 +107,7 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
queryString,
CMDTAG_SELECT, /* cursor's query is always a SELECT */
list_make1(plan),
+ NULL,
NULL);
/*----------
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index 311b9ebd5b..4cd79a6e3a 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -202,7 +202,8 @@ ExecuteQuery(ParseState *pstate,
query_string,
entry->plansource->commandTag,
plan_list,
- cplan);
+ cplan,
+ entry->plansource);
/*
* For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
@@ -583,6 +584,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
MemoryContextCounters mem_counters;
MemoryContext planner_ctx = NULL;
MemoryContext saved_ctx = NULL;
+ int i = 0;
if (es->memory)
{
@@ -655,8 +657,8 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, cplan, into, es, query_string, paramLI,
- queryEnv,
+ ExplainOnePlan(pstmt, cplan, entry->plansource, i,
+ into, es, query_string, paramLI, queryEnv,
&planduration, (es->buffers ? &bufusage : NULL),
es->memory ? &mem_counters : NULL);
else
@@ -668,6 +670,8 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
/* Separate plans with an appropriate separator */
if (lnext(plan_list, p) != NULL)
ExplainSeparatePlans(es);
+
+ i++;
}
if (estate)
diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index 09356e46d1..79572ec8f1 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -5123,6 +5123,20 @@ AfterTriggerEndQuery(EState *estate)
afterTriggers.query_depth--;
}
+/* ----------
+ * AfterTriggerAbortQuery()
+ *
+ * Called by ExecutorEnd() if the query execution was aborted due to the
+ * plan becoming invalid during initialization.
+ * ----------
+ */
+void
+AfterTriggerAbortQuery(void)
+{
+ /* Revert the actions of AfterTriggerBeginQuery(). */
+ afterTriggers.query_depth--;
+}
+
/*
* AfterTriggerFreeQuery
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 642d63be61..c76a00b394 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -280,6 +280,28 @@ are typically reset to empty once per tuple. Per-tuple contexts are usually
associated with ExprContexts, and commonly each PlanState node has its own
ExprContext to evaluate its qual and targetlist expressions in.
+Relation Locking
+----------------
+
+Typically, when the executor initializes a plan tree for execution, it doesn't
+lock non-index relations if the plan tree is freshly generated and not derived
+from a CachedPlan. This is because such locks have already been established
+during the query's parsing, rewriting, and planning phases. However, with a
+cached plan tree, some relations may remain unlocked. The function
+AcquireExecutorLocks() only locks unprunable relations in the plan, deferring
+the locking of prunable ones to executor initialization. This avoids
+unnecessary locking of relations that will be pruned during "initial" runtime
+pruning in ExecDoInitialPruning().
+
+This approach creates a window where a cached plan tree with child tables
+could become outdated if another backend modifies these tables before
+ExecDoInitialPruning() locks them. As a result, the executor has the added duty
+to verify the plan tree's validity whenever it locks a child table after
+doing initial pruning. This validation is done by checking the CachedPlan.is_valid
+attribute. If the plan tree is outdated (is_valid=false), the executor halts
+further initialization, cleans up anything in EState that would have been
+allocated up to that point, and retries execution after recreating the
+invalid plan in the CachedPlan.
Query Processing Control Flow
-----------------------------
@@ -288,11 +310,13 @@ This is a sketch of control flow for full query processing:
CreateQueryDesc
- ExecutorStart
+ ExecutorStart or ExecutorStartExt
CreateExecutorState
creates per-query context
- switch to per-query context to run ExecInitNode
+ switch to per-query context to run ExecDoInitialPruning and ExecInitNode
AfterTriggerBeginQuery
+ ExecDoInitialPruning
+ does initial pruning and locks surviving partitions if needed
ExecInitNode --- recursively scans plan tree
ExecInitNode
recurse into subsidiary nodes
@@ -316,7 +340,12 @@ This is a sketch of control flow for full query processing:
FreeQueryDesc
-Per above comments, it's not really critical for ExecEndNode to free any
+As mentioned in the "Relation Locking" section, if the plan tree is found to
+be stale after locking partitions in ExecDoInitialPruning(), the control is
+immediately returned to ExecutorStartExt(), which will create a new plan tree
+and perform the steps starting from CreateExecutorState() again.
+
+Per above comments, it's not really critical for ExecEndPlan to free any
memory; it'll all go away in FreeExecutorState anyway. However, we do need to
be careful to close relations, drop buffer pins, etc, so we do need to scan
the plan state tree to find these sorts of resources.
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index ed783236eb..5427bdfd4c 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -60,6 +60,7 @@
#include "utils/backend_status.h"
#include "utils/lsyscache.h"
#include "utils/partcache.h"
+#include "utils/plancache.h"
#include "utils/rls.h"
#include "utils/snapmgr.h"
@@ -138,6 +139,63 @@ ExecutorStart(QueryDesc *queryDesc, int eflags)
standard_ExecutorStart(queryDesc, eflags);
}
+/*
+ * ExecutorStartExt
+ * Start query execution, replanning if the plan is invalidated due to
+ * locks taken during initialization, which can occur when the plan is
+ * from a CachedPlan.
+ *
+ * This function is a variant of ExecutorStart() that handles cases where
+ * the CachedPlan might become invalid during initialization, particularly
+ * when prunable relations are locked. If locks taken during ExecutorStart()
+ * invalidate the plan, the function calls UpdateCachedPlan() to replan all
+ * queries in the CachedPlan, including the query at query_index, and then
+ * retries initialization.
+ *
+ * The function repeats the process until ExecutorStart() successfully
+ * initializes the query at query_index with a valid plan. If invalidation
+ * occurs, the current execution state is cleaned up by calling ExecutorEnd(),
+ * and the plan is updated by UpdateCachedPlan(). The loop exits once the
+ * query is successfully initialized with a valid CachedPlan.
+ */
+void
+ExecutorStartExt(QueryDesc *queryDesc, int eflags,
+ CachedPlanSource *plansource,
+ int query_index)
+{
+ if (queryDesc->cplan == NULL)
+ {
+ ExecutorStart(queryDesc, eflags);
+ return;
+ }
+
+ /*
+ * For a CachedPlan, locks acquired during ExecutorStart() may invalidate it.
+ * Therefore, we must loop and retry with an updated plan until no further
+ * invalidation occurs.
+ */
+ while (1)
+ {
+ ExecutorStart(queryDesc, eflags);
+ if (!CachedPlanValid(queryDesc->cplan))
+ {
+ /*
+ * Clean up the current execution state before creating the new
+ * plan to retry ExecutorStart(). Mark execution as aborted to
+ * ensure that AFTER trigger state is properly reset.
+ */
+ queryDesc->estate->es_aborted = true;
+ ExecutorEnd(queryDesc);
+
+ queryDesc->plannedstmt = UpdateCachedPlan(plansource, query_index,
+ queryDesc->queryEnv);
+ }
+ else
+ /* Exit the loop if the plan is initialized successfully. */
+ break;
+ }
+}
+
void
standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
{
@@ -321,6 +379,7 @@ standard_ExecutorRun(QueryDesc *queryDesc,
estate = queryDesc->estate;
Assert(estate != NULL);
+ Assert(!estate->es_aborted);
Assert(!(estate->es_top_eflags & EXEC_FLAG_EXPLAIN_ONLY));
/* caller must ensure the query's snapshot is active */
@@ -427,8 +486,11 @@ standard_ExecutorFinish(QueryDesc *queryDesc)
Assert(estate != NULL);
Assert(!(estate->es_top_eflags & EXEC_FLAG_EXPLAIN_ONLY));
- /* This should be run once and only once per Executor instance */
- Assert(!estate->es_finished);
+ /*
+ * This should be run once and only once per Executor instance and never
+ * if the execution was aborted.
+ */
+ Assert(!estate->es_finished && !estate->es_aborted);
/* Switch into per-query memory context */
oldcontext = MemoryContextSwitchTo(estate->es_query_cxt);
@@ -487,11 +549,10 @@ standard_ExecutorEnd(QueryDesc *queryDesc)
Assert(estate != NULL);
/*
- * Check that ExecutorFinish was called, unless in EXPLAIN-only mode. This
- * Assert is needed because ExecutorFinish is new as of 9.1, and callers
- * might forget to call it.
+ * Check that ExecutorFinish was called, unless in EXPLAIN-only mode or if
+ * execution was aborted.
*/
- Assert(estate->es_finished ||
+ Assert(estate->es_finished || estate->es_aborted ||
(estate->es_top_eflags & EXEC_FLAG_EXPLAIN_ONLY));
/*
@@ -505,6 +566,14 @@ standard_ExecutorEnd(QueryDesc *queryDesc)
UnregisterSnapshot(estate->es_snapshot);
UnregisterSnapshot(estate->es_crosscheck_snapshot);
+ /*
+ * Reset AFTER trigger module if the query execution was aborted.
+ */
+ if (estate->es_aborted &&
+ !(estate->es_top_eflags &
+ (EXEC_FLAG_SKIP_TRIGGERS | EXEC_FLAG_EXPLAIN_ONLY)))
+ AfterTriggerAbortQuery();
+
/*
* Must switch out of context before destroying it
*/
@@ -862,6 +931,7 @@ static void
ExecDoInitialPruning(EState *estate)
{
ListCell *lc;
+ List *locked_relids = NIL;
foreach(lc, estate->es_part_prune_infos)
{
@@ -897,6 +967,7 @@ ExecDoInitialPruning(EState *estate)
Assert(rte->rtekind == RTE_RELATION &&
rte->rellockmode != NoLock);
LockRelationOid(rte->relid, rte->rellockmode);
+ locked_relids = lappend_int(locked_relids, rtindex);
}
}
estate->es_unpruned_relids = bms_add_members(estate->es_unpruned_relids,
@@ -906,6 +977,20 @@ ExecDoInitialPruning(EState *estate)
estate->es_part_prune_results = lappend(estate->es_part_prune_results,
validsubplans);
}
+
+ /*
+ * Release the useless locks if the plan won't be executed. This is the
+ * same as what CheckCachedPlan() in plancache.c does.
+ */
+ if (!ExecPlanStillValid(estate))
+ {
+ foreach(lc, locked_relids)
+ {
+ RangeTblEntry *rte = exec_rt_fetch(lfirst_int(lc), estate);
+
+ UnlockRelationOid(rte->relid, rte->rellockmode);
+ }
+ }
}
/*
@@ -969,6 +1054,9 @@ InitPlan(QueryDesc *queryDesc, int eflags)
estate->es_part_prune_infos = plannedstmt->partPruneInfos;
ExecDoInitialPruning(estate);
+ if (!ExecPlanStillValid(estate))
+ return;
+
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
*/
@@ -2961,6 +3049,9 @@ EvalPlanQualStart(EPQState *epqstate, Plan *planTree)
* the snapshot, rangetable, and external Param info. They need their own
* copies of local state, including a tuple table, es_param_exec_vals,
* result-rel info, etc.
+ *
+ * es_cachedplan is not copied because EPQ plan execution does not acquire
+ * any new locks that could invalidate the CachedPlan.
*/
rcestate->es_direction = ForwardScanDirection;
rcestate->es_snapshot = parentestate->es_snapshot;
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index bc905a0cdc..b7c914d66c 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -147,6 +147,7 @@ CreateExecutorState(void)
estate->es_top_eflags = 0;
estate->es_instrument = 0;
estate->es_finished = false;
+ estate->es_aborted = false;
estate->es_exprcontexts = NIL;
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index e2b781e939..70ab0ece1d 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -70,7 +70,8 @@ static int _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
static ParamListInfo _SPI_convert_params(int nargs, Oid *argtypes,
Datum *Values, const char *Nulls);
-static int _SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount);
+static int _SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount,
+ CachedPlanSource *plansource, int query_index);
static void _SPI_error_callback(void *arg);
@@ -1685,7 +1686,8 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
query_string,
plansource->commandTag,
stmt_list,
- cplan);
+ cplan,
+ plansource);
/*
* Set up options for portal. Default SCROLL type is chosen the same way
@@ -2500,6 +2502,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
List *stmt_list;
ListCell *lc2;
+ int i = 0;
spicallbackarg.query = plansource->query_string;
@@ -2697,8 +2700,9 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
options->params,
_SPI_current->queryEnv,
0);
- res = _SPI_pquery(qdesc, fire_triggers,
- canSetTag ? options->tcount : 0);
+
+ res = _SPI_pquery(qdesc, fire_triggers, canSetTag ? options->tcount : 0,
+ plansource, i);
FreeQueryDesc(qdesc);
}
else
@@ -2795,6 +2799,8 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
my_res = res;
goto fail;
}
+
+ i++;
}
/* Done with this plan, so release refcount */
@@ -2872,7 +2878,8 @@ _SPI_convert_params(int nargs, Oid *argtypes,
}
static int
-_SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount)
+_SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount,
+ CachedPlanSource *plansource, int query_index)
{
int operation = queryDesc->operation;
int eflags;
@@ -2928,7 +2935,7 @@ _SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount)
else
eflags = EXEC_FLAG_SKIP_TRIGGERS;
- ExecutorStart(queryDesc, eflags);
+ ExecutorStartExt(queryDesc, eflags, plansource, query_index);
ExecutorRun(queryDesc, ForwardScanDirection, tcount, true);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 7f5eada9d4..3b98248ad4 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1237,6 +1237,7 @@ exec_simple_query(const char *query_string)
query_string,
commandTag,
plantree_list,
+ NULL,
NULL);
/*
@@ -2039,7 +2040,8 @@ exec_bind_message(StringInfo input_message)
query_string,
psrc->commandTag,
cplan->stmt_list,
- cplan);
+ cplan,
+ psrc);
/* Done with the snapshot used for parameter I/O and parsing/planning */
if (snapshot_set)
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 6e8f6b1b8f..ee5eea4ce1 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -19,6 +19,7 @@
#include "access/xact.h"
#include "commands/prepare.h"
+#include "executor/execdesc.h"
#include "executor/tstoreReceiver.h"
#include "miscadmin.h"
#include "pg_trace.h"
@@ -37,6 +38,8 @@ Portal ActivePortal = NULL;
static void ProcessQuery(PlannedStmt *plan,
CachedPlan *cplan,
+ CachedPlanSource *plansource,
+ int query_index,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -126,6 +129,8 @@ FreeQueryDesc(QueryDesc *qdesc)
*
* plan: the plan tree for the query
* cplan: CachedPlan supplying the plan
+ * plansource: CachedPlanSource supplying the cplan
+ * query_index: index of the query in plansource->query_list
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -139,6 +144,8 @@ FreeQueryDesc(QueryDesc *qdesc)
static void
ProcessQuery(PlannedStmt *plan,
CachedPlan *cplan,
+ CachedPlanSource *plansource,
+ int query_index,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -157,7 +164,7 @@ ProcessQuery(PlannedStmt *plan,
/*
* Call ExecutorStart to prepare the plan for execution
*/
- ExecutorStart(queryDesc, 0);
+ ExecutorStartExt(queryDesc, 0, plansource, query_index);
/*
* Run the plan to completion.
@@ -518,9 +525,9 @@ PortalStart(Portal portal, ParamListInfo params,
myeflags = eflags;
/*
- * Call ExecutorStart to prepare the plan for execution
+ * Call ExecutorStartExt() to prepare the plan for execution.
*/
- ExecutorStart(queryDesc, myeflags);
+ ExecutorStartExt(queryDesc, myeflags, portal->plansource, 0);
/*
* This tells PortalCleanup to shut down the executor
@@ -1201,6 +1208,7 @@ PortalRunMulti(Portal portal,
{
bool active_snapshot_set = false;
ListCell *stmtlist_item;
+ int i = 0;
/*
* If the destination is DestRemoteExecute, change to DestNone. The
@@ -1283,6 +1291,8 @@ PortalRunMulti(Portal portal,
/* statement can set tag string */
ProcessQuery(pstmt,
portal->cplan,
+ portal->plansource,
+ i,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1293,6 +1303,8 @@ PortalRunMulti(Portal portal,
/* stmt added by rewrite cannot set tag */
ProcessQuery(pstmt,
portal->cplan,
+ portal->plansource,
+ i,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1357,6 +1369,8 @@ PortalRunMulti(Portal portal,
*/
if (lnext(portal->stmts, stmtlist_item) != NULL)
CommandCounterIncrement();
+
+ i++;
}
/* Pop the snapshot if we pushed one. */
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index 449fb8f4e2..d3e78afd97 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -101,7 +101,8 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
static void ReleaseGenericPlan(CachedPlanSource *plansource);
static List *RevalidateCachedQuery(CachedPlanSource *plansource,
- QueryEnvironment *queryEnv);
+ QueryEnvironment *queryEnv,
+ bool release_generic);
static bool CheckCachedPlan(CachedPlanSource *plansource);
static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
ParamListInfo boundParams, QueryEnvironment *queryEnv,
@@ -579,10 +580,17 @@ ReleaseGenericPlan(CachedPlanSource *plansource)
* The result value is the transient analyzed-and-rewritten query tree if we
* had to do re-analysis, and NIL otherwise. (This is returned just to save
* a tree copying step in a subsequent BuildCachedPlan call.)
+ *
+ * This also releases and drops the generic plan (plansource->gplan), if any,
+ * as most callers will typically build a new CachedPlan for the plansource
+ * right after this. However, when called from UpdateCachedPlan(), the
+ * function does not release the generic plan, as UpdateCachedPlan() updates
+ * an existing CachedPlan in place.
*/
static List *
RevalidateCachedQuery(CachedPlanSource *plansource,
- QueryEnvironment *queryEnv)
+ QueryEnvironment *queryEnv,
+ bool release_generic)
{
bool snapshot_set;
RawStmt *rawtree;
@@ -679,8 +687,9 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
MemoryContextDelete(qcxt);
}
- /* Drop the generic plan reference if any */
- ReleaseGenericPlan(plansource);
+ /* Drop the generic plan reference, if any, and if requested */
+ if (release_generic)
+ ReleaseGenericPlan(plansource);
/*
* Now re-do parse analysis and rewrite. This not incidentally acquires
@@ -905,6 +914,8 @@ CheckCachedPlan(CachedPlanSource *plansource)
* Planning work is done in the caller's memory context. The finished plan
* is in a child memory context, which typically should get reparented
* (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * Note: When changing this, you should also look at UpdateCachedPlan().
*/
static CachedPlan *
BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
@@ -933,7 +944,7 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
* let's treat it as real and redo the RevalidateCachedQuery call.
*/
if (!plansource->is_valid)
- qlist = RevalidateCachedQuery(plansource, queryEnv);
+ qlist = RevalidateCachedQuery(plansource, queryEnv, true);
/*
* If we don't already have a copy of the querytree list that can be
@@ -1188,7 +1199,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
elog(ERROR, "cannot apply ResourceOwner to non-saved cached plan");
/* Make sure the querytree list is valid and we have parse-time locks */
- qlist = RevalidateCachedQuery(plansource, queryEnv);
+ qlist = RevalidateCachedQuery(plansource, queryEnv, true);
/* Decide whether to use a custom plan */
customplan = choose_custom_plan(plansource, boundParams);
@@ -1284,6 +1295,106 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
return plan;
}
+/*
+ * UpdateCachedPlan
+ * Create fresh plans for all the queries in the plansource, replacing
+ * those in the generic plan's stmt_list, and return the plan for the
+ * query_index'th query.
+ *
+ * This function is primarily intended for ExecutorStartExt(), which handles
+ * cases where the original generic CachedPlan becomes invalid when prunable
+ * relations in the old plan for the query_index'th query are locked for
+ * execution.
+ *
+ * Note that even though this function is called due to invalidations received
+ * during the execution of the query_index'th query, they might affect both
+ * queries that have already finished execution (e.g., due to concurrent
+ * modifications on prunable relations that were not locked during their
+ * execution) and those that have not yet executed. Therefore, we must update
+ * all plans to safely set CachedPlan.is_valid to true.
+ */
+
+PlannedStmt *
+UpdateCachedPlan(CachedPlanSource *plansource, int query_index,
+ QueryEnvironment *queryEnv)
+{
+ List *query_list = plansource->query_list,
+ *plan_list;
+ ListCell *l1,
+ *l2;
+ CachedPlan *plan = plansource->gplan;
+ MemoryContext oldcxt;
+
+ Assert(ActiveSnapshotSet());
+
+ /* Sanity checks */
+ if (plan == NULL)
+ elog(ERROR, "UpdateCachedPlan() called in the wrong context: plansource->gplan is NULL");
+ else if (plan->is_valid)
+ elog(ERROR, "UpdateCachedPlan() called in the wrong context: plansource->gplan->is_valid is true");
+
+ /*
+ * The plansource might have become invalid since GetCachedPlan(). See the
+ * comment in BuildCachedPlan() for details on why this might happen.
+ *
+ * The risk of invalidation is higher here than when BuildCachedPlan()
+ * is called from GetCachedPlan(), because this function is called
+ * within the executor, where much more processing could have occurred
+ * since GetCachedPlan() initially returned the CachedPlan.
+ *
+ * Although invalidation is likely a false positive, we make the
+ * plan valid to ensure the query list used for planning is up to date.
+ *
+ * However, plansource->gplan must not be released, as the upstream
+ * callers (such as the callers of ExecutorStartExt()) still reference it.
+ * The freshly created plans will replace any potentially invalid ones in
+ * plansource->gplan->stmt_list.
+ */
+ if (!plansource->is_valid)
+ query_list = RevalidateCachedQuery(plansource, queryEnv, false);
+ Assert(query_list != NIL);
+
+ /*
+ * Build a new generic plan for all the queries after make a copy
+ * to be scribbled on by the planner.
+ */
+ query_list = copyObject(query_list);
+
+ /*
+ * Planning work is done in the caller's memory context. The resulting
+ * PlannedStmt is then copied into plan->context.
+ */
+ plan_list = pg_plan_queries(query_list, plansource->query_string,
+ plansource->cursor_options, NULL);
+ Assert(list_length(plan_list) == list_length(plan->stmt_list));
+
+ oldcxt = MemoryContextSwitchTo(plan->context);
+ forboth (l1, plan_list, l2, plan->stmt_list)
+ {
+ PlannedStmt *plannedstmt = lfirst(l1);
+
+ lfirst(l2) = copyObject(plannedstmt);
+ }
+ MemoryContextSwitchTo(oldcxt);
+
+ /*
+ * XXX Should this also (re)set the properties of the CachedPlan that are
+ * set in BuildCachedPlan() after creating the fresh plans such as
+ * planRoleId, dependsOnRole, and save_xmin?
+ */
+
+ /*
+ * We've updated all the plans that might have been invalidated, so mark
+ * the CachedPlan as valid.
+ */
+ plan->is_valid = true;
+
+ /* Also update generic_cost because we just created a new generic plan. */
+ plansource->generic_cost = cached_plan_cost(plan, false);
+
+ return list_nth_node(PlannedStmt, plan->stmt_list, query_index);
+}
+
/*
* ReleaseCachedPlan: release active use of a cached plan.
*
@@ -1662,7 +1773,7 @@ CachedPlanGetTargetList(CachedPlanSource *plansource,
return NIL;
/* Make sure the querytree list is valid and we have parse-time locks */
- RevalidateCachedQuery(plansource, queryEnv);
+ RevalidateCachedQuery(plansource, queryEnv, true);
/* Get the primary statement and find out what it returns */
pstmt = QueryListGetPrimaryStmt(plansource->query_list);
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 93137820ac..ef4791bf65 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -284,7 +284,8 @@ PortalDefineQuery(Portal portal,
const char *sourceText,
CommandTag commandTag,
List *stmts,
- CachedPlan *cplan)
+ CachedPlan *cplan,
+ CachedPlanSource *plansource)
{
Assert(PortalIsValid(portal));
Assert(portal->status == PORTAL_NEW);
@@ -299,6 +300,7 @@ PortalDefineQuery(Portal portal,
portal->commandTag = commandTag;
portal->stmts = stmts;
portal->cplan = cplan;
+ portal->plansource = plansource;
portal->status = PORTAL_DEFINED;
}
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index 21c71e0d53..a39989a950 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -104,6 +104,7 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ParamListInfo params, QueryEnvironment *queryEnv);
extern void ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
+ CachedPlanSource *plansource, int plan_index,
IntoClause *into, ExplainState *es,
const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
diff --git a/src/include/commands/trigger.h b/src/include/commands/trigger.h
index 8a5a9fe642..db21561c8c 100644
--- a/src/include/commands/trigger.h
+++ b/src/include/commands/trigger.h
@@ -258,6 +258,7 @@ extern void ExecASTruncateTriggers(EState *estate,
extern void AfterTriggerBeginXact(void);
extern void AfterTriggerBeginQuery(void);
extern void AfterTriggerEndQuery(EState *estate);
+extern void AfterTriggerAbortQuery(void);
extern void AfterTriggerFireDeferred(void);
extern void AfterTriggerEndXact(bool isCommit);
extern void AfterTriggerBeginSubXact(void);
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 69c3ebff00..1270af3be5 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -19,6 +19,7 @@
#include "nodes/lockoptions.h"
#include "nodes/parsenodes.h"
#include "utils/memutils.h"
+#include "utils/plancache.h"
/*
@@ -198,6 +199,8 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
* prototypes from functions in execMain.c
*/
extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
+extern void ExecutorStartExt(QueryDesc *queryDesc, int eflags,
+ CachedPlanSource *plansource, int query_index);
extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void ExecutorRun(QueryDesc *queryDesc,
ScanDirection direction, uint64 count, bool execute_once);
@@ -261,6 +264,19 @@ extern void ExecEndNode(PlanState *node);
extern void ExecShutdownNode(PlanState *node);
extern void ExecSetTupleBound(int64 tuples_needed, PlanState *child_node);
+/*
+ * Is the CachedPlan in es_cachedplan still valid?
+ *
+ * Called from InitPlan() because invalidation messages that affect the plan
+ * might be received after locks have been taken on runtime-prunable relations.
+ * The caller should take appropriate action if the plan has become invalid.
+ */
+static inline bool
+ExecPlanStillValid(EState *estate)
+{
+ return estate->es_cachedplan == NULL ? true :
+ CachedPlanValid(estate->es_cachedplan);
+}
/* ----------------------------------------------------------------
* ExecProcNode
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index ac9be82e19..1ec1021808 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -693,6 +693,7 @@ typedef struct EState
int es_top_eflags; /* eflags passed to ExecutorStart */
int es_instrument; /* OR of InstrumentOption flags */
bool es_finished; /* true when ExecutorFinish is done */
+ bool es_aborted; /* true when execution was aborted */
List *es_exprcontexts; /* List of ExprContexts within EState */
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index e227c4f11b..7b2f3ced26 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -18,6 +18,8 @@
#include "access/tupdesc.h"
#include "lib/ilist.h"
#include "nodes/params.h"
+#include "nodes/parsenodes.h"
+#include "nodes/plannodes.h"
#include "tcop/cmdtag.h"
#include "utils/queryenvironment.h"
#include "utils/resowner.h"
@@ -159,6 +161,12 @@ typedef struct CachedPlan
int generation; /* parent's generation number for this plan */
int refcount; /* count of live references to this struct */
MemoryContext context; /* context containing this CachedPlan */
+
+ /*
+ * If the plan is not associated with a CachedPlanSource, it is saved in
+ * a separate global list.
+ */
+ dlist_node node; /* list link, if is_standalone */
} CachedPlan;
/*
@@ -224,6 +232,10 @@ extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
ParamListInfo boundParams,
ResourceOwner owner,
QueryEnvironment *queryEnv);
+extern PlannedStmt *UpdateCachedPlan(CachedPlanSource *plansource,
+ int query_index,
+ QueryEnvironment *queryEnv);
+
extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
@@ -253,4 +265,17 @@ CachedPlanRequiresLocking(CachedPlan *cplan)
return !cplan->is_oneshot && cplan->is_generic;
}
+/*
+ * CachedPlanValid
+ * Returns whether a cached generic plan is still valid.
+ *
+ * Invoked by the executor to check if the plan has not been invalidated after
+ * taking locks during the initialization of the plan.
+ */
+static inline bool
+CachedPlanValid(CachedPlan *cplan)
+{
+ return cplan->is_valid;
+}
+
#endif /* PLANCACHE_H */
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index 29f49829f2..58c3828d2c 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
QueryCompletion qc; /* command completion data for executed query */
List *stmts; /* list of PlannedStmts */
CachedPlan *cplan; /* CachedPlan, if stmts are from one */
+ CachedPlanSource *plansource; /* CachedPlanSource, for cplan */
ParamListInfo portalParams; /* params to pass to query */
QueryEnvironment *queryEnv; /* environment for query */
@@ -241,7 +242,8 @@ extern void PortalDefineQuery(Portal portal,
const char *sourceText,
CommandTag commandTag,
List *stmts,
- CachedPlan *cplan);
+ CachedPlan *cplan,
+ CachedPlanSource *plansource);
extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
extern void PortalCreateHoldStore(Portal portal);
extern void PortalHashTableDeleteAll(void);
diff --git a/src/test/modules/delay_execution/Makefile b/src/test/modules/delay_execution/Makefile
index 70f24e846d..3eeb097fde 100644
--- a/src/test/modules/delay_execution/Makefile
+++ b/src/test/modules/delay_execution/Makefile
@@ -8,7 +8,8 @@ OBJS = \
delay_execution.o
ISOLATION = partition-addition \
- partition-removal-1
+ partition-removal-1 \
+ cached-plan-inval
ifdef USE_PGXS
PG_CONFIG = pg_config
diff --git a/src/test/modules/delay_execution/delay_execution.c b/src/test/modules/delay_execution/delay_execution.c
index 155c8a8d55..304ca77f7b 100644
--- a/src/test/modules/delay_execution/delay_execution.c
+++ b/src/test/modules/delay_execution/delay_execution.c
@@ -1,14 +1,18 @@
/*-------------------------------------------------------------------------
*
* delay_execution.c
- * Test module to allow delay between parsing and execution of a query.
+ * Test module to introduce delay at various points during execution of a
+ * query to test that execution proceeds safely in light of concurrent
+ * changes.
*
* The delay is implemented by taking and immediately releasing a specified
* advisory lock. If another process has previously taken that lock, the
* current process will be blocked until the lock is released; otherwise,
* there's no effect. This allows an isolationtester script to reliably
- * test behaviors where some specified action happens in another backend
- * between parsing and execution of any desired query.
+ * test behaviors where some specified action happens in another backend in
+ * a couple of cases: 1) between parsing and execution of any desired query
+ * when using the planner_hook, 2) between RevalidateCachedQuery() and
+ * ExecutorStart() when using the ExecutorStart_hook.
*
* Copyright (c) 2020-2024, PostgreSQL Global Development Group
*
@@ -22,6 +26,7 @@
#include <limits.h>
+#include "executor/executor.h"
#include "optimizer/planner.h"
#include "utils/builtins.h"
#include "utils/guc.h"
@@ -32,9 +37,11 @@ PG_MODULE_MAGIC;
/* GUC: advisory lock ID to use. Zero disables the feature. */
static int post_planning_lock_id = 0;
+static int executor_start_lock_id = 0;
-/* Save previous planner hook user to be a good citizen */
+/* Save previous hook users to be a good citizen */
static planner_hook_type prev_planner_hook = NULL;
+static ExecutorStart_hook_type prev_ExecutorStart_hook = NULL;
/* planner_hook function to provide the desired delay */
@@ -70,11 +77,41 @@ delay_execution_planner(Query *parse, const char *query_string,
return result;
}
+/* ExecutorStart_hook function to provide the desired delay */
+static void
+delay_execution_ExecutorStart(QueryDesc *queryDesc, int eflags)
+{
+ /* If enabled, delay by taking and releasing the specified lock */
+ if (executor_start_lock_id != 0)
+ {
+ DirectFunctionCall1(pg_advisory_lock_int8,
+ Int64GetDatum((int64) executor_start_lock_id));
+ DirectFunctionCall1(pg_advisory_unlock_int8,
+ Int64GetDatum((int64) executor_start_lock_id));
+
+ /*
+ * Ensure that we notice any pending invalidations, since the advisory
+ * lock functions don't do this.
+ */
+ AcceptInvalidationMessages();
+ }
+
+ /* Now start the executor, possibly via a previous hook user */
+ if (prev_ExecutorStart_hook)
+ prev_ExecutorStart_hook(queryDesc, eflags);
+ else
+ standard_ExecutorStart(queryDesc, eflags);
+
+ if (executor_start_lock_id != 0)
+ elog(NOTICE, "Finished ExecutorStart(): CachedPlan is %s",
+ CachedPlanValid(queryDesc->cplan) ? "valid" : "not valid");
+}
+
/* Module load function */
void
_PG_init(void)
{
- /* Set up the GUC to control which lock is used */
+ /* Set up GUCs to control which lock is used */
DefineCustomIntVariable("delay_execution.post_planning_lock_id",
"Sets the advisory lock ID to be locked/unlocked after planning.",
"Zero disables the delay.",
@@ -86,10 +123,22 @@ _PG_init(void)
NULL,
NULL,
NULL);
-
+ DefineCustomIntVariable("delay_execution.executor_start_lock_id",
+ "Sets the advisory lock ID to be locked/unlocked before starting execution.",
+ "Zero disables the delay.",
+ &executor_start_lock_id,
+ 0,
+ 0, INT_MAX,
+ PGC_USERSET,
+ 0,
+ NULL,
+ NULL,
+ NULL);
MarkGUCPrefixReserved("delay_execution");
- /* Install our hook */
+ /* Install our hooks. */
prev_planner_hook = planner_hook;
planner_hook = delay_execution_planner;
+ prev_ExecutorStart_hook = ExecutorStart_hook;
+ ExecutorStart_hook = delay_execution_ExecutorStart;
}
diff --git a/src/test/modules/delay_execution/expected/cached-plan-inval.out b/src/test/modules/delay_execution/expected/cached-plan-inval.out
new file mode 100644
index 0000000000..5bfb2b33b3
--- /dev/null
+++ b/src/test/modules/delay_execution/expected/cached-plan-inval.out
@@ -0,0 +1,282 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s1prep s2lock s1exec s2dropi s2unlock
+step s1prep: SET plan_cache_mode = force_generic_plan;
+ PREPARE q AS SELECT * FROM foov WHERE a = $1 FOR UPDATE;
+ EXPLAIN (COSTS OFF) EXECUTE q (1);
+QUERY PLAN
+------------------------------------------------
+LockRows
+ -> Append
+ Subplans Removed: 2
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: (a = $1)
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = $1)
+(7 rows)
+
+step s2lock: SELECT pg_advisory_lock(12345);
+pg_advisory_lock
+----------------
+
+(1 row)
+
+step s1exec: LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q (1); <waiting ...>
+step s2dropi: DROP INDEX foo12_1_a;
+step s2unlock: SELECT pg_advisory_unlock(12345);
+pg_advisory_unlock
+------------------
+t
+(1 row)
+
+step s1exec: <... completed>
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+-------------------------------------
+LockRows
+ -> Append
+ Subplans Removed: 2
+ -> Seq Scan on foo12_1 foo_1
+ Filter: (a = $1)
+(5 rows)
+
+
+starting permutation: s1prep2 s2lock s1exec2 s2dropi s2unlock
+step s1prep2: SET plan_cache_mode = force_generic_plan;
+ PREPARE q2 AS SELECT * FROM foov WHERE a = one() or a = two();
+ EXPLAIN (COSTS OFF) EXECUTE q2;
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+--------------------------------------------------
+Append
+ Subplans Removed: 1
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: ((a = one()) OR (a = two()))
+ -> BitmapOr
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = one())
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = two())
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+(11 rows)
+
+step s2lock: SELECT pg_advisory_lock(12345);
+pg_advisory_lock
+----------------
+
+(1 row)
+
+step s1exec2: LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q2; <waiting ...>
+step s2dropi: DROP INDEX foo12_1_a;
+step s2unlock: SELECT pg_advisory_unlock(12345);
+pg_advisory_unlock
+------------------
+t
+(1 row)
+
+step s1exec2: <... completed>
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+--------------------------------------------
+Append
+ Subplans Removed: 1
+ -> Seq Scan on foo12_1 foo_1
+ Filter: ((a = one()) OR (a = two()))
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+(6 rows)
+
+
+starting permutation: s1prep3 s2lock s1exec3 s2dropi s2unlock
+step s1prep3: SET plan_cache_mode = force_generic_plan;
+ PREPARE q3 AS UPDATE foov SET a = a WHERE a = one() or a = two();
+ EXPLAIN (COSTS OFF) EXECUTE q3;
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+--------------------------------------------------------------
+Nested Loop
+ -> Append
+ Subplans Removed: 1
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: ((a = one()) OR (a = two()))
+ -> BitmapOr
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = one())
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = two())
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+ -> Materialize
+ -> Append
+ Subplans Removed: 1
+ -> Bitmap Heap Scan on bar1 bar_1
+ Recheck Cond: (a = one())
+ -> Bitmap Index Scan on bar1_a_idx
+ Index Cond: (a = one())
+
+Update on bar
+ Update on bar1 bar_1
+ -> Nested Loop
+ -> Append
+ Subplans Removed: 1
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: ((a = one()) OR (a = two()))
+ -> BitmapOr
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = one())
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = two())
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+ -> Materialize
+ -> Append
+ Subplans Removed: 1
+ -> Bitmap Heap Scan on bar1 bar_1
+ Recheck Cond: (a = one())
+ -> Bitmap Index Scan on bar1_a_idx
+ Index Cond: (a = one())
+
+Update on foo
+ Update on foo12_1 foo_1
+ Update on foo12_2 foo_2
+ -> Append
+ Subplans Removed: 1
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: ((a = one()) OR (a = two()))
+ -> BitmapOr
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = one())
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = two())
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+(56 rows)
+
+step s2lock: SELECT pg_advisory_lock(12345);
+pg_advisory_lock
+----------------
+
+(1 row)
+
+step s1exec3: LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q3; <waiting ...>
+step s2dropi: DROP INDEX foo12_1_a;
+step s2unlock: SELECT pg_advisory_unlock(12345);
+pg_advisory_unlock
+------------------
+t
+(1 row)
+
+step s1exec3: <... completed>
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+-------------------------------------------------------------
+Nested Loop
+ -> Append
+ Subplans Removed: 1
+ -> Seq Scan on foo12_1 foo_1
+ Filter: ((a = one()) OR (a = two()))
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+ -> Materialize
+ -> Append
+ Subplans Removed: 1
+ -> Bitmap Heap Scan on bar1 bar_1
+ Recheck Cond: (a = one())
+ -> Bitmap Index Scan on bar1_a_idx
+ Index Cond: (a = one())
+
+Update on bar
+ Update on bar1 bar_1
+ -> Nested Loop
+ -> Append
+ Subplans Removed: 1
+ -> Seq Scan on foo12_1 foo_1
+ Filter: ((a = one()) OR (a = two()))
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+ -> Materialize
+ -> Append
+ Subplans Removed: 1
+ -> Bitmap Heap Scan on bar1 bar_1
+ Recheck Cond: (a = one())
+ -> Bitmap Index Scan on bar1_a_idx
+ Index Cond: (a = one())
+
+Update on foo
+ Update on foo12_1 foo_1
+ Update on foo12_2 foo_2
+ -> Append
+ Subplans Removed: 1
+ -> Seq Scan on foo12_1 foo_1
+ Filter: ((a = one()) OR (a = two()))
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+(41 rows)
+
+
+starting permutation: s1prep4 s2lock s1exec4 s2dropi s2unlock
+step s1prep4: SET plan_cache_mode = force_generic_plan;
+ SET enable_seqscan TO off;
+ PREPARE q4 AS SELECT * FROM generate_series(1, 1) WHERE EXISTS (SELECT * FROM foov WHERE a = $1 FOR UPDATE);
+ EXPLAIN (COSTS OFF) EXECUTE q4 (1);
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+---------------------------------------------------------------
+Result
+ One-Time Filter: (InitPlan 1).col1
+ InitPlan 1
+ -> LockRows
+ -> Append
+ Subplans Removed: 2
+ -> Index Scan using foo12_1_a on foo12_1 foo_1
+ Index Cond: (a = $1)
+ -> Function Scan on generate_series
+(9 rows)
+
+step s2lock: SELECT pg_advisory_lock(12345);
+pg_advisory_lock
+----------------
+
+(1 row)
+
+step s1exec4: LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q4 (1); <waiting ...>
+step s2dropi: DROP INDEX foo12_1_a;
+step s2unlock: SELECT pg_advisory_unlock(12345);
+pg_advisory_unlock
+------------------
+t
+(1 row)
+
+step s1exec4: <... completed>
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+---------------------------------------------
+Result
+ One-Time Filter: (InitPlan 1).col1
+ InitPlan 1
+ -> LockRows
+ -> Append
+ Subplans Removed: 2
+ -> Seq Scan on foo12_1 foo_1
+ Disabled: true
+ Filter: (a = $1)
+ -> Function Scan on generate_series
+(10 rows)
+
diff --git a/src/test/modules/delay_execution/meson.build b/src/test/modules/delay_execution/meson.build
index 41f3ac0b89..5a70b183d0 100644
--- a/src/test/modules/delay_execution/meson.build
+++ b/src/test/modules/delay_execution/meson.build
@@ -24,6 +24,7 @@ tests += {
'specs': [
'partition-addition',
'partition-removal-1',
+ 'cached-plan-inval',
],
},
}
diff --git a/src/test/modules/delay_execution/specs/cached-plan-inval.spec b/src/test/modules/delay_execution/specs/cached-plan-inval.spec
new file mode 100644
index 0000000000..f27e8fb521
--- /dev/null
+++ b/src/test/modules/delay_execution/specs/cached-plan-inval.spec
@@ -0,0 +1,80 @@
+# Test to check that invalidation of cached generic plans during ExecutorStart
+# correctly triggers replanning and re-execution.
+
+setup
+{
+ CREATE TABLE foo (a int, b text) PARTITION BY LIST(a);
+ CREATE TABLE foo12 PARTITION OF foo FOR VALUES IN (1, 2) PARTITION BY LIST (a);
+ CREATE TABLE foo12_1 PARTITION OF foo12 FOR VALUES IN (1);
+ CREATE TABLE foo12_2 PARTITION OF foo12 FOR VALUES IN (2);
+ CREATE INDEX foo12_1_a ON foo12_1 (a);
+ CREATE TABLE foo3 PARTITION OF foo FOR VALUES IN (3);
+ CREATE VIEW foov AS SELECT * FROM foo;
+ CREATE FUNCTION one () RETURNS int AS $$ BEGIN RETURN 1; END; $$ LANGUAGE PLPGSQL STABLE;
+ CREATE FUNCTION two () RETURNS int AS $$ BEGIN RETURN 2; END; $$ LANGUAGE PLPGSQL STABLE;
+ CREATE TABLE bar (a int, b text) PARTITION BY LIST(a);
+ CREATE TABLE bar1 PARTITION OF bar FOR VALUES IN (1);
+ CREATE INDEX ON bar1(a);
+ CREATE TABLE bar2 PARTITION OF bar FOR VALUES IN (2);
+ CREATE RULE update_foo AS ON UPDATE TO foo DO ALSO UPDATE bar SET a = a WHERE a = one();
+ CREATE RULE update_bar AS ON UPDATE TO bar DO ALSO SELECT 1;
+}
+
+teardown
+{
+ DROP VIEW foov;
+ DROP RULE update_foo ON foo;
+ DROP TABLE foo, bar;
+ DROP FUNCTION one(), two();
+}
+
+session "s1"
+# Append with run-time pruning
+step "s1prep" { SET plan_cache_mode = force_generic_plan;
+ PREPARE q AS SELECT * FROM foov WHERE a = $1 FOR UPDATE;
+ EXPLAIN (COSTS OFF) EXECUTE q (1); }
+
+# Another case with Append with run-time pruning
+step "s1prep2" { SET plan_cache_mode = force_generic_plan;
+ PREPARE q2 AS SELECT * FROM foov WHERE a = one() or a = two();
+ EXPLAIN (COSTS OFF) EXECUTE q2; }
+
+# Case with a rule adding another query
+step "s1prep3" { SET plan_cache_mode = force_generic_plan;
+ PREPARE q3 AS UPDATE foov SET a = a WHERE a = one() or a = two();
+ EXPLAIN (COSTS OFF) EXECUTE q3; }
+
+# Another case with Append with run-time pruning in a subquery
+step "s1prep4" { SET plan_cache_mode = force_generic_plan;
+ SET enable_seqscan TO off;
+ PREPARE q4 AS SELECT * FROM generate_series(1, 1) WHERE EXISTS (SELECT * FROM foov WHERE a = $1 FOR UPDATE);
+ EXPLAIN (COSTS OFF) EXECUTE q4 (1); }
+
+# Executes a generic plan
+step "s1exec" { LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q (1); }
+step "s1exec2" { LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q2; }
+step "s1exec3" { LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q3; }
+step "s1exec4" { LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q4 (1); }
+
+session "s2"
+step "s2lock" { SELECT pg_advisory_lock(12345); }
+step "s2unlock" { SELECT pg_advisory_unlock(12345); }
+step "s2dropi" { DROP INDEX foo12_1_a; }
+
+# While "s1exec", etc. wait to acquire the advisory lock, "s2drop" is able to
+# drop the index being used in the cached plan. When "s1exec" is then
+# unblocked and initializes the cached plan for execution, it detects the
+# concurrent index drop and causes the cached plan to be discarded and
+# recreated without the index.
+permutation "s1prep" "s2lock" "s1exec" "s2dropi" "s2unlock"
+permutation "s1prep2" "s2lock" "s1exec2" "s2dropi" "s2unlock"
+permutation "s1prep3" "s2lock" "s1exec3" "s2dropi" "s2unlock"
+permutation "s1prep4" "s2lock" "s1exec4" "s2dropi" "s2unlock"
--
2.43.0
^ permalink raw reply [nested|flat] 29+ messages in thread
* Re: generic plans and "initial" pruning
2024-08-15 15:34 Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-16 12:35 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-19 16:39 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-20 13:00 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-20 14:53 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-21 12:45 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-21 13:10 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-23 12:48 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-29 13:34 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-17 12:57 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-19 08:39 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-19 12:10 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-20 08:10 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-10-10 20:15 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-10-11 07:30 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-10-25 12:30 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
@ 2024-12-01 18:36 ` Tomas Vondra <[email protected]>
2024-12-04 13:34 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
0 siblings, 1 reply; 29+ messages in thread
From: Tomas Vondra @ 2024-12-01 18:36 UTC (permalink / raw)
To: Amit Langote <[email protected]>; Robert Haas <[email protected]>; +Cc: Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>; Tom Lane <[email protected]>
Hi,
I took a look at this patch, mostly to familiarize myself with the
pruning etc. I have a bunch of comments, but all of that is minor,
perhaps even nitpicking - with prior feedback from David, Tom and
Robert, I can't really compete with that.
FWIW the patch needs a rebase, there's a minor bitrot - but it was
simply enough to fix for a review / testing.
0001
----
1) But if we don't expect this error to actually happen, do we really
need to make it ereport()? Maybe it should be plain elog(). I mean, it's
"can't happen" and thus doesn't need translations etc.
if (!bms_equal(relids, pruneinfo->relids))
ereport(ERROR,
errcode(ERRCODE_INTERNAL_ERROR),
errmsg_internal("mismatching PartitionPruneInfo found at
part_prune_index %d",
part_prune_index),
errdetail_internal("plan node relids %s, pruneinfo
relids %s",
bmsToString(relids),
bmsToString(pruneinfo->relids)));
Perhaps it should even be an assert?
2) unnecessary newline added to execPartition.h
3) this comment in EState doesn't seem very helpful
List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
5) PlannerGlobal
/* List of PartitionPruneInfo contained in the plan */
List *partPruneInfos;
Why does this say "contained in the plan" unlike the other fields? Is
there some sort of difference? I'm not saying it's wrong.
0002
----
1) Isn't it weird/undesirable partkey_datum_from_expr() loses some of
the asserts? Would the assert be incorrect in the new implementation, or
are we removing it simply because we happen to not have one of the fields?
2) inconsistent spelling: run-time vs. runtime
3) PartitionPruneContext.is_valid - I think I'd rename the flag to
"initialized" or something like that. The "is_valid" is a bit confusing,
because it might seem the context can get invalidated later, but AFAICS
that's not the case - we just initialize it lazily.
0003
----
1) In InitPlan I'd move
estate->es_part_prune_infos = plannedstmt->partPruneInfos;
before the comment, which is more about ExecDoInitialPruning.
2) I'm not quite sure what "exec" partition pruning is?
/*
* ExecInitPartitionPruning
* Initialize the data structures needed for runtime "exec" partition
* pruning and return the result of initial pruning, if available.
Is that the same thing as "runtime pruning"?
0004
----
1) typo: paraller/parallel
2) What about adding an assert to ExecFindMatchingSubPlans, to check
valisubplan_rtis is not NULL? It's just mentioned in a comment, but
better to explicitly enforce that?
2) It may not be quite clear why ExecInitUpdateProjection() switches to
mt_updateColnosLists. Should that be explained in a comment, somewhere?
3) unnecessary newline in ExecLookupResultRelByOid
0005
----
1) auto_explain.c - So what happens if the plan gets invalidated? The
hook explain_ExecutorStart returns early, but then what? Does that break
the user session somehow, or what?
2) Isn't it a bit fragile if this requires every extension to update
and add the ExecPlanStillValid() calls to various places? What if an
extension doesn't do that? What weirdness will happen? Maybe it'd be
possible to at least check this in some other executor hook? Or at least
we could ensure the check was done in assert-enabled builds? Or
something to make extension authors aware of this?
Aside from going through the patches, I did a simple benchmark to see
how this works in practice. I did a simple test, with pgbench -S and
variable number of partitions/clients. I also varied the number of locks
per transaction, because I was wondering if it may interact with the
fast-path improvements. See the attached xeon.sh script and CSV with
results from the 44/88-core machine.
There's also two PDFs visualizing the results, to show the impact as a
difference between "master" (no patches) vs. "pruning" build with v57
applied. As usual, "green" is good (faster), read is "bad" (slower).
For most combinations of parameters, there's no impact on throughput.
Anything in 99-101% is just regular noise, possibly even more. I'm
trying to reduce the noise a bit more, but this seems acceptable. I'd
like to discuss three "cases" I see in the results:
1) bad #1
IIRC the patch should not affect results for "force_custom_plan" cache
mode (and "auto", which does mostly the same thing, I think). And for
most runs that's true, with results ~100% of master. But there's a
couple curious exceptions - e.g. results for 0 partitions and 16 locks
show a consistent regression of ~10% (in the "-M prepared" mode).
I'm not terribly worried about this because it only shows for 16 locks,
and the default is 64. If someone reduces this GUC value, they should
expect some impact.
Still, it only shows in the "auto" case. I wonder why is that. Strange.
2) bad #2
There's also a similar regression in the "force_generic_plan" without
partitions (with "-M prepared"). This seems more consistent and affects
all the lock counts.
3) good
There's an area os massive improvements (in the 2-200x range) with 100+
partitions. The fast-path patch helped a bit, but this is much better,
of course.
costing / auto mode
-------------------
Anyway, this leads me to a related question - not quite a "bug" in the
patch, but something to perhaps think about. And that's costing, and
what "auto" should do.
There are two PNG charts, showing throughput for runs with -M prepared
and 1000 partitions. Each chart shows throughput for the three cache
modes, and different client counts. There's a clear distinction between
"master" and "patched" runs - the "generic" plans performed terribly, by
orders of magnitude. With the patches it beats the "custom" plans.
Which is great! But it also means that while "auto" used to do the right
thing, with the patches that's not the case.
AFAIK that's because we don't consider the runtime pruning when costing
the plans, so the cost is calculated as if no pruning happened. And so
it seems way more expensive than it should ... and it loses with the
custom scans. Is that correct, or do I understand this wrong?
Just to be clear, I'm not claiming the patch has to deal with this. I
suppose it can be handled as a future improvement, and I'm not even sure
there's a good way to consider this during costing. For example, can we
estimate how many partitions will be pruned?
regards
--
Tomas Vondra
Attachments:
[application/pdf] xeon-complete.pdf (67.5K, 2-xeon-complete.pdf)
download
[application/pdf] xeon-prepared.pdf (34.8K, 3-xeon-prepared.pdf)
download
[application/gzip] xeon.csv.gz (152.7K, 4-xeon.csv.gz)
download
[application/x-shellscript] xeon.sh (1.7K, 5-xeon.sh)
download
[image/png] master-prepared-1000-partitions.png (18.5K, 6-master-prepared-1000-partitions.png)
download | view image
[image/png] patched-prepared-1000-partitions.png (23.5K, 7-patched-prepared-1000-partitions.png)
download | view image
^ permalink raw reply [nested|flat] 29+ messages in thread
* Re: generic plans and "initial" pruning
2024-08-15 15:34 Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-16 12:35 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-19 16:39 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-20 13:00 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-20 14:53 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-21 12:45 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-21 13:10 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-23 12:48 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-29 13:34 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-17 12:57 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-19 08:39 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-19 12:10 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-20 08:10 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-10-10 20:15 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-10-11 07:30 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-10-25 12:30 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-12-01 18:36 ` Re: generic plans and "initial" pruning Tomas Vondra <[email protected]>
@ 2024-12-04 13:34 ` Amit Langote <[email protected]>
2024-12-05 12:03 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
0 siblings, 1 reply; 29+ messages in thread
From: Amit Langote @ 2024-12-04 13:34 UTC (permalink / raw)
To: Tomas Vondra <[email protected]>; +Cc: Robert Haas <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>; Tom Lane <[email protected]>
Hi Tomas,
On Mon, Dec 2, 2024 at 3:36 AM Tomas Vondra <[email protected]> wrote:
> Hi,
>
> I took a look at this patch, mostly to familiarize myself with the
> pruning etc. I have a bunch of comments, but all of that is minor,
> perhaps even nitpicking - with prior feedback from David, Tom and
> Robert, I can't really compete with that.
Thanks for looking at this. These are helpful.
> FWIW the patch needs a rebase, there's a minor bitrot - but it was
> simply enough to fix for a review / testing.
>
>
> 0001
> ----
>
> 1) But if we don't expect this error to actually happen, do we really
> need to make it ereport()? Maybe it should be plain elog(). I mean, it's
> "can't happen" and thus doesn't need translations etc.
>
> if (!bms_equal(relids, pruneinfo->relids))
> ereport(ERROR,
> errcode(ERRCODE_INTERNAL_ERROR),
> errmsg_internal("mismatching PartitionPruneInfo found at
> part_prune_index %d",
> part_prune_index),
> errdetail_internal("plan node relids %s, pruneinfo
> relids %s",
> bmsToString(relids),
> bmsToString(pruneinfo->relids)));
I'm fine with elog() here even if it causes the message to be longer:
elog(ERROR, "mismatching PartitionPruneInfo found at part_prune_index
%d (plan node relids %s, pruneinfo relids %s)
> Perhaps it should even be an assert?
I am not sure about that. Having a message handy might be good if a
user ends up hitting this case for whatever reason, like trying to run
a corrupted plan.
> 2) unnecessary newline added to execPartition.h
Perhaps you meant "removed". Fixed.
> 3) this comment in EState doesn't seem very helpful
>
> List *es_part_prune_infos; /* PlannedStmt.partPruneInfos */
Agreed, fixed to be like the comment for es_rteperminfos:
List *es_part_prune_infos; /* List of PartitionPruneInfo */
> 5) PlannerGlobal
>
> /* List of PartitionPruneInfo contained in the plan */
> List *partPruneInfos;
>
> Why does this say "contained in the plan" unlike the other fields? Is
> there some sort of difference? I'm not saying it's wrong.
Ok, maybe the following is a bit more helpful and like the comment for
other fields:
/* "flat" list of PartitionPruneInfos */
List *partPruneInfos;
> 0002
> ----
>
> 1) Isn't it weird/undesirable partkey_datum_from_expr() loses some of
> the asserts? Would the assert be incorrect in the new implementation, or
> are we removing it simply because we happen to not have one of the fields?
The former -- the asserts would be incorrect in the new implementation
-- because in the new implementation a standalone ExprContext is used
that is independent of the parent PlanState (when available) for both
types of runtime pruning.
The old asserts, particularly the second one, weren't asserting
something very useful anyway, IMO. What I mean is that the
ExprContext provided in the PartitionPruneContext to be the same as
the parent PlanState's ps_ExprContext isn't critical to the code that
follows. Nor whether the PlanState is available or not.
> 2) inconsistent spelling: run-time vs. runtime
I assume you meant in this comment:
* estate The EState for the query doing runtime pruning
Fixed by using run-time, which is a more commonly used term in the
source code than runtime.
> 3) PartitionPruneContext.is_valid - I think I'd rename the flag to
> "initialized" or something like that. The "is_valid" is a bit confusing,
> because it might seem the context can get invalidated later, but AFAICS
> that's not the case - we just initialize it lazily.
Agree that "initialized" is better, so renamed.
> 0003
> ----
>
> 1) In InitPlan I'd move
>
> estate->es_part_prune_infos = plannedstmt->partPruneInfos;
>
> before the comment, which is more about ExecDoInitialPruning.
Makes sense, done.
> 2) I'm not quite sure what "exec" partition pruning is?
>
> /*
> * ExecInitPartitionPruning
> * Initialize the data structures needed for runtime "exec" partition
> * pruning and return the result of initial pruning, if available.
>
> Is that the same thing as "runtime pruning"?
"Exec" pruning refers to pruning performed during execution, using
PARAM_EXEC parameters. In contrast, "init" pruning occurs during plan
initialization, using parameters whose values remain constant during
execution, such as PARAM_EXTERN parameters and stable functions.
Before this patch, the ExecInitPartitionPruning function, called
during ExecutorStart(), performed "init" pruning and set up state in
the PartitionPruneState for subsequent "exec" pruning during
ExecutorRun(). With this patch, "init" pruning is performed well
before this function is called, leaving its sole responsibility to
setting up the state for "exec" pruning. It may be worth renaming the
function to better reflect this new role, rather than updating only
the comment.
Actually, that is what I decided to do in the attached, along with
some other adjustments like moving ExecDoInitialPruning() to
execPartition.c from execMain.c, fixing up some obsolete comments,
etc.
> 0004
> ----
>
> 1) typo: paraller/parallel
Oops, fixed.
> 2) What about adding an assert to ExecFindMatchingSubPlans, to check
> valisubplan_rtis is not NULL? It's just mentioned in a comment, but
> better to explicitly enforce that?
Good idea, done.
>
> 2) It may not be quite clear why ExecInitUpdateProjection() switches to
> mt_updateColnosLists. Should that be explained in a comment, somewhere?
There is a comment in the ModifyTableState struct definition:
/*
* List of valid updateColnosLists. Contains only those belonging to
* unpruned relations from ModifyTable.updateColnosLists.
*/
List *mt_updateColnosLists;
It seems redundant to reiterate this in ExecInitUpdateProjection().
> 3) unnecessary newline in ExecLookupResultRelByOid
Removed.
> 0005
> ----
>
> 1) auto_explain.c - So what happens if the plan gets invalidated? The
> hook explain_ExecutorStart returns early, but then what? Does that break
> the user session somehow, or what?
It will get called again after ExecutorStartExt() loops back to do
ExecutorStart() with a new updated plan tree.
> 2) Isn't it a bit fragile if this requires every extension to update
> and add the ExecPlanStillValid() calls to various places?
The ExecPlanStillValid() call only needs to be added immediately after
the call to standard_ExecutorStart() in an extension's
ExecutorStart_hook() implementation.
> What if an
> extension doesn't do that? What weirdness will happen?
The QueryDesc.planstate won't contain a PlanState tree for starters
and other state information that InitPlan() populates in EState based
on the PlannedStmt.
> Maybe it'd be
> possible to at least check this in some other executor hook? Or at least
> we could ensure the check was done in assert-enabled builds? Or
> something to make extension authors aware of this?
I've added a note in the commit message, but if that's not enough, one
idea might be to change the return type of ExecutorStart_hook so that
the extensions that implement it are forced to be adjusted. Say, from
void to bool to indicate whether standard_ExecutorStart() succeeded
and thus created a "valid" plan. I had that in the previous versions
of the patch. Thoughts?
> Aside from going through the patches, I did a simple benchmark to see
> how this works in practice. I did a simple test, with pgbench -S and
> variable number of partitions/clients. I also varied the number of locks
> per transaction, because I was wondering if it may interact with the
> fast-path improvements. See the attached xeon.sh script and CSV with
> results from the 44/88-core machine.
>
> There's also two PDFs visualizing the results, to show the impact as a
> difference between "master" (no patches) vs. "pruning" build with v57
> applied. As usual, "green" is good (faster), read is "bad" (slower).
>
> For most combinations of parameters, there's no impact on throughput.
> Anything in 99-101% is just regular noise, possibly even more. I'm
> trying to reduce the noise a bit more, but this seems acceptable. I'd
> like to discuss three "cases" I see in the results:
Thanks for doing these benchmarks. I'll reply separately to discuss
the individual cases.
> costing / auto mode
> -------------------
>
> Anyway, this leads me to a related question - not quite a "bug" in the
> patch, but something to perhaps think about. And that's costing, and
> what "auto" should do.
>
> There are two PNG charts, showing throughput for runs with -M prepared
> and 1000 partitions. Each chart shows throughput for the three cache
> modes, and different client counts. There's a clear distinction between
> "master" and "patched" runs - the "generic" plans performed terribly, by
> orders of magnitude. With the patches it beats the "custom" plans.
>
> Which is great! But it also means that while "auto" used to do the right
> thing, with the patches that's not the case.
>
> AFAIK that's because we don't consider the runtime pruning when costing
> the plans, so the cost is calculated as if no pruning happened. And so
> it seems way more expensive than it should ... and it loses with the
> custom scans. Is that correct, or do I understand this wrong?
That's correct. The planner does not consider runtime pruning when
assigning costs to Append or MergeAppend paths in
create_{merge}append_path().
> Just to be clear, I'm not claiming the patch has to deal with this. I
> suppose it can be handled as a future improvement, and I'm not even sure
> there's a good way to consider this during costing. For example, can we
> estimate how many partitions will be pruned?
There have been discussions about this in the 2017 development thread
of run-time pruning [1] and likely at some later point in other
threads. One simple approach mentioned at [1] is to consider that
only 1 partition will be scanned for queries containing WHERE partkey
= $1, because only 1 partition can contain matching rows with that
condition.
I agree that this should be dealt with sooner than later so users get
generic plans even without having to use force_generic_plan.
I'll post the updated patches tomorrow.
--
Thanks, Amit Langote
[1] https://www.postgresql.org/message-id/CA%2BTgmoZv8sd9cKyYtHwmd_13%2BBAjkVKo%3DECe7G98tBK5Ejwatw%40ma...
^ permalink raw reply [nested|flat] 29+ messages in thread
* Re: generic plans and "initial" pruning
2024-08-15 15:34 Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-16 12:35 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-19 16:39 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-20 13:00 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-20 14:53 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-21 12:45 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-21 13:10 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-23 12:48 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-08-29 13:34 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-17 12:57 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-19 08:39 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-19 12:10 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-09-20 08:10 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-10-10 20:15 ` Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-10-11 07:30 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-10-25 12:30 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
2024-12-01 18:36 ` Re: generic plans and "initial" pruning Tomas Vondra <[email protected]>
2024-12-04 13:34 ` Re: generic plans and "initial" pruning Amit Langote <[email protected]>
@ 2024-12-05 12:03 ` Amit Langote <[email protected]>
0 siblings, 0 replies; 29+ messages in thread
From: Amit Langote @ 2024-12-05 12:03 UTC (permalink / raw)
To: Tomas Vondra <[email protected]>; +Cc: Robert Haas <[email protected]>; Alvaro Herrera <[email protected]>; Andres Freund <[email protected]>; Daniel Gustafsson <[email protected]>; David Rowley <[email protected]>; pgsql-hackers; Thom Brown <[email protected]>; Tom Lane <[email protected]>
On Wed, Dec 4, 2024 at 10:34 PM Amit Langote <[email protected]> wrote:
> I'll post the updated patches tomorrow.
Here is an updated set.
I said that Assert suffices in ExecInitPartitionPruning() but on
further thought and after finding the following in ExecInitExprRec()
that an elog() won't hurt.
/* planner messed up */
elog(ERROR, "Aggref found in non-Agg plan node");
Like this:
+ /* Obtain the pruneinfo we need. */
+ pruneinfo = list_nth_node(PartitionPruneInfo, estate->es_part_prune_infos,
+ part_prune_index);
+
+ /* Its relids better match the plan node's or the planner messed up. */
+ if (!bms_equal(relids, pruneinfo->relids))
+ elog(ERROR, "wrong pruneinfo with relids=%s found at
part_prune_index=%d contained in plan node with relids=%s",
+ bmsToString(pruneinfo->relids), part_prune_index,
+ bmsToString(relids));
I've merged what were 0004 and 0005 in v57 together because they would
be eventually committed together and I wanted to write a unified
commit message.
One notable change is that I've renamed ExecutorStartExt() to
ExecutorStartCachedPlan() and changed its callers to only call it if a
CachedPlan is available, calling ExecutorStart() otherwise.
I'm still looking at Tomas's perf numbers and haven't confirmed some
of the findings myself.
--
Thanks, Amit Langote
Attachments:
[application/octet-stream] v58-0001-Move-PartitionPruneInfo-out-of-plan-nodes-into-P.patch (20.4K, 2-v58-0001-Move-PartitionPruneInfo-out-of-plan-nodes-into-P.patch)
download | inline diff:
From 0c8671803853427dc43c1e2f58240a6a925fac4b Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Wed, 4 Dec 2024 16:16:29 +0900
Subject: [PATCH v58 1/4] Move PartitionPruneInfo out of plan nodes into
PlannedStmt
This change moves PartitionPruneInfo from individual plan nodes to
PlannedStmt, enabling runtime initial pruning to be performed across
the entire plan tree without traversing it to find nodes containing
PartitionPruneInfos.
The PartitionPruneInfo pointer fields in Append and MergeAppend nodes
have been replaced with an integer index that points to a list of
PartitionPruneInfos within PlannedStmt, which now holds the
PartitionPruneInfos for all subqueries.
A bitmapset field has been added to PartitionPruneInfo to store the RT
indexes that correspond to the apprelids field in Append or
MergeAppend. This ensures that the execution pruning logic
cross-checks that it operates on the correct plan node.
Duplicated code in set_append_references() and
set_mergeappend_references() has been moved to a new function,
register_pruneinfo(), which both updates the RT indexes by adding
rtoffset and adds the PartitionPruneInfo to the global list in
PlannerGlobal.
Reviewed-by: Alvaro Herrera
Reviewed-by: Robert Haas
Reviewed-by: Tomas Vondra
Discussion: https://postgr.es/m/CA+HiwqFGkMSge6TgC9KQzde0ohpAycLQuV7ooitEEpbKB0O_mg@mail.gmail.com
---
src/backend/executor/execMain.c | 1 +
src/backend/executor/execParallel.c | 1 +
src/backend/executor/execPartition.c | 17 ++++-
src/backend/executor/execUtils.c | 1 +
src/backend/executor/nodeAppend.c | 5 +-
src/backend/executor/nodeMergeAppend.c | 5 +-
src/backend/optimizer/plan/createplan.c | 23 +++----
src/backend/optimizer/plan/planner.c | 1 +
src/backend/optimizer/plan/setrefs.c | 85 ++++++++++++++++---------
src/backend/partitioning/partprune.c | 19 ++++--
src/include/executor/execPartition.h | 3 +-
src/include/nodes/execnodes.h | 1 +
src/include/nodes/pathnodes.h | 6 ++
src/include/nodes/plannodes.h | 16 +++--
src/include/partitioning/partprune.h | 8 +--
15 files changed, 131 insertions(+), 61 deletions(-)
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 5ca856fd27..b40fe38178 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -856,6 +856,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable, plannedstmt->permInfos);
estate->es_plannedstmt = plannedstmt;
+ estate->es_part_prune_infos = plannedstmt->partPruneInfos;
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index bfb3419efb..b01a2fdfdd 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -181,6 +181,7 @@ ExecSerializePlan(Plan *plan, EState *estate)
pstmt->dependsOnRole = false;
pstmt->parallelModeNeeded = false;
pstmt->planTree = plan;
+ pstmt->partPruneInfos = estate->es_part_prune_infos;
pstmt->rtable = estate->es_range_table;
pstmt->permInfos = estate->es_rteperminfos;
pstmt->resultRelations = NIL;
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 7651886229..950fa3289c 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1786,6 +1786,9 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* Initialize data structure needed for run-time partition pruning and
* do initial pruning if needed
*
+ * 'relids' identifies the relation to which both the parent plan and the
+ * PartitionPruneInfo given by 'part_prune_index' belong.
+ *
* On return, *initially_valid_subplans is assigned the set of indexes of
* child subplans that must be initialized along with the parent plan node.
* Initial pruning is performed here if needed and in that case only the
@@ -1798,11 +1801,23 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
PartitionPruneState *
ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
+ Bitmapset *relids,
Bitmapset **initially_valid_subplans)
{
PartitionPruneState *prunestate;
EState *estate = planstate->state;
+ PartitionPruneInfo *pruneinfo;
+
+ /* Obtain the pruneinfo we need. */
+ pruneinfo = list_nth_node(PartitionPruneInfo, estate->es_part_prune_infos,
+ part_prune_index);
+
+ /* Its relids better match the plan node's or the planner messed up. */
+ if (!bms_equal(relids, pruneinfo->relids))
+ elog(ERROR, "wrong pruneinfo with relids=%s found at part_prune_index=%d contained in plan node with relids=%s",
+ bmsToString(pruneinfo->relids), part_prune_index,
+ bmsToString(relids));
/* We may need an expression context to evaluate partition exprs */
ExecAssignExprContext(estate, planstate);
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index 740e8fb148..bc905a0cdc 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -118,6 +118,7 @@ CreateExecutorState(void)
estate->es_rowmarks = NULL;
estate->es_rteperminfos = NIL;
estate->es_plannedstmt = NULL;
+ estate->es_part_prune_infos = NIL;
estate->es_junkFilter = NULL;
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index ca0f54d676..de7ebab5c2 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -134,7 +134,7 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
appendstate->as_begun = false;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -145,7 +145,8 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&appendstate->ps,
list_length(node->appendplans),
- node->part_prune_info,
+ node->part_prune_index,
+ node->apprelids,
&validsubplans);
appendstate->as_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index e1b9b984a7..3ed91808dd 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -82,7 +82,7 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
mergestate->ps.ExecProcNode = ExecMergeAppend;
/* If run-time partition pruning is enabled, then set that up now */
- if (node->part_prune_info != NULL)
+ if (node->part_prune_index >= 0)
{
PartitionPruneState *prunestate;
@@ -93,7 +93,8 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
*/
prunestate = ExecInitPartitionPruning(&mergestate->ps,
list_length(node->mergeplans),
- node->part_prune_info,
+ node->part_prune_index,
+ node->apprelids,
&validsubplans);
mergestate->ms_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
diff --git a/src/backend/optimizer/plan/createplan.c b/src/backend/optimizer/plan/createplan.c
index 178c572b02..8f209c2d2f 100644
--- a/src/backend/optimizer/plan/createplan.c
+++ b/src/backend/optimizer/plan/createplan.c
@@ -1227,7 +1227,6 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
ListCell *subpaths;
int nasyncplans = 0;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
int nodenumsortkeys = 0;
AttrNumber *nodeSortColIdx = NULL;
Oid *nodeSortOperators = NULL;
@@ -1378,6 +1377,9 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ plan->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1401,16 +1403,14 @@ create_append_plan(PlannerInfo *root, AppendPath *best_path, int flags)
}
if (prunequal != NIL)
- partpruneinfo =
- make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ plan->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
plan->appendplans = subplans;
plan->nasyncplans = nasyncplans;
plan->first_partial_plan = best_path->first_partial_path;
- plan->part_prune_info = partpruneinfo;
copy_generic_path_info(&plan->plan, (Path *) best_path);
@@ -1449,7 +1449,6 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
List *subplans = NIL;
ListCell *subpaths;
RelOptInfo *rel = best_path->path.parent;
- PartitionPruneInfo *partpruneinfo = NULL;
/*
* We don't have the actual creation of the MergeAppend node split out
@@ -1542,6 +1541,9 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
subplans = lappend(subplans, subplan);
}
+ /* Set below if we find quals that we can use to run-time prune */
+ node->part_prune_index = -1;
+
/*
* If any quals exist, they may be useful to perform further partition
* pruning during execution. Gather information needed by the executor to
@@ -1557,13 +1559,12 @@ create_merge_append_plan(PlannerInfo *root, MergeAppendPath *best_path,
Assert(best_path->path.param_info == NULL);
if (prunequal != NIL)
- partpruneinfo = make_partition_pruneinfo(root, rel,
- best_path->subpaths,
- prunequal);
+ node->part_prune_index = make_partition_pruneinfo(root, rel,
+ best_path->subpaths,
+ prunequal);
}
node->mergeplans = subplans;
- node->part_prune_info = partpruneinfo;
/*
* If prepare_sort_from_pathkeys added sort columns, but we were told to
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index b665a7762e..9c253e864a 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -557,6 +557,7 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->dependsOnRole = glob->dependsOnRole;
result->parallelModeNeeded = glob->parallelModeNeeded;
result->planTree = top_plan;
+ result->partPruneInfos = glob->partPruneInfos;
result->rtable = glob->finalrtable;
result->permInfos = glob->finalrteperminfos;
result->resultRelations = glob->resultRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 6d23df108d..9f13243d54 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -1731,6 +1731,47 @@ set_customscan_references(PlannerInfo *root,
cscan->custom_relids = offset_relid_set(cscan->custom_relids, rtoffset);
}
+/*
+ * register_partpruneinfo
+ * Subroutine for set_append_references and set_mergeappend_references
+ *
+ * Add the PartitionPruneInfo from root->partPruneInfos at the given index
+ * into PlannerGlobal->partPruneInfos and return its index there.
+ *
+ * Also update the RT indexes present in PartitionedRelPruneInfos to add the
+ * offset.
+ */
+static int
+register_partpruneinfo(PlannerInfo *root, int part_prune_index, int rtoffset)
+{
+ PlannerGlobal *glob = root->glob;
+ PartitionPruneInfo *pinfo;
+ ListCell *l;
+
+ Assert(part_prune_index >= 0 &&
+ part_prune_index < list_length(root->partPruneInfos));
+ pinfo = list_nth_node(PartitionPruneInfo, root->partPruneInfos,
+ part_prune_index);
+
+ pinfo->relids = offset_relid_set(pinfo->relids, rtoffset);
+ foreach(l, pinfo->prune_infos)
+ {
+ List *prune_infos = lfirst(l);
+ ListCell *l2;
+
+ foreach(l2, prune_infos)
+ {
+ PartitionedRelPruneInfo *prelinfo = lfirst(l2);
+
+ prelinfo->rtindex += rtoffset;
+ }
+ }
+
+ glob->partPruneInfos = lappend(glob->partPruneInfos, pinfo);
+
+ return list_length(glob->partPruneInfos) - 1;
+}
+
/*
* set_append_references
* Do set_plan_references processing on an Append
@@ -1783,21 +1824,13 @@ set_append_references(PlannerInfo *root,
aplan->apprelids = offset_relid_set(aplan->apprelids, rtoffset);
- if (aplan->part_prune_info)
- {
- foreach(l, aplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * Add PartitionPruneInfo, if any, to PlannerGlobal and update the index.
+ * Also update the RT indexes present in it to add the offset.
+ */
+ if (aplan->part_prune_index >= 0)
+ aplan->part_prune_index =
+ register_partpruneinfo(root, aplan->part_prune_index, rtoffset);
/* We don't need to recurse to lefttree or righttree ... */
Assert(aplan->plan.lefttree == NULL);
@@ -1859,21 +1892,13 @@ set_mergeappend_references(PlannerInfo *root,
mplan->apprelids = offset_relid_set(mplan->apprelids, rtoffset);
- if (mplan->part_prune_info)
- {
- foreach(l, mplan->part_prune_info->prune_infos)
- {
- List *prune_infos = lfirst(l);
- ListCell *l2;
-
- foreach(l2, prune_infos)
- {
- PartitionedRelPruneInfo *pinfo = lfirst(l2);
-
- pinfo->rtindex += rtoffset;
- }
- }
- }
+ /*
+ * Add PartitionPruneInfo, if any, to PlannerGlobal and update the index.
+ * Also update the RT indexes present in it to add the offset.
+ */
+ if (mplan->part_prune_index >= 0)
+ mplan->part_prune_index =
+ register_partpruneinfo(root, mplan->part_prune_index, rtoffset);
/* We don't need to recurse to lefttree or righttree ... */
Assert(mplan->plan.lefttree == NULL);
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index 4e12ae5d1e..ca5467104d 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -207,16 +207,20 @@ static void partkey_datum_from_expr(PartitionPruneContext *context,
/*
* make_partition_pruneinfo
- * Builds a PartitionPruneInfo which can be used in the executor to allow
- * additional partition pruning to take place. Returns NULL when
- * partition pruning would be useless.
+ * Checks if the given set of quals can be used to build pruning steps
+ * that the executor can use to prune away unneeded partitions. If
+ * suitable quals are found then a PartitionPruneInfo is built and tagged
+ * onto the PlannerInfo's partPruneInfos list.
+ *
+ * The return value is the 0-based index of the item added to the
+ * partPruneInfos list or -1 if nothing was added.
*
* 'parentrel' is the RelOptInfo for an appendrel, and 'subpaths' is the list
* of scan paths for its child rels.
* 'prunequal' is a list of potential pruning quals (i.e., restriction
* clauses that are applicable to the appendrel).
*/
-PartitionPruneInfo *
+int
make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
List *subpaths,
List *prunequal)
@@ -330,10 +334,11 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
* quals, then we can just not bother with run-time pruning.
*/
if (prunerelinfos == NIL)
- return NULL;
+ return -1;
/* Else build the result data structure */
pruneinfo = makeNode(PartitionPruneInfo);
+ pruneinfo->relids = bms_copy(parentrel->relids);
pruneinfo->prune_infos = prunerelinfos;
/*
@@ -356,7 +361,9 @@ make_partition_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
else
pruneinfo->other_subplans = NULL;
- return pruneinfo;
+ root->partPruneInfos = lappend(root->partPruneInfos, pruneinfo);
+
+ return list_length(root->partPruneInfos) - 1;
}
/*
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index c09bc83b2a..33d922fe8d 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -123,7 +123,8 @@ typedef struct PartitionPruneState
extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
int n_total_subplans,
- PartitionPruneInfo *pruneinfo,
+ int part_prune_index,
+ Bitmapset *relids,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
bool initial_prune);
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 182a6956bb..b1471e68fe 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -639,6 +639,7 @@ typedef struct EState
* ExecRowMarks, or NULL if none */
List *es_rteperminfos; /* List of RTEPermissionInfo */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
+ List *es_part_prune_infos; /* List of PartitionPruneInfo */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index add0f9e45f..f8a4cd42c6 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -128,6 +128,9 @@ typedef struct PlannerGlobal
/* "flat" list of AppendRelInfos */
List *appendRelations;
+ /* "flat" list of PartitionPruneInfos */
+ List *partPruneInfos;
+
/* OIDs of relations the plan depends on */
List *relationOids;
@@ -559,6 +562,9 @@ struct PlannerInfo
/* Does this query modify any partition key columns? */
bool partColsUpdated;
+
+ /* PartitionPruneInfos added in this query's plan. */
+ List *partPruneInfos;
};
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index 52f29bcdb6..ef89927471 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -69,6 +69,9 @@ typedef struct PlannedStmt
struct Plan *planTree; /* tree of Plan nodes */
+ List *partPruneInfos; /* List of PartitionPruneInfo contained in the
+ * plan */
+
List *rtable; /* list of RangeTblEntry nodes */
List *permInfos; /* list of RTEPermissionInfo nodes for rtable
@@ -276,8 +279,8 @@ typedef struct Append
*/
int first_partial_plan;
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+ int part_prune_index;
} Append;
/* ----------------
@@ -311,8 +314,8 @@ typedef struct MergeAppend
/* NULLS FIRST/LAST directions */
bool *nullsFirst pg_node_attr(array_size(numCols));
- /* Info for run-time subplan pruning; NULL if we're not doing that */
- struct PartitionPruneInfo *part_prune_info;
+ /* Index to PlannerInfo.partPruneInfos or -1 if no run-time pruning */
+ int part_prune_index;
} MergeAppend;
/* ----------------
@@ -1414,6 +1417,10 @@ typedef struct PlanRowMark
* Then, since an Append-type node could have multiple partitioning
* hierarchies among its children, we have an unordered List of those Lists.
*
+ * relids RelOptInfo.relids of the parent plan node (e.g. Append
+ * or MergeAppend) to which this PartitionPruneInfo node
+ * belongs. The pruning logic ensures that this matches
+ * the parent plan node's apprelids.
* prune_infos List of Lists containing PartitionedRelPruneInfo nodes,
* one sublist per run-time-prunable partition hierarchy
* appearing in the parent plan node's subplans.
@@ -1426,6 +1433,7 @@ typedef struct PartitionPruneInfo
pg_node_attr(no_equal, no_query_jumble)
NodeTag type;
+ Bitmapset *relids;
List *prune_infos;
Bitmapset *other_subplans;
} PartitionPruneInfo;
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index bd490d154f..6922e04430 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -70,10 +70,10 @@ typedef struct PartitionPruneContext
#define PruneCxtStateIdx(partnatts, step_id, keyno) \
((partnatts) * (step_id) + (keyno))
-extern PartitionPruneInfo *make_partition_pruneinfo(struct PlannerInfo *root,
- struct RelOptInfo *parentrel,
- List *subpaths,
- List *prunequal);
+extern int make_partition_pruneinfo(struct PlannerInfo *root,
+ struct RelOptInfo *parentrel,
+ List *subpaths,
+ List *prunequal);
extern Bitmapset *prune_append_rel_partitions(struct RelOptInfo *rel);
extern Bitmapset *get_matching_partitions(PartitionPruneContext *context,
List *pruning_steps);
--
2.43.0
[application/octet-stream] v58-0002-Initialize-PartitionPruneContexts-lazily.patch (16.9K, 3-v58-0002-Initialize-PartitionPruneContexts-lazily.patch)
download | inline diff:
From 540079086a25bc18538a78c8990464ab4c77659e Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Wed, 4 Dec 2024 16:16:41 +0900
Subject: [PATCH v58 2/4] Initialize PartitionPruneContexts lazily
This commit moves the initialization of PartitionPruneContexts for
both initial and exec pruning steps from CreatePartitionPruneState()
to find_matching_subplans_recurse(), where they are actually needed.
To track whether the context has been initialized and is ready for
use, a boolean field is_valid has been added to PartitionPruneContext.
The primary motivation is to allow CreatePartitionPruneState() to be
called before ExecInitNode(). Right now, it's coupled with
ExecInitNode() because setting up the exec pruning context requires
access to the parent plan node's PlanState. By deferring context
creation to where it's actually needed, we break this dependency.
The ExprContext used for both pruning phases is now a standalone
context, independent of the parent PlanState.
This change will be useful in a future commit, which will move initial
pruning to occur outside ExecInitNode(), specifically before it is
called by InitPlan().
Reviewed-by: Robert Haas
Reviewed-by: Tom Lane
Reviewed-by: Tomas Vondra
Discussion: https://postgr.es/m/CA+HiwqFGkMSge6TgC9KQzde0ohpAycLQuV7ooitEEpbKB0O_mg@mail.gmail.com
---
src/backend/executor/execPartition.c | 151 +++++++++++++++++++--------
src/backend/partitioning/partprune.c | 7 +-
src/include/executor/execPartition.h | 12 +++
src/include/partitioning/partprune.h | 2 +
4 files changed, 123 insertions(+), 49 deletions(-)
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 950fa3289c..f4d425cd45 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -181,18 +181,17 @@ static char *ExecBuildSlotPartitionKeyDescription(Relation rel,
int maxfieldlen);
static List *adjust_partition_colnos(List *colnos, ResultRelInfo *leaf_part_rri);
static List *adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap);
-static PartitionPruneState *CreatePartitionPruneState(PlanState *planstate,
+static PartitionPruneState *CreatePartitionPruneState(EState *estate,
PartitionPruneInfo *pruneinfo);
-static void InitPartitionPruneContext(PartitionPruneContext *context,
+static void InitPartitionPruneContext(PartitionedRelPruningData *pprune,
+ PartitionPruneContext *context,
List *pruning_steps,
- PartitionDesc partdesc,
- PartitionKey partkey,
- PlanState *planstate,
- ExprContext *econtext);
+ PlanState *planstate);
static void PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
Bitmapset *initially_valid_subplans,
int n_total_subplans);
-static void find_matching_subplans_recurse(PartitionPruningData *prunedata,
+static void find_matching_subplans_recurse(PlanState *parent_plan,
+ PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
Bitmapset **validsubplans);
@@ -1823,7 +1822,14 @@ ExecInitPartitionPruning(PlanState *planstate,
ExecAssignExprContext(estate, planstate);
/* Create the working data structure for pruning */
- prunestate = CreatePartitionPruneState(planstate, pruneinfo);
+ prunestate = CreatePartitionPruneState(estate, pruneinfo);
+
+ /*
+ * Store PlanState for using it to initialize exec pruning contexts later
+ * in find_matching_subplans_recurse() where they are needed.
+ */
+ if (prunestate->do_exec_prune)
+ prunestate->parent_plan = planstate;
/*
* Perform an initial partition prune pass, if required.
@@ -1863,8 +1869,6 @@ ExecInitPartitionPruning(PlanState *planstate,
* CreatePartitionPruneState
* Build the data structure required for calling ExecFindMatchingSubPlans
*
- * 'planstate' is the parent plan node's execution state.
- *
* 'pruneinfo' is a PartitionPruneInfo as generated by
* make_partition_pruneinfo. Here we build a PartitionPruneState containing a
* PartitionPruningData for each partitioning hierarchy (i.e., each sublist of
@@ -1875,16 +1879,24 @@ ExecInitPartitionPruning(PlanState *planstate,
* stored in each PartitionedRelPruningData can be re-used each time we
* re-evaluate which partitions match the pruning steps provided in each
* PartitionedRelPruneInfo.
+ *
+ * Note that the PartitionPruneContexts for both initial and exec pruning
+ * (which are stored in each PartitionedRelPruningData) are initialized lazily
+ * in find_matching_subplans_recurse() when used for the first time.
*/
static PartitionPruneState *
-CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
+CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
{
- EState *estate = planstate->state;
PartitionPruneState *prunestate;
int n_part_hierarchies;
ListCell *lc;
int i;
- ExprContext *econtext = planstate->ps_ExprContext;
+
+ /*
+ * Expression context that will be used by partkey_datum_from_expr() to
+ * evaluate expressions for comparison against partition bounds.
+ */
+ ExprContext *econtext = CreateExprContext(estate);
/* For data reading, executor always includes detached partitions */
if (estate->es_partition_directory == NULL)
@@ -1906,6 +1918,7 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
prunestate->other_subplans = bms_copy(pruneinfo->other_subplans);
prunestate->do_initial_prune = false; /* may be set below */
prunestate->do_exec_prune = false; /* may be set below */
+ prunestate->parent_plan = NULL;
prunestate->num_partprunedata = n_part_hierarchies;
/*
@@ -1941,16 +1954,25 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
PartitionedRelPruningData *pprune = &prunedata->partrelprunedata[j];
Relation partrel;
PartitionDesc partdesc;
- PartitionKey partkey;
/*
- * We can rely on the copies of the partitioned table's partition
- * key and partition descriptor appearing in its relcache entry,
- * because that entry will be held open and locked for the
- * duration of this executor run.
+ * Used for initializing the expressions in initial pruning steps.
+ * For exec pruning steps, the parent plan node's PlanState's
+ * ps_ExprContext will be used.
*/
+ pprune->estate = estate;
+ pprune->econtext = econtext;
+
+ /* Remember Relation for use in InitPartitionPruneContext. */
partrel = ExecGetRangeTableRelation(estate, pinfo->rtindex);
- partkey = RelationGetPartitionKey(partrel);
+ pprune->partrel = partrel;
+
+ /*
+ * We can rely on the copy of partitioned table's partition
+ * descriptor appearing in its relcache entry, because that entry
+ * will be held open and locked for the duration of this executor
+ * run.
+ */
partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
partrel);
@@ -2061,32 +2083,26 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
pprune->present_parts = bms_copy(pinfo->present_parts);
/*
- * Initialize pruning contexts as needed. Note that we must skip
- * execution-time partition pruning in EXPLAIN (GENERIC_PLAN),
- * since parameter values may be missing.
+ * Pruning contexts (initial_context and exec_context) are
+ * initialized lazily in find_matching_subplans_recurse() when
+ * used for the first time.
+ *
+ * Note that we must skip execution-time partition pruning in
+ * EXPLAIN (GENERIC_PLAN), since parameter values may be missing.
*/
pprune->initial_pruning_steps = pinfo->initial_pruning_steps;
+ pprune->initial_context.initialized = false;
if (pinfo->initial_pruning_steps &&
!(econtext->ecxt_estate->es_top_eflags & EXEC_FLAG_EXPLAIN_GENERIC))
- {
- InitPartitionPruneContext(&pprune->initial_context,
- pinfo->initial_pruning_steps,
- partdesc, partkey, planstate,
- econtext);
/* Record whether initial pruning is needed at any level */
prunestate->do_initial_prune = true;
- }
+
pprune->exec_pruning_steps = pinfo->exec_pruning_steps;
+ pprune->exec_context.initialized = false;
if (pinfo->exec_pruning_steps &&
!(econtext->ecxt_estate->es_top_eflags & EXEC_FLAG_EXPLAIN_GENERIC))
- {
- InitPartitionPruneContext(&pprune->exec_context,
- pinfo->exec_pruning_steps,
- partdesc, partkey, planstate,
- econtext);
/* Record whether exec pruning is needed at any level */
prunestate->do_exec_prune = true;
- }
/*
* Accumulate the IDs of all PARAM_EXEC Params affecting the
@@ -2107,17 +2123,41 @@ CreatePartitionPruneState(PlanState *planstate, PartitionPruneInfo *pruneinfo)
* Initialize a PartitionPruneContext for the given list of pruning steps.
*/
static void
-InitPartitionPruneContext(PartitionPruneContext *context,
+InitPartitionPruneContext(PartitionedRelPruningData *pprune,
+ PartitionPruneContext *context,
List *pruning_steps,
- PartitionDesc partdesc,
- PartitionKey partkey,
- PlanState *planstate,
- ExprContext *econtext)
+ PlanState *planstate)
{
int n_steps;
int partnatts;
ListCell *lc;
+ /*
+ * Use the ExprContext that CreatePartitionPruneState() should have
+ * created.
+ */
+ ExprContext *econtext = pprune->econtext;
+ EState *estate = pprune->estate;
+ MemoryContext oldcxt;
+ Relation partrel = pprune->partrel;
+ PartitionKey partkey;
+ PartitionDesc partdesc;
+
+ Assert(econtext != NULL);
+
+ /* Must allocate the needed stuff in the query lifetime context. */
+ oldcxt = MemoryContextSwitchTo(estate->es_query_cxt);
+
+ /*
+ * We can rely on the copies of the partitioned table's partition key and
+ * partition descriptor appearing in its relcache entry, because that
+ * entry will be held open and locked for the duration of this executor
+ * run.
+ */
+ partkey = RelationGetPartitionKey(partrel);
+ partdesc = PartitionDirectoryLookup(estate->es_partition_directory,
+ partrel);
+
n_steps = list_length(pruning_steps);
context->strategy = partkey->strategy;
@@ -2185,6 +2225,9 @@ InitPartitionPruneContext(PartitionPruneContext *context,
}
}
}
+
+ MemoryContextSwitchTo(oldcxt);
+ context->initialized = true;
}
/*
@@ -2348,12 +2391,16 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
* recursing to other (lower-level) parents as needed.
*/
pprune = &prunedata->partrelprunedata[0];
- find_matching_subplans_recurse(prunedata, pprune, initial_prune,
+ find_matching_subplans_recurse(prunestate->parent_plan,
+ prunedata, pprune, initial_prune,
&result);
/* Expression eval may have used space in ExprContext too */
- if (pprune->exec_pruning_steps)
+ if (pprune->exec_context.initialized)
+ {
+ Assert(pprune->exec_pruning_steps != NIL);
ResetExprContext(pprune->exec_context.exprcontext);
+ }
}
/* Add in any subplans that partition pruning didn't account for */
@@ -2376,7 +2423,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
* Adds valid (non-prunable) subplan IDs to *validsubplans
*/
static void
-find_matching_subplans_recurse(PartitionPruningData *prunedata,
+find_matching_subplans_recurse(PlanState *parent_plan,
+ PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
Bitmapset **validsubplans)
@@ -2393,11 +2441,27 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
* level.
*/
if (initial_prune && pprune->initial_pruning_steps)
+ {
+ /* Initialize initial_context if not already done. */
+ if (unlikely(!pprune->initial_context.initialized))
+ InitPartitionPruneContext(pprune,
+ &pprune->initial_context,
+ pprune->initial_pruning_steps,
+ parent_plan);
partset = get_matching_partitions(&pprune->initial_context,
pprune->initial_pruning_steps);
+ }
else if (!initial_prune && pprune->exec_pruning_steps)
+ {
+ /* Initialize exec_context if not already done. */
+ if (unlikely(!pprune->exec_context.initialized))
+ InitPartitionPruneContext(pprune,
+ &pprune->exec_context,
+ pprune->exec_pruning_steps,
+ parent_plan);
partset = get_matching_partitions(&pprune->exec_context,
pprune->exec_pruning_steps);
+ }
else
partset = pprune->present_parts;
@@ -2413,7 +2477,8 @@ find_matching_subplans_recurse(PartitionPruningData *prunedata,
int partidx = pprune->subpart_map[i];
if (partidx >= 0)
- find_matching_subplans_recurse(prunedata,
+ find_matching_subplans_recurse(parent_plan,
+ prunedata,
&prunedata->partrelprunedata[partidx],
initial_prune, validsubplans);
else
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index ca5467104d..ae1d69f96c 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -3783,13 +3783,8 @@ partkey_datum_from_expr(PartitionPruneContext *context,
/*
* We should never see a non-Const in a step unless the caller has
* passed a valid ExprContext.
- *
- * When context->planstate is valid, context->exprcontext is same as
- * context->planstate->ps_ExprContext.
*/
- Assert(context->planstate != NULL || context->exprcontext != NULL);
- Assert(context->planstate == NULL ||
- (context->exprcontext == context->planstate->ps_ExprContext));
+ Assert(context->exprcontext != NULL);
exprstate = context->exprstates[stateidx];
ectx = context->exprcontext;
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 33d922fe8d..7e470c82f6 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -42,6 +42,10 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
* PartitionedRelPruneInfo (see plannodes.h); though note that here,
* subpart_map contains indexes into PartitionPruningData.partrelprunedata[].
*
+ * estate The EState for the query doing run-time pruning
+ * partrel Partitioned table Relation; obtained by
+ * ExecGetRangeTableRelation(estate, rti), where
+ * rti is PartitionedRelPruneInfo.rtindex.
* nparts Length of subplan_map[] and subpart_map[].
* subplan_map Subplan index by partition index, or -1.
* subpart_map Subpart index by partition index, or -1.
@@ -51,6 +55,8 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
* perform executor startup pruning.
* exec_pruning_steps List of PartitionPruneSteps used to
* perform per-scan pruning.
+ * econtext ExprContext to use for evaluating partition
+ * key
* initial_context If initial_pruning_steps isn't NIL, contains
* the details needed to execute those steps.
* exec_context If exec_pruning_steps isn't NIL, contains
@@ -58,12 +64,15 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
*/
typedef struct PartitionedRelPruningData
{
+ EState *estate;
+ Relation partrel;
int nparts;
int *subplan_map;
int *subpart_map;
Bitmapset *present_parts;
List *initial_pruning_steps;
List *exec_pruning_steps;
+ ExprContext *econtext;
PartitionPruneContext initial_context;
PartitionPruneContext exec_context;
} PartitionedRelPruningData;
@@ -105,6 +114,8 @@ typedef struct PartitionPruningData
* startup (at any hierarchy level).
* do_exec_prune true if pruning should be performed during
* executor run (at any hierarchy level).
+ * parent_plan Parent plan node's PlanState used to initialize
+ * expression contained in "exec" pruning steps.
* num_partprunedata Number of items in "partprunedata" array.
* partprunedata Array of PartitionPruningData pointers for the plan's
* partitioned relation(s), one for each partitioning
@@ -117,6 +128,7 @@ typedef struct PartitionPruneState
MemoryContext prune_context;
bool do_initial_prune;
bool do_exec_prune;
+ PlanState *parent_plan;
int num_partprunedata;
PartitionPruningData *partprunedata[FLEXIBLE_ARRAY_MEMBER];
} PartitionPruneState;
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index 6922e04430..0cbcb4fb4e 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -26,6 +26,7 @@ struct RelOptInfo;
* Stores information needed at runtime for pruning computations
* related to a single partitioned table.
*
+ * initialized Has the information in this struct been initialized?
* strategy Partition strategy, e.g. LIST, RANGE, HASH.
* partnatts Number of columns in the partition key.
* nparts Number of partitions in this partitioned table.
@@ -48,6 +49,7 @@ struct RelOptInfo;
*/
typedef struct PartitionPruneContext
{
+ bool initialized;
char strategy;
int partnatts;
int nparts;
--
2.43.0
[application/octet-stream] v58-0003-Perform-runtime-initial-pruning-outside-ExecInit.patch (14.6K, 4-v58-0003-Perform-runtime-initial-pruning-outside-ExecInit.patch)
download | inline diff:
From 99d2fd33249e6846277bd02e6cde8606a9ce117b Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Wed, 4 Dec 2024 16:16:49 +0900
Subject: [PATCH v58 3/4] Perform runtime initial pruning outside
ExecInitNode()
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
This commit follows up on the previous change that moved
PartitionPruneInfos out of individual plan nodes into a list in
PlannedStmt. It moves the initialization of PartitionPruneStates
and runtime initial pruning out of ExecInitNode() and into a new
routine, ExecDoInitialPruning(), which is called by InitPlan()
before ExecInitNode() is invoked on the main plan tree and subplans.
ExecDoInitialPruning() performs the initial pruning and saves the
result—a bitmapset of indexes for the surviving child subnodes—in
es_part_prune_results, a list in EState. The PartitionPruneStates
created for initial pruning are also saved in es_part_prune_states,
another list in EState, for later use during exec pruning. Both lists
are parallel to es_part_prune_infos (which holds the
PartitionPruneInfos from PlannedStmt), allowing them to share the
same index.
Reviewed-by: Robert Haas
Reviewed-by: Tomas Vondra
Discussion: https://postgr.es/m/CA+HiwqFGkMSge6TgC9KQzde0ohpAycLQuV7ooitEEpbKB0O_mg@mail.gmail.com
---
src/backend/executor/execMain.c | 12 +++
src/backend/executor/execPartition.c | 130 ++++++++++++++++++-------
src/backend/executor/nodeAppend.c | 10 +-
src/backend/executor/nodeMergeAppend.c | 10 +-
src/include/executor/execPartition.h | 11 ++-
src/include/nodes/execnodes.h | 2 +
6 files changed, 125 insertions(+), 50 deletions(-)
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index b40fe38178..5dc46f2e95 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -46,6 +46,7 @@
#include "commands/matview.h"
#include "commands/trigger.h"
#include "executor/executor.h"
+#include "executor/execPartition.h"
#include "executor/nodeSubplan.h"
#include "foreign/fdwapi.h"
#include "mb/pg_wchar.h"
@@ -858,6 +859,17 @@ InitPlan(QueryDesc *queryDesc, int eflags)
estate->es_plannedstmt = plannedstmt;
estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+ /*
+ * Perform runtime "initial" pruning to identify which child subplans,
+ * corresponding to the children of plan nodes that contain
+ * PartitionPruneInfo such as Append, will not be executed. The results,
+ * which are bitmapsets of indexes of the child subplans that will be
+ * executed, are saved in es_part_prune_results. These results correspond
+ * to each PartitionPruneInfo entry, and the es_part_prune_results list is
+ * parallel to es_part_prune_infos.
+ */
+ ExecDoInitialPruning(estate);
+
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
*/
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index f4d425cd45..46dd1c77a3 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -1761,48 +1761,105 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
*
* Functions:
*
- * ExecInitPartitionPruning:
- * Creates the PartitionPruneState required by ExecFindMatchingSubPlans.
- * Details stored include how to map the partition index returned by the
- * partition pruning code into subplan indexes. Also determines the set
- * of subplans to initialize considering the result of performing initial
- * pruning steps if any. Maps in PartitionPruneState are updated to
+ * ExecDoInitialPruning:
+ * Perform runtime "initial" pruning, if necessary, to determine the set
+ * of child subnodes that need to be initialized during ExecInitNode() for
+ * all plan nodes that contain a PartitionPruneInfo.
+ *
+ * ExecInitPartitionExecPruning:
+ * Updates the PartitionPruneState found at given part_prune_index in
+ * EState.es_part_prune_states for use during "exec" pruning if required.
+ * Also returns the set of subplans to initialize that would be stored at
+ * part_prune_index in EState.es_part_prune_result by
+ * ExecDoInitialPruning(). Maps in PartitionPruneState are updated to
* account for initial pruning possibly having eliminated some of the
* subplans.
*
* ExecFindMatchingSubPlans:
* Returns indexes of matching subplans after evaluating the expressions
* that are safe to evaluate at a given point. This function is first
- * called during ExecInitPartitionPruning() to find the initially
- * matching subplans based on performing the initial pruning steps and
- * then must be called again each time the value of a Param listed in
+ * called during ExecDoInitialPruning() to find the initially matching
+ * subplans based on performing the initial pruning steps and then must be
+ * called again each time the value of a Param listed in
* PartitionPruneState's 'execparamids' changes.
*-------------------------------------------------------------------------
*/
/*
- * ExecInitPartitionPruning
- * Initialize data structure needed for run-time partition pruning and
- * do initial pruning if needed
+ * ExecDoInitialPruning
+ * Perform runtime "initial" pruning, if necessary, to determine the set
+ * of child subnodes that need to be initialized during ExecInitNode() for
+ * plan nodes that support partition pruning.
+ *
+ * This function iterates over each PartitionPruneInfo entry in
+ * estate->es_part_prune_infos. For each entry, it creates a PartitionPruneState
+ * and adds it to es_part_prune_states, where ExecInitPartitionExecPruning() can
+ * access it for use during "exec" pruning.
+ *
+ * If initial pruning steps exist for a PartitionPruneInfo entry, this function
+ * executes those pruning steps and stores the result as a bitmapset of valid
+ * child subplans, identifying which subplans should be initialized for
+ * execution. The results are saved in estate->es_part_prune_results.
+ *
+ * If no initial pruning is performed for a given PartitionPruneInfo, a NULL
+ * entry is still added to es_part_prune_results to maintain alignment with
+ * es_part_prune_infos. This ensures that ExecInitPartitionExecPruning() can
+ * use the same index to retrieve the pruning results.
+ */
+void
+ExecDoInitialPruning(EState *estate)
+{
+ ListCell *lc;
+
+ foreach(lc, estate->es_part_prune_infos)
+ {
+ PartitionPruneInfo *pruneinfo = lfirst_node(PartitionPruneInfo, lc);
+ PartitionPruneState *prunestate;
+ Bitmapset *validsubplans = NULL;
+
+ /* Create and save the PartitionPruneState. */
+ prunestate = CreatePartitionPruneState(estate, pruneinfo);
+ estate->es_part_prune_states = lappend(estate->es_part_prune_states,
+ prunestate);
+
+ /*
+ * Perform initial pruning steps, if any, and save the result
+ * bitmapset or NULL as described in the header comment.
+ */
+ if (prunestate->do_initial_prune)
+ validsubplans = ExecFindMatchingSubPlans(prunestate, true);
+ estate->es_part_prune_results = lappend(estate->es_part_prune_results,
+ validsubplans);
+ }
+}
+
+/*
+ * ExecInitPartitionExecPruning
+ * Initialize the data structures needed for runtime "exec" partition
+ * pruning and return the result of initial pruning, if available.
*
* 'relids' identifies the relation to which both the parent plan and the
* PartitionPruneInfo given by 'part_prune_index' belong.
*
- * On return, *initially_valid_subplans is assigned the set of indexes of
- * child subplans that must be initialized along with the parent plan node.
- * Initial pruning is performed here if needed and in that case only the
- * surviving subplans' indexes are added.
+ * The PartitionPruneState would have been created by ExecDoInitialPruning()
+ * and stored as the part_prune_index'th element of EState.es_part_prune_states.
*
- * If subplans are indeed pruned, subplan_map arrays contained in the returned
- * PartitionPruneState are re-sequenced to not count those, though only if the
- * maps will be needed for subsequent execution pruning passes.
+ * On return, *initially_valid_subplans is assigned the set of indexes of child
+ * subplans that must be initialized. Initial pruning would have been performed
+ * by ExecDoInitialPruning() if necessary, and the bitmapset of surviving
+ * subplans' indexes would have been stored as the part_prune_index'th element
+ * of EState.es_part_prune_results.
+ *
+ * If subplans were pruned during initial pruning, the subplan_map arrays in
+ * the returned PartitionPruneState are re-sequenced to exclude those subplans,
+ * but only if the maps will be needed for subsequent execution pruning passes.
*/
PartitionPruneState *
-ExecInitPartitionPruning(PlanState *planstate,
- int n_total_subplans,
- int part_prune_index,
- Bitmapset *relids,
- Bitmapset **initially_valid_subplans)
+ExecInitPartitionExecPruning(PlanState *planstate,
+ int n_total_subplans,
+ int part_prune_index,
+ Bitmapset *relids,
+ Bitmapset **initially_valid_subplans)
{
PartitionPruneState *prunestate;
EState *estate = planstate->state;
@@ -1818,11 +1875,12 @@ ExecInitPartitionPruning(PlanState *planstate,
bmsToString(pruneinfo->relids), part_prune_index,
bmsToString(relids));
- /* We may need an expression context to evaluate partition exprs */
- ExecAssignExprContext(estate, planstate);
-
- /* Create the working data structure for pruning */
- prunestate = CreatePartitionPruneState(estate, pruneinfo);
+ /*
+ * ExecDoInitialPruning() must have initialized the PartitionPruneState to
+ * perform the initial pruning.
+ */
+ prunestate = list_nth(estate->es_part_prune_states, part_prune_index);
+ Assert(prunestate != NULL);
/*
* Store PlanState for using it to initialize exec pruning contexts later
@@ -1831,11 +1889,11 @@ ExecInitPartitionPruning(PlanState *planstate,
if (prunestate->do_exec_prune)
prunestate->parent_plan = planstate;
- /*
- * Perform an initial partition prune pass, if required.
- */
+ /* Use the result of initial pruning done by ExecDoInitialPruning(). */
if (prunestate->do_initial_prune)
- *initially_valid_subplans = ExecFindMatchingSubPlans(prunestate, true);
+ *initially_valid_subplans = list_nth_node(Bitmapset,
+ estate->es_part_prune_results,
+ part_prune_index);
else
{
/* No pruning, so we'll need to initialize all subplans */
@@ -1846,8 +1904,8 @@ ExecInitPartitionPruning(PlanState *planstate,
/*
* Re-sequence subplan indexes contained in prunestate to account for any
- * that were removed above due to initial pruning. No need to do this if
- * no steps were removed.
+ * that were removed due to initial pruning. No need to do this if no
+ * partitions were removed.
*/
if (bms_num_members(*initially_valid_subplans) < n_total_subplans)
{
@@ -1868,6 +1926,8 @@ ExecInitPartitionPruning(PlanState *planstate,
/*
* CreatePartitionPruneState
* Build the data structure required for calling ExecFindMatchingSubPlans
+ * Details stored include how to map the partition index returned by the
+ * partition pruning code into subplan indexes.
*
* 'pruneinfo' is a PartitionPruneInfo as generated by
* make_partition_pruneinfo. Here we build a PartitionPruneState containing a
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index de7ebab5c2..b77ff84840 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -143,11 +143,11 @@ ExecInitAppend(Append *node, EState *estate, int eflags)
* subplans to initialize (validsubplans) by taking into account the
* result of performing initial pruning if any.
*/
- prunestate = ExecInitPartitionPruning(&appendstate->ps,
- list_length(node->appendplans),
- node->part_prune_index,
- node->apprelids,
- &validsubplans);
+ prunestate = ExecInitPartitionExecPruning(&appendstate->ps,
+ list_length(node->appendplans),
+ node->part_prune_index,
+ node->apprelids,
+ &validsubplans);
appendstate->as_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index 3ed91808dd..e2032afcb7 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -91,11 +91,11 @@ ExecInitMergeAppend(MergeAppend *node, EState *estate, int eflags)
* subplans to initialize (validsubplans) by taking into account the
* result of performing initial pruning if any.
*/
- prunestate = ExecInitPartitionPruning(&mergestate->ps,
- list_length(node->mergeplans),
- node->part_prune_index,
- node->apprelids,
- &validsubplans);
+ prunestate = ExecInitPartitionExecPruning(&mergestate->ps,
+ list_length(node->mergeplans),
+ node->part_prune_index,
+ node->apprelids,
+ &validsubplans);
mergestate->ms_prune_state = prunestate;
nplans = bms_num_members(validsubplans);
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 7e470c82f6..0b34784922 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -133,11 +133,12 @@ typedef struct PartitionPruneState
PartitionPruningData *partprunedata[FLEXIBLE_ARRAY_MEMBER];
} PartitionPruneState;
-extern PartitionPruneState *ExecInitPartitionPruning(PlanState *planstate,
- int n_total_subplans,
- int part_prune_index,
- Bitmapset *relids,
- Bitmapset **initially_valid_subplans);
+void ExecDoInitialPruning(EState *estate);
+extern PartitionPruneState *ExecInitPartitionExecPruning(PlanState *planstate,
+ int n_total_subplans,
+ int part_prune_index,
+ Bitmapset *relids,
+ Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
bool initial_prune);
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index b1471e68fe..f93061c7bf 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -640,6 +640,8 @@ typedef struct EState
List *es_rteperminfos; /* List of RTEPermissionInfo */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
List *es_part_prune_infos; /* List of PartitionPruneInfo */
+ List *es_part_prune_states; /* List of PartitionPruneState */
+ List *es_part_prune_results; /* List of Bitmapset */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
--
2.43.0
[application/octet-stream] v58-0004-Defer-locking-of-runtime-prunable-relations-in-c.patch (111.3K, 5-v58-0004-Defer-locking-of-runtime-prunable-relations-in-c.patch)
download | inline diff:
From bf901dbe761f5a3fb2b10c480a66780f59af1913 Mon Sep 17 00:00:00 2001
From: Amit Langote <[email protected]>
Date: Wed, 4 Dec 2024 16:16:56 +0900
Subject: [PATCH v58 4/4] Defer locking of runtime-prunable relations in cached
plans
AcquireExecutorLocks() in plancache.c locks all relations in a
plan's range table to ensure the plan is safe for execution. However,
this approach also locks runtime-prunable relations that will later
be pruned during "initial" runtime pruning, introducing unnecessary
overhead. This commit defers locking for such relations and ensures
that any invalidation caused by this deferral is handled by
replanning when necessary.
* Locking changes:
The planner now tracks "unprunable" relations using the new
PlannedStmt.unprunableRelids field, which is computed during
set_plan_refs() by subtracting runtime-prunable relation RT indexes
(identified from PartitionPruneInfos) from all RT indexes.
AcquireExecutorLocks() locks only these unprunable relations.
During executor startup, ExecDoInitialPruning() identifies unpruned
partitions and acquires locks on them. A new es_unpruned_relids field
is added to EState to ensure that subsequent initialization steps
process only locked relations. It is initially populated with
PlannedStmt.unprunableRelids and updated by ExecDoInitialPruning()
with the RT indexes of the unpruned partitions. To populate
es_unpruned_relids, PartitionedRelPruneInfo and
PartitionedRelPruningData now include a leafpart_rti_map[] to map
partition indexes (as determined by get_matching_partitions()) to
their corresponding RT indexes.
Executor code that works with child result relations and child
RowMarks require adjustments because pruned relations are no longer
locked, because without such adjustments, the executor could attempt
to process result relations or RowMarks for pruned partitions.
Specifically, ExecInitModifyTable() trims result relation-related
lists resultRelations, withCheckOptionLists, returningLists, and
updateColnosLists to include only unpruned partitions by checking
es_pruned_relids. It also creates ResultRelInfo structs only for
these unpruned partitions. Similarly, child RowMarks whose owning
relations are pruned are now ignored, again by checking
es_unpruned_relids, ensuring only those associated with unpruned
relations are processed.
Finally, ExecCheckPermissions() now includes an Assert to verify that
all relations undergoing permission checks have been properly locked.
This safeguard helps catch any cases where relations that should have
been added to the unprunableRelids set were missed during planning.
* Changed related to handling plan invalidation:
Deferring locks introduces a window where prunable relations may be
altered by concurrent DDL, invalidating the plan. To ensure
correctness, a new ExecutorStartCachedPlan() function that wraps
ExecutorStart() is added to detect and handle invalid plans caused by
deferred locking. When invalidation occurs, ExecutorStartCachedPlan()
updates all plans in the CachedPlan using the new UpdateCachedPlan()
function and retries execution with the refreshed plan.
UpdateCachedPlan() replaces stale plans in CachedPlan.stmt_list. To
enable this, a new CachedPlan.stmt_context is introduced as a child
context of CachedPlan.context. This separates PlannedStmts from the
parent context, allowing UpdateCachedPlan() to free old PlannedStmts
when replacing them with new plans, while preserving the CachedPlan
structure, including the List containing the statements.
* Testing
Tests using the delay_execution module verify scenarios where a cached
plan becomes invalid due to changes in prunable relations after
deferred locks are taken.
* Note to extension authors:
ExecutorStart_hook implementations should verify plan validity after
calling standard_ExecutorStart() to ensure they are not working with
an invalid plan. The following check can be used:
/* The plan may have become invalid during ExecutorStart() */
if (!ExecPlanStillValid(queryDesc->estate))
return;
Additionally, any RT index inspected by an extension should be
checked against EState.es_unpruned_relids before processing the
relation, particularly if the relation could be a child relation
subject to initial partition pruning. This is necessary because
extensions can no longer assume that all range table relations are
locked; only those in es_unpruned_relids are. For reference, see
how InitPlan() processes entries from PlannedStmt.rowMarks.
Reviewed-by: Robert Haas
Reviewed-by: Tomas Vondra
Discussion: https://postgr.es/m/CA+HiwqFGkMSge6TgC9KQzde0ohpAycLQuV7ooitEEpbKB0O_mg@mail.gmail.com
---
contrib/auto_explain/auto_explain.c | 4 +
.../pg_stat_statements/pg_stat_statements.c | 4 +
src/backend/commands/copyto.c | 2 +-
src/backend/commands/createas.c | 2 +-
src/backend/commands/explain.c | 16 +-
src/backend/commands/extension.c | 1 +
src/backend/commands/matview.c | 2 +-
src/backend/commands/portalcmds.c | 1 +
src/backend/commands/prepare.c | 9 +-
src/backend/commands/trigger.c | 14 +
src/backend/executor/README | 35 ++-
src/backend/executor/execMain.c | 124 +++++++-
src/backend/executor/execParallel.c | 9 +-
src/backend/executor/execPartition.c | 88 +++++-
src/backend/executor/execUtils.c | 1 +
src/backend/executor/functions.c | 1 +
src/backend/executor/nodeAppend.c | 8 +-
src/backend/executor/nodeLockRows.c | 9 +-
src/backend/executor/nodeMergeAppend.c | 2 +-
src/backend/executor/nodeModifyTable.c | 70 ++++-
src/backend/executor/spi.c | 23 +-
src/backend/optimizer/plan/planner.c | 2 +
src/backend/optimizer/plan/setrefs.c | 29 +-
src/backend/partitioning/partprune.c | 22 ++
src/backend/tcop/postgres.c | 4 +-
src/backend/tcop/pquery.c | 39 ++-
src/backend/utils/cache/plancache.c | 204 +++++++++++--
src/backend/utils/mmgr/portalmem.c | 4 +-
src/include/commands/explain.h | 6 +-
src/include/commands/trigger.h | 1 +
src/include/executor/execPartition.h | 6 +-
src/include/executor/execdesc.h | 2 +
src/include/executor/executor.h | 28 ++
src/include/nodes/execnodes.h | 13 +
src/include/nodes/pathnodes.h | 8 +
src/include/nodes/plannodes.h | 7 +
src/include/utils/plancache.h | 50 +++-
src/include/utils/portal.h | 4 +-
src/test/modules/delay_execution/Makefile | 3 +-
.../modules/delay_execution/delay_execution.c | 63 +++-
.../expected/cached-plan-inval.out | 282 ++++++++++++++++++
src/test/modules/delay_execution/meson.build | 1 +
.../specs/cached-plan-inval.spec | 80 +++++
src/test/regress/expected/partition_prune.out | 44 +++
src/test/regress/sql/partition_prune.sql | 18 ++
45 files changed, 1237 insertions(+), 108 deletions(-)
create mode 100644 src/test/modules/delay_execution/expected/cached-plan-inval.out
create mode 100644 src/test/modules/delay_execution/specs/cached-plan-inval.spec
diff --git a/contrib/auto_explain/auto_explain.c b/contrib/auto_explain/auto_explain.c
index 623a674f99..8b5eaf3ef3 100644
--- a/contrib/auto_explain/auto_explain.c
+++ b/contrib/auto_explain/auto_explain.c
@@ -298,6 +298,10 @@ explain_ExecutorStart(QueryDesc *queryDesc, int eflags)
else
standard_ExecutorStart(queryDesc, eflags);
+ /* The plan may have become invalid during standard_ExecutorStart() */
+ if (!ExecPlanStillValid(queryDesc->estate))
+ return;
+
if (auto_explain_enabled())
{
/*
diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 49c657b3e0..b11691ae26 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -994,6 +994,10 @@ pgss_ExecutorStart(QueryDesc *queryDesc, int eflags)
else
standard_ExecutorStart(queryDesc, eflags);
+ /* The plan may have become invalid during standard_ExecutorStart() */
+ if (!ExecPlanStillValid(queryDesc->estate))
+ return;
+
/*
* If query has queryId zero, don't track it. This prevents double
* counting of optimizable statements that are directly contained in
diff --git a/src/backend/commands/copyto.c b/src/backend/commands/copyto.c
index f55e6d9675..27b6f6f069 100644
--- a/src/backend/commands/copyto.c
+++ b/src/backend/commands/copyto.c
@@ -556,7 +556,7 @@ BeginCopyTo(ParseState *pstate,
((DR_copy *) dest)->cstate = cstate;
/* Create a QueryDesc requesting no output */
- cstate->queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ cstate->queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(),
InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/createas.c b/src/backend/commands/createas.c
index 5c92e48a56..0cc74dd45a 100644
--- a/src/backend/commands/createas.c
+++ b/src/backend/commands/createas.c
@@ -332,7 +332,7 @@ ExecCreateTableAs(ParseState *pstate, CreateTableAsStmt *stmt,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, pstate->p_sourcetext,
+ queryDesc = CreateQueryDesc(plan, NULL, pstate->p_sourcetext,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index a3f1d53d7a..b5c734e75c 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -512,7 +512,8 @@ standard_ExplainOneQuery(Query *query, int cursorOptions,
}
/* run it (if needed) and produce output */
- ExplainOnePlan(plan, into, es, queryString, params, queryEnv,
+ ExplainOnePlan(plan, NULL, NULL, -1, into, es, queryString, params,
+ queryEnv,
&planduration, (es->buffers ? &bufusage : NULL),
es->memory ? &mem_counters : NULL);
}
@@ -634,7 +635,9 @@ ExplainOneUtility(Node *utilityStmt, IntoClause *into, ExplainState *es,
* to call it.
*/
void
-ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
+ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
+ CachedPlanSource *plansource, int query_index,
+ IntoClause *into, ExplainState *es,
const char *queryString, ParamListInfo params,
QueryEnvironment *queryEnv, const instr_time *planduration,
const BufferUsage *bufusage,
@@ -690,7 +693,7 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
dest = None_Receiver;
/* Create a QueryDesc for the query */
- queryDesc = CreateQueryDesc(plannedstmt, queryString,
+ queryDesc = CreateQueryDesc(plannedstmt, cplan, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, instrument_option);
@@ -704,8 +707,11 @@ ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into, ExplainState *es,
if (into)
eflags |= GetIntoRelEFlags(into);
- /* call ExecutorStart to prepare the plan for execution */
- ExecutorStart(queryDesc, eflags);
+ /* Prepare the plan for execution. */
+ if (queryDesc->cplan)
+ ExecutorStartCachedPlan(queryDesc, eflags, plansource, query_index);
+ else
+ ExecutorStart(queryDesc, eflags);
/* Execute the plan for statistics if asked for */
if (es->analyze)
diff --git a/src/backend/commands/extension.c b/src/backend/commands/extension.c
index af6bd8ff42..7d4a3c5b8d 100644
--- a/src/backend/commands/extension.c
+++ b/src/backend/commands/extension.c
@@ -907,6 +907,7 @@ execute_sql_string(const char *sql, const char *filename)
QueryDesc *qdesc;
qdesc = CreateQueryDesc(stmt,
+ NULL,
sql,
GetActiveSnapshot(), NULL,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/matview.c b/src/backend/commands/matview.c
index 010097873d..69be74b4bd 100644
--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -438,7 +438,7 @@ refresh_matview_datafill(DestReceiver *dest, Query *query,
UpdateActiveSnapshotCommandId();
/* Create a QueryDesc, redirecting output to our tuple receiver */
- queryDesc = CreateQueryDesc(plan, queryString,
+ queryDesc = CreateQueryDesc(plan, NULL, queryString,
GetActiveSnapshot(), InvalidSnapshot,
dest, NULL, NULL, 0);
diff --git a/src/backend/commands/portalcmds.c b/src/backend/commands/portalcmds.c
index ac52ca25e9..48cf0b84e5 100644
--- a/src/backend/commands/portalcmds.c
+++ b/src/backend/commands/portalcmds.c
@@ -117,6 +117,7 @@ PerformCursorOpen(ParseState *pstate, DeclareCursorStmt *cstmt, ParamListInfo pa
queryString,
CMDTAG_SELECT, /* cursor's query is always a SELECT */
list_make1(plan),
+ NULL,
NULL);
/*----------
diff --git a/src/backend/commands/prepare.c b/src/backend/commands/prepare.c
index a93f970a29..45fd63d2b1 100644
--- a/src/backend/commands/prepare.c
+++ b/src/backend/commands/prepare.c
@@ -202,7 +202,8 @@ ExecuteQuery(ParseState *pstate,
query_string,
entry->plansource->commandTag,
plan_list,
- cplan);
+ cplan,
+ entry->plansource);
/*
* For CREATE TABLE ... AS EXECUTE, we must verify that the prepared
@@ -582,6 +583,7 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
MemoryContextCounters mem_counters;
MemoryContext planner_ctx = NULL;
MemoryContext saved_ctx = NULL;
+ int query_index = 0;
if (es->memory)
{
@@ -654,7 +656,8 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
PlannedStmt *pstmt = lfirst_node(PlannedStmt, p);
if (pstmt->commandType != CMD_UTILITY)
- ExplainOnePlan(pstmt, into, es, query_string, paramLI, pstate->p_queryEnv,
+ ExplainOnePlan(pstmt, cplan, entry->plansource, query_index,
+ into, es, query_string, paramLI, pstate->p_queryEnv,
&planduration, (es->buffers ? &bufusage : NULL),
es->memory ? &mem_counters : NULL);
else
@@ -665,6 +668,8 @@ ExplainExecuteQuery(ExecuteStmt *execstmt, IntoClause *into, ExplainState *es,
/* Separate plans with an appropriate separator */
if (lnext(plan_list, p) != NULL)
ExplainSeparatePlans(es);
+
+ query_index++;
}
if (estate)
diff --git a/src/backend/commands/trigger.c b/src/backend/commands/trigger.c
index 09356e46d1..79572ec8f1 100644
--- a/src/backend/commands/trigger.c
+++ b/src/backend/commands/trigger.c
@@ -5123,6 +5123,20 @@ AfterTriggerEndQuery(EState *estate)
afterTriggers.query_depth--;
}
+/* ----------
+ * AfterTriggerAbortQuery()
+ *
+ * Called by ExecutorEnd() if the query execution was aborted due to the
+ * plan becoming invalid during initialization.
+ * ----------
+ */
+void
+AfterTriggerAbortQuery(void)
+{
+ /* Revert the actions of AfterTriggerBeginQuery(). */
+ afterTriggers.query_depth--;
+}
+
/*
* AfterTriggerFreeQuery
diff --git a/src/backend/executor/README b/src/backend/executor/README
index 642d63be61..449c6068ae 100644
--- a/src/backend/executor/README
+++ b/src/backend/executor/README
@@ -280,6 +280,28 @@ are typically reset to empty once per tuple. Per-tuple contexts are usually
associated with ExprContexts, and commonly each PlanState node has its own
ExprContext to evaluate its qual and targetlist expressions in.
+Relation Locking
+----------------
+
+Typically, when the executor initializes a plan tree for execution, it doesn't
+lock non-index relations if the plan tree is freshly generated and not derived
+from a CachedPlan. This is because such locks have already been established
+during the query's parsing, rewriting, and planning phases. However, with a
+cached plan tree, some relations may remain unlocked. The function
+AcquireExecutorLocks() only locks unprunable relations in the plan, deferring
+the locking of prunable ones to executor initialization. This avoids
+unnecessary locking of relations that will be pruned during "initial" runtime
+pruning in ExecDoInitialPruning().
+
+This approach creates a window where a cached plan tree with child tables
+could become outdated if another backend modifies these tables before
+ExecDoInitialPruning() locks them. As a result, the executor has the added duty
+to verify the plan tree's validity whenever it locks a child table after
+doing initial pruning. This validation is done by checking the CachedPlan.is_valid
+flag. If the plan tree is outdated (is_valid = false), the executor stops
+further initialization, cleans up anything in EState that would have been
+allocated up to that point, and retries execution after recreating the
+invalid plan in the CachedPlan.
Query Processing Control Flow
-----------------------------
@@ -288,11 +310,13 @@ This is a sketch of control flow for full query processing:
CreateQueryDesc
- ExecutorStart
+ ExecutorStart or ExecutorStartCachedPlan
CreateExecutorState
creates per-query context
- switch to per-query context to run ExecInitNode
+ switch to per-query context to run ExecDoInitialPruning and ExecInitNode
AfterTriggerBeginQuery
+ ExecDoInitialPruning
+ does initial pruning and locks surviving partitions if needed
ExecInitNode --- recursively scans plan tree
ExecInitNode
recurse into subsidiary nodes
@@ -316,7 +340,12 @@ This is a sketch of control flow for full query processing:
FreeQueryDesc
-Per above comments, it's not really critical for ExecEndNode to free any
+As mentioned in the "Relation Locking" section, if the plan tree is found to
+be stale after locking partitions in ExecDoInitialPruning(), the control is
+immediately returned to ExecutorStartCachedPlan(), which will create a new plan
+tree and perform the steps starting from CreateExecutorState() again.
+
+Per above comments, it's not really critical for ExecEndPlan to free any
memory; it'll all go away in FreeExecutorState anyway. However, we do need to
be careful to close relations, drop buffer pins, etc, so we do need to scan
the plan state tree to find these sorts of resources.
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 5dc46f2e95..9543d9490c 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -55,11 +55,13 @@
#include "parser/parse_relation.h"
#include "pgstat.h"
#include "rewrite/rewriteHandler.h"
+#include "storage/lmgr.h"
#include "tcop/utility.h"
#include "utils/acl.h"
#include "utils/backend_status.h"
#include "utils/lsyscache.h"
#include "utils/partcache.h"
+#include "utils/plancache.h"
#include "utils/rls.h"
#include "utils/snapmgr.h"
@@ -137,6 +139,62 @@ ExecutorStart(QueryDesc *queryDesc, int eflags)
standard_ExecutorStart(queryDesc, eflags);
}
+/*
+ * ExecutorStartCachedPlan
+ * Start execution for a given query in the CachedPlanSource, replanning
+ * if the plan is invalidated due to deferred locks taken during the
+ * plan's initialization
+ *
+ * This function handles cases where the CachedPlan given in queryDesc->cplan
+ * might become invalid during the initialization of the plan given in
+ * queryDesc->plannedstmt, particularly when prunable relations in it are
+ * locked after performing initial pruning. If the locks invalidate the plan,
+ * the function calls UpdateCachedPlan() to replan all queries in the
+ * CachedPlan, and then retries initialization.
+ *
+ * The function repeats the process until ExecutorStart() successfully
+ * initializes the plan, that is without the CachedPlan becoming invalid.
+ */
+void
+ExecutorStartCachedPlan(QueryDesc *queryDesc, int eflags,
+ CachedPlanSource *plansource,
+ int query_index)
+{
+ if (unlikely(queryDesc->cplan == NULL))
+ elog(ERROR, "ExecutorStartCachedPlan(): missing CachedPlan");
+ if (unlikely(plansource == NULL))
+ elog(ERROR, "ExecutorStartCachedPlan(): missing CachedPlanSource");
+
+ /*
+ * Loop and retry with an updated plan until no further invalidation
+ * occurs.
+ */
+ while (1)
+ {
+ ExecutorStart(queryDesc, eflags);
+ if (!CachedPlanValid(queryDesc->cplan))
+ {
+ /*
+ * Clean up the current execution state before creating the new
+ * plan to retry ExecutorStart(). Mark execution as aborted to
+ * ensure that AFTER trigger state is properly reset.
+ */
+ queryDesc->estate->es_aborted = true;
+ ExecutorEnd(queryDesc);
+
+ /* Retry ExecutorStart() with an updated plan tree. */
+ queryDesc->plannedstmt = UpdateCachedPlan(plansource, query_index,
+ queryDesc->queryEnv);
+ }
+ else
+ /*
+ * Exit the loop if the plan is initialized successfully and no
+ * sinval messages were received that invalidated the CachedPlan.
+ */
+ break;
+ }
+}
+
void
standard_ExecutorStart(QueryDesc *queryDesc, int eflags)
{
@@ -320,6 +378,7 @@ standard_ExecutorRun(QueryDesc *queryDesc,
estate = queryDesc->estate;
Assert(estate != NULL);
+ Assert(!estate->es_aborted);
Assert(!(estate->es_top_eflags & EXEC_FLAG_EXPLAIN_ONLY));
/* caller must ensure the query's snapshot is active */
@@ -426,8 +485,11 @@ standard_ExecutorFinish(QueryDesc *queryDesc)
Assert(estate != NULL);
Assert(!(estate->es_top_eflags & EXEC_FLAG_EXPLAIN_ONLY));
- /* This should be run once and only once per Executor instance */
- Assert(!estate->es_finished);
+ /*
+ * This should be run once and only once per Executor instance and never
+ * if the execution was aborted.
+ */
+ Assert(!estate->es_finished && !estate->es_aborted);
/* Switch into per-query memory context */
oldcontext = MemoryContextSwitchTo(estate->es_query_cxt);
@@ -490,11 +552,10 @@ standard_ExecutorEnd(QueryDesc *queryDesc)
(PgStat_Counter) estate->es_parallel_workers_launched);
/*
- * Check that ExecutorFinish was called, unless in EXPLAIN-only mode. This
- * Assert is needed because ExecutorFinish is new as of 9.1, and callers
- * might forget to call it.
+ * Check that ExecutorFinish was called, unless in EXPLAIN-only mode or if
+ * execution was aborted.
*/
- Assert(estate->es_finished ||
+ Assert(estate->es_finished || estate->es_aborted ||
(estate->es_top_eflags & EXEC_FLAG_EXPLAIN_ONLY));
/*
@@ -508,6 +569,14 @@ standard_ExecutorEnd(QueryDesc *queryDesc)
UnregisterSnapshot(estate->es_snapshot);
UnregisterSnapshot(estate->es_crosscheck_snapshot);
+ /*
+ * Reset AFTER trigger module if the query execution was aborted.
+ */
+ if (estate->es_aborted &&
+ !(estate->es_top_eflags &
+ (EXEC_FLAG_SKIP_TRIGGERS | EXEC_FLAG_EXPLAIN_ONLY)))
+ AfterTriggerAbortQuery();
+
/*
* Must switch out of context before destroying it
*/
@@ -606,6 +675,21 @@ ExecCheckPermissions(List *rangeTable, List *rteperminfos,
(rte->rtekind == RTE_SUBQUERY &&
rte->relkind == RELKIND_VIEW));
+ /*
+ * Ensure that we have at least an AccessShareLock on relations
+ * whose permissions need to be checked.
+ *
+ * Skip this check in a parallel worker because locks won't be
+ * taken until ExecInitNode() performs plan initialization.
+ *
+ * XXX: ExecCheckPermissions() in a parallel worker may be
+ * redundant with the checks done in the leader process, so this
+ * should be reviewed to ensure it’s necessary.
+ */
+ Assert(IsParallelWorker() ||
+ CheckRelationOidLockedByMe(rte->relid, AccessShareLock,
+ true));
+
(void) getRTEPermissionInfo(rteperminfos, rte);
/* Many-to-one mapping not allowed */
Assert(!bms_is_member(rte->perminfoindex, indexset));
@@ -838,6 +922,7 @@ InitPlan(QueryDesc *queryDesc, int eflags)
{
CmdType operation = queryDesc->operation;
PlannedStmt *plannedstmt = queryDesc->plannedstmt;
+ CachedPlan *cachedplan = queryDesc->cplan;
Plan *plan = plannedstmt->planTree;
List *rangeTable = plannedstmt->rtable;
EState *estate = queryDesc->estate;
@@ -857,7 +942,9 @@ InitPlan(QueryDesc *queryDesc, int eflags)
ExecInitRangeTable(estate, rangeTable, plannedstmt->permInfos);
estate->es_plannedstmt = plannedstmt;
+ estate->es_cachedplan = cachedplan;
estate->es_part_prune_infos = plannedstmt->partPruneInfos;
+ estate->es_unpruned_relids = bms_copy(plannedstmt->unprunableRelids);
/*
* Perform runtime "initial" pruning to identify which child subplans,
@@ -867,9 +954,15 @@ InitPlan(QueryDesc *queryDesc, int eflags)
* executed, are saved in es_part_prune_results. These results correspond
* to each PartitionPruneInfo entry, and the es_part_prune_results list is
* parallel to es_part_prune_infos.
+ *
+ * This will also add the RT indexes of surviving leaf partitions to
+ * es_unpruned_relids.
*/
ExecDoInitialPruning(estate);
+ if (!ExecPlanStillValid(estate))
+ return;
+
/*
* Next, build the ExecRowMark array from the PlanRowMark(s), if any.
*/
@@ -884,8 +977,13 @@ InitPlan(QueryDesc *queryDesc, int eflags)
Relation relation;
ExecRowMark *erm;
- /* ignore "parent" rowmarks; they are irrelevant at runtime */
- if (rc->isParent)
+ /*
+ * Ignore "parent" rowmarks, because they are irrelevant at
+ * runtime. Also ignore the rowmarks belonging to child tables
+ * that have been pruned in ExecDoInitialPruning().
+ */
+ if (rc->isParent ||
+ !bms_is_member(rc->rti, estate->es_unpruned_relids))
continue;
/* get relation's OID (will produce InvalidOid if subquery) */
@@ -2857,6 +2955,9 @@ EvalPlanQualStart(EPQState *epqstate, Plan *planTree)
* the snapshot, rangetable, and external Param info. They need their own
* copies of local state, including a tuple table, es_param_exec_vals,
* result-rel info, etc.
+ *
+ * es_cachedplan is not copied because EPQ plan execution does not acquire
+ * any new locks that could invalidate the CachedPlan.
*/
rcestate->es_direction = ForwardScanDirection;
rcestate->es_snapshot = parentestate->es_snapshot;
@@ -2928,6 +3029,13 @@ EvalPlanQualStart(EPQState *epqstate, Plan *planTree)
}
}
+ /*
+ * Copy es_unpruned_relids so that RowMarks of pruned relations are
+ * ignored in ExecInitLockRows() and ExecInitModifyTable() when
+ * initializing the plan trees below.
+ */
+ rcestate->es_unpruned_relids = parentestate->es_unpruned_relids;
+
/*
* Initialize private state information for each SubPlan. We must do this
* before running ExecInitNode on the main query tree, since
diff --git a/src/backend/executor/execParallel.c b/src/backend/executor/execParallel.c
index b01a2fdfdd..0c2da25fab 100644
--- a/src/backend/executor/execParallel.c
+++ b/src/backend/executor/execParallel.c
@@ -1257,8 +1257,15 @@ ExecParallelGetQueryDesc(shm_toc *toc, DestReceiver *receiver,
paramspace = shm_toc_lookup(toc, PARALLEL_KEY_PARAMLISTINFO, false);
paramLI = RestoreParamList(¶mspace);
- /* Create a QueryDesc for the query. */
+ /*
+ * Create a QueryDesc for the query. We pass NULL for cachedplan, because
+ * we don't have a pointer to the CachedPlan in the leader's process. It's
+ * fine because the only reason the executor needs to see it is to decide
+ * if it should take locks on certain relations, but parallel workers
+ * always take locks anyway.
+ */
return CreateQueryDesc(pstmt,
+ NULL,
queryString,
GetActiveSnapshot(), InvalidSnapshot,
receiver, paramLI, NULL, instrument_options);
diff --git a/src/backend/executor/execPartition.c b/src/backend/executor/execPartition.c
index 46dd1c77a3..93cdae6f89 100644
--- a/src/backend/executor/execPartition.c
+++ b/src/backend/executor/execPartition.c
@@ -26,6 +26,7 @@
#include "partitioning/partdesc.h"
#include "partitioning/partprune.h"
#include "rewrite/rewriteManip.h"
+#include "storage/lmgr.h"
#include "utils/acl.h"
#include "utils/lsyscache.h"
#include "utils/partcache.h"
@@ -194,7 +195,8 @@ static void find_matching_subplans_recurse(PlanState *parent_plan,
PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans);
+ Bitmapset **validsubplans,
+ Bitmapset **validsubplan_rtis);
/*
@@ -1764,7 +1766,8 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
* ExecDoInitialPruning:
* Perform runtime "initial" pruning, if necessary, to determine the set
* of child subnodes that need to be initialized during ExecInitNode() for
- * all plan nodes that contain a PartitionPruneInfo.
+ * all plan nodes that contain a PartitionPruneInfo. This also locks the
+ * leaf partitions whose subnodes will be initialized if needed.
*
* ExecInitPartitionExecPruning:
* Updates the PartitionPruneState found at given part_prune_index in
@@ -1785,11 +1788,13 @@ adjust_partition_colnos_using_map(List *colnos, AttrMap *attrMap)
*-------------------------------------------------------------------------
*/
+
/*
* ExecDoInitialPruning
* Perform runtime "initial" pruning, if necessary, to determine the set
* of child subnodes that need to be initialized during ExecInitNode() for
- * plan nodes that support partition pruning.
+ * plan nodes that support partition pruning. This also locks the leaf
+ * partitions whose subnodes will be initialized if needed.
*
* This function iterates over each PartitionPruneInfo entry in
* estate->es_part_prune_infos. For each entry, it creates a PartitionPruneState
@@ -1810,6 +1815,7 @@ void
ExecDoInitialPruning(EState *estate)
{
ListCell *lc;
+ List *locked_relids = NIL;
foreach(lc, estate->es_part_prune_infos)
{
@@ -1827,10 +1833,48 @@ ExecDoInitialPruning(EState *estate)
* bitmapset or NULL as described in the header comment.
*/
if (prunestate->do_initial_prune)
- validsubplans = ExecFindMatchingSubPlans(prunestate, true);
+ {
+ Bitmapset *validsubplan_rtis = NULL;
+
+ validsubplans = ExecFindMatchingSubPlans(prunestate, true,
+ &validsubplan_rtis);
+ if (ExecShouldLockRelations(estate))
+ {
+ int rtindex = -1;
+
+ rtindex = -1;
+ while ((rtindex = bms_next_member(validsubplan_rtis,
+ rtindex)) >= 0)
+ {
+ RangeTblEntry *rte = exec_rt_fetch(rtindex, estate);
+
+ Assert(rte->rtekind == RTE_RELATION &&
+ rte->rellockmode != NoLock);
+ LockRelationOid(rte->relid, rte->rellockmode);
+ locked_relids = lappend_int(locked_relids, rtindex);
+ }
+ }
+ estate->es_unpruned_relids = bms_add_members(estate->es_unpruned_relids,
+ validsubplan_rtis);
+ }
+
estate->es_part_prune_results = lappend(estate->es_part_prune_results,
validsubplans);
}
+
+ /*
+ * Release the useless locks if the plan won't be executed. This is the
+ * same as what CheckCachedPlan() in plancache.c does.
+ */
+ if (!ExecPlanStillValid(estate))
+ {
+ foreach(lc, locked_relids)
+ {
+ RangeTblEntry *rte = exec_rt_fetch(lfirst_int(lc), estate);
+
+ UnlockRelationOid(rte->relid, rte->rellockmode);
+ }
+ }
}
/*
@@ -2042,8 +2086,8 @@ CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
* The set of partitions that exist now might not be the same that
* existed when the plan was made. The normal case is that it is;
* optimize for that case with a quick comparison, and just copy
- * the subplan_map and make subpart_map point to the one in
- * PruneInfo.
+ * the subplan_map and make subpart_map, leafpart_rti_map point to
+ * the ones in PruneInfo.
*
* For the case where they aren't identical, we could have more
* partitions on either side; or even exactly the same number of
@@ -2062,6 +2106,7 @@ CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
sizeof(int) * partdesc->nparts) == 0)
{
pprune->subpart_map = pinfo->subpart_map;
+ pprune->leafpart_rti_map = pinfo->leafpart_rti_map;
memcpy(pprune->subplan_map, pinfo->subplan_map,
sizeof(int) * pinfo->nparts);
}
@@ -2082,6 +2127,7 @@ CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
* mismatches.
*/
pprune->subpart_map = palloc(sizeof(int) * partdesc->nparts);
+ pprune->leafpart_rti_map = palloc(sizeof(int) * partdesc->nparts);
for (pp_idx = 0; pp_idx < partdesc->nparts; pp_idx++)
{
@@ -2099,6 +2145,8 @@ CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
pinfo->subplan_map[pd_idx];
pprune->subpart_map[pp_idx] =
pinfo->subpart_map[pd_idx];
+ pprune->leafpart_rti_map[pp_idx] =
+ pinfo->leafpart_rti_map[pd_idx];
pd_idx++;
continue;
}
@@ -2136,6 +2184,7 @@ CreatePartitionPruneState(EState *estate, PartitionPruneInfo *pruneinfo)
pprune->subpart_map[pp_idx] = -1;
pprune->subplan_map[pp_idx] = -1;
+ pprune->leafpart_rti_map[pp_idx] = 0;
}
}
@@ -2414,10 +2463,15 @@ PartitionPruneFixSubPlanMap(PartitionPruneState *prunestate,
* Pass initial_prune if PARAM_EXEC Params cannot yet be evaluated. This
* differentiates the initial executor-time pruning step from later
* runtime pruning.
+ *
+ * The caller must pass a non-NULL validsubplan_rtis during initial pruning
+ * to collect the RT indexes of leaf partitions whose subnodes will be
+ * executed. These RT indexes are later added to EState.es_unpruned_relids.
*/
Bitmapset *
ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune)
+ bool initial_prune,
+ Bitmapset **validsubplan_rtis)
{
Bitmapset *result = NULL;
MemoryContext oldcontext;
@@ -2429,6 +2483,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
* evaluated *and* there are steps in which to do so.
*/
Assert(initial_prune || prunestate->do_exec_prune);
+ Assert(validsubplan_rtis != NULL || !initial_prune);
/*
* Switch to a temp context to avoid leaking memory in the executor's
@@ -2453,7 +2508,7 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
pprune = &prunedata->partrelprunedata[0];
find_matching_subplans_recurse(prunestate->parent_plan,
prunedata, pprune, initial_prune,
- &result);
+ &result, validsubplan_rtis);
/* Expression eval may have used space in ExprContext too */
if (pprune->exec_context.initialized)
@@ -2470,6 +2525,8 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
/* Copy result out of the temp context before we reset it */
result = bms_copy(result);
+ if (validsubplan_rtis)
+ *validsubplan_rtis = bms_copy(*validsubplan_rtis);
MemoryContextReset(prunestate->prune_context);
@@ -2480,14 +2537,17 @@ ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
* find_matching_subplans_recurse
* Recursive worker function for ExecFindMatchingSubPlans
*
- * Adds valid (non-prunable) subplan IDs to *validsubplans
+ * Adds valid (non-prunable) subplan IDs to *validsubplans and the RT indexes
+ * of their corresponding leaf partitions to *validsubplan_rtis if
+ * it's non-NULL.
*/
static void
find_matching_subplans_recurse(PlanState *parent_plan,
PartitionPruningData *prunedata,
PartitionedRelPruningData *pprune,
bool initial_prune,
- Bitmapset **validsubplans)
+ Bitmapset **validsubplans,
+ Bitmapset **validsubplan_rtis)
{
Bitmapset *partset;
int i;
@@ -2530,8 +2590,13 @@ find_matching_subplans_recurse(PlanState *parent_plan,
while ((i = bms_next_member(partset, i)) >= 0)
{
if (pprune->subplan_map[i] >= 0)
+ {
*validsubplans = bms_add_member(*validsubplans,
pprune->subplan_map[i]);
+ if (validsubplan_rtis)
+ *validsubplan_rtis = bms_add_member(*validsubplan_rtis,
+ pprune->leafpart_rti_map[i]);
+ }
else
{
int partidx = pprune->subpart_map[i];
@@ -2540,7 +2605,8 @@ find_matching_subplans_recurse(PlanState *parent_plan,
find_matching_subplans_recurse(parent_plan,
prunedata,
&prunedata->partrelprunedata[partidx],
- initial_prune, validsubplans);
+ initial_prune, validsubplans,
+ validsubplan_rtis);
else
{
/*
diff --git a/src/backend/executor/execUtils.c b/src/backend/executor/execUtils.c
index bc905a0cdc..b7c914d66c 100644
--- a/src/backend/executor/execUtils.c
+++ b/src/backend/executor/execUtils.c
@@ -147,6 +147,7 @@ CreateExecutorState(void)
estate->es_top_eflags = 0;
estate->es_instrument = 0;
estate->es_finished = false;
+ estate->es_aborted = false;
estate->es_exprcontexts = NIL;
diff --git a/src/backend/executor/functions.c b/src/backend/executor/functions.c
index 8d1fda2ddc..058c10b4d4 100644
--- a/src/backend/executor/functions.c
+++ b/src/backend/executor/functions.c
@@ -840,6 +840,7 @@ postquel_start(execution_state *es, SQLFunctionCachePtr fcache)
dest = None_Receiver;
es->qd = CreateQueryDesc(es->stmt,
+ NULL,
fcache->src,
GetActiveSnapshot(),
InvalidSnapshot,
diff --git a/src/backend/executor/nodeAppend.c b/src/backend/executor/nodeAppend.c
index b77ff84840..89e05b19d0 100644
--- a/src/backend/executor/nodeAppend.c
+++ b/src/backend/executor/nodeAppend.c
@@ -581,7 +581,7 @@ choose_next_subplan_locally(AppendState *node)
else if (!node->as_valid_subplans_identified)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
node->as_valid_subplans_identified = true;
}
@@ -648,7 +648,7 @@ choose_next_subplan_for_leader(AppendState *node)
if (!node->as_valid_subplans_identified)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
node->as_valid_subplans_identified = true;
/*
@@ -724,7 +724,7 @@ choose_next_subplan_for_worker(AppendState *node)
else if (!node->as_valid_subplans_identified)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
node->as_valid_subplans_identified = true;
mark_invalid_subplans_as_finished(node);
@@ -877,7 +877,7 @@ ExecAppendAsyncBegin(AppendState *node)
if (!node->as_valid_subplans_identified)
{
node->as_valid_subplans =
- ExecFindMatchingSubPlans(node->as_prune_state, false);
+ ExecFindMatchingSubPlans(node->as_prune_state, false, NULL);
node->as_valid_subplans_identified = true;
classify_matching_subplans(node);
diff --git a/src/backend/executor/nodeLockRows.c b/src/backend/executor/nodeLockRows.c
index 41754ddfea..cfead7ded2 100644
--- a/src/backend/executor/nodeLockRows.c
+++ b/src/backend/executor/nodeLockRows.c
@@ -347,8 +347,13 @@ ExecInitLockRows(LockRows *node, EState *estate, int eflags)
ExecRowMark *erm;
ExecAuxRowMark *aerm;
- /* ignore "parent" rowmarks; they are irrelevant at runtime */
- if (rc->isParent)
+ /*
+ * Ignore "parent" rowmarks, because they are irrelevant at
+ * runtime. Also ignore the rowmarks belonging to child tables
+ * that have been pruned in ExecDoInitialPruning().
+ */
+ if (rc->isParent ||
+ !bms_is_member(rc->rti, estate->es_unpruned_relids))
continue;
/* find ExecRowMark and build ExecAuxRowMark */
diff --git a/src/backend/executor/nodeMergeAppend.c b/src/backend/executor/nodeMergeAppend.c
index e2032afcb7..0696dfe7eb 100644
--- a/src/backend/executor/nodeMergeAppend.c
+++ b/src/backend/executor/nodeMergeAppend.c
@@ -219,7 +219,7 @@ ExecMergeAppend(PlanState *pstate)
*/
if (node->ms_valid_subplans == NULL)
node->ms_valid_subplans =
- ExecFindMatchingSubPlans(node->ms_prune_state, false);
+ ExecFindMatchingSubPlans(node->ms_prune_state, false, NULL);
/*
* First time through: pull the first tuple from each valid subplan,
diff --git a/src/backend/executor/nodeModifyTable.c b/src/backend/executor/nodeModifyTable.c
index 1161520f76..7413a29eda 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -636,7 +636,7 @@ ExecInitUpdateProjection(ModifyTableState *mtstate,
Assert(whichrel >= 0 && whichrel < mtstate->mt_nrels);
}
- updateColnos = (List *) list_nth(node->updateColnosLists, whichrel);
+ updateColnos = (List *) list_nth(mtstate->mt_updateColnosLists, whichrel);
/*
* For UPDATE, we use the old tuple to fill up missing values in the tuple
@@ -4282,7 +4282,11 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
ModifyTableState *mtstate;
Plan *subplan = outerPlan(node);
CmdType operation = node->operation;
- int nrels = list_length(node->resultRelations);
+ int nrels;
+ List *resultRelations = NIL;
+ List *withCheckOptionLists = NIL;
+ List *returningLists = NIL;
+ List *updateColnosLists = NIL;
ResultRelInfo *resultRelInfo;
List *arowmarks;
ListCell *l;
@@ -4292,6 +4296,45 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
/* check for unsupported flags */
Assert(!(eflags & (EXEC_FLAG_BACKWARD | EXEC_FLAG_MARK)));
+ /*
+ * Only consider unpruned relations for initializing their ResultRelInfo
+ * struct and other fields such as withCheckOptions, etc.
+ */
+ i = 0;
+ foreach(l, node->resultRelations)
+ {
+ Index rti = lfirst_int(l);
+
+ if (bms_is_member(rti, estate->es_unpruned_relids))
+ {
+ resultRelations = lappend_int(resultRelations, rti);
+ if (node->withCheckOptionLists)
+ {
+ List *withCheckOptions = list_nth_node(List,
+ node->withCheckOptionLists,
+ i);
+
+ withCheckOptionLists = lappend(withCheckOptionLists, withCheckOptions);
+ }
+ if (node->returningLists)
+ {
+ List *returningList = list_nth_node(List,
+ node->returningLists,
+ i);
+
+ returningLists = lappend(returningLists, returningList);
+ }
+ if (node->updateColnosLists)
+ {
+ List *updateColnosList = list_nth(node->updateColnosLists, i);
+
+ updateColnosLists = lappend(updateColnosLists, updateColnosList);
+ }
+ }
+ i++;
+ }
+ nrels = list_length(resultRelations);
+
/*
* create state structure
*/
@@ -4312,6 +4355,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
mtstate->mt_merge_inserted = 0;
mtstate->mt_merge_updated = 0;
mtstate->mt_merge_deleted = 0;
+ mtstate->mt_updateColnosLists = updateColnosLists;
/*----------
* Resolve the target relation. This is the same as:
@@ -4329,6 +4373,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
*/
if (node->rootRelation > 0)
{
+ Assert(bms_is_member(node->rootRelation, estate->es_unpruned_relids));
mtstate->rootResultRelInfo = makeNode(ResultRelInfo);
ExecInitResultRelation(estate, mtstate->rootResultRelInfo,
node->rootRelation);
@@ -4343,7 +4388,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
/* set up epqstate with dummy subplan data for the moment */
EvalPlanQualInit(&mtstate->mt_epqstate, estate, NULL, NIL,
- node->epqParam, node->resultRelations);
+ node->epqParam, resultRelations);
mtstate->fireBSTriggers = true;
/*
@@ -4361,7 +4406,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
*/
resultRelInfo = mtstate->resultRelInfo;
i = 0;
- foreach(l, node->resultRelations)
+ foreach(l, resultRelations)
{
Index resultRelation = lfirst_int(l);
List *mergeActions = NIL;
@@ -4505,7 +4550,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
* Initialize any WITH CHECK OPTION constraints if needed.
*/
resultRelInfo = mtstate->resultRelInfo;
- foreach(l, node->withCheckOptionLists)
+ foreach(l, withCheckOptionLists)
{
List *wcoList = (List *) lfirst(l);
List *wcoExprs = NIL;
@@ -4528,7 +4573,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
/*
* Initialize RETURNING projections if needed.
*/
- if (node->returningLists)
+ if (returningLists)
{
TupleTableSlot *slot;
ExprContext *econtext;
@@ -4537,7 +4582,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
* Initialize result tuple slot and assign its rowtype using the first
* RETURNING list. We assume the rest will look the same.
*/
- mtstate->ps.plan->targetlist = (List *) linitial(node->returningLists);
+ mtstate->ps.plan->targetlist = (List *) linitial(returningLists);
/* Set up a slot for the output of the RETURNING projection(s) */
ExecInitResultTupleSlotTL(&mtstate->ps, &TTSOpsVirtual);
@@ -4552,7 +4597,7 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
* Build a projection for each result rel.
*/
resultRelInfo = mtstate->resultRelInfo;
- foreach(l, node->returningLists)
+ foreach(l, returningLists)
{
List *rlist = (List *) lfirst(l);
@@ -4653,8 +4698,13 @@ ExecInitModifyTable(ModifyTable *node, EState *estate, int eflags)
ExecRowMark *erm;
ExecAuxRowMark *aerm;
- /* ignore "parent" rowmarks; they are irrelevant at runtime */
- if (rc->isParent)
+ /*
+ * Ignore "parent" rowmarks, because they are irrelevant at
+ * runtime. Also ignore the rowmarks belonging to child tables
+ * that have been pruned in ExecDoInitialPruning().
+ */
+ if (rc->isParent ||
+ !bms_is_member(rc->rti, estate->es_unpruned_relids))
continue;
/* Find ExecRowMark and build ExecAuxRowMark */
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 2fb2e73604..a7f9824e4d 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -70,7 +70,8 @@ static int _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
static ParamListInfo _SPI_convert_params(int nargs, Oid *argtypes,
Datum *Values, const char *Nulls);
-static int _SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount);
+static int _SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount,
+ CachedPlanSource *plansource, int query_index);
static void _SPI_error_callback(void *arg);
@@ -1685,7 +1686,8 @@ SPI_cursor_open_internal(const char *name, SPIPlanPtr plan,
query_string,
plansource->commandTag,
stmt_list,
- cplan);
+ cplan,
+ plansource);
/*
* Set up options for portal. Default SCROLL type is chosen the same way
@@ -2500,6 +2502,7 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
CachedPlanSource *plansource = (CachedPlanSource *) lfirst(lc1);
List *stmt_list;
ListCell *lc2;
+ int query_index = 0;
spicallbackarg.query = plansource->query_string;
@@ -2690,14 +2693,16 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
snap = InvalidSnapshot;
qdesc = CreateQueryDesc(stmt,
+ cplan,
plansource->query_string,
snap, crosscheck_snapshot,
dest,
options->params,
_SPI_current->queryEnv,
0);
- res = _SPI_pquery(qdesc, fire_triggers,
- canSetTag ? options->tcount : 0);
+
+ res = _SPI_pquery(qdesc, fire_triggers, canSetTag ? options->tcount : 0,
+ plansource, query_index);
FreeQueryDesc(qdesc);
}
else
@@ -2794,6 +2799,8 @@ _SPI_execute_plan(SPIPlanPtr plan, const SPIExecuteOptions *options,
my_res = res;
goto fail;
}
+
+ query_index++;
}
/* Done with this plan, so release refcount */
@@ -2871,7 +2878,8 @@ _SPI_convert_params(int nargs, Oid *argtypes,
}
static int
-_SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount)
+_SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount,
+ CachedPlanSource *plansource, int query_index)
{
int operation = queryDesc->operation;
int eflags;
@@ -2927,7 +2935,10 @@ _SPI_pquery(QueryDesc *queryDesc, bool fire_triggers, uint64 tcount)
else
eflags = EXEC_FLAG_SKIP_TRIGGERS;
- ExecutorStart(queryDesc, eflags);
+ if (queryDesc->cplan)
+ ExecutorStartCachedPlan(queryDesc, eflags, plansource, query_index);
+ else
+ ExecutorStart(queryDesc, eflags);
ExecutorRun(queryDesc, ForwardScanDirection, tcount, true);
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 9c253e864a..5fe2eeb65c 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -559,6 +559,8 @@ standard_planner(Query *parse, const char *query_string, int cursorOptions,
result->planTree = top_plan;
result->partPruneInfos = glob->partPruneInfos;
result->rtable = glob->finalrtable;
+ result->unprunableRelids = bms_difference(glob->allRelids,
+ glob->prunableRelids);
result->permInfos = glob->finalrteperminfos;
result->resultRelations = glob->resultRelations;
result->appendRelations = glob->appendRelations;
diff --git a/src/backend/optimizer/plan/setrefs.c b/src/backend/optimizer/plan/setrefs.c
index 9f13243d54..053d2687f2 100644
--- a/src/backend/optimizer/plan/setrefs.c
+++ b/src/backend/optimizer/plan/setrefs.c
@@ -564,7 +564,8 @@ add_rte_to_flat_rtable(PlannerGlobal *glob, List *rteperminfos,
/*
* If it's a plain relation RTE (or a subquery that was once a view
- * reference), add the relation OID to relationOids.
+ * reference), add the relation OID to relationOids. Also add its new RT
+ * index to the set of relations that need to be locked for execution.
*
* We do this even though the RTE might be unreferenced in the plan tree;
* this would correspond to cases such as views that were expanded, child
@@ -576,7 +577,11 @@ add_rte_to_flat_rtable(PlannerGlobal *glob, List *rteperminfos,
*/
if (newrte->rtekind == RTE_RELATION ||
(newrte->rtekind == RTE_SUBQUERY && OidIsValid(newrte->relid)))
+ {
glob->relationOids = lappend_oid(glob->relationOids, newrte->relid);
+ glob->allRelids = bms_add_member(glob->allRelids,
+ list_length(glob->finalrtable));
+ }
/*
* Add a copy of the RTEPermissionInfo, if any, corresponding to this RTE
@@ -1740,6 +1745,11 @@ set_customscan_references(PlannerInfo *root,
*
* Also update the RT indexes present in PartitionedRelPruneInfos to add the
* offset.
+ *
+ * Finally, if there are initial pruning steps, add the RT indexes of the
+ * leaf partitions to the set of relations that are prunable at execution
+ * startup time. This set indicates which relations should not be locked
+ * before executor startup, as they may be pruned during initial pruning.
*/
static int
register_partpruneinfo(PlannerInfo *root, int part_prune_index, int rtoffset)
@@ -1762,8 +1772,25 @@ register_partpruneinfo(PlannerInfo *root, int part_prune_index, int rtoffset)
foreach(l2, prune_infos)
{
PartitionedRelPruneInfo *prelinfo = lfirst(l2);
+ int i;
prelinfo->rtindex += rtoffset;
+
+ for (i = 0; i < prelinfo->nparts; i++)
+ {
+ /*
+ * Non-leaf partitions and partitions that do not have a
+ * subplan are not included in this map as mentioned in
+ * make_partitionedrel_pruneinfo().
+ */
+ if (prelinfo->leafpart_rti_map[i])
+ {
+ prelinfo->leafpart_rti_map[i] += rtoffset;
+ if (prelinfo->initial_pruning_steps)
+ glob->prunableRelids = bms_add_member(glob->prunableRelids,
+ prelinfo->leafpart_rti_map[i]);
+ }
+ }
}
}
diff --git a/src/backend/partitioning/partprune.c b/src/backend/partitioning/partprune.c
index ae1d69f96c..03e596c405 100644
--- a/src/backend/partitioning/partprune.c
+++ b/src/backend/partitioning/partprune.c
@@ -645,6 +645,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
int *subplan_map;
int *subpart_map;
Oid *relid_map;
+ int *leafpart_rti_map;
/*
* Construct the subplan and subpart maps for this partitioning level.
@@ -657,6 +658,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subpart_map = (int *) palloc(nparts * sizeof(int));
memset(subpart_map, -1, nparts * sizeof(int));
relid_map = (Oid *) palloc0(nparts * sizeof(Oid));
+ leafpart_rti_map = (int *) palloc0(nparts * sizeof(int));
present_parts = NULL;
i = -1;
@@ -671,9 +673,28 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
subplan_map[i] = subplanidx = relid_subplan_map[partrel->relid] - 1;
subpart_map[i] = subpartidx = relid_subpart_map[partrel->relid] - 1;
relid_map[i] = planner_rt_fetch(partrel->relid, root)->relid;
+
+ /*
+ * Track the RT indexes of "leaf" partitions so they can be
+ * included in the PlannerGlobal.prunableRelids set, indicating
+ * relations whose locking is deferred until executor startup.
+ *
+ * We don’t defer locking of sub-partitioned partitions because
+ * setting up PartitionedRelPruningData currently occurs before
+ * initial pruning, so the relation must be locked at that stage,
+ * even if it may be pruned.
+ *
+ * Only leaf partitions with a valid subplan that are prunable
+ * using initial pruning are added to prunableRelids. So
+ * partitions without a subplan due to constraint exclusion will
+ * remain in PlannedStmt.unprunableRelids and thus their locking
+ * will not be deferred even if they may ultimately be pruned due
+ * to initial pruning.
+ */
if (subplanidx >= 0)
{
present_parts = bms_add_member(present_parts, i);
+ leafpart_rti_map[i] = (int) partrel->relid;
/* Record finding this subplan */
subplansfound = bms_add_member(subplansfound, subplanidx);
@@ -695,6 +716,7 @@ make_partitionedrel_pruneinfo(PlannerInfo *root, RelOptInfo *parentrel,
pinfo->subplan_map = subplan_map;
pinfo->subpart_map = subpart_map;
pinfo->relid_map = relid_map;
+ pinfo->leafpart_rti_map = leafpart_rti_map;
}
pfree(relid_subpart_map);
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 4b985bd056..48b0675070 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -1236,6 +1236,7 @@ exec_simple_query(const char *query_string)
query_string,
commandTag,
plantree_list,
+ NULL,
NULL);
/*
@@ -2038,7 +2039,8 @@ exec_bind_message(StringInfo input_message)
query_string,
psrc->commandTag,
cplan->stmt_list,
- cplan);
+ cplan,
+ psrc);
/* Done with the snapshot used for parameter I/O and parsing/planning */
if (snapshot_set)
diff --git a/src/backend/tcop/pquery.c b/src/backend/tcop/pquery.c
index 0c45fcf318..fe52db1369 100644
--- a/src/backend/tcop/pquery.c
+++ b/src/backend/tcop/pquery.c
@@ -19,6 +19,7 @@
#include "access/xact.h"
#include "commands/prepare.h"
+#include "executor/execdesc.h"
#include "executor/tstoreReceiver.h"
#include "miscadmin.h"
#include "pg_trace.h"
@@ -36,6 +37,9 @@ Portal ActivePortal = NULL;
static void ProcessQuery(PlannedStmt *plan,
+ CachedPlan *cplan,
+ CachedPlanSource *plansource,
+ int query_index,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -65,6 +69,7 @@ static void DoPortalRewind(Portal portal);
*/
QueryDesc *
CreateQueryDesc(PlannedStmt *plannedstmt,
+ CachedPlan *cplan,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
@@ -77,6 +82,7 @@ CreateQueryDesc(PlannedStmt *plannedstmt,
qd->operation = plannedstmt->commandType; /* operation */
qd->plannedstmt = plannedstmt; /* plan */
+ qd->cplan = cplan; /* CachedPlan supplying the plannedstmt */
qd->sourceText = sourceText; /* query text */
qd->snapshot = RegisterSnapshot(snapshot); /* snapshot */
/* RI check snapshot */
@@ -122,6 +128,9 @@ FreeQueryDesc(QueryDesc *qdesc)
* PORTAL_ONE_RETURNING, or PORTAL_ONE_MOD_WITH portal
*
* plan: the plan tree for the query
+ * cplan: CachedPlan supplying the plan
+ * plansource: CachedPlanSource supplying the cplan
+ * query_index: index of the query in plansource->query_list
* sourceText: the source text of the query
* params: any parameters needed
* dest: where to send results
@@ -134,6 +143,9 @@ FreeQueryDesc(QueryDesc *qdesc)
*/
static void
ProcessQuery(PlannedStmt *plan,
+ CachedPlan *cplan,
+ CachedPlanSource *plansource,
+ int query_index,
const char *sourceText,
ParamListInfo params,
QueryEnvironment *queryEnv,
@@ -145,14 +157,17 @@ ProcessQuery(PlannedStmt *plan,
/*
* Create the QueryDesc object
*/
- queryDesc = CreateQueryDesc(plan, sourceText,
+ queryDesc = CreateQueryDesc(plan, cplan, sourceText,
GetActiveSnapshot(), InvalidSnapshot,
dest, params, queryEnv, 0);
/*
- * Call ExecutorStart to prepare the plan for execution
+ * Prepare the plan for execution
*/
- ExecutorStart(queryDesc, 0);
+ if (queryDesc->cplan)
+ ExecutorStartCachedPlan(queryDesc, 0, plansource, query_index);
+ else
+ ExecutorStart(queryDesc, 0);
/*
* Run the plan to completion.
@@ -493,6 +508,7 @@ PortalStart(Portal portal, ParamListInfo params,
* the destination to DestNone.
*/
queryDesc = CreateQueryDesc(linitial_node(PlannedStmt, portal->stmts),
+ portal->cplan,
portal->sourceText,
GetActiveSnapshot(),
InvalidSnapshot,
@@ -512,9 +528,13 @@ PortalStart(Portal portal, ParamListInfo params,
myeflags = eflags;
/*
- * Call ExecutorStart to prepare the plan for execution
+ * Prepare the plan for execution.
*/
- ExecutorStart(queryDesc, myeflags);
+ if (portal->cplan)
+ ExecutorStartCachedPlan(queryDesc, myeflags,
+ portal->plansource, 0);
+ else
+ ExecutorStart(queryDesc, myeflags);
/*
* This tells PortalCleanup to shut down the executor
@@ -1194,6 +1214,7 @@ PortalRunMulti(Portal portal,
{
bool active_snapshot_set = false;
ListCell *stmtlist_item;
+ int query_index = 0;
/*
* If the destination is DestRemoteExecute, change to DestNone. The
@@ -1275,6 +1296,9 @@ PortalRunMulti(Portal portal,
{
/* statement can set tag string */
ProcessQuery(pstmt,
+ portal->cplan,
+ portal->plansource,
+ query_index,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1284,6 +1308,9 @@ PortalRunMulti(Portal portal,
{
/* stmt added by rewrite cannot set tag */
ProcessQuery(pstmt,
+ portal->cplan,
+ portal->plansource,
+ query_index,
portal->sourceText,
portal->portalParams,
portal->queryEnv,
@@ -1348,6 +1375,8 @@ PortalRunMulti(Portal portal,
*/
if (lnext(portal->stmts, stmtlist_item) != NULL)
CommandCounterIncrement();
+
+ query_index++;
}
/* Pop the snapshot if we pushed one. */
diff --git a/src/backend/utils/cache/plancache.c b/src/backend/utils/cache/plancache.c
index c66a088f40..8908a0cdc2 100644
--- a/src/backend/utils/cache/plancache.c
+++ b/src/backend/utils/cache/plancache.c
@@ -101,7 +101,8 @@ static dlist_head cached_expression_list = DLIST_STATIC_INIT(cached_expression_l
static void ReleaseGenericPlan(CachedPlanSource *plansource);
static List *RevalidateCachedQuery(CachedPlanSource *plansource,
- QueryEnvironment *queryEnv);
+ QueryEnvironment *queryEnv,
+ bool release_generic);
static bool CheckCachedPlan(CachedPlanSource *plansource);
static CachedPlan *BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
ParamListInfo boundParams, QueryEnvironment *queryEnv);
@@ -578,10 +579,17 @@ ReleaseGenericPlan(CachedPlanSource *plansource)
* The result value is the transient analyzed-and-rewritten query tree if we
* had to do re-analysis, and NIL otherwise. (This is returned just to save
* a tree copying step in a subsequent BuildCachedPlan call.)
+ *
+ * This also releases and drops the generic plan (plansource->gplan), if any,
+ * as most callers will typically build a new CachedPlan for the plansource
+ * right after this. However, when called from UpdateCachedPlan(), the
+ * function does not release the generic plan, as UpdateCachedPlan() updates
+ * an existing CachedPlan in place.
*/
static List *
RevalidateCachedQuery(CachedPlanSource *plansource,
- QueryEnvironment *queryEnv)
+ QueryEnvironment *queryEnv,
+ bool release_generic)
{
bool snapshot_set;
RawStmt *rawtree;
@@ -678,8 +686,9 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
MemoryContextDelete(qcxt);
}
- /* Drop the generic plan reference if any */
- ReleaseGenericPlan(plansource);
+ /* Drop the generic plan reference, if any, and if requested */
+ if (release_generic)
+ ReleaseGenericPlan(plansource);
/*
* Now re-do parse analysis and rewrite. This not incidentally acquires
@@ -815,8 +824,11 @@ RevalidateCachedQuery(CachedPlanSource *plansource,
* Caller must have already called RevalidateCachedQuery to verify that the
* querytree is up to date.
*
- * On a "true" return, we have acquired the locks needed to run the plan.
- * (We must do this for the "true" result to be race-condition-free.)
+ * On a "true" return, we have acquired locks on the "unprunableRelids" set
+ * for all plans in plansource->stmt_list. However, the plans are not fully
+ * race-condition-free until the executor acquires locks on the prunable
+ * relations that survive initial runtime pruning during executor
+ * initialization.
*/
static bool
CheckCachedPlan(CachedPlanSource *plansource)
@@ -870,7 +882,11 @@ CheckCachedPlan(CachedPlanSource *plansource)
*/
if (plan->is_valid)
{
- /* Successfully revalidated and locked the query. */
+ /*
+ * Successfully revalidated and locked the query. Set is_reused
+ * to true so that CachedPlanRequiresLocking() returns true.
+ */
+ plan->is_reused = true;
return true;
}
@@ -895,12 +911,14 @@ CheckCachedPlan(CachedPlanSource *plansource)
* To build a generic, parameter-value-independent plan, pass NULL for
* boundParams. To build a custom plan, pass the actual parameter values via
* boundParams. For best effect, the PARAM_FLAG_CONST flag should be set on
- * each parameter value; otherwise the planner will treat the value as a
- * hint rather than a hard constant.
+ * each parameter value; otherwise the planner will treat the value as a hint
+ * rather than a hard constant.
*
* Planning work is done in the caller's memory context. The finished plan
* is in a child memory context, which typically should get reparented
* (unless this is a one-shot plan, in which case we don't copy the plan).
+ *
+ * Note: When changing this, you should also look at UpdateCachedPlan().
*/
static CachedPlan *
BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
@@ -911,6 +929,7 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
bool snapshot_set;
bool is_transient;
MemoryContext plan_context;
+ MemoryContext stmt_context = NULL;
MemoryContext oldcxt = CurrentMemoryContext;
ListCell *lc;
@@ -928,7 +947,7 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
* let's treat it as real and redo the RevalidateCachedQuery call.
*/
if (!plansource->is_valid)
- qlist = RevalidateCachedQuery(plansource, queryEnv);
+ qlist = RevalidateCachedQuery(plansource, queryEnv, true);
/*
* If we don't already have a copy of the querytree list that can be
@@ -967,10 +986,19 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
PopActiveSnapshot();
/*
- * Normally we make a dedicated memory context for the CachedPlan and its
- * subsidiary data. (It's probably not going to be large, but just in
- * case, allow it to grow large. It's transient for the moment.) But for
- * a one-shot plan, we just leave it in the caller's memory context.
+ * Normally, we create a dedicated memory context for the CachedPlan and
+ * its subsidiary data. Although it's usually not very large, the context
+ * is designed to allow growth if necessary.
+ *
+ * The PlannedStmts are stored in a separate child context (stmt_context)
+ * of the CachedPlan's memory context. This separation allows
+ * UpdateCachedPlan() to free and replace the PlannedStmts without
+ * affecting the CachedPlan structure or its stmt_list List.
+ *
+ * For one-shot plans, we instead use the caller's memory context, as the
+ * CachedPlan will not persist. stmt_context will be set to NULL in this
+ * case, because UpdateCachedPlan() should never get called on a one-shot
+ * plan.
*/
if (!plansource->is_oneshot)
{
@@ -979,12 +1007,17 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
ALLOCSET_START_SMALL_SIZES);
MemoryContextCopyAndSetIdentifier(plan_context, plansource->query_string);
- /*
- * Copy plan into the new context.
- */
- MemoryContextSwitchTo(plan_context);
+ stmt_context = AllocSetContextCreate(CurrentMemoryContext,
+ "CachedPlan PlannedStmts",
+ ALLOCSET_START_SMALL_SIZES);
+ MemoryContextCopyAndSetIdentifier(stmt_context, plansource->query_string);
+ MemoryContextSetParent(stmt_context, plan_context);
+ MemoryContextSwitchTo(stmt_context);
plist = copyObject(plist);
+
+ MemoryContextSwitchTo(plan_context);
+ plist = list_copy(plist);
}
else
plan_context = CurrentMemoryContext;
@@ -1025,8 +1058,10 @@ BuildCachedPlan(CachedPlanSource *plansource, List *qlist,
plan->saved_xmin = InvalidTransactionId;
plan->refcount = 0;
plan->context = plan_context;
+ plan->stmt_context = stmt_context;
plan->is_oneshot = plansource->is_oneshot;
plan->is_saved = false;
+ plan->is_reused = false;
plan->is_valid = true;
/* assign generation number to new plan */
@@ -1153,8 +1188,11 @@ cached_plan_cost(CachedPlan *plan, bool include_planner)
* plan or a custom plan for the given parameters: the caller does not know
* which it will get.
*
- * On return, the plan is valid and we have sufficient locks to begin
- * execution.
+ * On return, the plan is valid, but not all locks are acquired if the
+ * returned plan is a reused generic plan. In such cases, locks on relations
+ * subject to initial runtime pruning are not taken by CheckCachedPlan() but
+ * deferred until the execution startup phase, specifically when
+ * ExecDoInitialPruning() performs initial pruning.
*
* On return, the refcount of the plan has been incremented; a later
* ReleaseCachedPlan() call is expected. If "owner" is not NULL then
@@ -1180,7 +1218,7 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
elog(ERROR, "cannot apply ResourceOwner to non-saved cached plan");
/* Make sure the querytree list is valid and we have parse-time locks */
- qlist = RevalidateCachedQuery(plansource, queryEnv);
+ qlist = RevalidateCachedQuery(plansource, queryEnv, true);
/* Decide whether to use a custom plan */
customplan = choose_custom_plan(plansource, boundParams);
@@ -1276,6 +1314,113 @@ GetCachedPlan(CachedPlanSource *plansource, ParamListInfo boundParams,
return plan;
}
+/*
+ * UpdateCachedPlan
+ * Create fresh plans for all queries in the CachedPlanSource, replacing
+ * those in the generic plan's stmt_list, and return the plan for the
+ * query_index'th query.
+ *
+ * This function is primarily used by ExecutorStartCachedPlan() to handle
+ * cases where the original generic CachedPlan becomes invalid. Such
+ * invalidation may occur when prunable relations in the old plan for the
+ * query_index'th query are locked in preparation for execution.
+ *
+ * Note that invalidations received during the execution of the query_index'th
+ * query can affect both the queries that have already finished execution
+ * (e.g., due to concurrent modifications on prunable relations that were not
+ * locked during their execution) and also the queries that have not yet been
+ * executed. As a result, this function updates all plans to ensure
+ * CachedPlan.is_valid is safely set to true.
+ *
+ * The old PlannedStmts in plansource->gplan->stmt_list are freed here, so
+ * the caller and any of its callers must not rely on them remaining accessible
+ * after this function is called.
+ */
+PlannedStmt *
+UpdateCachedPlan(CachedPlanSource *plansource, int query_index,
+ QueryEnvironment *queryEnv)
+{
+ List *query_list = plansource->query_list,
+ *plan_list;
+ ListCell *l1,
+ *l2;
+ CachedPlan *plan = plansource->gplan;
+ MemoryContext oldcxt;
+
+ Assert(ActiveSnapshotSet());
+
+ /* Sanity checks */
+ if (plan == NULL)
+ elog(ERROR, "UpdateCachedPlan() called in the wrong context: plansource->gplan is NULL");
+ else if (plan->is_valid)
+ elog(ERROR, "UpdateCachedPlan() called in the wrong context: plansource->gplan->is_valid is true");
+ else if (plan->is_oneshot)
+ elog(ERROR, "UpdateCachedPlan() called in the wrong context: plansource->gplan->is_oneshot is true");
+
+ /*
+ * The plansource might have become invalid since GetCachedPlan() returned
+ * the CachedPlan. See the comment in BuildCachedPlan() for details on why
+ * this might happen. Although invalidation is likely a false positive as
+ * stated there, we make the plan valid to ensure the query list used for
+ * planning is up to date.
+ *
+ * The risk of catching an invalidation is higher here than when
+ * BuildCachedPlan() is called from GetCachedPlan(), because this function
+ * is normally called long after GetCachedPlan() returns the CachedPlan, so
+ * much more processing could have occurred including things that mark
+ * the CachedPlanSource invalid.
+ *
+ * Note: Do not release plansource->gplan, because the upstream callers
+ * (such as the callers of ExecutorStartCachedPlan()) would still be
+ * referencing it.
+ */
+ if (!plansource->is_valid)
+ query_list = RevalidateCachedQuery(plansource, queryEnv, false);
+ Assert(query_list != NIL);
+
+ /*
+ * Build a new generic plan for all the queries after making a copy to be
+ * scribbled on by the planner.
+ */
+ query_list = copyObject(query_list);
+
+ /*
+ * Planning work is done in the caller's memory context. The resulting
+ * PlannedStmt is then copied into plan->stmt_context after throwing
+ * away the old ones.
+ */
+ plan_list = pg_plan_queries(query_list, plansource->query_string,
+ plansource->cursor_options, NULL);
+ Assert(list_length(plan_list) == list_length(plan->stmt_list));
+
+ MemoryContextReset(plan->stmt_context);
+ oldcxt = MemoryContextSwitchTo(plan->stmt_context);
+ forboth (l1, plan_list, l2, plan->stmt_list)
+ {
+ PlannedStmt *plannedstmt = lfirst(l1);
+
+ lfirst(l2) = copyObject(plannedstmt);
+ }
+ MemoryContextSwitchTo(oldcxt);
+
+ /*
+ * XXX Should this also (re)set the properties of the CachedPlan that are
+ * set in BuildCachedPlan() after creating the fresh plans such as
+ * planRoleId, dependsOnRole, and save_xmin?
+ */
+
+ /*
+ * We've updated all the plans that might have been invalidated, so mark
+ * the CachedPlan as valid.
+ */
+ plan->is_valid = true;
+
+ /* Also update generic_cost because we just created a new generic plan. */
+ plansource->generic_cost = cached_plan_cost(plan, false);
+
+ return list_nth_node(PlannedStmt, plan->stmt_list, query_index);
+}
+
/*
* ReleaseCachedPlan: release active use of a cached plan.
*
@@ -1654,7 +1799,7 @@ CachedPlanGetTargetList(CachedPlanSource *plansource,
return NIL;
/* Make sure the querytree list is valid and we have parse-time locks */
- RevalidateCachedQuery(plansource, queryEnv);
+ RevalidateCachedQuery(plansource, queryEnv, true);
/* Get the primary statement and find out what it returns */
pstmt = QueryListGetPrimaryStmt(plansource->query_list);
@@ -1776,7 +1921,7 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
foreach(lc1, stmt_list)
{
PlannedStmt *plannedstmt = lfirst_node(PlannedStmt, lc1);
- ListCell *lc2;
+ int rtindex;
if (plannedstmt->commandType == CMD_UTILITY)
{
@@ -1794,13 +1939,16 @@ AcquireExecutorLocks(List *stmt_list, bool acquire)
continue;
}
- foreach(lc2, plannedstmt->rtable)
+ rtindex = -1;
+ while ((rtindex = bms_next_member(plannedstmt->unprunableRelids,
+ rtindex)) >= 0)
{
- RangeTblEntry *rte = (RangeTblEntry *) lfirst(lc2);
+ RangeTblEntry *rte = list_nth_node(RangeTblEntry,
+ plannedstmt->rtable,
+ rtindex - 1);
- if (!(rte->rtekind == RTE_RELATION ||
- (rte->rtekind == RTE_SUBQUERY && OidIsValid(rte->relid))))
- continue;
+ Assert(rte->rtekind == RTE_RELATION ||
+ (rte->rtekind == RTE_SUBQUERY && OidIsValid(rte->relid)));
/*
* Acquire the appropriate type of lock on each relation OID. Note
diff --git a/src/backend/utils/mmgr/portalmem.c b/src/backend/utils/mmgr/portalmem.c
index 93137820ac..ef4791bf65 100644
--- a/src/backend/utils/mmgr/portalmem.c
+++ b/src/backend/utils/mmgr/portalmem.c
@@ -284,7 +284,8 @@ PortalDefineQuery(Portal portal,
const char *sourceText,
CommandTag commandTag,
List *stmts,
- CachedPlan *cplan)
+ CachedPlan *cplan,
+ CachedPlanSource *plansource)
{
Assert(PortalIsValid(portal));
Assert(portal->status == PORTAL_NEW);
@@ -299,6 +300,7 @@ PortalDefineQuery(Portal portal,
portal->commandTag = commandTag;
portal->stmts = stmts;
portal->cplan = cplan;
+ portal->plansource = plansource;
portal->status = PORTAL_DEFINED;
}
diff --git a/src/include/commands/explain.h b/src/include/commands/explain.h
index aa5872bc15..09c1b1367a 100644
--- a/src/include/commands/explain.h
+++ b/src/include/commands/explain.h
@@ -103,8 +103,10 @@ extern void ExplainOneUtility(Node *utilityStmt, IntoClause *into,
ExplainState *es, ParseState *pstate,
ParamListInfo params);
-extern void ExplainOnePlan(PlannedStmt *plannedstmt, IntoClause *into,
- ExplainState *es, const char *queryString,
+extern void ExplainOnePlan(PlannedStmt *plannedstmt, CachedPlan *cplan,
+ CachedPlanSource *plansource, int plan_index,
+ IntoClause *into, ExplainState *es,
+ const char *queryString,
ParamListInfo params, QueryEnvironment *queryEnv,
const instr_time *planduration,
const BufferUsage *bufusage,
diff --git a/src/include/commands/trigger.h b/src/include/commands/trigger.h
index 8a5a9fe642..db21561c8c 100644
--- a/src/include/commands/trigger.h
+++ b/src/include/commands/trigger.h
@@ -258,6 +258,7 @@ extern void ExecASTruncateTriggers(EState *estate,
extern void AfterTriggerBeginXact(void);
extern void AfterTriggerBeginQuery(void);
extern void AfterTriggerEndQuery(EState *estate);
+extern void AfterTriggerAbortQuery(void);
extern void AfterTriggerFireDeferred(void);
extern void AfterTriggerEndXact(bool isCommit);
extern void AfterTriggerBeginSubXact(void);
diff --git a/src/include/executor/execPartition.h b/src/include/executor/execPartition.h
index 0b34784922..a0843481f7 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -49,6 +49,8 @@ extern void ExecCleanupTupleRouting(ModifyTableState *mtstate,
* nparts Length of subplan_map[] and subpart_map[].
* subplan_map Subplan index by partition index, or -1.
* subpart_map Subpart index by partition index, or -1.
+ * leafpart_rti_map RT index by partition index, or 0 if not a leaf
+ * partition.
* present_parts A Bitmapset of the partition indexes that we
* have subplans or subparts for.
* initial_pruning_steps List of PartitionPruneSteps used to
@@ -69,6 +71,7 @@ typedef struct PartitionedRelPruningData
int nparts;
int *subplan_map;
int *subpart_map;
+ int *leafpart_rti_map;
Bitmapset *present_parts;
List *initial_pruning_steps;
List *exec_pruning_steps;
@@ -140,6 +143,7 @@ extern PartitionPruneState *ExecInitPartitionExecPruning(PlanState *planstate,
Bitmapset *relids,
Bitmapset **initially_valid_subplans);
extern Bitmapset *ExecFindMatchingSubPlans(PartitionPruneState *prunestate,
- bool initial_prune);
+ bool initial_prune,
+ Bitmapset **validsubplan_rtis);
#endif /* EXECPARTITION_H */
diff --git a/src/include/executor/execdesc.h b/src/include/executor/execdesc.h
index 0a7274e26c..0e7245435d 100644
--- a/src/include/executor/execdesc.h
+++ b/src/include/executor/execdesc.h
@@ -35,6 +35,7 @@ typedef struct QueryDesc
/* These fields are provided by CreateQueryDesc */
CmdType operation; /* CMD_SELECT, CMD_UPDATE, etc. */
PlannedStmt *plannedstmt; /* planner's output (could be utility, too) */
+ CachedPlan *cplan; /* CachedPlan that supplies the plannedstmt */
const char *sourceText; /* source text of the query */
Snapshot snapshot; /* snapshot to use for query */
Snapshot crosscheck_snapshot; /* crosscheck for RI update/delete */
@@ -57,6 +58,7 @@ typedef struct QueryDesc
/* in pquery.c */
extern QueryDesc *CreateQueryDesc(PlannedStmt *plannedstmt,
+ CachedPlan *cplan,
const char *sourceText,
Snapshot snapshot,
Snapshot crosscheck_snapshot,
diff --git a/src/include/executor/executor.h b/src/include/executor/executor.h
index 69c3ebff00..6d72f7d9d6 100644
--- a/src/include/executor/executor.h
+++ b/src/include/executor/executor.h
@@ -19,6 +19,7 @@
#include "nodes/lockoptions.h"
#include "nodes/parsenodes.h"
#include "utils/memutils.h"
+#include "utils/plancache.h"
/*
@@ -198,6 +199,9 @@ ExecGetJunkAttribute(TupleTableSlot *slot, AttrNumber attno, bool *isNull)
* prototypes from functions in execMain.c
*/
extern void ExecutorStart(QueryDesc *queryDesc, int eflags);
+extern void ExecutorStartCachedPlan(QueryDesc *queryDesc, int eflags,
+ CachedPlanSource *plansource,
+ int query_index);
extern void standard_ExecutorStart(QueryDesc *queryDesc, int eflags);
extern void ExecutorRun(QueryDesc *queryDesc,
ScanDirection direction, uint64 count, bool execute_once);
@@ -261,6 +265,30 @@ extern void ExecEndNode(PlanState *node);
extern void ExecShutdownNode(PlanState *node);
extern void ExecSetTupleBound(int64 tuples_needed, PlanState *child_node);
+/*
+ * Is the CachedPlan in es_cachedplan still valid?
+ *
+ * Called from InitPlan() because invalidation messages that affect the plan
+ * might be received after locks have been taken on runtime-prunable relations.
+ * The caller should take appropriate action if the plan has become invalid.
+ */
+static inline bool
+ExecPlanStillValid(EState *estate)
+{
+ return estate->es_cachedplan == NULL ? true :
+ CachedPlanValid(estate->es_cachedplan);
+}
+
+/*
+ * Locks are needed only if running a cached plan that might contain unlocked
+ * relations, such as a reused generic plan.
+ */
+static inline bool
+ExecShouldLockRelations(EState *estate)
+{
+ return estate->es_cachedplan == NULL ? false :
+ CachedPlanRequiresLocking(estate->es_cachedplan);
+}
/* ----------------------------------------------------------------
* ExecProcNode
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index f93061c7bf..9643a9d626 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -42,6 +42,7 @@
#include "storage/condition_variable.h"
#include "utils/hsearch.h"
#include "utils/queryenvironment.h"
+#include "utils/plancache.h"
#include "utils/reltrigger.h"
#include "utils/sharedtuplestore.h"
#include "utils/snapshot.h"
@@ -639,9 +640,14 @@ typedef struct EState
* ExecRowMarks, or NULL if none */
List *es_rteperminfos; /* List of RTEPermissionInfo */
PlannedStmt *es_plannedstmt; /* link to top of plan tree */
+ CachedPlan *es_cachedplan; /* CachedPlan providing the plan tree */
List *es_part_prune_infos; /* List of PartitionPruneInfo */
List *es_part_prune_states; /* List of PartitionPruneState */
List *es_part_prune_results; /* List of Bitmapset */
+ Bitmapset *es_unpruned_relids; /* PlannedStmt.unprunableRelids + RT
+ * indexes of leaf partitions that
+ * survive initial pruning; see
+ * ExecDoInitialPruning() */
const char *es_sourceText; /* Source text from QueryDesc */
JunkFilter *es_junkFilter; /* top-level junk filter, if any */
@@ -687,6 +693,7 @@ typedef struct EState
int es_top_eflags; /* eflags passed to ExecutorStart */
int es_instrument; /* OR of InstrumentOption flags */
bool es_finished; /* true when ExecutorFinish is done */
+ bool es_aborted; /* true when execution was aborted */
List *es_exprcontexts; /* List of ExprContexts within EState */
@@ -1426,6 +1433,12 @@ typedef struct ModifyTableState
double mt_merge_inserted;
double mt_merge_updated;
double mt_merge_deleted;
+
+ /*
+ * List of valid updateColnosLists. Contains only those belonging to
+ * unpruned relations from ModifyTable.updateColnosLists.
+ */
+ List *mt_updateColnosLists;
} ModifyTableState;
/* ----------------
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index f8a4cd42c6..ef6156f30b 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -116,6 +116,14 @@ typedef struct PlannerGlobal
/* "flat" rangetable for executor */
List *finalrtable;
+ /*
+ * RT indexes of all relation RTEs in finalrtable (RTE_RELATION and
+ * RTE_SUBQUERY RTEs of views) and of those that are subject to runtime
+ * pruning at plan initialization time ("initial" pruning).
+ */
+ Bitmapset *allRelids;
+ Bitmapset *prunableRelids;
+
/* "flat" list of RTEPermissionInfos */
List *finalrteperminfos;
diff --git a/src/include/nodes/plannodes.h b/src/include/nodes/plannodes.h
index ef89927471..59699a1f86 100644
--- a/src/include/nodes/plannodes.h
+++ b/src/include/nodes/plannodes.h
@@ -74,6 +74,10 @@ typedef struct PlannedStmt
List *rtable; /* list of RangeTblEntry nodes */
+ Bitmapset *unprunableRelids; /* RT indexes of relations that are not
+ * subject to runtime pruning; set for
+ * AcquireExecutorLocks(). */
+
List *permInfos; /* list of RTEPermissionInfo nodes for rtable
* entries needing one */
@@ -1476,6 +1480,9 @@ typedef struct PartitionedRelPruneInfo
/* subpart index by partition index, or -1 */
int *subpart_map pg_node_attr(array_size(nparts));
+ /* RT index by partition index, or 0 if not a leaf partition */
+ int *leafpart_rti_map pg_node_attr(array_size(nparts));
+
/* relation OID by partition index, or 0 */
Oid *relid_map pg_node_attr(array_size(nparts));
diff --git a/src/include/utils/plancache.h b/src/include/utils/plancache.h
index a90dfdf906..72862f5e85 100644
--- a/src/include/utils/plancache.h
+++ b/src/include/utils/plancache.h
@@ -18,6 +18,8 @@
#include "access/tupdesc.h"
#include "lib/ilist.h"
#include "nodes/params.h"
+#include "nodes/parsenodes.h"
+#include "nodes/plannodes.h"
#include "tcop/cmdtag.h"
#include "utils/queryenvironment.h"
#include "utils/resowner.h"
@@ -139,10 +141,11 @@ typedef struct CachedPlanSource
* The reference count includes both the link from the parent CachedPlanSource
* (if any), and any active plan executions, so the plan can be discarded
* exactly when refcount goes to zero. Both the struct itself and the
- * subsidiary data live in the context denoted by the context field.
- * This makes it easy to free a no-longer-needed cached plan. (However,
- * if is_oneshot is true, the context does not belong solely to the CachedPlan
- * so no freeing is possible.)
+ * subsidiary data, except the PlannedStmts in stmt_list live in the context
+ * denoted by the context field; the PlannedStmts live in the context denoted
+ * by stmt_context. Separate contexts makes it easy to free a no-longer-needed
+ * cached plan. (However, if is_oneshot is true, the context does not belong
+ * solely to the CachedPlan so no freeing is possible.)
*/
typedef struct CachedPlan
{
@@ -150,6 +153,7 @@ typedef struct CachedPlan
List *stmt_list; /* list of PlannedStmts */
bool is_oneshot; /* is it a "oneshot" plan? */
bool is_saved; /* is CachedPlan in a long-lived context? */
+ bool is_reused; /* is it a reused generic plan? */
bool is_valid; /* is the stmt_list currently valid? */
Oid planRoleId; /* Role ID the plan was created for */
bool dependsOnRole; /* is plan specific to that role? */
@@ -158,6 +162,10 @@ typedef struct CachedPlan
int generation; /* parent's generation number for this plan */
int refcount; /* count of live references to this struct */
MemoryContext context; /* context containing this CachedPlan */
+ MemoryContext stmt_context; /* context containing the PlannedStmts in
+ * stmt_list, but not the List itself which
+ * is in the above context; NULL if is_oneshot
+ * is true. */
} CachedPlan;
/*
@@ -223,6 +231,10 @@ extern CachedPlan *GetCachedPlan(CachedPlanSource *plansource,
ParamListInfo boundParams,
ResourceOwner owner,
QueryEnvironment *queryEnv);
+extern PlannedStmt *UpdateCachedPlan(CachedPlanSource *plansource,
+ int query_index,
+ QueryEnvironment *queryEnv);
+
extern void ReleaseCachedPlan(CachedPlan *plan, ResourceOwner owner);
extern bool CachedPlanAllowsSimpleValidityCheck(CachedPlanSource *plansource,
@@ -235,4 +247,34 @@ extern bool CachedPlanIsSimplyValid(CachedPlanSource *plansource,
extern CachedExpression *GetCachedExpression(Node *expr);
extern void FreeCachedExpression(CachedExpression *cexpr);
+/*
+ * CachedPlanRequiresLocking: should the executor acquire additional locks?
+ *
+ * If the plan is a saved generic plan, the executor must acquire locks for
+ * relations that are not covered by AcquireExecutorLocks(), such as partitions
+ * that are subject to initial runtime pruning.
+ *
+ * Note: These locks are unnecessary if the plan is executed immediately after
+ * its creation, since the planner would have already acquired them. However,
+ * we do not optimize for that case.
+ */
+static inline bool
+CachedPlanRequiresLocking(CachedPlan *cplan)
+{
+ return !cplan->is_oneshot && cplan->is_reused;
+}
+
+/*
+ * CachedPlanValid
+ * Returns whether a cached generic plan is still valid.
+ *
+ * Invoked by the executor to check if the plan has not been invalidated after
+ * taking locks during the initialization of the plan.
+ */
+static inline bool
+CachedPlanValid(CachedPlan *cplan)
+{
+ return cplan->is_valid;
+}
+
#endif /* PLANCACHE_H */
diff --git a/src/include/utils/portal.h b/src/include/utils/portal.h
index 29f49829f2..58c3828d2c 100644
--- a/src/include/utils/portal.h
+++ b/src/include/utils/portal.h
@@ -138,6 +138,7 @@ typedef struct PortalData
QueryCompletion qc; /* command completion data for executed query */
List *stmts; /* list of PlannedStmts */
CachedPlan *cplan; /* CachedPlan, if stmts are from one */
+ CachedPlanSource *plansource; /* CachedPlanSource, for cplan */
ParamListInfo portalParams; /* params to pass to query */
QueryEnvironment *queryEnv; /* environment for query */
@@ -241,7 +242,8 @@ extern void PortalDefineQuery(Portal portal,
const char *sourceText,
CommandTag commandTag,
List *stmts,
- CachedPlan *cplan);
+ CachedPlan *cplan,
+ CachedPlanSource *plansource);
extern PlannedStmt *PortalGetPrimaryStmt(Portal portal);
extern void PortalCreateHoldStore(Portal portal);
extern void PortalHashTableDeleteAll(void);
diff --git a/src/test/modules/delay_execution/Makefile b/src/test/modules/delay_execution/Makefile
index 70f24e846d..3eeb097fde 100644
--- a/src/test/modules/delay_execution/Makefile
+++ b/src/test/modules/delay_execution/Makefile
@@ -8,7 +8,8 @@ OBJS = \
delay_execution.o
ISOLATION = partition-addition \
- partition-removal-1
+ partition-removal-1 \
+ cached-plan-inval
ifdef USE_PGXS
PG_CONFIG = pg_config
diff --git a/src/test/modules/delay_execution/delay_execution.c b/src/test/modules/delay_execution/delay_execution.c
index fa4693a3f5..44aa828fdf 100644
--- a/src/test/modules/delay_execution/delay_execution.c
+++ b/src/test/modules/delay_execution/delay_execution.c
@@ -1,14 +1,18 @@
/*-------------------------------------------------------------------------
*
* delay_execution.c
- * Test module to allow delay between parsing and execution of a query.
+ * Test module to introduce delay at various points during execution of a
+ * query to test that execution proceeds safely in light of concurrent
+ * changes.
*
* The delay is implemented by taking and immediately releasing a specified
* advisory lock. If another process has previously taken that lock, the
* current process will be blocked until the lock is released; otherwise,
* there's no effect. This allows an isolationtester script to reliably
- * test behaviors where some specified action happens in another backend
- * between parsing and execution of any desired query.
+ * test behaviors where some specified action happens in another backend in
+ * a couple of cases: 1) between parsing and execution of any desired query
+ * when using the planner_hook, 2) between RevalidateCachedQuery() and
+ * ExecutorStart() when using the ExecutorStart_hook.
*
* Copyright (c) 2020-2024, PostgreSQL Global Development Group
*
@@ -22,6 +26,7 @@
#include <limits.h>
+#include "executor/executor.h"
#include "optimizer/planner.h"
#include "utils/fmgrprotos.h"
#include "utils/guc.h"
@@ -32,9 +37,11 @@ PG_MODULE_MAGIC;
/* GUC: advisory lock ID to use. Zero disables the feature. */
static int post_planning_lock_id = 0;
+static int executor_start_lock_id = 0;
-/* Save previous planner hook user to be a good citizen */
+/* Save previous hook users to be a good citizen */
static planner_hook_type prev_planner_hook = NULL;
+static ExecutorStart_hook_type prev_ExecutorStart_hook = NULL;
/* planner_hook function to provide the desired delay */
@@ -70,11 +77,41 @@ delay_execution_planner(Query *parse, const char *query_string,
return result;
}
+/* ExecutorStart_hook function to provide the desired delay */
+static void
+delay_execution_ExecutorStart(QueryDesc *queryDesc, int eflags)
+{
+ /* If enabled, delay by taking and releasing the specified lock */
+ if (executor_start_lock_id != 0)
+ {
+ DirectFunctionCall1(pg_advisory_lock_int8,
+ Int64GetDatum((int64) executor_start_lock_id));
+ DirectFunctionCall1(pg_advisory_unlock_int8,
+ Int64GetDatum((int64) executor_start_lock_id));
+
+ /*
+ * Ensure that we notice any pending invalidations, since the advisory
+ * lock functions don't do this.
+ */
+ AcceptInvalidationMessages();
+ }
+
+ /* Now start the executor, possibly via a previous hook user */
+ if (prev_ExecutorStart_hook)
+ prev_ExecutorStart_hook(queryDesc, eflags);
+ else
+ standard_ExecutorStart(queryDesc, eflags);
+
+ if (executor_start_lock_id != 0)
+ elog(NOTICE, "Finished ExecutorStart(): CachedPlan is %s",
+ CachedPlanValid(queryDesc->cplan) ? "valid" : "not valid");
+}
+
/* Module load function */
void
_PG_init(void)
{
- /* Set up the GUC to control which lock is used */
+ /* Set up GUCs to control which lock is used */
DefineCustomIntVariable("delay_execution.post_planning_lock_id",
"Sets the advisory lock ID to be locked/unlocked after planning.",
"Zero disables the delay.",
@@ -86,10 +123,22 @@ _PG_init(void)
NULL,
NULL,
NULL);
-
+ DefineCustomIntVariable("delay_execution.executor_start_lock_id",
+ "Sets the advisory lock ID to be locked/unlocked before starting execution.",
+ "Zero disables the delay.",
+ &executor_start_lock_id,
+ 0,
+ 0, INT_MAX,
+ PGC_USERSET,
+ 0,
+ NULL,
+ NULL,
+ NULL);
MarkGUCPrefixReserved("delay_execution");
- /* Install our hook */
+ /* Install our hooks. */
prev_planner_hook = planner_hook;
planner_hook = delay_execution_planner;
+ prev_ExecutorStart_hook = ExecutorStart_hook;
+ ExecutorStart_hook = delay_execution_ExecutorStart;
}
diff --git a/src/test/modules/delay_execution/expected/cached-plan-inval.out b/src/test/modules/delay_execution/expected/cached-plan-inval.out
new file mode 100644
index 0000000000..5bfb2b33b3
--- /dev/null
+++ b/src/test/modules/delay_execution/expected/cached-plan-inval.out
@@ -0,0 +1,282 @@
+Parsed test spec with 2 sessions
+
+starting permutation: s1prep s2lock s1exec s2dropi s2unlock
+step s1prep: SET plan_cache_mode = force_generic_plan;
+ PREPARE q AS SELECT * FROM foov WHERE a = $1 FOR UPDATE;
+ EXPLAIN (COSTS OFF) EXECUTE q (1);
+QUERY PLAN
+------------------------------------------------
+LockRows
+ -> Append
+ Subplans Removed: 2
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: (a = $1)
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = $1)
+(7 rows)
+
+step s2lock: SELECT pg_advisory_lock(12345);
+pg_advisory_lock
+----------------
+
+(1 row)
+
+step s1exec: LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q (1); <waiting ...>
+step s2dropi: DROP INDEX foo12_1_a;
+step s2unlock: SELECT pg_advisory_unlock(12345);
+pg_advisory_unlock
+------------------
+t
+(1 row)
+
+step s1exec: <... completed>
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+-------------------------------------
+LockRows
+ -> Append
+ Subplans Removed: 2
+ -> Seq Scan on foo12_1 foo_1
+ Filter: (a = $1)
+(5 rows)
+
+
+starting permutation: s1prep2 s2lock s1exec2 s2dropi s2unlock
+step s1prep2: SET plan_cache_mode = force_generic_plan;
+ PREPARE q2 AS SELECT * FROM foov WHERE a = one() or a = two();
+ EXPLAIN (COSTS OFF) EXECUTE q2;
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+--------------------------------------------------
+Append
+ Subplans Removed: 1
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: ((a = one()) OR (a = two()))
+ -> BitmapOr
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = one())
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = two())
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+(11 rows)
+
+step s2lock: SELECT pg_advisory_lock(12345);
+pg_advisory_lock
+----------------
+
+(1 row)
+
+step s1exec2: LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q2; <waiting ...>
+step s2dropi: DROP INDEX foo12_1_a;
+step s2unlock: SELECT pg_advisory_unlock(12345);
+pg_advisory_unlock
+------------------
+t
+(1 row)
+
+step s1exec2: <... completed>
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+--------------------------------------------
+Append
+ Subplans Removed: 1
+ -> Seq Scan on foo12_1 foo_1
+ Filter: ((a = one()) OR (a = two()))
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+(6 rows)
+
+
+starting permutation: s1prep3 s2lock s1exec3 s2dropi s2unlock
+step s1prep3: SET plan_cache_mode = force_generic_plan;
+ PREPARE q3 AS UPDATE foov SET a = a WHERE a = one() or a = two();
+ EXPLAIN (COSTS OFF) EXECUTE q3;
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+--------------------------------------------------------------
+Nested Loop
+ -> Append
+ Subplans Removed: 1
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: ((a = one()) OR (a = two()))
+ -> BitmapOr
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = one())
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = two())
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+ -> Materialize
+ -> Append
+ Subplans Removed: 1
+ -> Bitmap Heap Scan on bar1 bar_1
+ Recheck Cond: (a = one())
+ -> Bitmap Index Scan on bar1_a_idx
+ Index Cond: (a = one())
+
+Update on bar
+ Update on bar1 bar_1
+ -> Nested Loop
+ -> Append
+ Subplans Removed: 1
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: ((a = one()) OR (a = two()))
+ -> BitmapOr
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = one())
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = two())
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+ -> Materialize
+ -> Append
+ Subplans Removed: 1
+ -> Bitmap Heap Scan on bar1 bar_1
+ Recheck Cond: (a = one())
+ -> Bitmap Index Scan on bar1_a_idx
+ Index Cond: (a = one())
+
+Update on foo
+ Update on foo12_1 foo_1
+ Update on foo12_2 foo_2
+ -> Append
+ Subplans Removed: 1
+ -> Bitmap Heap Scan on foo12_1 foo_1
+ Recheck Cond: ((a = one()) OR (a = two()))
+ -> BitmapOr
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = one())
+ -> Bitmap Index Scan on foo12_1_a
+ Index Cond: (a = two())
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+(56 rows)
+
+step s2lock: SELECT pg_advisory_lock(12345);
+pg_advisory_lock
+----------------
+
+(1 row)
+
+step s1exec3: LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q3; <waiting ...>
+step s2dropi: DROP INDEX foo12_1_a;
+step s2unlock: SELECT pg_advisory_unlock(12345);
+pg_advisory_unlock
+------------------
+t
+(1 row)
+
+step s1exec3: <... completed>
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+-------------------------------------------------------------
+Nested Loop
+ -> Append
+ Subplans Removed: 1
+ -> Seq Scan on foo12_1 foo_1
+ Filter: ((a = one()) OR (a = two()))
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+ -> Materialize
+ -> Append
+ Subplans Removed: 1
+ -> Bitmap Heap Scan on bar1 bar_1
+ Recheck Cond: (a = one())
+ -> Bitmap Index Scan on bar1_a_idx
+ Index Cond: (a = one())
+
+Update on bar
+ Update on bar1 bar_1
+ -> Nested Loop
+ -> Append
+ Subplans Removed: 1
+ -> Seq Scan on foo12_1 foo_1
+ Filter: ((a = one()) OR (a = two()))
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+ -> Materialize
+ -> Append
+ Subplans Removed: 1
+ -> Bitmap Heap Scan on bar1 bar_1
+ Recheck Cond: (a = one())
+ -> Bitmap Index Scan on bar1_a_idx
+ Index Cond: (a = one())
+
+Update on foo
+ Update on foo12_1 foo_1
+ Update on foo12_2 foo_2
+ -> Append
+ Subplans Removed: 1
+ -> Seq Scan on foo12_1 foo_1
+ Filter: ((a = one()) OR (a = two()))
+ -> Seq Scan on foo12_2 foo_2
+ Filter: ((a = one()) OR (a = two()))
+(41 rows)
+
+
+starting permutation: s1prep4 s2lock s1exec4 s2dropi s2unlock
+step s1prep4: SET plan_cache_mode = force_generic_plan;
+ SET enable_seqscan TO off;
+ PREPARE q4 AS SELECT * FROM generate_series(1, 1) WHERE EXISTS (SELECT * FROM foov WHERE a = $1 FOR UPDATE);
+ EXPLAIN (COSTS OFF) EXECUTE q4 (1);
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+---------------------------------------------------------------
+Result
+ One-Time Filter: (InitPlan 1).col1
+ InitPlan 1
+ -> LockRows
+ -> Append
+ Subplans Removed: 2
+ -> Index Scan using foo12_1_a on foo12_1 foo_1
+ Index Cond: (a = $1)
+ -> Function Scan on generate_series
+(9 rows)
+
+step s2lock: SELECT pg_advisory_lock(12345);
+pg_advisory_lock
+----------------
+
+(1 row)
+
+step s1exec4: LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q4 (1); <waiting ...>
+step s2dropi: DROP INDEX foo12_1_a;
+step s2unlock: SELECT pg_advisory_unlock(12345);
+pg_advisory_unlock
+------------------
+t
+(1 row)
+
+step s1exec4: <... completed>
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is not valid
+s1: NOTICE: Finished ExecutorStart(): CachedPlan is valid
+QUERY PLAN
+---------------------------------------------
+Result
+ One-Time Filter: (InitPlan 1).col1
+ InitPlan 1
+ -> LockRows
+ -> Append
+ Subplans Removed: 2
+ -> Seq Scan on foo12_1 foo_1
+ Disabled: true
+ Filter: (a = $1)
+ -> Function Scan on generate_series
+(10 rows)
+
diff --git a/src/test/modules/delay_execution/meson.build b/src/test/modules/delay_execution/meson.build
index 41f3ac0b89..5a70b183d0 100644
--- a/src/test/modules/delay_execution/meson.build
+++ b/src/test/modules/delay_execution/meson.build
@@ -24,6 +24,7 @@ tests += {
'specs': [
'partition-addition',
'partition-removal-1',
+ 'cached-plan-inval',
],
},
}
diff --git a/src/test/modules/delay_execution/specs/cached-plan-inval.spec b/src/test/modules/delay_execution/specs/cached-plan-inval.spec
new file mode 100644
index 0000000000..f27e8fb521
--- /dev/null
+++ b/src/test/modules/delay_execution/specs/cached-plan-inval.spec
@@ -0,0 +1,80 @@
+# Test to check that invalidation of cached generic plans during ExecutorStart
+# correctly triggers replanning and re-execution.
+
+setup
+{
+ CREATE TABLE foo (a int, b text) PARTITION BY LIST(a);
+ CREATE TABLE foo12 PARTITION OF foo FOR VALUES IN (1, 2) PARTITION BY LIST (a);
+ CREATE TABLE foo12_1 PARTITION OF foo12 FOR VALUES IN (1);
+ CREATE TABLE foo12_2 PARTITION OF foo12 FOR VALUES IN (2);
+ CREATE INDEX foo12_1_a ON foo12_1 (a);
+ CREATE TABLE foo3 PARTITION OF foo FOR VALUES IN (3);
+ CREATE VIEW foov AS SELECT * FROM foo;
+ CREATE FUNCTION one () RETURNS int AS $$ BEGIN RETURN 1; END; $$ LANGUAGE PLPGSQL STABLE;
+ CREATE FUNCTION two () RETURNS int AS $$ BEGIN RETURN 2; END; $$ LANGUAGE PLPGSQL STABLE;
+ CREATE TABLE bar (a int, b text) PARTITION BY LIST(a);
+ CREATE TABLE bar1 PARTITION OF bar FOR VALUES IN (1);
+ CREATE INDEX ON bar1(a);
+ CREATE TABLE bar2 PARTITION OF bar FOR VALUES IN (2);
+ CREATE RULE update_foo AS ON UPDATE TO foo DO ALSO UPDATE bar SET a = a WHERE a = one();
+ CREATE RULE update_bar AS ON UPDATE TO bar DO ALSO SELECT 1;
+}
+
+teardown
+{
+ DROP VIEW foov;
+ DROP RULE update_foo ON foo;
+ DROP TABLE foo, bar;
+ DROP FUNCTION one(), two();
+}
+
+session "s1"
+# Append with run-time pruning
+step "s1prep" { SET plan_cache_mode = force_generic_plan;
+ PREPARE q AS SELECT * FROM foov WHERE a = $1 FOR UPDATE;
+ EXPLAIN (COSTS OFF) EXECUTE q (1); }
+
+# Another case with Append with run-time pruning
+step "s1prep2" { SET plan_cache_mode = force_generic_plan;
+ PREPARE q2 AS SELECT * FROM foov WHERE a = one() or a = two();
+ EXPLAIN (COSTS OFF) EXECUTE q2; }
+
+# Case with a rule adding another query
+step "s1prep3" { SET plan_cache_mode = force_generic_plan;
+ PREPARE q3 AS UPDATE foov SET a = a WHERE a = one() or a = two();
+ EXPLAIN (COSTS OFF) EXECUTE q3; }
+
+# Another case with Append with run-time pruning in a subquery
+step "s1prep4" { SET plan_cache_mode = force_generic_plan;
+ SET enable_seqscan TO off;
+ PREPARE q4 AS SELECT * FROM generate_series(1, 1) WHERE EXISTS (SELECT * FROM foov WHERE a = $1 FOR UPDATE);
+ EXPLAIN (COSTS OFF) EXECUTE q4 (1); }
+
+# Executes a generic plan
+step "s1exec" { LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q (1); }
+step "s1exec2" { LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q2; }
+step "s1exec3" { LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q3; }
+step "s1exec4" { LOAD 'delay_execution';
+ SET delay_execution.executor_start_lock_id = 12345;
+ EXPLAIN (COSTS OFF) EXECUTE q4 (1); }
+
+session "s2"
+step "s2lock" { SELECT pg_advisory_lock(12345); }
+step "s2unlock" { SELECT pg_advisory_unlock(12345); }
+step "s2dropi" { DROP INDEX foo12_1_a; }
+
+# While "s1exec", etc. wait to acquire the advisory lock, "s2drop" is able to
+# drop the index being used in the cached plan. When "s1exec" is then
+# unblocked and initializes the cached plan for execution, it detects the
+# concurrent index drop and causes the cached plan to be discarded and
+# recreated without the index.
+permutation "s1prep" "s2lock" "s1exec" "s2dropi" "s2unlock"
+permutation "s1prep2" "s2lock" "s1exec2" "s2dropi" "s2unlock"
+permutation "s1prep3" "s2lock" "s1exec3" "s2dropi" "s2unlock"
+permutation "s1prep4" "s2lock" "s1exec4" "s2dropi" "s2unlock"
diff --git a/src/test/regress/expected/partition_prune.out b/src/test/regress/expected/partition_prune.out
index 7a03b4e360..705cd922fc 100644
--- a/src/test/regress/expected/partition_prune.out
+++ b/src/test/regress/expected/partition_prune.out
@@ -4440,3 +4440,47 @@ drop table hp_contradict_test;
drop operator class part_test_int4_ops2 using hash;
drop operator ===(int4, int4);
drop function explain_analyze(text);
+-- Runtime pruning on UPDATE using WITH CHECK OPTIONS and RETURNING
+create table part_abc (a int, b text, c bool) partition by list (a);
+create table part_abc_1 (b text, a int, c bool);
+create table part_abc_2 (a int, c bool, b text);
+alter table part_abc attach partition part_abc_1 for values in (1);
+alter table part_abc attach partition part_abc_2 for values in (2);
+insert into part_abc values (1, 'b', true);
+insert into part_abc values (2, 'c', true);
+create view part_abc_view as select * from part_abc where b <> 'a' with check option;
+prepare update_part_abc_view as update part_abc_view set b = $2 where a = $1 returning *;
+explain (costs off) execute update_part_abc_view (1, 'd');
+ QUERY PLAN
+-------------------------------------------------------
+ Update on part_abc
+ Update on part_abc_1
+ -> Append
+ Subplans Removed: 1
+ -> Seq Scan on part_abc_1
+ Filter: ((b <> 'a'::text) AND (a = $1))
+(6 rows)
+
+execute update_part_abc_view (1, 'd');
+ a | b | c
+---+---+---
+ 1 | d | t
+(1 row)
+
+explain (costs off) execute update_part_abc_view (2, 'a');
+ QUERY PLAN
+-------------------------------------------------------
+ Update on part_abc
+ Update on part_abc_2 part_abc_1
+ -> Append
+ Subplans Removed: 1
+ -> Seq Scan on part_abc_2 part_abc_1
+ Filter: ((b <> 'a'::text) AND (a = $1))
+(6 rows)
+
+execute update_part_abc_view (2, 'a');
+ERROR: new row violates check option for view "part_abc_view"
+DETAIL: Failing row contains (2, a, t).
+deallocate update_part_abc_view;
+drop view part_abc_view;
+drop table part_abc;
diff --git a/src/test/regress/sql/partition_prune.sql b/src/test/regress/sql/partition_prune.sql
index 442428d937..af26ad2fb2 100644
--- a/src/test/regress/sql/partition_prune.sql
+++ b/src/test/regress/sql/partition_prune.sql
@@ -1339,3 +1339,21 @@ drop operator class part_test_int4_ops2 using hash;
drop operator ===(int4, int4);
drop function explain_analyze(text);
+
+-- Runtime pruning on UPDATE using WITH CHECK OPTIONS and RETURNING
+create table part_abc (a int, b text, c bool) partition by list (a);
+create table part_abc_1 (b text, a int, c bool);
+create table part_abc_2 (a int, c bool, b text);
+alter table part_abc attach partition part_abc_1 for values in (1);
+alter table part_abc attach partition part_abc_2 for values in (2);
+insert into part_abc values (1, 'b', true);
+insert into part_abc values (2, 'c', true);
+create view part_abc_view as select * from part_abc where b <> 'a' with check option;
+prepare update_part_abc_view as update part_abc_view set b = $2 where a = $1 returning *;
+explain (costs off) execute update_part_abc_view (1, 'd');
+execute update_part_abc_view (1, 'd');
+explain (costs off) execute update_part_abc_view (2, 'a');
+execute update_part_abc_view (2, 'a');
+deallocate update_part_abc_view;
+drop view part_abc_view;
+drop table part_abc;
--
2.43.0
^ permalink raw reply [nested|flat] 29+ messages in thread
end of thread, other threads:[~2024-12-05 12:03 UTC | newest]
Thread overview: 29+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2024-08-15 15:34 Re: generic plans and "initial" pruning Robert Haas <[email protected]>
2024-08-16 12:35 ` Amit Langote <[email protected]>
2024-08-19 16:39 ` Robert Haas <[email protected]>
2024-08-19 16:54 ` Tom Lane <[email protected]>
2024-08-19 17:38 ` Robert Haas <[email protected]>
2024-08-19 17:52 ` Tom Lane <[email protected]>
2024-08-19 18:20 ` Robert Haas <[email protected]>
2024-08-20 13:14 ` Amit Langote <[email protected]>
2024-08-20 13:00 ` Amit Langote <[email protected]>
2024-08-20 14:53 ` Robert Haas <[email protected]>
2024-08-21 12:45 ` Amit Langote <[email protected]>
2024-08-21 13:10 ` Robert Haas <[email protected]>
2024-08-23 12:48 ` Amit Langote <[email protected]>
2024-08-29 13:34 ` Amit Langote <[email protected]>
2024-08-31 12:30 ` Junwang Zhao <[email protected]>
2024-09-02 08:19 ` Amit Langote <[email protected]>
2024-09-05 09:55 ` Amit Langote <[email protected]>
2024-09-17 12:57 ` Amit Langote <[email protected]>
2024-09-19 08:39 ` Amit Langote <[email protected]>
2024-09-19 12:10 ` Amit Langote <[email protected]>
2024-09-20 08:10 ` Amit Langote <[email protected]>
2024-10-10 20:15 ` Robert Haas <[email protected]>
2024-10-11 07:30 ` Amit Langote <[email protected]>
2024-10-15 14:38 ` Robert Haas <[email protected]>
2024-10-15 15:22 ` Tom Lane <[email protected]>
2024-10-25 12:30 ` Amit Langote <[email protected]>
2024-12-01 18:36 ` Tomas Vondra <[email protected]>
2024-12-04 13:34 ` Amit Langote <[email protected]>
2024-12-05 12:03 ` Amit Langote <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox