public inbox for [email protected]  
help / color / mirror / Atom feed
From: Junwang Zhao <[email protected]>
To: Amit Langote <[email protected]>
Cc: Robert Haas <[email protected]>
Cc: Alvaro Herrera <[email protected]>
Cc: Andres Freund <[email protected]>
Cc: Daniel Gustafsson <[email protected]>
Cc: David Rowley <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Cc: Thom Brown <[email protected]>
Cc: Tom Lane <[email protected]>
Subject: Re: generic plans and "initial" pruning
Date: Sat, 31 Aug 2024 20:30:34 +0800
Message-ID: <CAEG8a3LTsA4KcbvHYGbFX6w27Rv-7m5RZMHrH+JupGw+cm2Ung@mail.gmail.com> (raw)
In-Reply-To: <CA+HiwqH9u1RWn9OEa=VQQpJagB0hDLCY+=fSyBC4ZkeU6Gg2HA@mail.gmail.com>
References: <CA+HiwqFpZ80UJKr4tZus4Omgg7YESzFXKSwSHRW2Ap2=XSVyUA@mail.gmail.com>
	<[email protected]>
	<CA+HiwqF+3tv=tuB9EVfOj9YcXhSq477X+1RKOpJ5JqCCj3qgww@mail.gmail.com>
	<CA+TgmobHL_vTjOdy6KVMVeW-CQQmXXz_yU6Q9d2YjnVfFxuy6A@mail.gmail.com>
	<CA+HiwqHL=aGU9Y4RYXQ5VCp4L5NVdiaQLLoXN3NCQQQMKo0ByQ@mail.gmail.com>
	<CA+TgmoabYD=pnccFLzbbREFsqkFgE4EZ+FdHoTOhgCqn4jP2Cw@mail.gmail.com>
	<CA+HiwqE_rQ9pZnkXeoHdds2kgAiT7XNNHZW8gTGicBdXv0rwnw@mail.gmail.com>
	<CA+TgmoY2drv9PmrRAC7AR77mkx09sOh-+5qJkHB_hLKeHRNqzQ@mail.gmail.com>
	<CA+HiwqHkjicWzfAjB6_SVsVmKF6omQ4EBHr+GTUgJNN7WiUDag@mail.gmail.com>
	<CA+TgmoaZxb4JTimK8MmbXEeCwtzyfx7uGYjq565s2pY9i1GN+Q@mail.gmail.com>
	<CA+HiwqHzKO9FT-CjFWo6OmkiCSYmbPspKXVex96tOBKf6S_x_w@mail.gmail.com>
	<CA+TgmoZGWyMXutfen-NNv9=QM7eCHn9R1bpLZ9N4sRURMOCK2A@mail.gmail.com>
	<CA+HiwqHNb9jrwOFHfmASfiGc=SnqXs7THwQ_Rta=z+ognYV8qw@mail.gmail.com>
	<CA+HiwqH9u1RWn9OEa=VQQpJagB0hDLCY+=fSyBC4ZkeU6Gg2HA@mail.gmail.com>

Hi,

On Thu, Aug 29, 2024 at 9:34 PM Amit Langote <[email protected]> wrote:
>
> On Fri, Aug 23, 2024 at 9:48 PM Amit Langote <[email protected]> wrote:
> > On Wed, Aug 21, 2024 at 10:10 PM Robert Haas <[email protected]> wrote:
> > > On Wed, Aug 21, 2024 at 8:45 AM Amit Langote <[email protected]> wrote:
> > > > * The replanning aspect of the lock-in-the-executor design would be
> > > > simpler if a CachedPlan contained the plan for a single query rather
> > > > than a list of queries, as previously mentioned. This is particularly
> > > > due to the requirements of the PORTAL_MULTI_QUERY case. However, this
> > > > option might be impractical.
> > >
> > > It might be, but maybe it would be worth a try? I mean,
> > > GetCachedPlan() seems to just call pg_plan_queries() which just loops
> > > over the list of query trees and does the same thing for each one. If
> > > we wanted to replan a single query, why couldn't we do
> > > fake_querytree_list = list_make1(list_nth(querytree_list, n)) and then
> > > call pg_plan_queries(fake_querytree_list)? Or something equivalent to
> > > that. We could have a new GetCachedSinglePlan(cplan, n) to do this.
> >
> > I've been hacking to prototype this, and it's showing promise. It
> > helps make the replan loop at the call sites that start the executor
> > with an invalidatable plan more localized and less prone to
> > action-at-a-distance issues. However, the interface and contract of
> > the new function in my prototype are pretty specialized for the replan
> > loop in this context—meaning it's not as general-purpose as
> > GetCachedPlan(). Essentially, what you get when you call it is a
> > 'throwaway' CachedPlan containing only the plan for the query that
> > failed during ExecutorStart(), not a plan integrated into the original
> > CachedPlanSource's stmt_list. A call site entering the replan loop
> > will retry the execution with that throwaway plan, release it once
> > done, and resume looping over the plans in the original list. The
> > invalid plan that remains in the original list will be discarded and
> > replanned in the next call to GetCachedPlan() using the same
> > CachedPlanSource. While that may sound undesirable, I'm inclined to
> > think it's not something that needs optimization, given that we're
> > expecting this code path to be taken rarely.
> >
> > I'll post a version of a revamped locks-in-the-executor patch set
> > using the above function after debugging some more.
>
> Here it is.
>
> 0001 implements changes to defer the locking of runtime-prunable
> relations to the executor.  The new design introduces a bitmapset
> field in PlannedStmt to distinguish at runtime between relations that
> are prunable whose locking can be deferred until ExecInitNode() and
> those that are not and must be locked in advance.  The set of prunable
> relations can be constructed by looking at all the PartitionPruneInfos
> in the plan and checking which are subject to "initial" pruning steps.
> The set of unprunable relations is obtained by subtracting those from
> the set of all RT indexes.  This design gets rid of one annoying
> aspect of the old design which was the need to add specialized fields
> to store the RT indexes of partitioned relations that are not
> otherwise referenced in the plan tree. That was necessary because in
> the old design, I had removed the function AcquireExecutorLocks()
> altogether to defer the locking of all child relations to execution.
> In the new design such relations are still locked by
> AcquireExecutorLocks().
>
> 0002 is the old patch to make ExecEndNode() robust against partially
> initialized PlanState nodes by adding NULL checks.
>
> 0003 is the patch to add changes to deal with the CachedPlan becoming
> invalid before the deferred locks on prunable relations are taken.
> I've moved the replan loop into a new wrapper-over-ExecutorStart()
> function instead of having the same logic at multiple sites.  The
> replan logic uses the GetSingleCachedPlan() described in the quoted
> text.  The callers of the new ExecutorStart()-wrapper, which I've
> dubbed ExecutorStartExt(), need to pass the CachedPlanSource and a
> query_index, which is the index of the query being executed in the
> list CachedPlanSource.query_list.  They are needed by
> GetSingleCachedPlan().  The changes outside the executor are pretty
> minimal in this design and all the difficulties of having to loop back
> to GetCachedPlan() are now gone.  I like how this turned out.
>
> One idea that I think might be worth trying to reduce the footprint of
> 0003 is to try to lock the prunable relations in a step of InitPlan()
> separate from ExecInitNode(), which can be implemented by doing the
> initial runtime pruning in that separate step.  That way, we'll have
> all the necessary locks before calling ExecInitNode() and so we don't
> need to sprinkle the CachedPlanStillValid() checks all over the place
> and worry about missed checks and dealing with partially initialized
> PlanState trees.
>
> --
> Thanks, Amit Langote

@@ -1241,7 +1244,7 @@ GetCachedPlan(CachedPlanSource *plansource,
ParamListInfo boundParams,
  if (customplan)
  {
  /* Build a custom plan */
- plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv);
+ plan = BuildCachedPlan(plansource, qlist, boundParams, queryEnv, true);

Is the *true* here a typo? Seems it should be *false* for custom plan?

-- 
Regards
Junwang Zhao






view thread (29+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: generic plans and "initial" pruning
  In-Reply-To: <CAEG8a3LTsA4KcbvHYGbFX6w27Rv-7m5RZMHrH+JupGw+cm2Ung@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox