MIME-Version: 1.0
References: <20221221101846.7zsi7lscnm5ediik@alvherre.pgsql>
 <1350682.1671635927@sss.pgh.pa.us> <4191508.1674157166@sss.pgh.pa.us>
 <CA+HiwqFhQ8tLMLQAWYRWiBQUrWSLM8qhVzJ4B5jWr=orUXfSyA@mail.gmail.com>
 <349124.1674185509@sss.pgh.pa.us> <CA+HiwqGFm_5aUqPnt=WCarkJ2ZU6F8kD8pFeGurHP+NZWS8KQw@mail.gmail.com>
 <CA+HiwqEDr9m3NGrmiOgatCnRPwD95=MHgWQdwvnoMyQd3k9-Yw@mail.gmail.com>
 <CA+HiwqGTrQ=ywAmB2zP81jcENZh1vLuyJaC2-xhWvBsnXWgZYQ@mail.gmail.com>
 <CA+HiwqGe0W2C+SrKcxrk-r4JjO0sCfL581p1M2bzr_LSrzGn+g@mail.gmail.com>
 <20230207180855.xy5m4puwh5gzd7xy@awork3.anarazel.de> <CA+HiwqGBumxzWctnUy33dHy2uGCtfmcqKyw4FONJNyitJvujWw@mail.gmail.com>
 <CA+HiwqHmdH2bcUtBGncwB7iJ9N0VTkUo4YPYFNtJL_f3kkau=g@mail.gmail.com>
 <CA+HiwqHKTxfaYc=e4mVOv0iDm3vVK56WOBCddzYdXKaWQqniww@mail.gmail.com>
 <CA+HiwqHQ1PM+HXoEdvutj0huhu2cfmuPa8wtctor0NNADzZVvA@mail.gmail.com>
 <CA+HiwqH=cbBocfSmyjd_N7ZceZ3RtXuQ=rNkAfdn+RwqMGY9fQ@mail.gmail.com>
 <CA+HiwqHoZSM4A0HKoTERmp=_stQjpjmomgg=rCf_4x4qCpxbZA@mail.gmail.com> <1274885.1680558101@sss.pgh.pa.us>
In-Reply-To: <1274885.1680558101@sss.pgh.pa.us>
From: Amit Langote <amitlangote09@gmail.com>
Date: Tue, 4 Apr 2023 22:29:12 +0900
Message-ID: <CA+HiwqFfC7ANtb+HAHYuR4wnwYbQdbK5B0ee0fjtNwTt+TOdwg@mail.gmail.com>
Subject: Re: generic plans and "initial" pruning
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: Andres Freund <andres@anarazel.de>, Alvaro Herrera <alvherre@alvh.no-ip.org>, 
	David Rowley <dgrowleyml@gmail.com>, Jacob Champion <jchampion@timescale.com>, 
	PostgreSQL-development <pgsql-hackers@postgresql.org>, Robert Haas <robertmhaas@gmail.com>
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable
Archived-At: <https://www.postgresql.org/message-id/CA%2BHiwqFfC7ANtb%2BHAHYuR4wnwYbQdbK5B0ee0fjtNwTt%2BTOdwg%40mail.gmail.com>
Precedence: bulk

On Tue, Apr 4, 2023 at 6:41=E2=80=AFAM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Amit Langote <amitlangote09@gmail.com> writes:
> > [ v38 patchset ]
>
> I spent a little bit of time looking through this, and concluded that
> it's not something I will be wanting to push into v16 at this stage.
> The patch doesn't seem very close to being committable on its own
> terms, and even if it was now is not a great time in the dev cycle
> to be making significant executor API changes.  Too much risk of
> having to thrash the API during beta, or even change it some more
> in v17.  I suggest that we push this forward to the next CF with the
> hope of landing it early in v17.

OK, thanks a lot for your feedback.

> A few concrete thoughts:
>
> * I understand that your plan now is to acquire locks on all the
> originally-named tables, then do permissions checks (which will
> involve only those tables), then dynamically lock just inheritance and
> partitioning child tables as we descend the plan tree.

Actually, with the current implementation of the patch, *all* of the
relations mentioned in the plan tree would get locked during the
ExecInitNode() traversal of the plan tree (and of those in
plannedstmt->subplans), not just the inheritance child tables.
Locking of non-child tables done by the executor after this patch is
duplicative with AcquirePlannerLocks(), so that's something to be
improved.

> That seems
> more or less okay to me, but it could be reflected better in the
> structure of the patch perhaps.
>
> * In particular I don't much like the "viewRelations" list, which
> seems like a wart; those ought to be handled more nearly the same way
> as other RTEs.  (One concrete reason why is that this scheme is going
> to result in locking views in a different order than they were locked
> during original parsing, which perhaps could contribute to deadlocks.)
> Maybe we should store an integer list of which RTIs need to be locked
> in the early phase?  Building that in the parser/rewriter would provide
> a solid guide to the original locking order, so we'd be trivially sure
> of duplicating that.  (It might be close enough to follow the RT list
> order, which is basically what AcquireExecutorLocks does today, but
> this'd be more certain to do the right thing.)  I'm less concerned
> about lock order for child tables because those are just going to
> follow the inheritance or partitioning structure.

What you've described here sounds somewhat like what I had implemented
in the patch versions till v31, though it used a bitmapset named
minLockRelids that is initialized by setrefs.c.  Your idea of
initializing a list before planning seems more appealing offhand than
the code I had added in setrefs.c to populate that minLockRelids
bitmapset, which would be bms_add_range(1, list_lenth(finalrtable)),
followed by bms_del_members(set-of-child-rel-rtis).

I'll give your idea a try.

> * I don't understand the need for changes like this:
>
>         /* clean up tuple table */
> -       ExecClearTuple(node->ps.ps_ResultTupleSlot);
> +       if (node->ps.ps_ResultTupleSlot)
> +               ExecClearTuple(node->ps.ps_ResultTupleSlot);
>
> ISTM that the process ought to involve taking a lock (if needed)
> before we have built any execution state for a given plan node,
> and if we find we have to fail, returning NULL instead of a
> partially-valid planstate node.  Otherwise, considerations of how
> to handle partially-valid nodes are going to metastasize into all
> sorts of places, almost certainly including EXPLAIN for instance.
> I think we ought to be able to limit the damage to "parent nodes
> might have NULL child links that you wouldn't have expected".
> That wouldn't faze ExecEndNode at all, nor most other code.

Hmm, yes, taking a lock before allocating any of the stuff to add into
the planstate seems like it's much easier to reason about than the
alternative I've implemented.

> * More attention is needed to comments.  For example, in a couple of
> places in plancache.c you have removed function header comments
> defining API details and not replaced them with any info about the new
> details, despite the fact that those details are more complex than the
> old.

OK, yeah, maybe I've added a bunch of explanations in execMain.c that
should perhaps have been in plancache.c.

> > It seems I hadn't noted in the ExecEndNode()'s comment that all node
> > types' recursive subroutines need to  handle the change made by this
> > patch that the corresponding ExecInitNode() subroutine may now return
> > early without having initialized all state struct fields.
> > Also noted in the documentation for CustomScan and ForeignScan that
> > the Begin*Scan callback may not have been called at all, so the
> > End*Scan should handle that gracefully.
>
> Yeah, I think we need to avoid adding such requirements.  It's the
> sort of thing that would far too easily get past developer testing
> and only fail once in a blue moon in the field.

OK, got it.

--
Thanks, Amit Langote
EDB: http://www.enterprisedb.com