public inbox for [email protected]
help / color / mirror / Atom feedFrom: Andrey Borodin <[email protected]>
To: Tom Lane <[email protected]>
Cc: Michel Pelletier <[email protected]>
Cc: Pavel Stehule <[email protected]>
Cc: [email protected]
Subject: Re: Using Expanded Objects other than Arrays from plpgsql
Date: Sun, 26 Jan 2025 22:04:15 +0500
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <CACxu=vJaKFNsYxooSnW1wEgsAO5u_v1XYBacfVJ14wgJV_PYeg@mail.gmail.com>
<[email protected]>
<CACxu=vLXvpzN4X3k+9jsMt6ujuOvFVUSkA80t_cROSsF4y2jQQ@mail.gmail.com>
<[email protected]>
<CACxu=vKEF8Qa-OaADFxf0uMg-xw6gH_CNCWd2s+xaqh-gY4=xg@mail.gmail.com>
<[email protected]>
<[email protected]>
<CACxu=v++HNmss59yGUDkRny7g=M8tZ2YXF07AUXqKVGqcSfxGQ@mail.gmail.com>
<[email protected]>
<[email protected]>
<CACxu=v+dn37zr8gx5xNP-EZY3OLtGLTHrbx_ZkCQc40HpyMLKA@mail.gmail.com>
<[email protected]>
<CACxu=vL7i_U_iSNpREe8eCAMEyKHuPpk9THRpBhk+ar0U1EdOw@mail.gmail.com>
<CACxu=vKLc6f5N8_DR58LKkE1eohWSxTvThTeGsLm7p7QH1aFBA@mail.gmail.com>
<CACxu=vJf2S=ysun_h=zmYNu6oUM47+egbpX5mMC0X9BJK=EQwQ@mail.gmail.com>
<CACxu=vK+S6BXN8ZYyBvqQBWrcwHXqtue1-ZuKO3+XtHGBYcDUQ@mail.gmail.com>
<CAFj8pRCd6xcH-AYEyHFdGdU89O9JjZ-v-pyQnOwd9zNJkCEdhQ@mail.gmail.com>
<[email protected]>
<647219! ! [email protected]>
<CACxu=vJ1UeXPjPOzfT2E=tFpSCxPxcB7hgfqHYmefJWh3ZXPew@mail.gmail.com>
<[email protected]>
<CACxu=vKModwqSpEjpn-ur1uMAEYkcd-i8CvNsH-z9=dk7F7YQw@mail.gmail.com>
<[email protected]>
<[email protected]>
<[email protected]>
> On 26 Jan 2025, at 20:37, Tom Lane <[email protected]> wrote:
>
>> And the coverage of newly invented mark_stmt() 42.37%. Some of branches are easy noops, but some are not.
>
> Yeah. I'm not too concerned about that because it's pretty much a
> copy-and-paste of the adjacent code. Maybe we should think about
> some way of refactoring pl_funcs.c to reduce duplication, but I
> don't have any great ideas about how.
OK, now I got it. The whole purpose of 2nd patch is to do
if (expr->target_param >= 0) expr->target_is_local = bms_is_member(expr->target_param, local_dnos);
to local variables.
>
>> expr_is_assignment_source() is named like if it should return nool, but it's void.
>
> I've been less than satisfied with that name too. I intended it
> as a statement of fact, "this expression has been found to be
> the source of an assignment". But it does seem confusing.
> Maybe we should recast it as an action. What do you think of
> "mark_expr_as_assignment_source"?
Sounds better to me. I found no examples of similar functions nether in pl_gram.y, nor in gram.y, so IMO mark_expr_as_assignment_source() is the best candidate.
>
>> I could not grasp from reading the code one generic question about new optimization rule. What cost does checking for possible in-place update incurs to code cannot have this optimization? Is it O(numer_of_arguments) of for every assignment execution?
>
> No, the extra effort is incurred at most once per assignment statement
> per session. (Unless the plpgsql function's cache entry gets
> invalidated, in which case we'd rebuild all of the function's data
> structures and have to redo this work too.)
OK, I think execution benefits justify this preparatory costs.
> We set up the evaluation
> function "paramfunc" as plpgsql_param_eval_var_check if we think we
> might be able to apply this optimization, or plpgsql_param_eval_var_ro
> if we don't think so but the variable is of varlena type. At runtime,
> if the variable's current value is not actually expanded, then
> plpgsql_param_eval_var_check falls through doing essentially the same
> work as plpgsql_param_eval_var_ro, so there should be no added cost.
> The first time we observe that the value *is* expanded, we incur the
> cost to detect whether an optimization is really possible, and then
> we change the "paramfunc" pointer to be the appropriate function
> so as to apply the optimization or not without rechecking. So
> generally speaking, if we're considering a variable of a type that
> doesn't support expansion, there should be zero extra per-execution
> cost. There is some extra cost at function compilation time to
> determine which expressions are assignment sources (but we were doing
> that already) and to discover whether those assignments are to
> nonlocal variables (which is new work, but only needs to be done in
> functions with exception blocks).
Got it, many thanks for the explanation.
But I've got some new questions:
I'm lost in internals of ExprEvalStep. But void *paramarg and his friend void *paramarg2 are cryptic. They always have same type and same meaning, but have very generic names.
I wonder if you plan similar optimizations for array_cat(), array_remove() etc?
+ a := a || a; -- not optimizable
Why is it not optimizable? Because there is no support function, because array_cat() has no support function, or something else?
Besides this, the patch looks good to me.
Best regards, Andrey Borodin.
view thread (34+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: Using Expanded Objects other than Arrays from plpgsql
In-Reply-To: <[email protected]>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox