MIME-Version: 1.0
References: <CACxu=vJaKFNsYxooSnW1wEgsAO5u_v1XYBacfVJ14wgJV_PYeg@mail.gmail.com>
 <1342498.1729444411@sss.pgh.pa.us> <CACxu=vLXvpzN4X3k+9jsMt6ujuOvFVUSkA80t_cROSsF4y2jQQ@mail.gmail.com>
 <1445998.1729482404@sss.pgh.pa.us> <CACxu=vKEF8Qa-OaADFxf0uMg-xw6gH_CNCWd2s+xaqh-gY4=xg@mail.gmail.com>
 <2062830.1729625620@sss.pgh.pa.us> <2265411.1729699470@sss.pgh.pa.us>
 <CACxu=v++HNmss59yGUDkRny7g=M8tZ2YXF07AUXqKVGqcSfxGQ@mail.gmail.com>
 <2354718.1729737539@sss.pgh.pa.us> <2581216.1729794746@sss.pgh.pa.us>
In-Reply-To: <2581216.1729794746@sss.pgh.pa.us>
From: Michel Pelletier <pelletier.michel@gmail.com>
Date: Thu, 31 Oct 2024 16:51:59 -0700
Message-ID: <CACxu=v+dn37zr8gx5xNP-EZY3OLtGLTHrbx_ZkCQc40HpyMLKA@mail.gmail.com>
Subject: Re: Using Expanded Objects other than Arrays from plpgsql
To: Tom Lane <tgl@sss.pgh.pa.us>
Cc: pgsql-hackers@lists.postgresql.org
Content-Type: multipart/alternative; boundary="0000000000003b23cf0625ce832e"
Archived-At: <https://www.postgresql.org/message-id/CACxu%3Dv%2Bdn37zr8gx5xNP-EZY3OLtGLTHrbx_ZkCQc40HpyMLKA%40mail.gmail.com>
Precedence: bulk

--0000000000003b23cf0625ce832e
Content-Type: text/plain; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

On Thu, Oct 24, 2024 at 11:32=E2=80=AFAM Tom Lane <tgl@sss.pgh.pa.us> wrote=
:

> I wrote:
> > ... I'm still writing up
> > details, but right now I'm envisioning completely separate sets of
> > rules for the prosupport case versus the no-prosupport case.
>
> So here is the design I've come up with for optimizing R/W expanded
> object updates in plpgsql without any special knowledge from a
> prosupport function.  AFAICS this requires no assumptions at all
> about the behavior of called functions, other than the bare minimum
> "you can't corrupt the object to the point where it wouldn't be
> cleanly free-able".  In particular that means it can work for
> user-written called functions in plpgsql, SQL, or whatever, not
> only for C-coded functions.
>

Great, I checked with the upstream library authors and they verified that
the object can't be corrupted to where it can't be freed.  Since my
expanded objects are just a box around a library handle, I use a
MemoryContext callback to call the library free function when the context
cleans up, and we can't think of a path where that will fail.


>
> There are two requirements to apply the optimization:
>
> * If the assignment statement is within a BEGIN ... EXCEPTION block,
> its target variable must be declared inside the most-closely-nested
> such block.  This ensures that if an error is thrown from within the
> assignment statement's expression, we do not care about the value
> of the target variable, except to the extent of being able to clean
> it up.
>

My users are writing algebraic expressions to be done in bulk on GPUs,
etc.  I don't think I have to worry too much about wrapping stuff in
exception blocks while handling my library objects.

<snip>

> While I've not tried to write any code yet, I think both of these
> conditions should be reasonably easy to verify.
>
> Given that those conditions are met and the current value of the
> assignment target variable is a R/W expanded pointer, we can
> execute the assignment as follows:
>
> <snip>

> So, while this design greatly expands the set of cases we can
> optimize, it does lose some cases that the old approach could
> support.  I envision addressing that by allowing a prosupport
> function attached to the RHS' topmost function to "bless"
> other cases as safe, using reasoning similar to the old rules.
> (Or different rules, even, but it's on the prosupport function
> to be sure it's safe.)  I don't have a detailed design in mind,
> but I'm thinking along the lines of just passing the whole RHS
> expression to the prosupport function and letting it decide
> what's safe.  In any case, we don't need to even call the
> prosupport function unless there's an exception block or
> multiple RHS references to the target variable.
>

That all sounds great, and it sounds like my prosupport function just needs
to return true, or set some kind of flag saying aliasing is ok.  I'd like
to help as much as possible, but some of that reparenting stuff was pretty
deep for me, other than being a quick sanity check case, is there anything
I can do to help?


>
>                         regards, tom lane
>

--0000000000003b23cf0625ce832e
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div dir=3D"ltr">On Thu, Oct 24, 2024 at 11:32=E2=80=AFAM =
Tom Lane &lt;<a href=3D"mailto:tgl@sss.pgh.pa.us">tgl@sss.pgh.pa.us</a>&gt;=
 wrote:<br></div><div class=3D"gmail_quote"><blockquote class=3D"gmail_quot=
e" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204)=
;padding-left:1ex">I wrote:<br>
&gt; ... I&#39;m still writing up<br>
&gt; details, but right now I&#39;m envisioning completely separate sets of=
<br>
&gt; rules for the prosupport case versus the no-prosupport case.<br>
<br>
So here is the design I&#39;ve come up with for optimizing R/W expanded<br>
object updates in plpgsql without any special knowledge from a<br>
prosupport function.=C2=A0 AFAICS this requires no assumptions at all<br>
about the behavior of called functions, other than the bare minimum<br>
&quot;you can&#39;t corrupt the object to the point where it wouldn&#39;t b=
e<br>
cleanly free-able&quot;.=C2=A0 In particular that means it can work for<br>
user-written called functions in plpgsql, SQL, or whatever, not<br>
only for C-coded functions.<br></blockquote><div><br></div><div>Great, I ch=
ecked with the upstream library authors and they verified that the object c=
an&#39;t be corrupted to where it can&#39;t be freed.=C2=A0 Since my expand=
ed objects are just a box around a library handle, I use a MemoryContext ca=
llback to call the library free function when the context cleans up, and we=
 can&#39;t think of a path where that will fail.</div><div>=C2=A0</div><blo=
ckquote class=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left=
:1px solid rgb(204,204,204);padding-left:1ex">
<br>
There are two requirements to apply the optimization:<br>
<br>
* If the assignment statement is within a BEGIN ... EXCEPTION block,<br>
its target variable must be declared inside the most-closely-nested<br>
such block.=C2=A0 This ensures that if an error is thrown from within the<b=
r>
assignment statement&#39;s expression, we do not care about the value<br>
of the target variable, except to the extent of being able to clean<br>
it up.<br></blockquote><div><br></div><div>My users are writing algebraic e=
xpressions to be done in bulk on GPUs, etc.=C2=A0 I don&#39;t think I have =
to worry too much about wrapping stuff in exception blocks while handling m=
y library objects.</div><div><br></div><div>&lt;snip&gt;</div><blockquote c=
lass=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px soli=
d rgb(204,204,204);padding-left:1ex">
While I&#39;ve not tried to write any code yet, I think both of these<br>
conditions should be reasonably easy to verify.<br>
<br>
Given that those conditions are met and the current value of the<br>
assignment target variable is a R/W expanded pointer, we can<br>
execute the assignment as follows:<br>
<br></blockquote><div>&lt;snip&gt;</div><blockquote class=3D"gmail_quote" s=
tyle=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);pad=
ding-left:1ex">
So, while this design greatly expands the set of cases we can<br>
optimize, it does lose some cases that the old approach could<br>
support.=C2=A0 I envision addressing that by allowing a prosupport<br>
function attached to the RHS&#39; topmost function to &quot;bless&quot;<br>
other cases as safe, using reasoning similar to the old rules.<br>
(Or different rules, even, but it&#39;s on the prosupport function<br>
to be sure it&#39;s safe.)=C2=A0 I don&#39;t have a detailed design in mind=
,<br>
but I&#39;m thinking along the lines of just passing the whole RHS<br>
expression to the prosupport function and letting it decide<br>
what&#39;s safe.=C2=A0 In any case, we don&#39;t need to even call the<br>
prosupport function unless there&#39;s an exception block or<br>
multiple RHS references to the target variable.<br></blockquote><div><br></=
div><div>That all sounds great, and it sounds like my prosupport function j=
ust needs to return true, or set some kind of flag saying aliasing is ok.=
=C2=A0 I&#39;d like to help as much as possible, but some of that reparenti=
ng stuff was pretty deep for me, other than being a quick sanity check case=
, is there anything I can do to help?</div><div>=C2=A0</div><blockquote cla=
ss=3D"gmail_quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid =
rgb(204,204,204);padding-left:1ex">
<br>
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=
=A0 =C2=A0 regards, tom lane<br>
</blockquote></div></div>

--0000000000003b23cf0625ce832e--