Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1t3nK7-00Egtm-Eg for pgsql-hackers@arkaria.postgresql.org; Thu, 24 Oct 2024 02:11:39 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1t3nK5-00FlYf-PN for pgsql-hackers@arkaria.postgresql.org; Thu, 24 Oct 2024 02:11:38 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1t3nK5-00FlXB-Am for pgsql-hackers@lists.postgresql.org; Thu, 24 Oct 2024 02:11:37 +0000 Received: from mail-lj1-x22f.google.com ([2a00:1450:4864:20::22f]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1t3nK2-002Xp4-Pq for pgsql-hackers@lists.postgresql.org; Thu, 24 Oct 2024 02:11:36 +0000 Received: by mail-lj1-x22f.google.com with SMTP id 38308e7fff4ca-2f75c56f16aso3318581fa.0 for ; Wed, 23 Oct 2024 19:11:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1729735893; x=1730340693; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=8DHinblsRb2eFnfDH3haewFi4SofnzWvbgUu2ySCLBg=; b=jxKDSiOMGO8jQveK+4rYfxx1CsJMQXwBwa3vc0w8c18do/63qjaiB3oow3TBgvrI6T UFZ1B7GiqllyYP1dbQfJ3WRUFPhn4OMXl7fIpKaCKCymI0REua+Gsd7IxevRCs15HOi0 R29njwDpv+0eLjewTVcfKRNI6nVLeRmSAXXqVSuwZ2EvyZvWalaCHnGQdRePCgu2I+fy b9q+m4Zho1f2vYmrVUYERjHTXPYLRVh9qk4tPcT86U6oW7OcSfPZlbRnWFvjL2JcmZma kohyOYA0dPNhLg4Op5NGuJ7zaQVApuFm+My/vc4HuUpjJvAXYBbZRaGuj5l3Gb09H5W4 bbYQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729735893; x=1730340693; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=8DHinblsRb2eFnfDH3haewFi4SofnzWvbgUu2ySCLBg=; b=mGE4UHEJlFrJEvKwHmoSw164Y1U+2mpmtUUIciqu3ZIvjTKKIPWMXf815GfDkjcnUJ 57007lS+qtO1hDSSz1gqPdQ601iR3s0I5syjTh7qjYmi6FrKpgC72/Z2iGofRQdapk04 kyQvNI8AT/RfiUXRPeSpy46sbIF4W+gNZmvrPN8ooojgkk3FB4ParUKVXWkvRQEpkiFm iFFpV9ap3XJwpYPD5GtOZkz1wk/lD1k5sGe9DDM2+QGvvDhMAzp0KiHLpB4/uF/sEHIG 58oyDRCTj5HEc2+gZePc2O00dJlVXL7224zr8MzCwziQwcWU/rKqFCD5u3tdCBUTahtc evog== X-Gm-Message-State: AOJu0YzbQjrpF63ULIzb8NlBOOyPt4WjzOiBzvbXh3lZhmZM2nad4bCC 26paPGjMF6d1GLzcOpdgvNl8xDs5i16kY0WwqA3KmZfFiwLFyxFjccc5YhR8lUI9C+FUTCBxOQF nJezEnghppbYicLa05uYKLZEULVRCww== X-Google-Smtp-Source: AGHT+IHBlZI9fjPZkhgxnoiWIbX/bAtYJfOaym5wqIMEqreXB4y0qe5i+Dsme3xkqqcCIhc3BpDjjqvjFQmdzmTFFqM= X-Received: by 2002:a05:6512:3da8:b0:539:e94d:2d3d with SMTP id 2adb3069b0e04-53b1a30c2f9mr2487414e87.7.1729735892784; Wed, 23 Oct 2024 19:11:32 -0700 (PDT) MIME-Version: 1.0 References: <1342498.1729444411@sss.pgh.pa.us> <1445998.1729482404@sss.pgh.pa.us> <2062830.1729625620@sss.pgh.pa.us> <2265411.1729699470@sss.pgh.pa.us> In-Reply-To: <2265411.1729699470@sss.pgh.pa.us> From: Michel Pelletier Date: Wed, 23 Oct 2024 19:10:56 -0700 Message-ID: Subject: Re: Using Expanded Objects other than Arrays from plpgsql To: Tom Lane Cc: pgsql-hackers@lists.postgresql.org Content-Type: multipart/alternative; boundary="00000000000071df7f06252f8531" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --00000000000071df7f06252f8531 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, Oct 23, 2024 at 9:04=E2=80=AFAM Tom Lane wrote: > I wrote: > > One idea I was toying with is that it doesn't matter if f() > > throws an error so long as the plpgsql function is not executing > > within an exception block: if the error propagates out of the plpgsql > > function then we no longer care about the value of the variable. > > That would very substantially weaken the requirements on how f() > > is implemented. > > The more I think about this idea the better I like it. We can > improve on the original concept a bit: the assignment can be > within an exception block so long as the target variable is too. > For example, consider > > DECLARE x float8[]; > BEGIN > ... > DECLARE y float8[]; > BEGIN > x :=3D array_append(x, 42); > y :=3D array_append(y, 42); > END; > EXCEPTION WHEN ...; > END; > > Currently, both calls of array_append are subject to R/W optimization, > so that array_append must provide a strong guarantee that it won't > throw an error after it's begun to change the R/W object. If we > redefine things so that the optimization is applied only to "y", > then AFAICS we need nothing from array_append. It only has to be > sure it doesn't corrupt the object so badly that it can't be freed > ... but that requirement exists already, for anything dealing with > expanded objects. So this would put us in a situation where we > could apply the optimization by default, which'd be a huge win. > Great! I can make that same guarantee. > There is an exception: if we are considering > > x :=3D array_cat(x, x); > > then I don't think we can optimize because of the aliasing problem > I mentioned before. So there'd have to be a restriction that the > target variable is mentioned only once in the function's arguments. > For stuff like your vxm() function, that'd be annoying. But functions > that need that and are willing to deal with the aliasing hazard could > still provide a prosupport function that promises it's okay. What > we'd accomplish is that a large fraction of interesting functions > could get the benefit without having to create a prosupport function, > which is a win all around. > I see, I'm not completely clear on the prosupport code so let me repeat it back just so I know I'm getting it right, it looks like I'll need to write a C function, that I specify with CREATE FUNCTION ... SUPPORT, that the planner will call asking me to tell it that it's ok to alias arguments (which is fine for SuiteSparse so no problem). You also mentioned a couple emails back about a "type support" feature similar to prosupport, that would allow me to specify an eager expansion function for my types. > Also worth noting: in the above example, we could optimize the > update on "x" too, if we know that "x" is not referenced in the > block's EXCEPTION handlers. I wouldn't bother with this in the > first version, but it might be worth doing later. > > So if we go this way, the universe of functions that can benefit > from the optimization enlarges considerably, and the risk of bugs > that break the optimization drops considerably. The cost is that > some cases that were optimized before now will not be. But I > suspect that plpgsql functions where this optimization is key > probably don't contain EXCEPTION handlers at all, so that they > won't notice any change. > Sounds like a good tradeoff to me! Hopefully if anyone does have concerns with this approach they'll see this thread and comment. Thanks, -Michel > > regards, tom lane > --00000000000071df7f06252f8531 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Wed, Oct 23, 2024 at 9:04=E2=80=AFAM T= om Lane <tgl@sss.pgh.pa.us> = wrote:
I wrote:
> One idea I was toying with is that it doesn't matter if f()
> throws an error so long as the plpgsql function is not executing
> within an exception block: if the error propagates out of the plpgsql<= br> > function then we no longer care about the value of the variable.
> That would very substantially weaken the requirements on how f()
> is implemented.

The more I think about this idea the better I like it.=C2=A0 We can
improve on the original concept a bit: the assignment can be
within an exception block so long as the target variable is too.
For example, consider

=C2=A0 =C2=A0 =C2=A0 =C2=A0 DECLARE x float8[];
=C2=A0 =C2=A0 =C2=A0 =C2=A0 BEGIN
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0...
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0DECLARE y float8[];
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0BEGIN
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 x :=3D array_append(x, 42)= ;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 y :=3D array_append(y, 42)= ;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0END;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 EXCEPTION WHEN ...;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 END;

Currently, both calls of array_append are subject to R/W optimization,
so that array_append must provide a strong guarantee that it won't
throw an error after it's begun to change the R/W object.=C2=A0 If we redefine things so that the optimization is applied only to "y",<= br> then AFAICS we need nothing from array_append.=C2=A0 It only has to be
sure it doesn't corrupt the object so badly that it can't be freed<= br> ... but that requirement exists already, for anything dealing with
expanded objects.=C2=A0 So this would put us in a situation where we
could apply the optimization by default, which'd be a huge win.

Great!=C2=A0 I can make that same guarantee.
=C2=A0
There is an exception: if we are considering

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 x :=3D array_cat(x, x);
then I don't think we can optimize because of the aliasing problem
I mentioned before.=C2=A0 So there'd have to be a restriction that the<= br> target variable is mentioned only once in the function's arguments.
For stuff like your vxm() function, that'd be annoying.=C2=A0 But funct= ions
that need that and are willing to deal with the aliasing hazard could
still provide a prosupport function that promises it's okay.=C2=A0 What=
we'd accomplish is that a large fraction of interesting functions
could get the benefit without having to create a prosupport function,
which is a win all around.

I see, I'= ;m not completely clear on the prosupport code so let me repeat it back jus= t so I know I'm getting it right, it looks like I'll need to write = a C function, that I specify with CREATE FUNCTION ... SUPPORT, that the pla= nner will call asking me to tell it that it's ok to alias arguments (wh= ich is fine for SuiteSparse so no problem).=C2=A0 You also mentioned a coup= le emails back about a "type support" feature similar to prosuppo= rt, that would allow me to specify an eager expansion function for my types= .
=C2=A0
Also worth noting: in the above example, we could optimize the
update on "x" too, if we know that "x" is not reference= d in the
block's EXCEPTION handlers.=C2=A0 I wouldn't bother with this in th= e
first version, but it might be worth doing later.

So if we go this way, the universe of functions that can benefit
from the optimization enlarges considerably, and the risk of bugs
that break the optimization drops considerably.=C2=A0 The cost is that
some cases that were optimized before now will not be.=C2=A0 But I
suspect that plpgsql functions where this optimization is key
probably don't contain EXCEPTION handlers at all, so that they
won't notice any change.

Sounds lik= e a good tradeoff to me!=C2=A0 Hopefully if anyone does have concerns with = this approach they'll see this thread and comment.

=
Thanks,

-Michel

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 regards, tom lane
--00000000000071df7f06252f8531--