Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1t6ey6-00Eavo-IK for pgsql-hackers@arkaria.postgresql.org; Thu, 31 Oct 2024 23:52:46 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1t6ey4-007AnX-LC for pgsql-hackers@arkaria.postgresql.org; Thu, 31 Oct 2024 23:52:45 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1t6ey4-007AnB-7M for pgsql-hackers@lists.postgresql.org; Thu, 31 Oct 2024 23:52:44 +0000 Received: from mail-lj1-x230.google.com ([2a00:1450:4864:20::230]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1t6exy-003rid-0n for pgsql-hackers@lists.postgresql.org; Thu, 31 Oct 2024 23:52:43 +0000 Received: by mail-lj1-x230.google.com with SMTP id 38308e7fff4ca-2fb59652cb9so14519381fa.3 for ; Thu, 31 Oct 2024 16:52:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1730418756; x=1731023556; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=OX44R31GLEntmz7CtB7mtvnF2aXJ4JI6VCZ7smZq+jg=; b=T7rQNwbZc0eUQwnwMsP5qjc30luv/a6OWWy7qLjjk9FAceVooO0dR8mD+4iV+vstX/ OxgL1zf1zjClgiOq0wYJPY8JjJxj0wTBIRd3aw0hD/XCole3AtqSPft/SxNv3ofFOsBx EXlUh+qp0H9IbbNykUsvCAQ0o4ErCEbK1Ilui0YMYAznmOHSPrEklXbpEAAk7HqLLgNF xCWzMWfuQJ1IbeAMWnxZdju8yx+gfrk7EeKUFW1y4t0ta6e49iLWauH6XPjiGWHGRI1b c/b1e4/b/fMDHeNnVLFxE5OHMNKCwWYzpzLddNCwGJ9Zvx7HCC0dzIHLbizh2EqJRG8i zPcw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730418756; x=1731023556; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=OX44R31GLEntmz7CtB7mtvnF2aXJ4JI6VCZ7smZq+jg=; b=qgwCzrBU5trIz6jFcBiAyjA8puE8K31iTI4DMrtI1Q3x13E10lXJE1mWBotI/SFSmF cUetCBXFOJOfTvem2MpeN2aAma7ezIfvVUqvZzN+oyANJCsNxY0P6oZqNSHFI3bJn/ua hjVomJSSFzgYFROpqn/FAFnE4+2RFOUqH1UMgMLmv5HyguSPbykM0s8wQUe+hAfuY2Y4 9mWlSow0H0RfnZfMDIt8jB4A23IZFE15+9J8MFs5SpSldbGQDofccy0q6l/7gZiZc2X2 fL6+GYKe5x53DQ1qugutvON+v1cVrLZ3aFaP3zGF5KQ0IwZqRLlfmign3XngjrlG9/Cq 96tg== X-Gm-Message-State: AOJu0YzB8lfTz+wkO7+BoSc3D2tHBQDRGZn37XTM8SLk7qK8G2woJvVd L1UQ4mUz6GXnlChbpNIh2I5nw4NyRGjR0HKgZvQtLi1WLXo23pBoCacqyuoIEyOGHk8jxA3DB9K 9nnjI8SXeobsRjADOLRqoVZEsZh0= X-Google-Smtp-Source: AGHT+IHIgj6PhXMa2kZ5ZYMyw1R+elE9cK6zQ1Zgr2hNw7q+VzPeo2Rlkf5Bo0JU//kmGRSF/br/GkZ1KRdZkhssDQs= X-Received: by 2002:a05:651c:512:b0:2fc:a026:bb41 with SMTP id 38308e7fff4ca-2fdec726245mr31398621fa.15.1730418755442; Thu, 31 Oct 2024 16:52:35 -0700 (PDT) MIME-Version: 1.0 References: <1342498.1729444411@sss.pgh.pa.us> <1445998.1729482404@sss.pgh.pa.us> <2062830.1729625620@sss.pgh.pa.us> <2265411.1729699470@sss.pgh.pa.us> <2354718.1729737539@sss.pgh.pa.us> <2581216.1729794746@sss.pgh.pa.us> In-Reply-To: <2581216.1729794746@sss.pgh.pa.us> From: Michel Pelletier Date: Thu, 31 Oct 2024 16:51:59 -0700 Message-ID: Subject: Re: Using Expanded Objects other than Arrays from plpgsql To: Tom Lane Cc: pgsql-hackers@lists.postgresql.org Content-Type: multipart/alternative; boundary="0000000000003b23cf0625ce832e" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000003b23cf0625ce832e Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Thu, Oct 24, 2024 at 11:32=E2=80=AFAM Tom Lane wrote= : > I wrote: > > ... I'm still writing up > > details, but right now I'm envisioning completely separate sets of > > rules for the prosupport case versus the no-prosupport case. > > So here is the design I've come up with for optimizing R/W expanded > object updates in plpgsql without any special knowledge from a > prosupport function. AFAICS this requires no assumptions at all > about the behavior of called functions, other than the bare minimum > "you can't corrupt the object to the point where it wouldn't be > cleanly free-able". In particular that means it can work for > user-written called functions in plpgsql, SQL, or whatever, not > only for C-coded functions. > Great, I checked with the upstream library authors and they verified that the object can't be corrupted to where it can't be freed. Since my expanded objects are just a box around a library handle, I use a MemoryContext callback to call the library free function when the context cleans up, and we can't think of a path where that will fail. > > There are two requirements to apply the optimization: > > * If the assignment statement is within a BEGIN ... EXCEPTION block, > its target variable must be declared inside the most-closely-nested > such block. This ensures that if an error is thrown from within the > assignment statement's expression, we do not care about the value > of the target variable, except to the extent of being able to clean > it up. > My users are writing algebraic expressions to be done in bulk on GPUs, etc. I don't think I have to worry too much about wrapping stuff in exception blocks while handling my library objects. > While I've not tried to write any code yet, I think both of these > conditions should be reasonably easy to verify. > > Given that those conditions are met and the current value of the > assignment target variable is a R/W expanded pointer, we can > execute the assignment as follows: > > > So, while this design greatly expands the set of cases we can > optimize, it does lose some cases that the old approach could > support. I envision addressing that by allowing a prosupport > function attached to the RHS' topmost function to "bless" > other cases as safe, using reasoning similar to the old rules. > (Or different rules, even, but it's on the prosupport function > to be sure it's safe.) I don't have a detailed design in mind, > but I'm thinking along the lines of just passing the whole RHS > expression to the prosupport function and letting it decide > what's safe. In any case, we don't need to even call the > prosupport function unless there's an exception block or > multiple RHS references to the target variable. > That all sounds great, and it sounds like my prosupport function just needs to return true, or set some kind of flag saying aliasing is ok. I'd like to help as much as possible, but some of that reparenting stuff was pretty deep for me, other than being a quick sanity check case, is there anything I can do to help? > > regards, tom lane > --0000000000003b23cf0625ce832e Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Thu, Oct 24, 2024 at 11:32=E2=80=AFAM = Tom Lane <tgl@sss.pgh.pa.us>= wrote:
I wrote:
> ... I'm still writing up
> details, but right now I'm envisioning completely separate sets of=
> rules for the prosupport case versus the no-prosupport case.

So here is the design I've come up with for optimizing R/W expanded
object updates in plpgsql without any special knowledge from a
prosupport function.=C2=A0 AFAICS this requires no assumptions at all
about the behavior of called functions, other than the bare minimum
"you can't corrupt the object to the point where it wouldn't b= e
cleanly free-able".=C2=A0 In particular that means it can work for
user-written called functions in plpgsql, SQL, or whatever, not
only for C-coded functions.

Great, I ch= ecked with the upstream library authors and they verified that the object c= an't be corrupted to where it can't be freed.=C2=A0 Since my expand= ed objects are just a box around a library handle, I use a MemoryContext ca= llback to call the library free function when the context cleans up, and we= can't think of a path where that will fail.
=C2=A0

There are two requirements to apply the optimization:

* If the assignment statement is within a BEGIN ... EXCEPTION block,
its target variable must be declared inside the most-closely-nested
such block.=C2=A0 This ensures that if an error is thrown from within the assignment statement's expression, we do not care about the value
of the target variable, except to the extent of being able to clean
it up.

My users are writing algebraic e= xpressions to be done in bulk on GPUs, etc.=C2=A0 I don't think I have = to worry too much about wrapping stuff in exception blocks while handling m= y library objects.

<snip>
While I've not tried to write any code yet, I think both of these
conditions should be reasonably easy to verify.

Given that those conditions are met and the current value of the
assignment target variable is a R/W expanded pointer, we can
execute the assignment as follows:

<snip>
So, while this design greatly expands the set of cases we can
optimize, it does lose some cases that the old approach could
support.=C2=A0 I envision addressing that by allowing a prosupport
function attached to the RHS' topmost function to "bless"
other cases as safe, using reasoning similar to the old rules.
(Or different rules, even, but it's on the prosupport function
to be sure it's safe.)=C2=A0 I don't have a detailed design in mind= ,
but I'm thinking along the lines of just passing the whole RHS
expression to the prosupport function and letting it decide
what's safe.=C2=A0 In any case, we don't need to even call the
prosupport function unless there's an exception block or
multiple RHS references to the target variable.

That all sounds great, and it sounds like my prosupport function j= ust needs to return true, or set some kind of flag saying aliasing is ok.= =C2=A0 I'd like to help as much as possible, but some of that reparenti= ng stuff was pretty deep for me, other than being a quick sanity check case= , is there anything I can do to help?
=C2=A0

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 regards, tom lane
--0000000000003b23cf0625ce832e--