Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1t3mpL-00EdXG-Db for pgsql-hackers@arkaria.postgresql.org; Thu, 24 Oct 2024 01:39:51 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1t3mpJ-00FQP0-IM for pgsql-hackers@arkaria.postgresql.org; Thu, 24 Oct 2024 01:39:49 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1t3mpI-00FQLx-Ve for pgsql-hackers@lists.postgresql.org; Thu, 24 Oct 2024 01:39:49 +0000 Received: from mail-lj1-x22c.google.com ([2a00:1450:4864:20::22c]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1t3mpC-002Xc8-5o for pgsql-hackers@lists.postgresql.org; Thu, 24 Oct 2024 01:39:48 +0000 Received: by mail-lj1-x22c.google.com with SMTP id 38308e7fff4ca-2fb4fa17044so3391341fa.3 for ; Wed, 23 Oct 2024 18:39:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1729733980; x=1730338780; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=w1hu6zkwuI8EEulP5rDHJDmzeO4y5VV9noHTqowAy5A=; b=Z1njDIBeD4XtREmvpic//2SgbKggaXG2GCpcBgCQLVzRF9eDj/o+zRqiltWauVy2Wt yYi7ALRyGSnG8UlFbpmjO6G5nC5YbzmAySoG8M7b+7qzmlklBqFXjxjGUdOEBMEZiy8K qnuG5XL2CMI0n63m5Z1eSHXtC/QqTQZ5exDy2Sg9NM4Mblbod/Ju5uQMwE2yLhoUF1i9 +Tfey1sdPEH4QdfwIV0WjKjNk4xHz7moFKWSEYebpduT9AhGUaAyhUcHcPD6F6kPpt5x jts+YRP/tmPq0tD8ZoTqeuewe3bGtweHJJowV8+JH4RTEtdiWGVltMuFYnwM+WBZZ/3D fdbw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729733980; x=1730338780; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=w1hu6zkwuI8EEulP5rDHJDmzeO4y5VV9noHTqowAy5A=; b=rhkaQf0idZXcjdNgZtC6alrYwpnBdPrmIgakTcVu467riFltKOlNeFdQpaTtDueOxK U6KzxARdz6dhp0a7MCvMwkrOMP5yg8ixTubwRkO9eBlrIsvZbKJSLOzp05ml3h/Tl3ml pmtxD7+bP8VdH6Kn+CL12nvA4aDrndEv3GJhF/Q1xNEgqPhcIduB/fI2crTDxvIB+LxV /cL6cw5jFiLMSx3G5ToNZrDDoxbI+IjphNdMT+5cXoVO6nhMWq4WPSf9xuOn3FL4y7Ps zQP8mzcWGaQ42LR8dHPj6qYd3o5mjEJ3ksbfJQt0JqIgAdGr37wV8hhLqi9D90kiEB6T HYbg== X-Gm-Message-State: AOJu0YyyeuMGxUY1xxh0OorNXHJ56XaORaQWIHRymoV1Dy2mQlm56yft kavRg6hyxfuus6VhQKAHodia8GFdyVbVdMo+5EIggRPC5R+7QcAYgIlQsnh+frgj/qU4+Dbid4p hnh0dyxfyiqJE3d0R+W7dFD9g8+N23iPH X-Google-Smtp-Source: AGHT+IGdXEmZnDhAgcT15Rwi7a7FxzecUU0Ubofw88Fpit4kKFzRcprfni32K7dv8cSmg47wgqX3njdNaQKozux/6dI= X-Received: by 2002:a05:651c:504:b0:2f7:53b8:ca57 with SMTP id 38308e7fff4ca-2fc9d30c4b7mr20472871fa.19.1729733979941; Wed, 23 Oct 2024 18:39:39 -0700 (PDT) MIME-Version: 1.0 References: <1342498.1729444411@sss.pgh.pa.us> <1445998.1729482404@sss.pgh.pa.us> <2062830.1729625620@sss.pgh.pa.us> <2259890.1729696885@sss.pgh.pa.us> In-Reply-To: <2259890.1729696885@sss.pgh.pa.us> From: Michel Pelletier Date: Wed, 23 Oct 2024 18:39:03 -0700 Message-ID: Subject: Re: Using Expanded Objects other than Arrays from plpgsql To: Tom Lane Cc: pgsql-hackers@lists.postgresql.org Content-Type: multipart/alternative; boundary="0000000000006e36f106252f1302" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000006e36f106252f1302 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Wed, Oct 23, 2024 at 8:21=E2=80=AFAM Tom Lane wrote: > Michel Pelletier writes: > > Here's another example: > > > CREATE OR REPLACE FUNCTION test2(graph matrix) > > RETURNS bigint LANGUAGE plpgsql AS > > $$ > > BEGIN > > perform set_element(graph, 1, 1, 1); > > RETURN nvals(graph); > > end; > > $$; > > CREATE FUNCTION > > postgres=3D# select test2(matrix('int32')); > > DEBUG: new_matrix > > DEBUG: matrix_get_flat_size > > DEBUG: flatten_matrix > > DEBUG: scalar_int32 > > DEBUG: new_scalar > > DEBUG: matrix_set_element > > DEBUG: DatumGetMatrix > > DEBUG: expand_matrix > > DEBUG: new_matrix > > DEBUG: DatumGetScalar > > DEBUG: matrix_get_flat_size > > DEBUG: matrix_get_flat_size > > DEBUG: flatten_matrix > > DEBUG: context_callback_matrix_free > > DEBUG: context_callback_scalar_free > > DEBUG: matrix_nvals > > DEBUG: DatumGetMatrix > > DEBUG: expand_matrix > > DEBUG: new_matrix > > DEBUG: context_callback_matrix_free > > DEBUG: context_callback_matrix_free > > test2 > > ------- > > 0 > > (1 row) > > I'm a little confused by your debug output. What are "scalar_int32" > and "new_scalar", and what part of the plpgsql function is causing > them to be invoked? > GraphBLAS scalars hold a single element value for the matrix type. Internally, they are simply 1x1 matrices (much like vectors are 1xn matrices). The function signature is: set_element(a matrix, i bigint, j bigint, s scalar) There is a "CAST (integer as scalar)" function (scalar_int32) that casts Postgres integers to GraphBLAS GrB_INT32 scalar elements (which calls new_scalar because like vectors and matrices, they are expanded objects which have a GrB_Scalar "handle"). Scalars are useful for working with individual values, for example reduce() returns a scalar. There are way more efficient ways to push huge C arrays of values into matrices but for now I'm just working at the element level. Another thing that confuses me is why there's a second flatten_matrix > operation happening here. Shouldn't set_element return its result > as a R/W expanded object? > That confuses me too, and my default assumption is always that I'm doing it wrong. set_element does return a R/W object afaict, here is the return: https://github.com/OneSparse/OneSparse/blob/main/src/matrix.c#L1726 where: #define OS_RETURN_MATRIX(_matrix) return EOHPGetRWDatum(&(_matrix)->hdr) > > I would expect that to return 1. If I do "graph =3D set_element(graph,= 1, > 1, > > 1)" it works. > > I think you have a faulty understanding of PERFORM. It's defined as > "evaluate this expression and throw away the result", so it's *not* > going to change "graph", not even if set_element declares that > argument as INOUT. Faulty indeed, I was going from the plpgsql statement documentation: "Sometimes it is useful to evaluate an expression or SELECT query but discard the result, for example when calling a function that has side-effects but no useful result value." My understanding of "side-effects" was flawed there, but I'm fine with "x = =3D set_element(x...)" anyway as I was trying to follow the example of array_append et al. > (Our interpretation of OUT arguments for functions > is that they're just an alternate notation for specifying the function > result.) If you want to avoid the explicit assignment back to "graph" > then the thing to do would be to declare set_element as a procedure, > not a function, with an INOUT argument and then call it with CALL. > I'll stick with the assignment. That's only cosmetically different though, in that the updated > "graph" value is still passed back much as if it were a function > result, and then the CALL infrastructure knows it has to assign that > back to the argument variable. And, as I tried to explain earlier, > that code path currently has no mechanism for avoiding making a copy > of the graph somewhere along the line: it will pass "graph" to the > procedure as either a flat Datum or a R/O expanded object, so that > set_element will be required to copy that before modifying it. > Right, I'm still figuring out exactly what that code flow is. This is my first dive into these corners of the code so thank you for being patient with me. I promise to write up some expanded object documentation if this works! > We can imagine extending whatever we do for "x :=3D f(x)" cases so that > it also works during "CALL p(x)". But I think that's only going to > yield cosmetic or notational improvements so I don't want to start > with doing that. Let's focus first on improving the existing > infrastructure for the f(x) case. > +1 -Michel --0000000000006e36f106252f1302 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Wed, Oct 23, 2024 at 8:21=E2=80=AFAM T= om Lane <tgl@sss.pgh.pa.us> = wrote:
Michel Pelletier <pelletier.michel@gmail.com> writes:
> Here's another example:

> CREATE OR REPLACE FUNCTION test2(graph matrix)
>=C2=A0 =C2=A0 =C2=A0RETURNS bigint LANGUAGE plpgsql AS
>=C2=A0 =C2=A0 =C2=A0$$
>=C2=A0 =C2=A0 =C2=A0BEGIN
>=C2=A0 =C2=A0 =C2=A0perform set_element(graph, 1, 1, 1);
>=C2=A0 =C2=A0 =C2=A0RETURN nvals(graph);
>=C2=A0 =C2=A0 =C2=A0end;
>=C2=A0 =C2=A0 =C2=A0$$;
> CREATE FUNCTION
> postgres=3D# select test2(matrix('int32'));
> DEBUG:=C2=A0 new_matrix
> DEBUG:=C2=A0 matrix_get_flat_size
> DEBUG:=C2=A0 flatten_matrix
> DEBUG:=C2=A0 scalar_int32
> DEBUG:=C2=A0 new_scalar
> DEBUG:=C2=A0 matrix_set_element
> DEBUG:=C2=A0 DatumGetMatrix
> DEBUG:=C2=A0 expand_matrix
> DEBUG:=C2=A0 new_matrix
> DEBUG:=C2=A0 DatumGetScalar
> DEBUG:=C2=A0 matrix_get_flat_size
> DEBUG:=C2=A0 matrix_get_flat_size
> DEBUG:=C2=A0 flatten_matrix
> DEBUG:=C2=A0 context_callback_matrix_free
> DEBUG:=C2=A0 context_callback_scalar_free
> DEBUG:=C2=A0 matrix_nvals
> DEBUG:=C2=A0 DatumGetMatrix
> DEBUG:=C2=A0 expand_matrix
> DEBUG:=C2=A0 new_matrix
> DEBUG:=C2=A0 context_callback_matrix_free
> DEBUG:=C2=A0 context_callback_matrix_free
>=C2=A0 test2
> -------
>=C2=A0 =C2=A0 =C2=A0 0
> (1 row)

I'm a little confused by your debug output.=C2=A0 What are "scalar= _int32"
and "new_scalar", and what part of the plpgsql function is causin= g
them to be invoked?

GraphBLAS scalars h= old a single element value for the matrix type.=C2=A0 Internally, they are = simply 1x1 matrices (much like vectors are 1xn matrices).=C2=A0 The functio= n signature is:

set_element(a matrix, i bigint, j = bigint, s scalar)

There is a "CAST (integer a= s scalar)" function (scalar_int32) that casts Postgres integers to Gra= phBLAS GrB_INT32 scalar elements (which calls new_scalar because like vecto= rs and matrices, they are expanded objects which have a GrB_Scalar "ha= ndle").=C2=A0 Scalars are useful for working with individual values, f= or example reduce() returns a scalar.=C2=A0 There are way more efficient wa= ys to push huge C arrays of values into matrices but for now I'm just w= orking at the element level.

Another thing that confuses me is why there's a second flatten_matrix operation happening here.=C2=A0 Shouldn't set_element return its result=
as a R/W expanded object?

That confuses= me too, and my default assumption is always that I'm doing it wrong.= =C2=A0 set_element does return a R/W object afaict, here is the return:


where:

#define OS_RETURN_MATRIX(_matrix) return EOHPGetRWDatum(&(_matrix= )->hdr)


> I would expect that to return 1.=C2=A0 If I do "graph =3D set_ele= ment(graph, 1, 1,
> 1)" it works.

I think you have a faulty understanding of PERFORM.=C2=A0 It's defined = as
"evaluate this expression and throw away the result", so it's= *not*
going to change "graph", not even if set_element declares that argument as INOUT.=C2=A0

Faulty indeed, I w= as going from the plpgsql statement documentation:

&quo= t;Sometimes it is useful to evaluate an expression or SELECT query but disc= ard the result, for example when calling a function that has side-effects b= ut no useful result value."

=
My understanding of "side-effects" was= flawed there, but I'm fine with "x =3D set_element(x...)" an= yway as I was trying to follow the example of array_append et al.
= =C2=A0
(Our interpr= etation of OUT arguments for functions
is that they're just an alternate notation for specifying the function<= br> result.)=C2=A0 If you want to avoid the explicit assignment back to "g= raph"
then the thing to do would be to declare set_element as a procedure,
not a function, with an INOUT argument and then call it with CALL.

I'll stick with the assignment.=C2=A0
=

That's only cosmetically different though, in that the updated
"graph" value is still passed back much as if it were a function<= br> result, and then the CALL infrastructure knows it has to assign that
back to the argument variable.=C2=A0 And, as I tried to explain earlier, that code path currently has no mechanism for avoiding making a copy
of the graph somewhere along the line: it will pass "graph" to th= e
procedure as either a flat Datum or a R/O expanded object, so that
set_element will be required to copy that before modifying it.

Right, I'm still figuring out exactly what that= code flow is.=C2=A0 This is my first dive into these corners of the code s= o thank you for being patient with me.=C2=A0 I promise to write up some exp= anded object documentation if this works!
=C2=A0
We can imagine extending whatever we do for "x :=3D f(x)" cases s= o that
it also works during "CALL p(x)".=C2=A0 But I think that's on= ly going to
yield cosmetic or notational improvements so I don't want to start
with doing that.=C2=A0 Let's focus first on improving the existing
infrastructure for the f(x) case.

+1

-Michel=C2=A0
--0000000000006e36f106252f1302--