Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1t3Sn8-00Cf6d-I0 for pgsql-hackers@arkaria.postgresql.org; Wed, 23 Oct 2024 04:16:14 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1t3Sn6-0061fL-PK for pgsql-hackers@arkaria.postgresql.org; Wed, 23 Oct 2024 04:16:13 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1t3Sn6-0061c7-Bu for pgsql-hackers@lists.postgresql.org; Wed, 23 Oct 2024 04:16:12 +0000 Received: from mail-lj1-x230.google.com ([2a00:1450:4864:20::230]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1t3Sn3-002V9r-IR for pgsql-hackers@lists.postgresql.org; Wed, 23 Oct 2024 04:16:12 +0000 Received: by mail-lj1-x230.google.com with SMTP id 38308e7fff4ca-2fb59652cb9so63302201fa.3 for ; Tue, 22 Oct 2024 21:16:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1729656968; x=1730261768; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=ELi/dKhHBQgOg1cthuw72LhYvPIEQxQ3Gj97nQtli0M=; b=LFHZvJW6pysGM/zdqy4ddjfEVwhaAH8Jdl+cuahfJdnhkUDLfycpCzvyllhTADUv6B WiKXF9uN4UqOP+OsT+oUxBst/WTZ6UIC7U6jvJdu71mcpl+jov8PvNbezqy/n4W+CPoH 9+YoMkwxwoQCx08H3WJBamyl+jJUZziInVMeWcKpKufn5CbBGwc+RZRk+huVWZiaJ+BJ 5bXvqCma1JQDEgicxff1igDfLdSn7dwAp13jl6BouSbNsgkthqeN0hlnJTX4t4fpwOaE tLlnuYz98BEXOXligtlgtquozHtBTAvXEutaAAjN7QIBPenSwqr+xnh2kSrzPSl9RlGS Fpvg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1729656968; x=1730261768; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ELi/dKhHBQgOg1cthuw72LhYvPIEQxQ3Gj97nQtli0M=; b=bQih32SytbXwi3O6ptWq1Gnvqud4Ovp8wbMz2OoiPyZ4VPXRr/6f7SjE5JQaYg02wr Xdx3GLaalE8rSKy+VkBzv5hjKGjomgMbY2VA+mBSFgFP+03oaEPdhBYKzyxgOAM9BXUp bhkZfTsrCrEN+OHKDiHmySvy7VphQcbdqalWY+AU6CTvFUBqBd7miVA9DZT6ZYcXGxMC i9couSb5p7efmO/AovSM/POARKaOVkZDvG5JTZuDqaf1HuCUyFvGshacSnWF0un+/xix yZ2//7nlCP5a5wYtloNrEOsB/LxHEzI8oiMalDQTvHJUklGWILHBJfLIJUjhzwn5WTQh m1CA== X-Gm-Message-State: AOJu0Yzv8cTMmcQ0meKXUhH2fohm3L9jXvoBzCPlR8e61Se6kguCFYha dtAxS2NsPrLP8gSxhMiuUPBDJkCf/18sxn4WNgkDYpuwOqFA8czMNLV0Ys/kuxgQTUJhLzuwj4V duFFWvG/4yXpl2kfC7asKLe5W4W6yP60D X-Google-Smtp-Source: AGHT+IGKR1OsDbubu7wTKHfqJuzTO1LEu1qGKKJ+24n8lRiosKBQJbnew5hpxR4mAKa6+QNRDuU9MfAAFdNHYXeD57Q= X-Received: by 2002:a05:651c:198c:b0:2fc:97a8:48f9 with SMTP id 38308e7fff4ca-2fc9d35a491mr4504171fa.19.1729656967744; Tue, 22 Oct 2024 21:16:07 -0700 (PDT) MIME-Version: 1.0 References: <1342498.1729444411@sss.pgh.pa.us> <1445998.1729482404@sss.pgh.pa.us> <2062830.1729625620@sss.pgh.pa.us> In-Reply-To: <2062830.1729625620@sss.pgh.pa.us> From: Michel Pelletier Date: Tue, 22 Oct 2024 21:15:30 -0700 Message-ID: Subject: Re: Using Expanded Objects other than Arrays from plpgsql To: Tom Lane Cc: pgsql-hackers@lists.postgresql.org Content-Type: multipart/alternative; boundary="00000000000025549306251d25be" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --00000000000025549306251d25be Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Oct 22, 2024 at 12:33=E2=80=AFPM Tom Lane wrote= : > Michel Pelletier writes: > But now we also have > expanded records, and with your use-case as an example of an extension > trying to do similar things, I feel like we have enough examples to > move forward. > Great! As far as the hack we were discussing upthread goes, I realized that > it should test for typlen =3D=3D -1 not just !typbyval, since the > VARATT_IS_EXTERNAL_EXPANDED_RW test requires that there be a length > word. With that fix and some comment updates, it looks like the > attached. I'm inclined to just go ahead and push that. It won't move > the needle hugely far, but it will help, and it seems simple and safe. > I made those changes and my code works a bit faster, it looks like it takes a couple of the top level expansions out. I'll have more data in the morning. To make more progress, there are basically two areas that we need > to look at: > > 1. exec_assign_value() has hard-wired knowledge that it's a good idea > to expand an array that's being assigned to a plpgsql variable, and > likewise hard-wired knowledge that calling expand_array() is how to > do that. The bit in plpgsql_exec_function() that we're patching > in the attached is the same thing, but for the initial assignment of > a function input parameter to a plpgsql variable. At the time this > was written I was quite unsure that forcible expansion would be a net > win, but the code is nine years old now and there's been few or no > performance complaints. So maybe it's okay to decide that "always > expand expandable types during assignment" is a suitable policy across > the board, and we don't need to figure out a smarter rule. It sounds > like that'd probably be a win for your application, which gives me > more faith in the idea than I would've had before. Definitely a win, as the flattened format of my objects don't have any run time use, so there is no chance for net loss for me. I guess I'm using this feature differently from how arrays are, which have a usable flattened format so there is a need to weigh a trade off with expanding or not. In my case, only the expanded version is useful, and serializing the flat version is expensive. Formalizing something like expand_array would work well for me, as my expand_matrix function has the identical function signature and serves the exact same purpose. So now > I'm thinking that we should steal ideas from the "prosupport" API > (see src/include/nodes/supportnodes.h) and invent a concept of a > "type support" function that can handle an extensible set of > different requests. The first one would be to pass back the > address of an expansion function comparable to expand_array(), > if the type supports being converted to an expanded object. > I'll look into the support code, I haven't seen that before. > 2. Most of the performance gold is hidden in deciding when we > can optimize operations on expanded objects that look like > > plpgsql_var :=3D f(plpgsql_var, other-parameters); > > by passing a R/W rather than R/O expanded pointer to f() and letting > it munge the expanded object in-place. If we fail to do that then > f() has to construct a new expanded object for its result. It's > probably still better off getting a R/O pointer than a flat object, > but nonetheless we fail to avoid a lot of data copying. > > The difficulty here is that we do not want to change the normal naive > semantics of such an assignment, in particular "if f() throws an error > then the value of plpgsql_var has not been modified". This means that > we can only optimize when the R/W parameter is to be passed to the > top-level function of the expression (else, some function called later > could throw an error and ruin things). Furthermore, we need a > guarantee from f() that it will not throw an error after modifying > the value. > As things stand, plpgsql has hard-wired knowledge that > array_subscript_handler(), array_append(), and array_prepend() > are safe in that way, but it knows nothing about anything else. > So one route to making things better seems fairly clear: invent a new > prosupport request that asks whether the function is prepared to make > such a guarantee. I wonder though if you can guarantee any such thing > for your functions, when you are relying on a library that's probably > not designed with such a restriction in mind. If this won't work then > we need a new concept. > This will work for the GraphBLAS API. The expanded object in my case is really just a small box struct around a GraphBLAS "handle" which is an opaque pointer to data which I cannot mutate, only the library can change the object behind the handle. The API makes strong guarantees that it will either do the operation and return a success code or not do the operation and return an error code. It's not possible (normally) to get a corrupt or incomplete object back. > > One idea I was toying with is that it doesn't matter if f() > throws an error so long as the plpgsql function is not executing > within an exception block: if the error propagates out of the plpgsql > function then we no longer care about the value of the variable. > That would very substantially weaken the requirements on how f() > is implemented. I'm not sure that we could make this work across > multiple levels of plpgsql functions (that is, if the value of the > variable ultimately resides in some outer function) but right now > that's not an issue since no plpgsql function as such would ever > promise to be safe, thus it would never receive a R/W pointer to > some outer function's variable. > The water here is pretty deep for me but I'm pretty sure I get what you're saying, I'll need to do some more studying of the plpgsql code which I've been spending the last couple of days familiarizing myself with more. > > same with: > > bfs_vector =3D vxm(bfs_vector, graph, 'any_secondi_int32', > > w=3D>bfs_vector, accum=3D>'min_int32'); > > This matmul mutates bfs_vector, I shouldn't need to reassign it back bu= t > at > > the moment it seems necessary otherwise the mutations are lost but this > > costs a full flatten/expand cycle. > > I'm not hugely excited about that. We could imagine extending > this concept to INOUT parameters of procedures, but it doesn't > seem like that buys anything except a small notational savings. > Maybe it would work to allow multiple parameters of a procedure > to be passed as R/W, whereas we're restricted to one for the > function-notation method. So possibly there's a gain there but > I'm not sure how big. > > BTW, if I understand your example correctly then bfs_vector is > being passed to vxm() twice. Yes, this is not unusual in this form of linear algebra, as multiple operations often accumulate into the same object to prevent a bunch of copying during each iteration of a given algorithm. There is also a "mask" parameter where another or the same object can be provided to either mask in or out (compliment mask) values during the accumulation phase. This is very useful for many algorithms, and good example is the Burkhardt method of Triangle Counting ( https://journals.sagepub.com/doi/10.1177/1473871616666393) which in GraphBLAS boils down to: GrB_mxm (C, A, NULL, semiring, A, A, GrB_DESC_S) ; GrB_reduce (&ntri, NULL, monoid, C, NULL) ; ntri /=3D 6 ; In this case A is three of the parameters to mxm, the left operand, right operand, and a structural mask. This can be summed up as "A squared, masked by A", which when reduced returns the number of triangles in the graph (times 6). > This brings up an interesting > point: if we pass the first instance as R/W, and vxm() manipulates it, > then the changes would also be visible in its other parameter "w". > This is certainly not per normal semantics. A "safe" function would > have to either not have any possibility that two parameters could > refer to the same object, or be coded in a way that made it impervious > to this issue --- in your example, it couldn't look at "w" anymore > once it'd started modifying the first parameter. Is that an okay > requirement, and if not what shall we do about it? > I *think*, if I understand you correctly, that this isn't an issue for the GraphBLAS. My expanded objects are just boxes around an opaque handle, I don't actually mutate anything inside the box, and I can't see past the opaque pointer. SuiteSparse may be storing the matrix in one of many different formats, or on a GPU, or who knows, all I have is a handle to "A" which I pass to GraphBLAS methods which is the only way I can interact with them. Here's the definition of that vxm function: https://github.com/OneSparse/OneSparse/blob/main/src/matrix.c#L907 It's pretty straightforward, get the arguments, and pass them to the GraphBLAS API, there is no mutable structure inside the expanded "box", just the handle. I'm using the expanded object API to solve my two key problems, flatten an object for disk storage, expand that object (through the GraphBLAS serialize/deserialize API) and turn it into an object handle, which is secretly just a pointer of course to the internal details of the object, but I can't see or change that, only SuiteSparse can. (btw sorry about the bad parameter names, "w" is the name from the API spec which I've been sticking to, which is the optional output object to use, if one is not passed, I create a new one, this is similar to the numpy "out" parameter semantics) . I added some debug instrumentation that might show a bit more of what's going on for me, consider this function: CREATE OR REPLACE FUNCTION test(graph matrix) RETURNS matrix LANGUAGE plpgsql AS $$ DECLARE n bigint =3D nrows(graph); m bigint =3D ncols(graph); BEGIN RETURN graph; end; $$; The graph passes straight through but first I call two methods to get the number rows and columns. When I run it on a graph: postgres=3D# select pg_typeof(test(graph)) from test_graphs ; DEBUG: matrix_nvals DEBUG: DatumGetMatrix DEBUG: expand_matrix DEBUG: new_matrix DEBUG: context_callback_matrix_free DEBUG: matrix_ncols DEBUG: DatumGetMatrix DEBUG: expand_matrix DEBUG: new_matrix DEBUG: context_callback_matrix_free pg_typeof ----------- matrix (1 row) THe matrix gets expanded twice, presumably because the object comes in flat, and both nrows() and ncols() expands it, which ends up being two separate handles and thus two separate objects to the GraphBLAS. Here's another example: CREATE OR REPLACE FUNCTION test2(graph matrix) RETURNS bigint LANGUAGE plpgsql AS $$ BEGIN perform set_element(graph, 1, 1, 1); RETURN nvals(graph); end; $$; CREATE FUNCTION postgres=3D# select test2(matrix('int32')); DEBUG: new_matrix DEBUG: matrix_get_flat_size DEBUG: flatten_matrix DEBUG: scalar_int32 DEBUG: new_scalar DEBUG: matrix_set_element DEBUG: DatumGetMatrix DEBUG: expand_matrix DEBUG: new_matrix DEBUG: DatumGetScalar DEBUG: matrix_get_flat_size DEBUG: matrix_get_flat_size DEBUG: flatten_matrix DEBUG: context_callback_matrix_free DEBUG: context_callback_scalar_free DEBUG: matrix_nvals DEBUG: DatumGetMatrix DEBUG: expand_matrix DEBUG: new_matrix DEBUG: context_callback_matrix_free DEBUG: context_callback_matrix_free test2 ------- 0 (1 row) I would expect that to return 1. If I do "graph =3D set_element(graph, 1, = 1, 1)" it works. I hope that gives a bit more information about my use cases, in general I'm very happy with the API, it's very algebraic, I have a lot of interesting plans for supporting more operators and subscription syntax, but this issue is not my top priority to see if we can resolve it. I'm sure I missed something in your detailed plan so I'll be going over it some more this week. Please let me know if you have any other questions about my use case or concerns about my expectations. Thank you! -Michel --00000000000025549306251d25be Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Tue, Oct 22, 2024 at = 12:33=E2=80=AFPM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Michel Pelletier &l= t;pelletier= .michel@gmail.com> writes:
=C2=A0
But now we also have
expanded records, and with your use-case as an example of an extension
trying to do similar things, I feel like we have enough examples to
move forward.

Great!=C2=A0
As far as the hack we were discussing upthread goes, I realized that
it should test for typlen =3D=3D -1 not just !typbyval, since the
VARATT_IS_EXTERNAL_EXPANDED_RW test requires that there be a length
word.=C2=A0 With that fix and some comment updates, it looks like the
attached.=C2=A0 I'm inclined to just go ahead and push that.=C2=A0 It w= on't move
the needle hugely far, but it will help, and it seems simple and safe.
<= /blockquote>

I made those changes and my code works a bi= t faster, it looks like it takes a couple of the top level expansions out.= =C2=A0 I'll have more data in the morning.

To make more progress, there are basically two areas that we need
to look at:

1. exec_assign_value() has hard-wired knowledge that it's a good idea to expand an array that's being assigned to a plpgsql variable, and
likewise hard-wired knowledge that calling expand_array() is how to
do that.=C2=A0 The bit in plpgsql_exec_function() that we're patching in the attached is the same thing, but for the initial assignment of
a function input parameter to a plpgsql variable.=C2=A0 At the time this was written I was quite unsure that forcible expansion would be a net
win, but the code is nine years old now and there's been few or no
performance complaints.=C2=A0 So maybe it's okay to decide that "a= lways
expand expandable types during assignment" is a suitable policy across=
the board, and we don't need to figure out a smarter rule.=C2=A0 It sou= nds
like that'd probably be a win for your application, which gives me
more faith in the idea than I would've had before.
Definitely a win, as the flattened format of my objects don'= ;t have any run time use, so there is no chance for net loss for me.=C2=A0 = I guess I'm using this feature differently from how arrays are, which h= ave a usable flattened format so there is a need to weigh a trade off with = expanding or not.=C2=A0 =C2=A0In my case, only the expanded version is usef= ul, and serializing the flat version is expensive.=C2=A0 Formalizing someth= ing like expand_array would work well for me, as my expand_matrix function = has the identical function signature and serves the exact same purpose.

=C2=A0 =C2=A0So now
I'm thinking that we should steal ideas from the "prosupport"= API
(see src/include/nodes/supportnodes.h) and invent a concept of a
"type support" function that can handle an extensible set of
different requests.=C2=A0 The first one would be to pass back the
address of an expansion function comparable to expand_array(),
if the type supports being converted to an expanded object.

I'll look into the support code, I haven't see= n that before.
=C2=A0
2. Most of the performance gold is hidden in deciding when we
can optimize operations on expanded objects that look like

=C2=A0 =C2=A0 =C2=A0 =C2=A0 plpgsql_var :=3D f(plpgsql_var, other-parameter= s);

by passing a R/W rather than R/O expanded pointer to f() and letting
it munge the expanded object in-place.=C2=A0 If we fail to do that then
f() has to construct a new expanded object for its result.=C2=A0 It's probably still better off getting a R/O pointer than a flat object,
but nonetheless we fail to avoid a lot of data copying.

The difficulty here is that we do not want to change the normal naive
semantics of such an assignment, in particular "if f() throws an error=
then the value of plpgsql_var has not been modified".=C2=A0 This means= that
we can only optimize when the R/W parameter is to be passed to the
top-level function of the expression (else, some function called later
could throw an error and ruin things).=C2=A0 Furthermore, we need a
guarantee from f() that it will not throw an error after modifying
the value.=C2=A0

As things stand, plpgsql has hard-wired knowledge that
array_subscript_handler(), array_append(), and array_prepend()
are safe in that way, but it knows nothing about anything else.
So one route to making things better seems fairly clear: invent a new
prosupport request that asks whether the function is prepared to make
such a guarantee.=C2=A0 I wonder though if you can guarantee any such thing=
for your functions, when you are relying on a library that's probably not designed with such a restriction in mind.=C2=A0 If this won't work = then
we need a new concept.

This will work f= or the GraphBLAS API.=C2=A0 The expanded object in my case is really just a= small box struct around a GraphBLAS "handle" which is an opaque = pointer to data which I cannot mutate, only the library can change the obje= ct behind the handle.=C2=A0 The API makes strong guarantees that it will ei= ther do the operation and return a success code or not do the operation and= return an error code.=C2=A0 It's not possible (normally) to get a corr= upt or incomplete object back.
=C2=A0

One idea I was toying with is that it doesn't matter if f()
throws an error so long as the plpgsql function is not executing
within an exception block: if the error propagates out of the plpgsql
function then we no longer care about the value of the variable.
That would very substantially weaken the requirements on how f()
is implemented.=C2=A0 I'm not sure that we could make this work across<= br> multiple levels of plpgsql functions (that is, if the value of the
variable ultimately resides in some outer function) but right now
that's not an issue since no plpgsql function as such would ever
promise to be safe, thus it would never receive a R/W pointer to
some outer function's variable.

The= water here is pretty deep for me but I'm pretty sure I get what you= 9;re saying, I'll need to do some more studying of the plpgsql code whi= ch I've been spending the last couple of days familiarizing myself with= more.=C2=A0


> same with:
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0bfs_vector =3D vxm(bfs_= vector, graph, 'any_secondi_int32',
>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 w=3D>bfs_vector, accum=3D>'min= _int32');
> This matmul mutates bfs_vector, I shouldn't need to reassign it ba= ck but at
> the moment it seems necessary otherwise the mutations are lost but thi= s
> costs a full flatten/expand cycle.

I'm not hugely excited about that.=C2=A0 We could imagine extending
this concept to INOUT parameters of procedures, but it doesn't
seem like that buys anything except a small notational savings.
Maybe it would work to allow multiple parameters of a procedure
to be passed as R/W, whereas we're restricted to one for the
function-notation method.=C2=A0 So possibly there's a gain there but I'm not sure how big.

BTW, if I understand your example correctly then bfs_vector is
being passed to vxm() twice.

Yes, this is n= ot unusual in this form of linear algebra, as multiple operations often acc= umulate into the same object to prevent a bunch of copying during each iter= ation of a given algorithm.=C2=A0 There is also a "mask" paramete= r where another or the same object can be provided to either mask in or out= (compliment mask) values during the accumulation phase.=C2=A0 This is very= useful for many algorithms, and good example is the Burkhardt method of Tr= iangle Counting (https://journals.sagepub.com/doi/10.1177/1473= 871616666393) which in GraphBLAS boils down to:

=C2=A0 =C2=A0 GrB_mxm (C, A, NULL, semiring, A, A, GrB_DESC_S) ;
=C2=A0 =C2=A0 GrB_reduce (&ntri, NULL, monoid, C, NULL) ;
=C2=A0 =C2=A0 ntri /=3D 6 ;

In this cas= e A is three of the parameters to mxm, the left operand, right operand, and= a structural mask.=C2=A0 This can be summed up as "A squared, masked = by A", which when reduced returns the number of triangles in the graph= (times 6).
=C2=A0
=C2=A0 This brings up an interesting
point: if we pass the first instance as R/W, and vxm() manipulates it,
then the changes would also be visible in its other parameter "w"= .
This is certainly not per normal semantics.=C2=A0 A "safe" functi= on would
have to either not have any possibility that two parameters could
refer to the same object, or be coded in a way that made it impervious
to this issue --- in your example, it couldn't look at "w" an= ymore
once it'd started modifying the first parameter.=C2=A0 Is that an okay<= br> requirement, and if not what shall we do about it?
I *think*, if I understand you correctly, that this isn't a= n issue for the GraphBLAS.=C2=A0 My expanded objects are just boxes around = an opaque handle, I don't actually mutate anything inside the box, and = I can't see past the opaque pointer.=C2=A0 SuiteSparse may be storing t= he matrix in one of many different formats, or on a GPU, or who knows, all = I have is a handle to "A" which I pass to GraphBLAS methods which= is the only way I can interact with them.=C2=A0 Here's the definition = of that vxm function:


It's pretty straightforward, get the arguments, and pass them to the G= raphBLAS API, there is no mutable structure inside the expanded "box&q= uot;, just the handle.

I'm using the expanded = object API to solve my two key problems, flatten an object for disk storage= , expand that object (through the GraphBLAS serialize/deserialize API) and = turn it into an object handle, which is secretly just a pointer of course t= o the internal details of the object, but I can't see or change that, o= nly SuiteSparse can.

(btw sorry about the bad= parameter names, "w" is the name from the API spec which I'v= e been sticking to, which is the optional output object to use, if one is n= ot passed, I create a new one, this is similar to the numpy "out"= parameter semantics) .

I added some debug i= nstrumentation that might show a bit more of what's going on for me, co= nsider this function:

CREATE OR REPLACE FUNCTION t= est(graph matrix)
=C2=A0 =C2=A0 RETURNS matrix LANGUAGE plpgsql AS
= =C2=A0 =C2=A0 $$
=C2=A0 =C2=A0 DECLARE
=C2=A0 =C2=A0 =C2=A0 n bigint = =3D nrows(graph);
=C2=A0 =C2=A0 =C2=A0 m bigint =3D ncols(graph);
=C2= =A0 =C2=A0 BEGIN
=C2=A0 =C2=A0 RETURN graph;
=C2=A0 =C2=A0 end;
= =C2=A0 =C2=A0 $$;

The graph passes straight th= rough but first I call two methods to get the number rows and columns.=C2= =A0 When I run it on a graph:

postgres=3D# select = pg_typeof(test(graph)) from test_graphs ;
DEBUG: =C2=A0matrix_nvals
D= EBUG: =C2=A0DatumGetMatrix
DEBUG: =C2=A0expand_matrix
DEBUG: =C2=A0ne= w_matrix
DEBUG: =C2=A0context_callback_matrix_free
DEBUG: =C2=A0matri= x_ncols
DEBUG: =C2=A0DatumGetMatrix
DEBUG: =C2=A0expand_matrix
DEB= UG: =C2=A0new_matrix
DEBUG: =C2=A0context_callback_matrix_free
=C2=A0= pg_typeof
-----------
=C2=A0matrix
(1 row)

THe matrix gets expanded twice, presumably because the object comes = in flat, and both nrows() and ncols() expands it, which ends up being two s= eparate handles and thus two separate objects to the GraphBLAS.=C2=A0
=

Here's another example:

CR= EATE OR REPLACE FUNCTION test2(graph matrix)
=C2=A0 =C2=A0 RETURNS bigin= t LANGUAGE plpgsql AS
=C2=A0 =C2=A0 $$
=C2=A0 =C2=A0 BEGIN
=C2=A0 = =C2=A0 perform set_element(graph, 1, 1, 1);
=C2=A0 =C2=A0 RETURN nvals(g= raph);
=C2=A0 =C2=A0 end;
=C2=A0 =C2=A0 $$;
CREATE FUNCTION
pos= tgres=3D# select test2(matrix('int32'));
DEBUG: =C2=A0new_matrix=
DEBUG: =C2=A0matrix_get_flat_size
DEBUG: =C2=A0flatten_matrix
DEB= UG: =C2=A0scalar_int32
DEBUG: =C2=A0new_scalar
DEBUG: =C2=A0matrix_se= t_element
DEBUG: =C2=A0DatumGetMatrix
DEBUG: =C2=A0expand_matrix
D= EBUG: =C2=A0new_matrix
DEBUG: =C2=A0DatumGetScalar
DEBUG: =C2=A0matri= x_get_flat_size
DEBUG: =C2=A0matrix_get_flat_size
DEBUG: =C2=A0flatte= n_matrix
DEBUG: =C2=A0context_callback_matrix_free
DEBUG: =C2=A0conte= xt_callback_scalar_free
DEBUG: =C2=A0matrix_nvals
DEBUG: =C2=A0DatumG= etMatrix
DEBUG: =C2=A0expand_matrix
DEBUG: =C2=A0new_matrix
DEBUG:= =C2=A0context_callback_matrix_free
DEBUG: =C2=A0context_callback_matrix= _free
=C2=A0test2
-------
=C2=A0 =C2=A0 =C2=A00
(1 row)

I would expect that to return 1.=C2=A0 If I do "= graph =3D set_element(graph, 1, 1, 1)" it works.

<= div>I hope that gives a bit more information about my use cases, in general= I'm very happy with the API, it's very algebraic, I have a lot of = interesting plans for supporting more operators and subscription syntax, bu= t this issue is not my top priority to see if we can resolve it.=C2=A0 I= 9;m sure I missed something in your detailed plan so I'll be going over= it some more this week.=C2=A0 Please let me know if you have any other que= stions about my use case or concerns about my expectations.

<= /div>
Thank you!

-Michel
--00000000000025549306251d25be--