public inbox for [email protected]
help / color / mirror / Atom feedFrom: Michel Pelletier <[email protected]>
To: Tom Lane <[email protected]>
Cc: [email protected]
Subject: Re: Using Expanded Objects other than Arrays from plpgsql
Date: Wed, 23 Oct 2024 18:39:03 -0700
Message-ID: <CACxu=vJMqLUZR1N-AT4cXw9JHMsN=5-iYvPvKWy-CcKwz_9aBg@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <CACxu=vJaKFNsYxooSnW1wEgsAO5u_v1XYBacfVJ14wgJV_PYeg@mail.gmail.com>
<[email protected]>
<CACxu=vLXvpzN4X3k+9jsMt6ujuOvFVUSkA80t_cROSsF4y2jQQ@mail.gmail.com>
<[email protected]>
<CACxu=vKEF8Qa-OaADFxf0uMg-xw6gH_CNCWd2s+xaqh-gY4=xg@mail.gmail.com>
<[email protected]>
<CACxu=v+t5xDh0K=tOVmFrumvAH60izWhB16a9k7f2A8jPxQEaw@mail.gmail.com>
<[email protected]>
On Wed, Oct 23, 2024 at 8:21 AM Tom Lane <[email protected]> wrote:
> Michel Pelletier <[email protected]> writes:
> > Here's another example:
>
> > CREATE OR REPLACE FUNCTION test2(graph matrix)
> > RETURNS bigint LANGUAGE plpgsql AS
> > $$
> > BEGIN
> > perform set_element(graph, 1, 1, 1);
> > RETURN nvals(graph);
> > end;
> > $$;
> > CREATE FUNCTION
> > postgres=# select test2(matrix('int32'));
> > DEBUG: new_matrix
> > DEBUG: matrix_get_flat_size
> > DEBUG: flatten_matrix
> > DEBUG: scalar_int32
> > DEBUG: new_scalar
> > DEBUG: matrix_set_element
> > DEBUG: DatumGetMatrix
> > DEBUG: expand_matrix
> > DEBUG: new_matrix
> > DEBUG: DatumGetScalar
> > DEBUG: matrix_get_flat_size
> > DEBUG: matrix_get_flat_size
> > DEBUG: flatten_matrix
> > DEBUG: context_callback_matrix_free
> > DEBUG: context_callback_scalar_free
> > DEBUG: matrix_nvals
> > DEBUG: DatumGetMatrix
> > DEBUG: expand_matrix
> > DEBUG: new_matrix
> > DEBUG: context_callback_matrix_free
> > DEBUG: context_callback_matrix_free
> > test2
> > -------
> > 0
> > (1 row)
>
> I'm a little confused by your debug output. What are "scalar_int32"
> and "new_scalar", and what part of the plpgsql function is causing
> them to be invoked?
>
GraphBLAS scalars hold a single element value for the matrix type.
Internally, they are simply 1x1 matrices (much like vectors are 1xn
matrices). The function signature is:
set_element(a matrix, i bigint, j bigint, s scalar)
There is a "CAST (integer as scalar)" function (scalar_int32) that casts
Postgres integers to GraphBLAS GrB_INT32 scalar elements (which calls
new_scalar because like vectors and matrices, they are expanded objects
which have a GrB_Scalar "handle"). Scalars are useful for working with
individual values, for example reduce() returns a scalar. There are way
more efficient ways to push huge C arrays of values into matrices but for
now I'm just working at the element level.
Another thing that confuses me is why there's a second flatten_matrix
> operation happening here. Shouldn't set_element return its result
> as a R/W expanded object?
>
That confuses me too, and my default assumption is always that I'm doing it
wrong. set_element does return a R/W object afaict, here is the return:
https://github.com/OneSparse/OneSparse/blob/main/src/matrix.c#L1726
where:
#define OS_RETURN_MATRIX(_matrix) return EOHPGetRWDatum(&(_matrix)->hdr)
> > I would expect that to return 1. If I do "graph = set_element(graph, 1,
> 1,
> > 1)" it works.
>
> I think you have a faulty understanding of PERFORM. It's defined as
> "evaluate this expression and throw away the result", so it's *not*
> going to change "graph", not even if set_element declares that
> argument as INOUT.
Faulty indeed, I was going from the plpgsql statement documentation:
"Sometimes it is useful to evaluate an expression or SELECT query but
discard the result, for example when calling a function that has
side-effects but no useful result value."
My understanding of "side-effects" was flawed there, but I'm fine with "x =
set_element(x...)" anyway as I was trying to follow the example of
array_append et al.
> (Our interpretation of OUT arguments for functions
> is that they're just an alternate notation for specifying the function
> result.) If you want to avoid the explicit assignment back to "graph"
> then the thing to do would be to declare set_element as a procedure,
> not a function, with an INOUT argument and then call it with CALL.
>
I'll stick with the assignment.
That's only cosmetically different though, in that the updated
> "graph" value is still passed back much as if it were a function
> result, and then the CALL infrastructure knows it has to assign that
> back to the argument variable. And, as I tried to explain earlier,
> that code path currently has no mechanism for avoiding making a copy
> of the graph somewhere along the line: it will pass "graph" to the
> procedure as either a flat Datum or a R/O expanded object, so that
> set_element will be required to copy that before modifying it.
>
Right, I'm still figuring out exactly what that code flow is. This is my
first dive into these corners of the code so thank you for being patient
with me. I promise to write up some expanded object documentation if this
works!
> We can imagine extending whatever we do for "x := f(x)" cases so that
> it also works during "CALL p(x)". But I think that's only going to
> yield cosmetic or notational improvements so I don't want to start
> with doing that. Let's focus first on improving the existing
> infrastructure for the f(x) case.
>
+1
-Michel
view thread (6+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected]
Subject: Re: Using Expanded Objects other than Arrays from plpgsql
In-Reply-To: <CACxu=vJMqLUZR1N-AT4cXw9JHMsN=5-iYvPvKWy-CcKwz_9aBg@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox