public inbox for [email protected]
help / color / mirror / Atom feedRe: Using Expanded Objects other than Arrays from plpgsql
6+ messages / 2 participants
[nested] [flat]
* Re: Using Expanded Objects other than Arrays from plpgsql
@ 2024-10-23 15:21 Tom Lane <[email protected]>
2024-10-24 01:39 ` Re: Using Expanded Objects other than Arrays from plpgsql Michel Pelletier <[email protected]>
0 siblings, 1 reply; 6+ messages in thread
From: Tom Lane @ 2024-10-23 15:21 UTC (permalink / raw)
To: Michel Pelletier <[email protected]>; +Cc: [email protected]
Michel Pelletier <[email protected]> writes:
> Here's another example:
> CREATE OR REPLACE FUNCTION test2(graph matrix)
> RETURNS bigint LANGUAGE plpgsql AS
> $$
> BEGIN
> perform set_element(graph, 1, 1, 1);
> RETURN nvals(graph);
> end;
> $$;
> CREATE FUNCTION
> postgres=# select test2(matrix('int32'));
> DEBUG: new_matrix
> DEBUG: matrix_get_flat_size
> DEBUG: flatten_matrix
> DEBUG: scalar_int32
> DEBUG: new_scalar
> DEBUG: matrix_set_element
> DEBUG: DatumGetMatrix
> DEBUG: expand_matrix
> DEBUG: new_matrix
> DEBUG: DatumGetScalar
> DEBUG: matrix_get_flat_size
> DEBUG: matrix_get_flat_size
> DEBUG: flatten_matrix
> DEBUG: context_callback_matrix_free
> DEBUG: context_callback_scalar_free
> DEBUG: matrix_nvals
> DEBUG: DatumGetMatrix
> DEBUG: expand_matrix
> DEBUG: new_matrix
> DEBUG: context_callback_matrix_free
> DEBUG: context_callback_matrix_free
> test2
> -------
> 0
> (1 row)
I'm a little confused by your debug output. What are "scalar_int32"
and "new_scalar", and what part of the plpgsql function is causing
them to be invoked?
Another thing that confuses me is why there's a second flatten_matrix
operation happening here. Shouldn't set_element return its result
as a R/W expanded object?
> I would expect that to return 1. If I do "graph = set_element(graph, 1, 1,
> 1)" it works.
I think you have a faulty understanding of PERFORM. It's defined as
"evaluate this expression and throw away the result", so it's *not*
going to change "graph", not even if set_element declares that
argument as INOUT. (Our interpretation of OUT arguments for functions
is that they're just an alternate notation for specifying the function
result.) If you want to avoid the explicit assignment back to "graph"
then the thing to do would be to declare set_element as a procedure,
not a function, with an INOUT argument and then call it with CALL.
That's only cosmetically different though, in that the updated
"graph" value is still passed back much as if it were a function
result, and then the CALL infrastructure knows it has to assign that
back to the argument variable. And, as I tried to explain earlier,
that code path currently has no mechanism for avoiding making a copy
of the graph somewhere along the line: it will pass "graph" to the
procedure as either a flat Datum or a R/O expanded object, so that
set_element will be required to copy that before modifying it.
We can imagine extending whatever we do for "x := f(x)" cases so that
it also works during "CALL p(x)". But I think that's only going to
yield cosmetic or notational improvements so I don't want to start
with doing that. Let's focus first on improving the existing
infrastructure for the f(x) case.
regards, tom lane
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Using Expanded Objects other than Arrays from plpgsql
2024-10-23 15:21 Re: Using Expanded Objects other than Arrays from plpgsql Tom Lane <[email protected]>
@ 2024-10-24 01:39 ` Michel Pelletier <[email protected]>
2024-10-24 02:10 ` Re: Using Expanded Objects other than Arrays from plpgsql Tom Lane <[email protected]>
0 siblings, 1 reply; 6+ messages in thread
From: Michel Pelletier @ 2024-10-24 01:39 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: [email protected]
On Wed, Oct 23, 2024 at 8:21 AM Tom Lane <[email protected]> wrote:
> Michel Pelletier <[email protected]> writes:
> > Here's another example:
>
> > CREATE OR REPLACE FUNCTION test2(graph matrix)
> > RETURNS bigint LANGUAGE plpgsql AS
> > $$
> > BEGIN
> > perform set_element(graph, 1, 1, 1);
> > RETURN nvals(graph);
> > end;
> > $$;
> > CREATE FUNCTION
> > postgres=# select test2(matrix('int32'));
> > DEBUG: new_matrix
> > DEBUG: matrix_get_flat_size
> > DEBUG: flatten_matrix
> > DEBUG: scalar_int32
> > DEBUG: new_scalar
> > DEBUG: matrix_set_element
> > DEBUG: DatumGetMatrix
> > DEBUG: expand_matrix
> > DEBUG: new_matrix
> > DEBUG: DatumGetScalar
> > DEBUG: matrix_get_flat_size
> > DEBUG: matrix_get_flat_size
> > DEBUG: flatten_matrix
> > DEBUG: context_callback_matrix_free
> > DEBUG: context_callback_scalar_free
> > DEBUG: matrix_nvals
> > DEBUG: DatumGetMatrix
> > DEBUG: expand_matrix
> > DEBUG: new_matrix
> > DEBUG: context_callback_matrix_free
> > DEBUG: context_callback_matrix_free
> > test2
> > -------
> > 0
> > (1 row)
>
> I'm a little confused by your debug output. What are "scalar_int32"
> and "new_scalar", and what part of the plpgsql function is causing
> them to be invoked?
>
GraphBLAS scalars hold a single element value for the matrix type.
Internally, they are simply 1x1 matrices (much like vectors are 1xn
matrices). The function signature is:
set_element(a matrix, i bigint, j bigint, s scalar)
There is a "CAST (integer as scalar)" function (scalar_int32) that casts
Postgres integers to GraphBLAS GrB_INT32 scalar elements (which calls
new_scalar because like vectors and matrices, they are expanded objects
which have a GrB_Scalar "handle"). Scalars are useful for working with
individual values, for example reduce() returns a scalar. There are way
more efficient ways to push huge C arrays of values into matrices but for
now I'm just working at the element level.
Another thing that confuses me is why there's a second flatten_matrix
> operation happening here. Shouldn't set_element return its result
> as a R/W expanded object?
>
That confuses me too, and my default assumption is always that I'm doing it
wrong. set_element does return a R/W object afaict, here is the return:
https://github.com/OneSparse/OneSparse/blob/main/src/matrix.c#L1726
where:
#define OS_RETURN_MATRIX(_matrix) return EOHPGetRWDatum(&(_matrix)->hdr)
> > I would expect that to return 1. If I do "graph = set_element(graph, 1,
> 1,
> > 1)" it works.
>
> I think you have a faulty understanding of PERFORM. It's defined as
> "evaluate this expression and throw away the result", so it's *not*
> going to change "graph", not even if set_element declares that
> argument as INOUT.
Faulty indeed, I was going from the plpgsql statement documentation:
"Sometimes it is useful to evaluate an expression or SELECT query but
discard the result, for example when calling a function that has
side-effects but no useful result value."
My understanding of "side-effects" was flawed there, but I'm fine with "x =
set_element(x...)" anyway as I was trying to follow the example of
array_append et al.
> (Our interpretation of OUT arguments for functions
> is that they're just an alternate notation for specifying the function
> result.) If you want to avoid the explicit assignment back to "graph"
> then the thing to do would be to declare set_element as a procedure,
> not a function, with an INOUT argument and then call it with CALL.
>
I'll stick with the assignment.
That's only cosmetically different though, in that the updated
> "graph" value is still passed back much as if it were a function
> result, and then the CALL infrastructure knows it has to assign that
> back to the argument variable. And, as I tried to explain earlier,
> that code path currently has no mechanism for avoiding making a copy
> of the graph somewhere along the line: it will pass "graph" to the
> procedure as either a flat Datum or a R/O expanded object, so that
> set_element will be required to copy that before modifying it.
>
Right, I'm still figuring out exactly what that code flow is. This is my
first dive into these corners of the code so thank you for being patient
with me. I promise to write up some expanded object documentation if this
works!
> We can imagine extending whatever we do for "x := f(x)" cases so that
> it also works during "CALL p(x)". But I think that's only going to
> yield cosmetic or notational improvements so I don't want to start
> with doing that. Let's focus first on improving the existing
> infrastructure for the f(x) case.
>
+1
-Michel
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Using Expanded Objects other than Arrays from plpgsql
2024-10-23 15:21 Re: Using Expanded Objects other than Arrays from plpgsql Tom Lane <[email protected]>
2024-10-24 01:39 ` Re: Using Expanded Objects other than Arrays from plpgsql Michel Pelletier <[email protected]>
@ 2024-10-24 02:10 ` Tom Lane <[email protected]>
2024-10-31 23:41 ` Re: Using Expanded Objects other than Arrays from plpgsql Michel Pelletier <[email protected]>
0 siblings, 1 reply; 6+ messages in thread
From: Tom Lane @ 2024-10-24 02:10 UTC (permalink / raw)
To: Michel Pelletier <[email protected]>; +Cc: [email protected]
Michel Pelletier <[email protected]> writes:
> On Wed, Oct 23, 2024 at 8:21 AM Tom Lane <[email protected]> wrote:
>> Another thing that confuses me is why there's a second flatten_matrix
>> operation happening here. Shouldn't set_element return its result
>> as a R/W expanded object?
> That confuses me too, and my default assumption is always that I'm doing it
> wrong. set_element does return a R/W object afaict, here is the return:
> https://github.com/OneSparse/OneSparse/blob/main/src/matrix.c#L1726
Hmph. That seems right. Can you add errbacktrace() to your logging
ereports, in hopes of seeing how we're getting to flatten_matrix?
Or break there with gdb for a more complete/reliable stack trace.
regards, tom lane
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Using Expanded Objects other than Arrays from plpgsql
2024-10-23 15:21 Re: Using Expanded Objects other than Arrays from plpgsql Tom Lane <[email protected]>
2024-10-24 01:39 ` Re: Using Expanded Objects other than Arrays from plpgsql Michel Pelletier <[email protected]>
2024-10-24 02:10 ` Re: Using Expanded Objects other than Arrays from plpgsql Tom Lane <[email protected]>
@ 2024-10-31 23:41 ` Michel Pelletier <[email protected]>
2024-11-01 22:20 ` Re: Using Expanded Objects other than Arrays from plpgsql Tom Lane <[email protected]>
0 siblings, 1 reply; 6+ messages in thread
From: Michel Pelletier @ 2024-10-31 23:41 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: [email protected]
Here's two backtraces from gdb from this function:
CREATE OR REPLACE FUNCTION test2(graph matrix)
RETURNS bigint LANGUAGE plpgsql AS
$$
BEGIN
perform set_element(graph, 1, 1, 1);
RETURN nvals(graph);
end;
$$;
https://gist.githubusercontent.com/michelp/d02e3e300710443454357222077f9ded/raw/86d9c2c3de3f9b474006...
Both traces are in that file split on the hyphens line 44. I'm still
puzzling it out, could it have something to do with the volatility of the
functions? I'm not entirely clear on how to interpret function volatility
with expanded objects, nvals is STRICT, but set_element is STABLE. I think
my logic there was because it's a mutation. This is likely another
misunderstanding of mine.
-Michel
On Wed, Oct 23, 2024 at 7:10 PM Tom Lane <[email protected]> wrote:
> Michel Pelletier <[email protected]> writes:
> > On Wed, Oct 23, 2024 at 8:21 AM Tom Lane <[email protected]> wrote:
> >> Another thing that confuses me is why there's a second flatten_matrix
> >> operation happening here. Shouldn't set_element return its result
> >> as a R/W expanded object?
>
> > That confuses me too, and my default assumption is always that I'm doing
> it
> > wrong. set_element does return a R/W object afaict, here is the return:
> > https://github.com/OneSparse/OneSparse/blob/main/src/matrix.c#L1726
>
> Hmph. That seems right. Can you add errbacktrace() to your logging
> ereports, in hopes of seeing how we're getting to flatten_matrix?
> Or break there with gdb for a more complete/reliable stack trace.
>
> regards, tom lane
>
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Using Expanded Objects other than Arrays from plpgsql
2024-10-23 15:21 Re: Using Expanded Objects other than Arrays from plpgsql Tom Lane <[email protected]>
2024-10-24 01:39 ` Re: Using Expanded Objects other than Arrays from plpgsql Michel Pelletier <[email protected]>
2024-10-24 02:10 ` Re: Using Expanded Objects other than Arrays from plpgsql Tom Lane <[email protected]>
2024-10-31 23:41 ` Re: Using Expanded Objects other than Arrays from plpgsql Michel Pelletier <[email protected]>
@ 2024-11-01 22:20 ` Tom Lane <[email protected]>
2024-11-02 00:50 ` Re: Using Expanded Objects other than Arrays from plpgsql Michel Pelletier <[email protected]>
0 siblings, 1 reply; 6+ messages in thread
From: Tom Lane @ 2024-11-01 22:20 UTC (permalink / raw)
To: Michel Pelletier <[email protected]>; +Cc: [email protected]
Michel Pelletier <[email protected]> writes:
> Here's two backtraces from gdb from this function:
> CREATE OR REPLACE FUNCTION test2(graph matrix)
> RETURNS bigint LANGUAGE plpgsql AS
> $$
> BEGIN
> perform set_element(graph, 1, 1, 1);
> RETURN nvals(graph);
> end;
> $$;
Well, you shouldn't be using PERFORM. Not only does it not do the
right thing, but it's not optimized for expanded objects at all:
they'll get flattened both on the way into the statement and on
the way out. Try it with
graph := set_element(graph, 1, 1, 1);
RETURN nvals(graph);
regards, tom lane
^ permalink raw reply [nested|flat] 6+ messages in thread
* Re: Using Expanded Objects other than Arrays from plpgsql
2024-10-23 15:21 Re: Using Expanded Objects other than Arrays from plpgsql Tom Lane <[email protected]>
2024-10-24 01:39 ` Re: Using Expanded Objects other than Arrays from plpgsql Michel Pelletier <[email protected]>
2024-10-24 02:10 ` Re: Using Expanded Objects other than Arrays from plpgsql Tom Lane <[email protected]>
2024-10-31 23:41 ` Re: Using Expanded Objects other than Arrays from plpgsql Michel Pelletier <[email protected]>
2024-11-01 22:20 ` Re: Using Expanded Objects other than Arrays from plpgsql Tom Lane <[email protected]>
@ 2024-11-02 00:50 ` Michel Pelletier <[email protected]>
0 siblings, 0 replies; 6+ messages in thread
From: Michel Pelletier @ 2024-11-02 00:50 UTC (permalink / raw)
To: Tom Lane <[email protected]>; +Cc: [email protected]
On Fri, Nov 1, 2024 at 3:20 PM Tom Lane <[email protected]> wrote:
> Michel Pelletier <[email protected]> writes:
>
> Well, you shouldn't be using PERFORM. Not only does it not do the
> right thing, but it's not optimized for expanded objects at all:
> they'll get flattened both on the way into the statement and on
> the way out. Try it with
>
> graph := set_element(graph, 1, 1, 1);
> RETURN nvals(graph);
>
Ah my bad, you mentioned that already and I missed it, here's the two
backtraces with the assignment:
https://gist.githubusercontent.com/michelp/20b917686149d482be2359569845f232/raw/ca8349ae4b0469674b4b...
>
> regards, tom lane
>
^ permalink raw reply [nested|flat] 6+ messages in thread
end of thread, other threads:[~2024-11-02 00:50 UTC | newest]
Thread overview: 6+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2024-10-23 15:21 Re: Using Expanded Objects other than Arrays from plpgsql Tom Lane <[email protected]>
2024-10-24 01:39 ` Michel Pelletier <[email protected]>
2024-10-24 02:10 ` Tom Lane <[email protected]>
2024-10-31 23:41 ` Michel Pelletier <[email protected]>
2024-11-01 22:20 ` Tom Lane <[email protected]>
2024-11-02 00:50 ` Michel Pelletier <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox