On Wed, Dec 18, 2024 at 12:22 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Michel Pelletier <pelletier.michel@gmail.com> writes:
> My bad, sorry for the long confusing email, I figured out that I was
> calling the wrong macro when getting my matrix datum and inadvertently
> expanding RO pointers as well, I've fixed that issue, and everything is
> working great! No extra expansions and my support functions are working
> well, I need to go through a few more places in the API to add more support
> but otherwise the fixes Tom has put into plpgsql have worked perfectly and
> the library now appears to be behaving optimally! I can get down to doing
> some benchmarks and head-to-head with the C and Python bindings to compare
> against.

So, just to clarify where we're at: you are satisfied that the current
patch-set does what you need?

I have some updates on this thread based on some graph algorithms I've ported from the Python/C graphblas libraries.

All of the plpgsql expanded object optimizations so far are working well, I can minimize object expansion in most cases, there are a couple I haven't been able to work around but I'm still getting excellent benchmarking numbers on some large test graphs:

LiveJournal Orkut
Nodes 3,997,962 3,072,441
Edges 34,681,185 117,185,037
Triangles 177,820,130 627,583,972

Seconds Edges/S Seconds Edges/S
Tri Count LL 2.80s 12,386,138 32.03s 3,658,602
Tri Count LU 1.91s 18,157,688 16.38s 7,156,338
Tri Centrality 1.55s 22,374,958 12.22s 9,589,610
Page Rank 8.10s 4,281,628 23.14s 5,064,176

That's on a 2020 era 4 core economy laptop and is in line with what the C/Python/Julia bindings get on similar hardware.

There are a few cases where I have to force an expansion, I work around this by calling a `wait()` function, which expands the datum, calls GrB_wait() on it (a nop in this case) and returns a r/w pointer. You can see this in the following Triangle Counting function which is a matrix multiplication of a graph to itself, using itself as a mask. This matrix reduces to the triangle count (times six):

create or replace function tcount_b(graph matrix) returns bigint language plpgsql as
$$
begin
graph = wait(graph);
graph = mxm(graph, graph, 'plus_pair_int32', mask=>graph, descr=>'s');
return reduce_scalar(graph) / 6;
end;
$$;

DEBUG: new_matrix
DEBUG: flatten_matrix
DEBUG: matrix_wait
DEBUG: expand_matrix -- expansion happens here in wait()
DEBUG: new_matrix
DEBUG: matrix_mxm -- mxm does not re-expand the object, good!
DEBUG: expand_semiring
DEBUG: new_semiring
DEBUG: new_matrix
DEBUG: expand_descriptor
DEBUG: new_descriptor
DEBUG: matrix_reduce_scalar -- neither does reduce, good!
DEBUG: new_scalar
DEBUG: scalar_div_int32
DEBUG: new_scalar
DEBUG: cast_scalar_int64

If I take out the call to wait(), then mxm calls expand_matrix 3 times as it did before your optimizations.

The other task we'd talked about was generalizing the existing
heuristics in exec_assign_value() and plpgsql_exec_function() that
say that array-type values should be forced into expanded R/W form
when being assigned to an array-type PL/pgSQL variable. The argument
for that is that the PL/pgSQL function might subsequently do a lot of
subscripted accesses to the array (which'd benefit from working with
an expanded array) while never doing another assignment and thus not
having any opportunity to revisit the decision. The counter-argument
is that it might *not* do such accesses, so that the expansion was
just a waste of cycles. So this is squishy enough that I'd prefer to
have some solid use-cases to look at before trying to generalize it.

It's sounding to me like you're going to end up in a place where all
your values are passed around in expanded form already and so you have
little need for that optimization.

If so, I'd prefer not to go any
further than the present patch-set for now. Adding "type support"
hooks as discussed would be a substantial amount of work, so I'd
like to have a more compelling case for it before doing that.

I agree it makes sense to have more use cases before making deeper changes. I only work with expanded forms, but need to call wait() to pre-expand the object to avoid multiple expansions in functions that can take the same object in multiple parameters. This is a pretty common pattern in GraphBLAS (and linear algebra in general) where (many) matrices are commutable to themselves in several ways like multiplication, element-wise operations, and element masking.

I'm not sure if eliminating wait() is a good enough use case, it would definitely be nice to get rid of but I can document it pretty thoroughly and it's relatively easy to catch.

-Michel