I have some updates on this thread based on some graph algorithms I've ported from the Python/C graphblas libraries.
All of the plpgsql expanded object optimizations so far are working well, I can minimize object expansion in most cases, there are a couple I haven't been able to work around but I'm still getting excellent benchmarking numbers on some large test graphs:
LiveJournal Orkut
Nodes 3,997,962 3,072,441
Edges 34,681,185 117,185,037
Triangles 177,820,130 627,583,972
Seconds Edges/S Seconds Edges/S
Tri Count LL 2.80s 12,386,138 32.03s 3,658,602
Tri Count LU 1.91s 18,157,688 16.38s 7,156,338
Tri Centrality 1.55s 22,374,958 12.22s 9,589,610
Page Rank 8.10s 4,281,628 23.14s 5,064,176
That's on a 2020 era 4 core economy laptop and is in line with what the C/Python/Julia bindings get on similar hardware.
There are a few cases where I have to force an expansion, I work around this by calling a `wait()` function, which expands the datum, calls GrB_wait() on it (a nop in this case) and returns a r/w pointer. You can see this in the following Triangle Counting function which is a matrix multiplication of a graph to itself, using itself as a mask. This matrix reduces to the triangle count (times six):
create or replace function tcount_b(graph matrix) returns bigint language plpgsql as
$$
begin
graph = wait(graph);
graph = mxm(graph, graph, 'plus_pair_int32', mask=>graph, descr=>'s');
return reduce_scalar(graph) / 6;
end;
$$;
DEBUG: new_matrix
DEBUG: flatten_matrix
DEBUG: matrix_wait
DEBUG: expand_matrix -- expansion happens here in wait()
DEBUG: new_matrix
DEBUG: matrix_mxm -- mxm does not re-expand the object, good!
DEBUG: expand_semiring
DEBUG: new_semiring
DEBUG: new_matrix
DEBUG: expand_descriptor
DEBUG: new_descriptor
DEBUG: matrix_reduce_scalar -- neither does reduce, good!
DEBUG: new_scalar
DEBUG: scalar_div_int32
DEBUG: new_scalar
DEBUG: cast_scalar_int64
If I take out the call to wait(), then mxm calls expand_matrix 3 times as it did before your optimizations.
I agree it makes sense to have more use cases before making deeper changes. I only work with expanded forms, but need to call wait() to pre-expand the object to avoid multiple expansions in functions that can take the same object in multiple parameters. This is a pretty common pattern in GraphBLAS (and linear algebra in general) where (many) matrices are commutable to themselves in several ways like multiplication, element-wise operations, and element masking.
I'm not sure if eliminating wait() is a good enough use case, it would definitely be nice to get rid of but I can document it pretty thoroughly and it's relatively easy to catch.