Richard:
Thanks for reviving this patch and for all of your work on it! Eager aggregation pushdown will be beneficial for my work and I'm hoping to see it land.
I was playing around with v9 of the patches and was specifically curious about this previous statement...
>This patch also makes eager aggregation work with outer joins. With
>outer join, the aggregate cannot be pushed down if any column referenced
>by grouping expressions or aggregate functions is nullable by an outer
>join above the relation to which we want to apply the partiall
>aggregation. Thanks to Tom's outer-join-aware-Var infrastructure, we
>can easily identify such situations and subsequently refrain from
>pushing down the aggregates.
...and this related comment in eager_aggregate.out:
>-- Ensure aggregation cannot be pushed down to the nullable side
While I'm new to this work and its subtleties, I'm wondering if this is too broad a condition.
I modified the first test query in eager_aggregate.sql to make it a LEFT JOIN and eager aggregation indeed did not happen, which is expected based on the comments upthread.
query:
SET enable_eager_aggregate=ON;
EXPLAIN (VERBOSE, COSTS OFF)
SELECT t1.a, sum(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
plan:
QUERY PLAN
------------------------------------------------------------
GroupAggregate
Output: t1.a, sum(t2.c)
Group Key: t1.a
-> Sort
Output: t1.a, t2.c
Sort Key: t1.a
-> Hash Right Join
Output: t1.a, t2.c
Hash Cond: (t2.b = t1.b)
-> Seq Scan on public.eager_agg_t2 t2
Output: t2.a, t2.b, t2.c
-> Hash
Output: t1.a, t1.b
-> Seq Scan on public.eager_agg_t1 t1
Output: t1.a, t1.b
(15 rows)
(NOTE: I changed the aggregate from avg(...) to sum(...) for simplicity)
But, it seems that eager aggregation for the query above can be "replicated" as:
query:
EXPLAIN (VERBOSE, COSTS OFF)
SELECT t1.a, sum(t2.c)
FROM eager_agg_t1 t1
LEFT JOIN (
SELECT b, sum(c) c
FROM eager_agg_t2 t2p
GROUP BY b
) t2 ON t1.b = t2.b
GROUP BY t1.a
ORDER BY t1.a;
The output of both the original query and this one match (and the plans with eager aggregation and the subquery are nearly identical if you restore the LEFT JOIN to a JOIN). I admittedly may be missing a subtlety, but does this mean that there are conditions under which eager aggregation can be pushed down to the nullable side?
-Paul-