public inbox for [email protected]
help / color / mirror / Atom feedRe: Eager aggregation, take 3
2+ messages / 1 participants
[nested] [flat]
* Re: Eager aggregation, take 3
@ 2024-04-30 04:06 Richard Guo <[email protected]>
2024-05-20 08:12 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
0 siblings, 1 reply; 2+ messages in thread
From: Richard Guo @ 2024-04-30 04:06 UTC (permalink / raw)
To: Andy Fan <[email protected]>; +Cc: pgsql-hackers; [email protected]
Here is an update of the patchset with the following changes:
* Fix a 'Aggref found where not expected' error caused by the PVC call
in is_var_in_aggref_only. This would happen if we have Aggrefs
contained in other expressions.
* Use joinrel's relids rather than the union of the relids of its outer
and inner to search for its grouped rel. This is more correct as we
need to include OJs into consideration.
* Remove RelAggInfo.agg_exprs as it is not used anymore.
Thanks
Richard
Attachments:
[application/octet-stream] v6-0001-Introduce-RelInfoList-structure.patch (14.3K, 3-v6-0001-Introduce-RelInfoList-structure.patch)
download | inline diff:
From 9398f129e74c9c7e9dea8b85a2166f5dfa589bc2 Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Mon, 19 Feb 2024 15:16:51 +0800
Subject: [PATCH v6 1/9] Introduce RelInfoList structure
This commit introduces the RelInfoList structure, which encapsulates
both a list and a hash table, so that we can leverage the hash table for
faster lookups not only for join relations but also for upper relations.
---
contrib/postgres_fdw/postgres_fdw.c | 3 +-
src/backend/optimizer/geqo/geqo_eval.c | 20 +--
src/backend/optimizer/path/allpaths.c | 7 +-
src/backend/optimizer/plan/planmain.c | 5 +-
src/backend/optimizer/util/relnode.c | 164 ++++++++++++++-----------
src/include/nodes/pathnodes.h | 31 +++--
6 files changed, 133 insertions(+), 97 deletions(-)
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 4053cd641c..bfced61422 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -6069,7 +6069,8 @@ foreign_join_ok(PlannerInfo *root, RelOptInfo *joinrel, JoinType jointype,
*/
Assert(fpinfo->relation_index == 0); /* shouldn't be set yet */
fpinfo->relation_index =
- list_length(root->parse->rtable) + list_length(root->join_rel_list);
+ list_length(root->parse->rtable) +
+ list_length(root->join_rel_list->items);
return true;
}
diff --git a/src/backend/optimizer/geqo/geqo_eval.c b/src/backend/optimizer/geqo/geqo_eval.c
index d2f7f4e5f3..1141156899 100644
--- a/src/backend/optimizer/geqo/geqo_eval.c
+++ b/src/backend/optimizer/geqo/geqo_eval.c
@@ -85,18 +85,18 @@ geqo_eval(PlannerInfo *root, Gene *tour, int num_gene)
* truncating the list to its original length. NOTE this assumes that any
* added entries are appended at the end!
*
- * We also must take care not to mess up the outer join_rel_hash, if there
- * is one. We can do this by just temporarily setting the link to NULL.
- * (If we are dealing with enough join rels, which we very likely are, a
- * new hash table will get built and used locally.)
+ * We also must take care not to mess up the outer join_rel_list->hash, if
+ * there is one. We can do this by just temporarily setting the link to
+ * NULL. (If we are dealing with enough join rels, which we very likely
+ * are, a new hash table will get built and used locally.)
*
* join_rel_level[] shouldn't be in use, so just Assert it isn't.
*/
- savelength = list_length(root->join_rel_list);
- savehash = root->join_rel_hash;
+ savelength = list_length(root->join_rel_list->items);
+ savehash = root->join_rel_list->hash;
Assert(root->join_rel_level == NULL);
- root->join_rel_hash = NULL;
+ root->join_rel_list->hash = NULL;
/* construct the best path for the given combination of relations */
joinrel = gimme_tree(root, tour, num_gene);
@@ -121,9 +121,9 @@ geqo_eval(PlannerInfo *root, Gene *tour, int num_gene)
* Restore join_rel_list to its former state, and put back original
* hashtable if any.
*/
- root->join_rel_list = list_truncate(root->join_rel_list,
- savelength);
- root->join_rel_hash = savehash;
+ root->join_rel_list->items = list_truncate(root->join_rel_list->items,
+ savelength);
+ root->join_rel_list->hash = savehash;
/* release all the memory acquired within gimme_tree */
MemoryContextSwitchTo(oldcxt);
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index cc51ae1757..ffc6edd6c7 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -3415,9 +3415,10 @@ make_rel_from_joinlist(PlannerInfo *root, List *joinlist)
* needed for these paths need have been instantiated.
*
* Note to plugin authors: the functions invoked during standard_join_search()
- * modify root->join_rel_list and root->join_rel_hash. If you want to do more
- * than one join-order search, you'll probably need to save and restore the
- * original states of those data structures. See geqo_eval() for an example.
+ * modify root->join_rel_list->items and root->join_rel_list->hash. If you
+ * want to do more than one join-order search, you'll probably need to save and
+ * restore the original states of those data structures. See geqo_eval() for
+ * an example.
*/
RelOptInfo *
standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c
index 075d36c7ec..eb78e37317 100644
--- a/src/backend/optimizer/plan/planmain.c
+++ b/src/backend/optimizer/plan/planmain.c
@@ -64,8 +64,9 @@ query_planner(PlannerInfo *root,
* NOTE: append_rel_list was set up by subquery_planner, so do not touch
* here.
*/
- root->join_rel_list = NIL;
- root->join_rel_hash = NULL;
+ root->join_rel_list = makeNode(RelInfoList);
+ root->join_rel_list->items = NIL;
+ root->join_rel_list->hash = NULL;
root->join_rel_level = NULL;
root->join_cur_level = 0;
root->canon_pathkeys = NIL;
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index e05b21c884..8279ab0e11 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -35,11 +35,15 @@
#include "utils/lsyscache.h"
-typedef struct JoinHashEntry
+/*
+ * An entry of a hash table that we use to make lookup for RelOptInfo
+ * structures more efficient.
+ */
+typedef struct RelInfoEntry
{
- Relids join_relids; /* hash key --- MUST BE FIRST */
- RelOptInfo *join_rel;
-} JoinHashEntry;
+ Relids relids; /* hash key --- MUST BE FIRST */
+ RelOptInfo *rel;
+} RelInfoEntry;
static void build_joinrel_tlist(PlannerInfo *root, RelOptInfo *joinrel,
RelOptInfo *input_rel,
@@ -479,11 +483,11 @@ find_base_rel_ignore_join(PlannerInfo *root, int relid)
}
/*
- * build_join_rel_hash
- * Construct the auxiliary hash table for join relations.
+ * build_rel_hash
+ * Construct the auxiliary hash table for relations.
*/
static void
-build_join_rel_hash(PlannerInfo *root)
+build_rel_hash(RelInfoList *list)
{
HTAB *hashtab;
HASHCTL hash_ctl;
@@ -491,47 +495,49 @@ build_join_rel_hash(PlannerInfo *root)
/* Create the hash table */
hash_ctl.keysize = sizeof(Relids);
- hash_ctl.entrysize = sizeof(JoinHashEntry);
+ hash_ctl.entrysize = sizeof(RelInfoEntry);
hash_ctl.hash = bitmap_hash;
hash_ctl.match = bitmap_match;
hash_ctl.hcxt = CurrentMemoryContext;
- hashtab = hash_create("JoinRelHashTable",
+ hashtab = hash_create("RelHashTable",
256L,
&hash_ctl,
HASH_ELEM | HASH_FUNCTION | HASH_COMPARE | HASH_CONTEXT);
- /* Insert all the already-existing joinrels */
- foreach(l, root->join_rel_list)
+ /* Insert all the already-existing relations */
+ foreach(l, list->items)
{
RelOptInfo *rel = (RelOptInfo *) lfirst(l);
- JoinHashEntry *hentry;
+ RelInfoEntry *hentry;
bool found;
- hentry = (JoinHashEntry *) hash_search(hashtab,
- &(rel->relids),
- HASH_ENTER,
- &found);
+ hentry = (RelInfoEntry *) hash_search(hashtab,
+ &(rel->relids),
+ HASH_ENTER,
+ &found);
Assert(!found);
- hentry->join_rel = rel;
+ hentry->rel = rel;
}
- root->join_rel_hash = hashtab;
+ list->hash = hashtab;
}
/*
- * find_join_rel
- * Returns relation entry corresponding to 'relids' (a set of RT indexes),
- * or NULL if none exists. This is for join relations.
+ * find_rel_info
+ * Find an RelOptInfo entry.
*/
-RelOptInfo *
-find_join_rel(PlannerInfo *root, Relids relids)
+static RelOptInfo *
+find_rel_info(RelInfoList *list, Relids relids)
{
+ if (list == NULL)
+ return NULL;
+
/*
* Switch to using hash lookup when list grows "too long". The threshold
* is arbitrary and is known only here.
*/
- if (!root->join_rel_hash && list_length(root->join_rel_list) > 32)
- build_join_rel_hash(root);
+ if (!list->hash && list_length(list->items) > 32)
+ build_rel_hash(list);
/*
* Use either hashtable lookup or linear search, as appropriate.
@@ -541,23 +547,23 @@ find_join_rel(PlannerInfo *root, Relids relids)
* so would force relids out of a register and thus probably slow down the
* list-search case.
*/
- if (root->join_rel_hash)
+ if (list->hash)
{
Relids hashkey = relids;
- JoinHashEntry *hentry;
+ RelInfoEntry *hentry;
- hentry = (JoinHashEntry *) hash_search(root->join_rel_hash,
- &hashkey,
- HASH_FIND,
- NULL);
+ hentry = (RelInfoEntry *) hash_search(list->hash,
+ &hashkey,
+ HASH_FIND,
+ NULL);
if (hentry)
- return hentry->join_rel;
+ return hentry->rel;
}
else
{
ListCell *l;
- foreach(l, root->join_rel_list)
+ foreach(l, list->items)
{
RelOptInfo *rel = (RelOptInfo *) lfirst(l);
@@ -569,6 +575,54 @@ find_join_rel(PlannerInfo *root, Relids relids)
return NULL;
}
+/*
+ * find_join_rel
+ * Returns relation entry corresponding to 'relids' (a set of RT indexes),
+ * or NULL if none exists. This is for join relations.
+ */
+RelOptInfo *
+find_join_rel(PlannerInfo *root, Relids relids)
+{
+ return find_rel_info(root->join_rel_list, relids);
+}
+
+/*
+ * add_rel_info
+ * Add given relation to the given list. Also add it to the auxiliary
+ * hashtable if there is one.
+ */
+static void
+add_rel_info(RelInfoList *list, RelOptInfo *rel)
+{
+ /* GEQO requires us to append the new relation to the end of the list! */
+ list->items = lappend(list->items, rel);
+
+ /* store it into the auxiliary hashtable if there is one. */
+ if (list->hash)
+ {
+ RelInfoEntry *hentry;
+ bool found;
+
+ hentry = (RelInfoEntry *) hash_search(list->hash,
+ &(rel->relids),
+ HASH_ENTER,
+ &found);
+ Assert(!found);
+ hentry->rel = rel;
+ }
+}
+
+/*
+ * add_join_rel
+ * Add given join relation to the list of join relations in the given
+ * PlannerInfo.
+ */
+static void
+add_join_rel(PlannerInfo *root, RelOptInfo *joinrel)
+{
+ add_rel_info(root->join_rel_list, joinrel);
+}
+
/*
* set_foreign_rel_properties
* Set up foreign-join fields if outer and inner relation are foreign
@@ -618,32 +672,6 @@ set_foreign_rel_properties(RelOptInfo *joinrel, RelOptInfo *outer_rel,
}
}
-/*
- * add_join_rel
- * Add given join relation to the list of join relations in the given
- * PlannerInfo. Also add it to the auxiliary hashtable if there is one.
- */
-static void
-add_join_rel(PlannerInfo *root, RelOptInfo *joinrel)
-{
- /* GEQO requires us to append the new joinrel to the end of the list! */
- root->join_rel_list = lappend(root->join_rel_list, joinrel);
-
- /* store it into the auxiliary hashtable if there is one. */
- if (root->join_rel_hash)
- {
- JoinHashEntry *hentry;
- bool found;
-
- hentry = (JoinHashEntry *) hash_search(root->join_rel_hash,
- &(joinrel->relids),
- HASH_ENTER,
- &found);
- Assert(!found);
- hentry->join_rel = joinrel;
- }
-}
-
/*
* build_join_rel
* Returns relation entry corresponding to the union of two given rels,
@@ -1469,22 +1497,14 @@ subbuild_joinrel_joinlist(RelOptInfo *joinrel,
RelOptInfo *
fetch_upper_rel(PlannerInfo *root, UpperRelationKind kind, Relids relids)
{
+ RelInfoList *list = &root->upper_rels[kind];
RelOptInfo *upperrel;
- ListCell *lc;
-
- /*
- * For the moment, our indexing data structure is just a List for each
- * relation kind. If we ever get so many of one kind that this stops
- * working well, we can improve it. No code outside this function should
- * assume anything about how to find a particular upperrel.
- */
/* If we already made this upperrel for the query, return it */
- foreach(lc, root->upper_rels[kind])
+ if (list)
{
- upperrel = (RelOptInfo *) lfirst(lc);
-
- if (bms_equal(upperrel->relids, relids))
+ upperrel = find_rel_info(list, relids);
+ if (upperrel)
return upperrel;
}
@@ -1503,7 +1523,7 @@ fetch_upper_rel(PlannerInfo *root, UpperRelationKind kind, Relids relids)
upperrel->cheapest_unique_path = NULL;
upperrel->cheapest_parameterized_paths = NIL;
- root->upper_rels[kind] = lappend(root->upper_rels[kind], upperrel);
+ add_rel_info(&root->upper_rels[kind], upperrel);
return upperrel;
}
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index b8141f141a..c696824f5c 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -80,6 +80,25 @@ typedef enum UpperRelationKind
/* NB: UPPERREL_FINAL must be last enum entry; it's used to size arrays */
} UpperRelationKind;
+/*
+ * Hashed list to store relation specific info and to retrieve it by relids.
+ *
+ * For small problems we just scan the list to do lookups, but when there are
+ * many relations we build a hash table for faster lookups. The hash table is
+ * present and valid when 'hash' is not NULL. Note that we still maintain the
+ * list even when using the hash table for lookups; this simplifies life for
+ * GEQO.
+ */
+typedef struct RelInfoList
+{
+ pg_node_attr(no_copy_equal, no_read)
+
+ NodeTag type;
+
+ List *items;
+ struct HTAB *hash pg_node_attr(read_write_ignore);
+} RelInfoList;
+
/*----------
* PlannerGlobal
* Global information for planning/optimization
@@ -270,15 +289,9 @@ struct PlannerInfo
/*
* join_rel_list is a list of all join-relation RelOptInfos we have
- * considered in this planning run. For small problems we just scan the
- * list to do lookups, but when there are many join relations we build a
- * hash table for faster lookups. The hash table is present and valid
- * when join_rel_hash is not NULL. Note that we still maintain the list
- * even when using the hash table for lookups; this simplifies life for
- * GEQO.
+ * considered in this planning run.
*/
- List *join_rel_list;
- struct HTAB *join_rel_hash pg_node_attr(read_write_ignore);
+ RelInfoList *join_rel_list; /* list of join-relation RelOptInfos */
/*
* When doing a dynamic-programming-style join search, join_rel_level[k]
@@ -413,7 +426,7 @@ struct PlannerInfo
* Upper-rel RelOptInfos. Use fetch_upper_rel() to get any particular
* upper rel.
*/
- List *upper_rels[UPPERREL_FINAL + 1] pg_node_attr(read_write_ignore);
+ RelInfoList upper_rels[UPPERREL_FINAL + 1] pg_node_attr(read_write_ignore);
/* Result tlists chosen by grouping_planner for upper-stage processing */
struct PathTarget *upper_targets[UPPERREL_FINAL + 1] pg_node_attr(read_write_ignore);
--
2.31.0
[application/octet-stream] v6-0002-Introduce-RelAggInfo-structure-to-store-info-for-grouped-paths.patch (7.8K, 4-v6-0002-Introduce-RelAggInfo-structure-to-store-info-for-grouped-paths.patch)
download | inline diff:
From 388befb0b73fb7f1b2c6409156f322366185d3f3 Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Fri, 23 Feb 2024 11:12:18 +0800
Subject: [PATCH v6 2/9] Introduce RelAggInfo structure to store info for
grouped paths.
This commit introduces RelAggInfo structure to store information needed
to create grouped paths for base and join rels. It also revises the
RelInfoList related structures and functions so that they can be used
with RelAggInfos.
---
src/backend/optimizer/util/relnode.c | 66 +++++++++++++++++--------
src/include/nodes/pathnodes.h | 73 ++++++++++++++++++++++++++++
2 files changed, 118 insertions(+), 21 deletions(-)
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index 8279ab0e11..8420b8936e 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -36,13 +36,13 @@
/*
- * An entry of a hash table that we use to make lookup for RelOptInfo
- * structures more efficient.
+ * An entry of a hash table that we use to make lookup for RelOptInfo or
+ * RelAggInfo structures more efficient.
*/
typedef struct RelInfoEntry
{
Relids relids; /* hash key --- MUST BE FIRST */
- RelOptInfo *rel;
+ void *data;
} RelInfoEntry;
static void build_joinrel_tlist(PlannerInfo *root, RelOptInfo *joinrel,
@@ -484,7 +484,7 @@ find_base_rel_ignore_join(PlannerInfo *root, int relid)
/*
* build_rel_hash
- * Construct the auxiliary hash table for relations.
+ * Construct the auxiliary hash table for relation specific data.
*/
static void
build_rel_hash(RelInfoList *list)
@@ -504,19 +504,27 @@ build_rel_hash(RelInfoList *list)
&hash_ctl,
HASH_ELEM | HASH_FUNCTION | HASH_COMPARE | HASH_CONTEXT);
- /* Insert all the already-existing relations */
+ /* Insert all the already-existing relation specific infos */
foreach(l, list->items)
{
- RelOptInfo *rel = (RelOptInfo *) lfirst(l);
+ void *item = lfirst(l);
RelInfoEntry *hentry;
bool found;
+ Relids relids;
+
+ Assert(IsA(item, RelOptInfo) || IsA(item, RelAggInfo));
+
+ if (IsA(item, RelOptInfo))
+ relids = ((RelOptInfo *) item)->relids;
+ else
+ relids = ((RelAggInfo *) item)->relids;
hentry = (RelInfoEntry *) hash_search(hashtab,
- &(rel->relids),
+ &relids,
HASH_ENTER,
&found);
Assert(!found);
- hentry->rel = rel;
+ hentry->data = item;
}
list->hash = hashtab;
@@ -524,9 +532,9 @@ build_rel_hash(RelInfoList *list)
/*
* find_rel_info
- * Find an RelOptInfo entry.
+ * Find an RelOptInfo or a RelAggInfo entry.
*/
-static RelOptInfo *
+static void *
find_rel_info(RelInfoList *list, Relids relids)
{
if (list == NULL)
@@ -557,7 +565,7 @@ find_rel_info(RelInfoList *list, Relids relids)
HASH_FIND,
NULL);
if (hentry)
- return hentry->rel;
+ return hentry->data;
}
else
{
@@ -565,10 +573,18 @@ find_rel_info(RelInfoList *list, Relids relids)
foreach(l, list->items)
{
- RelOptInfo *rel = (RelOptInfo *) lfirst(l);
+ void *item = lfirst(l);
+ Relids item_relids = NULL;
+
+ Assert(IsA(item, RelOptInfo) || IsA(item, RelAggInfo));
- if (bms_equal(rel->relids, relids))
- return rel;
+ if (IsA(item, RelOptInfo))
+ item_relids = ((RelOptInfo *) item)->relids;
+ else if (IsA(item, RelAggInfo))
+ item_relids = ((RelAggInfo *) item)->relids;
+
+ if (bms_equal(item_relids, relids))
+ return item;
}
}
@@ -583,32 +599,40 @@ find_rel_info(RelInfoList *list, Relids relids)
RelOptInfo *
find_join_rel(PlannerInfo *root, Relids relids)
{
- return find_rel_info(root->join_rel_list, relids);
+ return (RelOptInfo *) find_rel_info(root->join_rel_list, relids);
}
/*
* add_rel_info
- * Add given relation to the given list. Also add it to the auxiliary
+ * Add relation specific info to a list, and also add it to the auxiliary
* hashtable if there is one.
*/
static void
-add_rel_info(RelInfoList *list, RelOptInfo *rel)
+add_rel_info(RelInfoList *list, void *data)
{
+ Assert(IsA(data, RelOptInfo) || IsA(data, RelAggInfo));
+
/* GEQO requires us to append the new relation to the end of the list! */
- list->items = lappend(list->items, rel);
+ list->items = lappend(list->items, data);
/* store it into the auxiliary hashtable if there is one. */
if (list->hash)
{
+ Relids relids;
RelInfoEntry *hentry;
bool found;
+ if (IsA(data, RelOptInfo))
+ relids = ((RelOptInfo *) data)->relids;
+ else
+ relids = ((RelAggInfo *) data)->relids;
+
hentry = (RelInfoEntry *) hash_search(list->hash,
- &(rel->relids),
+ &relids,
HASH_ENTER,
&found);
Assert(!found);
- hentry->rel = rel;
+ hentry->data = data;
}
}
@@ -1503,7 +1527,7 @@ fetch_upper_rel(PlannerInfo *root, UpperRelationKind kind, Relids relids)
/* If we already made this upperrel for the query, return it */
if (list)
{
- upperrel = find_rel_info(list, relids);
+ upperrel = (RelOptInfo *) find_rel_info(list, relids);
if (upperrel)
return upperrel;
}
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index c696824f5c..816c41ed8c 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -1074,6 +1074,79 @@ typedef struct RelOptInfo
((rel)->part_scheme && (rel)->boundinfo && (rel)->nparts > 0 && \
(rel)->part_rels && (rel)->partexprs && (rel)->nullable_partexprs)
+/*
+ * RelAggInfo
+ * Information needed to create grouped paths for base and join rels.
+ *
+ * "relids" is the set of relation identifiers (RT indexes), just like with
+ * RelOptInfo.
+ *
+ * "target" will be used as pathtarget if partial aggregation is applied to
+ * base relation or join. The same target will also --- if the relation is a
+ * join --- be used to join grouped path to a non-grouped one. This target can
+ * contain plain-Var grouping expressions and Aggref nodes.
+ *
+ * Note: There's a convention that Aggref expressions are supposed to follow
+ * the other expressions of the target. Iterations of ->exprs may rely on this
+ * arrangement.
+ *
+ * "agg_input" contains Vars used either as grouping expressions or aggregate
+ * arguments. Paths providing the aggregation plan with input data should use
+ * this target. The only difference from reltarget of the non-grouped relation
+ * is that some items can have sortgroupref initialized.
+ *
+ * "input_rows" is the estimated number of input rows for AggPath. It's
+ * actually just a workspace for users of the structure, i.e. not initialized
+ * when instance of the structure is created.
+ *
+ * "grouped_rows" is the estimated number of result rows of the AggPath.
+ *
+ * "group_clauses", "group_exprs" and "group_pathkeys" are lists of
+ * SortGroupClause, the corresponding grouping expressions and PathKey
+ * respectively.
+ *
+ * "agg_exprs" is a list of Aggref nodes for the aggregation of the relation's
+ * paths.
+ */
+typedef struct RelAggInfo
+{
+ pg_node_attr(no_copy_equal, no_read, no_query_jumble)
+
+ NodeTag type;
+
+ /*
+ * the same as in RelOptInfo; set of base + OJ relids (rangetable indexes)
+ */
+ Relids relids;
+
+ /*
+ * the targetlist for Paths scanning this grouped rel; list of Vars/Exprs,
+ * cost, width
+ */
+ struct PathTarget *target;
+
+ /*
+ * the targetlist for Paths that generate input for the grouped paths
+ */
+ struct PathTarget *agg_input;
+
+ /* estimated number of input tuples for the grouped paths */
+ Cardinality input_rows;
+
+ /* estimated number of result tuples of the grouped relation*/
+ Cardinality grouped_rows;
+
+ /* a list of SortGroupClause's */
+ List *group_clauses;
+ /* a list of grouping expressions */
+ List *group_exprs;
+ /* a list of PathKeys */
+ List *group_pathkeys;
+
+ /* a list of Aggref nodes */
+ List *agg_exprs;
+} RelAggInfo;
+
/*
* IndexOptInfo
* Per-index information for planning/optimization
--
2.31.0
[application/octet-stream] v6-0003-Set-up-for-eager-aggregation-by-collecting-needed-infos.patch (14.3K, 5-v6-0003-Set-up-for-eager-aggregation-by-collecting-needed-infos.patch)
download | inline diff:
From f92e4774e00411423f22116959431ef14e392f61 Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Fri, 23 Feb 2024 18:40:46 +0800
Subject: [PATCH v6 3/9] Set up for eager aggregation by collecting needed
infos
This commit checks if eager aggregation is applicable, and if so, sets
up root->agg_clause_list and root->group_expr_list by collecting
suitable aggregate expressions and grouping expressions in the query.
---
src/backend/optimizer/path/allpaths.c | 1 +
src/backend/optimizer/plan/initsplan.c | 250 ++++++++++++++++++
src/backend/optimizer/plan/planmain.c | 8 +
src/backend/utils/misc/guc_tables.c | 10 +
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/include/nodes/pathnodes.h | 41 +++
src/include/optimizer/paths.h | 1 +
src/include/optimizer/planmain.h | 1 +
src/test/regress/expected/sysviews.out | 3 +-
9 files changed, 315 insertions(+), 1 deletion(-)
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index ffc6edd6c7..586c0e07c0 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -77,6 +77,7 @@ typedef enum pushdown_safe_type
/* These parameters are set by GUC */
bool enable_geqo = false; /* just in case GUC doesn't set it */
+bool enable_eager_aggregate = false;
int geqo_threshold;
int min_parallel_table_scan_size;
int min_parallel_index_scan_size;
diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index e2c68fe6f9..0281336469 100644
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -14,6 +14,7 @@
*/
#include "postgres.h"
+#include "access/nbtree.h"
#include "catalog/pg_type.h"
#include "nodes/makefuncs.h"
#include "nodes/nodeFuncs.h"
@@ -80,6 +81,8 @@ typedef struct JoinTreeItem
} JoinTreeItem;
+static void create_agg_clause_infos(PlannerInfo *root);
+static void create_grouping_expr_infos(PlannerInfo *root);
static void extract_lateral_references(PlannerInfo *root, RelOptInfo *brel,
Index rtindex);
static List *deconstruct_recurse(PlannerInfo *root, Node *jtnode,
@@ -327,6 +330,253 @@ add_vars_to_targetlist(PlannerInfo *root, List *vars,
}
}
+/*
+ * setup_eager_aggregation
+ * Check if eager aggregation is applicable, and if so collect suitable
+ * aggregate expressions and grouping expressions in the query.
+ */
+void
+setup_eager_aggregation(PlannerInfo *root)
+{
+ /*
+ * Don't apply eager aggregation if disabled by user.
+ */
+ if (!enable_eager_aggregate)
+ return;
+
+ /*
+ * Don't apply eager aggregation if there are no GROUP BY clauses.
+ */
+ if (!root->parse->groupClause)
+ return;
+
+ /*
+ * For now we don't try to support grouping sets.
+ */
+ if (root->parse->groupingSets)
+ return;
+
+ /*
+ * For now we don't try to support DISTINCT or ORDER BY aggregates.
+ */
+ if (root->numOrderedAggs > 0)
+ return;
+
+ /*
+ * If there are any aggregates that do not support partial mode, or any
+ * partial aggregates that are non-serializable, do not apply eager
+ * aggregation.
+ */
+ if (root->hasNonPartialAggs || root->hasNonSerialAggs)
+ return;
+
+ /*
+ * SRF is not allowed in the aggregate argument and we don't even want it
+ * in the GROUP BY clause, so forbid it in general. It needs to be
+ * analyzed if evaluation of a GROUP BY clause containing SRF below the
+ * query targetlist would be correct. Currently it does not seem to be an
+ * important use case.
+ */
+ if (root->parse->hasTargetSRFs)
+ return;
+
+ /*
+ * Collect aggregate expressions that appear in targetlist and having
+ * clauses.
+ */
+ create_agg_clause_infos(root);
+
+ /*
+ * If there are no suitable aggregate expressions, we cannot apply eager
+ * aggregation.
+ */
+ if (root->agg_clause_list == NIL)
+ return;
+
+ /*
+ * Collect grouping expressions that appear in grouping clauses.
+ */
+ create_grouping_expr_infos(root);
+}
+
+/*
+ * Create AggClauseInfo for each aggregate.
+ *
+ * If any aggregate is not suitable, set root->agg_clause_list to NIL and
+ * return.
+ */
+static void
+create_agg_clause_infos(PlannerInfo *root)
+{
+ List *tlist_exprs;
+ ListCell *lc;
+
+ Assert(root->agg_clause_list == NIL);
+
+ tlist_exprs = pull_var_clause((Node *) root->processed_tlist,
+ PVC_INCLUDE_AGGREGATES |
+ PVC_RECURSE_WINDOWFUNCS |
+ PVC_RECURSE_PLACEHOLDERS);
+
+ /*
+ * For now we don't try to support GROUPING() expressions.
+ */
+ foreach(lc, tlist_exprs)
+ {
+ Expr *expr = (Expr *) lfirst(lc);
+
+ if (IsA(expr, GroupingFunc))
+ return;
+ }
+
+ /*
+ * Aggregates within the HAVING clause need to be processed in the same way
+ * as those in the targetlist. Note that HAVING can contain Aggrefs but
+ * not WindowFuncs.
+ */
+ if (root->parse->havingQual != NULL)
+ {
+ List *having_exprs;
+
+ having_exprs = pull_var_clause((Node *) root->parse->havingQual,
+ PVC_INCLUDE_AGGREGATES |
+ PVC_RECURSE_PLACEHOLDERS);
+ if (having_exprs != NIL)
+ {
+ tlist_exprs = list_concat(tlist_exprs, having_exprs);
+ list_free(having_exprs);
+ }
+ }
+
+ foreach(lc, tlist_exprs)
+ {
+ Expr *expr = (Expr *) lfirst(lc);
+ Aggref *aggref;
+ AggClauseInfo *ac_info;
+
+ /*
+ * tlist_exprs may also contain Vars, but we only need Aggrefs.
+ */
+ if (IsA(expr, Var))
+ continue;
+
+ aggref = castNode(Aggref, expr);
+
+ Assert(aggref->aggorder == NIL);
+ Assert(aggref->aggdistinct == NIL);
+
+ ac_info = makeNode(AggClauseInfo);
+ ac_info->aggref = aggref;
+ ac_info->agg_eval_at = pull_varnos(root, (Node *) aggref);
+
+ root->agg_clause_list =
+ list_append_unique(root->agg_clause_list, ac_info);
+ }
+
+ list_free(tlist_exprs);
+}
+
+/*
+ * Create GroupExprInfo for each expression usable as grouping key.
+ *
+ * If any grouping expression is not suitable, set root->group_expr_list to NIL
+ * and return.
+ */
+static void
+create_grouping_expr_infos(PlannerInfo *root)
+{
+ List *exprs = NIL;
+ List *sortgrouprefs = NIL;
+ List *btree_opfamilies = NIL;
+ ListCell *lc,
+ *lc1,
+ *lc2,
+ *lc3;
+
+ Assert(root->group_expr_list == NIL);
+
+ foreach(lc, root->parse->groupClause)
+ {
+ SortGroupClause *sgc = lfirst_node(SortGroupClause, lc);
+ TargetEntry *tle = get_sortgroupclause_tle(sgc, root->processed_tlist);
+ TypeCacheEntry *tce;
+ Oid equalimageproc;
+ Oid eq_op;
+ List *eq_opfamilies;
+ Oid btree_opfamily;
+
+ Assert(tle->ressortgroupref > 0);
+
+ /*
+ * For now we only support plain Vars as grouping expressions.
+ */
+ if (!IsA(tle->expr, Var))
+ return;
+
+ /*
+ * Eager aggregation is only possible if equality of grouping keys
+ * per the equality operator implies bitwise equality. Otherwise, if
+ * we put keys of different byte images into the same group, we lose
+ * some information that may be needed to evaluate join clauses above
+ * the pushed-down aggregate node, or the WHERE clause.
+ *
+ * For example, the NUMERIC data type is not supported because values
+ * that fall into the same group according to the equality operator
+ * (e.g. 0 and 0.0) can have different scale.
+ */
+ tce = lookup_type_cache(exprType((Node *) tle->expr),
+ TYPECACHE_BTREE_OPFAMILY);
+ if (!OidIsValid(tce->btree_opf) ||
+ !OidIsValid(tce->btree_opintype))
+ return;
+
+ equalimageproc = get_opfamily_proc(tce->btree_opf,
+ tce->btree_opintype,
+ tce->btree_opintype,
+ BTEQUALIMAGE_PROC);
+ if (!OidIsValid(equalimageproc) ||
+ !DatumGetBool(OidFunctionCall1Coll(equalimageproc,
+ tce->typcollation,
+ ObjectIdGetDatum(tce->btree_opintype))))
+ return;
+
+ /*
+ * Get the operator in the btree's opfamily.
+ */
+ eq_op = get_opfamily_member(tce->btree_opf,
+ tce->btree_opintype,
+ tce->btree_opintype,
+ BTEqualStrategyNumber);
+ if (!OidIsValid(eq_op))
+ return;
+ eq_opfamilies = get_mergejoin_opfamilies(eq_op);
+ if (!eq_opfamilies)
+ return;
+ btree_opfamily = linitial_oid(eq_opfamilies);
+
+ exprs = lappend(exprs, tle->expr);
+ sortgrouprefs = lappend_int(sortgrouprefs, tle->ressortgroupref);
+ btree_opfamilies = lappend_oid(btree_opfamilies, btree_opfamily);
+ }
+
+ /*
+ * Construct GroupExprInfo for each expression.
+ */
+ forthree(lc1, exprs, lc2, sortgrouprefs, lc3, btree_opfamilies)
+ {
+ Expr *expr = (Expr *) lfirst(lc1);
+ int sortgroupref = lfirst_int(lc2);
+ Oid btree_opfamily = lfirst_oid(lc3);
+ GroupExprInfo *ge_info;
+
+ ge_info = makeNode(GroupExprInfo);
+ ge_info->expr = (Expr *) copyObject(expr);
+ ge_info->sortgroupref = sortgroupref;
+ ge_info->btree_opfamily = btree_opfamily;
+
+ root->group_expr_list = lappend(root->group_expr_list, ge_info);
+ }
+}
/*****************************************************************************
*
diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c
index eb78e37317..197a3f905e 100644
--- a/src/backend/optimizer/plan/planmain.c
+++ b/src/backend/optimizer/plan/planmain.c
@@ -77,6 +77,8 @@ query_planner(PlannerInfo *root,
root->placeholder_list = NIL;
root->placeholder_array = NULL;
root->placeholder_array_size = 0;
+ root->agg_clause_list = NIL;
+ root->group_expr_list = NIL;
root->fkey_list = NIL;
root->initial_rels = NIL;
@@ -263,6 +265,12 @@ query_planner(PlannerInfo *root,
*/
extract_restriction_or_clauses(root);
+ /*
+ * Check if eager aggregation is applicable, and if so, set up
+ * root->agg_clause_list and root->group_expr_list.
+ */
+ setup_eager_aggregation(root);
+
/*
* Now expand appendrels by adding "otherrels" for their children. We
* delay this to the end so that we have as much information as possible
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 3fd0b14dd8..5ed01f7914 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -929,6 +929,16 @@ struct config_bool ConfigureNamesBool[] =
false,
NULL, NULL, NULL
},
+ {
+ {"enable_eager_aggregate", PGC_USERSET, QUERY_TUNING_METHOD,
+ gettext_noop("Enables eager aggregation."),
+ NULL,
+ GUC_EXPLAIN
+ },
+ &enable_eager_aggregate,
+ false,
+ NULL, NULL, NULL
+ },
{
{"enable_parallel_append", PGC_USERSET, QUERY_TUNING_METHOD,
gettext_noop("Enables the planner's use of parallel append plans."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 2166ea4a87..27b6515cd3 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -413,6 +413,7 @@
#enable_sort = on
#enable_tidscan = on
#enable_group_by_reordering = on
+#enable_eager_aggregate = off
# - Planner Cost Constants -
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 816c41ed8c..7c4ade0bef 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -386,6 +386,12 @@ struct PlannerInfo
/* list of PlaceHolderInfos */
List *placeholder_list;
+ /* list of AggClauseInfos */
+ List *agg_clause_list;
+
+ /* List of GroupExprInfos */
+ List *group_expr_list;
+
/* array of PlaceHolderInfos indexed by phid */
struct PlaceHolderInfo **placeholder_array pg_node_attr(read_write_ignore, array_size(placeholder_array_size));
/* allocated size of array */
@@ -3207,6 +3213,41 @@ typedef struct MinMaxAggInfo
Param *param;
} MinMaxAggInfo;
+/*
+ * The aggregate expressions that appear in targetlist and having clauses
+ */
+typedef struct AggClauseInfo
+{
+ pg_node_attr(no_read, no_query_jumble)
+
+ NodeTag type;
+
+ /* the Aggref expr */
+ Aggref *aggref;
+
+ /* lowest level we can evaluate this aggregate at */
+ Relids agg_eval_at;
+} AggClauseInfo;
+
+/*
+ * The grouping expressions that appear in grouping clauses
+ */
+typedef struct GroupExprInfo
+{
+ pg_node_attr(no_read, no_query_jumble)
+
+ NodeTag type;
+
+ /* the represented expression */
+ Expr *expr;
+
+ /* the tleSortGroupRef of the corresponding SortGroupClause */
+ Index sortgroupref;
+
+ /* btree opfamily defining the ordering */
+ Oid btree_opfamily;
+} GroupExprInfo;
+
/*
* At runtime, PARAM_EXEC slots are used to pass values around from one plan
* node to another. They can be used to pass values down into subqueries (for
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 39ba461548..8f2bd60d47 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -21,6 +21,7 @@
* allpaths.c
*/
extern PGDLLIMPORT bool enable_geqo;
+extern PGDLLIMPORT bool enable_eager_aggregate;
extern PGDLLIMPORT int geqo_threshold;
extern PGDLLIMPORT int min_parallel_table_scan_size;
extern PGDLLIMPORT int min_parallel_index_scan_size;
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index f2e3fa4c2e..42e0f37859 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -73,6 +73,7 @@ extern void add_other_rels_to_query(PlannerInfo *root);
extern void build_base_rel_tlists(PlannerInfo *root, List *final_tlist);
extern void add_vars_to_targetlist(PlannerInfo *root, List *vars,
Relids where_needed);
+extern void setup_eager_aggregation(PlannerInfo *root);
extern void find_lateral_references(PlannerInfo *root);
extern void create_lateral_join_info(PlannerInfo *root);
extern List *deconstruct_jointree(PlannerInfo *root);
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index 2f3eb4e7f1..b6f4f6686c 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -136,6 +136,7 @@ select name, setting from pg_settings where name like 'enable%';
--------------------------------+---------
enable_async_append | on
enable_bitmapscan | on
+ enable_eager_aggregate | off
enable_gathermerge | on
enable_group_by_reordering | on
enable_hashagg | on
@@ -157,7 +158,7 @@ select name, setting from pg_settings where name like 'enable%';
enable_seqscan | on
enable_sort | on
enable_tidscan | on
-(23 rows)
+(24 rows)
-- There are always wait event descriptions for various types.
select type, count(*) > 0 as ok FROM pg_wait_events
--
2.31.0
[application/octet-stream] v6-0004-Implement-functions-that-create-RelAggInfos-if-applicable.patch (26.2K, 6-v6-0004-Implement-functions-that-create-RelAggInfos-if-applicable.patch)
download | inline diff:
From 8d16ab4f55fed03a509e5a921e7255026e7bf5fc Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Fri, 23 Feb 2024 11:27:49 +0800
Subject: [PATCH v6 4/9] Implement functions that create RelAggInfos if
applicable
This commit implements the functions that check if eager aggregation is
applicable for a given relation, and if so, create RelAggInfo structure
for the relation, using the infos about aggregate expressions and
grouping expressions we collected earlier.
---
src/backend/optimizer/path/equivclass.c | 26 +-
src/backend/optimizer/plan/planmain.c | 3 +
src/backend/optimizer/util/relnode.c | 636 ++++++++++++++++++++++++
src/backend/utils/adt/selfuncs.c | 5 +-
src/include/nodes/pathnodes.h | 6 +
src/include/optimizer/pathnode.h | 5 +
src/include/optimizer/paths.h | 3 +-
7 files changed, 674 insertions(+), 10 deletions(-)
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index 21ce1ae2e1..9369acf033 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -2454,15 +2454,17 @@ find_join_domain(PlannerInfo *root, Relids relids)
* Detect whether two expressions are known equal due to equivalence
* relationships.
*
- * Actually, this only shows that the expressions are equal according
- * to some opfamily's notion of equality --- but we only use it for
- * selectivity estimation, so a fuzzy idea of equality is OK.
+ * If opfamily is given, the expressions must be known equal per the semantics
+ * of that opfamily (note it has to be a btree opfamily, since those are the
+ * only opfamilies equivclass.c deals with). If opfamily is InvalidOid, we'll
+ * return true if they're equal according to any opfamily, which is fuzzy but
+ * OK for estimation purposes.
*
* Note: does not bother to check for "equal(item1, item2)"; caller must
* check that case if it's possible to pass identical items.
*/
bool
-exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2)
+exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2, Oid opfamily)
{
ListCell *lc1;
@@ -2477,6 +2479,17 @@ exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2)
if (ec->ec_has_volatile)
continue;
+ /*
+ * It's okay to consider ec_broken ECs here. Brokenness just means we
+ * couldn't derive all the implied clauses we'd have liked to; it does
+ * not invalidate our knowledge that the members are equal.
+ */
+
+ /* Ignore if this EC doesn't use specified opfamily */
+ if (OidIsValid(opfamily) &&
+ !list_member_oid(ec->ec_opfamilies, opfamily))
+ continue;
+
foreach(lc2, ec->ec_members)
{
EquivalenceMember *em = (EquivalenceMember *) lfirst(lc2);
@@ -2505,8 +2518,7 @@ exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2)
* (In principle there might be more than one matching eclass if multiple
* collations are involved, but since collation doesn't matter for equality,
* we ignore that fine point here.) This is much like exprs_known_equal,
- * except that we insist on the comparison operator matching the eclass, so
- * that the result is definite not approximate.
+ * except for the format of the input.
*
* On success, we also set fkinfo->eclass[colno] to the matching eclass,
* and set fkinfo->fk_eclass_member[colno] to the eclass member for the
@@ -2547,7 +2559,7 @@ match_eclasses_to_foreign_key_col(PlannerInfo *root,
/* Never match to a volatile EC */
if (ec->ec_has_volatile)
continue;
- /* Note: it seems okay to match to "broken" eclasses here */
+ /* It's okay to consider "broken" ECs here, see exprs_known_equal */
foreach(lc2, ec->ec_members)
{
diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c
index 197a3f905e..0ff0ca99cb 100644
--- a/src/backend/optimizer/plan/planmain.c
+++ b/src/backend/optimizer/plan/planmain.c
@@ -67,6 +67,9 @@ query_planner(PlannerInfo *root,
root->join_rel_list = makeNode(RelInfoList);
root->join_rel_list->items = NIL;
root->join_rel_list->hash = NULL;
+ root->agg_info_list = makeNode(RelInfoList);
+ root->agg_info_list->items = NIL;
+ root->agg_info_list->hash = NULL;
root->join_rel_level = NULL;
root->join_cur_level = 0;
root->canon_pathkeys = NIL;
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index 8420b8936e..c6e2d417a8 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -87,6 +87,14 @@ static void build_child_join_reltarget(PlannerInfo *root,
RelOptInfo *childrel,
int nappinfos,
AppendRelInfo **appinfos);
+static bool eager_aggregation_possible_for_relation(PlannerInfo *root,
+ RelOptInfo *rel);
+static bool init_grouping_targets(PlannerInfo *root, RelOptInfo *rel,
+ PathTarget *target, PathTarget *agg_input,
+ List **group_exprs_extra_p);
+static bool is_var_in_aggref_only(PlannerInfo *root, Var *var);
+static bool is_var_needed_by_join(PlannerInfo *root, Var *var, RelOptInfo *rel);
+static Index get_expression_sortgroupref(PlannerInfo *root, Expr *expr);
/*
@@ -647,6 +655,58 @@ add_join_rel(PlannerInfo *root, RelOptInfo *joinrel)
add_rel_info(root->join_rel_list, joinrel);
}
+/*
+ * add_grouped_rel
+ * Add grouped base or join relation to the list of grouped relations in
+ * the given PlannerInfo. Also add the corresponding RelAggInfo to
+ * root->agg_info_list.
+ */
+void
+add_grouped_rel(PlannerInfo *root, RelOptInfo *rel, RelAggInfo *agg_info)
+{
+ add_rel_info(&root->upper_rels[UPPERREL_PARTIAL_GROUP_AGG], rel);
+ add_rel_info(root->agg_info_list, agg_info);
+}
+
+/*
+ * find_grouped_rel
+ * Returns grouped relation entry (base or join relation) corresponding to
+ * 'relids' or NULL if none exists.
+ *
+ * If agg_info_p is not NULL, then also the corresponding RelAggInfo (if one
+ * exists) will be returned in *agg_info_p.
+ */
+RelOptInfo *
+find_grouped_rel(PlannerInfo *root, Relids relids, RelAggInfo **agg_info_p)
+{
+ RelOptInfo *rel;
+
+ rel = (RelOptInfo *) find_rel_info(&root->upper_rels[UPPERREL_PARTIAL_GROUP_AGG],
+ relids);
+ if (rel == NULL)
+ {
+ if (agg_info_p)
+ *agg_info_p = NULL;
+
+ return NULL;
+ }
+
+ /* also return the corresponding RelAggInfo, if asked */
+ if (agg_info_p)
+ {
+ RelAggInfo *agg_info;
+
+ agg_info = (RelAggInfo *) find_rel_info(root->agg_info_list, relids);
+
+ /* The relation exists, so the agg_info should be there too. */
+ Assert(agg_info != NULL);
+
+ *agg_info_p = agg_info;
+ }
+
+ return rel;
+}
+
/*
* set_foreign_rel_properties
* Set up foreign-join fields if outer and inner relation are foreign
@@ -2483,3 +2543,579 @@ build_child_join_reltarget(PlannerInfo *root,
childrel->reltarget->cost.per_tuple = parentrel->reltarget->cost.per_tuple;
childrel->reltarget->width = parentrel->reltarget->width;
}
+
+/*
+ * create_rel_agg_info
+ * Check if the given relation can produce grouped paths and return the
+ * information it'll need for it. The given relation is the non-grouped one
+ * which has the reltarget already constructed.
+ */
+RelAggInfo *
+create_rel_agg_info(PlannerInfo *root, RelOptInfo *rel)
+{
+ ListCell *lc;
+ RelAggInfo *result;
+ PathTarget *agg_input;
+ PathTarget *target;
+ List *grp_exprs_extra = NIL;
+ List *group_clauses_final;
+ int i;
+
+ /*
+ * The lists of aggregate expressions and grouping expressions should have
+ * been constructed.
+ */
+ Assert(root->agg_clause_list != NIL);
+ Assert(root->group_expr_list != NIL);
+
+ /*
+ * If this is a child rel, the grouped rel for its parent rel must have
+ * been created if it can. So we can just use parent's RelAggInfo if there
+ * is one, with appropriate variable substitutions.
+ */
+ if (IS_OTHER_REL(rel))
+ {
+ RelOptInfo *rel_grouped;
+ RelAggInfo *agg_info;
+
+ Assert(!bms_is_empty(rel->top_parent_relids));
+ rel_grouped = find_grouped_rel(root, rel->top_parent_relids, &agg_info);
+
+ if (rel_grouped == NULL)
+ return NULL;
+
+ Assert(agg_info != NULL);
+ /* Must do multi-level transformation */
+ agg_info = (RelAggInfo *)
+ adjust_appendrel_attrs_multilevel(root,
+ (Node *) agg_info,
+ rel,
+ rel->top_parent);
+
+ agg_info->input_rows = rel->rows;
+ agg_info->grouped_rows =
+ estimate_num_groups(root, agg_info->group_exprs,
+ agg_info->input_rows, NULL, NULL);
+
+ return agg_info;
+ }
+
+ /* Check if it's possible to produce grouped paths for this relation. */
+ if (!eager_aggregation_possible_for_relation(root, rel))
+ return NULL;
+
+ /*
+ * Create targets for the grouped paths and for the input paths of the
+ * grouped paths.
+ */
+ target = create_empty_pathtarget();
+ agg_input = create_empty_pathtarget();
+
+ /* initialize 'target' and 'agg_input' */
+ if (!init_grouping_targets(root, rel, target, agg_input, &grp_exprs_extra))
+ return NULL;
+
+ /* Eager aggregation makes no sense w/o grouping expressions */
+ if ((list_length(target->exprs) + list_length(grp_exprs_extra)) == 0)
+ return NULL;
+
+ group_clauses_final = root->parse->groupClause;
+
+ /*
+ * If the aggregation target should have extra grouping expressions (in
+ * order to emit input vars for join conditions), add them now. This step
+ * includes assignment of tleSortGroupRef's which we can generate now.
+ */
+ if (list_length(grp_exprs_extra) > 0)
+ {
+ Index sortgroupref;
+
+ /*
+ * Make a copy of the group clauses as we'll need to add some more
+ * clauses.
+ */
+ group_clauses_final = list_copy(group_clauses_final);
+
+ /* find out the current max sortgroupref */
+ sortgroupref = 0;
+ foreach(lc, root->processed_tlist)
+ {
+ Index ref = ((TargetEntry *) lfirst(lc))->ressortgroupref;
+
+ if (ref > sortgroupref)
+ sortgroupref = ref;
+ }
+
+ /*
+ * Generate the SortGroupClause's and add the expressions to the
+ * target.
+ */
+ foreach(lc, grp_exprs_extra)
+ {
+ Var *var = lfirst_node(Var, lc);
+ SortGroupClause *cl = makeNode(SortGroupClause);
+
+ /*
+ * Initialize the SortGroupClause.
+ *
+ * As the final aggregation will not use this grouping expression,
+ * we don't care whether sortop is < or >. The value of nulls_first
+ * should not matter for the same reason.
+ */
+ cl->tleSortGroupRef = ++sortgroupref;
+ get_sort_group_operators(var->vartype,
+ false, true, false,
+ &cl->sortop, &cl->eqop, NULL,
+ &cl->hashable);
+ group_clauses_final = lappend(group_clauses_final, cl);
+ add_column_to_pathtarget(target, (Expr *) var,
+ cl->tleSortGroupRef);
+
+ /*
+ * The aggregation input target must emit this var too.
+ */
+ add_column_to_pathtarget(agg_input, (Expr *) var,
+ cl->tleSortGroupRef);
+ }
+ }
+
+ /*
+ * Build a list of grouping expressions and a list of the corresponding
+ * SortGroupClauses.
+ */
+ i = 0;
+ result = makeNode(RelAggInfo);
+ foreach(lc, target->exprs)
+ {
+ Index sortgroupref = 0;
+ SortGroupClause *cl;
+ Expr *texpr;
+
+ texpr = (Expr *) lfirst(lc);
+
+ Assert(IsA(texpr, Var));
+
+ sortgroupref = target->sortgrouprefs[i++];
+ if (sortgroupref == 0)
+ continue;
+
+ /* find the SortGroupClause in group_clauses_final */
+ cl = get_sortgroupref_clause(sortgroupref, group_clauses_final);
+
+ /* do not add this SortGroupClause if it has already been added */
+ if (list_member(result->group_clauses, cl))
+ continue;
+
+ result->group_clauses = lappend(result->group_clauses, cl);
+ result->group_exprs = list_append_unique(result->group_exprs,
+ texpr);
+ }
+
+ /*
+ * Calculate pathkeys that represent this grouping requirements.
+ */
+ result->group_pathkeys =
+ make_pathkeys_for_sortclauses(root, result->group_clauses,
+ make_tlist_from_pathtarget(target));
+
+ /*
+ * Add aggregates to the grouping target.
+ */
+ foreach(lc, root->agg_clause_list)
+ {
+ AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+ Aggref *aggref;
+
+ Assert(IsA(ac_info->aggref, Aggref));
+
+ aggref = (Aggref *) copyObject(ac_info->aggref);
+ mark_partial_aggref(aggref, AGGSPLIT_INITIAL_SERIAL);
+
+ add_column_to_pathtarget(target, (Expr *) aggref, 0);
+
+ result->agg_exprs = lappend(result->agg_exprs, aggref);
+ }
+
+ /*
+ * Since neither target nor agg_input is supposed to be identical to the
+ * source reltarget, compute the width and cost again.
+ */
+ set_pathtarget_cost_width(root, target);
+ set_pathtarget_cost_width(root, agg_input);
+
+ result->relids = bms_copy(rel->relids);
+ result->target = target;
+ result->agg_input = agg_input;
+
+ /*
+ * The number of aggregation input rows is simply the number of rows of the
+ * non-grouped relation, which should have been estimated by now.
+ */
+ result->input_rows = rel->rows;
+
+ /* Estimate the number of groups with equal grouped exprs. */
+ result->grouped_rows = estimate_num_groups(root, result->group_exprs,
+ result->input_rows, NULL, NULL);
+
+ return result;
+}
+
+/*
+ * eager_aggregation_possible_for_relation
+ * Check if it's possible to produce grouped paths for the given relation.
+ */
+static bool
+eager_aggregation_possible_for_relation(PlannerInfo *root, RelOptInfo *rel)
+{
+ ListCell *lc;
+
+ /*
+ * The current implementation of eager aggregation cannot handle
+ * PlaceHolderVar (PHV).
+ *
+ * If we knew that the PHV should be evaluated in this target (and of
+ * course, if its expression matched some Aggref argument), we'd just let
+ * init_grouping_targets add that Aggref. On the other hand, if we knew
+ * that the PHV is evaluated below the current rel, we could ignore it
+ * because the referencing Aggref would take care of propagation of the
+ * value to upper joins.
+ *
+ * The problem is that the same PHV can be evaluated in the target of the
+ * current rel or in that of lower rel --- depending on the input paths.
+ * For example, consider rel->relids = {A, B, C} and if ph_eval_at = {B,
+ * C}. Path "A JOIN (B JOIN C)" implies that the PHV is evaluated by the
+ * "(B JOIN C)", while path "(A JOIN B) JOIN C" evaluates the PHV itself.
+ */
+ foreach(lc, rel->reltarget->exprs)
+ {
+ Expr *expr = lfirst(lc);
+
+ if (IsA(expr, PlaceHolderVar))
+ return false;
+ }
+
+ if (IS_SIMPLE_REL(rel))
+ {
+ RangeTblEntry *rte = root->simple_rte_array[rel->relid];
+
+ /*
+ * rtekind != RTE_RELATION case is not supported yet.
+ */
+ if (rte->rtekind != RTE_RELATION)
+ return false;
+ }
+
+ /* Caller should only pass base relations or joins. */
+ Assert(rel->reloptkind == RELOPT_BASEREL ||
+ rel->reloptkind == RELOPT_JOINREL);
+
+ /*
+ * Check if all aggregate expressions can be evaluated on this relation
+ * level.
+ */
+ foreach(lc, root->agg_clause_list)
+ {
+ AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+
+ Assert(IsA(ac_info->aggref, Aggref));
+
+ /*
+ * Give up if any aggregate needs relations other than the current one.
+ *
+ * If the aggregate needs the current rel plus anything else, then the
+ * problem is that grouping of the current relation could make some
+ * input variables unavailable for the "higher aggregate", and it'd
+ * also decrease the number of input rows the "higher aggregate"
+ * receives.
+ *
+ * If the aggregate does not even need the current rel, then the
+ * current rel should be grouped because we do not support join of two
+ * grouped relations.
+ */
+ if (!bms_is_subset(ac_info->agg_eval_at, rel->relids))
+ return false;
+ }
+
+ return true;
+}
+
+/*
+ * init_grouping_targets
+ * Initialize target for grouped paths (target) as well as a target for
+ * paths that generate input for the grouped paths (agg_input).
+ *
+ * group_exprs_extra_p receives a list of Var nodes for which we need to
+ * construct SortGroupClause. Those vars will then be used as additional
+ * grouping expressions, for the sake of join clauses.
+ *
+ * Return true iff the targets could be initialized.
+ */
+static bool
+init_grouping_targets(PlannerInfo *root, RelOptInfo *rel,
+ PathTarget *target, PathTarget *agg_input,
+ List **group_exprs_extra_p)
+{
+ ListCell *lc;
+ List *possibly_dependent = NIL;
+
+ foreach(lc, rel->reltarget->exprs)
+ {
+ Expr *expr = (Expr *) lfirst(lc);
+ Index sortgroupref;
+
+ /*
+ * Given that PlaceHolderVar currently prevents us from doing eager
+ * aggregation, the source target cannot contain anything more complex
+ * than a Var.
+ */
+ Assert(IsA(expr, Var));
+
+ /* Get the sortgroupref if the expr can act as grouping expression. */
+ sortgroupref = get_expression_sortgroupref(root, expr);
+ if (sortgroupref > 0)
+ {
+ /*
+ * If the target expression can be used as the grouping key, it
+ * should be emitted by the grouped paths that have been pushed
+ * down to this relation level.
+ */
+ add_column_to_pathtarget(target, expr, sortgroupref);
+
+ /*
+ * ... and it also should be emitted by the input paths
+ */
+ add_column_to_pathtarget(agg_input, expr, sortgroupref);
+ }
+ else
+ {
+ if (is_var_needed_by_join(root, (Var *) expr, rel))
+ {
+ /*
+ * The variable is needed for a join, however it's neither in
+ * the GROUP BY clause nor can it be derived from it using EC.
+ * (Otherwise it would have to be added to the targets above.)
+ * We need to construct special SortGroupClause for this
+ * variable.
+ *
+ * Note that its tleSortGroupRef needs to be unique within
+ * agg_input, so we need to postpone creation of the
+ * SortGroupClause's until we're done with the iteration of
+ * rel->reltarget->exprs. Also it makes sense for the caller to
+ * do some more check before it starts to create those
+ * SortGroupClause's.
+ */
+ *group_exprs_extra_p = lappend(*group_exprs_extra_p, expr);
+ }
+ else if (is_var_in_aggref_only(root, (Var *) expr))
+ {
+ /*
+ * Another reason we might need this variable is that some
+ * aggregate pushed down to this relation references it. In
+ * such a case, add it to "agg_input", but not to "target".
+ * However, if the aggregate is not the only reason for the var
+ * to be in the target, some more checks need to be performed
+ * below.
+ */
+ add_new_column_to_pathtarget(agg_input, expr);
+ }
+ else
+ {
+ /*
+ * The Var can be functionally dependent on another expression
+ * of the target, but we cannot check that until we've built
+ * all the expressions for the target.
+ */
+ possibly_dependent = lappend(possibly_dependent, expr);
+ }
+ }
+ }
+
+ /*
+ * Now we can check whether the expression is functionally dependent on
+ * another one.
+ */
+ foreach(lc, possibly_dependent)
+ {
+ Var *tvar;
+ List *deps = NIL;
+ RangeTblEntry *rte;
+
+ tvar = lfirst_node(Var, lc);
+ rte = root->simple_rte_array[tvar->varno];
+
+ /*
+ * Check if the Var can be in the grouping key even though it's not
+ * mentioned by the GROUP BY clause (and could not be derived using
+ * ECs).
+ */
+ if (check_functional_grouping(rte->relid, tvar->varno,
+ tvar->varlevelsup,
+ target->exprs, &deps))
+ {
+ /*
+ * The var shouldn't be actually used for grouping key evaluation
+ * (instead, the one this depends on will be), so sortgroupref
+ * should not be important.
+ */
+ add_new_column_to_pathtarget(target, (Expr *) tvar);
+ add_new_column_to_pathtarget(agg_input, (Expr *) tvar);
+ }
+ else
+ {
+ /*
+ * As long as the query is semantically correct, arriving here
+ * means that the var is referenced by a generic grouping
+ * expression but not referenced by any join.
+ *
+ * If the eager aggregation will support generic grouping
+ * expression in the future, create_rel_agg_info() will have to add
+ * this variable to "agg_input" target and also add the whole
+ * generic expression to "target".
+ */
+ return false;
+ }
+ }
+
+ return true;
+}
+
+/*
+ * is_var_in_aggref_only
+ * Check whether the given Var appears in aggregate expressions and not
+ * elsewhere in the targetlist.
+ */
+static bool
+is_var_in_aggref_only(PlannerInfo *root, Var *var)
+{
+ List *tlist_exprs;
+ ListCell *lc;
+
+ /*
+ * Search the list of aggregate expressions for the Var.
+ */
+ foreach(lc, root->agg_clause_list)
+ {
+ AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+ List *vars;
+
+ Assert(IsA(ac_info->aggref, Aggref));
+
+ if (!bms_is_member(var->varno, ac_info->agg_eval_at))
+ continue;
+
+ vars = pull_var_clause((Node *) ac_info->aggref,
+ PVC_RECURSE_AGGREGATES |
+ PVC_RECURSE_WINDOWFUNCS |
+ PVC_RECURSE_PLACEHOLDERS);
+
+ if (list_member(vars, var))
+ {
+ list_free(vars);
+ break;
+ }
+
+ list_free(vars);
+ }
+
+ /*
+ * If we reached the end of the list, the Var is not referenced in
+ * aggregate expressions.
+ */
+ if (lc == NULL)
+ return false;
+
+ /*
+ * Search the targetlist to see if the Var is referenced anywhere other
+ * than in aggregate expressions.
+ */
+ tlist_exprs = pull_var_clause((Node *) root->processed_tlist,
+ PVC_INCLUDE_AGGREGATES |
+ PVC_RECURSE_WINDOWFUNCS |
+ PVC_RECURSE_PLACEHOLDERS);
+
+ foreach(lc, tlist_exprs)
+ {
+ Var *tlist_var = (Var *) lfirst(lc);
+
+ if (IsA(tlist_var, Aggref))
+ continue;
+
+ if (equal(tlist_var, var))
+ {
+ list_free(tlist_exprs);
+ return false;
+ }
+ }
+
+ list_free(tlist_exprs);
+
+ return true;
+}
+
+/*
+ * is_var_needed_by_join
+ * Check if the given Var is needed by joins above the current rel.
+ *
+ * Consider pushing the aggregate avg(b.y) down to relation b for the following
+ * query:
+ *
+ * SELECT a.i, avg(b.y)
+ * FROM a JOIN b ON a.j = b.j
+ * GROUP BY a.i;
+ *
+ * Column b.j needs to be used as the grouping key because otherwise it cannot
+ * find its way to the input of the join expression.
+ */
+static bool
+is_var_needed_by_join(PlannerInfo *root, Var *var, RelOptInfo *rel)
+{
+ Relids relids;
+ int attno;
+ RelOptInfo *baserel;
+
+ /*
+ * Note that when we are checking if the Var is needed by joins above, we
+ * want to exclude the situation where the Var is only needed in final
+ * output. So include "relation 0" here.
+ */
+ relids = bms_copy(rel->relids);
+ relids = bms_add_member(relids, 0);
+
+ baserel = find_base_rel(root, var->varno);
+ attno = var->varattno - baserel->min_attr;
+
+ return bms_nonempty_difference(baserel->attr_needed[attno], relids);
+}
+
+/*
+ * get_expression_sortgroupref
+ * Return sortgroupref if the given 'expr' can be used as a grouping
+ * expression in grouped paths for base or join relations, or 0 otherwise.
+ *
+ * Note that we also need to check if the 'expr' is known equal to other exprs
+ * due to equivalence relationships that can act as grouping expressions.
+ */
+static Index
+get_expression_sortgroupref(PlannerInfo *root, Expr *expr)
+{
+ ListCell *lc;
+
+ foreach(lc, root->group_expr_list)
+ {
+ GroupExprInfo *ge_info = lfirst_node(GroupExprInfo, lc);
+
+ Assert(IsA(ge_info->expr, Var));
+
+ if (equal(ge_info->expr, expr) ||
+ exprs_known_equal(root, (Node *) expr, (Node *) ge_info->expr,
+ ge_info->btree_opfamily))
+ {
+ Assert(ge_info->sortgroupref > 0);
+
+ return ge_info->sortgroupref;
+ }
+ }
+
+ /* The expression cannot be used as grouping key. */
+ return 0;
+}
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 5f5d7959d8..877a62a62e 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3313,10 +3313,11 @@ add_unique_group_var(PlannerInfo *root, List *varinfos,
/*
* Drop known-equal vars, but only if they belong to different
- * relations (see comments for estimate_num_groups)
+ * relations (see comments for estimate_num_groups). We aren't too
+ * fussy about the semantics of "equal" here.
*/
if (vardata->rel != varinfo->rel &&
- exprs_known_equal(root, var, varinfo->var))
+ exprs_known_equal(root, var, varinfo->var, InvalidOid))
{
if (varinfo->ndistinct <= ndistinct)
{
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 7c4ade0bef..ac639abe31 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -434,6 +434,12 @@ struct PlannerInfo
*/
RelInfoList upper_rels[UPPERREL_FINAL + 1] pg_node_attr(read_write_ignore);
+ /*
+ * list of grouped relation RelAggInfos. One instance of RelAggInfo per
+ * item of the upper_rels[UPPERREL_PARTIAL_GROUP_AGG] list.
+ */
+ RelInfoList *agg_info_list;
+
/* Result tlists chosen by grouping_planner for upper-stage processing */
struct PathTarget *upper_targets[UPPERREL_FINAL + 1] pg_node_attr(read_write_ignore);
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index c5c4756b0f..d973bff8ff 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -313,6 +313,10 @@ extern RelOptInfo *find_base_rel(PlannerInfo *root, int relid);
extern RelOptInfo *find_base_rel_noerr(PlannerInfo *root, int relid);
extern RelOptInfo *find_base_rel_ignore_join(PlannerInfo *root, int relid);
extern RelOptInfo *find_join_rel(PlannerInfo *root, Relids relids);
+extern void add_grouped_rel(PlannerInfo *root, RelOptInfo *rel,
+ RelAggInfo *agg_info);
+extern RelOptInfo *find_grouped_rel(PlannerInfo *root, Relids relids,
+ RelAggInfo **agg_info_p);
extern RelOptInfo *build_join_rel(PlannerInfo *root,
Relids joinrelids,
RelOptInfo *outer_rel,
@@ -347,4 +351,5 @@ extern RelOptInfo *build_child_join_rel(PlannerInfo *root,
RelOptInfo *parent_joinrel, List *restrictlist,
SpecialJoinInfo *sjinfo);
+extern RelAggInfo *create_rel_agg_info(PlannerInfo *root, RelOptInfo *rel);
#endif /* PATHNODE_H */
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 8f2bd60d47..31eed6b6a8 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -162,7 +162,8 @@ extern List *generate_join_implied_equalities_for_ecs(PlannerInfo *root,
Relids join_relids,
Relids outer_relids,
RelOptInfo *inner_rel);
-extern bool exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2);
+extern bool exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2,
+ Oid opfamily);
extern EquivalenceClass *match_eclasses_to_foreign_key_col(PlannerInfo *root,
ForeignKeyOptInfo *fkinfo,
int colno);
--
2.31.0
[application/octet-stream] v6-0005-Implement-functions-that-generate-paths-for-grouped-relations.patch (13.1K, 7-v6-0005-Implement-functions-that-generate-paths-for-grouped-relations.patch)
download | inline diff:
From 65cb86a4ec954786c9d6533c13ea71ba76224372 Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Fri, 23 Feb 2024 14:19:39 +0800
Subject: [PATCH v6 5/9] Implement functions that generate paths for grouped
relations
This commit implements the functions that generate paths for grouped
relations by adding sorted and hashed partial aggregation paths on top
of paths of the plain base or join relations.
---
src/backend/optimizer/path/allpaths.c | 307 ++++++++++++++++++++++++++
src/backend/optimizer/util/pathnode.c | 12 +-
src/include/optimizer/paths.h | 4 +
3 files changed, 315 insertions(+), 8 deletions(-)
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 586c0e07c0..3f3dbc486e 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -40,6 +40,7 @@
#include "optimizer/paths.h"
#include "optimizer/plancat.h"
#include "optimizer/planner.h"
+#include "optimizer/prep.h"
#include "optimizer/tlist.h"
#include "parser/parse_clause.h"
#include "parser/parsetree.h"
@@ -47,6 +48,7 @@
#include "port/pg_bitutils.h"
#include "rewrite/rewriteManip.h"
#include "utils/lsyscache.h"
+#include "utils/selfuncs.h"
/* Bitmask flags for pushdown_safety_info.unsafeFlags */
@@ -3308,6 +3310,311 @@ generate_useful_gather_paths(PlannerInfo *root, RelOptInfo *rel, bool override_r
}
}
+/*
+ * generate_grouped_paths
+ * Generate paths for a grouped relation by adding sorted and hashed
+ * partial aggregation paths on top of paths of the plain base or join
+ * relation.
+ *
+ * The information needed are provided by the RelAggInfo structure.
+ */
+void
+generate_grouped_paths(PlannerInfo *root, RelOptInfo *rel_grouped,
+ RelOptInfo *rel_plain, RelAggInfo *agg_info)
+{
+ AggClauseCosts agg_costs;
+ bool can_hash;
+ bool can_sort;
+ Path *cheapest_total_path = NULL;
+ Path *cheapest_partial_path = NULL;
+ double dNumGroups = 0;
+ double dNumPartialGroups = 0;
+
+ if (IS_DUMMY_REL(rel_plain))
+ {
+ mark_dummy_rel(rel_grouped);
+ return;
+ }
+
+ MemSet(&agg_costs, 0, sizeof(AggClauseCosts));
+ get_agg_clause_costs(root, AGGSPLIT_INITIAL_SERIAL, &agg_costs);
+
+ /*
+ * Determine whether it's possible to perform sort-based implementations of
+ * grouping.
+ */
+ can_sort = grouping_is_sortable(agg_info->group_clauses);
+
+ /*
+ * Determine whether we should consider hash-based implementations of
+ * grouping.
+ */
+ Assert(root->numOrderedAggs == 0);
+ can_hash = (agg_info->group_clauses != NIL &&
+ grouping_is_hashable(agg_info->group_clauses));
+
+ /*
+ * Consider whether we should generate partially aggregated non-partial
+ * paths. We can only do this if we have a non-partial path.
+ */
+ if (rel_plain->pathlist != NIL)
+ {
+ cheapest_total_path = rel_plain->cheapest_total_path;
+ Assert(cheapest_total_path != NULL);
+ }
+
+ /*
+ * If parallelism is possible for rel_grouped, then we should consider
+ * generating partially-grouped partial paths. However, if the plain rel
+ * has no partial paths, then we can't.
+ */
+ if (rel_grouped->consider_parallel && rel_plain->partial_pathlist != NIL)
+ {
+ cheapest_partial_path = linitial(rel_plain->partial_pathlist);
+ Assert(cheapest_partial_path != NULL);
+ }
+
+ /* Estimate number of partial groups. */
+ if (cheapest_total_path != NULL)
+ dNumGroups = estimate_num_groups(root,
+ agg_info->group_exprs,
+ cheapest_total_path->rows,
+ NULL, NULL);
+ if (cheapest_partial_path != NULL)
+ dNumPartialGroups = estimate_num_groups(root,
+ agg_info->group_exprs,
+ cheapest_partial_path->rows,
+ NULL, NULL);
+
+ if (can_sort && cheapest_total_path != NULL)
+ {
+ ListCell *lc;
+
+ /*
+ * Use any available suitably-sorted path as input, and also consider
+ * sorting the cheapest-total path.
+ */
+ foreach(lc, rel_plain->pathlist)
+ {
+ Path *input_path = (Path *) lfirst(lc);
+ Path *path;
+ bool is_sorted;
+ int presorted_keys;
+
+ /*
+ * Since the path originates from the non-grouped relation which is
+ * not aware of eager aggregation, we must ensure that it provides
+ * the correct input for the partial aggregation.
+ */
+ path = (Path *) create_projection_path(root,
+ rel_grouped,
+ input_path,
+ agg_info->agg_input);
+
+ is_sorted = pathkeys_count_contained_in(agg_info->group_pathkeys,
+ path->pathkeys,
+ &presorted_keys);
+ if (!is_sorted)
+ {
+ /*
+ * Try at least sorting the cheapest path and also try
+ * incrementally sorting any path which is partially sorted
+ * already (no need to deal with paths which have presorted
+ * keys when incremental sort is disabled unless it's the
+ * cheapest input path).
+ */
+ if (input_path != cheapest_total_path &&
+ (presorted_keys == 0 || !enable_incremental_sort))
+ continue;
+
+ /*
+ * We've no need to consider both a sort and incremental sort.
+ * We'll just do a sort if there are no presorted keys and an
+ * incremental sort when there are presorted keys.
+ */
+ if (presorted_keys == 0 || !enable_incremental_sort)
+ path = (Path *) create_sort_path(root,
+ rel_grouped,
+ path,
+ agg_info->group_pathkeys,
+ -1.0);
+ else
+ path = (Path *) create_incremental_sort_path(root,
+ rel_grouped,
+ path,
+ agg_info->group_pathkeys,
+ presorted_keys,
+ -1.0);
+ }
+
+ /*
+ * qual is NIL because the HAVING clause cannot be evaluated until the
+ * final value of the aggregate is known.
+ */
+ path = (Path *) create_agg_path(root,
+ rel_grouped,
+ path,
+ agg_info->target,
+ AGG_SORTED,
+ AGGSPLIT_INITIAL_SERIAL,
+ agg_info->group_clauses,
+ NIL,
+ &agg_costs,
+ dNumGroups);
+
+ add_path(rel_grouped, path);
+ }
+ }
+
+ if (can_sort && cheapest_partial_path != NULL)
+ {
+ ListCell *lc;
+
+ /* Similar to above logic, but for partial paths. */
+ foreach(lc, rel_plain->partial_pathlist)
+ {
+ Path *input_path = (Path *) lfirst(lc);
+ Path *path;
+ bool is_sorted;
+ int presorted_keys;
+
+ /*
+ * Since the path originates from the non-grouped relation which is
+ * not aware of eager aggregation, we must ensure that it provides
+ * the correct input for the partial aggregation.
+ */
+ path = (Path *) create_projection_path(root,
+ rel_grouped,
+ input_path,
+ agg_info->agg_input);
+
+ is_sorted = pathkeys_count_contained_in(agg_info->group_pathkeys,
+ path->pathkeys,
+ &presorted_keys);
+
+ if (!is_sorted)
+ {
+ /*
+ * Try at least sorting the cheapest path and also try
+ * incrementally sorting any path which is partially sorted
+ * already (no need to deal with paths which have presorted
+ * keys when incremental sort is disabled unless it's the
+ * cheapest input path).
+ */
+ if (input_path != cheapest_partial_path &&
+ (presorted_keys == 0 || !enable_incremental_sort))
+ continue;
+
+ /*
+ * We've no need to consider both a sort and incremental sort.
+ * We'll just do a sort if there are no presorted keys and an
+ * incremental sort when there are presorted keys.
+ */
+ if (presorted_keys == 0 || !enable_incremental_sort)
+ path = (Path *) create_sort_path(root,
+ rel_grouped,
+ path,
+ agg_info->group_pathkeys,
+ -1.0);
+ else
+ path = (Path *) create_incremental_sort_path(root,
+ rel_grouped,
+ path,
+ agg_info->group_pathkeys,
+ presorted_keys,
+ -1.0);
+ }
+
+ /*
+ * qual is NIL because the HAVING clause cannot be evaluated until the
+ * final value of the aggregate is known.
+ */
+ path = (Path *) create_agg_path(root,
+ rel_grouped,
+ path,
+ agg_info->target,
+ AGG_SORTED,
+ AGGSPLIT_INITIAL_SERIAL,
+ agg_info->group_clauses,
+ NIL,
+ &agg_costs,
+ dNumPartialGroups);
+
+ add_partial_path(rel_grouped, path);
+ }
+ }
+
+ /*
+ * Add a partially-grouped HashAgg Path where possible
+ */
+ if (can_hash && cheapest_total_path != NULL)
+ {
+ Path *path;
+
+ /*
+ * Since the path originates from the non-grouped relation which is
+ * not aware of eager aggregation, we must ensure that it provides
+ * the correct input for the partial aggregation.
+ */
+ path = (Path *) create_projection_path(root,
+ rel_grouped,
+ cheapest_total_path,
+ agg_info->agg_input);
+
+ /*
+ * qual is NIL because the HAVING clause cannot be evaluated until
+ * the final value of the aggregate is known.
+ */
+ path = (Path *) create_agg_path(root,
+ rel_grouped,
+ path,
+ agg_info->target,
+ AGG_HASHED,
+ AGGSPLIT_INITIAL_SERIAL,
+ agg_info->group_clauses,
+ NIL,
+ &agg_costs,
+ dNumGroups);
+
+ add_path(rel_grouped, path);
+ }
+
+ /*
+ * Now add a partially-grouped HashAgg partial Path where possible
+ */
+ if (can_hash && cheapest_partial_path != NULL)
+ {
+ Path *path;
+
+ /*
+ * Since the path originates from the non-grouped relation which is
+ * not aware of eager aggregation, we must ensure that it provides
+ * the correct input for the partial aggregation.
+ */
+ path = (Path *) create_projection_path(root,
+ rel_grouped,
+ cheapest_partial_path,
+ agg_info->agg_input);
+
+ /*
+ * qual is NIL because the HAVING clause cannot be evaluated until
+ * the final value of the aggregate is known.
+ */
+ path = (Path *) create_agg_path(root,
+ rel_grouped,
+ path,
+ agg_info->target,
+ AGG_HASHED,
+ AGGSPLIT_INITIAL_SERIAL,
+ agg_info->group_clauses,
+ NIL,
+ &agg_costs,
+ dNumPartialGroups);
+
+ add_partial_path(rel_grouped, path);
+ }
+}
+
/*
* make_rel_from_joinlist
* Build access paths using a "joinlist" to guide the join path search.
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index 3cf1dac087..70fa25a67b 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -2709,8 +2709,7 @@ create_projection_path(PlannerInfo *root,
pathnode->path.pathtype = T_Result;
pathnode->path.parent = rel;
pathnode->path.pathtarget = target;
- /* For now, assume we are above any joins, so no parameterization */
- pathnode->path.param_info = NULL;
+ pathnode->path.param_info = subpath->param_info;
pathnode->path.parallel_aware = false;
pathnode->path.parallel_safe = rel->consider_parallel &&
subpath->parallel_safe &&
@@ -2962,8 +2961,7 @@ create_incremental_sort_path(PlannerInfo *root,
pathnode->path.parent = rel;
/* Sort doesn't project, so use source path's pathtarget */
pathnode->path.pathtarget = subpath->pathtarget;
- /* For now, assume we are above any joins, so no parameterization */
- pathnode->path.param_info = NULL;
+ pathnode->path.param_info = subpath->param_info;
pathnode->path.parallel_aware = false;
pathnode->path.parallel_safe = rel->consider_parallel &&
subpath->parallel_safe;
@@ -3009,8 +3007,7 @@ create_sort_path(PlannerInfo *root,
pathnode->path.parent = rel;
/* Sort doesn't project, so use source path's pathtarget */
pathnode->path.pathtarget = subpath->pathtarget;
- /* For now, assume we are above any joins, so no parameterization */
- pathnode->path.param_info = NULL;
+ pathnode->path.param_info = subpath->param_info;
pathnode->path.parallel_aware = false;
pathnode->path.parallel_safe = rel->consider_parallel &&
subpath->parallel_safe;
@@ -3168,8 +3165,7 @@ create_agg_path(PlannerInfo *root,
pathnode->path.pathtype = T_Agg;
pathnode->path.parent = rel;
pathnode->path.pathtarget = target;
- /* For now, assume we are above any joins, so no parameterization */
- pathnode->path.param_info = NULL;
+ pathnode->path.param_info = subpath->param_info;
pathnode->path.parallel_aware = false;
pathnode->path.parallel_safe = rel->consider_parallel &&
subpath->parallel_safe;
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 31eed6b6a8..947f814f4f 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -58,6 +58,10 @@ extern void generate_gather_paths(PlannerInfo *root, RelOptInfo *rel,
bool override_rows);
extern void generate_useful_gather_paths(PlannerInfo *root, RelOptInfo *rel,
bool override_rows);
+extern void generate_grouped_paths(PlannerInfo *root,
+ RelOptInfo *rel_grouped,
+ RelOptInfo *rel_plain,
+ RelAggInfo *agg_info);
extern int compute_parallel_worker(RelOptInfo *rel, double heap_pages,
double index_pages, int max_workers);
extern void create_partial_bitmap_paths(PlannerInfo *root, RelOptInfo *rel,
--
2.31.0
[application/octet-stream] v6-0006-Build-grouped-relations-out-of-base-relations.patch (9.0K, 8-v6-0006-Build-grouped-relations-out-of-base-relations.patch)
download | inline diff:
From 667c3c7de368090dd106c3a199874c20c4639bcb Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Wed, 28 Feb 2024 10:03:41 +0800
Subject: [PATCH v6 6/9] Build grouped relations out of base relations
This commit builds grouped relations for each base relation if possible,
and generates aggregation paths for the grouped base relations.
---
src/backend/optimizer/path/allpaths.c | 91 +++++++++++++++++++++++
src/backend/optimizer/util/relnode.c | 101 ++++++++++++++++++++++++++
src/include/optimizer/pathnode.h | 4 +
3 files changed, 196 insertions(+)
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 3f3dbc486e..ef699ab630 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -93,6 +93,7 @@ join_search_hook_type join_search_hook = NULL;
static void set_base_rel_consider_startup(PlannerInfo *root);
static void set_base_rel_sizes(PlannerInfo *root);
+static void setup_base_grouped_rels(PlannerInfo *root);
static void set_base_rel_pathlists(PlannerInfo *root);
static void set_rel_size(PlannerInfo *root, RelOptInfo *rel,
Index rti, RangeTblEntry *rte);
@@ -117,6 +118,7 @@ static void set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
Index rti, RangeTblEntry *rte);
static void set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
Index rti, RangeTblEntry *rte);
+static void set_grouped_rel_pathlist(PlannerInfo *root, RelOptInfo *rel);
static void generate_orderedappend_paths(PlannerInfo *root, RelOptInfo *rel,
List *live_childrels,
List *all_child_pathkeys);
@@ -185,6 +187,11 @@ make_one_rel(PlannerInfo *root, List *joinlist)
*/
set_base_rel_sizes(root);
+ /*
+ * Build grouped base relations for each base rel if possible.
+ */
+ setup_base_grouped_rels(root);
+
/*
* We should now have size estimates for every actual table involved in
* the query, and we also know which if any have been deleted from the
@@ -326,6 +333,59 @@ set_base_rel_sizes(PlannerInfo *root)
}
}
+/*
+ * setup_base_grouped_rels
+ * For each "plain" base relation build a grouped base relation if eager
+ * aggregation is possible and if this relation can produce grouped paths.
+ */
+static void
+setup_base_grouped_rels(PlannerInfo *root)
+{
+ Index rti;
+
+ /*
+ * If there are no aggregate expressions or grouping expressions, eager
+ * aggregation is not possible.
+ */
+ if (root->agg_clause_list == NIL ||
+ root->group_expr_list == NIL)
+ return;
+
+ /*
+ * Eager aggregation only makes sense if there are multiple base rels in
+ * the query.
+ */
+ if (bms_membership(root->all_baserels) != BMS_MULTIPLE)
+ return;
+
+ for (rti = 1; rti < root->simple_rel_array_size; rti++)
+ {
+ RelOptInfo *rel = root->simple_rel_array[rti];
+ RelOptInfo *rel_grouped;
+ RelAggInfo *agg_info;
+
+ /* there may be empty slots corresponding to non-baserel RTEs */
+ if (rel == NULL)
+ continue;
+
+ Assert(rel->relid == rti); /* sanity check on array */
+
+ /*
+ * Ignore RTEs that are not simple rels. Note that we need to consider
+ * "other rels" here.
+ */
+ if (!IS_SIMPLE_REL(rel))
+ continue;
+
+ rel_grouped = build_simple_grouped_rel(root, rel->relid, &agg_info);
+ if (rel_grouped)
+ {
+ /* Make the grouped relation available for joining. */
+ add_grouped_rel(root, rel_grouped, agg_info);
+ }
+ }
+}
+
/*
* set_base_rel_pathlists
* Finds all paths available for scanning each base-relation entry.
@@ -562,6 +622,15 @@ set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
/* Now find the cheapest of the paths for this rel */
set_cheapest(rel);
+ /*
+ * If a grouped relation for this rel exists, build partial aggregation
+ * paths for it.
+ *
+ * Note that this can only happen after we've called set_cheapest() for
+ * this base rel, because we need its cheapest paths.
+ */
+ set_grouped_rel_pathlist(root, rel);
+
#ifdef OPTIMIZER_DEBUG
pprint(rel);
#endif
@@ -1289,6 +1358,28 @@ set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
add_paths_to_append_rel(root, rel, live_childrels);
}
+/*
+ * set_grouped_rel_pathlist
+ * If a grouped relation for the given 'rel' exists, build partial
+ * aggregation paths for it.
+ */
+static void
+set_grouped_rel_pathlist(PlannerInfo *root, RelOptInfo *rel)
+{
+ RelOptInfo *rel_grouped;
+ RelAggInfo *agg_info;
+
+ /* Add paths to the grouped base relation if one exists. */
+ rel_grouped = find_grouped_rel(root, rel->relids,
+ &agg_info);
+ if (rel_grouped)
+ {
+ generate_grouped_paths(root, rel_grouped, rel,
+ agg_info);
+ set_cheapest(rel_grouped);
+ }
+}
+
/*
* add_paths_to_append_rel
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index c6e2d417a8..b14f99a9ea 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -16,6 +16,7 @@
#include <limits.h>
+#include "catalog/pg_constraint.h"
#include "miscadmin.h"
#include "nodes/nodeFuncs.h"
#include "optimizer/appendinfo.h"
@@ -27,12 +28,15 @@
#include "optimizer/paths.h"
#include "optimizer/placeholder.h"
#include "optimizer/plancat.h"
+#include "optimizer/planner.h"
#include "optimizer/restrictinfo.h"
#include "optimizer/tlist.h"
+#include "parser/parse_oper.h"
#include "parser/parse_relation.h"
#include "rewrite/rewriteManip.h"
#include "utils/hsearch.h"
#include "utils/lsyscache.h"
+#include "utils/selfuncs.h"
/*
@@ -418,6 +422,103 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
return rel;
}
+/*
+ * build_simple_grouped_rel
+ * Construct a new RelOptInfo for a grouped base relation out of an existing
+ * non-grouped base relation.
+ *
+ * On success, the new RelOptInfo is returned and the corresponding RelAggInfo
+ * is stored in *agg_info_p.
+ */
+RelOptInfo *
+build_simple_grouped_rel(PlannerInfo *root, int relid,
+ RelAggInfo **agg_info_p)
+{
+ RelOptInfo *rel_plain;
+ RelOptInfo *rel_grouped;
+ RelAggInfo *agg_info;
+
+ /*
+ * We should have available aggregate expressions and grouping expressions,
+ * otherwise we cannot reach here.
+ */
+ Assert(root->agg_clause_list != NIL);
+ Assert(root->group_expr_list != NIL);
+
+ rel_plain = root->simple_rel_array[relid];
+ Assert(rel_plain != NULL);
+ Assert(IS_SIMPLE_REL(rel_plain));
+
+ /* nothing to do for dummy rel */
+ if (IS_DUMMY_REL(rel_plain))
+ return NULL;
+
+ /*
+ * Prepare the information we need to create grouped paths for this base
+ * relation.
+ */
+ agg_info = create_rel_agg_info(root, rel_plain);
+ if (agg_info == NULL)
+ return NULL;
+
+ /* build a grouped relation out of the plain relation */
+ rel_grouped = build_grouped_rel(root, rel_plain);
+ rel_grouped->reltarget = agg_info->target;
+ rel_grouped->rows = agg_info->grouped_rows;
+
+ /* return the RelAggInfo structure */
+ *agg_info_p = agg_info;
+
+ return rel_grouped;
+}
+
+/*
+ * build_grouped_rel
+ * Build a grouped relation by flat copying a plain relation and resetting
+ * the necessary fields.
+ */
+RelOptInfo *
+build_grouped_rel(PlannerInfo *root, RelOptInfo *rel_plain)
+{
+ RelOptInfo *rel_grouped;
+
+ rel_grouped = makeNode(RelOptInfo);
+ memcpy(rel_grouped, rel_plain, sizeof(RelOptInfo));
+
+ /*
+ * clear path info
+ */
+ rel_grouped->pathlist = NIL;
+ rel_grouped->ppilist = NIL;
+ rel_grouped->partial_pathlist = NIL;
+ rel_grouped->cheapest_startup_path = NULL;
+ rel_grouped->cheapest_total_path = NULL;
+ rel_grouped->cheapest_unique_path = NULL;
+ rel_grouped->cheapest_parameterized_paths = NIL;
+
+ /*
+ * clear partition info
+ */
+ rel_grouped->part_scheme = NULL;
+ rel_grouped->nparts = -1;
+ rel_grouped->boundinfo = NULL;
+ rel_grouped->partbounds_merged = false;
+ rel_grouped->partition_qual = NIL;
+ rel_grouped->part_rels = NULL;
+ rel_grouped->live_parts = NULL;
+ rel_grouped->all_partrels = NULL;
+ rel_grouped->partexprs = NULL;
+ rel_grouped->nullable_partexprs = NULL;
+ rel_grouped->consider_partitionwise_join = false;
+
+ /*
+ * clear size estimates
+ */
+ rel_grouped->rows = 0;
+
+ return rel_grouped;
+}
+
/*
* find_base_rel
* Find a base or otherrel relation entry, which must already exist.
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index d973bff8ff..d4b4499db3 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -309,6 +309,10 @@ extern void setup_simple_rel_arrays(PlannerInfo *root);
extern void expand_planner_arrays(PlannerInfo *root, int add_size);
extern RelOptInfo *build_simple_rel(PlannerInfo *root, int relid,
RelOptInfo *parent);
+extern RelOptInfo *build_simple_grouped_rel(PlannerInfo *root, int relid,
+ RelAggInfo **agg_info_p);
+extern RelOptInfo *build_grouped_rel(PlannerInfo *root,
+ RelOptInfo *rel_plain);
extern RelOptInfo *find_base_rel(PlannerInfo *root, int relid);
extern RelOptInfo *find_base_rel_noerr(PlannerInfo *root, int relid);
extern RelOptInfo *find_base_rel_ignore_join(PlannerInfo *root, int relid);
--
2.31.0
[application/octet-stream] v6-0007-Build-grouped-relations-out-of-join-relations.patch (25.5K, 9-v6-0007-Build-grouped-relations-out-of-join-relations.patch)
download | inline diff:
From 552892b0b78128392d2adb6bae2d367316f07885 Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Fri, 23 Feb 2024 13:33:09 +0800
Subject: [PATCH v6 7/9] Build grouped relations out of join relations
This commit builds grouped relations for each just-processed join
relation if possible, and generates aggregation paths for the grouped
join relations.
The changes made to make_join_rel() are relatively minor, with the
addition of a new function make_grouped_join_rel(), which finds or
creates a grouped relation for the just-processed joinrel, and generates
grouped paths by joining a grouped input relation with a non-grouped
input relation.
The other way to generate grouped paths is by adding sorted and hashed
partial aggregation paths on top of paths of the joinrel. This occurs
in standard_join_search(), after we've run set_cheapest() for the
joinrel. The reason for performing this step after set_cheapest() is
that we need to know the joinrel's cheapest paths (see
generate_grouped_paths()).
This patch also makes the grouped relation for the topmost join rel act
as the upper rel representing the result of partial aggregation, so that
we can add the final aggregation on top of that. Additionally, this
patch extends the functionality of eager aggregation to work with
partitionwise join and geqo.
This patch also makes eager aggregation work with outer joins. With
outer joins, the aggregate cannot be pushed down if any column
referenced by grouping expressions or aggregate functions is nullable by
an outer join above the relation to which we want to apply the partial
aggregation. Thanks to Tom's outer-join-aware-Var infrastructure, we
can easily identify such situations and subsequently refrain from
pushing down the aggregates.
Starting from this patch, you should be able to see plans with eager
aggregation.
---
src/backend/optimizer/geqo/geqo_eval.c | 84 ++++++++++++----
src/backend/optimizer/path/allpaths.c | 48 ++++++++++
src/backend/optimizer/path/joinrels.c | 122 ++++++++++++++++++++++++
src/backend/optimizer/plan/planner.c | 84 +++++++++++-----
src/backend/optimizer/util/appendinfo.c | 60 ++++++++++++
src/backend/optimizer/util/relnode.c | 2 -
src/include/nodes/pathnodes.h | 6 --
7 files changed, 355 insertions(+), 51 deletions(-)
diff --git a/src/backend/optimizer/geqo/geqo_eval.c b/src/backend/optimizer/geqo/geqo_eval.c
index 1141156899..278857d767 100644
--- a/src/backend/optimizer/geqo/geqo_eval.c
+++ b/src/backend/optimizer/geqo/geqo_eval.c
@@ -60,8 +60,12 @@ geqo_eval(PlannerInfo *root, Gene *tour, int num_gene)
MemoryContext oldcxt;
RelOptInfo *joinrel;
Cost fitness;
- int savelength;
- struct HTAB *savehash;
+ int savelength_join_rel;
+ struct HTAB *savehash_join_rel;
+ int savelength_grouped_rel;
+ struct HTAB *savehash_grouped_rel;
+ int savelength_grouped_info;
+ struct HTAB *savehash_grouped_info;
/*
* Create a private memory context that will hold all temp storage
@@ -78,25 +82,38 @@ geqo_eval(PlannerInfo *root, Gene *tour, int num_gene)
oldcxt = MemoryContextSwitchTo(mycontext);
/*
- * gimme_tree will add entries to root->join_rel_list, which may or may
- * not already contain some entries. The newly added entries will be
- * recycled by the MemoryContextDelete below, so we must ensure that the
- * list is restored to its former state before exiting. We can do this by
- * truncating the list to its original length. NOTE this assumes that any
- * added entries are appended at the end!
+ * gimme_tree will add entries to root->join_rel_list, root->agg_info_list
+ * and root->upper_rels[UPPERREL_PARTIAL_GROUP_AGG], which may or may not
+ * already contain some entries. The newly added entries will be recycled
+ * by the MemoryContextDelete below, so we must ensure that each list of
+ * the RelInfoList structures is restored to its former state before
+ * exiting. We can do this by truncating each list to its original length.
+ * NOTE this assumes that any added entries are appended at the end!
*
- * We also must take care not to mess up the outer join_rel_list->hash, if
- * there is one. We can do this by just temporarily setting the link to
- * NULL. (If we are dealing with enough join rels, which we very likely
- * are, a new hash table will get built and used locally.)
+ * We also must take care not to mess up the outer hash tables of the
+ * RelInfoList structures, if any. We can do this by just temporarily
+ * setting each link to NULL. (If we are dealing with enough join rels,
+ * which we very likely are, new hash tables will get built and used
+ * locally.)
*
* join_rel_level[] shouldn't be in use, so just Assert it isn't.
*/
- savelength = list_length(root->join_rel_list->items);
- savehash = root->join_rel_list->hash;
+ savelength_join_rel = list_length(root->join_rel_list->items);
+ savehash_join_rel = root->join_rel_list->hash;
+
+ savelength_grouped_rel =
+ list_length(root->upper_rels[UPPERREL_PARTIAL_GROUP_AGG].items);
+ savehash_grouped_rel =
+ root->upper_rels[UPPERREL_PARTIAL_GROUP_AGG].hash;
+
+ savelength_grouped_info = list_length(root->agg_info_list->items);
+ savehash_grouped_info = root->agg_info_list->hash;
+
Assert(root->join_rel_level == NULL);
root->join_rel_list->hash = NULL;
+ root->upper_rels[UPPERREL_PARTIAL_GROUP_AGG].hash = NULL;
+ root->agg_info_list->hash = NULL;
/* construct the best path for the given combination of relations */
joinrel = gimme_tree(root, tour, num_gene);
@@ -118,12 +135,22 @@ geqo_eval(PlannerInfo *root, Gene *tour, int num_gene)
fitness = DBL_MAX;
/*
- * Restore join_rel_list to its former state, and put back original
- * hashtable if any.
+ * Restore each of the list in join_rel_list, agg_info_list and
+ * upper_rels[UPPERREL_PARTIAL_GROUP_AGG] to its former state, and put back
+ * original hashtable if any.
*/
root->join_rel_list->items = list_truncate(root->join_rel_list->items,
- savelength);
- root->join_rel_list->hash = savehash;
+ savelength_join_rel);
+ root->join_rel_list->hash = savehash_join_rel;
+
+ root->upper_rels[UPPERREL_PARTIAL_GROUP_AGG].items =
+ list_truncate(root->upper_rels[UPPERREL_PARTIAL_GROUP_AGG].items,
+ savelength_grouped_rel);
+ root->upper_rels[UPPERREL_PARTIAL_GROUP_AGG].hash = savehash_grouped_rel;
+
+ root->agg_info_list->items = list_truncate(root->agg_info_list->items,
+ savelength_grouped_info);
+ root->agg_info_list->hash = savehash_grouped_info;
/* release all the memory acquired within gimme_tree */
MemoryContextSwitchTo(oldcxt);
@@ -279,6 +306,27 @@ merge_clump(PlannerInfo *root, List *clumps, Clump *new_clump, int num_gene,
/* Find and save the cheapest paths for this joinrel */
set_cheapest(joinrel);
+ /*
+ * Except for the topmost scan/join rel, consider generating
+ * partial aggregation paths for the grouped relation on top of the
+ * paths of this rel. After that, we're done creating paths for
+ * the grouped relation, so run set_cheapest().
+ */
+ if (!bms_equal(joinrel->relids, root->all_query_rels))
+ {
+ RelOptInfo *rel_grouped;
+ RelAggInfo *agg_info;
+
+ rel_grouped = find_grouped_rel(root, joinrel->relids,
+ &agg_info);
+ if (rel_grouped)
+ {
+ generate_grouped_paths(root, rel_grouped, joinrel,
+ agg_info);
+ set_cheapest(rel_grouped);
+ }
+ }
+
/* Absorb new clump into old */
old_clump->joinrel = joinrel;
old_clump->size += new_clump->size;
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index ef699ab630..0e2c984442 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -3866,6 +3866,10 @@ standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
*
* After that, we're done creating paths for the joinrel, so run
* set_cheapest().
+ *
+ * In addition, we also run generate_grouped_paths() for the grouped
+ * relation of each just-processed joinrel, and run set_cheapest() for
+ * the grouped relation afterwards.
*/
foreach(lc, root->join_rel_level[lev])
{
@@ -3886,6 +3890,27 @@ standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
/* Find and save the cheapest paths for this rel */
set_cheapest(rel);
+ /*
+ * Except for the topmost scan/join rel, consider generating
+ * partial aggregation paths for the grouped relation on top of the
+ * paths of this rel. After that, we're done creating paths for
+ * the grouped relation, so run set_cheapest().
+ */
+ if (!bms_equal(rel->relids, root->all_query_rels))
+ {
+ RelOptInfo *rel_grouped;
+ RelAggInfo *agg_info;
+
+ rel_grouped = find_grouped_rel(root, rel->relids,
+ &agg_info);
+ if (rel_grouped)
+ {
+ generate_grouped_paths(root, rel_grouped, rel,
+ agg_info);
+ set_cheapest(rel_grouped);
+ }
+ }
+
#ifdef OPTIMIZER_DEBUG
pprint(rel);
#endif
@@ -4754,6 +4779,29 @@ generate_partitionwise_join_paths(PlannerInfo *root, RelOptInfo *rel)
if (IS_DUMMY_REL(child_rel))
continue;
+ /*
+ * Except for the topmost scan/join rel, consider generating partial
+ * aggregation paths for the grouped relation on top of the paths of
+ * this partitioned child-join. After that, we're done creating paths
+ * for the grouped relation, so run set_cheapest().
+ */
+ if (!bms_equal(IS_OTHER_REL(rel) ?
+ rel->top_parent_relids : rel->relids,
+ root->all_query_rels))
+ {
+ RelOptInfo *rel_grouped;
+ RelAggInfo *agg_info;
+
+ rel_grouped = find_grouped_rel(root, child_rel->relids,
+ &agg_info);
+ if (rel_grouped)
+ {
+ generate_grouped_paths(root, rel_grouped, child_rel,
+ agg_info);
+ set_cheapest(rel_grouped);
+ }
+ }
+
#ifdef OPTIMIZER_DEBUG
pprint(child_rel);
#endif
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index f3a9412d18..ba1d15e85a 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -16,11 +16,13 @@
#include "miscadmin.h"
#include "optimizer/appendinfo.h"
+#include "optimizer/cost.h"
#include "optimizer/joininfo.h"
#include "optimizer/pathnode.h"
#include "optimizer/paths.h"
#include "partitioning/partbounds.h"
#include "utils/memutils.h"
+#include "utils/selfuncs.h"
static void make_rels_by_clause_joins(PlannerInfo *root,
@@ -35,6 +37,9 @@ static bool has_legal_joinclause(PlannerInfo *root, RelOptInfo *rel);
static bool restriction_is_constant_false(List *restrictlist,
RelOptInfo *joinrel,
bool only_pushed_down);
+static void make_grouped_join_rel(PlannerInfo *root, RelOptInfo *rel1,
+ RelOptInfo *rel2, RelOptInfo *joinrel,
+ SpecialJoinInfo *sjinfo, List *restrictlist);
static void populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
RelOptInfo *rel2, RelOptInfo *joinrel,
SpecialJoinInfo *sjinfo, List *restrictlist);
@@ -771,6 +776,10 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
return joinrel;
}
+ /* Build a grouped join relation for 'joinrel' if possible. */
+ make_grouped_join_rel(root, rel1, rel2, joinrel, sjinfo,
+ restrictlist);
+
/* Add paths to the join relation. */
populate_joinrel_with_paths(root, rel1, rel2, joinrel, sjinfo,
restrictlist);
@@ -882,6 +891,114 @@ add_outer_joins_to_relids(PlannerInfo *root, Relids input_relids,
return input_relids;
}
+/*
+ * make_grouped_join_rel
+ * Build a grouped join relation out of 'joinrel' if eager aggregation is
+ * possible and the 'joinrel' can produce grouped paths.
+ *
+ * We also generate partial aggregation paths for the grouped relation by
+ * joining the grouped paths of 'rel1' to the plain paths of 'rel2', or by
+ * joining the grouped paths of 'rel2' to the plain paths of 'rel1'.
+ */
+static void
+make_grouped_join_rel(PlannerInfo *root, RelOptInfo *rel1,
+ RelOptInfo *rel2, RelOptInfo *joinrel,
+ SpecialJoinInfo *sjinfo, List *restrictlist)
+{
+ RelOptInfo *rel_grouped;
+ RelAggInfo *agg_info = NULL;
+ RelOptInfo *rel1_grouped;
+ RelOptInfo *rel2_grouped;
+ bool rel1_empty;
+ bool rel2_empty;
+
+ /*
+ * If there are no aggregate expressions or grouping expressions, eager
+ * aggregation is not possible.
+ */
+ if (root->agg_clause_list == NIL ||
+ root->group_expr_list == NIL)
+ return;
+
+ /*
+ * See if we already have a grouped joinrel for this joinrel.
+ */
+ rel_grouped = find_grouped_rel(root, joinrel->relids, &agg_info);
+
+ /*
+ * Construct a new RelOptInfo for the grouped join relation if there is no
+ * existing one.
+ */
+ if (rel_grouped == NULL)
+ {
+ /*
+ * Prepare the information we need to create grouped paths for this
+ * join relation.
+ */
+ agg_info = create_rel_agg_info(root, joinrel);
+ if (agg_info == NULL)
+ return;
+
+ /* build a grouped relation out of the plain relation */
+ rel_grouped = build_grouped_rel(root, joinrel);
+ rel_grouped->reltarget = agg_info->target;
+ rel_grouped->rows = agg_info->grouped_rows;
+
+ /*
+ * Make the grouped relation available for further joining or for
+ * acting as the upper rel representing the result of partial
+ * aggregation.
+ */
+ add_grouped_rel(root, rel_grouped, agg_info);
+ }
+
+ Assert(agg_info != NULL);
+
+ /*
+ * If we've already proven this grouped join relation is empty, we needn't
+ * consider any more paths for it.
+ */
+ if (IS_DUMMY_REL(rel_grouped))
+ return;
+
+ /* retrieve the grouped relations for the two input rels */
+ rel1_grouped = find_grouped_rel(root, rel1->relids, NULL);
+ rel2_grouped = find_grouped_rel(root, rel2->relids, NULL);
+
+ rel1_empty = (rel1_grouped == NULL || IS_DUMMY_REL(rel1_grouped));
+ rel2_empty = (rel2_grouped == NULL || IS_DUMMY_REL(rel2_grouped));
+
+ /* Nothing to do if there's no grouped relation. */
+ if (rel1_empty && rel2_empty)
+ return;
+
+ /*
+ * Join of two grouped relations is currently not supported. In such a
+ * case, grouping of one side would change the occurrence of the other
+ * side's aggregate transient states on the input of the final aggregation.
+ * This can be handled by adjusting the transient states, but it's not
+ * worth the effort for now.
+ */
+ if (!rel1_empty && !rel2_empty)
+ return;
+
+ /* generate partial aggregation paths for the grouped relation */
+ if (!rel1_empty)
+ {
+ set_joinrel_size_estimates(root, rel_grouped, rel1_grouped, rel2,
+ sjinfo, restrictlist);
+ populate_joinrel_with_paths(root, rel1_grouped, rel2, rel_grouped,
+ sjinfo, restrictlist);
+ }
+ else if (!rel2_empty)
+ {
+ set_joinrel_size_estimates(root, rel_grouped, rel1, rel2_grouped,
+ sjinfo, restrictlist);
+ populate_joinrel_with_paths(root, rel1, rel2_grouped, rel_grouped,
+ sjinfo, restrictlist);
+ }
+}
+
/*
* populate_joinrel_with_paths
* Add paths to the given joinrel for given pair of joining relations. The
@@ -1671,6 +1788,11 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
adjust_child_relids(joinrel->relids,
nappinfos, appinfos)));
+ /* Build a grouped join relation for 'child_joinrel' if possible */
+ make_grouped_join_rel(root, child_rel1, child_rel2,
+ child_joinrel, child_sjinfo,
+ child_restrictlist);
+
/* And make paths for the child join */
populate_joinrel_with_paths(root, child_rel1, child_rel2,
child_joinrel, child_sjinfo,
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 5320da51a0..4a6386a09d 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -225,7 +225,6 @@ static void add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
RelOptInfo *partially_grouped_rel,
const AggClauseCosts *agg_costs,
grouping_sets_data *gd,
- double dNumGroups,
GroupPathExtraData *extra);
static RelOptInfo *create_partial_grouping_paths(PlannerInfo *root,
RelOptInfo *grouped_rel,
@@ -3913,9 +3912,7 @@ create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
GroupPathExtraData *extra,
RelOptInfo **partially_grouped_rel_p)
{
- Path *cheapest_path = input_rel->cheapest_total_path;
RelOptInfo *partially_grouped_rel = NULL;
- double dNumGroups;
PartitionwiseAggregateType patype = PARTITIONWISE_AGGREGATE_NONE;
/*
@@ -3996,23 +3993,21 @@ create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
/* Gather any partially grouped partial paths. */
if (partially_grouped_rel && partially_grouped_rel->partial_pathlist)
- {
gather_grouping_paths(root, partially_grouped_rel);
- set_cheapest(partially_grouped_rel);
- }
/*
- * Estimate number of groups.
+ * Now choose the best path(s) for partially_grouped_rel.
+ *
+ * Note that the non-partial paths can come either from the Gather above or
+ * from eager aggregation.
*/
- dNumGroups = get_number_of_groups(root,
- cheapest_path->rows,
- gd,
- extra->targetList);
+ if (partially_grouped_rel && partially_grouped_rel->pathlist)
+ set_cheapest(partially_grouped_rel);
/* Build final grouping paths */
add_paths_to_grouping_rel(root, input_rel, grouped_rel,
partially_grouped_rel, agg_costs, gd,
- dNumGroups, extra);
+ extra);
/* Give a helpful error if we failed to find any implementation */
if (grouped_rel->pathlist == NIL)
@@ -6843,16 +6838,42 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
RelOptInfo *grouped_rel,
RelOptInfo *partially_grouped_rel,
const AggClauseCosts *agg_costs,
- grouping_sets_data *gd, double dNumGroups,
+ grouping_sets_data *gd,
GroupPathExtraData *extra)
{
Query *parse = root->parse;
Path *cheapest_path = input_rel->cheapest_total_path;
+ Path *cheapest_partially_grouped_path = NULL;
ListCell *lc;
bool can_hash = (extra->flags & GROUPING_CAN_USE_HASH) != 0;
bool can_sort = (extra->flags & GROUPING_CAN_USE_SORT) != 0;
List *havingQual = (List *) extra->havingQual;
AggClauseCosts *agg_final_costs = &extra->agg_final_costs;
+ double dNumGroups = 0;
+ double dNumFinalGroups = 0;
+
+ /*
+ * Estimate number of groups for non-split aggregation.
+ */
+ dNumGroups = get_number_of_groups(root,
+ cheapest_path->rows,
+ gd,
+ extra->targetList);
+
+ if (partially_grouped_rel && partially_grouped_rel->pathlist)
+ {
+ cheapest_partially_grouped_path =
+ partially_grouped_rel->cheapest_total_path;
+
+ /*
+ * Estimate number of groups for final phase of partial aggregation.
+ */
+ dNumFinalGroups =
+ get_number_of_groups(root,
+ cheapest_partially_grouped_path->rows,
+ gd,
+ extra->targetList);
+ }
if (can_sort)
{
@@ -6964,7 +6985,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
path = make_ordered_path(root,
grouped_rel,
path,
- partially_grouped_rel->cheapest_total_path,
+ cheapest_partially_grouped_path,
info->pathkeys);
if (path == NULL)
@@ -6981,7 +7002,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
info->clauses,
havingQual,
agg_final_costs,
- dNumGroups));
+ dNumFinalGroups));
else
add_path(grouped_rel, (Path *)
create_group_path(root,
@@ -6989,7 +7010,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
path,
info->clauses,
havingQual,
- dNumGroups));
+ dNumFinalGroups));
}
}
@@ -7031,19 +7052,17 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
*/
if (partially_grouped_rel && partially_grouped_rel->pathlist)
{
- Path *path = partially_grouped_rel->cheapest_total_path;
-
add_path(grouped_rel, (Path *)
create_agg_path(root,
grouped_rel,
- path,
+ cheapest_partially_grouped_path,
grouped_rel->reltarget,
AGG_HASHED,
AGGSPLIT_FINAL_DESERIAL,
root->processed_groupClause,
havingQual,
agg_final_costs,
- dNumGroups));
+ dNumFinalGroups));
}
}
@@ -7093,6 +7112,13 @@ create_partial_grouping_paths(PlannerInfo *root,
bool can_hash = (extra->flags & GROUPING_CAN_USE_HASH) != 0;
bool can_sort = (extra->flags & GROUPING_CAN_USE_SORT) != 0;
+ /*
+ * The partially_grouped_rel could have been already created due to eager
+ * aggregation.
+ */
+ partially_grouped_rel = find_grouped_rel(root, input_rel->relids, NULL);
+ Assert(enable_eager_aggregate || partially_grouped_rel == NULL);
+
/*
* Consider whether we should generate partially aggregated non-partial
* paths. We can only do this if we have a non-partial path, and only if
@@ -7116,19 +7142,27 @@ create_partial_grouping_paths(PlannerInfo *root,
* If we can't partially aggregate partial paths, and we can't partially
* aggregate non-partial paths, then don't bother creating the new
* RelOptInfo at all, unless the caller specified force_rel_creation.
+ *
+ * Note that the partially_grouped_rel could have been already created and
+ * populated with appropriate paths by eager aggregation.
*/
if (cheapest_total_path == NULL &&
cheapest_partial_path == NULL &&
+ (partially_grouped_rel == NULL ||
+ partially_grouped_rel->pathlist == NIL) &&
!force_rel_creation)
return NULL;
/*
* Build a new upper relation to represent the result of partially
- * aggregating the rows from the input relation.
- */
- partially_grouped_rel = fetch_upper_rel(root,
- UPPERREL_PARTIAL_GROUP_AGG,
- grouped_rel->relids);
+ * aggregating the rows from the input relation. The relation may already
+ * exist due to eager aggregation, in which case we don't need to create
+ * it.
+ */
+ if (partially_grouped_rel == NULL)
+ partially_grouped_rel = fetch_upper_rel(root,
+ UPPERREL_PARTIAL_GROUP_AGG,
+ grouped_rel->relids);
partially_grouped_rel->consider_parallel =
grouped_rel->consider_parallel;
partially_grouped_rel->reloptkind = grouped_rel->reloptkind;
diff --git a/src/backend/optimizer/util/appendinfo.c b/src/backend/optimizer/util/appendinfo.c
index 6ba4eba224..08de77d439 100644
--- a/src/backend/optimizer/util/appendinfo.c
+++ b/src/backend/optimizer/util/appendinfo.c
@@ -495,6 +495,66 @@ adjust_appendrel_attrs_mutator(Node *node,
return (Node *) newinfo;
}
+ /*
+ * We have to process RelAggInfo nodes specially.
+ */
+ if (IsA(node, RelAggInfo))
+ {
+ RelAggInfo *oldinfo = (RelAggInfo *) node;
+ RelAggInfo *newinfo = makeNode(RelAggInfo);
+
+ /* Copy all flat-copiable fields */
+ memcpy(newinfo, oldinfo, sizeof(RelAggInfo));
+
+ newinfo->relids = adjust_child_relids(oldinfo->relids,
+ context->nappinfos,
+ context->appinfos);
+
+ newinfo->target = (PathTarget *)
+ adjust_appendrel_attrs_mutator((Node *) oldinfo->target,
+ context);
+
+ newinfo->agg_input = (PathTarget *)
+ adjust_appendrel_attrs_mutator((Node *) oldinfo->agg_input,
+ context);
+
+ newinfo->group_clauses = (List *)
+ adjust_appendrel_attrs_mutator((Node *) oldinfo->group_clauses,
+ context);
+
+ newinfo->group_exprs = (List *)
+ adjust_appendrel_attrs_mutator((Node *) oldinfo->group_exprs,
+ context);
+
+ return (Node *) newinfo;
+ }
+
+ /*
+ * We have to process PathTarget nodes specially.
+ */
+ if (IsA(node, PathTarget))
+ {
+ PathTarget *oldtarget = (PathTarget *) node;
+ PathTarget *newtarget = makeNode(PathTarget);
+
+ /* Copy all flat-copiable fields */
+ memcpy(newtarget, oldtarget, sizeof(PathTarget));
+
+ if (oldtarget->sortgrouprefs)
+ {
+ Size nbytes = list_length(oldtarget->exprs) * sizeof(Index);
+
+ newtarget->exprs = (List *)
+ adjust_appendrel_attrs_mutator((Node *) oldtarget->exprs,
+ context);
+
+ newtarget->sortgrouprefs = (Index *) palloc(nbytes);
+ memcpy(newtarget->sortgrouprefs, oldtarget->sortgrouprefs, nbytes);
+ }
+
+ return (Node *) newtarget;
+ }
+
/*
* NOTE: we do not need to recurse into sublinks, because they should
* already have been converted to subplans before we see them.
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index b14f99a9ea..6087a14a76 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -2833,8 +2833,6 @@ create_rel_agg_info(PlannerInfo *root, RelOptInfo *rel)
mark_partial_aggref(aggref, AGGSPLIT_INITIAL_SERIAL);
add_column_to_pathtarget(target, (Expr *) aggref, 0);
-
- result->agg_exprs = lappend(result->agg_exprs, aggref);
}
/*
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index ac639abe31..2c93dc3241 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -1116,9 +1116,6 @@ typedef struct RelOptInfo
* "group_clauses", "group_exprs" and "group_pathkeys" are lists of
* SortGroupClause, the corresponding grouping expressions and PathKey
* respectively.
- *
- * "agg_exprs" is a list of Aggref nodes for the aggregation of the relation's
- * paths.
*/
typedef struct RelAggInfo
{
@@ -1154,9 +1151,6 @@ typedef struct RelAggInfo
List *group_exprs;
/* a list of PathKeys */
List *group_pathkeys;
-
- /* a list of Aggref nodes */
- List *agg_exprs;
} RelAggInfo;
/*
--
2.31.0
[application/octet-stream] v6-0008-Add-test-cases.patch (71.5K, 10-v6-0008-Add-test-cases.patch)
download | inline diff:
From 44cbdd2b6fadf10c4f6e50665038c693e4d59977 Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Fri, 23 Feb 2024 13:41:22 +0800
Subject: [PATCH v6 8/9] Add test cases
---
src/test/regress/expected/eager_aggregate.out | 1293 +++++++++++++++++
src/test/regress/parallel_schedule | 2 +-
src/test/regress/sql/eager_aggregate.sql | 192 +++
3 files changed, 1486 insertions(+), 1 deletion(-)
create mode 100644 src/test/regress/expected/eager_aggregate.out
create mode 100644 src/test/regress/sql/eager_aggregate.sql
diff --git a/src/test/regress/expected/eager_aggregate.out b/src/test/regress/expected/eager_aggregate.out
new file mode 100644
index 0000000000..7a28287522
--- /dev/null
+++ b/src/test/regress/expected/eager_aggregate.out
@@ -0,0 +1,1293 @@
+--
+-- EAGER AGGREGATION
+-- Test we can push aggregation down below join
+--
+-- Enable eager aggregation, which by default is disabled.
+SET enable_eager_aggregate TO on;
+CREATE TABLE eager_agg_t1 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t2 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t3 (a int, b int, c double precision);
+INSERT INTO eager_agg_t1 SELECT i, i, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t2 SELECT i, i%10, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t3 SELECT i%10, i%10, i FROM generate_series(1, 1000)i;
+ANALYZE eager_agg_t1;
+ANALYZE eager_agg_t2;
+ANALYZE eager_agg_t3;
+--
+-- Test eager aggregation over base rel
+--
+-- Perform scan of a table, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ QUERY PLAN
+------------------------------------------------------------------
+ Finalize GroupAggregate
+ Output: t1.a, avg(t2.c)
+ Group Key: t1.a
+ -> Sort
+ Output: t1.a, (PARTIAL avg(t2.c))
+ Sort Key: t1.a
+ -> Hash Join
+ Output: t1.a, (PARTIAL avg(t2.c))
+ Hash Cond: (t1.b = t2.b)
+ -> Seq Scan on public.eager_agg_t1 t1
+ Output: t1.a, t1.b, t1.c
+ -> Hash
+ Output: t2.b, (PARTIAL avg(t2.c))
+ -> Partial HashAggregate
+ Output: t2.b, PARTIAL avg(t2.c)
+ Group Key: t2.b
+ -> Seq Scan on public.eager_agg_t2 t2
+ Output: t2.a, t2.b, t2.c
+(18 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ QUERY PLAN
+------------------------------------------------------------------------
+ Finalize GroupAggregate
+ Output: t1.a, avg(t2.c)
+ Group Key: t1.a
+ -> Sort
+ Output: t1.a, (PARTIAL avg(t2.c))
+ Sort Key: t1.a
+ -> Hash Join
+ Output: t1.a, (PARTIAL avg(t2.c))
+ Hash Cond: (t1.b = t2.b)
+ -> Seq Scan on public.eager_agg_t1 t1
+ Output: t1.a, t1.b, t1.c
+ -> Hash
+ Output: t2.b, (PARTIAL avg(t2.c))
+ -> Partial GroupAggregate
+ Output: t2.b, PARTIAL avg(t2.c)
+ Group Key: t2.b
+ -> Sort
+ Output: t2.c, t2.b
+ Sort Key: t2.b
+ -> Seq Scan on public.eager_agg_t2 t2
+ Output: t2.c, t2.b
+(21 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+RESET enable_hashagg;
+--
+-- Test eager aggregation over join rel
+--
+-- Perform join of tables, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+ QUERY PLAN
+------------------------------------------------------------------------------
+ Finalize GroupAggregate
+ Output: t1.a, avg((t2.c + t3.c))
+ Group Key: t1.a
+ -> Sort
+ Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+ Sort Key: t1.a
+ -> Hash Join
+ Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+ Hash Cond: (t1.b = t2.b)
+ -> Seq Scan on public.eager_agg_t1 t1
+ Output: t1.a, t1.b, t1.c
+ -> Hash
+ Output: t2.b, (PARTIAL avg((t2.c + t3.c)))
+ -> Partial HashAggregate
+ Output: t2.b, PARTIAL avg((t2.c + t3.c))
+ Group Key: t2.b
+ -> Hash Join
+ Output: t2.c, t3.c, t2.b
+ Hash Cond: (t3.a = t2.a)
+ -> Seq Scan on public.eager_agg_t3 t3
+ Output: t3.a, t3.b, t3.c
+ -> Hash
+ Output: t2.c, t2.b, t2.a
+ -> Seq Scan on public.eager_agg_t2 t2
+ Output: t2.c, t2.b, t2.a
+(25 rows)
+
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+ a | avg
+---+-----
+ 1 | 497
+ 2 | 499
+ 3 | 501
+ 4 | 503
+ 5 | 505
+ 6 | 507
+ 7 | 509
+ 8 | 511
+ 9 | 513
+(9 rows)
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+ QUERY PLAN
+------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+ Output: t1.a, avg((t2.c + t3.c))
+ Group Key: t1.a
+ -> Sort
+ Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+ Sort Key: t1.a
+ -> Hash Join
+ Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+ Hash Cond: (t1.b = t2.b)
+ -> Seq Scan on public.eager_agg_t1 t1
+ Output: t1.a, t1.b, t1.c
+ -> Hash
+ Output: t2.b, (PARTIAL avg((t2.c + t3.c)))
+ -> Partial GroupAggregate
+ Output: t2.b, PARTIAL avg((t2.c + t3.c))
+ Group Key: t2.b
+ -> Sort
+ Output: t2.c, t3.c, t2.b
+ Sort Key: t2.b
+ -> Hash Join
+ Output: t2.c, t3.c, t2.b
+ Hash Cond: (t3.a = t2.a)
+ -> Seq Scan on public.eager_agg_t3 t3
+ Output: t3.a, t3.b, t3.c
+ -> Hash
+ Output: t2.c, t2.b, t2.a
+ -> Seq Scan on public.eager_agg_t2 t2
+ Output: t2.c, t2.b, t2.a
+(28 rows)
+
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+ a | avg
+---+-----
+ 1 | 497
+ 2 | 499
+ 3 | 501
+ 4 | 503
+ 5 | 505
+ 6 | 507
+ 7 | 509
+ 8 | 511
+ 9 | 513
+(9 rows)
+
+RESET enable_hashagg;
+--
+-- Test that eager aggregation works for outer join
+--
+-- Ensure aggregation can be pushed down to the non-nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ QUERY PLAN
+------------------------------------------------------------------
+ Finalize GroupAggregate
+ Output: t1.a, avg(t2.c)
+ Group Key: t1.a
+ -> Sort
+ Output: t1.a, (PARTIAL avg(t2.c))
+ Sort Key: t1.a
+ -> Hash Right Join
+ Output: t1.a, (PARTIAL avg(t2.c))
+ Hash Cond: (t1.b = t2.b)
+ -> Seq Scan on public.eager_agg_t1 t1
+ Output: t1.a, t1.b, t1.c
+ -> Hash
+ Output: t2.b, (PARTIAL avg(t2.c))
+ -> Partial HashAggregate
+ Output: t2.b, PARTIAL avg(t2.c)
+ Group Key: t2.b
+ -> Seq Scan on public.eager_agg_t2 t2
+ Output: t2.a, t2.b, t2.c
+(18 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+ | 505
+(10 rows)
+
+-- Ensure aggregation cannot be pushed down to the nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+ QUERY PLAN
+------------------------------------------------------------
+ Sort
+ Output: t2.b, (avg(t2.c))
+ Sort Key: t2.b
+ -> HashAggregate
+ Output: t2.b, avg(t2.c)
+ Group Key: t2.b
+ -> Hash Right Join
+ Output: t2.b, t2.c
+ Hash Cond: (t2.b = t1.b)
+ -> Seq Scan on public.eager_agg_t2 t2
+ Output: t2.a, t2.b, t2.c
+ -> Hash
+ Output: t1.b
+ -> Seq Scan on public.eager_agg_t1 t1
+ Output: t1.b
+(15 rows)
+
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+ b | avg
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+ |
+(10 rows)
+
+--
+-- Test that eager aggregation works for parallel plans
+--
+SET parallel_setup_cost=0;
+SET parallel_tuple_cost=0;
+SET min_parallel_table_scan_size=0;
+SET max_parallel_workers_per_gather=4;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ QUERY PLAN
+---------------------------------------------------------------------------------
+ Finalize GroupAggregate
+ Output: t1.a, avg(t2.c)
+ Group Key: t1.a
+ -> Sort
+ Output: t1.a, (PARTIAL avg(t2.c))
+ Sort Key: t1.a
+ -> Gather
+ Output: t1.a, (PARTIAL avg(t2.c))
+ Workers Planned: 2
+ -> Parallel Hash Join
+ Output: t1.a, (PARTIAL avg(t2.c))
+ Hash Cond: (t1.b = t2.b)
+ -> Parallel Seq Scan on public.eager_agg_t1 t1
+ Output: t1.a, t1.b, t1.c
+ -> Parallel Hash
+ Output: t2.b, (PARTIAL avg(t2.c))
+ -> Partial HashAggregate
+ Output: t2.b, PARTIAL avg(t2.c)
+ Group Key: t2.b
+ -> Parallel Seq Scan on public.eager_agg_t2 t2
+ Output: t2.a, t2.b, t2.c
+(21 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+RESET parallel_setup_cost;
+RESET parallel_tuple_cost;
+RESET min_parallel_table_scan_size;
+RESET max_parallel_workers_per_gather;
+DROP TABLE eager_agg_t1;
+DROP TABLE eager_agg_t2;
+DROP TABLE eager_agg_t3;
+--
+-- Test eager aggregation for partitionwise join
+--
+-- Enable partitionwise aggregate, which by default is disabled.
+SET enable_partitionwise_aggregate TO true;
+-- Enable partitionwise join, which by default is disabled.
+SET enable_partitionwise_join TO true;
+CREATE TABLE eager_agg_tab1(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab1_p1 PARTITION OF eager_agg_tab1 FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab1_p2 PARTITION OF eager_agg_tab1 FOR VALUES FROM (10) TO (20);
+CREATE TABLE eager_agg_tab1_p3 PARTITION OF eager_agg_tab1 FOR VALUES FROM (20) TO (30);
+CREATE TABLE eager_agg_tab2(x int, y int) PARTITION BY RANGE(y);
+CREATE TABLE eager_agg_tab2_p1 PARTITION OF eager_agg_tab2 FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab2_p2 PARTITION OF eager_agg_tab2 FOR VALUES FROM (10) TO (20);
+CREATE TABLE eager_agg_tab2_p3 PARTITION OF eager_agg_tab2 FOR VALUES FROM (20) TO (30);
+INSERT INTO eager_agg_tab1 SELECT i % 30, i % 20 FROM generate_series(0, 299, 2) i;
+INSERT INTO eager_agg_tab2 SELECT i % 20, i % 30 FROM generate_series(0, 299, 3) i;
+ANALYZE eager_agg_tab1;
+ANALYZE eager_agg_tab2;
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+ QUERY PLAN
+---------------------------------------------------------------------------------------
+ Sort
+ Output: t1.x, (sum(t1.y)), (count(*))
+ Sort Key: t1.x
+ -> Append
+ -> Finalize HashAggregate
+ Output: t1.x, sum(t1.y), count(*)
+ Group Key: t1.x
+ -> Hash Join
+ Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+ Hash Cond: (t2.y = t1.x)
+ -> Seq Scan on public.eager_agg_tab2_p1 t2
+ Output: t2.y
+ -> Hash
+ Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t1.x, PARTIAL sum(t1.y), PARTIAL count(*)
+ Group Key: t1.x
+ -> Seq Scan on public.eager_agg_tab1_p1 t1
+ Output: t1.x, t1.y
+ -> Finalize HashAggregate
+ Output: t1_1.x, sum(t1_1.y), count(*)
+ Group Key: t1_1.x
+ -> Hash Join
+ Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+ Hash Cond: (t2_1.y = t1_1.x)
+ -> Seq Scan on public.eager_agg_tab2_p2 t2_1
+ Output: t2_1.y
+ -> Hash
+ Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t1_1.x, PARTIAL sum(t1_1.y), PARTIAL count(*)
+ Group Key: t1_1.x
+ -> Seq Scan on public.eager_agg_tab1_p2 t1_1
+ Output: t1_1.x, t1_1.y
+ -> Finalize HashAggregate
+ Output: t1_2.x, sum(t1_2.y), count(*)
+ Group Key: t1_2.x
+ -> Hash Join
+ Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+ Hash Cond: (t2_2.y = t1_2.x)
+ -> Seq Scan on public.eager_agg_tab2_p3 t2_2
+ Output: t2_2.y
+ -> Hash
+ Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t1_2.x, PARTIAL sum(t1_2.y), PARTIAL count(*)
+ Group Key: t1_2.x
+ -> Seq Scan on public.eager_agg_tab1_p3 t1_2
+ Output: t1_2.x, t1_2.y
+(49 rows)
+
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+ x | sum | count
+----+------+-------
+ 0 | 500 | 100
+ 6 | 1100 | 100
+ 12 | 700 | 100
+ 18 | 1300 | 100
+ 24 | 900 | 100
+(5 rows)
+
+-- GROUP BY having other matching key
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+ QUERY PLAN
+---------------------------------------------------------------------------------------
+ Sort
+ Output: t2.y, (sum(t1.y)), (count(*))
+ Sort Key: t2.y
+ -> Append
+ -> Finalize HashAggregate
+ Output: t2.y, sum(t1.y), count(*)
+ Group Key: t2.y
+ -> Hash Join
+ Output: t2.y, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+ Hash Cond: (t2.y = t1.x)
+ -> Seq Scan on public.eager_agg_tab2_p1 t2
+ Output: t2.y
+ -> Hash
+ Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t1.x, PARTIAL sum(t1.y), PARTIAL count(*)
+ Group Key: t1.x
+ -> Seq Scan on public.eager_agg_tab1_p1 t1
+ Output: t1.y, t1.x
+ -> Finalize HashAggregate
+ Output: t2_1.y, sum(t1_1.y), count(*)
+ Group Key: t2_1.y
+ -> Hash Join
+ Output: t2_1.y, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+ Hash Cond: (t2_1.y = t1_1.x)
+ -> Seq Scan on public.eager_agg_tab2_p2 t2_1
+ Output: t2_1.y
+ -> Hash
+ Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t1_1.x, PARTIAL sum(t1_1.y), PARTIAL count(*)
+ Group Key: t1_1.x
+ -> Seq Scan on public.eager_agg_tab1_p2 t1_1
+ Output: t1_1.y, t1_1.x
+ -> Finalize HashAggregate
+ Output: t2_2.y, sum(t1_2.y), count(*)
+ Group Key: t2_2.y
+ -> Hash Join
+ Output: t2_2.y, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+ Hash Cond: (t2_2.y = t1_2.x)
+ -> Seq Scan on public.eager_agg_tab2_p3 t2_2
+ Output: t2_2.y
+ -> Hash
+ Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t1_2.x, PARTIAL sum(t1_2.y), PARTIAL count(*)
+ Group Key: t1_2.x
+ -> Seq Scan on public.eager_agg_tab1_p3 t1_2
+ Output: t1_2.y, t1_2.x
+(49 rows)
+
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+ y | sum | count
+----+------+-------
+ 0 | 500 | 100
+ 6 | 1100 | 100
+ 12 | 700 | 100
+ 18 | 1300 | 100
+ 24 | 900 | 100
+(5 rows)
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 10 ORDER BY t2.x;
+ QUERY PLAN
+------------------------------------------------------------------------------------------------------------
+ Sort
+ Output: t2.x, (sum(t1.x)), (count(*))
+ Sort Key: t2.x
+ -> Finalize HashAggregate
+ Output: t2.x, sum(t1.x), count(*)
+ Group Key: t2.x
+ Filter: (avg(t1.x) > '10'::numeric)
+ -> Append
+ -> Hash Join
+ Output: t2_1.x, (PARTIAL sum(t1_1.x)), (PARTIAL count(*)), (PARTIAL avg(t1_1.x))
+ Hash Cond: (t2_1.y = t1_1.x)
+ -> Seq Scan on public.eager_agg_tab2_p1 t2_1
+ Output: t2_1.x, t2_1.y
+ -> Hash
+ Output: t1_1.x, (PARTIAL sum(t1_1.x)), (PARTIAL count(*)), (PARTIAL avg(t1_1.x))
+ -> Partial HashAggregate
+ Output: t1_1.x, PARTIAL sum(t1_1.x), PARTIAL count(*), PARTIAL avg(t1_1.x)
+ Group Key: t1_1.x
+ -> Seq Scan on public.eager_agg_tab1_p1 t1_1
+ Output: t1_1.x
+ -> Hash Join
+ Output: t2_2.x, (PARTIAL sum(t1_2.x)), (PARTIAL count(*)), (PARTIAL avg(t1_2.x))
+ Hash Cond: (t2_2.y = t1_2.x)
+ -> Seq Scan on public.eager_agg_tab2_p2 t2_2
+ Output: t2_2.x, t2_2.y
+ -> Hash
+ Output: t1_2.x, (PARTIAL sum(t1_2.x)), (PARTIAL count(*)), (PARTIAL avg(t1_2.x))
+ -> Partial HashAggregate
+ Output: t1_2.x, PARTIAL sum(t1_2.x), PARTIAL count(*), PARTIAL avg(t1_2.x)
+ Group Key: t1_2.x
+ -> Seq Scan on public.eager_agg_tab1_p2 t1_2
+ Output: t1_2.x
+ -> Hash Join
+ Output: t2_3.x, (PARTIAL sum(t1_3.x)), (PARTIAL count(*)), (PARTIAL avg(t1_3.x))
+ Hash Cond: (t2_3.y = t1_3.x)
+ -> Seq Scan on public.eager_agg_tab2_p3 t2_3
+ Output: t2_3.x, t2_3.y
+ -> Hash
+ Output: t1_3.x, (PARTIAL sum(t1_3.x)), (PARTIAL count(*)), (PARTIAL avg(t1_3.x))
+ -> Partial HashAggregate
+ Output: t1_3.x, PARTIAL sum(t1_3.x), PARTIAL count(*), PARTIAL avg(t1_3.x)
+ Group Key: t1_3.x
+ -> Seq Scan on public.eager_agg_tab1_p3 t1_3
+ Output: t1_3.x
+(44 rows)
+
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 10 ORDER BY t2.x;
+ x | sum | count
+----+------+-------
+ 2 | 600 | 50
+ 4 | 1200 | 50
+ 8 | 900 | 50
+ 12 | 600 | 50
+ 14 | 1200 | 50
+ 18 | 900 | 50
+(6 rows)
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+ QUERY PLAN
+-------------------------------------------------------------------------------------------
+ Sort
+ Output: t1.x, (sum((t2.y + t3.y)))
+ Sort Key: t1.x
+ -> Append
+ -> Finalize HashAggregate
+ Output: t1.x, sum((t2.y + t3.y))
+ Group Key: t1.x
+ -> Hash Join
+ Output: t1.x, (PARTIAL sum((t2.y + t3.y)))
+ Hash Cond: (t1.x = t2.x)
+ -> Seq Scan on public.eager_agg_tab1_p1 t1
+ Output: t1.x
+ -> Hash
+ Output: t2.x, t3.x, (PARTIAL sum((t2.y + t3.y)))
+ -> Partial HashAggregate
+ Output: t2.x, t3.x, PARTIAL sum((t2.y + t3.y))
+ Group Key: t2.x
+ -> Hash Join
+ Output: t2.y, t2.x, t3.y, t3.x
+ Hash Cond: (t2.x = t3.x)
+ -> Seq Scan on public.eager_agg_tab1_p1 t2
+ Output: t2.y, t2.x
+ -> Hash
+ Output: t3.y, t3.x
+ -> Seq Scan on public.eager_agg_tab1_p1 t3
+ Output: t3.y, t3.x
+ -> Finalize HashAggregate
+ Output: t1_1.x, sum((t2_1.y + t3_1.y))
+ Group Key: t1_1.x
+ -> Hash Join
+ Output: t1_1.x, (PARTIAL sum((t2_1.y + t3_1.y)))
+ Hash Cond: (t1_1.x = t2_1.x)
+ -> Seq Scan on public.eager_agg_tab1_p2 t1_1
+ Output: t1_1.x
+ -> Hash
+ Output: t2_1.x, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y)))
+ -> Partial HashAggregate
+ Output: t2_1.x, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y))
+ Group Key: t2_1.x
+ -> Hash Join
+ Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+ Hash Cond: (t2_1.x = t3_1.x)
+ -> Seq Scan on public.eager_agg_tab1_p2 t2_1
+ Output: t2_1.y, t2_1.x
+ -> Hash
+ Output: t3_1.y, t3_1.x
+ -> Seq Scan on public.eager_agg_tab1_p2 t3_1
+ Output: t3_1.y, t3_1.x
+ -> Finalize HashAggregate
+ Output: t1_2.x, sum((t2_2.y + t3_2.y))
+ Group Key: t1_2.x
+ -> Hash Join
+ Output: t1_2.x, (PARTIAL sum((t2_2.y + t3_2.y)))
+ Hash Cond: (t1_2.x = t2_2.x)
+ -> Seq Scan on public.eager_agg_tab1_p3 t1_2
+ Output: t1_2.x
+ -> Hash
+ Output: t2_2.x, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y)))
+ -> Partial HashAggregate
+ Output: t2_2.x, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y))
+ Group Key: t2_2.x
+ -> Hash Join
+ Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+ Hash Cond: (t2_2.x = t3_2.x)
+ -> Seq Scan on public.eager_agg_tab1_p3 t2_2
+ Output: t2_2.y, t2_2.x
+ -> Hash
+ Output: t3_2.y, t3_2.x
+ -> Seq Scan on public.eager_agg_tab1_p3 t3_2
+ Output: t3_2.y, t3_2.x
+(70 rows)
+
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+ x | sum
+----+-------
+ 0 | 10000
+ 2 | 14000
+ 4 | 18000
+ 6 | 22000
+ 8 | 26000
+ 10 | 10000
+ 12 | 14000
+ 14 | 18000
+ 16 | 22000
+ 18 | 26000
+ 20 | 10000
+ 22 | 14000
+ 24 | 18000
+ 26 | 22000
+ 28 | 26000
+(15 rows)
+
+-- partial aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+ QUERY PLAN
+-------------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+ Output: t3.y, sum((t2.y + t3.y))
+ Group Key: t3.y
+ -> Sort
+ Output: t3.y, (PARTIAL sum((t2.y + t3.y)))
+ Sort Key: t3.y
+ -> Append
+ -> Hash Join
+ Output: t3_1.y, (PARTIAL sum((t2_1.y + t3_1.y)))
+ Hash Cond: (t2_1.x = t1_1.x)
+ -> Partial GroupAggregate
+ Output: t3_1.y, t2_1.x, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y))
+ Group Key: t3_1.y, t2_1.x, t3_1.x
+ -> Sort
+ Output: t2_1.y, t3_1.y, t2_1.x, t3_1.x
+ Sort Key: t3_1.y, t2_1.x
+ -> Hash Join
+ Output: t2_1.y, t3_1.y, t2_1.x, t3_1.x
+ Hash Cond: (t2_1.x = t3_1.x)
+ -> Seq Scan on public.eager_agg_tab1_p1 t2_1
+ Output: t2_1.y, t2_1.x
+ -> Hash
+ Output: t3_1.y, t3_1.x
+ -> Seq Scan on public.eager_agg_tab1_p1 t3_1
+ Output: t3_1.y, t3_1.x
+ -> Hash
+ Output: t1_1.x
+ -> Seq Scan on public.eager_agg_tab1_p1 t1_1
+ Output: t1_1.x
+ -> Hash Join
+ Output: t3_2.y, (PARTIAL sum((t2_2.y + t3_2.y)))
+ Hash Cond: (t2_2.x = t1_2.x)
+ -> Partial GroupAggregate
+ Output: t3_2.y, t2_2.x, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y))
+ Group Key: t3_2.y, t2_2.x, t3_2.x
+ -> Sort
+ Output: t2_2.y, t3_2.y, t2_2.x, t3_2.x
+ Sort Key: t3_2.y, t2_2.x
+ -> Hash Join
+ Output: t2_2.y, t3_2.y, t2_2.x, t3_2.x
+ Hash Cond: (t2_2.x = t3_2.x)
+ -> Seq Scan on public.eager_agg_tab1_p2 t2_2
+ Output: t2_2.y, t2_2.x
+ -> Hash
+ Output: t3_2.y, t3_2.x
+ -> Seq Scan on public.eager_agg_tab1_p2 t3_2
+ Output: t3_2.y, t3_2.x
+ -> Hash
+ Output: t1_2.x
+ -> Seq Scan on public.eager_agg_tab1_p2 t1_2
+ Output: t1_2.x
+ -> Hash Join
+ Output: t3_3.y, (PARTIAL sum((t2_3.y + t3_3.y)))
+ Hash Cond: (t2_3.x = t1_3.x)
+ -> Partial GroupAggregate
+ Output: t3_3.y, t2_3.x, t3_3.x, PARTIAL sum((t2_3.y + t3_3.y))
+ Group Key: t3_3.y, t2_3.x, t3_3.x
+ -> Sort
+ Output: t2_3.y, t3_3.y, t2_3.x, t3_3.x
+ Sort Key: t3_3.y, t2_3.x
+ -> Hash Join
+ Output: t2_3.y, t3_3.y, t2_3.x, t3_3.x
+ Hash Cond: (t2_3.x = t3_3.x)
+ -> Seq Scan on public.eager_agg_tab1_p3 t2_3
+ Output: t2_3.y, t2_3.x
+ -> Hash
+ Output: t3_3.y, t3_3.x
+ -> Seq Scan on public.eager_agg_tab1_p3 t3_3
+ Output: t3_3.y, t3_3.x
+ -> Hash
+ Output: t1_3.x
+ -> Seq Scan on public.eager_agg_tab1_p3 t1_3
+ Output: t1_3.x
+(73 rows)
+
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+ y | sum
+----+-------
+ 0 | 7500
+ 2 | 13500
+ 4 | 19500
+ 6 | 25500
+ 8 | 31500
+ 10 | 22500
+ 12 | 28500
+ 14 | 34500
+ 16 | 40500
+ 18 | 46500
+(10 rows)
+
+RESET enable_hashagg;
+DROP TABLE eager_agg_tab1;
+DROP TABLE eager_agg_tab2;
+--
+-- Test with multi-level partitioning scheme
+--
+CREATE TABLE eager_agg_tab_ml(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p1 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab_ml_p2 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (10) TO (20) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p2_s1 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab_ml_p2_s2 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (15) TO (20);
+CREATE TABLE eager_agg_tab_ml_p3 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (20) TO (30) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p3_s1 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (20) TO (25);
+CREATE TABLE eager_agg_tab_ml_p3_s2 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (25) TO (30);
+INSERT INTO eager_agg_tab_ml SELECT i % 30, i % 30 FROM generate_series(1, 1000) i;
+ANALYZE eager_agg_tab_ml;
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+ QUERY PLAN
+---------------------------------------------------------------------------------------
+ Sort
+ Output: t1.x, (sum(t2.y)), (count(*))
+ Sort Key: t1.x
+ -> Append
+ -> Finalize HashAggregate
+ Output: t1.x, sum(t2.y), count(*)
+ Group Key: t1.x
+ -> Hash Join
+ Output: t1.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+ Hash Cond: (t1.x = t2.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p1 t1
+ Output: t1.x
+ -> Hash
+ Output: t2.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t2.x, PARTIAL sum(t2.y), PARTIAL count(*)
+ Group Key: t2.x
+ -> Seq Scan on public.eager_agg_tab_ml_p1 t2
+ Output: t2.y, t2.x
+ -> Finalize HashAggregate
+ Output: t1_1.x, sum(t2_1.y), count(*)
+ Group Key: t1_1.x
+ -> Hash Join
+ Output: t1_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+ Hash Cond: (t1_1.x = t2_1.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+ Output: t1_1.x
+ -> Hash
+ Output: t2_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t2_1.x, PARTIAL sum(t2_1.y), PARTIAL count(*)
+ Group Key: t2_1.x
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+ Output: t2_1.y, t2_1.x
+ -> Finalize HashAggregate
+ Output: t1_2.x, sum(t2_2.y), count(*)
+ Group Key: t1_2.x
+ -> Hash Join
+ Output: t1_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+ Hash Cond: (t1_2.x = t2_2.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+ Output: t1_2.x
+ -> Hash
+ Output: t2_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t2_2.x, PARTIAL sum(t2_2.y), PARTIAL count(*)
+ Group Key: t2_2.x
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+ Output: t2_2.y, t2_2.x
+ -> Finalize HashAggregate
+ Output: t1_3.x, sum(t2_3.y), count(*)
+ Group Key: t1_3.x
+ -> Hash Join
+ Output: t1_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+ Hash Cond: (t1_3.x = t2_3.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+ Output: t1_3.x
+ -> Hash
+ Output: t2_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t2_3.x, PARTIAL sum(t2_3.y), PARTIAL count(*)
+ Group Key: t2_3.x
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+ Output: t2_3.y, t2_3.x
+ -> Finalize HashAggregate
+ Output: t1_4.x, sum(t2_4.y), count(*)
+ Group Key: t1_4.x
+ -> Hash Join
+ Output: t1_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+ Hash Cond: (t1_4.x = t2_4.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+ Output: t1_4.x
+ -> Hash
+ Output: t2_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t2_4.x, PARTIAL sum(t2_4.y), PARTIAL count(*)
+ Group Key: t2_4.x
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+ Output: t2_4.y, t2_4.x
+(79 rows)
+
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+ x | sum | count
+----+-------+-------
+ 0 | 0 | 1089
+ 1 | 1156 | 1156
+ 2 | 2312 | 1156
+ 3 | 3468 | 1156
+ 4 | 4624 | 1156
+ 5 | 5780 | 1156
+ 6 | 6936 | 1156
+ 7 | 8092 | 1156
+ 8 | 9248 | 1156
+ 9 | 10404 | 1156
+ 10 | 11560 | 1156
+ 11 | 11979 | 1089
+ 12 | 13068 | 1089
+ 13 | 14157 | 1089
+ 14 | 15246 | 1089
+ 15 | 16335 | 1089
+ 16 | 17424 | 1089
+ 17 | 18513 | 1089
+ 18 | 19602 | 1089
+ 19 | 20691 | 1089
+ 20 | 21780 | 1089
+ 21 | 22869 | 1089
+ 22 | 23958 | 1089
+ 23 | 25047 | 1089
+ 24 | 26136 | 1089
+ 25 | 27225 | 1089
+ 26 | 28314 | 1089
+ 27 | 29403 | 1089
+ 28 | 30492 | 1089
+ 29 | 31581 | 1089
+(30 rows)
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+ QUERY PLAN
+---------------------------------------------------------------------------------------
+ Sort
+ Output: t1.y, (sum(t2.y)), (count(*))
+ Sort Key: t1.y
+ -> Finalize HashAggregate
+ Output: t1.y, sum(t2.y), count(*)
+ Group Key: t1.y
+ -> Append
+ -> Hash Join
+ Output: t1_1.y, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+ Hash Cond: (t1_1.x = t2_1.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p1 t1_1
+ Output: t1_1.y, t1_1.x
+ -> Hash
+ Output: t2_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t2_1.x, PARTIAL sum(t2_1.y), PARTIAL count(*)
+ Group Key: t2_1.x
+ -> Seq Scan on public.eager_agg_tab_ml_p1 t2_1
+ Output: t2_1.y, t2_1.x
+ -> Hash Join
+ Output: t1_2.y, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+ Hash Cond: (t1_2.x = t2_2.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_2
+ Output: t1_2.y, t1_2.x
+ -> Hash
+ Output: t2_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t2_2.x, PARTIAL sum(t2_2.y), PARTIAL count(*)
+ Group Key: t2_2.x
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_2
+ Output: t2_2.y, t2_2.x
+ -> Hash Join
+ Output: t1_3.y, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+ Hash Cond: (t1_3.x = t2_3.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_3
+ Output: t1_3.y, t1_3.x
+ -> Hash
+ Output: t2_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t2_3.x, PARTIAL sum(t2_3.y), PARTIAL count(*)
+ Group Key: t2_3.x
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_3
+ Output: t2_3.y, t2_3.x
+ -> Hash Join
+ Output: t1_4.y, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+ Hash Cond: (t1_4.x = t2_4.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_4
+ Output: t1_4.y, t1_4.x
+ -> Hash
+ Output: t2_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t2_4.x, PARTIAL sum(t2_4.y), PARTIAL count(*)
+ Group Key: t2_4.x
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_4
+ Output: t2_4.y, t2_4.x
+ -> Hash Join
+ Output: t1_5.y, (PARTIAL sum(t2_5.y)), (PARTIAL count(*))
+ Hash Cond: (t1_5.x = t2_5.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_5
+ Output: t1_5.y, t1_5.x
+ -> Hash
+ Output: t2_5.x, (PARTIAL sum(t2_5.y)), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t2_5.x, PARTIAL sum(t2_5.y), PARTIAL count(*)
+ Group Key: t2_5.x
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_5
+ Output: t2_5.y, t2_5.x
+(67 rows)
+
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+ y | sum | count
+----+-------+-------
+ 0 | 0 | 1089
+ 1 | 1156 | 1156
+ 2 | 2312 | 1156
+ 3 | 3468 | 1156
+ 4 | 4624 | 1156
+ 5 | 5780 | 1156
+ 6 | 6936 | 1156
+ 7 | 8092 | 1156
+ 8 | 9248 | 1156
+ 9 | 10404 | 1156
+ 10 | 11560 | 1156
+ 11 | 11979 | 1089
+ 12 | 13068 | 1089
+ 13 | 14157 | 1089
+ 14 | 15246 | 1089
+ 15 | 16335 | 1089
+ 16 | 17424 | 1089
+ 17 | 18513 | 1089
+ 18 | 19602 | 1089
+ 19 | 20691 | 1089
+ 20 | 21780 | 1089
+ 21 | 22869 | 1089
+ 22 | 23958 | 1089
+ 23 | 25047 | 1089
+ 24 | 26136 | 1089
+ 25 | 27225 | 1089
+ 26 | 28314 | 1089
+ 27 | 29403 | 1089
+ 28 | 30492 | 1089
+ 29 | 31581 | 1089
+(30 rows)
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+ QUERY PLAN
+----------------------------------------------------------------------------------------------------------
+ Sort
+ Output: t1.x, (sum((t2.y + t3.y))), (count(*))
+ Sort Key: t1.x
+ -> Append
+ -> Finalize HashAggregate
+ Output: t1.x, sum((t2.y + t3.y)), count(*)
+ Group Key: t1.x
+ -> Hash Join
+ Output: t1.x, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+ Hash Cond: (t1.x = t2.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p1 t1
+ Output: t1.x
+ -> Hash
+ Output: t2.x, t3.x, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t2.x, t3.x, PARTIAL sum((t2.y + t3.y)), PARTIAL count(*)
+ Group Key: t2.x
+ -> Hash Join
+ Output: t2.y, t2.x, t3.y, t3.x
+ Hash Cond: (t2.x = t3.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p1 t2
+ Output: t2.y, t2.x
+ -> Hash
+ Output: t3.y, t3.x
+ -> Seq Scan on public.eager_agg_tab_ml_p1 t3
+ Output: t3.y, t3.x
+ -> Finalize HashAggregate
+ Output: t1_1.x, sum((t2_1.y + t3_1.y)), count(*)
+ Group Key: t1_1.x
+ -> Hash Join
+ Output: t1_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+ Hash Cond: (t1_1.x = t2_1.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+ Output: t1_1.x
+ -> Hash
+ Output: t2_1.x, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t2_1.x, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y)), PARTIAL count(*)
+ Group Key: t2_1.x
+ -> Hash Join
+ Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+ Hash Cond: (t2_1.x = t3_1.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+ Output: t2_1.y, t2_1.x
+ -> Hash
+ Output: t3_1.y, t3_1.x
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s1 t3_1
+ Output: t3_1.y, t3_1.x
+ -> Finalize HashAggregate
+ Output: t1_2.x, sum((t2_2.y + t3_2.y)), count(*)
+ Group Key: t1_2.x
+ -> Hash Join
+ Output: t1_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+ Hash Cond: (t1_2.x = t2_2.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+ Output: t1_2.x
+ -> Hash
+ Output: t2_2.x, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t2_2.x, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y)), PARTIAL count(*)
+ Group Key: t2_2.x
+ -> Hash Join
+ Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+ Hash Cond: (t2_2.x = t3_2.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+ Output: t2_2.y, t2_2.x
+ -> Hash
+ Output: t3_2.y, t3_2.x
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s2 t3_2
+ Output: t3_2.y, t3_2.x
+ -> Finalize HashAggregate
+ Output: t1_3.x, sum((t2_3.y + t3_3.y)), count(*)
+ Group Key: t1_3.x
+ -> Hash Join
+ Output: t1_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+ Hash Cond: (t1_3.x = t2_3.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+ Output: t1_3.x
+ -> Hash
+ Output: t2_3.x, t3_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t2_3.x, t3_3.x, PARTIAL sum((t2_3.y + t3_3.y)), PARTIAL count(*)
+ Group Key: t2_3.x
+ -> Hash Join
+ Output: t2_3.y, t2_3.x, t3_3.y, t3_3.x
+ Hash Cond: (t2_3.x = t3_3.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+ Output: t2_3.y, t2_3.x
+ -> Hash
+ Output: t3_3.y, t3_3.x
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s1 t3_3
+ Output: t3_3.y, t3_3.x
+ -> Finalize HashAggregate
+ Output: t1_4.x, sum((t2_4.y + t3_4.y)), count(*)
+ Group Key: t1_4.x
+ -> Hash Join
+ Output: t1_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+ Hash Cond: (t1_4.x = t2_4.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+ Output: t1_4.x
+ -> Hash
+ Output: t2_4.x, t3_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t2_4.x, t3_4.x, PARTIAL sum((t2_4.y + t3_4.y)), PARTIAL count(*)
+ Group Key: t2_4.x
+ -> Hash Join
+ Output: t2_4.y, t2_4.x, t3_4.y, t3_4.x
+ Hash Cond: (t2_4.x = t3_4.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+ Output: t2_4.y, t2_4.x
+ -> Hash
+ Output: t3_4.y, t3_4.x
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s2 t3_4
+ Output: t3_4.y, t3_4.x
+(114 rows)
+
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+ x | sum | count
+----+---------+-------
+ 0 | 0 | 35937
+ 1 | 78608 | 39304
+ 2 | 157216 | 39304
+ 3 | 235824 | 39304
+ 4 | 314432 | 39304
+ 5 | 393040 | 39304
+ 6 | 471648 | 39304
+ 7 | 550256 | 39304
+ 8 | 628864 | 39304
+ 9 | 707472 | 39304
+ 10 | 786080 | 39304
+ 11 | 790614 | 35937
+ 12 | 862488 | 35937
+ 13 | 934362 | 35937
+ 14 | 1006236 | 35937
+ 15 | 1078110 | 35937
+ 16 | 1149984 | 35937
+ 17 | 1221858 | 35937
+ 18 | 1293732 | 35937
+ 19 | 1365606 | 35937
+ 20 | 1437480 | 35937
+ 21 | 1509354 | 35937
+ 22 | 1581228 | 35937
+ 23 | 1653102 | 35937
+ 24 | 1724976 | 35937
+ 25 | 1796850 | 35937
+ 26 | 1868724 | 35937
+ 27 | 1940598 | 35937
+ 28 | 2012472 | 35937
+ 29 | 2084346 | 35937
+(30 rows)
+
+-- partial aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+ QUERY PLAN
+------------------------------------------------------------------------------------------------------------------
+ Sort
+ Output: t3.y, (sum((t2.y + t3.y))), (count(*))
+ Sort Key: t3.y
+ -> Finalize HashAggregate
+ Output: t3.y, sum((t2.y + t3.y)), count(*)
+ Group Key: t3.y
+ -> Append
+ -> Hash Join
+ Output: t3_1.y, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+ Hash Cond: (t1_1.x = t2_1.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p1 t1_1
+ Output: t1_1.x
+ -> Hash
+ Output: t3_1.y, t2_1.x, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t3_1.y, t2_1.x, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y)), PARTIAL count(*)
+ Group Key: t3_1.y, t2_1.x, t3_1.x
+ -> Hash Join
+ Output: t2_1.y, t3_1.y, t2_1.x, t3_1.x
+ Hash Cond: (t2_1.x = t3_1.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p1 t2_1
+ Output: t2_1.y, t2_1.x
+ -> Hash
+ Output: t3_1.y, t3_1.x
+ -> Seq Scan on public.eager_agg_tab_ml_p1 t3_1
+ Output: t3_1.y, t3_1.x
+ -> Hash Join
+ Output: t3_2.y, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+ Hash Cond: (t1_2.x = t2_2.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_2
+ Output: t1_2.x
+ -> Hash
+ Output: t3_2.y, t2_2.x, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t3_2.y, t2_2.x, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y)), PARTIAL count(*)
+ Group Key: t3_2.y, t2_2.x, t3_2.x
+ -> Hash Join
+ Output: t2_2.y, t3_2.y, t2_2.x, t3_2.x
+ Hash Cond: (t2_2.x = t3_2.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_2
+ Output: t2_2.y, t2_2.x
+ -> Hash
+ Output: t3_2.y, t3_2.x
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s1 t3_2
+ Output: t3_2.y, t3_2.x
+ -> Hash Join
+ Output: t3_3.y, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+ Hash Cond: (t1_3.x = t2_3.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_3
+ Output: t1_3.x
+ -> Hash
+ Output: t3_3.y, t2_3.x, t3_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t3_3.y, t2_3.x, t3_3.x, PARTIAL sum((t2_3.y + t3_3.y)), PARTIAL count(*)
+ Group Key: t3_3.y, t2_3.x, t3_3.x
+ -> Hash Join
+ Output: t2_3.y, t3_3.y, t2_3.x, t3_3.x
+ Hash Cond: (t2_3.x = t3_3.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_3
+ Output: t2_3.y, t2_3.x
+ -> Hash
+ Output: t3_3.y, t3_3.x
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s2 t3_3
+ Output: t3_3.y, t3_3.x
+ -> Hash Join
+ Output: t3_4.y, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+ Hash Cond: (t1_4.x = t2_4.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_4
+ Output: t1_4.x
+ -> Hash
+ Output: t3_4.y, t2_4.x, t3_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t3_4.y, t2_4.x, t3_4.x, PARTIAL sum((t2_4.y + t3_4.y)), PARTIAL count(*)
+ Group Key: t3_4.y, t2_4.x, t3_4.x
+ -> Hash Join
+ Output: t2_4.y, t3_4.y, t2_4.x, t3_4.x
+ Hash Cond: (t2_4.x = t3_4.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_4
+ Output: t2_4.y, t2_4.x
+ -> Hash
+ Output: t3_4.y, t3_4.x
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s1 t3_4
+ Output: t3_4.y, t3_4.x
+ -> Hash Join
+ Output: t3_5.y, (PARTIAL sum((t2_5.y + t3_5.y))), (PARTIAL count(*))
+ Hash Cond: (t1_5.x = t2_5.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_5
+ Output: t1_5.x
+ -> Hash
+ Output: t3_5.y, t2_5.x, t3_5.x, (PARTIAL sum((t2_5.y + t3_5.y))), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t3_5.y, t2_5.x, t3_5.x, PARTIAL sum((t2_5.y + t3_5.y)), PARTIAL count(*)
+ Group Key: t3_5.y, t2_5.x, t3_5.x
+ -> Hash Join
+ Output: t2_5.y, t3_5.y, t2_5.x, t3_5.x
+ Hash Cond: (t2_5.x = t3_5.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_5
+ Output: t2_5.y, t2_5.x
+ -> Hash
+ Output: t3_5.y, t3_5.x
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s2 t3_5
+ Output: t3_5.y, t3_5.x
+(102 rows)
+
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+ y | sum | count
+----+---------+-------
+ 0 | 0 | 35937
+ 1 | 78608 | 39304
+ 2 | 157216 | 39304
+ 3 | 235824 | 39304
+ 4 | 314432 | 39304
+ 5 | 393040 | 39304
+ 6 | 471648 | 39304
+ 7 | 550256 | 39304
+ 8 | 628864 | 39304
+ 9 | 707472 | 39304
+ 10 | 786080 | 39304
+ 11 | 790614 | 35937
+ 12 | 862488 | 35937
+ 13 | 934362 | 35937
+ 14 | 1006236 | 35937
+ 15 | 1078110 | 35937
+ 16 | 1149984 | 35937
+ 17 | 1221858 | 35937
+ 18 | 1293732 | 35937
+ 19 | 1365606 | 35937
+ 20 | 1437480 | 35937
+ 21 | 1509354 | 35937
+ 22 | 1581228 | 35937
+ 23 | 1653102 | 35937
+ 24 | 1724976 | 35937
+ 25 | 1796850 | 35937
+ 26 | 1868724 | 35937
+ 27 | 1940598 | 35937
+ 28 | 2012472 | 35937
+ 29 | 2084346 | 35937
+(30 rows)
+
+DROP TABLE eager_agg_tab_ml;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 675c567617..0f6b3e78a8 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -119,7 +119,7 @@ test: plancache limit plpgsql copy2 temp domain rangefuncs prepare conversion tr
# The stats test resets stats, so nothing else needing stats access can be in
# this group.
# ----------
-test: partition_merge partition_split partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats predicate
+test: partition_merge partition_split partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats predicate eager_aggregate
# event_trigger depends on create_am and cannot run concurrently with
# any test that runs DDL
diff --git a/src/test/regress/sql/eager_aggregate.sql b/src/test/regress/sql/eager_aggregate.sql
new file mode 100644
index 0000000000..4050e4df44
--- /dev/null
+++ b/src/test/regress/sql/eager_aggregate.sql
@@ -0,0 +1,192 @@
+--
+-- EAGER AGGREGATION
+-- Test we can push aggregation down below join
+--
+
+-- Enable eager aggregation, which by default is disabled.
+SET enable_eager_aggregate TO on;
+
+CREATE TABLE eager_agg_t1 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t2 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t3 (a int, b int, c double precision);
+
+INSERT INTO eager_agg_t1 SELECT i, i, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t2 SELECT i, i%10, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t3 SELECT i%10, i%10, i FROM generate_series(1, 1000)i;
+
+ANALYZE eager_agg_t1;
+ANALYZE eager_agg_t2;
+ANALYZE eager_agg_t3;
+
+
+--
+-- Test eager aggregation over base rel
+--
+
+-- Perform scan of a table, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+RESET enable_hashagg;
+
+
+--
+-- Test eager aggregation over join rel
+--
+
+-- Perform join of tables, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+
+RESET enable_hashagg;
+
+
+--
+-- Test that eager aggregation works for outer join
+--
+
+-- Ensure aggregation can be pushed down to the non-nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+-- Ensure aggregation cannot be pushed down to the nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+
+
+--
+-- Test that eager aggregation works for parallel plans
+--
+
+SET parallel_setup_cost=0;
+SET parallel_tuple_cost=0;
+SET min_parallel_table_scan_size=0;
+SET max_parallel_workers_per_gather=4;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+RESET parallel_setup_cost;
+RESET parallel_tuple_cost;
+RESET min_parallel_table_scan_size;
+RESET max_parallel_workers_per_gather;
+
+
+DROP TABLE eager_agg_t1;
+DROP TABLE eager_agg_t2;
+DROP TABLE eager_agg_t3;
+
+
+--
+-- Test eager aggregation for partitionwise join
+--
+
+-- Enable partitionwise aggregate, which by default is disabled.
+SET enable_partitionwise_aggregate TO true;
+-- Enable partitionwise join, which by default is disabled.
+SET enable_partitionwise_join TO true;
+
+CREATE TABLE eager_agg_tab1(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab1_p1 PARTITION OF eager_agg_tab1 FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab1_p2 PARTITION OF eager_agg_tab1 FOR VALUES FROM (10) TO (20);
+CREATE TABLE eager_agg_tab1_p3 PARTITION OF eager_agg_tab1 FOR VALUES FROM (20) TO (30);
+CREATE TABLE eager_agg_tab2(x int, y int) PARTITION BY RANGE(y);
+CREATE TABLE eager_agg_tab2_p1 PARTITION OF eager_agg_tab2 FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab2_p2 PARTITION OF eager_agg_tab2 FOR VALUES FROM (10) TO (20);
+CREATE TABLE eager_agg_tab2_p3 PARTITION OF eager_agg_tab2 FOR VALUES FROM (20) TO (30);
+INSERT INTO eager_agg_tab1 SELECT i % 30, i % 20 FROM generate_series(0, 299, 2) i;
+INSERT INTO eager_agg_tab2 SELECT i % 20, i % 30 FROM generate_series(0, 299, 3) i;
+
+ANALYZE eager_agg_tab1;
+ANALYZE eager_agg_tab2;
+
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+
+-- GROUP BY having other matching key
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 10 ORDER BY t2.x;
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 10 ORDER BY t2.x;
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+
+-- partial aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+RESET enable_hashagg;
+
+DROP TABLE eager_agg_tab1;
+DROP TABLE eager_agg_tab2;
+
+
+--
+-- Test with multi-level partitioning scheme
+--
+CREATE TABLE eager_agg_tab_ml(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p1 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab_ml_p2 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (10) TO (20) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p2_s1 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab_ml_p2_s2 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (15) TO (20);
+CREATE TABLE eager_agg_tab_ml_p3 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (20) TO (30) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p3_s1 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (20) TO (25);
+CREATE TABLE eager_agg_tab_ml_p3_s2 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (25) TO (30);
+INSERT INTO eager_agg_tab_ml SELECT i % 30, i % 30 FROM generate_series(1, 1000) i;
+
+ANALYZE eager_agg_tab_ml;
+
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+
+-- partial aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+
+DROP TABLE eager_agg_tab_ml;
--
2.31.0
[application/octet-stream] v6-0009-Add-README.patch (4.8K, 11-v6-0009-Add-README.patch)
download | inline diff:
From cfc9124cd774b5364925690d10627f86a16b080c Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Fri, 23 Feb 2024 13:41:36 +0800
Subject: [PATCH v6 9/9] Add README
---
src/backend/optimizer/README | 88 ++++++++++++++++++++++++++++++++++++
1 file changed, 88 insertions(+)
diff --git a/src/backend/optimizer/README b/src/backend/optimizer/README
index 2ab4f3dbf3..dae7b87f32 100644
--- a/src/backend/optimizer/README
+++ b/src/backend/optimizer/README
@@ -1497,3 +1497,91 @@ breaking down aggregation or grouping over a partitioned relation into
aggregation or grouping over its partitions is called partitionwise
aggregation. Especially when the partition keys match the GROUP BY clause,
this can be significantly faster than the regular method.
+
+Eager aggregation
+-------------------
+
+The obvious way to evaluate aggregates is to evaluate the FROM clause of the
+SQL query (this is what query_planner does) and use the resulting paths as the
+input of Agg node. However, if the groups are large enough, it may be more
+efficient to apply the partial aggregation to the output of base relation
+scan, and finalize it when we have all relations of the query joined:
+
+ EXPLAIN (COSTS OFF)
+ SELECT a.i, avg(b.y)
+ FROM a JOIN b ON a.i = b.j
+ GROUP BY a.i;
+
+ Finalize HashAggregate
+ Group Key: a.i
+ -> Nested Loop
+ -> Partial HashAggregate
+ Group Key: b.j
+ -> Seq Scan on b
+ -> Index Only Scan using a_pkey on a
+ Index Cond: (i = b.j)
+
+Thus the join above the partial aggregate node receives fewer input rows, and
+so the number of outer-to-inner pairs of tuples to be checked can be
+significantly lower, which can in turn lead to considerably lower join cost.
+
+Note that the GROUP BY expression might not be useful for the partial
+aggregate. In the example above, the aggregate avg(b.y) references table "b",
+but the GROUP BY expression mentions "a". However, the equivalence class {a.i,
+b.j} allows us to use the b.j column as a grouping key for the partial
+aggregation of the "b" table. The equivalence class mechanism is suitable
+because it's designed to derive join clauses, and at the same time the join
+clauses determine the choice of grouping columns of the partial aggregate: the
+only way for the partial aggregate to provide upper join(s) with input values
+is to have the join input expression(s) in the grouping key; besides grouping
+columns, the partial aggregate can only produce the transient states of the
+aggregate functions, but aggregate functions cannot be referenced by the JOIN
+clauses.
+
+Regarding correctness, join node considers the output of the partial aggregate
+to be equivalent to the output of a plain (non-aggregated) relation scan. That
+is, a group (i.e. a row of the partial aggregate output) matches the other
+side of the join if and only if each row of the non-aggregate relation
+does. In other words, all rows belonging to the same group have the same value
+of the join columns (As mentioned above, a join cannot reference other output
+expressions of the partial aggregate than the grouping expressions.).
+
+However, there's a restriction from the aggregate's perspective: the aggregate
+cannot be pushed down if any column referenced by either grouping expression
+or aggregate function can be set to NULL by an outer join above the relation
+to which we want to apply the partial aggregation. The point is that those
+NULL values would not appear on the input of the pushed-down, so it could
+either put the rows into groups in a different way than the aggregate at the
+top of the plan, or it could compute wrong values of the aggregate functions.
+
+Besides base relation, the aggregation can also be pushed down to join:
+
+ EXPLAIN (COSTS OFF)
+ SELECT a.i, avg(b.y + c.z)
+ FROM a JOIN b ON a.i = b.j
+ JOIN c ON b.j = c.i
+ GROUP BY a.i;
+
+ Finalize HashAggregate
+ Group Key: a.i
+ -> Nested Loop
+ -> Partial HashAggregate
+ Group Key: b.j
+ -> Hash Join
+ Hash Cond: (b.j = c.i)
+ -> Seq Scan on b
+ -> Hash
+ -> Seq Scan on c
+ -> Index Only Scan using a_pkey on a
+ Index Cond: (i = b.j)
+
+Whether the Agg node is created out of base relation or out of join, it's
+added to a separate RelOptInfo that we call "grouped relation". Grouped
+relation can be joined to a non-grouped relation, which results in a grouped
+relation too. Join of two grouped relations does not seem to be very useful
+and is currently not supported.
+
+If query_planner produces a grouped relation that contains valid paths, these
+are simply added to the UPPERREL_PARTIAL_GROUP_AGG relation. Further
+processing of these paths then does not differ from processing of other
+partially grouped paths.
--
2.31.0
^ permalink raw reply [nested|flat] 2+ messages in thread
* Re: Eager aggregation, take 3
2024-04-30 04:06 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2024-05-20 08:12 ` Richard Guo <[email protected]>
0 siblings, 0 replies; 2+ messages in thread
From: Richard Guo @ 2024-05-20 08:12 UTC (permalink / raw)
To: Andy Fan <[email protected]>; +Cc: pgsql-hackers; [email protected]
Another rebase is needed after d1d286d83c. Also I realized that the
partially_grouped_rel generated by eager aggregation might be dummy,
such as in query:
select count(t2.c) from t t1 join t t2 on t1.b = t2.b where false group by
t1.a;
If somehow we choose this dummy path with a Finalize Agg Path on top of
it as the final cheapest path (a very rare case), we would encounter the
"Aggref found in non-Agg plan node" error. The v7 patch fixes this
issue.
Thanks
Richard
Attachments:
[application/octet-stream] v7-0001-Introduce-RelInfoList-structure.patch (14.3K, 3-v7-0001-Introduce-RelInfoList-structure.patch)
download | inline diff:
From 10ad693ef379979cd6794cfc0a805d4431ada9c9 Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Mon, 19 Feb 2024 15:16:51 +0800
Subject: [PATCH v7 1/9] Introduce RelInfoList structure
This commit introduces the RelInfoList structure, which encapsulates
both a list and a hash table, so that we can leverage the hash table for
faster lookups not only for join relations but also for upper relations.
---
contrib/postgres_fdw/postgres_fdw.c | 3 +-
src/backend/optimizer/geqo/geqo_eval.c | 20 +--
src/backend/optimizer/path/allpaths.c | 7 +-
src/backend/optimizer/plan/planmain.c | 5 +-
src/backend/optimizer/util/relnode.c | 164 ++++++++++++++-----------
src/include/nodes/pathnodes.h | 31 +++--
6 files changed, 133 insertions(+), 97 deletions(-)
diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index 4053cd641c..bfced61422 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -6069,7 +6069,8 @@ foreign_join_ok(PlannerInfo *root, RelOptInfo *joinrel, JoinType jointype,
*/
Assert(fpinfo->relation_index == 0); /* shouldn't be set yet */
fpinfo->relation_index =
- list_length(root->parse->rtable) + list_length(root->join_rel_list);
+ list_length(root->parse->rtable) +
+ list_length(root->join_rel_list->items);
return true;
}
diff --git a/src/backend/optimizer/geqo/geqo_eval.c b/src/backend/optimizer/geqo/geqo_eval.c
index d2f7f4e5f3..1141156899 100644
--- a/src/backend/optimizer/geqo/geqo_eval.c
+++ b/src/backend/optimizer/geqo/geqo_eval.c
@@ -85,18 +85,18 @@ geqo_eval(PlannerInfo *root, Gene *tour, int num_gene)
* truncating the list to its original length. NOTE this assumes that any
* added entries are appended at the end!
*
- * We also must take care not to mess up the outer join_rel_hash, if there
- * is one. We can do this by just temporarily setting the link to NULL.
- * (If we are dealing with enough join rels, which we very likely are, a
- * new hash table will get built and used locally.)
+ * We also must take care not to mess up the outer join_rel_list->hash, if
+ * there is one. We can do this by just temporarily setting the link to
+ * NULL. (If we are dealing with enough join rels, which we very likely
+ * are, a new hash table will get built and used locally.)
*
* join_rel_level[] shouldn't be in use, so just Assert it isn't.
*/
- savelength = list_length(root->join_rel_list);
- savehash = root->join_rel_hash;
+ savelength = list_length(root->join_rel_list->items);
+ savehash = root->join_rel_list->hash;
Assert(root->join_rel_level == NULL);
- root->join_rel_hash = NULL;
+ root->join_rel_list->hash = NULL;
/* construct the best path for the given combination of relations */
joinrel = gimme_tree(root, tour, num_gene);
@@ -121,9 +121,9 @@ geqo_eval(PlannerInfo *root, Gene *tour, int num_gene)
* Restore join_rel_list to its former state, and put back original
* hashtable if any.
*/
- root->join_rel_list = list_truncate(root->join_rel_list,
- savelength);
- root->join_rel_hash = savehash;
+ root->join_rel_list->items = list_truncate(root->join_rel_list->items,
+ savelength);
+ root->join_rel_list->hash = savehash;
/* release all the memory acquired within gimme_tree */
MemoryContextSwitchTo(oldcxt);
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 4895cee994..70e2b58d8f 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -3403,9 +3403,10 @@ make_rel_from_joinlist(PlannerInfo *root, List *joinlist)
* needed for these paths need have been instantiated.
*
* Note to plugin authors: the functions invoked during standard_join_search()
- * modify root->join_rel_list and root->join_rel_hash. If you want to do more
- * than one join-order search, you'll probably need to save and restore the
- * original states of those data structures. See geqo_eval() for an example.
+ * modify root->join_rel_list->items and root->join_rel_list->hash. If you
+ * want to do more than one join-order search, you'll probably need to save and
+ * restore the original states of those data structures. See geqo_eval() for
+ * an example.
*/
RelOptInfo *
standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c
index e17d31a5c3..fd8b2b0ca3 100644
--- a/src/backend/optimizer/plan/planmain.c
+++ b/src/backend/optimizer/plan/planmain.c
@@ -64,8 +64,9 @@ query_planner(PlannerInfo *root,
* NOTE: append_rel_list was set up by subquery_planner, so do not touch
* here.
*/
- root->join_rel_list = NIL;
- root->join_rel_hash = NULL;
+ root->join_rel_list = makeNode(RelInfoList);
+ root->join_rel_list->items = NIL;
+ root->join_rel_list->hash = NULL;
root->join_rel_level = NULL;
root->join_cur_level = 0;
root->canon_pathkeys = NIL;
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index e05b21c884..8279ab0e11 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -35,11 +35,15 @@
#include "utils/lsyscache.h"
-typedef struct JoinHashEntry
+/*
+ * An entry of a hash table that we use to make lookup for RelOptInfo
+ * structures more efficient.
+ */
+typedef struct RelInfoEntry
{
- Relids join_relids; /* hash key --- MUST BE FIRST */
- RelOptInfo *join_rel;
-} JoinHashEntry;
+ Relids relids; /* hash key --- MUST BE FIRST */
+ RelOptInfo *rel;
+} RelInfoEntry;
static void build_joinrel_tlist(PlannerInfo *root, RelOptInfo *joinrel,
RelOptInfo *input_rel,
@@ -479,11 +483,11 @@ find_base_rel_ignore_join(PlannerInfo *root, int relid)
}
/*
- * build_join_rel_hash
- * Construct the auxiliary hash table for join relations.
+ * build_rel_hash
+ * Construct the auxiliary hash table for relations.
*/
static void
-build_join_rel_hash(PlannerInfo *root)
+build_rel_hash(RelInfoList *list)
{
HTAB *hashtab;
HASHCTL hash_ctl;
@@ -491,47 +495,49 @@ build_join_rel_hash(PlannerInfo *root)
/* Create the hash table */
hash_ctl.keysize = sizeof(Relids);
- hash_ctl.entrysize = sizeof(JoinHashEntry);
+ hash_ctl.entrysize = sizeof(RelInfoEntry);
hash_ctl.hash = bitmap_hash;
hash_ctl.match = bitmap_match;
hash_ctl.hcxt = CurrentMemoryContext;
- hashtab = hash_create("JoinRelHashTable",
+ hashtab = hash_create("RelHashTable",
256L,
&hash_ctl,
HASH_ELEM | HASH_FUNCTION | HASH_COMPARE | HASH_CONTEXT);
- /* Insert all the already-existing joinrels */
- foreach(l, root->join_rel_list)
+ /* Insert all the already-existing relations */
+ foreach(l, list->items)
{
RelOptInfo *rel = (RelOptInfo *) lfirst(l);
- JoinHashEntry *hentry;
+ RelInfoEntry *hentry;
bool found;
- hentry = (JoinHashEntry *) hash_search(hashtab,
- &(rel->relids),
- HASH_ENTER,
- &found);
+ hentry = (RelInfoEntry *) hash_search(hashtab,
+ &(rel->relids),
+ HASH_ENTER,
+ &found);
Assert(!found);
- hentry->join_rel = rel;
+ hentry->rel = rel;
}
- root->join_rel_hash = hashtab;
+ list->hash = hashtab;
}
/*
- * find_join_rel
- * Returns relation entry corresponding to 'relids' (a set of RT indexes),
- * or NULL if none exists. This is for join relations.
+ * find_rel_info
+ * Find an RelOptInfo entry.
*/
-RelOptInfo *
-find_join_rel(PlannerInfo *root, Relids relids)
+static RelOptInfo *
+find_rel_info(RelInfoList *list, Relids relids)
{
+ if (list == NULL)
+ return NULL;
+
/*
* Switch to using hash lookup when list grows "too long". The threshold
* is arbitrary and is known only here.
*/
- if (!root->join_rel_hash && list_length(root->join_rel_list) > 32)
- build_join_rel_hash(root);
+ if (!list->hash && list_length(list->items) > 32)
+ build_rel_hash(list);
/*
* Use either hashtable lookup or linear search, as appropriate.
@@ -541,23 +547,23 @@ find_join_rel(PlannerInfo *root, Relids relids)
* so would force relids out of a register and thus probably slow down the
* list-search case.
*/
- if (root->join_rel_hash)
+ if (list->hash)
{
Relids hashkey = relids;
- JoinHashEntry *hentry;
+ RelInfoEntry *hentry;
- hentry = (JoinHashEntry *) hash_search(root->join_rel_hash,
- &hashkey,
- HASH_FIND,
- NULL);
+ hentry = (RelInfoEntry *) hash_search(list->hash,
+ &hashkey,
+ HASH_FIND,
+ NULL);
if (hentry)
- return hentry->join_rel;
+ return hentry->rel;
}
else
{
ListCell *l;
- foreach(l, root->join_rel_list)
+ foreach(l, list->items)
{
RelOptInfo *rel = (RelOptInfo *) lfirst(l);
@@ -569,6 +575,54 @@ find_join_rel(PlannerInfo *root, Relids relids)
return NULL;
}
+/*
+ * find_join_rel
+ * Returns relation entry corresponding to 'relids' (a set of RT indexes),
+ * or NULL if none exists. This is for join relations.
+ */
+RelOptInfo *
+find_join_rel(PlannerInfo *root, Relids relids)
+{
+ return find_rel_info(root->join_rel_list, relids);
+}
+
+/*
+ * add_rel_info
+ * Add given relation to the given list. Also add it to the auxiliary
+ * hashtable if there is one.
+ */
+static void
+add_rel_info(RelInfoList *list, RelOptInfo *rel)
+{
+ /* GEQO requires us to append the new relation to the end of the list! */
+ list->items = lappend(list->items, rel);
+
+ /* store it into the auxiliary hashtable if there is one. */
+ if (list->hash)
+ {
+ RelInfoEntry *hentry;
+ bool found;
+
+ hentry = (RelInfoEntry *) hash_search(list->hash,
+ &(rel->relids),
+ HASH_ENTER,
+ &found);
+ Assert(!found);
+ hentry->rel = rel;
+ }
+}
+
+/*
+ * add_join_rel
+ * Add given join relation to the list of join relations in the given
+ * PlannerInfo.
+ */
+static void
+add_join_rel(PlannerInfo *root, RelOptInfo *joinrel)
+{
+ add_rel_info(root->join_rel_list, joinrel);
+}
+
/*
* set_foreign_rel_properties
* Set up foreign-join fields if outer and inner relation are foreign
@@ -618,32 +672,6 @@ set_foreign_rel_properties(RelOptInfo *joinrel, RelOptInfo *outer_rel,
}
}
-/*
- * add_join_rel
- * Add given join relation to the list of join relations in the given
- * PlannerInfo. Also add it to the auxiliary hashtable if there is one.
- */
-static void
-add_join_rel(PlannerInfo *root, RelOptInfo *joinrel)
-{
- /* GEQO requires us to append the new joinrel to the end of the list! */
- root->join_rel_list = lappend(root->join_rel_list, joinrel);
-
- /* store it into the auxiliary hashtable if there is one. */
- if (root->join_rel_hash)
- {
- JoinHashEntry *hentry;
- bool found;
-
- hentry = (JoinHashEntry *) hash_search(root->join_rel_hash,
- &(joinrel->relids),
- HASH_ENTER,
- &found);
- Assert(!found);
- hentry->join_rel = joinrel;
- }
-}
-
/*
* build_join_rel
* Returns relation entry corresponding to the union of two given rels,
@@ -1469,22 +1497,14 @@ subbuild_joinrel_joinlist(RelOptInfo *joinrel,
RelOptInfo *
fetch_upper_rel(PlannerInfo *root, UpperRelationKind kind, Relids relids)
{
+ RelInfoList *list = &root->upper_rels[kind];
RelOptInfo *upperrel;
- ListCell *lc;
-
- /*
- * For the moment, our indexing data structure is just a List for each
- * relation kind. If we ever get so many of one kind that this stops
- * working well, we can improve it. No code outside this function should
- * assume anything about how to find a particular upperrel.
- */
/* If we already made this upperrel for the query, return it */
- foreach(lc, root->upper_rels[kind])
+ if (list)
{
- upperrel = (RelOptInfo *) lfirst(lc);
-
- if (bms_equal(upperrel->relids, relids))
+ upperrel = find_rel_info(list, relids);
+ if (upperrel)
return upperrel;
}
@@ -1503,7 +1523,7 @@ fetch_upper_rel(PlannerInfo *root, UpperRelationKind kind, Relids relids)
upperrel->cheapest_unique_path = NULL;
upperrel->cheapest_parameterized_paths = NIL;
- root->upper_rels[kind] = lappend(root->upper_rels[kind], upperrel);
+ add_rel_info(&root->upper_rels[kind], upperrel);
return upperrel;
}
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 14ef296ab7..4c7c6bc7a8 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -80,6 +80,25 @@ typedef enum UpperRelationKind
/* NB: UPPERREL_FINAL must be last enum entry; it's used to size arrays */
} UpperRelationKind;
+/*
+ * Hashed list to store relation specific info and to retrieve it by relids.
+ *
+ * For small problems we just scan the list to do lookups, but when there are
+ * many relations we build a hash table for faster lookups. The hash table is
+ * present and valid when 'hash' is not NULL. Note that we still maintain the
+ * list even when using the hash table for lookups; this simplifies life for
+ * GEQO.
+ */
+typedef struct RelInfoList
+{
+ pg_node_attr(no_copy_equal, no_read)
+
+ NodeTag type;
+
+ List *items;
+ struct HTAB *hash pg_node_attr(read_write_ignore);
+} RelInfoList;
+
/*----------
* PlannerGlobal
* Global information for planning/optimization
@@ -270,15 +289,9 @@ struct PlannerInfo
/*
* join_rel_list is a list of all join-relation RelOptInfos we have
- * considered in this planning run. For small problems we just scan the
- * list to do lookups, but when there are many join relations we build a
- * hash table for faster lookups. The hash table is present and valid
- * when join_rel_hash is not NULL. Note that we still maintain the list
- * even when using the hash table for lookups; this simplifies life for
- * GEQO.
+ * considered in this planning run.
*/
- List *join_rel_list;
- struct HTAB *join_rel_hash pg_node_attr(read_write_ignore);
+ RelInfoList *join_rel_list; /* list of join-relation RelOptInfos */
/*
* When doing a dynamic-programming-style join search, join_rel_level[k]
@@ -413,7 +426,7 @@ struct PlannerInfo
* Upper-rel RelOptInfos. Use fetch_upper_rel() to get any particular
* upper rel.
*/
- List *upper_rels[UPPERREL_FINAL + 1] pg_node_attr(read_write_ignore);
+ RelInfoList upper_rels[UPPERREL_FINAL + 1] pg_node_attr(read_write_ignore);
/* Result tlists chosen by grouping_planner for upper-stage processing */
struct PathTarget *upper_targets[UPPERREL_FINAL + 1] pg_node_attr(read_write_ignore);
--
2.31.0
[application/octet-stream] v7-0002-Introduce-RelAggInfo-structure-to-store-info-for-grouped-paths.patch (7.8K, 4-v7-0002-Introduce-RelAggInfo-structure-to-store-info-for-grouped-paths.patch)
download | inline diff:
From dc1f62ad81396b150c00c277cad1e1d041032707 Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Fri, 23 Feb 2024 11:12:18 +0800
Subject: [PATCH v7 2/9] Introduce RelAggInfo structure to store info for
grouped paths.
This commit introduces RelAggInfo structure to store information needed
to create grouped paths for base and join rels. It also revises the
RelInfoList related structures and functions so that they can be used
with RelAggInfos.
---
src/backend/optimizer/util/relnode.c | 66 +++++++++++++++++--------
src/include/nodes/pathnodes.h | 73 ++++++++++++++++++++++++++++
2 files changed, 118 insertions(+), 21 deletions(-)
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index 8279ab0e11..8420b8936e 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -36,13 +36,13 @@
/*
- * An entry of a hash table that we use to make lookup for RelOptInfo
- * structures more efficient.
+ * An entry of a hash table that we use to make lookup for RelOptInfo or
+ * RelAggInfo structures more efficient.
*/
typedef struct RelInfoEntry
{
Relids relids; /* hash key --- MUST BE FIRST */
- RelOptInfo *rel;
+ void *data;
} RelInfoEntry;
static void build_joinrel_tlist(PlannerInfo *root, RelOptInfo *joinrel,
@@ -484,7 +484,7 @@ find_base_rel_ignore_join(PlannerInfo *root, int relid)
/*
* build_rel_hash
- * Construct the auxiliary hash table for relations.
+ * Construct the auxiliary hash table for relation specific data.
*/
static void
build_rel_hash(RelInfoList *list)
@@ -504,19 +504,27 @@ build_rel_hash(RelInfoList *list)
&hash_ctl,
HASH_ELEM | HASH_FUNCTION | HASH_COMPARE | HASH_CONTEXT);
- /* Insert all the already-existing relations */
+ /* Insert all the already-existing relation specific infos */
foreach(l, list->items)
{
- RelOptInfo *rel = (RelOptInfo *) lfirst(l);
+ void *item = lfirst(l);
RelInfoEntry *hentry;
bool found;
+ Relids relids;
+
+ Assert(IsA(item, RelOptInfo) || IsA(item, RelAggInfo));
+
+ if (IsA(item, RelOptInfo))
+ relids = ((RelOptInfo *) item)->relids;
+ else
+ relids = ((RelAggInfo *) item)->relids;
hentry = (RelInfoEntry *) hash_search(hashtab,
- &(rel->relids),
+ &relids,
HASH_ENTER,
&found);
Assert(!found);
- hentry->rel = rel;
+ hentry->data = item;
}
list->hash = hashtab;
@@ -524,9 +532,9 @@ build_rel_hash(RelInfoList *list)
/*
* find_rel_info
- * Find an RelOptInfo entry.
+ * Find an RelOptInfo or a RelAggInfo entry.
*/
-static RelOptInfo *
+static void *
find_rel_info(RelInfoList *list, Relids relids)
{
if (list == NULL)
@@ -557,7 +565,7 @@ find_rel_info(RelInfoList *list, Relids relids)
HASH_FIND,
NULL);
if (hentry)
- return hentry->rel;
+ return hentry->data;
}
else
{
@@ -565,10 +573,18 @@ find_rel_info(RelInfoList *list, Relids relids)
foreach(l, list->items)
{
- RelOptInfo *rel = (RelOptInfo *) lfirst(l);
+ void *item = lfirst(l);
+ Relids item_relids = NULL;
+
+ Assert(IsA(item, RelOptInfo) || IsA(item, RelAggInfo));
- if (bms_equal(rel->relids, relids))
- return rel;
+ if (IsA(item, RelOptInfo))
+ item_relids = ((RelOptInfo *) item)->relids;
+ else if (IsA(item, RelAggInfo))
+ item_relids = ((RelAggInfo *) item)->relids;
+
+ if (bms_equal(item_relids, relids))
+ return item;
}
}
@@ -583,32 +599,40 @@ find_rel_info(RelInfoList *list, Relids relids)
RelOptInfo *
find_join_rel(PlannerInfo *root, Relids relids)
{
- return find_rel_info(root->join_rel_list, relids);
+ return (RelOptInfo *) find_rel_info(root->join_rel_list, relids);
}
/*
* add_rel_info
- * Add given relation to the given list. Also add it to the auxiliary
+ * Add relation specific info to a list, and also add it to the auxiliary
* hashtable if there is one.
*/
static void
-add_rel_info(RelInfoList *list, RelOptInfo *rel)
+add_rel_info(RelInfoList *list, void *data)
{
+ Assert(IsA(data, RelOptInfo) || IsA(data, RelAggInfo));
+
/* GEQO requires us to append the new relation to the end of the list! */
- list->items = lappend(list->items, rel);
+ list->items = lappend(list->items, data);
/* store it into the auxiliary hashtable if there is one. */
if (list->hash)
{
+ Relids relids;
RelInfoEntry *hentry;
bool found;
+ if (IsA(data, RelOptInfo))
+ relids = ((RelOptInfo *) data)->relids;
+ else
+ relids = ((RelAggInfo *) data)->relids;
+
hentry = (RelInfoEntry *) hash_search(list->hash,
- &(rel->relids),
+ &relids,
HASH_ENTER,
&found);
Assert(!found);
- hentry->rel = rel;
+ hentry->data = data;
}
}
@@ -1503,7 +1527,7 @@ fetch_upper_rel(PlannerInfo *root, UpperRelationKind kind, Relids relids)
/* If we already made this upperrel for the query, return it */
if (list)
{
- upperrel = find_rel_info(list, relids);
+ upperrel = (RelOptInfo *) find_rel_info(list, relids);
if (upperrel)
return upperrel;
}
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 4c7c6bc7a8..9a2bf98ae2 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -1074,6 +1074,79 @@ typedef struct RelOptInfo
((rel)->part_scheme && (rel)->boundinfo && (rel)->nparts > 0 && \
(rel)->part_rels && (rel)->partexprs && (rel)->nullable_partexprs)
+/*
+ * RelAggInfo
+ * Information needed to create grouped paths for base and join rels.
+ *
+ * "relids" is the set of relation identifiers (RT indexes), just like with
+ * RelOptInfo.
+ *
+ * "target" will be used as pathtarget if partial aggregation is applied to
+ * base relation or join. The same target will also --- if the relation is a
+ * join --- be used to join grouped path to a non-grouped one. This target can
+ * contain plain-Var grouping expressions and Aggref nodes.
+ *
+ * Note: There's a convention that Aggref expressions are supposed to follow
+ * the other expressions of the target. Iterations of ->exprs may rely on this
+ * arrangement.
+ *
+ * "agg_input" contains Vars used either as grouping expressions or aggregate
+ * arguments. Paths providing the aggregation plan with input data should use
+ * this target. The only difference from reltarget of the non-grouped relation
+ * is that some items can have sortgroupref initialized.
+ *
+ * "input_rows" is the estimated number of input rows for AggPath. It's
+ * actually just a workspace for users of the structure, i.e. not initialized
+ * when instance of the structure is created.
+ *
+ * "grouped_rows" is the estimated number of result rows of the AggPath.
+ *
+ * "group_clauses", "group_exprs" and "group_pathkeys" are lists of
+ * SortGroupClause, the corresponding grouping expressions and PathKey
+ * respectively.
+ *
+ * "agg_exprs" is a list of Aggref nodes for the aggregation of the relation's
+ * paths.
+ */
+typedef struct RelAggInfo
+{
+ pg_node_attr(no_copy_equal, no_read, no_query_jumble)
+
+ NodeTag type;
+
+ /*
+ * the same as in RelOptInfo; set of base + OJ relids (rangetable indexes)
+ */
+ Relids relids;
+
+ /*
+ * the targetlist for Paths scanning this grouped rel; list of Vars/Exprs,
+ * cost, width
+ */
+ struct PathTarget *target;
+
+ /*
+ * the targetlist for Paths that generate input for the grouped paths
+ */
+ struct PathTarget *agg_input;
+
+ /* estimated number of input tuples for the grouped paths */
+ Cardinality input_rows;
+
+ /* estimated number of result tuples of the grouped relation*/
+ Cardinality grouped_rows;
+
+ /* a list of SortGroupClause's */
+ List *group_clauses;
+ /* a list of grouping expressions */
+ List *group_exprs;
+ /* a list of PathKeys */
+ List *group_pathkeys;
+
+ /* a list of Aggref nodes */
+ List *agg_exprs;
+} RelAggInfo;
+
/*
* IndexOptInfo
* Per-index information for planning/optimization
--
2.31.0
[application/octet-stream] v7-0003-Set-up-for-eager-aggregation-by-collecting-needed-infos.patch (14.3K, 5-v7-0003-Set-up-for-eager-aggregation-by-collecting-needed-infos.patch)
download | inline diff:
From 16456744ff3412f18ce4024913c1c82f1d28989f Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Fri, 23 Feb 2024 18:40:46 +0800
Subject: [PATCH v7 3/9] Set up for eager aggregation by collecting needed
infos
This commit checks if eager aggregation is applicable, and if so, sets
up root->agg_clause_list and root->group_expr_list by collecting
suitable aggregate expressions and grouping expressions in the query.
---
src/backend/optimizer/path/allpaths.c | 1 +
src/backend/optimizer/plan/initsplan.c | 250 ++++++++++++++++++
src/backend/optimizer/plan/planmain.c | 8 +
src/backend/utils/misc/guc_tables.c | 10 +
src/backend/utils/misc/postgresql.conf.sample | 1 +
src/include/nodes/pathnodes.h | 41 +++
src/include/optimizer/paths.h | 1 +
src/include/optimizer/planmain.h | 1 +
src/test/regress/expected/sysviews.out | 3 +-
9 files changed, 315 insertions(+), 1 deletion(-)
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 70e2b58d8f..d1b974367b 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -77,6 +77,7 @@ typedef enum pushdown_safe_type
/* These parameters are set by GUC */
bool enable_geqo = false; /* just in case GUC doesn't set it */
+bool enable_eager_aggregate = false;
int geqo_threshold;
int min_parallel_table_scan_size;
int min_parallel_index_scan_size;
diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index e2c68fe6f9..0281336469 100644
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -14,6 +14,7 @@
*/
#include "postgres.h"
+#include "access/nbtree.h"
#include "catalog/pg_type.h"
#include "nodes/makefuncs.h"
#include "nodes/nodeFuncs.h"
@@ -80,6 +81,8 @@ typedef struct JoinTreeItem
} JoinTreeItem;
+static void create_agg_clause_infos(PlannerInfo *root);
+static void create_grouping_expr_infos(PlannerInfo *root);
static void extract_lateral_references(PlannerInfo *root, RelOptInfo *brel,
Index rtindex);
static List *deconstruct_recurse(PlannerInfo *root, Node *jtnode,
@@ -327,6 +330,253 @@ add_vars_to_targetlist(PlannerInfo *root, List *vars,
}
}
+/*
+ * setup_eager_aggregation
+ * Check if eager aggregation is applicable, and if so collect suitable
+ * aggregate expressions and grouping expressions in the query.
+ */
+void
+setup_eager_aggregation(PlannerInfo *root)
+{
+ /*
+ * Don't apply eager aggregation if disabled by user.
+ */
+ if (!enable_eager_aggregate)
+ return;
+
+ /*
+ * Don't apply eager aggregation if there are no GROUP BY clauses.
+ */
+ if (!root->parse->groupClause)
+ return;
+
+ /*
+ * For now we don't try to support grouping sets.
+ */
+ if (root->parse->groupingSets)
+ return;
+
+ /*
+ * For now we don't try to support DISTINCT or ORDER BY aggregates.
+ */
+ if (root->numOrderedAggs > 0)
+ return;
+
+ /*
+ * If there are any aggregates that do not support partial mode, or any
+ * partial aggregates that are non-serializable, do not apply eager
+ * aggregation.
+ */
+ if (root->hasNonPartialAggs || root->hasNonSerialAggs)
+ return;
+
+ /*
+ * SRF is not allowed in the aggregate argument and we don't even want it
+ * in the GROUP BY clause, so forbid it in general. It needs to be
+ * analyzed if evaluation of a GROUP BY clause containing SRF below the
+ * query targetlist would be correct. Currently it does not seem to be an
+ * important use case.
+ */
+ if (root->parse->hasTargetSRFs)
+ return;
+
+ /*
+ * Collect aggregate expressions that appear in targetlist and having
+ * clauses.
+ */
+ create_agg_clause_infos(root);
+
+ /*
+ * If there are no suitable aggregate expressions, we cannot apply eager
+ * aggregation.
+ */
+ if (root->agg_clause_list == NIL)
+ return;
+
+ /*
+ * Collect grouping expressions that appear in grouping clauses.
+ */
+ create_grouping_expr_infos(root);
+}
+
+/*
+ * Create AggClauseInfo for each aggregate.
+ *
+ * If any aggregate is not suitable, set root->agg_clause_list to NIL and
+ * return.
+ */
+static void
+create_agg_clause_infos(PlannerInfo *root)
+{
+ List *tlist_exprs;
+ ListCell *lc;
+
+ Assert(root->agg_clause_list == NIL);
+
+ tlist_exprs = pull_var_clause((Node *) root->processed_tlist,
+ PVC_INCLUDE_AGGREGATES |
+ PVC_RECURSE_WINDOWFUNCS |
+ PVC_RECURSE_PLACEHOLDERS);
+
+ /*
+ * For now we don't try to support GROUPING() expressions.
+ */
+ foreach(lc, tlist_exprs)
+ {
+ Expr *expr = (Expr *) lfirst(lc);
+
+ if (IsA(expr, GroupingFunc))
+ return;
+ }
+
+ /*
+ * Aggregates within the HAVING clause need to be processed in the same way
+ * as those in the targetlist. Note that HAVING can contain Aggrefs but
+ * not WindowFuncs.
+ */
+ if (root->parse->havingQual != NULL)
+ {
+ List *having_exprs;
+
+ having_exprs = pull_var_clause((Node *) root->parse->havingQual,
+ PVC_INCLUDE_AGGREGATES |
+ PVC_RECURSE_PLACEHOLDERS);
+ if (having_exprs != NIL)
+ {
+ tlist_exprs = list_concat(tlist_exprs, having_exprs);
+ list_free(having_exprs);
+ }
+ }
+
+ foreach(lc, tlist_exprs)
+ {
+ Expr *expr = (Expr *) lfirst(lc);
+ Aggref *aggref;
+ AggClauseInfo *ac_info;
+
+ /*
+ * tlist_exprs may also contain Vars, but we only need Aggrefs.
+ */
+ if (IsA(expr, Var))
+ continue;
+
+ aggref = castNode(Aggref, expr);
+
+ Assert(aggref->aggorder == NIL);
+ Assert(aggref->aggdistinct == NIL);
+
+ ac_info = makeNode(AggClauseInfo);
+ ac_info->aggref = aggref;
+ ac_info->agg_eval_at = pull_varnos(root, (Node *) aggref);
+
+ root->agg_clause_list =
+ list_append_unique(root->agg_clause_list, ac_info);
+ }
+
+ list_free(tlist_exprs);
+}
+
+/*
+ * Create GroupExprInfo for each expression usable as grouping key.
+ *
+ * If any grouping expression is not suitable, set root->group_expr_list to NIL
+ * and return.
+ */
+static void
+create_grouping_expr_infos(PlannerInfo *root)
+{
+ List *exprs = NIL;
+ List *sortgrouprefs = NIL;
+ List *btree_opfamilies = NIL;
+ ListCell *lc,
+ *lc1,
+ *lc2,
+ *lc3;
+
+ Assert(root->group_expr_list == NIL);
+
+ foreach(lc, root->parse->groupClause)
+ {
+ SortGroupClause *sgc = lfirst_node(SortGroupClause, lc);
+ TargetEntry *tle = get_sortgroupclause_tle(sgc, root->processed_tlist);
+ TypeCacheEntry *tce;
+ Oid equalimageproc;
+ Oid eq_op;
+ List *eq_opfamilies;
+ Oid btree_opfamily;
+
+ Assert(tle->ressortgroupref > 0);
+
+ /*
+ * For now we only support plain Vars as grouping expressions.
+ */
+ if (!IsA(tle->expr, Var))
+ return;
+
+ /*
+ * Eager aggregation is only possible if equality of grouping keys
+ * per the equality operator implies bitwise equality. Otherwise, if
+ * we put keys of different byte images into the same group, we lose
+ * some information that may be needed to evaluate join clauses above
+ * the pushed-down aggregate node, or the WHERE clause.
+ *
+ * For example, the NUMERIC data type is not supported because values
+ * that fall into the same group according to the equality operator
+ * (e.g. 0 and 0.0) can have different scale.
+ */
+ tce = lookup_type_cache(exprType((Node *) tle->expr),
+ TYPECACHE_BTREE_OPFAMILY);
+ if (!OidIsValid(tce->btree_opf) ||
+ !OidIsValid(tce->btree_opintype))
+ return;
+
+ equalimageproc = get_opfamily_proc(tce->btree_opf,
+ tce->btree_opintype,
+ tce->btree_opintype,
+ BTEQUALIMAGE_PROC);
+ if (!OidIsValid(equalimageproc) ||
+ !DatumGetBool(OidFunctionCall1Coll(equalimageproc,
+ tce->typcollation,
+ ObjectIdGetDatum(tce->btree_opintype))))
+ return;
+
+ /*
+ * Get the operator in the btree's opfamily.
+ */
+ eq_op = get_opfamily_member(tce->btree_opf,
+ tce->btree_opintype,
+ tce->btree_opintype,
+ BTEqualStrategyNumber);
+ if (!OidIsValid(eq_op))
+ return;
+ eq_opfamilies = get_mergejoin_opfamilies(eq_op);
+ if (!eq_opfamilies)
+ return;
+ btree_opfamily = linitial_oid(eq_opfamilies);
+
+ exprs = lappend(exprs, tle->expr);
+ sortgrouprefs = lappend_int(sortgrouprefs, tle->ressortgroupref);
+ btree_opfamilies = lappend_oid(btree_opfamilies, btree_opfamily);
+ }
+
+ /*
+ * Construct GroupExprInfo for each expression.
+ */
+ forthree(lc1, exprs, lc2, sortgrouprefs, lc3, btree_opfamilies)
+ {
+ Expr *expr = (Expr *) lfirst(lc1);
+ int sortgroupref = lfirst_int(lc2);
+ Oid btree_opfamily = lfirst_oid(lc3);
+ GroupExprInfo *ge_info;
+
+ ge_info = makeNode(GroupExprInfo);
+ ge_info->expr = (Expr *) copyObject(expr);
+ ge_info->sortgroupref = sortgroupref;
+ ge_info->btree_opfamily = btree_opfamily;
+
+ root->group_expr_list = lappend(root->group_expr_list, ge_info);
+ }
+}
/*****************************************************************************
*
diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c
index fd8b2b0ca3..5d2bca914b 100644
--- a/src/backend/optimizer/plan/planmain.c
+++ b/src/backend/optimizer/plan/planmain.c
@@ -77,6 +77,8 @@ query_planner(PlannerInfo *root,
root->placeholder_list = NIL;
root->placeholder_array = NULL;
root->placeholder_array_size = 0;
+ root->agg_clause_list = NIL;
+ root->group_expr_list = NIL;
root->fkey_list = NIL;
root->initial_rels = NIL;
@@ -258,6 +260,12 @@ query_planner(PlannerInfo *root,
*/
extract_restriction_or_clauses(root);
+ /*
+ * Check if eager aggregation is applicable, and if so, set up
+ * root->agg_clause_list and root->group_expr_list.
+ */
+ setup_eager_aggregation(root);
+
/*
* Now expand appendrels by adding "otherrels" for their children. We
* delay this to the end so that we have as much information as possible
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 46c258be28..aa7641d133 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -929,6 +929,16 @@ struct config_bool ConfigureNamesBool[] =
false,
NULL, NULL, NULL
},
+ {
+ {"enable_eager_aggregate", PGC_USERSET, QUERY_TUNING_METHOD,
+ gettext_noop("Enables eager aggregation."),
+ NULL,
+ GUC_EXPLAIN
+ },
+ &enable_eager_aggregate,
+ false,
+ NULL, NULL, NULL
+ },
{
{"enable_parallel_append", PGC_USERSET, QUERY_TUNING_METHOD,
gettext_noop("Enables the planner's use of parallel append plans."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 83d5df8e46..94ab3e6582 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -413,6 +413,7 @@
#enable_sort = on
#enable_tidscan = on
#enable_group_by_reordering = on
+#enable_eager_aggregate = off
# - Planner Cost Constants -
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 9a2bf98ae2..9e785816e6 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -386,6 +386,12 @@ struct PlannerInfo
/* list of PlaceHolderInfos */
List *placeholder_list;
+ /* list of AggClauseInfos */
+ List *agg_clause_list;
+
+ /* List of GroupExprInfos */
+ List *group_expr_list;
+
/* array of PlaceHolderInfos indexed by phid */
struct PlaceHolderInfo **placeholder_array pg_node_attr(read_write_ignore, array_size(placeholder_array_size));
/* allocated size of array */
@@ -3208,6 +3214,41 @@ typedef struct MinMaxAggInfo
Param *param;
} MinMaxAggInfo;
+/*
+ * The aggregate expressions that appear in targetlist and having clauses
+ */
+typedef struct AggClauseInfo
+{
+ pg_node_attr(no_read, no_query_jumble)
+
+ NodeTag type;
+
+ /* the Aggref expr */
+ Aggref *aggref;
+
+ /* lowest level we can evaluate this aggregate at */
+ Relids agg_eval_at;
+} AggClauseInfo;
+
+/*
+ * The grouping expressions that appear in grouping clauses
+ */
+typedef struct GroupExprInfo
+{
+ pg_node_attr(no_read, no_query_jumble)
+
+ NodeTag type;
+
+ /* the represented expression */
+ Expr *expr;
+
+ /* the tleSortGroupRef of the corresponding SortGroupClause */
+ Index sortgroupref;
+
+ /* btree opfamily defining the ordering */
+ Oid btree_opfamily;
+} GroupExprInfo;
+
/*
* At runtime, PARAM_EXEC slots are used to pass values around from one plan
* node to another. They can be used to pass values down into subqueries (for
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 914d9bdef5..5181220263 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -21,6 +21,7 @@
* allpaths.c
*/
extern PGDLLIMPORT bool enable_geqo;
+extern PGDLLIMPORT bool enable_eager_aggregate;
extern PGDLLIMPORT int geqo_threshold;
extern PGDLLIMPORT int min_parallel_table_scan_size;
extern PGDLLIMPORT int min_parallel_index_scan_size;
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index aafc173792..cedcd88ebf 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -72,6 +72,7 @@ extern void add_other_rels_to_query(PlannerInfo *root);
extern void build_base_rel_tlists(PlannerInfo *root, List *final_tlist);
extern void add_vars_to_targetlist(PlannerInfo *root, List *vars,
Relids where_needed);
+extern void setup_eager_aggregation(PlannerInfo *root);
extern void find_lateral_references(PlannerInfo *root);
extern void create_lateral_join_info(PlannerInfo *root);
extern List *deconstruct_jointree(PlannerInfo *root);
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index dbfd0c13d4..5e2b19d693 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -136,6 +136,7 @@ select name, setting from pg_settings where name like 'enable%';
--------------------------------+---------
enable_async_append | on
enable_bitmapscan | on
+ enable_eager_aggregate | off
enable_gathermerge | on
enable_group_by_reordering | on
enable_hashagg | on
@@ -156,7 +157,7 @@ select name, setting from pg_settings where name like 'enable%';
enable_seqscan | on
enable_sort | on
enable_tidscan | on
-(22 rows)
+(23 rows)
-- There are always wait event descriptions for various types.
select type, count(*) > 0 as ok FROM pg_wait_events
--
2.31.0
[application/octet-stream] v7-0004-Implement-functions-that-create-RelAggInfos-if-applicable.patch (26.2K, 6-v7-0004-Implement-functions-that-create-RelAggInfos-if-applicable.patch)
download | inline diff:
From 1da20a17dde21b1061f94e59a127125b1230bfa7 Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Fri, 23 Feb 2024 11:27:49 +0800
Subject: [PATCH v7 4/9] Implement functions that create RelAggInfos if
applicable
This commit implements the functions that check if eager aggregation is
applicable for a given relation, and if so, create RelAggInfo structure
for the relation, using the infos about aggregate expressions and
grouping expressions we collected earlier.
---
src/backend/optimizer/path/equivclass.c | 26 +-
src/backend/optimizer/plan/planmain.c | 3 +
src/backend/optimizer/util/relnode.c | 636 ++++++++++++++++++++++++
src/backend/utils/adt/selfuncs.c | 5 +-
src/include/nodes/pathnodes.h | 6 +
src/include/optimizer/pathnode.h | 5 +
src/include/optimizer/paths.h | 3 +-
7 files changed, 674 insertions(+), 10 deletions(-)
diff --git a/src/backend/optimizer/path/equivclass.c b/src/backend/optimizer/path/equivclass.c
index 21ce1ae2e1..9369acf033 100644
--- a/src/backend/optimizer/path/equivclass.c
+++ b/src/backend/optimizer/path/equivclass.c
@@ -2454,15 +2454,17 @@ find_join_domain(PlannerInfo *root, Relids relids)
* Detect whether two expressions are known equal due to equivalence
* relationships.
*
- * Actually, this only shows that the expressions are equal according
- * to some opfamily's notion of equality --- but we only use it for
- * selectivity estimation, so a fuzzy idea of equality is OK.
+ * If opfamily is given, the expressions must be known equal per the semantics
+ * of that opfamily (note it has to be a btree opfamily, since those are the
+ * only opfamilies equivclass.c deals with). If opfamily is InvalidOid, we'll
+ * return true if they're equal according to any opfamily, which is fuzzy but
+ * OK for estimation purposes.
*
* Note: does not bother to check for "equal(item1, item2)"; caller must
* check that case if it's possible to pass identical items.
*/
bool
-exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2)
+exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2, Oid opfamily)
{
ListCell *lc1;
@@ -2477,6 +2479,17 @@ exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2)
if (ec->ec_has_volatile)
continue;
+ /*
+ * It's okay to consider ec_broken ECs here. Brokenness just means we
+ * couldn't derive all the implied clauses we'd have liked to; it does
+ * not invalidate our knowledge that the members are equal.
+ */
+
+ /* Ignore if this EC doesn't use specified opfamily */
+ if (OidIsValid(opfamily) &&
+ !list_member_oid(ec->ec_opfamilies, opfamily))
+ continue;
+
foreach(lc2, ec->ec_members)
{
EquivalenceMember *em = (EquivalenceMember *) lfirst(lc2);
@@ -2505,8 +2518,7 @@ exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2)
* (In principle there might be more than one matching eclass if multiple
* collations are involved, but since collation doesn't matter for equality,
* we ignore that fine point here.) This is much like exprs_known_equal,
- * except that we insist on the comparison operator matching the eclass, so
- * that the result is definite not approximate.
+ * except for the format of the input.
*
* On success, we also set fkinfo->eclass[colno] to the matching eclass,
* and set fkinfo->fk_eclass_member[colno] to the eclass member for the
@@ -2547,7 +2559,7 @@ match_eclasses_to_foreign_key_col(PlannerInfo *root,
/* Never match to a volatile EC */
if (ec->ec_has_volatile)
continue;
- /* Note: it seems okay to match to "broken" eclasses here */
+ /* It's okay to consider "broken" ECs here, see exprs_known_equal */
foreach(lc2, ec->ec_members)
{
diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c
index 5d2bca914b..f7217d7690 100644
--- a/src/backend/optimizer/plan/planmain.c
+++ b/src/backend/optimizer/plan/planmain.c
@@ -67,6 +67,9 @@ query_planner(PlannerInfo *root,
root->join_rel_list = makeNode(RelInfoList);
root->join_rel_list->items = NIL;
root->join_rel_list->hash = NULL;
+ root->agg_info_list = makeNode(RelInfoList);
+ root->agg_info_list->items = NIL;
+ root->agg_info_list->hash = NULL;
root->join_rel_level = NULL;
root->join_cur_level = 0;
root->canon_pathkeys = NIL;
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index 8420b8936e..c6e2d417a8 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -87,6 +87,14 @@ static void build_child_join_reltarget(PlannerInfo *root,
RelOptInfo *childrel,
int nappinfos,
AppendRelInfo **appinfos);
+static bool eager_aggregation_possible_for_relation(PlannerInfo *root,
+ RelOptInfo *rel);
+static bool init_grouping_targets(PlannerInfo *root, RelOptInfo *rel,
+ PathTarget *target, PathTarget *agg_input,
+ List **group_exprs_extra_p);
+static bool is_var_in_aggref_only(PlannerInfo *root, Var *var);
+static bool is_var_needed_by_join(PlannerInfo *root, Var *var, RelOptInfo *rel);
+static Index get_expression_sortgroupref(PlannerInfo *root, Expr *expr);
/*
@@ -647,6 +655,58 @@ add_join_rel(PlannerInfo *root, RelOptInfo *joinrel)
add_rel_info(root->join_rel_list, joinrel);
}
+/*
+ * add_grouped_rel
+ * Add grouped base or join relation to the list of grouped relations in
+ * the given PlannerInfo. Also add the corresponding RelAggInfo to
+ * root->agg_info_list.
+ */
+void
+add_grouped_rel(PlannerInfo *root, RelOptInfo *rel, RelAggInfo *agg_info)
+{
+ add_rel_info(&root->upper_rels[UPPERREL_PARTIAL_GROUP_AGG], rel);
+ add_rel_info(root->agg_info_list, agg_info);
+}
+
+/*
+ * find_grouped_rel
+ * Returns grouped relation entry (base or join relation) corresponding to
+ * 'relids' or NULL if none exists.
+ *
+ * If agg_info_p is not NULL, then also the corresponding RelAggInfo (if one
+ * exists) will be returned in *agg_info_p.
+ */
+RelOptInfo *
+find_grouped_rel(PlannerInfo *root, Relids relids, RelAggInfo **agg_info_p)
+{
+ RelOptInfo *rel;
+
+ rel = (RelOptInfo *) find_rel_info(&root->upper_rels[UPPERREL_PARTIAL_GROUP_AGG],
+ relids);
+ if (rel == NULL)
+ {
+ if (agg_info_p)
+ *agg_info_p = NULL;
+
+ return NULL;
+ }
+
+ /* also return the corresponding RelAggInfo, if asked */
+ if (agg_info_p)
+ {
+ RelAggInfo *agg_info;
+
+ agg_info = (RelAggInfo *) find_rel_info(root->agg_info_list, relids);
+
+ /* The relation exists, so the agg_info should be there too. */
+ Assert(agg_info != NULL);
+
+ *agg_info_p = agg_info;
+ }
+
+ return rel;
+}
+
/*
* set_foreign_rel_properties
* Set up foreign-join fields if outer and inner relation are foreign
@@ -2483,3 +2543,579 @@ build_child_join_reltarget(PlannerInfo *root,
childrel->reltarget->cost.per_tuple = parentrel->reltarget->cost.per_tuple;
childrel->reltarget->width = parentrel->reltarget->width;
}
+
+/*
+ * create_rel_agg_info
+ * Check if the given relation can produce grouped paths and return the
+ * information it'll need for it. The given relation is the non-grouped one
+ * which has the reltarget already constructed.
+ */
+RelAggInfo *
+create_rel_agg_info(PlannerInfo *root, RelOptInfo *rel)
+{
+ ListCell *lc;
+ RelAggInfo *result;
+ PathTarget *agg_input;
+ PathTarget *target;
+ List *grp_exprs_extra = NIL;
+ List *group_clauses_final;
+ int i;
+
+ /*
+ * The lists of aggregate expressions and grouping expressions should have
+ * been constructed.
+ */
+ Assert(root->agg_clause_list != NIL);
+ Assert(root->group_expr_list != NIL);
+
+ /*
+ * If this is a child rel, the grouped rel for its parent rel must have
+ * been created if it can. So we can just use parent's RelAggInfo if there
+ * is one, with appropriate variable substitutions.
+ */
+ if (IS_OTHER_REL(rel))
+ {
+ RelOptInfo *rel_grouped;
+ RelAggInfo *agg_info;
+
+ Assert(!bms_is_empty(rel->top_parent_relids));
+ rel_grouped = find_grouped_rel(root, rel->top_parent_relids, &agg_info);
+
+ if (rel_grouped == NULL)
+ return NULL;
+
+ Assert(agg_info != NULL);
+ /* Must do multi-level transformation */
+ agg_info = (RelAggInfo *)
+ adjust_appendrel_attrs_multilevel(root,
+ (Node *) agg_info,
+ rel,
+ rel->top_parent);
+
+ agg_info->input_rows = rel->rows;
+ agg_info->grouped_rows =
+ estimate_num_groups(root, agg_info->group_exprs,
+ agg_info->input_rows, NULL, NULL);
+
+ return agg_info;
+ }
+
+ /* Check if it's possible to produce grouped paths for this relation. */
+ if (!eager_aggregation_possible_for_relation(root, rel))
+ return NULL;
+
+ /*
+ * Create targets for the grouped paths and for the input paths of the
+ * grouped paths.
+ */
+ target = create_empty_pathtarget();
+ agg_input = create_empty_pathtarget();
+
+ /* initialize 'target' and 'agg_input' */
+ if (!init_grouping_targets(root, rel, target, agg_input, &grp_exprs_extra))
+ return NULL;
+
+ /* Eager aggregation makes no sense w/o grouping expressions */
+ if ((list_length(target->exprs) + list_length(grp_exprs_extra)) == 0)
+ return NULL;
+
+ group_clauses_final = root->parse->groupClause;
+
+ /*
+ * If the aggregation target should have extra grouping expressions (in
+ * order to emit input vars for join conditions), add them now. This step
+ * includes assignment of tleSortGroupRef's which we can generate now.
+ */
+ if (list_length(grp_exprs_extra) > 0)
+ {
+ Index sortgroupref;
+
+ /*
+ * Make a copy of the group clauses as we'll need to add some more
+ * clauses.
+ */
+ group_clauses_final = list_copy(group_clauses_final);
+
+ /* find out the current max sortgroupref */
+ sortgroupref = 0;
+ foreach(lc, root->processed_tlist)
+ {
+ Index ref = ((TargetEntry *) lfirst(lc))->ressortgroupref;
+
+ if (ref > sortgroupref)
+ sortgroupref = ref;
+ }
+
+ /*
+ * Generate the SortGroupClause's and add the expressions to the
+ * target.
+ */
+ foreach(lc, grp_exprs_extra)
+ {
+ Var *var = lfirst_node(Var, lc);
+ SortGroupClause *cl = makeNode(SortGroupClause);
+
+ /*
+ * Initialize the SortGroupClause.
+ *
+ * As the final aggregation will not use this grouping expression,
+ * we don't care whether sortop is < or >. The value of nulls_first
+ * should not matter for the same reason.
+ */
+ cl->tleSortGroupRef = ++sortgroupref;
+ get_sort_group_operators(var->vartype,
+ false, true, false,
+ &cl->sortop, &cl->eqop, NULL,
+ &cl->hashable);
+ group_clauses_final = lappend(group_clauses_final, cl);
+ add_column_to_pathtarget(target, (Expr *) var,
+ cl->tleSortGroupRef);
+
+ /*
+ * The aggregation input target must emit this var too.
+ */
+ add_column_to_pathtarget(agg_input, (Expr *) var,
+ cl->tleSortGroupRef);
+ }
+ }
+
+ /*
+ * Build a list of grouping expressions and a list of the corresponding
+ * SortGroupClauses.
+ */
+ i = 0;
+ result = makeNode(RelAggInfo);
+ foreach(lc, target->exprs)
+ {
+ Index sortgroupref = 0;
+ SortGroupClause *cl;
+ Expr *texpr;
+
+ texpr = (Expr *) lfirst(lc);
+
+ Assert(IsA(texpr, Var));
+
+ sortgroupref = target->sortgrouprefs[i++];
+ if (sortgroupref == 0)
+ continue;
+
+ /* find the SortGroupClause in group_clauses_final */
+ cl = get_sortgroupref_clause(sortgroupref, group_clauses_final);
+
+ /* do not add this SortGroupClause if it has already been added */
+ if (list_member(result->group_clauses, cl))
+ continue;
+
+ result->group_clauses = lappend(result->group_clauses, cl);
+ result->group_exprs = list_append_unique(result->group_exprs,
+ texpr);
+ }
+
+ /*
+ * Calculate pathkeys that represent this grouping requirements.
+ */
+ result->group_pathkeys =
+ make_pathkeys_for_sortclauses(root, result->group_clauses,
+ make_tlist_from_pathtarget(target));
+
+ /*
+ * Add aggregates to the grouping target.
+ */
+ foreach(lc, root->agg_clause_list)
+ {
+ AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+ Aggref *aggref;
+
+ Assert(IsA(ac_info->aggref, Aggref));
+
+ aggref = (Aggref *) copyObject(ac_info->aggref);
+ mark_partial_aggref(aggref, AGGSPLIT_INITIAL_SERIAL);
+
+ add_column_to_pathtarget(target, (Expr *) aggref, 0);
+
+ result->agg_exprs = lappend(result->agg_exprs, aggref);
+ }
+
+ /*
+ * Since neither target nor agg_input is supposed to be identical to the
+ * source reltarget, compute the width and cost again.
+ */
+ set_pathtarget_cost_width(root, target);
+ set_pathtarget_cost_width(root, agg_input);
+
+ result->relids = bms_copy(rel->relids);
+ result->target = target;
+ result->agg_input = agg_input;
+
+ /*
+ * The number of aggregation input rows is simply the number of rows of the
+ * non-grouped relation, which should have been estimated by now.
+ */
+ result->input_rows = rel->rows;
+
+ /* Estimate the number of groups with equal grouped exprs. */
+ result->grouped_rows = estimate_num_groups(root, result->group_exprs,
+ result->input_rows, NULL, NULL);
+
+ return result;
+}
+
+/*
+ * eager_aggregation_possible_for_relation
+ * Check if it's possible to produce grouped paths for the given relation.
+ */
+static bool
+eager_aggregation_possible_for_relation(PlannerInfo *root, RelOptInfo *rel)
+{
+ ListCell *lc;
+
+ /*
+ * The current implementation of eager aggregation cannot handle
+ * PlaceHolderVar (PHV).
+ *
+ * If we knew that the PHV should be evaluated in this target (and of
+ * course, if its expression matched some Aggref argument), we'd just let
+ * init_grouping_targets add that Aggref. On the other hand, if we knew
+ * that the PHV is evaluated below the current rel, we could ignore it
+ * because the referencing Aggref would take care of propagation of the
+ * value to upper joins.
+ *
+ * The problem is that the same PHV can be evaluated in the target of the
+ * current rel or in that of lower rel --- depending on the input paths.
+ * For example, consider rel->relids = {A, B, C} and if ph_eval_at = {B,
+ * C}. Path "A JOIN (B JOIN C)" implies that the PHV is evaluated by the
+ * "(B JOIN C)", while path "(A JOIN B) JOIN C" evaluates the PHV itself.
+ */
+ foreach(lc, rel->reltarget->exprs)
+ {
+ Expr *expr = lfirst(lc);
+
+ if (IsA(expr, PlaceHolderVar))
+ return false;
+ }
+
+ if (IS_SIMPLE_REL(rel))
+ {
+ RangeTblEntry *rte = root->simple_rte_array[rel->relid];
+
+ /*
+ * rtekind != RTE_RELATION case is not supported yet.
+ */
+ if (rte->rtekind != RTE_RELATION)
+ return false;
+ }
+
+ /* Caller should only pass base relations or joins. */
+ Assert(rel->reloptkind == RELOPT_BASEREL ||
+ rel->reloptkind == RELOPT_JOINREL);
+
+ /*
+ * Check if all aggregate expressions can be evaluated on this relation
+ * level.
+ */
+ foreach(lc, root->agg_clause_list)
+ {
+ AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+
+ Assert(IsA(ac_info->aggref, Aggref));
+
+ /*
+ * Give up if any aggregate needs relations other than the current one.
+ *
+ * If the aggregate needs the current rel plus anything else, then the
+ * problem is that grouping of the current relation could make some
+ * input variables unavailable for the "higher aggregate", and it'd
+ * also decrease the number of input rows the "higher aggregate"
+ * receives.
+ *
+ * If the aggregate does not even need the current rel, then the
+ * current rel should be grouped because we do not support join of two
+ * grouped relations.
+ */
+ if (!bms_is_subset(ac_info->agg_eval_at, rel->relids))
+ return false;
+ }
+
+ return true;
+}
+
+/*
+ * init_grouping_targets
+ * Initialize target for grouped paths (target) as well as a target for
+ * paths that generate input for the grouped paths (agg_input).
+ *
+ * group_exprs_extra_p receives a list of Var nodes for which we need to
+ * construct SortGroupClause. Those vars will then be used as additional
+ * grouping expressions, for the sake of join clauses.
+ *
+ * Return true iff the targets could be initialized.
+ */
+static bool
+init_grouping_targets(PlannerInfo *root, RelOptInfo *rel,
+ PathTarget *target, PathTarget *agg_input,
+ List **group_exprs_extra_p)
+{
+ ListCell *lc;
+ List *possibly_dependent = NIL;
+
+ foreach(lc, rel->reltarget->exprs)
+ {
+ Expr *expr = (Expr *) lfirst(lc);
+ Index sortgroupref;
+
+ /*
+ * Given that PlaceHolderVar currently prevents us from doing eager
+ * aggregation, the source target cannot contain anything more complex
+ * than a Var.
+ */
+ Assert(IsA(expr, Var));
+
+ /* Get the sortgroupref if the expr can act as grouping expression. */
+ sortgroupref = get_expression_sortgroupref(root, expr);
+ if (sortgroupref > 0)
+ {
+ /*
+ * If the target expression can be used as the grouping key, it
+ * should be emitted by the grouped paths that have been pushed
+ * down to this relation level.
+ */
+ add_column_to_pathtarget(target, expr, sortgroupref);
+
+ /*
+ * ... and it also should be emitted by the input paths
+ */
+ add_column_to_pathtarget(agg_input, expr, sortgroupref);
+ }
+ else
+ {
+ if (is_var_needed_by_join(root, (Var *) expr, rel))
+ {
+ /*
+ * The variable is needed for a join, however it's neither in
+ * the GROUP BY clause nor can it be derived from it using EC.
+ * (Otherwise it would have to be added to the targets above.)
+ * We need to construct special SortGroupClause for this
+ * variable.
+ *
+ * Note that its tleSortGroupRef needs to be unique within
+ * agg_input, so we need to postpone creation of the
+ * SortGroupClause's until we're done with the iteration of
+ * rel->reltarget->exprs. Also it makes sense for the caller to
+ * do some more check before it starts to create those
+ * SortGroupClause's.
+ */
+ *group_exprs_extra_p = lappend(*group_exprs_extra_p, expr);
+ }
+ else if (is_var_in_aggref_only(root, (Var *) expr))
+ {
+ /*
+ * Another reason we might need this variable is that some
+ * aggregate pushed down to this relation references it. In
+ * such a case, add it to "agg_input", but not to "target".
+ * However, if the aggregate is not the only reason for the var
+ * to be in the target, some more checks need to be performed
+ * below.
+ */
+ add_new_column_to_pathtarget(agg_input, expr);
+ }
+ else
+ {
+ /*
+ * The Var can be functionally dependent on another expression
+ * of the target, but we cannot check that until we've built
+ * all the expressions for the target.
+ */
+ possibly_dependent = lappend(possibly_dependent, expr);
+ }
+ }
+ }
+
+ /*
+ * Now we can check whether the expression is functionally dependent on
+ * another one.
+ */
+ foreach(lc, possibly_dependent)
+ {
+ Var *tvar;
+ List *deps = NIL;
+ RangeTblEntry *rte;
+
+ tvar = lfirst_node(Var, lc);
+ rte = root->simple_rte_array[tvar->varno];
+
+ /*
+ * Check if the Var can be in the grouping key even though it's not
+ * mentioned by the GROUP BY clause (and could not be derived using
+ * ECs).
+ */
+ if (check_functional_grouping(rte->relid, tvar->varno,
+ tvar->varlevelsup,
+ target->exprs, &deps))
+ {
+ /*
+ * The var shouldn't be actually used for grouping key evaluation
+ * (instead, the one this depends on will be), so sortgroupref
+ * should not be important.
+ */
+ add_new_column_to_pathtarget(target, (Expr *) tvar);
+ add_new_column_to_pathtarget(agg_input, (Expr *) tvar);
+ }
+ else
+ {
+ /*
+ * As long as the query is semantically correct, arriving here
+ * means that the var is referenced by a generic grouping
+ * expression but not referenced by any join.
+ *
+ * If the eager aggregation will support generic grouping
+ * expression in the future, create_rel_agg_info() will have to add
+ * this variable to "agg_input" target and also add the whole
+ * generic expression to "target".
+ */
+ return false;
+ }
+ }
+
+ return true;
+}
+
+/*
+ * is_var_in_aggref_only
+ * Check whether the given Var appears in aggregate expressions and not
+ * elsewhere in the targetlist.
+ */
+static bool
+is_var_in_aggref_only(PlannerInfo *root, Var *var)
+{
+ List *tlist_exprs;
+ ListCell *lc;
+
+ /*
+ * Search the list of aggregate expressions for the Var.
+ */
+ foreach(lc, root->agg_clause_list)
+ {
+ AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+ List *vars;
+
+ Assert(IsA(ac_info->aggref, Aggref));
+
+ if (!bms_is_member(var->varno, ac_info->agg_eval_at))
+ continue;
+
+ vars = pull_var_clause((Node *) ac_info->aggref,
+ PVC_RECURSE_AGGREGATES |
+ PVC_RECURSE_WINDOWFUNCS |
+ PVC_RECURSE_PLACEHOLDERS);
+
+ if (list_member(vars, var))
+ {
+ list_free(vars);
+ break;
+ }
+
+ list_free(vars);
+ }
+
+ /*
+ * If we reached the end of the list, the Var is not referenced in
+ * aggregate expressions.
+ */
+ if (lc == NULL)
+ return false;
+
+ /*
+ * Search the targetlist to see if the Var is referenced anywhere other
+ * than in aggregate expressions.
+ */
+ tlist_exprs = pull_var_clause((Node *) root->processed_tlist,
+ PVC_INCLUDE_AGGREGATES |
+ PVC_RECURSE_WINDOWFUNCS |
+ PVC_RECURSE_PLACEHOLDERS);
+
+ foreach(lc, tlist_exprs)
+ {
+ Var *tlist_var = (Var *) lfirst(lc);
+
+ if (IsA(tlist_var, Aggref))
+ continue;
+
+ if (equal(tlist_var, var))
+ {
+ list_free(tlist_exprs);
+ return false;
+ }
+ }
+
+ list_free(tlist_exprs);
+
+ return true;
+}
+
+/*
+ * is_var_needed_by_join
+ * Check if the given Var is needed by joins above the current rel.
+ *
+ * Consider pushing the aggregate avg(b.y) down to relation b for the following
+ * query:
+ *
+ * SELECT a.i, avg(b.y)
+ * FROM a JOIN b ON a.j = b.j
+ * GROUP BY a.i;
+ *
+ * Column b.j needs to be used as the grouping key because otherwise it cannot
+ * find its way to the input of the join expression.
+ */
+static bool
+is_var_needed_by_join(PlannerInfo *root, Var *var, RelOptInfo *rel)
+{
+ Relids relids;
+ int attno;
+ RelOptInfo *baserel;
+
+ /*
+ * Note that when we are checking if the Var is needed by joins above, we
+ * want to exclude the situation where the Var is only needed in final
+ * output. So include "relation 0" here.
+ */
+ relids = bms_copy(rel->relids);
+ relids = bms_add_member(relids, 0);
+
+ baserel = find_base_rel(root, var->varno);
+ attno = var->varattno - baserel->min_attr;
+
+ return bms_nonempty_difference(baserel->attr_needed[attno], relids);
+}
+
+/*
+ * get_expression_sortgroupref
+ * Return sortgroupref if the given 'expr' can be used as a grouping
+ * expression in grouped paths for base or join relations, or 0 otherwise.
+ *
+ * Note that we also need to check if the 'expr' is known equal to other exprs
+ * due to equivalence relationships that can act as grouping expressions.
+ */
+static Index
+get_expression_sortgroupref(PlannerInfo *root, Expr *expr)
+{
+ ListCell *lc;
+
+ foreach(lc, root->group_expr_list)
+ {
+ GroupExprInfo *ge_info = lfirst_node(GroupExprInfo, lc);
+
+ Assert(IsA(ge_info->expr, Var));
+
+ if (equal(ge_info->expr, expr) ||
+ exprs_known_equal(root, (Node *) expr, (Node *) ge_info->expr,
+ ge_info->btree_opfamily))
+ {
+ Assert(ge_info->sortgroupref > 0);
+
+ return ge_info->sortgroupref;
+ }
+ }
+
+ /* The expression cannot be used as grouping key. */
+ return 0;
+}
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index 5f5d7959d8..877a62a62e 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -3313,10 +3313,11 @@ add_unique_group_var(PlannerInfo *root, List *varinfos,
/*
* Drop known-equal vars, but only if they belong to different
- * relations (see comments for estimate_num_groups)
+ * relations (see comments for estimate_num_groups). We aren't too
+ * fussy about the semantics of "equal" here.
*/
if (vardata->rel != varinfo->rel &&
- exprs_known_equal(root, var, varinfo->var))
+ exprs_known_equal(root, var, varinfo->var, InvalidOid))
{
if (varinfo->ndistinct <= ndistinct)
{
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 9e785816e6..1a1a1b6dfb 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -434,6 +434,12 @@ struct PlannerInfo
*/
RelInfoList upper_rels[UPPERREL_FINAL + 1] pg_node_attr(read_write_ignore);
+ /*
+ * list of grouped relation RelAggInfos. One instance of RelAggInfo per
+ * item of the upper_rels[UPPERREL_PARTIAL_GROUP_AGG] list.
+ */
+ RelInfoList *agg_info_list;
+
/* Result tlists chosen by grouping_planner for upper-stage processing */
struct PathTarget *upper_targets[UPPERREL_FINAL + 1] pg_node_attr(read_write_ignore);
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index 112e7c23d4..02da68a753 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -314,6 +314,10 @@ extern RelOptInfo *find_base_rel(PlannerInfo *root, int relid);
extern RelOptInfo *find_base_rel_noerr(PlannerInfo *root, int relid);
extern RelOptInfo *find_base_rel_ignore_join(PlannerInfo *root, int relid);
extern RelOptInfo *find_join_rel(PlannerInfo *root, Relids relids);
+extern void add_grouped_rel(PlannerInfo *root, RelOptInfo *rel,
+ RelAggInfo *agg_info);
+extern RelOptInfo *find_grouped_rel(PlannerInfo *root, Relids relids,
+ RelAggInfo **agg_info_p);
extern RelOptInfo *build_join_rel(PlannerInfo *root,
Relids joinrelids,
RelOptInfo *outer_rel,
@@ -348,4 +352,5 @@ extern RelOptInfo *build_child_join_rel(PlannerInfo *root,
RelOptInfo *parent_joinrel, List *restrictlist,
SpecialJoinInfo *sjinfo);
+extern RelAggInfo *create_rel_agg_info(PlannerInfo *root, RelOptInfo *rel);
#endif /* PATHNODE_H */
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 5181220263..1068ff6953 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -159,7 +159,8 @@ extern List *generate_join_implied_equalities_for_ecs(PlannerInfo *root,
Relids join_relids,
Relids outer_relids,
RelOptInfo *inner_rel);
-extern bool exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2);
+extern bool exprs_known_equal(PlannerInfo *root, Node *item1, Node *item2,
+ Oid opfamily);
extern EquivalenceClass *match_eclasses_to_foreign_key_col(PlannerInfo *root,
ForeignKeyOptInfo *fkinfo,
int colno);
--
2.31.0
[application/octet-stream] v7-0005-Implement-functions-that-generate-paths-for-grouped-relations.patch (13.1K, 7-v7-0005-Implement-functions-that-generate-paths-for-grouped-relations.patch)
download | inline diff:
From ef1b95894276bb3225a15c801201a78067a5cf4a Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Fri, 23 Feb 2024 14:19:39 +0800
Subject: [PATCH v7 5/9] Implement functions that generate paths for grouped
relations
This commit implements the functions that generate paths for grouped
relations by adding sorted and hashed partial aggregation paths on top
of paths of the plain base or join relations.
---
src/backend/optimizer/path/allpaths.c | 307 ++++++++++++++++++++++++++
src/backend/optimizer/util/pathnode.c | 12 +-
src/include/optimizer/paths.h | 4 +
3 files changed, 315 insertions(+), 8 deletions(-)
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index d1b974367b..0c2fae9608 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -40,6 +40,7 @@
#include "optimizer/paths.h"
#include "optimizer/plancat.h"
#include "optimizer/planner.h"
+#include "optimizer/prep.h"
#include "optimizer/tlist.h"
#include "parser/parse_clause.h"
#include "parser/parsetree.h"
@@ -47,6 +48,7 @@
#include "port/pg_bitutils.h"
#include "rewrite/rewriteManip.h"
#include "utils/lsyscache.h"
+#include "utils/selfuncs.h"
/* Bitmask flags for pushdown_safety_info.unsafeFlags */
@@ -3296,6 +3298,311 @@ generate_useful_gather_paths(PlannerInfo *root, RelOptInfo *rel, bool override_r
}
}
+/*
+ * generate_grouped_paths
+ * Generate paths for a grouped relation by adding sorted and hashed
+ * partial aggregation paths on top of paths of the plain base or join
+ * relation.
+ *
+ * The information needed are provided by the RelAggInfo structure.
+ */
+void
+generate_grouped_paths(PlannerInfo *root, RelOptInfo *rel_grouped,
+ RelOptInfo *rel_plain, RelAggInfo *agg_info)
+{
+ AggClauseCosts agg_costs;
+ bool can_hash;
+ bool can_sort;
+ Path *cheapest_total_path = NULL;
+ Path *cheapest_partial_path = NULL;
+ double dNumGroups = 0;
+ double dNumPartialGroups = 0;
+
+ if (IS_DUMMY_REL(rel_plain))
+ {
+ mark_dummy_rel(rel_grouped);
+ return;
+ }
+
+ MemSet(&agg_costs, 0, sizeof(AggClauseCosts));
+ get_agg_clause_costs(root, AGGSPLIT_INITIAL_SERIAL, &agg_costs);
+
+ /*
+ * Determine whether it's possible to perform sort-based implementations of
+ * grouping.
+ */
+ can_sort = grouping_is_sortable(agg_info->group_clauses);
+
+ /*
+ * Determine whether we should consider hash-based implementations of
+ * grouping.
+ */
+ Assert(root->numOrderedAggs == 0);
+ can_hash = (agg_info->group_clauses != NIL &&
+ grouping_is_hashable(agg_info->group_clauses));
+
+ /*
+ * Consider whether we should generate partially aggregated non-partial
+ * paths. We can only do this if we have a non-partial path.
+ */
+ if (rel_plain->pathlist != NIL)
+ {
+ cheapest_total_path = rel_plain->cheapest_total_path;
+ Assert(cheapest_total_path != NULL);
+ }
+
+ /*
+ * If parallelism is possible for rel_grouped, then we should consider
+ * generating partially-grouped partial paths. However, if the plain rel
+ * has no partial paths, then we can't.
+ */
+ if (rel_grouped->consider_parallel && rel_plain->partial_pathlist != NIL)
+ {
+ cheapest_partial_path = linitial(rel_plain->partial_pathlist);
+ Assert(cheapest_partial_path != NULL);
+ }
+
+ /* Estimate number of partial groups. */
+ if (cheapest_total_path != NULL)
+ dNumGroups = estimate_num_groups(root,
+ agg_info->group_exprs,
+ cheapest_total_path->rows,
+ NULL, NULL);
+ if (cheapest_partial_path != NULL)
+ dNumPartialGroups = estimate_num_groups(root,
+ agg_info->group_exprs,
+ cheapest_partial_path->rows,
+ NULL, NULL);
+
+ if (can_sort && cheapest_total_path != NULL)
+ {
+ ListCell *lc;
+
+ /*
+ * Use any available suitably-sorted path as input, and also consider
+ * sorting the cheapest-total path.
+ */
+ foreach(lc, rel_plain->pathlist)
+ {
+ Path *input_path = (Path *) lfirst(lc);
+ Path *path;
+ bool is_sorted;
+ int presorted_keys;
+
+ /*
+ * Since the path originates from the non-grouped relation which is
+ * not aware of eager aggregation, we must ensure that it provides
+ * the correct input for the partial aggregation.
+ */
+ path = (Path *) create_projection_path(root,
+ rel_grouped,
+ input_path,
+ agg_info->agg_input);
+
+ is_sorted = pathkeys_count_contained_in(agg_info->group_pathkeys,
+ path->pathkeys,
+ &presorted_keys);
+ if (!is_sorted)
+ {
+ /*
+ * Try at least sorting the cheapest path and also try
+ * incrementally sorting any path which is partially sorted
+ * already (no need to deal with paths which have presorted
+ * keys when incremental sort is disabled unless it's the
+ * cheapest input path).
+ */
+ if (input_path != cheapest_total_path &&
+ (presorted_keys == 0 || !enable_incremental_sort))
+ continue;
+
+ /*
+ * We've no need to consider both a sort and incremental sort.
+ * We'll just do a sort if there are no presorted keys and an
+ * incremental sort when there are presorted keys.
+ */
+ if (presorted_keys == 0 || !enable_incremental_sort)
+ path = (Path *) create_sort_path(root,
+ rel_grouped,
+ path,
+ agg_info->group_pathkeys,
+ -1.0);
+ else
+ path = (Path *) create_incremental_sort_path(root,
+ rel_grouped,
+ path,
+ agg_info->group_pathkeys,
+ presorted_keys,
+ -1.0);
+ }
+
+ /*
+ * qual is NIL because the HAVING clause cannot be evaluated until the
+ * final value of the aggregate is known.
+ */
+ path = (Path *) create_agg_path(root,
+ rel_grouped,
+ path,
+ agg_info->target,
+ AGG_SORTED,
+ AGGSPLIT_INITIAL_SERIAL,
+ agg_info->group_clauses,
+ NIL,
+ &agg_costs,
+ dNumGroups);
+
+ add_path(rel_grouped, path);
+ }
+ }
+
+ if (can_sort && cheapest_partial_path != NULL)
+ {
+ ListCell *lc;
+
+ /* Similar to above logic, but for partial paths. */
+ foreach(lc, rel_plain->partial_pathlist)
+ {
+ Path *input_path = (Path *) lfirst(lc);
+ Path *path;
+ bool is_sorted;
+ int presorted_keys;
+
+ /*
+ * Since the path originates from the non-grouped relation which is
+ * not aware of eager aggregation, we must ensure that it provides
+ * the correct input for the partial aggregation.
+ */
+ path = (Path *) create_projection_path(root,
+ rel_grouped,
+ input_path,
+ agg_info->agg_input);
+
+ is_sorted = pathkeys_count_contained_in(agg_info->group_pathkeys,
+ path->pathkeys,
+ &presorted_keys);
+
+ if (!is_sorted)
+ {
+ /*
+ * Try at least sorting the cheapest path and also try
+ * incrementally sorting any path which is partially sorted
+ * already (no need to deal with paths which have presorted
+ * keys when incremental sort is disabled unless it's the
+ * cheapest input path).
+ */
+ if (input_path != cheapest_partial_path &&
+ (presorted_keys == 0 || !enable_incremental_sort))
+ continue;
+
+ /*
+ * We've no need to consider both a sort and incremental sort.
+ * We'll just do a sort if there are no presorted keys and an
+ * incremental sort when there are presorted keys.
+ */
+ if (presorted_keys == 0 || !enable_incremental_sort)
+ path = (Path *) create_sort_path(root,
+ rel_grouped,
+ path,
+ agg_info->group_pathkeys,
+ -1.0);
+ else
+ path = (Path *) create_incremental_sort_path(root,
+ rel_grouped,
+ path,
+ agg_info->group_pathkeys,
+ presorted_keys,
+ -1.0);
+ }
+
+ /*
+ * qual is NIL because the HAVING clause cannot be evaluated until the
+ * final value of the aggregate is known.
+ */
+ path = (Path *) create_agg_path(root,
+ rel_grouped,
+ path,
+ agg_info->target,
+ AGG_SORTED,
+ AGGSPLIT_INITIAL_SERIAL,
+ agg_info->group_clauses,
+ NIL,
+ &agg_costs,
+ dNumPartialGroups);
+
+ add_partial_path(rel_grouped, path);
+ }
+ }
+
+ /*
+ * Add a partially-grouped HashAgg Path where possible
+ */
+ if (can_hash && cheapest_total_path != NULL)
+ {
+ Path *path;
+
+ /*
+ * Since the path originates from the non-grouped relation which is
+ * not aware of eager aggregation, we must ensure that it provides
+ * the correct input for the partial aggregation.
+ */
+ path = (Path *) create_projection_path(root,
+ rel_grouped,
+ cheapest_total_path,
+ agg_info->agg_input);
+
+ /*
+ * qual is NIL because the HAVING clause cannot be evaluated until
+ * the final value of the aggregate is known.
+ */
+ path = (Path *) create_agg_path(root,
+ rel_grouped,
+ path,
+ agg_info->target,
+ AGG_HASHED,
+ AGGSPLIT_INITIAL_SERIAL,
+ agg_info->group_clauses,
+ NIL,
+ &agg_costs,
+ dNumGroups);
+
+ add_path(rel_grouped, path);
+ }
+
+ /*
+ * Now add a partially-grouped HashAgg partial Path where possible
+ */
+ if (can_hash && cheapest_partial_path != NULL)
+ {
+ Path *path;
+
+ /*
+ * Since the path originates from the non-grouped relation which is
+ * not aware of eager aggregation, we must ensure that it provides
+ * the correct input for the partial aggregation.
+ */
+ path = (Path *) create_projection_path(root,
+ rel_grouped,
+ cheapest_partial_path,
+ agg_info->agg_input);
+
+ /*
+ * qual is NIL because the HAVING clause cannot be evaluated until
+ * the final value of the aggregate is known.
+ */
+ path = (Path *) create_agg_path(root,
+ rel_grouped,
+ path,
+ agg_info->target,
+ AGG_HASHED,
+ AGGSPLIT_INITIAL_SERIAL,
+ agg_info->group_clauses,
+ NIL,
+ &agg_costs,
+ dNumPartialGroups);
+
+ add_partial_path(rel_grouped, path);
+ }
+}
+
/*
* make_rel_from_joinlist
* Build access paths using a "joinlist" to guide the join path search.
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index 3491c3af1c..977c0ea4eb 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -2709,8 +2709,7 @@ create_projection_path(PlannerInfo *root,
pathnode->path.pathtype = T_Result;
pathnode->path.parent = rel;
pathnode->path.pathtarget = target;
- /* For now, assume we are above any joins, so no parameterization */
- pathnode->path.param_info = NULL;
+ pathnode->path.param_info = subpath->param_info;
pathnode->path.parallel_aware = false;
pathnode->path.parallel_safe = rel->consider_parallel &&
subpath->parallel_safe &&
@@ -2962,8 +2961,7 @@ create_incremental_sort_path(PlannerInfo *root,
pathnode->path.parent = rel;
/* Sort doesn't project, so use source path's pathtarget */
pathnode->path.pathtarget = subpath->pathtarget;
- /* For now, assume we are above any joins, so no parameterization */
- pathnode->path.param_info = NULL;
+ pathnode->path.param_info = subpath->param_info;
pathnode->path.parallel_aware = false;
pathnode->path.parallel_safe = rel->consider_parallel &&
subpath->parallel_safe;
@@ -3009,8 +3007,7 @@ create_sort_path(PlannerInfo *root,
pathnode->path.parent = rel;
/* Sort doesn't project, so use source path's pathtarget */
pathnode->path.pathtarget = subpath->pathtarget;
- /* For now, assume we are above any joins, so no parameterization */
- pathnode->path.param_info = NULL;
+ pathnode->path.param_info = subpath->param_info;
pathnode->path.parallel_aware = false;
pathnode->path.parallel_safe = rel->consider_parallel &&
subpath->parallel_safe;
@@ -3168,8 +3165,7 @@ create_agg_path(PlannerInfo *root,
pathnode->path.pathtype = T_Agg;
pathnode->path.parent = rel;
pathnode->path.pathtarget = target;
- /* For now, assume we are above any joins, so no parameterization */
- pathnode->path.param_info = NULL;
+ pathnode->path.param_info = subpath->param_info;
pathnode->path.parallel_aware = false;
pathnode->path.parallel_safe = rel->consider_parallel &&
subpath->parallel_safe;
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 1068ff6953..74015b4ed8 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -58,6 +58,10 @@ extern void generate_gather_paths(PlannerInfo *root, RelOptInfo *rel,
bool override_rows);
extern void generate_useful_gather_paths(PlannerInfo *root, RelOptInfo *rel,
bool override_rows);
+extern void generate_grouped_paths(PlannerInfo *root,
+ RelOptInfo *rel_grouped,
+ RelOptInfo *rel_plain,
+ RelAggInfo *agg_info);
extern int compute_parallel_worker(RelOptInfo *rel, double heap_pages,
double index_pages, int max_workers);
extern void create_partial_bitmap_paths(PlannerInfo *root, RelOptInfo *rel,
--
2.31.0
[application/octet-stream] v7-0006-Build-grouped-relations-out-of-base-relations.patch (9.0K, 8-v7-0006-Build-grouped-relations-out-of-base-relations.patch)
download | inline diff:
From e982f62ea075a908e78dad894739e1a190cc1a5f Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Wed, 28 Feb 2024 10:03:41 +0800
Subject: [PATCH v7 6/9] Build grouped relations out of base relations
This commit builds grouped relations for each base relation if possible,
and generates aggregation paths for the grouped base relations.
---
src/backend/optimizer/path/allpaths.c | 91 +++++++++++++++++++++++
src/backend/optimizer/util/relnode.c | 101 ++++++++++++++++++++++++++
src/include/optimizer/pathnode.h | 4 +
3 files changed, 196 insertions(+)
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 0c2fae9608..9219815e3d 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -93,6 +93,7 @@ join_search_hook_type join_search_hook = NULL;
static void set_base_rel_consider_startup(PlannerInfo *root);
static void set_base_rel_sizes(PlannerInfo *root);
+static void setup_base_grouped_rels(PlannerInfo *root);
static void set_base_rel_pathlists(PlannerInfo *root);
static void set_rel_size(PlannerInfo *root, RelOptInfo *rel,
Index rti, RangeTblEntry *rte);
@@ -117,6 +118,7 @@ static void set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
Index rti, RangeTblEntry *rte);
static void set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
Index rti, RangeTblEntry *rte);
+static void set_grouped_rel_pathlist(PlannerInfo *root, RelOptInfo *rel);
static void generate_orderedappend_paths(PlannerInfo *root, RelOptInfo *rel,
List *live_childrels,
List *all_child_pathkeys);
@@ -185,6 +187,11 @@ make_one_rel(PlannerInfo *root, List *joinlist)
*/
set_base_rel_sizes(root);
+ /*
+ * Build grouped base relations for each base rel if possible.
+ */
+ setup_base_grouped_rels(root);
+
/*
* We should now have size estimates for every actual table involved in
* the query, and we also know which if any have been deleted from the
@@ -326,6 +333,59 @@ set_base_rel_sizes(PlannerInfo *root)
}
}
+/*
+ * setup_base_grouped_rels
+ * For each "plain" base relation build a grouped base relation if eager
+ * aggregation is possible and if this relation can produce grouped paths.
+ */
+static void
+setup_base_grouped_rels(PlannerInfo *root)
+{
+ Index rti;
+
+ /*
+ * If there are no aggregate expressions or grouping expressions, eager
+ * aggregation is not possible.
+ */
+ if (root->agg_clause_list == NIL ||
+ root->group_expr_list == NIL)
+ return;
+
+ /*
+ * Eager aggregation only makes sense if there are multiple base rels in
+ * the query.
+ */
+ if (bms_membership(root->all_baserels) != BMS_MULTIPLE)
+ return;
+
+ for (rti = 1; rti < root->simple_rel_array_size; rti++)
+ {
+ RelOptInfo *rel = root->simple_rel_array[rti];
+ RelOptInfo *rel_grouped;
+ RelAggInfo *agg_info;
+
+ /* there may be empty slots corresponding to non-baserel RTEs */
+ if (rel == NULL)
+ continue;
+
+ Assert(rel->relid == rti); /* sanity check on array */
+
+ /*
+ * Ignore RTEs that are not simple rels. Note that we need to consider
+ * "other rels" here.
+ */
+ if (!IS_SIMPLE_REL(rel))
+ continue;
+
+ rel_grouped = build_simple_grouped_rel(root, rel->relid, &agg_info);
+ if (rel_grouped)
+ {
+ /* Make the grouped relation available for joining. */
+ add_grouped_rel(root, rel_grouped, agg_info);
+ }
+ }
+}
+
/*
* set_base_rel_pathlists
* Finds all paths available for scanning each base-relation entry.
@@ -562,6 +622,15 @@ set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
/* Now find the cheapest of the paths for this rel */
set_cheapest(rel);
+ /*
+ * If a grouped relation for this rel exists, build partial aggregation
+ * paths for it.
+ *
+ * Note that this can only happen after we've called set_cheapest() for
+ * this base rel, because we need its cheapest paths.
+ */
+ set_grouped_rel_pathlist(root, rel);
+
#ifdef OPTIMIZER_DEBUG
pprint(rel);
#endif
@@ -1289,6 +1358,28 @@ set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
add_paths_to_append_rel(root, rel, live_childrels);
}
+/*
+ * set_grouped_rel_pathlist
+ * If a grouped relation for the given 'rel' exists, build partial
+ * aggregation paths for it.
+ */
+static void
+set_grouped_rel_pathlist(PlannerInfo *root, RelOptInfo *rel)
+{
+ RelOptInfo *rel_grouped;
+ RelAggInfo *agg_info;
+
+ /* Add paths to the grouped base relation if one exists. */
+ rel_grouped = find_grouped_rel(root, rel->relids,
+ &agg_info);
+ if (rel_grouped)
+ {
+ generate_grouped_paths(root, rel_grouped, rel,
+ agg_info);
+ set_cheapest(rel_grouped);
+ }
+}
+
/*
* add_paths_to_append_rel
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index c6e2d417a8..b14f99a9ea 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -16,6 +16,7 @@
#include <limits.h>
+#include "catalog/pg_constraint.h"
#include "miscadmin.h"
#include "nodes/nodeFuncs.h"
#include "optimizer/appendinfo.h"
@@ -27,12 +28,15 @@
#include "optimizer/paths.h"
#include "optimizer/placeholder.h"
#include "optimizer/plancat.h"
+#include "optimizer/planner.h"
#include "optimizer/restrictinfo.h"
#include "optimizer/tlist.h"
+#include "parser/parse_oper.h"
#include "parser/parse_relation.h"
#include "rewrite/rewriteManip.h"
#include "utils/hsearch.h"
#include "utils/lsyscache.h"
+#include "utils/selfuncs.h"
/*
@@ -418,6 +422,103 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
return rel;
}
+/*
+ * build_simple_grouped_rel
+ * Construct a new RelOptInfo for a grouped base relation out of an existing
+ * non-grouped base relation.
+ *
+ * On success, the new RelOptInfo is returned and the corresponding RelAggInfo
+ * is stored in *agg_info_p.
+ */
+RelOptInfo *
+build_simple_grouped_rel(PlannerInfo *root, int relid,
+ RelAggInfo **agg_info_p)
+{
+ RelOptInfo *rel_plain;
+ RelOptInfo *rel_grouped;
+ RelAggInfo *agg_info;
+
+ /*
+ * We should have available aggregate expressions and grouping expressions,
+ * otherwise we cannot reach here.
+ */
+ Assert(root->agg_clause_list != NIL);
+ Assert(root->group_expr_list != NIL);
+
+ rel_plain = root->simple_rel_array[relid];
+ Assert(rel_plain != NULL);
+ Assert(IS_SIMPLE_REL(rel_plain));
+
+ /* nothing to do for dummy rel */
+ if (IS_DUMMY_REL(rel_plain))
+ return NULL;
+
+ /*
+ * Prepare the information we need to create grouped paths for this base
+ * relation.
+ */
+ agg_info = create_rel_agg_info(root, rel_plain);
+ if (agg_info == NULL)
+ return NULL;
+
+ /* build a grouped relation out of the plain relation */
+ rel_grouped = build_grouped_rel(root, rel_plain);
+ rel_grouped->reltarget = agg_info->target;
+ rel_grouped->rows = agg_info->grouped_rows;
+
+ /* return the RelAggInfo structure */
+ *agg_info_p = agg_info;
+
+ return rel_grouped;
+}
+
+/*
+ * build_grouped_rel
+ * Build a grouped relation by flat copying a plain relation and resetting
+ * the necessary fields.
+ */
+RelOptInfo *
+build_grouped_rel(PlannerInfo *root, RelOptInfo *rel_plain)
+{
+ RelOptInfo *rel_grouped;
+
+ rel_grouped = makeNode(RelOptInfo);
+ memcpy(rel_grouped, rel_plain, sizeof(RelOptInfo));
+
+ /*
+ * clear path info
+ */
+ rel_grouped->pathlist = NIL;
+ rel_grouped->ppilist = NIL;
+ rel_grouped->partial_pathlist = NIL;
+ rel_grouped->cheapest_startup_path = NULL;
+ rel_grouped->cheapest_total_path = NULL;
+ rel_grouped->cheapest_unique_path = NULL;
+ rel_grouped->cheapest_parameterized_paths = NIL;
+
+ /*
+ * clear partition info
+ */
+ rel_grouped->part_scheme = NULL;
+ rel_grouped->nparts = -1;
+ rel_grouped->boundinfo = NULL;
+ rel_grouped->partbounds_merged = false;
+ rel_grouped->partition_qual = NIL;
+ rel_grouped->part_rels = NULL;
+ rel_grouped->live_parts = NULL;
+ rel_grouped->all_partrels = NULL;
+ rel_grouped->partexprs = NULL;
+ rel_grouped->nullable_partexprs = NULL;
+ rel_grouped->consider_partitionwise_join = false;
+
+ /*
+ * clear size estimates
+ */
+ rel_grouped->rows = 0;
+
+ return rel_grouped;
+}
+
/*
* find_base_rel
* Find a base or otherrel relation entry, which must already exist.
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index 02da68a753..525481f296 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -310,6 +310,10 @@ extern void setup_simple_rel_arrays(PlannerInfo *root);
extern void expand_planner_arrays(PlannerInfo *root, int add_size);
extern RelOptInfo *build_simple_rel(PlannerInfo *root, int relid,
RelOptInfo *parent);
+extern RelOptInfo *build_simple_grouped_rel(PlannerInfo *root, int relid,
+ RelAggInfo **agg_info_p);
+extern RelOptInfo *build_grouped_rel(PlannerInfo *root,
+ RelOptInfo *rel_plain);
extern RelOptInfo *find_base_rel(PlannerInfo *root, int relid);
extern RelOptInfo *find_base_rel_noerr(PlannerInfo *root, int relid);
extern RelOptInfo *find_base_rel_ignore_join(PlannerInfo *root, int relid);
--
2.31.0
[application/octet-stream] v7-0007-Build-grouped-relations-out-of-join-relations.patch (25.8K, 9-v7-0007-Build-grouped-relations-out-of-join-relations.patch)
download | inline diff:
From 2a9dc93c243ff21c74719cc67b1723b7c6dc5880 Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Fri, 23 Feb 2024 13:33:09 +0800
Subject: [PATCH v7 7/9] Build grouped relations out of join relations
This commit builds grouped relations for each just-processed join
relation if possible, and generates aggregation paths for the grouped
join relations.
The changes made to make_join_rel() are relatively minor, with the
addition of a new function make_grouped_join_rel(), which finds or
creates a grouped relation for the just-processed joinrel, and generates
grouped paths by joining a grouped input relation with a non-grouped
input relation.
The other way to generate grouped paths is by adding sorted and hashed
partial aggregation paths on top of paths of the joinrel. This occurs
in standard_join_search(), after we've run set_cheapest() for the
joinrel. The reason for performing this step after set_cheapest() is
that we need to know the joinrel's cheapest paths (see
generate_grouped_paths()).
This patch also makes the grouped relation for the topmost join rel act
as the upper rel representing the result of partial aggregation, so that
we can add the final aggregation on top of that. Additionally, this
patch extends the functionality of eager aggregation to work with
partitionwise join and geqo.
This patch also makes eager aggregation work with outer joins. With
outer joins, the aggregate cannot be pushed down if any column
referenced by grouping expressions or aggregate functions is nullable by
an outer join above the relation to which we want to apply the partial
aggregation. Thanks to Tom's outer-join-aware-Var infrastructure, we
can easily identify such situations and subsequently refrain from
pushing down the aggregates.
Starting from this patch, you should be able to see plans with eager
aggregation.
---
src/backend/optimizer/geqo/geqo_eval.c | 84 ++++++++++++----
src/backend/optimizer/path/allpaths.c | 48 ++++++++++
src/backend/optimizer/path/joinrels.c | 122 ++++++++++++++++++++++++
src/backend/optimizer/plan/planner.c | 92 +++++++++++++-----
src/backend/optimizer/util/appendinfo.c | 60 ++++++++++++
src/backend/optimizer/util/relnode.c | 2 -
src/include/nodes/pathnodes.h | 6 --
7 files changed, 363 insertions(+), 51 deletions(-)
diff --git a/src/backend/optimizer/geqo/geqo_eval.c b/src/backend/optimizer/geqo/geqo_eval.c
index 1141156899..278857d767 100644
--- a/src/backend/optimizer/geqo/geqo_eval.c
+++ b/src/backend/optimizer/geqo/geqo_eval.c
@@ -60,8 +60,12 @@ geqo_eval(PlannerInfo *root, Gene *tour, int num_gene)
MemoryContext oldcxt;
RelOptInfo *joinrel;
Cost fitness;
- int savelength;
- struct HTAB *savehash;
+ int savelength_join_rel;
+ struct HTAB *savehash_join_rel;
+ int savelength_grouped_rel;
+ struct HTAB *savehash_grouped_rel;
+ int savelength_grouped_info;
+ struct HTAB *savehash_grouped_info;
/*
* Create a private memory context that will hold all temp storage
@@ -78,25 +82,38 @@ geqo_eval(PlannerInfo *root, Gene *tour, int num_gene)
oldcxt = MemoryContextSwitchTo(mycontext);
/*
- * gimme_tree will add entries to root->join_rel_list, which may or may
- * not already contain some entries. The newly added entries will be
- * recycled by the MemoryContextDelete below, so we must ensure that the
- * list is restored to its former state before exiting. We can do this by
- * truncating the list to its original length. NOTE this assumes that any
- * added entries are appended at the end!
+ * gimme_tree will add entries to root->join_rel_list, root->agg_info_list
+ * and root->upper_rels[UPPERREL_PARTIAL_GROUP_AGG], which may or may not
+ * already contain some entries. The newly added entries will be recycled
+ * by the MemoryContextDelete below, so we must ensure that each list of
+ * the RelInfoList structures is restored to its former state before
+ * exiting. We can do this by truncating each list to its original length.
+ * NOTE this assumes that any added entries are appended at the end!
*
- * We also must take care not to mess up the outer join_rel_list->hash, if
- * there is one. We can do this by just temporarily setting the link to
- * NULL. (If we are dealing with enough join rels, which we very likely
- * are, a new hash table will get built and used locally.)
+ * We also must take care not to mess up the outer hash tables of the
+ * RelInfoList structures, if any. We can do this by just temporarily
+ * setting each link to NULL. (If we are dealing with enough join rels,
+ * which we very likely are, new hash tables will get built and used
+ * locally.)
*
* join_rel_level[] shouldn't be in use, so just Assert it isn't.
*/
- savelength = list_length(root->join_rel_list->items);
- savehash = root->join_rel_list->hash;
+ savelength_join_rel = list_length(root->join_rel_list->items);
+ savehash_join_rel = root->join_rel_list->hash;
+
+ savelength_grouped_rel =
+ list_length(root->upper_rels[UPPERREL_PARTIAL_GROUP_AGG].items);
+ savehash_grouped_rel =
+ root->upper_rels[UPPERREL_PARTIAL_GROUP_AGG].hash;
+
+ savelength_grouped_info = list_length(root->agg_info_list->items);
+ savehash_grouped_info = root->agg_info_list->hash;
+
Assert(root->join_rel_level == NULL);
root->join_rel_list->hash = NULL;
+ root->upper_rels[UPPERREL_PARTIAL_GROUP_AGG].hash = NULL;
+ root->agg_info_list->hash = NULL;
/* construct the best path for the given combination of relations */
joinrel = gimme_tree(root, tour, num_gene);
@@ -118,12 +135,22 @@ geqo_eval(PlannerInfo *root, Gene *tour, int num_gene)
fitness = DBL_MAX;
/*
- * Restore join_rel_list to its former state, and put back original
- * hashtable if any.
+ * Restore each of the list in join_rel_list, agg_info_list and
+ * upper_rels[UPPERREL_PARTIAL_GROUP_AGG] to its former state, and put back
+ * original hashtable if any.
*/
root->join_rel_list->items = list_truncate(root->join_rel_list->items,
- savelength);
- root->join_rel_list->hash = savehash;
+ savelength_join_rel);
+ root->join_rel_list->hash = savehash_join_rel;
+
+ root->upper_rels[UPPERREL_PARTIAL_GROUP_AGG].items =
+ list_truncate(root->upper_rels[UPPERREL_PARTIAL_GROUP_AGG].items,
+ savelength_grouped_rel);
+ root->upper_rels[UPPERREL_PARTIAL_GROUP_AGG].hash = savehash_grouped_rel;
+
+ root->agg_info_list->items = list_truncate(root->agg_info_list->items,
+ savelength_grouped_info);
+ root->agg_info_list->hash = savehash_grouped_info;
/* release all the memory acquired within gimme_tree */
MemoryContextSwitchTo(oldcxt);
@@ -279,6 +306,27 @@ merge_clump(PlannerInfo *root, List *clumps, Clump *new_clump, int num_gene,
/* Find and save the cheapest paths for this joinrel */
set_cheapest(joinrel);
+ /*
+ * Except for the topmost scan/join rel, consider generating
+ * partial aggregation paths for the grouped relation on top of the
+ * paths of this rel. After that, we're done creating paths for
+ * the grouped relation, so run set_cheapest().
+ */
+ if (!bms_equal(joinrel->relids, root->all_query_rels))
+ {
+ RelOptInfo *rel_grouped;
+ RelAggInfo *agg_info;
+
+ rel_grouped = find_grouped_rel(root, joinrel->relids,
+ &agg_info);
+ if (rel_grouped)
+ {
+ generate_grouped_paths(root, rel_grouped, joinrel,
+ agg_info);
+ set_cheapest(rel_grouped);
+ }
+ }
+
/* Absorb new clump into old */
old_clump->joinrel = joinrel;
old_clump->size += new_clump->size;
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 9219815e3d..359eee3486 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -3854,6 +3854,10 @@ standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
*
* After that, we're done creating paths for the joinrel, so run
* set_cheapest().
+ *
+ * In addition, we also run generate_grouped_paths() for the grouped
+ * relation of each just-processed joinrel, and run set_cheapest() for
+ * the grouped relation afterwards.
*/
foreach(lc, root->join_rel_level[lev])
{
@@ -3874,6 +3878,27 @@ standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
/* Find and save the cheapest paths for this rel */
set_cheapest(rel);
+ /*
+ * Except for the topmost scan/join rel, consider generating
+ * partial aggregation paths for the grouped relation on top of the
+ * paths of this rel. After that, we're done creating paths for
+ * the grouped relation, so run set_cheapest().
+ */
+ if (!bms_equal(rel->relids, root->all_query_rels))
+ {
+ RelOptInfo *rel_grouped;
+ RelAggInfo *agg_info;
+
+ rel_grouped = find_grouped_rel(root, rel->relids,
+ &agg_info);
+ if (rel_grouped)
+ {
+ generate_grouped_paths(root, rel_grouped, rel,
+ agg_info);
+ set_cheapest(rel_grouped);
+ }
+ }
+
#ifdef OPTIMIZER_DEBUG
pprint(rel);
#endif
@@ -4742,6 +4767,29 @@ generate_partitionwise_join_paths(PlannerInfo *root, RelOptInfo *rel)
if (IS_DUMMY_REL(child_rel))
continue;
+ /*
+ * Except for the topmost scan/join rel, consider generating partial
+ * aggregation paths for the grouped relation on top of the paths of
+ * this partitioned child-join. After that, we're done creating paths
+ * for the grouped relation, so run set_cheapest().
+ */
+ if (!bms_equal(IS_OTHER_REL(rel) ?
+ rel->top_parent_relids : rel->relids,
+ root->all_query_rels))
+ {
+ RelOptInfo *rel_grouped;
+ RelAggInfo *agg_info;
+
+ rel_grouped = find_grouped_rel(root, child_rel->relids,
+ &agg_info);
+ if (rel_grouped)
+ {
+ generate_grouped_paths(root, rel_grouped, child_rel,
+ agg_info);
+ set_cheapest(rel_grouped);
+ }
+ }
+
#ifdef OPTIMIZER_DEBUG
pprint(child_rel);
#endif
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index f3a9412d18..ba1d15e85a 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -16,11 +16,13 @@
#include "miscadmin.h"
#include "optimizer/appendinfo.h"
+#include "optimizer/cost.h"
#include "optimizer/joininfo.h"
#include "optimizer/pathnode.h"
#include "optimizer/paths.h"
#include "partitioning/partbounds.h"
#include "utils/memutils.h"
+#include "utils/selfuncs.h"
static void make_rels_by_clause_joins(PlannerInfo *root,
@@ -35,6 +37,9 @@ static bool has_legal_joinclause(PlannerInfo *root, RelOptInfo *rel);
static bool restriction_is_constant_false(List *restrictlist,
RelOptInfo *joinrel,
bool only_pushed_down);
+static void make_grouped_join_rel(PlannerInfo *root, RelOptInfo *rel1,
+ RelOptInfo *rel2, RelOptInfo *joinrel,
+ SpecialJoinInfo *sjinfo, List *restrictlist);
static void populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
RelOptInfo *rel2, RelOptInfo *joinrel,
SpecialJoinInfo *sjinfo, List *restrictlist);
@@ -771,6 +776,10 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
return joinrel;
}
+ /* Build a grouped join relation for 'joinrel' if possible. */
+ make_grouped_join_rel(root, rel1, rel2, joinrel, sjinfo,
+ restrictlist);
+
/* Add paths to the join relation. */
populate_joinrel_with_paths(root, rel1, rel2, joinrel, sjinfo,
restrictlist);
@@ -882,6 +891,114 @@ add_outer_joins_to_relids(PlannerInfo *root, Relids input_relids,
return input_relids;
}
+/*
+ * make_grouped_join_rel
+ * Build a grouped join relation out of 'joinrel' if eager aggregation is
+ * possible and the 'joinrel' can produce grouped paths.
+ *
+ * We also generate partial aggregation paths for the grouped relation by
+ * joining the grouped paths of 'rel1' to the plain paths of 'rel2', or by
+ * joining the grouped paths of 'rel2' to the plain paths of 'rel1'.
+ */
+static void
+make_grouped_join_rel(PlannerInfo *root, RelOptInfo *rel1,
+ RelOptInfo *rel2, RelOptInfo *joinrel,
+ SpecialJoinInfo *sjinfo, List *restrictlist)
+{
+ RelOptInfo *rel_grouped;
+ RelAggInfo *agg_info = NULL;
+ RelOptInfo *rel1_grouped;
+ RelOptInfo *rel2_grouped;
+ bool rel1_empty;
+ bool rel2_empty;
+
+ /*
+ * If there are no aggregate expressions or grouping expressions, eager
+ * aggregation is not possible.
+ */
+ if (root->agg_clause_list == NIL ||
+ root->group_expr_list == NIL)
+ return;
+
+ /*
+ * See if we already have a grouped joinrel for this joinrel.
+ */
+ rel_grouped = find_grouped_rel(root, joinrel->relids, &agg_info);
+
+ /*
+ * Construct a new RelOptInfo for the grouped join relation if there is no
+ * existing one.
+ */
+ if (rel_grouped == NULL)
+ {
+ /*
+ * Prepare the information we need to create grouped paths for this
+ * join relation.
+ */
+ agg_info = create_rel_agg_info(root, joinrel);
+ if (agg_info == NULL)
+ return;
+
+ /* build a grouped relation out of the plain relation */
+ rel_grouped = build_grouped_rel(root, joinrel);
+ rel_grouped->reltarget = agg_info->target;
+ rel_grouped->rows = agg_info->grouped_rows;
+
+ /*
+ * Make the grouped relation available for further joining or for
+ * acting as the upper rel representing the result of partial
+ * aggregation.
+ */
+ add_grouped_rel(root, rel_grouped, agg_info);
+ }
+
+ Assert(agg_info != NULL);
+
+ /*
+ * If we've already proven this grouped join relation is empty, we needn't
+ * consider any more paths for it.
+ */
+ if (IS_DUMMY_REL(rel_grouped))
+ return;
+
+ /* retrieve the grouped relations for the two input rels */
+ rel1_grouped = find_grouped_rel(root, rel1->relids, NULL);
+ rel2_grouped = find_grouped_rel(root, rel2->relids, NULL);
+
+ rel1_empty = (rel1_grouped == NULL || IS_DUMMY_REL(rel1_grouped));
+ rel2_empty = (rel2_grouped == NULL || IS_DUMMY_REL(rel2_grouped));
+
+ /* Nothing to do if there's no grouped relation. */
+ if (rel1_empty && rel2_empty)
+ return;
+
+ /*
+ * Join of two grouped relations is currently not supported. In such a
+ * case, grouping of one side would change the occurrence of the other
+ * side's aggregate transient states on the input of the final aggregation.
+ * This can be handled by adjusting the transient states, but it's not
+ * worth the effort for now.
+ */
+ if (!rel1_empty && !rel2_empty)
+ return;
+
+ /* generate partial aggregation paths for the grouped relation */
+ if (!rel1_empty)
+ {
+ set_joinrel_size_estimates(root, rel_grouped, rel1_grouped, rel2,
+ sjinfo, restrictlist);
+ populate_joinrel_with_paths(root, rel1_grouped, rel2, rel_grouped,
+ sjinfo, restrictlist);
+ }
+ else if (!rel2_empty)
+ {
+ set_joinrel_size_estimates(root, rel_grouped, rel1, rel2_grouped,
+ sjinfo, restrictlist);
+ populate_joinrel_with_paths(root, rel1, rel2_grouped, rel_grouped,
+ sjinfo, restrictlist);
+ }
+}
+
/*
* populate_joinrel_with_paths
* Add paths to the given joinrel for given pair of joining relations. The
@@ -1671,6 +1788,11 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
adjust_child_relids(joinrel->relids,
nappinfos, appinfos)));
+ /* Build a grouped join relation for 'child_joinrel' if possible */
+ make_grouped_join_rel(root, child_rel1, child_rel2,
+ child_joinrel, child_sjinfo,
+ child_restrictlist);
+
/* And make paths for the child join */
populate_joinrel_with_paths(root, child_rel1, child_rel2,
child_joinrel, child_sjinfo,
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 032818423f..64e8e5bb91 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -225,7 +225,6 @@ static void add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
RelOptInfo *partially_grouped_rel,
const AggClauseCosts *agg_costs,
grouping_sets_data *gd,
- double dNumGroups,
GroupPathExtraData *extra);
static RelOptInfo *create_partial_grouping_paths(PlannerInfo *root,
RelOptInfo *grouped_rel,
@@ -3910,9 +3909,7 @@ create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
GroupPathExtraData *extra,
RelOptInfo **partially_grouped_rel_p)
{
- Path *cheapest_path = input_rel->cheapest_total_path;
RelOptInfo *partially_grouped_rel = NULL;
- double dNumGroups;
PartitionwiseAggregateType patype = PARTITIONWISE_AGGREGATE_NONE;
/*
@@ -3993,23 +3990,21 @@ create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
/* Gather any partially grouped partial paths. */
if (partially_grouped_rel && partially_grouped_rel->partial_pathlist)
- {
gather_grouping_paths(root, partially_grouped_rel);
- set_cheapest(partially_grouped_rel);
- }
/*
- * Estimate number of groups.
+ * Now choose the best path(s) for partially_grouped_rel.
+ *
+ * Note that the non-partial paths can come either from the Gather above or
+ * from eager aggregation.
*/
- dNumGroups = get_number_of_groups(root,
- cheapest_path->rows,
- gd,
- extra->targetList);
+ if (partially_grouped_rel && partially_grouped_rel->pathlist)
+ set_cheapest(partially_grouped_rel);
/* Build final grouping paths */
add_paths_to_grouping_rel(root, input_rel, grouped_rel,
partially_grouped_rel, agg_costs, gd,
- dNumGroups, extra);
+ extra);
/* Give a helpful error if we failed to find any implementation */
if (grouped_rel->pathlist == NIL)
@@ -6877,16 +6872,42 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
RelOptInfo *grouped_rel,
RelOptInfo *partially_grouped_rel,
const AggClauseCosts *agg_costs,
- grouping_sets_data *gd, double dNumGroups,
+ grouping_sets_data *gd,
GroupPathExtraData *extra)
{
Query *parse = root->parse;
Path *cheapest_path = input_rel->cheapest_total_path;
+ Path *cheapest_partially_grouped_path = NULL;
ListCell *lc;
bool can_hash = (extra->flags & GROUPING_CAN_USE_HASH) != 0;
bool can_sort = (extra->flags & GROUPING_CAN_USE_SORT) != 0;
List *havingQual = (List *) extra->havingQual;
AggClauseCosts *agg_final_costs = &extra->agg_final_costs;
+ double dNumGroups = 0;
+ double dNumFinalGroups = 0;
+
+ /*
+ * Estimate number of groups for non-split aggregation.
+ */
+ dNumGroups = get_number_of_groups(root,
+ cheapest_path->rows,
+ gd,
+ extra->targetList);
+
+ if (partially_grouped_rel && partially_grouped_rel->pathlist)
+ {
+ cheapest_partially_grouped_path =
+ partially_grouped_rel->cheapest_total_path;
+
+ /*
+ * Estimate number of groups for final phase of partial aggregation.
+ */
+ dNumFinalGroups =
+ get_number_of_groups(root,
+ cheapest_partially_grouped_path->rows,
+ gd,
+ extra->targetList);
+ }
if (can_sort)
{
@@ -6998,7 +7019,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
path = make_ordered_path(root,
grouped_rel,
path,
- partially_grouped_rel->cheapest_total_path,
+ cheapest_partially_grouped_path,
info->pathkeys);
if (path == NULL)
@@ -7015,7 +7036,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
info->clauses,
havingQual,
agg_final_costs,
- dNumGroups));
+ dNumFinalGroups));
else
add_path(grouped_rel, (Path *)
create_group_path(root,
@@ -7023,7 +7044,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
path,
info->clauses,
havingQual,
- dNumGroups));
+ dNumFinalGroups));
}
}
@@ -7065,19 +7086,17 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
*/
if (partially_grouped_rel && partially_grouped_rel->pathlist)
{
- Path *path = partially_grouped_rel->cheapest_total_path;
-
add_path(grouped_rel, (Path *)
create_agg_path(root,
grouped_rel,
- path,
+ cheapest_partially_grouped_path,
grouped_rel->reltarget,
AGG_HASHED,
AGGSPLIT_FINAL_DESERIAL,
root->processed_groupClause,
havingQual,
agg_final_costs,
- dNumGroups));
+ dNumFinalGroups));
}
}
@@ -7127,6 +7146,21 @@ create_partial_grouping_paths(PlannerInfo *root,
bool can_hash = (extra->flags & GROUPING_CAN_USE_HASH) != 0;
bool can_sort = (extra->flags & GROUPING_CAN_USE_SORT) != 0;
+ /*
+ * The partially_grouped_rel could have been already created due to eager
+ * aggregation.
+ */
+ partially_grouped_rel = find_grouped_rel(root, input_rel->relids, NULL);
+ Assert(enable_eager_aggregate || partially_grouped_rel == NULL);
+
+ /*
+ * It is possible that the partially_grouped_rel created by eager
+ * aggregation is dummy. In this case we just set it to NULL. It might be
+ * created again by the following logic if possible.
+ */
+ if (partially_grouped_rel && IS_DUMMY_REL(partially_grouped_rel))
+ partially_grouped_rel = NULL;
+
/*
* Consider whether we should generate partially aggregated non-partial
* paths. We can only do this if we have a non-partial path, and only if
@@ -7150,19 +7184,27 @@ create_partial_grouping_paths(PlannerInfo *root,
* If we can't partially aggregate partial paths, and we can't partially
* aggregate non-partial paths, then don't bother creating the new
* RelOptInfo at all, unless the caller specified force_rel_creation.
+ *
+ * Note that the partially_grouped_rel could have been already created and
+ * populated with appropriate paths by eager aggregation.
*/
if (cheapest_total_path == NULL &&
cheapest_partial_path == NULL &&
+ (partially_grouped_rel == NULL ||
+ partially_grouped_rel->pathlist == NIL) &&
!force_rel_creation)
return NULL;
/*
* Build a new upper relation to represent the result of partially
- * aggregating the rows from the input relation.
- */
- partially_grouped_rel = fetch_upper_rel(root,
- UPPERREL_PARTIAL_GROUP_AGG,
- grouped_rel->relids);
+ * aggregating the rows from the input relation. The relation may already
+ * exist due to eager aggregation, in which case we don't need to create
+ * it.
+ */
+ if (partially_grouped_rel == NULL)
+ partially_grouped_rel = fetch_upper_rel(root,
+ UPPERREL_PARTIAL_GROUP_AGG,
+ grouped_rel->relids);
partially_grouped_rel->consider_parallel =
grouped_rel->consider_parallel;
partially_grouped_rel->reloptkind = grouped_rel->reloptkind;
diff --git a/src/backend/optimizer/util/appendinfo.c b/src/backend/optimizer/util/appendinfo.c
index 6ba4eba224..08de77d439 100644
--- a/src/backend/optimizer/util/appendinfo.c
+++ b/src/backend/optimizer/util/appendinfo.c
@@ -495,6 +495,66 @@ adjust_appendrel_attrs_mutator(Node *node,
return (Node *) newinfo;
}
+ /*
+ * We have to process RelAggInfo nodes specially.
+ */
+ if (IsA(node, RelAggInfo))
+ {
+ RelAggInfo *oldinfo = (RelAggInfo *) node;
+ RelAggInfo *newinfo = makeNode(RelAggInfo);
+
+ /* Copy all flat-copiable fields */
+ memcpy(newinfo, oldinfo, sizeof(RelAggInfo));
+
+ newinfo->relids = adjust_child_relids(oldinfo->relids,
+ context->nappinfos,
+ context->appinfos);
+
+ newinfo->target = (PathTarget *)
+ adjust_appendrel_attrs_mutator((Node *) oldinfo->target,
+ context);
+
+ newinfo->agg_input = (PathTarget *)
+ adjust_appendrel_attrs_mutator((Node *) oldinfo->agg_input,
+ context);
+
+ newinfo->group_clauses = (List *)
+ adjust_appendrel_attrs_mutator((Node *) oldinfo->group_clauses,
+ context);
+
+ newinfo->group_exprs = (List *)
+ adjust_appendrel_attrs_mutator((Node *) oldinfo->group_exprs,
+ context);
+
+ return (Node *) newinfo;
+ }
+
+ /*
+ * We have to process PathTarget nodes specially.
+ */
+ if (IsA(node, PathTarget))
+ {
+ PathTarget *oldtarget = (PathTarget *) node;
+ PathTarget *newtarget = makeNode(PathTarget);
+
+ /* Copy all flat-copiable fields */
+ memcpy(newtarget, oldtarget, sizeof(PathTarget));
+
+ if (oldtarget->sortgrouprefs)
+ {
+ Size nbytes = list_length(oldtarget->exprs) * sizeof(Index);
+
+ newtarget->exprs = (List *)
+ adjust_appendrel_attrs_mutator((Node *) oldtarget->exprs,
+ context);
+
+ newtarget->sortgrouprefs = (Index *) palloc(nbytes);
+ memcpy(newtarget->sortgrouprefs, oldtarget->sortgrouprefs, nbytes);
+ }
+
+ return (Node *) newtarget;
+ }
+
/*
* NOTE: we do not need to recurse into sublinks, because they should
* already have been converted to subplans before we see them.
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index b14f99a9ea..6087a14a76 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -2833,8 +2833,6 @@ create_rel_agg_info(PlannerInfo *root, RelOptInfo *rel)
mark_partial_aggref(aggref, AGGSPLIT_INITIAL_SERIAL);
add_column_to_pathtarget(target, (Expr *) aggref, 0);
-
- result->agg_exprs = lappend(result->agg_exprs, aggref);
}
/*
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 1a1a1b6dfb..2b378665ba 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -1116,9 +1116,6 @@ typedef struct RelOptInfo
* "group_clauses", "group_exprs" and "group_pathkeys" are lists of
* SortGroupClause, the corresponding grouping expressions and PathKey
* respectively.
- *
- * "agg_exprs" is a list of Aggref nodes for the aggregation of the relation's
- * paths.
*/
typedef struct RelAggInfo
{
@@ -1154,9 +1151,6 @@ typedef struct RelAggInfo
List *group_exprs;
/* a list of PathKeys */
List *group_pathkeys;
-
- /* a list of Aggref nodes */
- List *agg_exprs;
} RelAggInfo;
/*
--
2.31.0
[application/octet-stream] v7-0008-Add-test-cases.patch (71.5K, 10-v7-0008-Add-test-cases.patch)
download | inline diff:
From 99feec748ac9b12e8973bd10a70b38274e6817a3 Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Fri, 23 Feb 2024 13:41:22 +0800
Subject: [PATCH v7 8/9] Add test cases
---
src/test/regress/expected/eager_aggregate.out | 1293 +++++++++++++++++
src/test/regress/parallel_schedule | 2 +-
src/test/regress/sql/eager_aggregate.sql | 192 +++
3 files changed, 1486 insertions(+), 1 deletion(-)
create mode 100644 src/test/regress/expected/eager_aggregate.out
create mode 100644 src/test/regress/sql/eager_aggregate.sql
diff --git a/src/test/regress/expected/eager_aggregate.out b/src/test/regress/expected/eager_aggregate.out
new file mode 100644
index 0000000000..7a28287522
--- /dev/null
+++ b/src/test/regress/expected/eager_aggregate.out
@@ -0,0 +1,1293 @@
+--
+-- EAGER AGGREGATION
+-- Test we can push aggregation down below join
+--
+-- Enable eager aggregation, which by default is disabled.
+SET enable_eager_aggregate TO on;
+CREATE TABLE eager_agg_t1 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t2 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t3 (a int, b int, c double precision);
+INSERT INTO eager_agg_t1 SELECT i, i, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t2 SELECT i, i%10, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t3 SELECT i%10, i%10, i FROM generate_series(1, 1000)i;
+ANALYZE eager_agg_t1;
+ANALYZE eager_agg_t2;
+ANALYZE eager_agg_t3;
+--
+-- Test eager aggregation over base rel
+--
+-- Perform scan of a table, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ QUERY PLAN
+------------------------------------------------------------------
+ Finalize GroupAggregate
+ Output: t1.a, avg(t2.c)
+ Group Key: t1.a
+ -> Sort
+ Output: t1.a, (PARTIAL avg(t2.c))
+ Sort Key: t1.a
+ -> Hash Join
+ Output: t1.a, (PARTIAL avg(t2.c))
+ Hash Cond: (t1.b = t2.b)
+ -> Seq Scan on public.eager_agg_t1 t1
+ Output: t1.a, t1.b, t1.c
+ -> Hash
+ Output: t2.b, (PARTIAL avg(t2.c))
+ -> Partial HashAggregate
+ Output: t2.b, PARTIAL avg(t2.c)
+ Group Key: t2.b
+ -> Seq Scan on public.eager_agg_t2 t2
+ Output: t2.a, t2.b, t2.c
+(18 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ QUERY PLAN
+------------------------------------------------------------------------
+ Finalize GroupAggregate
+ Output: t1.a, avg(t2.c)
+ Group Key: t1.a
+ -> Sort
+ Output: t1.a, (PARTIAL avg(t2.c))
+ Sort Key: t1.a
+ -> Hash Join
+ Output: t1.a, (PARTIAL avg(t2.c))
+ Hash Cond: (t1.b = t2.b)
+ -> Seq Scan on public.eager_agg_t1 t1
+ Output: t1.a, t1.b, t1.c
+ -> Hash
+ Output: t2.b, (PARTIAL avg(t2.c))
+ -> Partial GroupAggregate
+ Output: t2.b, PARTIAL avg(t2.c)
+ Group Key: t2.b
+ -> Sort
+ Output: t2.c, t2.b
+ Sort Key: t2.b
+ -> Seq Scan on public.eager_agg_t2 t2
+ Output: t2.c, t2.b
+(21 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+RESET enable_hashagg;
+--
+-- Test eager aggregation over join rel
+--
+-- Perform join of tables, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+ QUERY PLAN
+------------------------------------------------------------------------------
+ Finalize GroupAggregate
+ Output: t1.a, avg((t2.c + t3.c))
+ Group Key: t1.a
+ -> Sort
+ Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+ Sort Key: t1.a
+ -> Hash Join
+ Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+ Hash Cond: (t1.b = t2.b)
+ -> Seq Scan on public.eager_agg_t1 t1
+ Output: t1.a, t1.b, t1.c
+ -> Hash
+ Output: t2.b, (PARTIAL avg((t2.c + t3.c)))
+ -> Partial HashAggregate
+ Output: t2.b, PARTIAL avg((t2.c + t3.c))
+ Group Key: t2.b
+ -> Hash Join
+ Output: t2.c, t3.c, t2.b
+ Hash Cond: (t3.a = t2.a)
+ -> Seq Scan on public.eager_agg_t3 t3
+ Output: t3.a, t3.b, t3.c
+ -> Hash
+ Output: t2.c, t2.b, t2.a
+ -> Seq Scan on public.eager_agg_t2 t2
+ Output: t2.c, t2.b, t2.a
+(25 rows)
+
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+ a | avg
+---+-----
+ 1 | 497
+ 2 | 499
+ 3 | 501
+ 4 | 503
+ 5 | 505
+ 6 | 507
+ 7 | 509
+ 8 | 511
+ 9 | 513
+(9 rows)
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+ QUERY PLAN
+------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+ Output: t1.a, avg((t2.c + t3.c))
+ Group Key: t1.a
+ -> Sort
+ Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+ Sort Key: t1.a
+ -> Hash Join
+ Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+ Hash Cond: (t1.b = t2.b)
+ -> Seq Scan on public.eager_agg_t1 t1
+ Output: t1.a, t1.b, t1.c
+ -> Hash
+ Output: t2.b, (PARTIAL avg((t2.c + t3.c)))
+ -> Partial GroupAggregate
+ Output: t2.b, PARTIAL avg((t2.c + t3.c))
+ Group Key: t2.b
+ -> Sort
+ Output: t2.c, t3.c, t2.b
+ Sort Key: t2.b
+ -> Hash Join
+ Output: t2.c, t3.c, t2.b
+ Hash Cond: (t3.a = t2.a)
+ -> Seq Scan on public.eager_agg_t3 t3
+ Output: t3.a, t3.b, t3.c
+ -> Hash
+ Output: t2.c, t2.b, t2.a
+ -> Seq Scan on public.eager_agg_t2 t2
+ Output: t2.c, t2.b, t2.a
+(28 rows)
+
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+ a | avg
+---+-----
+ 1 | 497
+ 2 | 499
+ 3 | 501
+ 4 | 503
+ 5 | 505
+ 6 | 507
+ 7 | 509
+ 8 | 511
+ 9 | 513
+(9 rows)
+
+RESET enable_hashagg;
+--
+-- Test that eager aggregation works for outer join
+--
+-- Ensure aggregation can be pushed down to the non-nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ QUERY PLAN
+------------------------------------------------------------------
+ Finalize GroupAggregate
+ Output: t1.a, avg(t2.c)
+ Group Key: t1.a
+ -> Sort
+ Output: t1.a, (PARTIAL avg(t2.c))
+ Sort Key: t1.a
+ -> Hash Right Join
+ Output: t1.a, (PARTIAL avg(t2.c))
+ Hash Cond: (t1.b = t2.b)
+ -> Seq Scan on public.eager_agg_t1 t1
+ Output: t1.a, t1.b, t1.c
+ -> Hash
+ Output: t2.b, (PARTIAL avg(t2.c))
+ -> Partial HashAggregate
+ Output: t2.b, PARTIAL avg(t2.c)
+ Group Key: t2.b
+ -> Seq Scan on public.eager_agg_t2 t2
+ Output: t2.a, t2.b, t2.c
+(18 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+ | 505
+(10 rows)
+
+-- Ensure aggregation cannot be pushed down to the nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+ QUERY PLAN
+------------------------------------------------------------
+ Sort
+ Output: t2.b, (avg(t2.c))
+ Sort Key: t2.b
+ -> HashAggregate
+ Output: t2.b, avg(t2.c)
+ Group Key: t2.b
+ -> Hash Right Join
+ Output: t2.b, t2.c
+ Hash Cond: (t2.b = t1.b)
+ -> Seq Scan on public.eager_agg_t2 t2
+ Output: t2.a, t2.b, t2.c
+ -> Hash
+ Output: t1.b
+ -> Seq Scan on public.eager_agg_t1 t1
+ Output: t1.b
+(15 rows)
+
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+ b | avg
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+ |
+(10 rows)
+
+--
+-- Test that eager aggregation works for parallel plans
+--
+SET parallel_setup_cost=0;
+SET parallel_tuple_cost=0;
+SET min_parallel_table_scan_size=0;
+SET max_parallel_workers_per_gather=4;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ QUERY PLAN
+---------------------------------------------------------------------------------
+ Finalize GroupAggregate
+ Output: t1.a, avg(t2.c)
+ Group Key: t1.a
+ -> Sort
+ Output: t1.a, (PARTIAL avg(t2.c))
+ Sort Key: t1.a
+ -> Gather
+ Output: t1.a, (PARTIAL avg(t2.c))
+ Workers Planned: 2
+ -> Parallel Hash Join
+ Output: t1.a, (PARTIAL avg(t2.c))
+ Hash Cond: (t1.b = t2.b)
+ -> Parallel Seq Scan on public.eager_agg_t1 t1
+ Output: t1.a, t1.b, t1.c
+ -> Parallel Hash
+ Output: t2.b, (PARTIAL avg(t2.c))
+ -> Partial HashAggregate
+ Output: t2.b, PARTIAL avg(t2.c)
+ Group Key: t2.b
+ -> Parallel Seq Scan on public.eager_agg_t2 t2
+ Output: t2.a, t2.b, t2.c
+(21 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+RESET parallel_setup_cost;
+RESET parallel_tuple_cost;
+RESET min_parallel_table_scan_size;
+RESET max_parallel_workers_per_gather;
+DROP TABLE eager_agg_t1;
+DROP TABLE eager_agg_t2;
+DROP TABLE eager_agg_t3;
+--
+-- Test eager aggregation for partitionwise join
+--
+-- Enable partitionwise aggregate, which by default is disabled.
+SET enable_partitionwise_aggregate TO true;
+-- Enable partitionwise join, which by default is disabled.
+SET enable_partitionwise_join TO true;
+CREATE TABLE eager_agg_tab1(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab1_p1 PARTITION OF eager_agg_tab1 FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab1_p2 PARTITION OF eager_agg_tab1 FOR VALUES FROM (10) TO (20);
+CREATE TABLE eager_agg_tab1_p3 PARTITION OF eager_agg_tab1 FOR VALUES FROM (20) TO (30);
+CREATE TABLE eager_agg_tab2(x int, y int) PARTITION BY RANGE(y);
+CREATE TABLE eager_agg_tab2_p1 PARTITION OF eager_agg_tab2 FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab2_p2 PARTITION OF eager_agg_tab2 FOR VALUES FROM (10) TO (20);
+CREATE TABLE eager_agg_tab2_p3 PARTITION OF eager_agg_tab2 FOR VALUES FROM (20) TO (30);
+INSERT INTO eager_agg_tab1 SELECT i % 30, i % 20 FROM generate_series(0, 299, 2) i;
+INSERT INTO eager_agg_tab2 SELECT i % 20, i % 30 FROM generate_series(0, 299, 3) i;
+ANALYZE eager_agg_tab1;
+ANALYZE eager_agg_tab2;
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+ QUERY PLAN
+---------------------------------------------------------------------------------------
+ Sort
+ Output: t1.x, (sum(t1.y)), (count(*))
+ Sort Key: t1.x
+ -> Append
+ -> Finalize HashAggregate
+ Output: t1.x, sum(t1.y), count(*)
+ Group Key: t1.x
+ -> Hash Join
+ Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+ Hash Cond: (t2.y = t1.x)
+ -> Seq Scan on public.eager_agg_tab2_p1 t2
+ Output: t2.y
+ -> Hash
+ Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t1.x, PARTIAL sum(t1.y), PARTIAL count(*)
+ Group Key: t1.x
+ -> Seq Scan on public.eager_agg_tab1_p1 t1
+ Output: t1.x, t1.y
+ -> Finalize HashAggregate
+ Output: t1_1.x, sum(t1_1.y), count(*)
+ Group Key: t1_1.x
+ -> Hash Join
+ Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+ Hash Cond: (t2_1.y = t1_1.x)
+ -> Seq Scan on public.eager_agg_tab2_p2 t2_1
+ Output: t2_1.y
+ -> Hash
+ Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t1_1.x, PARTIAL sum(t1_1.y), PARTIAL count(*)
+ Group Key: t1_1.x
+ -> Seq Scan on public.eager_agg_tab1_p2 t1_1
+ Output: t1_1.x, t1_1.y
+ -> Finalize HashAggregate
+ Output: t1_2.x, sum(t1_2.y), count(*)
+ Group Key: t1_2.x
+ -> Hash Join
+ Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+ Hash Cond: (t2_2.y = t1_2.x)
+ -> Seq Scan on public.eager_agg_tab2_p3 t2_2
+ Output: t2_2.y
+ -> Hash
+ Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t1_2.x, PARTIAL sum(t1_2.y), PARTIAL count(*)
+ Group Key: t1_2.x
+ -> Seq Scan on public.eager_agg_tab1_p3 t1_2
+ Output: t1_2.x, t1_2.y
+(49 rows)
+
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+ x | sum | count
+----+------+-------
+ 0 | 500 | 100
+ 6 | 1100 | 100
+ 12 | 700 | 100
+ 18 | 1300 | 100
+ 24 | 900 | 100
+(5 rows)
+
+-- GROUP BY having other matching key
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+ QUERY PLAN
+---------------------------------------------------------------------------------------
+ Sort
+ Output: t2.y, (sum(t1.y)), (count(*))
+ Sort Key: t2.y
+ -> Append
+ -> Finalize HashAggregate
+ Output: t2.y, sum(t1.y), count(*)
+ Group Key: t2.y
+ -> Hash Join
+ Output: t2.y, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+ Hash Cond: (t2.y = t1.x)
+ -> Seq Scan on public.eager_agg_tab2_p1 t2
+ Output: t2.y
+ -> Hash
+ Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t1.x, PARTIAL sum(t1.y), PARTIAL count(*)
+ Group Key: t1.x
+ -> Seq Scan on public.eager_agg_tab1_p1 t1
+ Output: t1.y, t1.x
+ -> Finalize HashAggregate
+ Output: t2_1.y, sum(t1_1.y), count(*)
+ Group Key: t2_1.y
+ -> Hash Join
+ Output: t2_1.y, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+ Hash Cond: (t2_1.y = t1_1.x)
+ -> Seq Scan on public.eager_agg_tab2_p2 t2_1
+ Output: t2_1.y
+ -> Hash
+ Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t1_1.x, PARTIAL sum(t1_1.y), PARTIAL count(*)
+ Group Key: t1_1.x
+ -> Seq Scan on public.eager_agg_tab1_p2 t1_1
+ Output: t1_1.y, t1_1.x
+ -> Finalize HashAggregate
+ Output: t2_2.y, sum(t1_2.y), count(*)
+ Group Key: t2_2.y
+ -> Hash Join
+ Output: t2_2.y, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+ Hash Cond: (t2_2.y = t1_2.x)
+ -> Seq Scan on public.eager_agg_tab2_p3 t2_2
+ Output: t2_2.y
+ -> Hash
+ Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t1_2.x, PARTIAL sum(t1_2.y), PARTIAL count(*)
+ Group Key: t1_2.x
+ -> Seq Scan on public.eager_agg_tab1_p3 t1_2
+ Output: t1_2.y, t1_2.x
+(49 rows)
+
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+ y | sum | count
+----+------+-------
+ 0 | 500 | 100
+ 6 | 1100 | 100
+ 12 | 700 | 100
+ 18 | 1300 | 100
+ 24 | 900 | 100
+(5 rows)
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 10 ORDER BY t2.x;
+ QUERY PLAN
+------------------------------------------------------------------------------------------------------------
+ Sort
+ Output: t2.x, (sum(t1.x)), (count(*))
+ Sort Key: t2.x
+ -> Finalize HashAggregate
+ Output: t2.x, sum(t1.x), count(*)
+ Group Key: t2.x
+ Filter: (avg(t1.x) > '10'::numeric)
+ -> Append
+ -> Hash Join
+ Output: t2_1.x, (PARTIAL sum(t1_1.x)), (PARTIAL count(*)), (PARTIAL avg(t1_1.x))
+ Hash Cond: (t2_1.y = t1_1.x)
+ -> Seq Scan on public.eager_agg_tab2_p1 t2_1
+ Output: t2_1.x, t2_1.y
+ -> Hash
+ Output: t1_1.x, (PARTIAL sum(t1_1.x)), (PARTIAL count(*)), (PARTIAL avg(t1_1.x))
+ -> Partial HashAggregate
+ Output: t1_1.x, PARTIAL sum(t1_1.x), PARTIAL count(*), PARTIAL avg(t1_1.x)
+ Group Key: t1_1.x
+ -> Seq Scan on public.eager_agg_tab1_p1 t1_1
+ Output: t1_1.x
+ -> Hash Join
+ Output: t2_2.x, (PARTIAL sum(t1_2.x)), (PARTIAL count(*)), (PARTIAL avg(t1_2.x))
+ Hash Cond: (t2_2.y = t1_2.x)
+ -> Seq Scan on public.eager_agg_tab2_p2 t2_2
+ Output: t2_2.x, t2_2.y
+ -> Hash
+ Output: t1_2.x, (PARTIAL sum(t1_2.x)), (PARTIAL count(*)), (PARTIAL avg(t1_2.x))
+ -> Partial HashAggregate
+ Output: t1_2.x, PARTIAL sum(t1_2.x), PARTIAL count(*), PARTIAL avg(t1_2.x)
+ Group Key: t1_2.x
+ -> Seq Scan on public.eager_agg_tab1_p2 t1_2
+ Output: t1_2.x
+ -> Hash Join
+ Output: t2_3.x, (PARTIAL sum(t1_3.x)), (PARTIAL count(*)), (PARTIAL avg(t1_3.x))
+ Hash Cond: (t2_3.y = t1_3.x)
+ -> Seq Scan on public.eager_agg_tab2_p3 t2_3
+ Output: t2_3.x, t2_3.y
+ -> Hash
+ Output: t1_3.x, (PARTIAL sum(t1_3.x)), (PARTIAL count(*)), (PARTIAL avg(t1_3.x))
+ -> Partial HashAggregate
+ Output: t1_3.x, PARTIAL sum(t1_3.x), PARTIAL count(*), PARTIAL avg(t1_3.x)
+ Group Key: t1_3.x
+ -> Seq Scan on public.eager_agg_tab1_p3 t1_3
+ Output: t1_3.x
+(44 rows)
+
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 10 ORDER BY t2.x;
+ x | sum | count
+----+------+-------
+ 2 | 600 | 50
+ 4 | 1200 | 50
+ 8 | 900 | 50
+ 12 | 600 | 50
+ 14 | 1200 | 50
+ 18 | 900 | 50
+(6 rows)
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+ QUERY PLAN
+-------------------------------------------------------------------------------------------
+ Sort
+ Output: t1.x, (sum((t2.y + t3.y)))
+ Sort Key: t1.x
+ -> Append
+ -> Finalize HashAggregate
+ Output: t1.x, sum((t2.y + t3.y))
+ Group Key: t1.x
+ -> Hash Join
+ Output: t1.x, (PARTIAL sum((t2.y + t3.y)))
+ Hash Cond: (t1.x = t2.x)
+ -> Seq Scan on public.eager_agg_tab1_p1 t1
+ Output: t1.x
+ -> Hash
+ Output: t2.x, t3.x, (PARTIAL sum((t2.y + t3.y)))
+ -> Partial HashAggregate
+ Output: t2.x, t3.x, PARTIAL sum((t2.y + t3.y))
+ Group Key: t2.x
+ -> Hash Join
+ Output: t2.y, t2.x, t3.y, t3.x
+ Hash Cond: (t2.x = t3.x)
+ -> Seq Scan on public.eager_agg_tab1_p1 t2
+ Output: t2.y, t2.x
+ -> Hash
+ Output: t3.y, t3.x
+ -> Seq Scan on public.eager_agg_tab1_p1 t3
+ Output: t3.y, t3.x
+ -> Finalize HashAggregate
+ Output: t1_1.x, sum((t2_1.y + t3_1.y))
+ Group Key: t1_1.x
+ -> Hash Join
+ Output: t1_1.x, (PARTIAL sum((t2_1.y + t3_1.y)))
+ Hash Cond: (t1_1.x = t2_1.x)
+ -> Seq Scan on public.eager_agg_tab1_p2 t1_1
+ Output: t1_1.x
+ -> Hash
+ Output: t2_1.x, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y)))
+ -> Partial HashAggregate
+ Output: t2_1.x, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y))
+ Group Key: t2_1.x
+ -> Hash Join
+ Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+ Hash Cond: (t2_1.x = t3_1.x)
+ -> Seq Scan on public.eager_agg_tab1_p2 t2_1
+ Output: t2_1.y, t2_1.x
+ -> Hash
+ Output: t3_1.y, t3_1.x
+ -> Seq Scan on public.eager_agg_tab1_p2 t3_1
+ Output: t3_1.y, t3_1.x
+ -> Finalize HashAggregate
+ Output: t1_2.x, sum((t2_2.y + t3_2.y))
+ Group Key: t1_2.x
+ -> Hash Join
+ Output: t1_2.x, (PARTIAL sum((t2_2.y + t3_2.y)))
+ Hash Cond: (t1_2.x = t2_2.x)
+ -> Seq Scan on public.eager_agg_tab1_p3 t1_2
+ Output: t1_2.x
+ -> Hash
+ Output: t2_2.x, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y)))
+ -> Partial HashAggregate
+ Output: t2_2.x, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y))
+ Group Key: t2_2.x
+ -> Hash Join
+ Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+ Hash Cond: (t2_2.x = t3_2.x)
+ -> Seq Scan on public.eager_agg_tab1_p3 t2_2
+ Output: t2_2.y, t2_2.x
+ -> Hash
+ Output: t3_2.y, t3_2.x
+ -> Seq Scan on public.eager_agg_tab1_p3 t3_2
+ Output: t3_2.y, t3_2.x
+(70 rows)
+
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+ x | sum
+----+-------
+ 0 | 10000
+ 2 | 14000
+ 4 | 18000
+ 6 | 22000
+ 8 | 26000
+ 10 | 10000
+ 12 | 14000
+ 14 | 18000
+ 16 | 22000
+ 18 | 26000
+ 20 | 10000
+ 22 | 14000
+ 24 | 18000
+ 26 | 22000
+ 28 | 26000
+(15 rows)
+
+-- partial aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+ QUERY PLAN
+-------------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+ Output: t3.y, sum((t2.y + t3.y))
+ Group Key: t3.y
+ -> Sort
+ Output: t3.y, (PARTIAL sum((t2.y + t3.y)))
+ Sort Key: t3.y
+ -> Append
+ -> Hash Join
+ Output: t3_1.y, (PARTIAL sum((t2_1.y + t3_1.y)))
+ Hash Cond: (t2_1.x = t1_1.x)
+ -> Partial GroupAggregate
+ Output: t3_1.y, t2_1.x, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y))
+ Group Key: t3_1.y, t2_1.x, t3_1.x
+ -> Sort
+ Output: t2_1.y, t3_1.y, t2_1.x, t3_1.x
+ Sort Key: t3_1.y, t2_1.x
+ -> Hash Join
+ Output: t2_1.y, t3_1.y, t2_1.x, t3_1.x
+ Hash Cond: (t2_1.x = t3_1.x)
+ -> Seq Scan on public.eager_agg_tab1_p1 t2_1
+ Output: t2_1.y, t2_1.x
+ -> Hash
+ Output: t3_1.y, t3_1.x
+ -> Seq Scan on public.eager_agg_tab1_p1 t3_1
+ Output: t3_1.y, t3_1.x
+ -> Hash
+ Output: t1_1.x
+ -> Seq Scan on public.eager_agg_tab1_p1 t1_1
+ Output: t1_1.x
+ -> Hash Join
+ Output: t3_2.y, (PARTIAL sum((t2_2.y + t3_2.y)))
+ Hash Cond: (t2_2.x = t1_2.x)
+ -> Partial GroupAggregate
+ Output: t3_2.y, t2_2.x, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y))
+ Group Key: t3_2.y, t2_2.x, t3_2.x
+ -> Sort
+ Output: t2_2.y, t3_2.y, t2_2.x, t3_2.x
+ Sort Key: t3_2.y, t2_2.x
+ -> Hash Join
+ Output: t2_2.y, t3_2.y, t2_2.x, t3_2.x
+ Hash Cond: (t2_2.x = t3_2.x)
+ -> Seq Scan on public.eager_agg_tab1_p2 t2_2
+ Output: t2_2.y, t2_2.x
+ -> Hash
+ Output: t3_2.y, t3_2.x
+ -> Seq Scan on public.eager_agg_tab1_p2 t3_2
+ Output: t3_2.y, t3_2.x
+ -> Hash
+ Output: t1_2.x
+ -> Seq Scan on public.eager_agg_tab1_p2 t1_2
+ Output: t1_2.x
+ -> Hash Join
+ Output: t3_3.y, (PARTIAL sum((t2_3.y + t3_3.y)))
+ Hash Cond: (t2_3.x = t1_3.x)
+ -> Partial GroupAggregate
+ Output: t3_3.y, t2_3.x, t3_3.x, PARTIAL sum((t2_3.y + t3_3.y))
+ Group Key: t3_3.y, t2_3.x, t3_3.x
+ -> Sort
+ Output: t2_3.y, t3_3.y, t2_3.x, t3_3.x
+ Sort Key: t3_3.y, t2_3.x
+ -> Hash Join
+ Output: t2_3.y, t3_3.y, t2_3.x, t3_3.x
+ Hash Cond: (t2_3.x = t3_3.x)
+ -> Seq Scan on public.eager_agg_tab1_p3 t2_3
+ Output: t2_3.y, t2_3.x
+ -> Hash
+ Output: t3_3.y, t3_3.x
+ -> Seq Scan on public.eager_agg_tab1_p3 t3_3
+ Output: t3_3.y, t3_3.x
+ -> Hash
+ Output: t1_3.x
+ -> Seq Scan on public.eager_agg_tab1_p3 t1_3
+ Output: t1_3.x
+(73 rows)
+
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+ y | sum
+----+-------
+ 0 | 7500
+ 2 | 13500
+ 4 | 19500
+ 6 | 25500
+ 8 | 31500
+ 10 | 22500
+ 12 | 28500
+ 14 | 34500
+ 16 | 40500
+ 18 | 46500
+(10 rows)
+
+RESET enable_hashagg;
+DROP TABLE eager_agg_tab1;
+DROP TABLE eager_agg_tab2;
+--
+-- Test with multi-level partitioning scheme
+--
+CREATE TABLE eager_agg_tab_ml(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p1 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab_ml_p2 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (10) TO (20) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p2_s1 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab_ml_p2_s2 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (15) TO (20);
+CREATE TABLE eager_agg_tab_ml_p3 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (20) TO (30) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p3_s1 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (20) TO (25);
+CREATE TABLE eager_agg_tab_ml_p3_s2 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (25) TO (30);
+INSERT INTO eager_agg_tab_ml SELECT i % 30, i % 30 FROM generate_series(1, 1000) i;
+ANALYZE eager_agg_tab_ml;
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+ QUERY PLAN
+---------------------------------------------------------------------------------------
+ Sort
+ Output: t1.x, (sum(t2.y)), (count(*))
+ Sort Key: t1.x
+ -> Append
+ -> Finalize HashAggregate
+ Output: t1.x, sum(t2.y), count(*)
+ Group Key: t1.x
+ -> Hash Join
+ Output: t1.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+ Hash Cond: (t1.x = t2.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p1 t1
+ Output: t1.x
+ -> Hash
+ Output: t2.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t2.x, PARTIAL sum(t2.y), PARTIAL count(*)
+ Group Key: t2.x
+ -> Seq Scan on public.eager_agg_tab_ml_p1 t2
+ Output: t2.y, t2.x
+ -> Finalize HashAggregate
+ Output: t1_1.x, sum(t2_1.y), count(*)
+ Group Key: t1_1.x
+ -> Hash Join
+ Output: t1_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+ Hash Cond: (t1_1.x = t2_1.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+ Output: t1_1.x
+ -> Hash
+ Output: t2_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t2_1.x, PARTIAL sum(t2_1.y), PARTIAL count(*)
+ Group Key: t2_1.x
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+ Output: t2_1.y, t2_1.x
+ -> Finalize HashAggregate
+ Output: t1_2.x, sum(t2_2.y), count(*)
+ Group Key: t1_2.x
+ -> Hash Join
+ Output: t1_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+ Hash Cond: (t1_2.x = t2_2.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+ Output: t1_2.x
+ -> Hash
+ Output: t2_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t2_2.x, PARTIAL sum(t2_2.y), PARTIAL count(*)
+ Group Key: t2_2.x
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+ Output: t2_2.y, t2_2.x
+ -> Finalize HashAggregate
+ Output: t1_3.x, sum(t2_3.y), count(*)
+ Group Key: t1_3.x
+ -> Hash Join
+ Output: t1_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+ Hash Cond: (t1_3.x = t2_3.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+ Output: t1_3.x
+ -> Hash
+ Output: t2_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t2_3.x, PARTIAL sum(t2_3.y), PARTIAL count(*)
+ Group Key: t2_3.x
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+ Output: t2_3.y, t2_3.x
+ -> Finalize HashAggregate
+ Output: t1_4.x, sum(t2_4.y), count(*)
+ Group Key: t1_4.x
+ -> Hash Join
+ Output: t1_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+ Hash Cond: (t1_4.x = t2_4.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+ Output: t1_4.x
+ -> Hash
+ Output: t2_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t2_4.x, PARTIAL sum(t2_4.y), PARTIAL count(*)
+ Group Key: t2_4.x
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+ Output: t2_4.y, t2_4.x
+(79 rows)
+
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+ x | sum | count
+----+-------+-------
+ 0 | 0 | 1089
+ 1 | 1156 | 1156
+ 2 | 2312 | 1156
+ 3 | 3468 | 1156
+ 4 | 4624 | 1156
+ 5 | 5780 | 1156
+ 6 | 6936 | 1156
+ 7 | 8092 | 1156
+ 8 | 9248 | 1156
+ 9 | 10404 | 1156
+ 10 | 11560 | 1156
+ 11 | 11979 | 1089
+ 12 | 13068 | 1089
+ 13 | 14157 | 1089
+ 14 | 15246 | 1089
+ 15 | 16335 | 1089
+ 16 | 17424 | 1089
+ 17 | 18513 | 1089
+ 18 | 19602 | 1089
+ 19 | 20691 | 1089
+ 20 | 21780 | 1089
+ 21 | 22869 | 1089
+ 22 | 23958 | 1089
+ 23 | 25047 | 1089
+ 24 | 26136 | 1089
+ 25 | 27225 | 1089
+ 26 | 28314 | 1089
+ 27 | 29403 | 1089
+ 28 | 30492 | 1089
+ 29 | 31581 | 1089
+(30 rows)
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+ QUERY PLAN
+---------------------------------------------------------------------------------------
+ Sort
+ Output: t1.y, (sum(t2.y)), (count(*))
+ Sort Key: t1.y
+ -> Finalize HashAggregate
+ Output: t1.y, sum(t2.y), count(*)
+ Group Key: t1.y
+ -> Append
+ -> Hash Join
+ Output: t1_1.y, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+ Hash Cond: (t1_1.x = t2_1.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p1 t1_1
+ Output: t1_1.y, t1_1.x
+ -> Hash
+ Output: t2_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t2_1.x, PARTIAL sum(t2_1.y), PARTIAL count(*)
+ Group Key: t2_1.x
+ -> Seq Scan on public.eager_agg_tab_ml_p1 t2_1
+ Output: t2_1.y, t2_1.x
+ -> Hash Join
+ Output: t1_2.y, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+ Hash Cond: (t1_2.x = t2_2.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_2
+ Output: t1_2.y, t1_2.x
+ -> Hash
+ Output: t2_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t2_2.x, PARTIAL sum(t2_2.y), PARTIAL count(*)
+ Group Key: t2_2.x
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_2
+ Output: t2_2.y, t2_2.x
+ -> Hash Join
+ Output: t1_3.y, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+ Hash Cond: (t1_3.x = t2_3.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_3
+ Output: t1_3.y, t1_3.x
+ -> Hash
+ Output: t2_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t2_3.x, PARTIAL sum(t2_3.y), PARTIAL count(*)
+ Group Key: t2_3.x
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_3
+ Output: t2_3.y, t2_3.x
+ -> Hash Join
+ Output: t1_4.y, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+ Hash Cond: (t1_4.x = t2_4.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_4
+ Output: t1_4.y, t1_4.x
+ -> Hash
+ Output: t2_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t2_4.x, PARTIAL sum(t2_4.y), PARTIAL count(*)
+ Group Key: t2_4.x
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_4
+ Output: t2_4.y, t2_4.x
+ -> Hash Join
+ Output: t1_5.y, (PARTIAL sum(t2_5.y)), (PARTIAL count(*))
+ Hash Cond: (t1_5.x = t2_5.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_5
+ Output: t1_5.y, t1_5.x
+ -> Hash
+ Output: t2_5.x, (PARTIAL sum(t2_5.y)), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t2_5.x, PARTIAL sum(t2_5.y), PARTIAL count(*)
+ Group Key: t2_5.x
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_5
+ Output: t2_5.y, t2_5.x
+(67 rows)
+
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+ y | sum | count
+----+-------+-------
+ 0 | 0 | 1089
+ 1 | 1156 | 1156
+ 2 | 2312 | 1156
+ 3 | 3468 | 1156
+ 4 | 4624 | 1156
+ 5 | 5780 | 1156
+ 6 | 6936 | 1156
+ 7 | 8092 | 1156
+ 8 | 9248 | 1156
+ 9 | 10404 | 1156
+ 10 | 11560 | 1156
+ 11 | 11979 | 1089
+ 12 | 13068 | 1089
+ 13 | 14157 | 1089
+ 14 | 15246 | 1089
+ 15 | 16335 | 1089
+ 16 | 17424 | 1089
+ 17 | 18513 | 1089
+ 18 | 19602 | 1089
+ 19 | 20691 | 1089
+ 20 | 21780 | 1089
+ 21 | 22869 | 1089
+ 22 | 23958 | 1089
+ 23 | 25047 | 1089
+ 24 | 26136 | 1089
+ 25 | 27225 | 1089
+ 26 | 28314 | 1089
+ 27 | 29403 | 1089
+ 28 | 30492 | 1089
+ 29 | 31581 | 1089
+(30 rows)
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+ QUERY PLAN
+----------------------------------------------------------------------------------------------------------
+ Sort
+ Output: t1.x, (sum((t2.y + t3.y))), (count(*))
+ Sort Key: t1.x
+ -> Append
+ -> Finalize HashAggregate
+ Output: t1.x, sum((t2.y + t3.y)), count(*)
+ Group Key: t1.x
+ -> Hash Join
+ Output: t1.x, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+ Hash Cond: (t1.x = t2.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p1 t1
+ Output: t1.x
+ -> Hash
+ Output: t2.x, t3.x, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t2.x, t3.x, PARTIAL sum((t2.y + t3.y)), PARTIAL count(*)
+ Group Key: t2.x
+ -> Hash Join
+ Output: t2.y, t2.x, t3.y, t3.x
+ Hash Cond: (t2.x = t3.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p1 t2
+ Output: t2.y, t2.x
+ -> Hash
+ Output: t3.y, t3.x
+ -> Seq Scan on public.eager_agg_tab_ml_p1 t3
+ Output: t3.y, t3.x
+ -> Finalize HashAggregate
+ Output: t1_1.x, sum((t2_1.y + t3_1.y)), count(*)
+ Group Key: t1_1.x
+ -> Hash Join
+ Output: t1_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+ Hash Cond: (t1_1.x = t2_1.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+ Output: t1_1.x
+ -> Hash
+ Output: t2_1.x, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t2_1.x, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y)), PARTIAL count(*)
+ Group Key: t2_1.x
+ -> Hash Join
+ Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+ Hash Cond: (t2_1.x = t3_1.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+ Output: t2_1.y, t2_1.x
+ -> Hash
+ Output: t3_1.y, t3_1.x
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s1 t3_1
+ Output: t3_1.y, t3_1.x
+ -> Finalize HashAggregate
+ Output: t1_2.x, sum((t2_2.y + t3_2.y)), count(*)
+ Group Key: t1_2.x
+ -> Hash Join
+ Output: t1_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+ Hash Cond: (t1_2.x = t2_2.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+ Output: t1_2.x
+ -> Hash
+ Output: t2_2.x, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t2_2.x, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y)), PARTIAL count(*)
+ Group Key: t2_2.x
+ -> Hash Join
+ Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+ Hash Cond: (t2_2.x = t3_2.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+ Output: t2_2.y, t2_2.x
+ -> Hash
+ Output: t3_2.y, t3_2.x
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s2 t3_2
+ Output: t3_2.y, t3_2.x
+ -> Finalize HashAggregate
+ Output: t1_3.x, sum((t2_3.y + t3_3.y)), count(*)
+ Group Key: t1_3.x
+ -> Hash Join
+ Output: t1_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+ Hash Cond: (t1_3.x = t2_3.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+ Output: t1_3.x
+ -> Hash
+ Output: t2_3.x, t3_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t2_3.x, t3_3.x, PARTIAL sum((t2_3.y + t3_3.y)), PARTIAL count(*)
+ Group Key: t2_3.x
+ -> Hash Join
+ Output: t2_3.y, t2_3.x, t3_3.y, t3_3.x
+ Hash Cond: (t2_3.x = t3_3.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+ Output: t2_3.y, t2_3.x
+ -> Hash
+ Output: t3_3.y, t3_3.x
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s1 t3_3
+ Output: t3_3.y, t3_3.x
+ -> Finalize HashAggregate
+ Output: t1_4.x, sum((t2_4.y + t3_4.y)), count(*)
+ Group Key: t1_4.x
+ -> Hash Join
+ Output: t1_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+ Hash Cond: (t1_4.x = t2_4.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+ Output: t1_4.x
+ -> Hash
+ Output: t2_4.x, t3_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t2_4.x, t3_4.x, PARTIAL sum((t2_4.y + t3_4.y)), PARTIAL count(*)
+ Group Key: t2_4.x
+ -> Hash Join
+ Output: t2_4.y, t2_4.x, t3_4.y, t3_4.x
+ Hash Cond: (t2_4.x = t3_4.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+ Output: t2_4.y, t2_4.x
+ -> Hash
+ Output: t3_4.y, t3_4.x
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s2 t3_4
+ Output: t3_4.y, t3_4.x
+(114 rows)
+
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+ x | sum | count
+----+---------+-------
+ 0 | 0 | 35937
+ 1 | 78608 | 39304
+ 2 | 157216 | 39304
+ 3 | 235824 | 39304
+ 4 | 314432 | 39304
+ 5 | 393040 | 39304
+ 6 | 471648 | 39304
+ 7 | 550256 | 39304
+ 8 | 628864 | 39304
+ 9 | 707472 | 39304
+ 10 | 786080 | 39304
+ 11 | 790614 | 35937
+ 12 | 862488 | 35937
+ 13 | 934362 | 35937
+ 14 | 1006236 | 35937
+ 15 | 1078110 | 35937
+ 16 | 1149984 | 35937
+ 17 | 1221858 | 35937
+ 18 | 1293732 | 35937
+ 19 | 1365606 | 35937
+ 20 | 1437480 | 35937
+ 21 | 1509354 | 35937
+ 22 | 1581228 | 35937
+ 23 | 1653102 | 35937
+ 24 | 1724976 | 35937
+ 25 | 1796850 | 35937
+ 26 | 1868724 | 35937
+ 27 | 1940598 | 35937
+ 28 | 2012472 | 35937
+ 29 | 2084346 | 35937
+(30 rows)
+
+-- partial aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+ QUERY PLAN
+------------------------------------------------------------------------------------------------------------------
+ Sort
+ Output: t3.y, (sum((t2.y + t3.y))), (count(*))
+ Sort Key: t3.y
+ -> Finalize HashAggregate
+ Output: t3.y, sum((t2.y + t3.y)), count(*)
+ Group Key: t3.y
+ -> Append
+ -> Hash Join
+ Output: t3_1.y, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+ Hash Cond: (t1_1.x = t2_1.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p1 t1_1
+ Output: t1_1.x
+ -> Hash
+ Output: t3_1.y, t2_1.x, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t3_1.y, t2_1.x, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y)), PARTIAL count(*)
+ Group Key: t3_1.y, t2_1.x, t3_1.x
+ -> Hash Join
+ Output: t2_1.y, t3_1.y, t2_1.x, t3_1.x
+ Hash Cond: (t2_1.x = t3_1.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p1 t2_1
+ Output: t2_1.y, t2_1.x
+ -> Hash
+ Output: t3_1.y, t3_1.x
+ -> Seq Scan on public.eager_agg_tab_ml_p1 t3_1
+ Output: t3_1.y, t3_1.x
+ -> Hash Join
+ Output: t3_2.y, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+ Hash Cond: (t1_2.x = t2_2.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_2
+ Output: t1_2.x
+ -> Hash
+ Output: t3_2.y, t2_2.x, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t3_2.y, t2_2.x, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y)), PARTIAL count(*)
+ Group Key: t3_2.y, t2_2.x, t3_2.x
+ -> Hash Join
+ Output: t2_2.y, t3_2.y, t2_2.x, t3_2.x
+ Hash Cond: (t2_2.x = t3_2.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_2
+ Output: t2_2.y, t2_2.x
+ -> Hash
+ Output: t3_2.y, t3_2.x
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s1 t3_2
+ Output: t3_2.y, t3_2.x
+ -> Hash Join
+ Output: t3_3.y, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+ Hash Cond: (t1_3.x = t2_3.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_3
+ Output: t1_3.x
+ -> Hash
+ Output: t3_3.y, t2_3.x, t3_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t3_3.y, t2_3.x, t3_3.x, PARTIAL sum((t2_3.y + t3_3.y)), PARTIAL count(*)
+ Group Key: t3_3.y, t2_3.x, t3_3.x
+ -> Hash Join
+ Output: t2_3.y, t3_3.y, t2_3.x, t3_3.x
+ Hash Cond: (t2_3.x = t3_3.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_3
+ Output: t2_3.y, t2_3.x
+ -> Hash
+ Output: t3_3.y, t3_3.x
+ -> Seq Scan on public.eager_agg_tab_ml_p2_s2 t3_3
+ Output: t3_3.y, t3_3.x
+ -> Hash Join
+ Output: t3_4.y, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+ Hash Cond: (t1_4.x = t2_4.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_4
+ Output: t1_4.x
+ -> Hash
+ Output: t3_4.y, t2_4.x, t3_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t3_4.y, t2_4.x, t3_4.x, PARTIAL sum((t2_4.y + t3_4.y)), PARTIAL count(*)
+ Group Key: t3_4.y, t2_4.x, t3_4.x
+ -> Hash Join
+ Output: t2_4.y, t3_4.y, t2_4.x, t3_4.x
+ Hash Cond: (t2_4.x = t3_4.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_4
+ Output: t2_4.y, t2_4.x
+ -> Hash
+ Output: t3_4.y, t3_4.x
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s1 t3_4
+ Output: t3_4.y, t3_4.x
+ -> Hash Join
+ Output: t3_5.y, (PARTIAL sum((t2_5.y + t3_5.y))), (PARTIAL count(*))
+ Hash Cond: (t1_5.x = t2_5.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_5
+ Output: t1_5.x
+ -> Hash
+ Output: t3_5.y, t2_5.x, t3_5.x, (PARTIAL sum((t2_5.y + t3_5.y))), (PARTIAL count(*))
+ -> Partial HashAggregate
+ Output: t3_5.y, t2_5.x, t3_5.x, PARTIAL sum((t2_5.y + t3_5.y)), PARTIAL count(*)
+ Group Key: t3_5.y, t2_5.x, t3_5.x
+ -> Hash Join
+ Output: t2_5.y, t3_5.y, t2_5.x, t3_5.x
+ Hash Cond: (t2_5.x = t3_5.x)
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_5
+ Output: t2_5.y, t2_5.x
+ -> Hash
+ Output: t3_5.y, t3_5.x
+ -> Seq Scan on public.eager_agg_tab_ml_p3_s2 t3_5
+ Output: t3_5.y, t3_5.x
+(102 rows)
+
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+ y | sum | count
+----+---------+-------
+ 0 | 0 | 35937
+ 1 | 78608 | 39304
+ 2 | 157216 | 39304
+ 3 | 235824 | 39304
+ 4 | 314432 | 39304
+ 5 | 393040 | 39304
+ 6 | 471648 | 39304
+ 7 | 550256 | 39304
+ 8 | 628864 | 39304
+ 9 | 707472 | 39304
+ 10 | 786080 | 39304
+ 11 | 790614 | 35937
+ 12 | 862488 | 35937
+ 13 | 934362 | 35937
+ 14 | 1006236 | 35937
+ 15 | 1078110 | 35937
+ 16 | 1149984 | 35937
+ 17 | 1221858 | 35937
+ 18 | 1293732 | 35937
+ 19 | 1365606 | 35937
+ 20 | 1437480 | 35937
+ 21 | 1509354 | 35937
+ 22 | 1581228 | 35937
+ 23 | 1653102 | 35937
+ 24 | 1724976 | 35937
+ 25 | 1796850 | 35937
+ 26 | 1868724 | 35937
+ 27 | 1940598 | 35937
+ 28 | 2012472 | 35937
+ 29 | 2084346 | 35937
+(30 rows)
+
+DROP TABLE eager_agg_tab_ml;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 969ced994f..06362ae1e7 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -119,7 +119,7 @@ test: plancache limit plpgsql copy2 temp domain rangefuncs prepare conversion tr
# The stats test resets stats, so nothing else needing stats access can be in
# this group.
# ----------
-test: partition_merge partition_split partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats predicate
+test: partition_merge partition_split partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats predicate eager_aggregate
# event_trigger depends on create_am and cannot run concurrently with
# any test that runs DDL
diff --git a/src/test/regress/sql/eager_aggregate.sql b/src/test/regress/sql/eager_aggregate.sql
new file mode 100644
index 0000000000..4050e4df44
--- /dev/null
+++ b/src/test/regress/sql/eager_aggregate.sql
@@ -0,0 +1,192 @@
+--
+-- EAGER AGGREGATION
+-- Test we can push aggregation down below join
+--
+
+-- Enable eager aggregation, which by default is disabled.
+SET enable_eager_aggregate TO on;
+
+CREATE TABLE eager_agg_t1 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t2 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t3 (a int, b int, c double precision);
+
+INSERT INTO eager_agg_t1 SELECT i, i, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t2 SELECT i, i%10, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t3 SELECT i%10, i%10, i FROM generate_series(1, 1000)i;
+
+ANALYZE eager_agg_t1;
+ANALYZE eager_agg_t2;
+ANALYZE eager_agg_t3;
+
+
+--
+-- Test eager aggregation over base rel
+--
+
+-- Perform scan of a table, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+RESET enable_hashagg;
+
+
+--
+-- Test eager aggregation over join rel
+--
+
+-- Perform join of tables, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+
+RESET enable_hashagg;
+
+
+--
+-- Test that eager aggregation works for outer join
+--
+
+-- Ensure aggregation can be pushed down to the non-nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+-- Ensure aggregation cannot be pushed down to the nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+
+
+--
+-- Test that eager aggregation works for parallel plans
+--
+
+SET parallel_setup_cost=0;
+SET parallel_tuple_cost=0;
+SET min_parallel_table_scan_size=0;
+SET max_parallel_workers_per_gather=4;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+RESET parallel_setup_cost;
+RESET parallel_tuple_cost;
+RESET min_parallel_table_scan_size;
+RESET max_parallel_workers_per_gather;
+
+
+DROP TABLE eager_agg_t1;
+DROP TABLE eager_agg_t2;
+DROP TABLE eager_agg_t3;
+
+
+--
+-- Test eager aggregation for partitionwise join
+--
+
+-- Enable partitionwise aggregate, which by default is disabled.
+SET enable_partitionwise_aggregate TO true;
+-- Enable partitionwise join, which by default is disabled.
+SET enable_partitionwise_join TO true;
+
+CREATE TABLE eager_agg_tab1(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab1_p1 PARTITION OF eager_agg_tab1 FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab1_p2 PARTITION OF eager_agg_tab1 FOR VALUES FROM (10) TO (20);
+CREATE TABLE eager_agg_tab1_p3 PARTITION OF eager_agg_tab1 FOR VALUES FROM (20) TO (30);
+CREATE TABLE eager_agg_tab2(x int, y int) PARTITION BY RANGE(y);
+CREATE TABLE eager_agg_tab2_p1 PARTITION OF eager_agg_tab2 FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab2_p2 PARTITION OF eager_agg_tab2 FOR VALUES FROM (10) TO (20);
+CREATE TABLE eager_agg_tab2_p3 PARTITION OF eager_agg_tab2 FOR VALUES FROM (20) TO (30);
+INSERT INTO eager_agg_tab1 SELECT i % 30, i % 20 FROM generate_series(0, 299, 2) i;
+INSERT INTO eager_agg_tab2 SELECT i % 20, i % 30 FROM generate_series(0, 299, 3) i;
+
+ANALYZE eager_agg_tab1;
+ANALYZE eager_agg_tab2;
+
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+
+-- GROUP BY having other matching key
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 10 ORDER BY t2.x;
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 10 ORDER BY t2.x;
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+
+-- partial aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+RESET enable_hashagg;
+
+DROP TABLE eager_agg_tab1;
+DROP TABLE eager_agg_tab2;
+
+
+--
+-- Test with multi-level partitioning scheme
+--
+CREATE TABLE eager_agg_tab_ml(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p1 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab_ml_p2 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (10) TO (20) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p2_s1 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab_ml_p2_s2 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (15) TO (20);
+CREATE TABLE eager_agg_tab_ml_p3 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (20) TO (30) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p3_s1 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (20) TO (25);
+CREATE TABLE eager_agg_tab_ml_p3_s2 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (25) TO (30);
+INSERT INTO eager_agg_tab_ml SELECT i % 30, i % 30 FROM generate_series(1, 1000) i;
+
+ANALYZE eager_agg_tab_ml;
+
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+
+-- partial aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+
+DROP TABLE eager_agg_tab_ml;
--
2.31.0
[application/octet-stream] v7-0009-Add-README.patch (4.8K, 11-v7-0009-Add-README.patch)
download | inline diff:
From 897a2b1c162e340b8f6aea16ada68ed40e7bf727 Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Fri, 23 Feb 2024 13:41:36 +0800
Subject: [PATCH v7 9/9] Add README
---
src/backend/optimizer/README | 88 ++++++++++++++++++++++++++++++++++++
1 file changed, 88 insertions(+)
diff --git a/src/backend/optimizer/README b/src/backend/optimizer/README
index 2ab4f3dbf3..dae7b87f32 100644
--- a/src/backend/optimizer/README
+++ b/src/backend/optimizer/README
@@ -1497,3 +1497,91 @@ breaking down aggregation or grouping over a partitioned relation into
aggregation or grouping over its partitions is called partitionwise
aggregation. Especially when the partition keys match the GROUP BY clause,
this can be significantly faster than the regular method.
+
+Eager aggregation
+-------------------
+
+The obvious way to evaluate aggregates is to evaluate the FROM clause of the
+SQL query (this is what query_planner does) and use the resulting paths as the
+input of Agg node. However, if the groups are large enough, it may be more
+efficient to apply the partial aggregation to the output of base relation
+scan, and finalize it when we have all relations of the query joined:
+
+ EXPLAIN (COSTS OFF)
+ SELECT a.i, avg(b.y)
+ FROM a JOIN b ON a.i = b.j
+ GROUP BY a.i;
+
+ Finalize HashAggregate
+ Group Key: a.i
+ -> Nested Loop
+ -> Partial HashAggregate
+ Group Key: b.j
+ -> Seq Scan on b
+ -> Index Only Scan using a_pkey on a
+ Index Cond: (i = b.j)
+
+Thus the join above the partial aggregate node receives fewer input rows, and
+so the number of outer-to-inner pairs of tuples to be checked can be
+significantly lower, which can in turn lead to considerably lower join cost.
+
+Note that the GROUP BY expression might not be useful for the partial
+aggregate. In the example above, the aggregate avg(b.y) references table "b",
+but the GROUP BY expression mentions "a". However, the equivalence class {a.i,
+b.j} allows us to use the b.j column as a grouping key for the partial
+aggregation of the "b" table. The equivalence class mechanism is suitable
+because it's designed to derive join clauses, and at the same time the join
+clauses determine the choice of grouping columns of the partial aggregate: the
+only way for the partial aggregate to provide upper join(s) with input values
+is to have the join input expression(s) in the grouping key; besides grouping
+columns, the partial aggregate can only produce the transient states of the
+aggregate functions, but aggregate functions cannot be referenced by the JOIN
+clauses.
+
+Regarding correctness, join node considers the output of the partial aggregate
+to be equivalent to the output of a plain (non-aggregated) relation scan. That
+is, a group (i.e. a row of the partial aggregate output) matches the other
+side of the join if and only if each row of the non-aggregate relation
+does. In other words, all rows belonging to the same group have the same value
+of the join columns (As mentioned above, a join cannot reference other output
+expressions of the partial aggregate than the grouping expressions.).
+
+However, there's a restriction from the aggregate's perspective: the aggregate
+cannot be pushed down if any column referenced by either grouping expression
+or aggregate function can be set to NULL by an outer join above the relation
+to which we want to apply the partial aggregation. The point is that those
+NULL values would not appear on the input of the pushed-down, so it could
+either put the rows into groups in a different way than the aggregate at the
+top of the plan, or it could compute wrong values of the aggregate functions.
+
+Besides base relation, the aggregation can also be pushed down to join:
+
+ EXPLAIN (COSTS OFF)
+ SELECT a.i, avg(b.y + c.z)
+ FROM a JOIN b ON a.i = b.j
+ JOIN c ON b.j = c.i
+ GROUP BY a.i;
+
+ Finalize HashAggregate
+ Group Key: a.i
+ -> Nested Loop
+ -> Partial HashAggregate
+ Group Key: b.j
+ -> Hash Join
+ Hash Cond: (b.j = c.i)
+ -> Seq Scan on b
+ -> Hash
+ -> Seq Scan on c
+ -> Index Only Scan using a_pkey on a
+ Index Cond: (i = b.j)
+
+Whether the Agg node is created out of base relation or out of join, it's
+added to a separate RelOptInfo that we call "grouped relation". Grouped
+relation can be joined to a non-grouped relation, which results in a grouped
+relation too. Join of two grouped relations does not seem to be very useful
+and is currently not supported.
+
+If query_planner produces a grouped relation that contains valid paths, these
+are simply added to the UPPERREL_PARTIAL_GROUP_AGG relation. Further
+processing of these paths then does not differ from processing of other
+partially grouped paths.
--
2.31.0
^ permalink raw reply [nested|flat] 2+ messages in thread
end of thread, other threads:[~2024-05-20 08:12 UTC | newest]
Thread overview: 2+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2024-04-30 04:06 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
2024-05-20 08:12 ` Richard Guo <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox