public inbox for [email protected]  
help / color / mirror / Atom feed
Re: Eager aggregation, take 3
70+ messages / 7 participants
[nested] [flat]

* Re: Eager aggregation, take 3
@ 2024-12-17 03:42 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  0 siblings, 1 reply; 70+ messages in thread

From: Richard Guo @ 2024-12-17 03:42 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

On Fri, Nov 1, 2024 at 2:54 PM Richard Guo <[email protected]> wrote:
> Perhaps we could introduce a GroupPathInfo into the Path structure to
> store common information for a grouped path, such as the location of
> the partial aggregation (which could be the relids of the relation on
> top of which we place the partial aggregation) and the estimated
> rowcount for this grouped path, similar to how ParamPathInfo functions
> for parameterized paths.  Then we should be able to compare the
> grouped paths of the same location apples to apples.  I haven’t
> thought this through in detail yet, though.

After thinking over this again, I think one difference from the
parameterized path case is that, for a parameterized path, the fewer
the required outer rels, the better, as more outer rels imply more
join restrictions.  Therefore, the number of required outer rels
serves as a criterion when comparing paths in add_path().

For a grouped path, however, we don't concern ourselves with the
location of the partial aggregation.  What matters is whether one
grouped path is preferable to another based on the current merits of
add_path().  Therefore, I think it's acceptable to compare grouped
paths for the same grouped rel, regardless of where the partial
aggregation is placed.

Note that non-grouped and grouped paths will not appear in the same
RelOptInfo.  All paths for a grouped rel are grouped paths, meaning
there is a partial aggregation node somewhere in the path tree.
Similarly, all paths for a non-grouped rel are non-grouped paths.
That is to say, it is not possible to compare a grouped path with a
non-grouped path.

Two different grouped paths for the same grouped rel can have very
different rowcount estimates, depending on where the partial
aggregation is placed in the path tree.  Therefore, for a grouped
join path, we have to calculate its rowcount estimate using its outer
and inner paths, as what we do in set_joinpath_size().  This is
similar to what we do for parameterized paths: two different
parameterized paths for the same rel can also have very different
rowcount estimates, depending on which outer rels supply the
parameters.  So we calculate the rowcount estimates for parameterized
join paths for each different parameterization in
get_parameterized_joinrel_size().

set_joinpath_size() adds a special case into final_cost_nestloop(),
final_cost_mergejoin(), and final_cost_hashjoin().  For non-grouped
paths, it adds an additional check - IS_GROUPED_REL(rel), which is
defined as

#define IS_GROUPED_REL(rel)  ((rel)->agg_info != NULL)

I doubt that this additional simple pointer check will cause general
performance regressions.

> Yeah, this patch does not get it correct here.  Basically the logic is
> that for the partial aggregation pushed down to a non-aggregated
> relation, we need to consider all columns of that relation involved in
> upper join clauses and include them in the grouping keys.  Currently,
> the patch only checks whether a column is involved in upper join
> clauses but does not verify how the column is used.  We need to ensure
> that the operator used in the join clause is at least compatible with
> the grouping operator; otherwise, the grouping operator might
> interpret the values as the same while the join operator sees them as
> different.

Hmm, I think we can prevent this issue from occurring if we ensure
that "equality implies image equality" for each grouping key used in
partial aggregation.  In such cases, if the grouping operator in
partial aggregation treats two values as equal, the join operator in
the upper join clause must also treat them as equal.

On the other hand, it’s possible that the grouping operator treats two
values as different, while the join operator treats them as equal.
This is fine, as the different partial groups will be combined during
the final aggregation.

Attached is the patch rebased on the latest master.  It refines the
theoretical justification for the correctness of this transformation
in README and commit message.  It also adds the check for image
equality for all grouping keys used in partial aggregation, and fixes
the issue reported by Jian.  It does not yet handle the RLS case
though.

Thanks
Richard


Attachments:

  [application/octet-stream] v14-0001-Implement-Eager-Aggregation.patch (175.4K, 2-v14-0001-Implement-Eager-Aggregation.patch)
  download | inline diff:
From 8d3955e5a3c5bfa5b5de730733562b2c8e1c671b Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Tue, 11 Jun 2024 15:59:19 +0900
Subject: [PATCH v14] Implement Eager Aggregation

Eager aggregation is a query optimization technique that partially
pushes aggregation past a join, and finalizes it once all the
relations are joined.  Eager aggregation may reduce the number of
input rows to the join and thus could result in a better overall plan.

A plan with eager aggregation looks like:

 EXPLAIN (COSTS OFF)
 SELECT a.i, avg(b.y)
 FROM a JOIN b ON a.i = b.j
 GROUP BY a.i;

 Finalize HashAggregate
   Group Key: a.i
   ->  Nested Loop
         ->  Partial HashAggregate
               Group Key: b.j
               ->  Seq Scan on b
         ->  Index Only Scan using a_pkey on a
               Index Cond: (i = b.j)

During the construction of the join tree, we evaluate each base or
join relation to determine if eager aggregation can be applied.  If
feasible, we create a separate RelOptInfo called a "grouped relation"
and store it in a dedicated list.

Grouped relation paths can be generated in two ways.  The first method
involves adding sorted and hashed partial aggregation paths on top of
the non-grouped paths.  To limit planning time, we only consider the
cheapest or suitably-sorted non-grouped paths during this phase.

Alternatively, grouped paths can be generated by joining a grouped
relation with a non-grouped relation.  Joining two grouped relations
does not seem to be very useful and is currently not supported.

For the partial aggregation that is pushed down to a non-aggregated
relation, we need to consider all expressions from this relation that
are involved in upper join clauses and include them in the grouping
keys, using compatible operators.  This is essential to ensure that an
aggregated row from the partial aggregation matches the other side of
the join if and only if each row in the partial group does.  This
ensures that all rows within the same partial group share the same
'destiny', which is crucial for maintaining correctness.

One restriction is that we cannot push partial aggregation down to a
relation that is in the nullable side of an outer join, because the
NULL-extended rows produced by the outer join would not be available
when we perform the partial aggregation, while with a
non-eager-aggregation plan these rows are available for the top-level
aggregation.  Pushing partial aggregation in this case may result in
the rows being grouped differently than expected, or produce incorrect
values from the aggregate functions.

If we have generated a grouped relation for the topmost join relation,
we finalize its paths at the end.  The final paths will compete in the
usual way with paths built from regular planning.

Since eager aggregation can generate many grouped relations, we
introduce a RelInfoList structure, which encapsulates both a list and
a hash table, so that we can leverage the hash table for faster
lookups not only for join relations but also for grouped relations.

Eager aggregation can use significantly more CPU time and memory than
regular planning when the query involves aggregates and many joining
relations.  However, in some cases, the resulting plan can be much
better, justifying the additional planning effort.  All the same, for
now, turn this feature off by default.
---
 contrib/postgres_fdw/postgres_fdw.c           |    3 +-
 src/backend/optimizer/README                  |   80 +
 src/backend/optimizer/geqo/geqo_eval.c        |   98 +-
 src/backend/optimizer/path/allpaths.c         |  455 +++++-
 src/backend/optimizer/path/costsize.c         |   95 +-
 src/backend/optimizer/path/joinrels.c         |  147 ++
 src/backend/optimizer/plan/initsplan.c        |  258 ++++
 src/backend/optimizer/plan/planmain.c         |   17 +-
 src/backend/optimizer/plan/planner.c          |   99 +-
 src/backend/optimizer/util/appendinfo.c       |   60 +
 src/backend/optimizer/util/pathnode.c         |   47 +-
 src/backend/optimizer/util/relnode.c          |  761 +++++++++-
 src/backend/utils/misc/guc_tables.c           |   10 +
 src/backend/utils/misc/postgresql.conf.sample |    1 +
 src/include/nodes/pathnodes.h                 |  148 +-
 src/include/optimizer/pathnode.h              |    7 +
 src/include/optimizer/paths.h                 |    5 +
 src/include/optimizer/planmain.h              |    1 +
 src/test/regress/expected/eager_aggregate.out | 1308 +++++++++++++++++
 src/test/regress/expected/sysviews.out        |    3 +-
 src/test/regress/parallel_schedule            |    2 +-
 src/test/regress/sql/eager_aggregate.sql      |  192 +++
 src/tools/pgindent/typedefs.list              |    7 +-
 23 files changed, 3646 insertions(+), 158 deletions(-)
 create mode 100644 src/test/regress/expected/eager_aggregate.out
 create mode 100644 src/test/regress/sql/eager_aggregate.sql

diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index c0810fbd7c..0063f3942d 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -6089,7 +6089,8 @@ foreign_join_ok(PlannerInfo *root, RelOptInfo *joinrel, JoinType jointype,
 	 */
 	Assert(fpinfo->relation_index == 0);	/* shouldn't be set yet */
 	fpinfo->relation_index =
-		list_length(root->parse->rtable) + list_length(root->join_rel_list);
+		list_length(root->parse->rtable) +
+		list_length(root->join_rel_list->items);
 
 	return true;
 }
diff --git a/src/backend/optimizer/README b/src/backend/optimizer/README
index 2ab4f3dbf3..7a6de25f6e 100644
--- a/src/backend/optimizer/README
+++ b/src/backend/optimizer/README
@@ -1497,3 +1497,83 @@ breaking down aggregation or grouping over a partitioned relation into
 aggregation or grouping over its partitions is called partitionwise
 aggregation.  Especially when the partition keys match the GROUP BY clause,
 this can be significantly faster than the regular method.
+
+Eager aggregation
+-----------------
+
+Eager aggregation is a query optimization technique that partially pushes
+aggregation past a join, and finalizes it once all the relations are joined.
+Eager aggregation may reduce the number of input rows to the join and thus
+could result in a better overall plan.
+
+For example:
+
+ EXPLAIN (COSTS OFF)
+ SELECT a.i, avg(b.y)
+ FROM a JOIN b ON a.i = b.j
+ GROUP BY a.i;
+
+ Finalize HashAggregate
+   Group Key: a.i
+   ->  Nested Loop
+         ->  Partial HashAggregate
+               Group Key: b.j
+               ->  Seq Scan on b
+         ->  Index Only Scan using a_pkey on a
+               Index Cond: (i = b.j)
+
+If the partial aggregation on table B significantly reduces the number of
+input rows, the join above will be much cheaper, leading to a more efficient
+final plan.
+
+For the partial aggregation that is pushed down to a non-aggregated relation,
+we need to consider all expressions from this relation that are involved in
+upper join clauses and include them in the grouping keys, using compatible
+operators.  This is essential to ensure that an aggregated row from the partial
+aggregation matches the other side of the join if and only if each row in the
+partial group does.  This ensures that all rows within the same partial group
+share the same 'destiny', which is crucial for maintaining correctness.
+
+One restriction is that we cannot push partial aggregation down to a relation
+that is in the nullable side of an outer join, because the NULL-extended rows
+produced by the outer join would not be available when we perform the partial
+aggregation, while with a non-eager-aggregation plan these rows are available
+for the top-level aggregation.  Pushing partial aggregation in this case may
+result in the rows being grouped differently than expected, or produce
+incorrect values from the aggregate functions.
+
+We can also apply eager aggregation to a join:
+
+ EXPLAIN (COSTS OFF)
+ SELECT a.i, avg(b.y + c.z)
+ FROM a JOIN b ON a.i = b.j
+        JOIN c ON b.j = c.i
+ GROUP BY a.i;
+
+ Finalize HashAggregate
+   Group Key: a.i
+   ->  Nested Loop
+         ->  Partial HashAggregate
+               Group Key: b.j
+               ->  Hash Join
+                     Hash Cond: (b.j = c.i)
+                     ->  Seq Scan on b
+                     ->  Hash
+                           ->  Seq Scan on c
+         ->  Index Only Scan using a_pkey on a
+               Index Cond: (i = b.j)
+
+During the construction of the join tree, we evaluate each base or join
+relation to determine if eager aggregation can be applied.  If feasible, we
+create a separate RelOptInfo called a "grouped relation" and generate grouped
+paths by adding sorted and hashed partial aggregation paths on top of the
+non-grouped paths.  To limit planning time, we consider only the cheapest
+non-grouped paths in this step.
+
+Another way to generate grouped paths is to join a grouped relation with a
+non-grouped relation.  Joining two grouped relations does not seem to be very
+useful and is currently not supported.
+
+If we have generated a grouped relation for the topmost join relation, we need
+to finalize its paths at the end.  The final paths will compete in the usual
+way with paths built from regular planning.
diff --git a/src/backend/optimizer/geqo/geqo_eval.c b/src/backend/optimizer/geqo/geqo_eval.c
index d2f7f4e5f3..cdc9543135 100644
--- a/src/backend/optimizer/geqo/geqo_eval.c
+++ b/src/backend/optimizer/geqo/geqo_eval.c
@@ -39,10 +39,20 @@ typedef struct
 	int			size;			/* number of input relations in clump */
 } Clump;
 
+/* The original length and hashtable of a RelInfoList */
+typedef struct
+{
+	int			savelength;
+	struct HTAB *savehash;
+} RelInfoListInfo;
+
 static List *merge_clump(PlannerInfo *root, List *clumps, Clump *new_clump,
 						 int num_gene, bool force);
 static bool desirable_join(PlannerInfo *root,
 						   RelOptInfo *outer_rel, RelOptInfo *inner_rel);
+static RelInfoListInfo save_relinfolist(RelInfoList *relinfo_list);
+static void restore_relinfolist(RelInfoList *relinfo_list,
+								RelInfoListInfo *info);
 
 
 /*
@@ -60,8 +70,8 @@ geqo_eval(PlannerInfo *root, Gene *tour, int num_gene)
 	MemoryContext oldcxt;
 	RelOptInfo *joinrel;
 	Cost		fitness;
-	int			savelength;
-	struct HTAB *savehash;
+	RelInfoListInfo save_join_rel;
+	RelInfoListInfo save_grouped_rel;
 
 	/*
 	 * Create a private memory context that will hold all temp storage
@@ -78,25 +88,29 @@ geqo_eval(PlannerInfo *root, Gene *tour, int num_gene)
 	oldcxt = MemoryContextSwitchTo(mycontext);
 
 	/*
-	 * gimme_tree will add entries to root->join_rel_list, which may or may
-	 * not already contain some entries.  The newly added entries will be
-	 * recycled by the MemoryContextDelete below, so we must ensure that the
-	 * list is restored to its former state before exiting.  We can do this by
-	 * truncating the list to its original length.  NOTE this assumes that any
-	 * added entries are appended at the end!
+	 * gimme_tree will add entries to root->join_rel_list and
+	 * root->grouped_rel_list, which may or may not already contain some
+	 * entries.  The newly added entries will be recycled by the
+	 * MemoryContextDelete below, so we must ensure that each list within the
+	 * RelInfoList structures is restored to its former state before exiting.
+	 * We can do this by truncating each list to its original length.  NOTE
+	 * this assumes that any added entries are appended at the end!
 	 *
-	 * We also must take care not to mess up the outer join_rel_hash, if there
-	 * is one.  We can do this by just temporarily setting the link to NULL.
-	 * (If we are dealing with enough join rels, which we very likely are, a
-	 * new hash table will get built and used locally.)
+	 * We also must take care not to mess up the outer hash tables within the
+	 * RelInfoList structures, if any.  We can do this by just temporarily
+	 * setting each link to NULL.  (If we are dealing with enough join rels or
+	 * grouped rels, which we very likely are, new hash tables will get built
+	 * and used locally.)
 	 *
 	 * join_rel_level[] shouldn't be in use, so just Assert it isn't.
 	 */
-	savelength = list_length(root->join_rel_list);
-	savehash = root->join_rel_hash;
+	save_join_rel = save_relinfolist(root->join_rel_list);
+	save_grouped_rel = save_relinfolist(root->grouped_rel_list);
+
 	Assert(root->join_rel_level == NULL);
 
-	root->join_rel_hash = NULL;
+	root->join_rel_list->hash = NULL;
+	root->grouped_rel_list->hash = NULL;
 
 	/* construct the best path for the given combination of relations */
 	joinrel = gimme_tree(root, tour, num_gene);
@@ -118,12 +132,11 @@ geqo_eval(PlannerInfo *root, Gene *tour, int num_gene)
 		fitness = DBL_MAX;
 
 	/*
-	 * Restore join_rel_list to its former state, and put back original
-	 * hashtable if any.
+	 * Restore each of the list in join_rel_list and grouped_rel_list to its
+	 * former state, and put back original hashtables if any.
 	 */
-	root->join_rel_list = list_truncate(root->join_rel_list,
-										savelength);
-	root->join_rel_hash = savehash;
+	restore_relinfolist(root->join_rel_list, &save_join_rel);
+	restore_relinfolist(root->grouped_rel_list, &save_grouped_rel);
 
 	/* release all the memory acquired within gimme_tree */
 	MemoryContextSwitchTo(oldcxt);
@@ -279,6 +292,27 @@ merge_clump(PlannerInfo *root, List *clumps, Clump *new_clump, int num_gene,
 				/* Find and save the cheapest paths for this joinrel */
 				set_cheapest(joinrel);
 
+				/*
+				 * Except for the topmost scan/join rel, consider generating
+				 * partial aggregation paths for the grouped relation on top
+				 * of the paths of this rel.  After that, we're done creating
+				 * paths for the grouped relation, so run set_cheapest().
+				 */
+				if (!bms_equal(joinrel->relids, root->all_query_rels))
+				{
+					RelOptInfo *rel_grouped;
+
+					rel_grouped = find_grouped_rel(root, joinrel->relids);
+					if (rel_grouped)
+					{
+						Assert(IS_GROUPED_REL(rel_grouped));
+
+						generate_grouped_paths(root, rel_grouped, joinrel,
+											   rel_grouped->agg_info);
+						set_cheapest(rel_grouped);
+					}
+				}
+
 				/* Absorb new clump into old */
 				old_clump->joinrel = joinrel;
 				old_clump->size += new_clump->size;
@@ -336,3 +370,27 @@ desirable_join(PlannerInfo *root,
 	/* Otherwise postpone the join till later. */
 	return false;
 }
+
+/*
+ * Save the original length and hashtable of a RelInfoList.
+ */
+static RelInfoListInfo
+save_relinfolist(RelInfoList *relinfo_list)
+{
+	RelInfoListInfo info;
+
+	info.savelength = list_length(relinfo_list->items);
+	info.savehash = relinfo_list->hash;
+
+	return info;
+}
+
+/*
+ * Restore the original length and hashtable of a RelInfoList.
+ */
+static void
+restore_relinfolist(RelInfoList *relinfo_list, RelInfoListInfo *info)
+{
+	relinfo_list->items = list_truncate(relinfo_list->items, info->savelength);
+	relinfo_list->hash = info->savehash;
+}
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 172edb643a..0ac2c2d507 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -40,6 +40,7 @@
 #include "optimizer/paths.h"
 #include "optimizer/plancat.h"
 #include "optimizer/planner.h"
+#include "optimizer/prep.h"
 #include "optimizer/tlist.h"
 #include "parser/parse_clause.h"
 #include "parser/parsetree.h"
@@ -47,6 +48,7 @@
 #include "port/pg_bitutils.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/lsyscache.h"
+#include "utils/selfuncs.h"
 
 
 /* Bitmask flags for pushdown_safety_info.unsafeFlags */
@@ -77,6 +79,7 @@ typedef enum pushdown_safe_type
 
 /* These parameters are set by GUC */
 bool		enable_geqo = false;	/* just in case GUC doesn't set it */
+bool		enable_eager_aggregate = false;
 int			geqo_threshold;
 int			min_parallel_table_scan_size;
 int			min_parallel_index_scan_size;
@@ -90,6 +93,7 @@ join_search_hook_type join_search_hook = NULL;
 
 static void set_base_rel_consider_startup(PlannerInfo *root);
 static void set_base_rel_sizes(PlannerInfo *root);
+static void setup_base_grouped_rels(PlannerInfo *root);
 static void set_base_rel_pathlists(PlannerInfo *root);
 static void set_rel_size(PlannerInfo *root, RelOptInfo *rel,
 						 Index rti, RangeTblEntry *rte);
@@ -114,6 +118,7 @@ static void set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
 								Index rti, RangeTblEntry *rte);
 static void set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 									Index rti, RangeTblEntry *rte);
+static void set_grouped_rel_pathlist(PlannerInfo *root, RelOptInfo *rel);
 static void generate_orderedappend_paths(PlannerInfo *root, RelOptInfo *rel,
 										 List *live_childrels,
 										 List *all_child_pathkeys);
@@ -182,6 +187,11 @@ make_one_rel(PlannerInfo *root, List *joinlist)
 	 */
 	set_base_rel_sizes(root);
 
+	/*
+	 * Build grouped base relations for each base rel if possible.
+	 */
+	setup_base_grouped_rels(root);
+
 	/*
 	 * We should now have size estimates for every actual table involved in
 	 * the query, and we also know which if any have been deleted from the
@@ -323,6 +333,45 @@ set_base_rel_sizes(PlannerInfo *root)
 	}
 }
 
+/*
+ * setup_base_grouped_rels
+ *	  For each "plain" base relation, build a grouped base relation if eager
+ *	  aggregation is possible and if this relation can produce grouped paths.
+ */
+static void
+setup_base_grouped_rels(PlannerInfo *root)
+{
+	Index		rti;
+
+	/*
+	 * If there are no aggregate expressions or grouping expressions, eager
+	 * aggregation is not possible.
+	 */
+	if (root->agg_clause_list == NIL ||
+		root->group_expr_list == NIL)
+		return;
+
+	for (rti = 1; rti < root->simple_rel_array_size; rti++)
+	{
+		RelOptInfo *rel = root->simple_rel_array[rti];
+		RelOptInfo *rel_grouped;
+
+		/* there may be empty slots corresponding to non-baserel RTEs */
+		if (rel == NULL)
+			continue;
+
+		Assert(rel->relid == rti);	/* sanity check on array */
+		Assert(IS_SIMPLE_REL(rel)); /* sanity check on rel */
+
+		rel_grouped = build_simple_grouped_rel(root, rel);
+		if (rel_grouped)
+		{
+			/* Make the grouped relation available for joining. */
+			add_grouped_rel(root, rel_grouped);
+		}
+	}
+}
+
 /*
  * set_base_rel_pathlists
  *	  Finds all paths available for scanning each base-relation entry.
@@ -559,6 +608,15 @@ set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 	/* Now find the cheapest of the paths for this rel */
 	set_cheapest(rel);
 
+	/*
+	 * If a grouped relation for this rel exists, build partial aggregation
+	 * paths for it.
+	 *
+	 * Note that this can only happen after we've called set_cheapest() for
+	 * this base rel, because we need its cheapest paths.
+	 */
+	set_grouped_rel_pathlist(root, rel);
+
 #ifdef OPTIMIZER_DEBUG
 	pprint(rel);
 #endif
@@ -1298,6 +1356,36 @@ set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 	add_paths_to_append_rel(root, rel, live_childrels);
 }
 
+/*
+ * set_grouped_rel_pathlist
+ *	  If a grouped relation for the given 'rel' exists, build partial
+ *	  aggregation paths for it.
+ */
+static void
+set_grouped_rel_pathlist(PlannerInfo *root, RelOptInfo *rel)
+{
+	RelOptInfo *rel_grouped;
+
+	/*
+	 * If there are no aggregate expressions or grouping expressions, eager
+	 * aggregation is not possible.
+	 */
+	if (root->agg_clause_list == NIL ||
+		root->group_expr_list == NIL)
+		return;
+
+	/* Add paths to the grouped base relation if one exists. */
+	rel_grouped = find_grouped_rel(root, rel->relids);
+	if (rel_grouped)
+	{
+		Assert(IS_GROUPED_REL(rel_grouped));
+
+		generate_grouped_paths(root, rel_grouped, rel,
+							   rel_grouped->agg_info);
+		set_cheapest(rel_grouped);
+	}
+}
+
 
 /*
  * add_paths_to_append_rel
@@ -3306,6 +3394,318 @@ generate_useful_gather_paths(PlannerInfo *root, RelOptInfo *rel, bool override_r
 	}
 }
 
+/*
+ * generate_grouped_paths
+ *		Generate paths for a grouped relation by adding sorted and hashed
+ *		partial aggregation paths on top of paths of the plain base or join
+ *		relation.
+ *
+ * The information needed are provided by the RelAggInfo structure.
+ */
+void
+generate_grouped_paths(PlannerInfo *root, RelOptInfo *rel_grouped,
+					   RelOptInfo *rel_plain, RelAggInfo *agg_info)
+{
+	AggClauseCosts agg_costs;
+	bool		can_hash;
+	bool		can_sort;
+	Path	   *cheapest_total_path = NULL;
+	Path	   *cheapest_partial_path = NULL;
+	double		dNumGroups = 0;
+	double		dNumPartialGroups = 0;
+
+	if (IS_DUMMY_REL(rel_plain))
+	{
+		mark_dummy_rel(rel_grouped);
+		return;
+	}
+
+	/*
+	 * If the grouped paths for the given relation are not considered useful,
+	 * do not bother to generate them.
+	 */
+	if (!agg_info->agg_useful)
+		return;
+
+	MemSet(&agg_costs, 0, sizeof(AggClauseCosts));
+	get_agg_clause_costs(root, AGGSPLIT_INITIAL_SERIAL, &agg_costs);
+
+	/*
+	 * Determine whether it's possible to perform sort-based implementations
+	 * of grouping.
+	 */
+	can_sort = grouping_is_sortable(agg_info->group_clauses);
+
+	/*
+	 * Determine whether we should consider hash-based implementations of
+	 * grouping.
+	 */
+	Assert(root->numOrderedAggs == 0);
+	can_hash = (agg_info->group_clauses != NIL &&
+				grouping_is_hashable(agg_info->group_clauses));
+
+	/*
+	 * Consider whether we should generate partially aggregated non-partial
+	 * paths.  We can only do this if we have a non-partial path.
+	 */
+	if (rel_plain->pathlist != NIL)
+	{
+		cheapest_total_path = rel_plain->cheapest_total_path;
+		Assert(cheapest_total_path != NULL);
+	}
+
+	/*
+	 * If parallelism is possible for rel_grouped, then we should consider
+	 * generating partially-grouped partial paths.  However, if the plain rel
+	 * has no partial paths, then we can't.
+	 */
+	if (rel_grouped->consider_parallel && rel_plain->partial_pathlist != NIL)
+	{
+		cheapest_partial_path = linitial(rel_plain->partial_pathlist);
+		Assert(cheapest_partial_path != NULL);
+	}
+
+	/* Estimate number of partial groups. */
+	if (cheapest_total_path != NULL)
+		dNumGroups = estimate_num_groups(root,
+										 agg_info->group_exprs,
+										 cheapest_total_path->rows,
+										 NULL, NULL);
+	if (cheapest_partial_path != NULL)
+		dNumPartialGroups = estimate_num_groups(root,
+												agg_info->group_exprs,
+												cheapest_partial_path->rows,
+												NULL, NULL);
+
+	if (can_sort && cheapest_total_path != NULL)
+	{
+		ListCell   *lc;
+
+		/*
+		 * Use any available suitably-sorted path as input, and also consider
+		 * sorting the cheapest-total path.
+		 */
+		foreach(lc, rel_plain->pathlist)
+		{
+			Path	   *input_path = (Path *) lfirst(lc);
+			Path	   *path;
+			bool		is_sorted;
+			int			presorted_keys;
+
+			/*
+			 * Since the path originates from a non-grouped relation that is
+			 * not aware of eager aggregation, we must ensure that it provides
+			 * the correct input for partial aggregation.
+			 */
+			path = (Path *) create_projection_path(root,
+												   rel_grouped,
+												   input_path,
+												   agg_info->agg_input);
+
+			is_sorted = pathkeys_count_contained_in(agg_info->group_pathkeys,
+													path->pathkeys,
+													&presorted_keys);
+			if (!is_sorted)
+			{
+				/*
+				 * Try at least sorting the cheapest path and also try
+				 * incrementally sorting any path which is partially sorted
+				 * already (no need to deal with paths which have presorted
+				 * keys when incremental sort is disabled unless it's the
+				 * cheapest input path).
+				 */
+				if (input_path != cheapest_total_path &&
+					(presorted_keys == 0 || !enable_incremental_sort))
+					continue;
+
+				/*
+				 * We've no need to consider both a sort and incremental sort.
+				 * We'll just do a sort if there are no presorted keys and an
+				 * incremental sort when there are presorted keys.
+				 */
+				if (presorted_keys == 0 || !enable_incremental_sort)
+					path = (Path *) create_sort_path(root,
+													 rel_grouped,
+													 path,
+													 agg_info->group_pathkeys,
+													 -1.0);
+				else
+					path = (Path *) create_incremental_sort_path(root,
+																 rel_grouped,
+																 path,
+																 agg_info->group_pathkeys,
+																 presorted_keys,
+																 -1.0);
+			}
+
+			/*
+			 * qual is NIL because the HAVING clause cannot be evaluated until
+			 * the final value of the aggregate is known.
+			 */
+			path = (Path *) create_agg_path(root,
+											rel_grouped,
+											path,
+											agg_info->target,
+											AGG_SORTED,
+											AGGSPLIT_INITIAL_SERIAL,
+											agg_info->group_clauses,
+											NIL,
+											&agg_costs,
+											dNumGroups);
+
+			add_path(rel_grouped, path);
+		}
+	}
+
+	if (can_sort && cheapest_partial_path != NULL)
+	{
+		ListCell   *lc;
+
+		/* Similar to above logic, but for partial paths. */
+		foreach(lc, rel_plain->partial_pathlist)
+		{
+			Path	   *input_path = (Path *) lfirst(lc);
+			Path	   *path;
+			bool		is_sorted;
+			int			presorted_keys;
+
+			/*
+			 * Since the path originates from a non-grouped relation that is
+			 * not aware of eager aggregation, we must ensure that it provides
+			 * the correct input for partial aggregation.
+			 */
+			path = (Path *) create_projection_path(root,
+												   rel_grouped,
+												   input_path,
+												   agg_info->agg_input);
+
+			is_sorted = pathkeys_count_contained_in(agg_info->group_pathkeys,
+													path->pathkeys,
+													&presorted_keys);
+
+			if (!is_sorted)
+			{
+				/*
+				 * Try at least sorting the cheapest path and also try
+				 * incrementally sorting any path which is partially sorted
+				 * already (no need to deal with paths which have presorted
+				 * keys when incremental sort is disabled unless it's the
+				 * cheapest input path).
+				 */
+				if (input_path != cheapest_partial_path &&
+					(presorted_keys == 0 || !enable_incremental_sort))
+					continue;
+
+				/*
+				 * We've no need to consider both a sort and incremental sort.
+				 * We'll just do a sort if there are no presorted keys and an
+				 * incremental sort when there are presorted keys.
+				 */
+				if (presorted_keys == 0 || !enable_incremental_sort)
+					path = (Path *) create_sort_path(root,
+													 rel_grouped,
+													 path,
+													 agg_info->group_pathkeys,
+													 -1.0);
+				else
+					path = (Path *) create_incremental_sort_path(root,
+																 rel_grouped,
+																 path,
+																 agg_info->group_pathkeys,
+																 presorted_keys,
+																 -1.0);
+			}
+
+			/*
+			 * qual is NIL because the HAVING clause cannot be evaluated until
+			 * the final value of the aggregate is known.
+			 */
+			path = (Path *) create_agg_path(root,
+											rel_grouped,
+											path,
+											agg_info->target,
+											AGG_SORTED,
+											AGGSPLIT_INITIAL_SERIAL,
+											agg_info->group_clauses,
+											NIL,
+											&agg_costs,
+											dNumPartialGroups);
+
+			add_partial_path(rel_grouped, path);
+		}
+	}
+
+	/*
+	 * Add a partially-grouped HashAgg Path where possible
+	 */
+	if (can_hash && cheapest_total_path != NULL)
+	{
+		Path	   *path;
+
+		/*
+		 * Since the path originates from a non-grouped relation that is not
+		 * aware of eager aggregation, we must ensure that it provides the
+		 * correct input for partial aggregation.
+		 */
+		path = (Path *) create_projection_path(root,
+											   rel_grouped,
+											   cheapest_total_path,
+											   agg_info->agg_input);
+
+		/*
+		 * qual is NIL because the HAVING clause cannot be evaluated until the
+		 * final value of the aggregate is known.
+		 */
+		path = (Path *) create_agg_path(root,
+										rel_grouped,
+										path,
+										agg_info->target,
+										AGG_HASHED,
+										AGGSPLIT_INITIAL_SERIAL,
+										agg_info->group_clauses,
+										NIL,
+										&agg_costs,
+										dNumGroups);
+
+		add_path(rel_grouped, path);
+	}
+
+	/*
+	 * Now add a partially-grouped HashAgg partial Path where possible
+	 */
+	if (can_hash && cheapest_partial_path != NULL)
+	{
+		Path	   *path;
+
+		/*
+		 * Since the path originates from a non-grouped relation that is not
+		 * aware of eager aggregation, we must ensure that it provides the
+		 * correct input for partial aggregation.
+		 */
+		path = (Path *) create_projection_path(root,
+											   rel_grouped,
+											   cheapest_partial_path,
+											   agg_info->agg_input);
+
+		/*
+		 * qual is NIL because the HAVING clause cannot be evaluated until the
+		 * final value of the aggregate is known.
+		 */
+		path = (Path *) create_agg_path(root,
+										rel_grouped,
+										path,
+										agg_info->target,
+										AGG_HASHED,
+										AGGSPLIT_INITIAL_SERIAL,
+										agg_info->group_clauses,
+										NIL,
+										&agg_costs,
+										dNumPartialGroups);
+
+		add_partial_path(rel_grouped, path);
+	}
+}
+
 /*
  * make_rel_from_joinlist
  *	  Build access paths using a "joinlist" to guide the join path search.
@@ -3414,9 +3814,10 @@ make_rel_from_joinlist(PlannerInfo *root, List *joinlist)
  * needed for these paths need have been instantiated.
  *
  * Note to plugin authors: the functions invoked during standard_join_search()
- * modify root->join_rel_list and root->join_rel_hash.  If you want to do more
- * than one join-order search, you'll probably need to save and restore the
- * original states of those data structures.  See geqo_eval() for an example.
+ * modify root->join_rel_list->items and root->join_rel_list->hash.  If you
+ * want to do more than one join-order search, you'll probably need to save and
+ * restore the original states of those data structures.  See geqo_eval() for
+ * an example.
  */
 RelOptInfo *
 standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
@@ -3465,6 +3866,10 @@ standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
 		 *
 		 * After that, we're done creating paths for the joinrel, so run
 		 * set_cheapest().
+		 *
+		 * In addition, we also run generate_grouped_paths() for the grouped
+		 * relation of each just-processed joinrel, and run set_cheapest() for
+		 * the grouped relation afterwards.
 		 */
 		foreach(lc, root->join_rel_level[lev])
 		{
@@ -3485,6 +3890,27 @@ standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
 			/* Find and save the cheapest paths for this rel */
 			set_cheapest(rel);
 
+			/*
+			 * Except for the topmost scan/join rel, consider generating
+			 * partial aggregation paths for the grouped relation on top of
+			 * the paths of this rel.  After that, we're done creating paths
+			 * for the grouped relation, so run set_cheapest().
+			 */
+			if (!bms_equal(rel->relids, root->all_query_rels))
+			{
+				RelOptInfo *rel_grouped;
+
+				rel_grouped = find_grouped_rel(root, rel->relids);
+				if (rel_grouped)
+				{
+					Assert(IS_GROUPED_REL(rel_grouped));
+
+					generate_grouped_paths(root, rel_grouped, rel,
+										   rel_grouped->agg_info);
+					set_cheapest(rel_grouped);
+				}
+			}
+
 #ifdef OPTIMIZER_DEBUG
 			pprint(rel);
 #endif
@@ -4353,6 +4779,29 @@ generate_partitionwise_join_paths(PlannerInfo *root, RelOptInfo *rel)
 		if (IS_DUMMY_REL(child_rel))
 			continue;
 
+		/*
+		 * Except for the topmost scan/join rel, consider generating partial
+		 * aggregation paths for the grouped relation on top of the paths of
+		 * this partitioned child-join.  After that, we're done creating paths
+		 * for the grouped relation, so run set_cheapest().
+		 */
+		if (!bms_equal(IS_OTHER_REL(rel) ?
+					   rel->top_parent_relids : rel->relids,
+					   root->all_query_rels))
+		{
+			RelOptInfo *rel_grouped;
+
+			rel_grouped = find_grouped_rel(root, child_rel->relids);
+			if (rel_grouped)
+			{
+				Assert(IS_GROUPED_REL(rel_grouped));
+
+				generate_grouped_paths(root, rel_grouped, child_rel,
+									   rel_grouped->agg_info);
+				set_cheapest(rel_grouped);
+			}
+		}
+
 #ifdef OPTIMIZER_DEBUG
 		pprint(child_rel);
 #endif
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index c36687aa4d..c093b47af4 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -180,6 +180,8 @@ static bool cost_qual_eval_walker(Node *node, cost_qual_eval_context *context);
 static void get_restriction_qual_cost(PlannerInfo *root, RelOptInfo *baserel,
 									  ParamPathInfo *param_info,
 									  QualCost *qpqual_cost);
+static void set_joinpath_size(PlannerInfo *root, JoinPath *jpath,
+							  SpecialJoinInfo *sjinfo);
 static bool has_indexed_join_quals(NestPath *path);
 static double approx_tuple_count(PlannerInfo *root, JoinPath *path,
 								 List *quals);
@@ -3370,19 +3372,7 @@ final_cost_nestloop(PlannerInfo *root, NestPath *path,
 	if (inner_path_rows <= 0)
 		inner_path_rows = 1;
 	/* Mark the path with the correct row estimate */
-	if (path->jpath.path.param_info)
-		path->jpath.path.rows = path->jpath.path.param_info->ppi_rows;
-	else
-		path->jpath.path.rows = path->jpath.path.parent->rows;
-
-	/* For partial paths, scale row estimate. */
-	if (path->jpath.path.parallel_workers > 0)
-	{
-		double		parallel_divisor = get_parallel_divisor(&path->jpath.path);
-
-		path->jpath.path.rows =
-			clamp_row_est(path->jpath.path.rows / parallel_divisor);
-	}
+	set_joinpath_size(root, &path->jpath, extra->sjinfo);
 
 	/* cost of inner-relation source data (we already dealt with outer rel) */
 
@@ -3867,19 +3857,7 @@ final_cost_mergejoin(PlannerInfo *root, MergePath *path,
 		inner_path_rows = 1;
 
 	/* Mark the path with the correct row estimate */
-	if (path->jpath.path.param_info)
-		path->jpath.path.rows = path->jpath.path.param_info->ppi_rows;
-	else
-		path->jpath.path.rows = path->jpath.path.parent->rows;
-
-	/* For partial paths, scale row estimate. */
-	if (path->jpath.path.parallel_workers > 0)
-	{
-		double		parallel_divisor = get_parallel_divisor(&path->jpath.path);
-
-		path->jpath.path.rows =
-			clamp_row_est(path->jpath.path.rows / parallel_divisor);
-	}
+	set_joinpath_size(root, &path->jpath, extra->sjinfo);
 
 	/*
 	 * Compute cost of the mergequals and qpquals (other restriction clauses)
@@ -4299,19 +4277,7 @@ final_cost_hashjoin(PlannerInfo *root, HashPath *path,
 	path->jpath.path.disabled_nodes = workspace->disabled_nodes;
 
 	/* Mark the path with the correct row estimate */
-	if (path->jpath.path.param_info)
-		path->jpath.path.rows = path->jpath.path.param_info->ppi_rows;
-	else
-		path->jpath.path.rows = path->jpath.path.parent->rows;
-
-	/* For partial paths, scale row estimate. */
-	if (path->jpath.path.parallel_workers > 0)
-	{
-		double		parallel_divisor = get_parallel_divisor(&path->jpath.path);
-
-		path->jpath.path.rows =
-			clamp_row_est(path->jpath.path.rows / parallel_divisor);
-	}
+	set_joinpath_size(root, &path->jpath, extra->sjinfo);
 
 	/* mark the path with estimated # of batches */
 	path->num_batches = numbatches;
@@ -5061,6 +5027,57 @@ get_restriction_qual_cost(PlannerInfo *root, RelOptInfo *baserel,
 		*qpqual_cost = baserel->baserestrictcost;
 }
 
+/*
+ * set_joinpath_size
+ *	  Set the correct row estimate for the given join path.
+ *
+ * 'jpath' is the join path under consideration.
+ * 'sjinfo' is any SpecialJoinInfo relevant to this join.
+ *
+ * Note that for a grouped join relation, its paths could have very different
+ * rowcount estimates, so we need to calculate the rowcount estimate using the
+ * outer path and inner path of the given join path.
+ */
+static void
+set_joinpath_size(PlannerInfo *root, JoinPath *jpath, SpecialJoinInfo *sjinfo)
+{
+	if (IS_GROUPED_REL(jpath->path.parent))
+	{
+		Path	   *outer_path = jpath->outerjoinpath;
+		Path	   *inner_path = jpath->innerjoinpath;
+
+		/*
+		 * Estimate the number of rows of this grouped join path as the sizes
+		 * of the outer and inner paths times the selectivity of the clauses
+		 * that have ended up at this join node.
+		 */
+		jpath->path.rows = calc_joinrel_size_estimate(root,
+													  jpath->path.parent,
+													  outer_path->parent,
+													  inner_path->parent,
+													  outer_path->rows,
+													  inner_path->rows,
+													  sjinfo,
+													  jpath->joinrestrictinfo);
+	}
+	else
+	{
+		if (jpath->path.param_info)
+			jpath->path.rows = jpath->path.param_info->ppi_rows;
+		else
+			jpath->path.rows = jpath->path.parent->rows;
+
+		/* For partial paths, scale row estimate. */
+		if (jpath->path.parallel_workers > 0)
+		{
+			double		parallel_divisor = get_parallel_divisor(&jpath->path);
+
+			jpath->path.rows =
+				clamp_row_est(jpath->path.rows / parallel_divisor);
+		}
+	}
+}
+
 
 /*
  * compute_semi_anti_join_factors
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index 7db5e30eef..20698e48f0 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -35,6 +35,9 @@ static bool has_legal_joinclause(PlannerInfo *root, RelOptInfo *rel);
 static bool restriction_is_constant_false(List *restrictlist,
 										  RelOptInfo *joinrel,
 										  bool only_pushed_down);
+static void make_grouped_join_rel(PlannerInfo *root, RelOptInfo *rel1,
+								  RelOptInfo *rel2, RelOptInfo *joinrel,
+								  SpecialJoinInfo *sjinfo, List *restrictlist);
 static void populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
 										RelOptInfo *rel2, RelOptInfo *joinrel,
 										SpecialJoinInfo *sjinfo, List *restrictlist);
@@ -771,6 +774,10 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
 		return joinrel;
 	}
 
+	/* Build a grouped join relation for 'joinrel' if possible. */
+	make_grouped_join_rel(root, rel1, rel2, joinrel, sjinfo,
+						  restrictlist);
+
 	/* Add paths to the join relation. */
 	populate_joinrel_with_paths(root, rel1, rel2, joinrel, sjinfo,
 								restrictlist);
@@ -882,6 +889,141 @@ add_outer_joins_to_relids(PlannerInfo *root, Relids input_relids,
 	return input_relids;
 }
 
+/*
+ * make_grouped_join_rel
+ *	  Build a grouped join relation out of 'joinrel' if eager aggregation is
+ *	  possible and the 'joinrel' can produce grouped paths.
+ *
+ * We also generate partial aggregation paths for the grouped relation by
+ * joining the grouped paths of 'rel1' to the plain paths of 'rel2', or by
+ * joining the grouped paths of 'rel2' to the plain paths of 'rel1'.
+ */
+static void
+make_grouped_join_rel(PlannerInfo *root, RelOptInfo *rel1,
+					  RelOptInfo *rel2, RelOptInfo *joinrel,
+					  SpecialJoinInfo *sjinfo, List *restrictlist)
+{
+	RelOptInfo *rel_grouped;
+	RelOptInfo *rel1_grouped;
+	RelOptInfo *rel2_grouped;
+	bool		rel1_empty;
+	bool		rel2_empty;
+	bool		yet_to_add = false;
+
+	/*
+	 * If there are no aggregate expressions or grouping expressions, eager
+	 * aggregation is not possible.
+	 */
+	if (root->agg_clause_list == NIL ||
+		root->group_expr_list == NIL)
+		return;
+
+	/*
+	 * See if we already have a grouped joinrel for this joinrel.
+	 */
+	rel_grouped = find_grouped_rel(root, joinrel->relids);
+
+	/*
+	 * Construct a new RelOptInfo for the grouped join relation if there is no
+	 * existing one.
+	 */
+	if (rel_grouped == NULL)
+	{
+		RelAggInfo *agg_info = NULL;
+
+		/*
+		 * Prepare the information needed to create grouped paths for this
+		 * join relation.
+		 */
+		agg_info = create_rel_agg_info(root, joinrel);
+		if (agg_info == NULL)
+			return;
+
+		/* build a grouped relation out of the plain relation */
+		rel_grouped = build_grouped_rel(root, joinrel);
+		rel_grouped->reltarget = agg_info->target;
+		rel_grouped->rows = agg_info->grouped_rows;
+		rel_grouped->agg_info = agg_info;
+
+		/*
+		 * If the grouped paths for the given join relation are considered
+		 * useful, add the grouped relation we just built to the PlannerInfo
+		 * to make it available for further joining or for acting as the upper
+		 * rel representing the result of partial aggregation.  Otherwise, we
+		 * need to postpone the decision on adding the grouped relation to the
+		 * PlannerInfo, as it depends on whether we can generate any grouped
+		 * paths by joining the given pair of input relations.
+		 */
+		if (agg_info->agg_useful)
+			add_grouped_rel(root, rel_grouped);
+		else
+			yet_to_add = true;
+	}
+
+	Assert(IS_GROUPED_REL(rel_grouped));
+
+	/* We may have already proven this grouped join relation to be dummy. */
+	if (IS_DUMMY_REL(rel_grouped))
+		return;
+
+	/* Retrieve the grouped relations for the two input rels */
+	rel1_grouped = find_grouped_rel(root, rel1->relids);
+	rel2_grouped = find_grouped_rel(root, rel2->relids);
+
+	rel1_empty = (rel1_grouped == NULL || IS_DUMMY_REL(rel1_grouped));
+	rel2_empty = (rel2_grouped == NULL || IS_DUMMY_REL(rel2_grouped));
+
+	/* Nothing to do if there's no grouped relation. */
+	if (rel1_empty && rel2_empty)
+		return;
+
+	/*
+	 * Joining two grouped relations is currently not supported.  Grouping one
+	 * side would alter the occurrence of the other side's aggregate transient
+	 * states in the final aggregation input.  While this issue could be
+	 * addressed by adjusting the transient states, it is not deemed
+	 * worthwhile for now.
+	 */
+	if (!rel1_empty && !rel2_empty)
+		return;
+
+	/* Generate partial aggregation paths for the grouped relation */
+	if (!rel1_empty)
+	{
+		populate_joinrel_with_paths(root, rel1_grouped, rel2, rel_grouped,
+									sjinfo, restrictlist);
+
+		/*
+		 * It shouldn't happen that we have marked rel1_grouped as dummy in
+		 * populate_joinrel_with_paths due to provably constant-false join
+		 * restrictions, hence we wouldn't end up with a plan that has Aggref
+		 * in non-Agg plan node.
+		 */
+		Assert(!IS_DUMMY_REL(rel1_grouped));
+	}
+	else if (!rel2_empty)
+	{
+		populate_joinrel_with_paths(root, rel1, rel2_grouped, rel_grouped,
+									sjinfo, restrictlist);
+
+		/*
+		 * It shouldn't happen that we have marked rel2_grouped as dummy in
+		 * populate_joinrel_with_paths due to provably constant-false join
+		 * restrictions, hence we wouldn't end up with a plan that has Aggref
+		 * in non-Agg plan node.
+		 */
+		Assert(!IS_DUMMY_REL(rel2_grouped));
+	}
+
+	/*
+	 * Since we have generated grouped paths by joining the given pair of
+	 * input relations, add the grouped relation to the PlannerInfo if we have
+	 * not already done so.
+	 */
+	if (yet_to_add)
+		add_grouped_rel(root, rel_grouped);
+}
+
 /*
  * populate_joinrel_with_paths
  *	  Add paths to the given joinrel for given pair of joining relations. The
@@ -1674,6 +1816,11 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
 						 adjust_child_relids(joinrel->relids,
 											 nappinfos, appinfos)));
 
+		/* Build a grouped join relation for 'child_joinrel' if possible */
+		make_grouped_join_rel(root, child_rel1, child_rel2,
+							  child_joinrel, child_sjinfo,
+							  child_restrictlist);
+
 		/* And make paths for the child join */
 		populate_joinrel_with_paths(root, child_rel1, child_rel2,
 									child_joinrel, child_sjinfo,
diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index 5f3908be51..1f5e670dcc 100644
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -14,6 +14,7 @@
  */
 #include "postgres.h"
 
+#include "access/nbtree.h"
 #include "catalog/pg_constraint.h"
 #include "catalog/pg_type.h"
 #include "nodes/makefuncs.h"
@@ -81,6 +82,8 @@ typedef struct JoinTreeItem
 } JoinTreeItem;
 
 
+static void create_agg_clause_infos(PlannerInfo *root);
+static void create_grouping_expr_infos(PlannerInfo *root);
 static void extract_lateral_references(PlannerInfo *root, RelOptInfo *brel,
 									   Index rtindex);
 static List *deconstruct_recurse(PlannerInfo *root, Node *jtnode,
@@ -628,6 +631,261 @@ remove_useless_groupby_columns(PlannerInfo *root)
 	}
 }
 
+/*
+ * setup_eager_aggregation
+ *	  Check if eager aggregation is applicable, and if so collect suitable
+ *	  aggregate expressions and grouping expressions in the query.
+ */
+void
+setup_eager_aggregation(PlannerInfo *root)
+{
+	/*
+	 * Don't apply eager aggregation if disabled by user.
+	 */
+	if (!enable_eager_aggregate)
+		return;
+
+	/*
+	 * Don't apply eager aggregation if there are no available GROUP BY
+	 * clauses.
+	 */
+	if (!root->processed_groupClause)
+		return;
+
+	/*
+	 * For now we don't try to support grouping sets.
+	 */
+	if (root->parse->groupingSets)
+		return;
+
+	/*
+	 * For now we don't try to support DISTINCT or ORDER BY aggregates.
+	 */
+	if (root->numOrderedAggs > 0)
+		return;
+
+	/*
+	 * If there are any aggregates that do not support partial mode, or any
+	 * partial aggregates that are non-serializable, do not apply eager
+	 * aggregation.
+	 */
+	if (root->hasNonPartialAggs || root->hasNonSerialAggs)
+		return;
+
+	/*
+	 * We don't try to apply eager aggregation if there are set-returning
+	 * functions in targetlist.
+	 */
+	if (root->parse->hasTargetSRFs)
+		return;
+
+	/*
+	 * Eager aggregation only makes sense if there are multiple base rels in
+	 * the query.
+	 */
+	if (bms_membership(root->all_baserels) != BMS_MULTIPLE)
+		return;
+
+	/*
+	 * Collect aggregate expressions and plain Vars that appear in targetlist
+	 * and havingQual.
+	 */
+	create_agg_clause_infos(root);
+
+	/*
+	 * If there are no suitable aggregate expressions, we cannot apply eager
+	 * aggregation.
+	 */
+	if (root->agg_clause_list == NIL)
+		return;
+
+	/*
+	 * Collect grouping expressions that appear in grouping clauses.
+	 */
+	create_grouping_expr_infos(root);
+}
+
+/*
+ * create_agg_clause_infos
+ *	  Search the targetlist and havingQual for Aggrefs and plain Vars, and
+ *	  create an AggClauseInfo for each Aggref node.
+ */
+static void
+create_agg_clause_infos(PlannerInfo *root)
+{
+	List	   *tlist_exprs;
+	ListCell   *lc;
+
+	Assert(root->agg_clause_list == NIL);
+	Assert(root->tlist_vars == NIL);
+
+	tlist_exprs = pull_var_clause((Node *) root->processed_tlist,
+								  PVC_INCLUDE_AGGREGATES |
+								  PVC_RECURSE_WINDOWFUNCS |
+								  PVC_RECURSE_PLACEHOLDERS);
+
+	/*
+	 * Aggregates within the HAVING clause need to be processed in the same
+	 * way as those in the targetlist.  Note that HAVING can contain Aggrefs
+	 * but not WindowFuncs.
+	 */
+	if (root->parse->havingQual != NULL)
+	{
+		List	   *having_exprs;
+
+		having_exprs = pull_var_clause((Node *) root->parse->havingQual,
+									   PVC_INCLUDE_AGGREGATES |
+									   PVC_RECURSE_PLACEHOLDERS);
+		if (having_exprs != NIL)
+		{
+			tlist_exprs = list_concat(tlist_exprs, having_exprs);
+			list_free(having_exprs);
+		}
+	}
+
+	foreach(lc, tlist_exprs)
+	{
+		Expr	   *expr = (Expr *) lfirst(lc);
+		Aggref	   *aggref;
+		AggClauseInfo *ac_info;
+
+		/* For now we don't try to support GROUPING() expressions */
+		if (IsA(expr, GroupingFunc))
+		{
+			list_free_deep(root->agg_clause_list);
+			root->agg_clause_list = NIL;
+
+			list_free(root->tlist_vars);
+			root->tlist_vars = NIL;
+
+			return;
+		}
+
+		/* Collect plain Vars for future reference */
+		if (IsA(expr, Var))
+		{
+			root->tlist_vars = list_append_unique(root->tlist_vars, expr);
+			continue;
+		}
+
+		aggref = castNode(Aggref, expr);
+
+		Assert(aggref->aggorder == NIL);
+		Assert(aggref->aggdistinct == NIL);
+
+		ac_info = makeNode(AggClauseInfo);
+		ac_info->aggref = aggref;
+		ac_info->agg_eval_at = pull_varnos(root, (Node *) aggref);
+
+		root->agg_clause_list =
+			list_append_unique(root->agg_clause_list, ac_info);
+	}
+
+	list_free(tlist_exprs);
+}
+
+/*
+ * create_grouping_expr_infos
+ *	  Create GroupExprInfo for each expression usable as grouping key.
+ *
+ * If any grouping expression is not suitable, we will just return with
+ * root->group_expr_list being NIL.
+ */
+static void
+create_grouping_expr_infos(PlannerInfo *root)
+{
+	List	   *exprs = NIL;
+	List	   *sortgrouprefs = NIL;
+	List	   *btree_opfamilies = NIL;
+	ListCell   *lc,
+			   *lc1,
+			   *lc2,
+			   *lc3;
+
+	Assert(root->group_expr_list == NIL);
+
+	foreach(lc, root->processed_groupClause)
+	{
+		SortGroupClause *sgc = lfirst_node(SortGroupClause, lc);
+		TargetEntry *tle = get_sortgroupclause_tle(sgc, root->processed_tlist);
+		TypeCacheEntry *tce;
+		Oid			equalimageproc;
+		Oid			eq_op;
+		List	   *eq_opfamilies;
+		Oid			btree_opfamily;
+
+		Assert(tle->ressortgroupref > 0);
+
+		/*
+		 * For now we only support plain Vars as grouping expressions.
+		 */
+		if (!IsA(tle->expr, Var))
+			return;
+
+		/*
+		 * Eager aggregation is only possible if equality implies image
+		 * equality for each grouping key.  Otherwise, placing keys with
+		 * different byte images into the same group may result in the loss of
+		 * information that could be necessary to evaluate upper qual clauses.
+		 *
+		 * For instance, the NUMERIC data type is not supported, as values
+		 * that are considered equal by the equality operator (e.g., 0 and
+		 * 0.0) can have different scales.
+		 */
+		tce = lookup_type_cache(exprType((Node *) tle->expr),
+								TYPECACHE_BTREE_OPFAMILY);
+		if (!OidIsValid(tce->btree_opf) ||
+			!OidIsValid(tce->btree_opintype))
+			return;
+
+		equalimageproc = get_opfamily_proc(tce->btree_opf,
+										   tce->btree_opintype,
+										   tce->btree_opintype,
+										   BTEQUALIMAGE_PROC);
+		if (!OidIsValid(equalimageproc) ||
+			!DatumGetBool(OidFunctionCall1Coll(equalimageproc,
+											   tce->typcollation,
+											   ObjectIdGetDatum(tce->btree_opintype))))
+			return;
+
+		/*
+		 * Get the operator in the btree's opfamily.
+		 */
+		eq_op = get_opfamily_member(tce->btree_opf,
+									tce->btree_opintype,
+									tce->btree_opintype,
+									BTEqualStrategyNumber);
+		if (!OidIsValid(eq_op))
+			return;
+		eq_opfamilies = get_mergejoin_opfamilies(eq_op);
+		if (!eq_opfamilies)
+			return;
+		btree_opfamily = linitial_oid(eq_opfamilies);
+
+		exprs = lappend(exprs, tle->expr);
+		sortgrouprefs = lappend_int(sortgrouprefs, tle->ressortgroupref);
+		btree_opfamilies = lappend_oid(btree_opfamilies, btree_opfamily);
+	}
+
+	/*
+	 * Construct GroupExprInfo for each expression.
+	 */
+	forthree(lc1, exprs, lc2, sortgrouprefs, lc3, btree_opfamilies)
+	{
+		Expr	   *expr = (Expr *) lfirst(lc1);
+		int			sortgroupref = lfirst_int(lc2);
+		Oid			btree_opfamily = lfirst_oid(lc3);
+		GroupExprInfo *ge_info;
+
+		ge_info = makeNode(GroupExprInfo);
+		ge_info->expr = (Expr *) copyObject(expr);
+		ge_info->sortgroupref = sortgroupref;
+		ge_info->btree_opfamily = btree_opfamily;
+
+		root->group_expr_list = lappend(root->group_expr_list, ge_info);
+	}
+}
+
 /*****************************************************************************
  *
  *	  LATERAL REFERENCES
diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c
index 735560e8ca..22df968629 100644
--- a/src/backend/optimizer/plan/planmain.c
+++ b/src/backend/optimizer/plan/planmain.c
@@ -64,8 +64,12 @@ query_planner(PlannerInfo *root,
 	 * NOTE: append_rel_list was set up by subquery_planner, so do not touch
 	 * here.
 	 */
-	root->join_rel_list = NIL;
-	root->join_rel_hash = NULL;
+	root->join_rel_list = makeNode(RelInfoList);
+	root->join_rel_list->items = NIL;
+	root->join_rel_list->hash = NULL;
+	root->grouped_rel_list = makeNode(RelInfoList);
+	root->grouped_rel_list->items = NIL;
+	root->grouped_rel_list->hash = NULL;
 	root->join_rel_level = NULL;
 	root->join_cur_level = 0;
 	root->canon_pathkeys = NIL;
@@ -76,6 +80,9 @@ query_planner(PlannerInfo *root,
 	root->placeholder_list = NIL;
 	root->placeholder_array = NULL;
 	root->placeholder_array_size = 0;
+	root->agg_clause_list = NIL;
+	root->group_expr_list = NIL;
+	root->tlist_vars = NIL;
 	root->fkey_list = NIL;
 	root->initial_rels = NIL;
 
@@ -260,6 +267,12 @@ query_planner(PlannerInfo *root,
 	 */
 	extract_restriction_or_clauses(root);
 
+	/*
+	 * Check if eager aggregation is applicable, and if so, set up
+	 * root->agg_clause_list and root->group_expr_list.
+	 */
+	setup_eager_aggregation(root);
+
 	/*
 	 * Now expand appendrels by adding "otherrels" for their children.  We
 	 * delay this to the end so that we have as much information as possible
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index a0a2de7ee4..049bb679f0 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -229,7 +229,6 @@ static void add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 									  RelOptInfo *partially_grouped_rel,
 									  const AggClauseCosts *agg_costs,
 									  grouping_sets_data *gd,
-									  double dNumGroups,
 									  GroupPathExtraData *extra);
 static RelOptInfo *create_partial_grouping_paths(PlannerInfo *root,
 												 RelOptInfo *grouped_rel,
@@ -3916,9 +3915,7 @@ create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
 							   GroupPathExtraData *extra,
 							   RelOptInfo **partially_grouped_rel_p)
 {
-	Path	   *cheapest_path = input_rel->cheapest_total_path;
 	RelOptInfo *partially_grouped_rel = NULL;
-	double		dNumGroups;
 	PartitionwiseAggregateType patype = PARTITIONWISE_AGGREGATE_NONE;
 
 	/*
@@ -4000,23 +3997,16 @@ create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
 
 	/* Gather any partially grouped partial paths. */
 	if (partially_grouped_rel && partially_grouped_rel->partial_pathlist)
-	{
 		gather_grouping_paths(root, partially_grouped_rel);
-		set_cheapest(partially_grouped_rel);
-	}
 
-	/*
-	 * Estimate number of groups.
-	 */
-	dNumGroups = get_number_of_groups(root,
-									  cheapest_path->rows,
-									  gd,
-									  extra->targetList);
+	/* Now choose the best path(s) for partially_grouped_rel. */
+	if (partially_grouped_rel && partially_grouped_rel->pathlist)
+		set_cheapest(partially_grouped_rel);
 
 	/* Build final grouping paths */
 	add_paths_to_grouping_rel(root, input_rel, grouped_rel,
 							  partially_grouped_rel, agg_costs, gd,
-							  dNumGroups, extra);
+							  extra);
 
 	/* Give a helpful error if we failed to find any implementation */
 	if (grouped_rel->pathlist == NIL)
@@ -6906,16 +6896,42 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 						  RelOptInfo *grouped_rel,
 						  RelOptInfo *partially_grouped_rel,
 						  const AggClauseCosts *agg_costs,
-						  grouping_sets_data *gd, double dNumGroups,
+						  grouping_sets_data *gd,
 						  GroupPathExtraData *extra)
 {
 	Query	   *parse = root->parse;
 	Path	   *cheapest_path = input_rel->cheapest_total_path;
+	Path	   *cheapest_partially_grouped_path = NULL;
 	ListCell   *lc;
 	bool		can_hash = (extra->flags & GROUPING_CAN_USE_HASH) != 0;
 	bool		can_sort = (extra->flags & GROUPING_CAN_USE_SORT) != 0;
 	List	   *havingQual = (List *) extra->havingQual;
 	AggClauseCosts *agg_final_costs = &extra->agg_final_costs;
+	double		dNumGroups = 0;
+	double		dNumFinalGroups = 0;
+
+	/*
+	 * Estimate number of groups for non-split aggregation.
+	 */
+	dNumGroups = get_number_of_groups(root,
+									  cheapest_path->rows,
+									  gd,
+									  extra->targetList);
+
+	if (partially_grouped_rel && partially_grouped_rel->pathlist)
+	{
+		cheapest_partially_grouped_path =
+			partially_grouped_rel->cheapest_total_path;
+
+		/*
+		 * Estimate number of groups for final phase of partial aggregation.
+		 */
+		dNumFinalGroups =
+			get_number_of_groups(root,
+								 cheapest_partially_grouped_path->rows,
+								 gd,
+								 extra->targetList);
+	}
 
 	if (can_sort)
 	{
@@ -7028,7 +7044,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 					path = make_ordered_path(root,
 											 grouped_rel,
 											 path,
-											 partially_grouped_rel->cheapest_total_path,
+											 cheapest_partially_grouped_path,
 											 info->pathkeys,
 											 -1.0);
 
@@ -7046,7 +7062,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 												 info->clauses,
 												 havingQual,
 												 agg_final_costs,
-												 dNumGroups));
+												 dNumFinalGroups));
 					else
 						add_path(grouped_rel, (Path *)
 								 create_group_path(root,
@@ -7054,7 +7070,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 												   path,
 												   info->clauses,
 												   havingQual,
-												   dNumGroups));
+												   dNumFinalGroups));
 
 				}
 			}
@@ -7096,19 +7112,17 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 		 */
 		if (partially_grouped_rel && partially_grouped_rel->pathlist)
 		{
-			Path	   *path = partially_grouped_rel->cheapest_total_path;
-
 			add_path(grouped_rel, (Path *)
 					 create_agg_path(root,
 									 grouped_rel,
-									 path,
+									 cheapest_partially_grouped_path,
 									 grouped_rel->reltarget,
 									 AGG_HASHED,
 									 AGGSPLIT_FINAL_DESERIAL,
 									 root->processed_groupClause,
 									 havingQual,
 									 agg_final_costs,
-									 dNumGroups));
+									 dNumFinalGroups));
 		}
 	}
 
@@ -7158,6 +7172,21 @@ create_partial_grouping_paths(PlannerInfo *root,
 	bool		can_hash = (extra->flags & GROUPING_CAN_USE_HASH) != 0;
 	bool		can_sort = (extra->flags & GROUPING_CAN_USE_SORT) != 0;
 
+	/*
+	 * The partially_grouped_rel could have been already created due to eager
+	 * aggregation.
+	 */
+	partially_grouped_rel = find_grouped_rel(root, input_rel->relids);
+	Assert(enable_eager_aggregate || partially_grouped_rel == NULL);
+
+	/*
+	 * It is possible that the partially_grouped_rel created by eager
+	 * aggregation is dummy.  In this case we just set it to NULL.  It might
+	 * be created again by the following logic if possible.
+	 */
+	if (partially_grouped_rel && IS_DUMMY_REL(partially_grouped_rel))
+		partially_grouped_rel = NULL;
+
 	/*
 	 * Consider whether we should generate partially aggregated non-partial
 	 * paths.  We can only do this if we have a non-partial path, and only if
@@ -7181,19 +7210,27 @@ create_partial_grouping_paths(PlannerInfo *root,
 	 * If we can't partially aggregate partial paths, and we can't partially
 	 * aggregate non-partial paths, then don't bother creating the new
 	 * RelOptInfo at all, unless the caller specified force_rel_creation.
+	 *
+	 * Note that the partially_grouped_rel could have been already created and
+	 * populated with appropriate paths by eager aggregation.
 	 */
 	if (cheapest_total_path == NULL &&
 		cheapest_partial_path == NULL &&
+		(partially_grouped_rel == NULL ||
+		 partially_grouped_rel->pathlist == NIL) &&
 		!force_rel_creation)
 		return NULL;
 
 	/*
 	 * Build a new upper relation to represent the result of partially
-	 * aggregating the rows from the input relation.
-	 */
-	partially_grouped_rel = fetch_upper_rel(root,
-											UPPERREL_PARTIAL_GROUP_AGG,
-											grouped_rel->relids);
+	 * aggregating the rows from the input relation.  The relation may already
+	 * exist due to eager aggregation, in which case we don't need to create
+	 * it.
+	 */
+	if (partially_grouped_rel == NULL)
+		partially_grouped_rel = fetch_upper_rel(root,
+												UPPERREL_PARTIAL_GROUP_AGG,
+												grouped_rel->relids);
 	partially_grouped_rel->consider_parallel =
 		grouped_rel->consider_parallel;
 	partially_grouped_rel->reloptkind = grouped_rel->reloptkind;
@@ -7202,6 +7239,14 @@ create_partial_grouping_paths(PlannerInfo *root,
 	partially_grouped_rel->useridiscurrent = grouped_rel->useridiscurrent;
 	partially_grouped_rel->fdwroutine = grouped_rel->fdwroutine;
 
+	/*
+	 * Partially-grouped partial paths may have been generated by eager
+	 * aggregation.  If we find that parallelism is not possible for
+	 * partially_grouped_rel, we need to drop these partial paths.
+	 */
+	if (!partially_grouped_rel->consider_parallel)
+		partially_grouped_rel->partial_pathlist = NIL;
+
 	/*
 	 * Build target list for partial aggregate paths.  These paths cannot just
 	 * emit the same tlist as regular aggregate paths, because (1) we must
diff --git a/src/backend/optimizer/util/appendinfo.c b/src/backend/optimizer/util/appendinfo.c
index 45e8b74f94..0e4c7b2b2d 100644
--- a/src/backend/optimizer/util/appendinfo.c
+++ b/src/backend/optimizer/util/appendinfo.c
@@ -499,6 +499,66 @@ adjust_appendrel_attrs_mutator(Node *node,
 		return (Node *) newinfo;
 	}
 
+	/*
+	 * We have to process RelAggInfo nodes specially.
+	 */
+	if (IsA(node, RelAggInfo))
+	{
+		RelAggInfo *oldinfo = (RelAggInfo *) node;
+		RelAggInfo *newinfo = makeNode(RelAggInfo);
+
+		/* Copy all flat-copiable fields */
+		memcpy(newinfo, oldinfo, sizeof(RelAggInfo));
+
+		newinfo->relids = adjust_child_relids(oldinfo->relids,
+											  context->nappinfos,
+											  context->appinfos);
+
+		newinfo->target = (PathTarget *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->target,
+										   context);
+
+		newinfo->agg_input = (PathTarget *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->agg_input,
+										   context);
+
+		newinfo->group_clauses = (List *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->group_clauses,
+										   context);
+
+		newinfo->group_exprs = (List *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->group_exprs,
+										   context);
+
+		return (Node *) newinfo;
+	}
+
+	/*
+	 * We have to process PathTarget nodes specially.
+	 */
+	if (IsA(node, PathTarget))
+	{
+		PathTarget *oldtarget = (PathTarget *) node;
+		PathTarget *newtarget = makeNode(PathTarget);
+
+		/* Copy all flat-copiable fields */
+		memcpy(newtarget, oldtarget, sizeof(PathTarget));
+
+		if (oldtarget->sortgrouprefs)
+		{
+			Size		nbytes = list_length(oldtarget->exprs) * sizeof(Index);
+
+			newtarget->exprs = (List *)
+				adjust_appendrel_attrs_mutator((Node *) oldtarget->exprs,
+											   context);
+
+			newtarget->sortgrouprefs = (Index *) palloc(nbytes);
+			memcpy(newtarget->sortgrouprefs, oldtarget->sortgrouprefs, nbytes);
+		}
+
+		return (Node *) newtarget;
+	}
+
 	/*
 	 * NOTE: we do not need to recurse into sublinks, because they should
 	 * already have been converted to subplans before we see them.
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index fc97bf6ee2..673e181b32 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -262,6 +262,12 @@ compare_path_costs_fuzzily(Path *path1, Path *path2, double fuzz_factor)
  * unparameterized path, too, if there is one; the users of that list find
  * it more convenient if that's included.
  *
+ * cheapest_parameterized_paths also always includes the fewest-row
+ * unparameterized path, if there is one, for grouped relations.  Different
+ * paths of a grouped relation can have very different row counts, and in some
+ * cases the cheapest-total unparameterized path may not be the one with the
+ * fewest row.
+ *
  * This is normally called only after we've finished constructing the path
  * list for the rel node.
  */
@@ -271,6 +277,7 @@ set_cheapest(RelOptInfo *parent_rel)
 	Path	   *cheapest_startup_path;
 	Path	   *cheapest_total_path;
 	Path	   *best_param_path;
+	Path	   *fewest_row_path;
 	List	   *parameterized_paths;
 	ListCell   *p;
 
@@ -280,6 +287,7 @@ set_cheapest(RelOptInfo *parent_rel)
 		elog(ERROR, "could not devise a query plan for the given query");
 
 	cheapest_startup_path = cheapest_total_path = best_param_path = NULL;
+	fewest_row_path = NULL;
 	parameterized_paths = NIL;
 
 	foreach(p, parent_rel->pathlist)
@@ -341,6 +349,8 @@ set_cheapest(RelOptInfo *parent_rel)
 			if (cheapest_total_path == NULL)
 			{
 				cheapest_startup_path = cheapest_total_path = path;
+				if (IS_GROUPED_REL(parent_rel))
+					fewest_row_path = path;
 				continue;
 			}
 
@@ -364,6 +374,27 @@ set_cheapest(RelOptInfo *parent_rel)
 				 compare_pathkeys(cheapest_total_path->pathkeys,
 								  path->pathkeys) == PATHKEYS_BETTER2))
 				cheapest_total_path = path;
+
+			/*
+			 * Find the fewest-row unparameterized path for a grouped
+			 * relation.  If we find two paths of the same row count, try to
+			 * keep the one with the cheaper total cost; if the costs are
+			 * identical, keep the better-sorted one.
+			 */
+			if (IS_GROUPED_REL(parent_rel))
+			{
+				if (fewest_row_path->rows > path->rows)
+					fewest_row_path = path;
+				else if (fewest_row_path->rows == path->rows)
+				{
+					cmp = compare_path_costs(fewest_row_path, path, TOTAL_COST);
+					if (cmp > 0 ||
+						(cmp == 0 &&
+						 compare_pathkeys(fewest_row_path->pathkeys,
+										  path->pathkeys) == PATHKEYS_BETTER2))
+						fewest_row_path = path;
+				}
+			}
 		}
 	}
 
@@ -371,6 +402,10 @@ set_cheapest(RelOptInfo *parent_rel)
 	if (cheapest_total_path)
 		parameterized_paths = lcons(cheapest_total_path, parameterized_paths);
 
+	/* Add fewest-row unparameterized path, if any, to parameterized_paths */
+	if (fewest_row_path && fewest_row_path != cheapest_total_path)
+		parameterized_paths = lcons(fewest_row_path, parameterized_paths);
+
 	/*
 	 * If there is no unparameterized path, use the best parameterized path as
 	 * cheapest_total_path (but not as cheapest_startup_path).
@@ -2787,8 +2822,7 @@ create_projection_path(PlannerInfo *root,
 	pathnode->path.pathtype = T_Result;
 	pathnode->path.parent = rel;
 	pathnode->path.pathtarget = target;
-	/* For now, assume we are above any joins, so no parameterization */
-	pathnode->path.param_info = NULL;
+	pathnode->path.param_info = subpath->param_info;
 	pathnode->path.parallel_aware = false;
 	pathnode->path.parallel_safe = rel->consider_parallel &&
 		subpath->parallel_safe &&
@@ -3043,8 +3077,7 @@ create_incremental_sort_path(PlannerInfo *root,
 	pathnode->path.parent = rel;
 	/* Sort doesn't project, so use source path's pathtarget */
 	pathnode->path.pathtarget = subpath->pathtarget;
-	/* For now, assume we are above any joins, so no parameterization */
-	pathnode->path.param_info = NULL;
+	pathnode->path.param_info = subpath->param_info;
 	pathnode->path.parallel_aware = false;
 	pathnode->path.parallel_safe = rel->consider_parallel &&
 		subpath->parallel_safe;
@@ -3091,8 +3124,7 @@ create_sort_path(PlannerInfo *root,
 	pathnode->path.parent = rel;
 	/* Sort doesn't project, so use source path's pathtarget */
 	pathnode->path.pathtarget = subpath->pathtarget;
-	/* For now, assume we are above any joins, so no parameterization */
-	pathnode->path.param_info = NULL;
+	pathnode->path.param_info = subpath->param_info;
 	pathnode->path.parallel_aware = false;
 	pathnode->path.parallel_safe = rel->consider_parallel &&
 		subpath->parallel_safe;
@@ -3253,8 +3285,7 @@ create_agg_path(PlannerInfo *root,
 	pathnode->path.pathtype = T_Agg;
 	pathnode->path.parent = rel;
 	pathnode->path.pathtarget = target;
-	/* For now, assume we are above any joins, so no parameterization */
-	pathnode->path.param_info = NULL;
+	pathnode->path.param_info = subpath->param_info;
 	pathnode->path.parallel_aware = false;
 	pathnode->path.parallel_safe = rel->consider_parallel &&
 		subpath->parallel_safe;
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index f96573eb5d..6282c10da6 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -16,6 +16,8 @@
 
 #include <limits.h>
 
+#include "access/nbtree.h"
+#include "catalog/pg_constraint.h"
 #include "miscadmin.h"
 #include "nodes/nodeFuncs.h"
 #include "optimizer/appendinfo.h"
@@ -27,19 +29,27 @@
 #include "optimizer/paths.h"
 #include "optimizer/placeholder.h"
 #include "optimizer/plancat.h"
+#include "optimizer/planner.h"
 #include "optimizer/restrictinfo.h"
 #include "optimizer/tlist.h"
+#include "parser/parse_oper.h"
 #include "parser/parse_relation.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/hsearch.h"
 #include "utils/lsyscache.h"
+#include "utils/selfuncs.h"
+#include "utils/typcache.h"
 
 
-typedef struct JoinHashEntry
+/*
+ * An entry of a hash table that we use to make lookup for RelOptInfo
+ * structures more efficient.
+ */
+typedef struct RelHashEntry
 {
-	Relids		join_relids;	/* hash key --- MUST BE FIRST */
-	RelOptInfo *join_rel;
-} JoinHashEntry;
+	Relids		relids;			/* hash key --- MUST BE FIRST */
+	RelOptInfo *rel;
+} RelHashEntry;
 
 static void build_joinrel_tlist(PlannerInfo *root, RelOptInfo *joinrel,
 								RelOptInfo *input_rel,
@@ -83,7 +93,17 @@ static void build_child_join_reltarget(PlannerInfo *root,
 									   RelOptInfo *childrel,
 									   int nappinfos,
 									   AppendRelInfo **appinfos);
-
+static bool eager_aggregation_possible_for_relation(PlannerInfo *root,
+													RelOptInfo *rel);
+static bool init_grouping_targets(PlannerInfo *root, RelOptInfo *rel,
+								  PathTarget *target, PathTarget *agg_input,
+								  List **group_clauses, List **group_exprs);
+static bool is_var_in_aggref_only(PlannerInfo *root, Var *var);
+static bool is_var_needed_by_join(PlannerInfo *root, Var *var, RelOptInfo *rel);
+static Index get_expression_sortgroupref(PlannerInfo *root, Expr *expr);
+
+/* Minimum row reduction ratio at which a grouped path is considered useful */
+#define EAGER_AGGREGATE_RATIO 0.5
 
 /*
  * setup_simple_rel_arrays
@@ -276,6 +296,7 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
 	rel->joininfo = NIL;
 	rel->has_eclass_joins = false;
 	rel->consider_partitionwise_join = false;	/* might get changed later */
+	rel->agg_info = NULL;
 	rel->part_scheme = NULL;
 	rel->nparts = -1;
 	rel->boundinfo = NULL;
@@ -406,6 +427,99 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
 	return rel;
 }
 
+/*
+ * build_simple_grouped_rel
+ *	  Construct a new RelOptInfo for a grouped base relation out of an existing
+ *	  non-grouped base relation.
+ */
+RelOptInfo *
+build_simple_grouped_rel(PlannerInfo *root, RelOptInfo *rel_plain)
+{
+	RelOptInfo *rel_grouped;
+	RelAggInfo *agg_info;
+
+	/*
+	 * We should have available aggregate expressions and grouping
+	 * expressions, otherwise we cannot reach here.
+	 */
+	Assert(root->agg_clause_list != NIL);
+	Assert(root->group_expr_list != NIL);
+
+	/* nothing to do for dummy rel */
+	if (IS_DUMMY_REL(rel_plain))
+		return NULL;
+
+	/*
+	 * Prepare the information needed to create grouped paths for this base
+	 * relation.
+	 */
+	agg_info = create_rel_agg_info(root, rel_plain);
+	if (agg_info == NULL)
+		return NULL;
+
+	/*
+	 * If the grouped paths for the given base relation are not considered
+	 * useful, do not build the grouped relation.
+	 */
+	if (!agg_info->agg_useful)
+		return NULL;
+
+	/* build a grouped relation out of the plain relation */
+	rel_grouped = build_grouped_rel(root, rel_plain);
+	rel_grouped->reltarget = agg_info->target;
+	rel_grouped->rows = agg_info->grouped_rows;
+	rel_grouped->agg_info = agg_info;
+
+	return rel_grouped;
+}
+
+/*
+ * build_grouped_rel
+ *	  Build a grouped relation by flat copying a plain relation and resetting
+ *	  the necessary fields.
+ */
+RelOptInfo *
+build_grouped_rel(PlannerInfo *root, RelOptInfo *rel_plain)
+{
+	RelOptInfo *rel_grouped;
+
+	rel_grouped = makeNode(RelOptInfo);
+	memcpy(rel_grouped, rel_plain, sizeof(RelOptInfo));
+
+	/*
+	 * clear path info
+	 */
+	rel_grouped->pathlist = NIL;
+	rel_grouped->ppilist = NIL;
+	rel_grouped->partial_pathlist = NIL;
+	rel_grouped->cheapest_startup_path = NULL;
+	rel_grouped->cheapest_total_path = NULL;
+	rel_grouped->cheapest_unique_path = NULL;
+	rel_grouped->cheapest_parameterized_paths = NIL;
+
+	/*
+	 * clear partition info
+	 */
+	rel_grouped->part_scheme = NULL;
+	rel_grouped->nparts = -1;
+	rel_grouped->boundinfo = NULL;
+	rel_grouped->partbounds_merged = false;
+	rel_grouped->partition_qual = NIL;
+	rel_grouped->part_rels = NULL;
+	rel_grouped->live_parts = NULL;
+	rel_grouped->all_partrels = NULL;
+	rel_grouped->partexprs = NULL;
+	rel_grouped->nullable_partexprs = NULL;
+	rel_grouped->consider_partitionwise_join = false;
+
+	/*
+	 * clear size estimates
+	 */
+	rel_grouped->rows = 0;
+
+	return rel_grouped;
+}
+
 /*
  * find_base_rel
  *	  Find a base or otherrel relation entry, which must already exist.
@@ -479,11 +593,11 @@ find_base_rel_ignore_join(PlannerInfo *root, int relid)
 }
 
 /*
- * build_join_rel_hash
- *	  Construct the auxiliary hash table for join relations.
+ * build_rel_hash
+ *	  Construct the auxiliary hash table for relations.
  */
 static void
-build_join_rel_hash(PlannerInfo *root)
+build_rel_hash(RelInfoList *list)
 {
 	HTAB	   *hashtab;
 	HASHCTL		hash_ctl;
@@ -491,47 +605,46 @@ build_join_rel_hash(PlannerInfo *root)
 
 	/* Create the hash table */
 	hash_ctl.keysize = sizeof(Relids);
-	hash_ctl.entrysize = sizeof(JoinHashEntry);
+	hash_ctl.entrysize = sizeof(RelHashEntry);
 	hash_ctl.hash = bitmap_hash;
 	hash_ctl.match = bitmap_match;
 	hash_ctl.hcxt = CurrentMemoryContext;
-	hashtab = hash_create("JoinRelHashTable",
+	hashtab = hash_create("RelHashTable",
 						  256L,
 						  &hash_ctl,
 						  HASH_ELEM | HASH_FUNCTION | HASH_COMPARE | HASH_CONTEXT);
 
-	/* Insert all the already-existing joinrels */
-	foreach(l, root->join_rel_list)
+	/* Insert all the already-existing RelOptInfos */
+	foreach(l, list->items)
 	{
 		RelOptInfo *rel = (RelOptInfo *) lfirst(l);
-		JoinHashEntry *hentry;
+		RelHashEntry *hentry;
 		bool		found;
 
-		hentry = (JoinHashEntry *) hash_search(hashtab,
-											   &(rel->relids),
-											   HASH_ENTER,
-											   &found);
+		hentry = (RelHashEntry *) hash_search(hashtab,
+											  &(rel->relids),
+											  HASH_ENTER,
+											  &found);
 		Assert(!found);
-		hentry->join_rel = rel;
+		hentry->rel = rel;
 	}
 
-	root->join_rel_hash = hashtab;
+	list->hash = hashtab;
 }
 
 /*
- * find_join_rel
- *	  Returns relation entry corresponding to 'relids' (a set of RT indexes),
- *	  or NULL if none exists.  This is for join relations.
+ * find_rel_info
+ *	  Find a RelOptInfo entry corresponding to 'relids'.
  */
-RelOptInfo *
-find_join_rel(PlannerInfo *root, Relids relids)
+static RelOptInfo *
+find_rel_info(RelInfoList *list, Relids relids)
 {
 	/*
 	 * Switch to using hash lookup when list grows "too long".  The threshold
 	 * is arbitrary and is known only here.
 	 */
-	if (!root->join_rel_hash && list_length(root->join_rel_list) > 32)
-		build_join_rel_hash(root);
+	if (!list->hash && list_length(list->items) > 32)
+		build_rel_hash(list);
 
 	/*
 	 * Use either hashtable lookup or linear search, as appropriate.
@@ -541,23 +654,23 @@ find_join_rel(PlannerInfo *root, Relids relids)
 	 * so would force relids out of a register and thus probably slow down the
 	 * list-search case.
 	 */
-	if (root->join_rel_hash)
+	if (list->hash)
 	{
 		Relids		hashkey = relids;
-		JoinHashEntry *hentry;
+		RelHashEntry *hentry;
 
-		hentry = (JoinHashEntry *) hash_search(root->join_rel_hash,
-											   &hashkey,
-											   HASH_FIND,
-											   NULL);
+		hentry = (RelHashEntry *) hash_search(list->hash,
+											  &hashkey,
+											  HASH_FIND,
+											  NULL);
 		if (hentry)
-			return hentry->join_rel;
+			return hentry->rel;
 	}
 	else
 	{
 		ListCell   *l;
 
-		foreach(l, root->join_rel_list)
+		foreach(l, list->items)
 		{
 			RelOptInfo *rel = (RelOptInfo *) lfirst(l);
 
@@ -569,6 +682,28 @@ find_join_rel(PlannerInfo *root, Relids relids)
 	return NULL;
 }
 
+/*
+ * find_join_rel
+ *	  Returns relation entry corresponding to 'relids' (a set of RT indexes),
+ *	  or NULL if none exists.  This is for join relations.
+ */
+RelOptInfo *
+find_join_rel(PlannerInfo *root, Relids relids)
+{
+	return find_rel_info(root->join_rel_list, relids);
+}
+
+/*
+ * find_grouped_rel
+ *	  Returns relation entry corresponding to 'relids' (a set of RT indexes),
+ *	  or NULL if none exists.  This is for grouped relations.
+ */
+RelOptInfo *
+find_grouped_rel(PlannerInfo *root, Relids relids)
+{
+	return find_rel_info(root->grouped_rel_list, relids);
+}
+
 /*
  * set_foreign_rel_properties
  *		Set up foreign-join fields if outer and inner relation are foreign
@@ -619,31 +754,53 @@ set_foreign_rel_properties(RelOptInfo *joinrel, RelOptInfo *outer_rel,
 }
 
 /*
- * add_join_rel
- *		Add given join relation to the list of join relations in the given
- *		PlannerInfo. Also add it to the auxiliary hashtable if there is one.
+ * add_rel_info
+ *		Add given relation to the list, and also add it to the auxiliary
+ *		hashtable if there is one.
  */
 static void
-add_join_rel(PlannerInfo *root, RelOptInfo *joinrel)
+add_rel_info(RelInfoList *list, RelOptInfo *rel)
 {
-	/* GEQO requires us to append the new joinrel to the end of the list! */
-	root->join_rel_list = lappend(root->join_rel_list, joinrel);
+	/* GEQO requires us to append the new relation to the end of the list! */
+	list->items = lappend(list->items, rel);
 
 	/* store it into the auxiliary hashtable if there is one. */
-	if (root->join_rel_hash)
+	if (list->hash)
 	{
-		JoinHashEntry *hentry;
+		RelHashEntry *hentry;
 		bool		found;
 
-		hentry = (JoinHashEntry *) hash_search(root->join_rel_hash,
-											   &(joinrel->relids),
-											   HASH_ENTER,
-											   &found);
+		hentry = (RelHashEntry *) hash_search(list->hash,
+											  &(rel->relids),
+											  HASH_ENTER,
+											  &found);
 		Assert(!found);
-		hentry->join_rel = joinrel;
+		hentry->rel = rel;
 	}
 }
 
+/*
+ * add_join_rel
+ *		Add given join relation to the list of join relations in the given
+ *		PlannerInfo.
+ */
+static void
+add_join_rel(PlannerInfo *root, RelOptInfo *joinrel)
+{
+	add_rel_info(root->join_rel_list, joinrel);
+}
+
+/*
+ * add_grouped_rel
+ *		Add given grouped relation to the list of grouped relations in the
+ *		given PlannerInfo.
+ */
+void
+add_grouped_rel(PlannerInfo *root, RelOptInfo *rel)
+{
+	add_rel_info(root->grouped_rel_list, rel);
+}
+
 /*
  * build_join_rel
  *	  Returns relation entry corresponding to the union of two given rels,
@@ -755,6 +912,7 @@ build_join_rel(PlannerInfo *root,
 	joinrel->joininfo = NIL;
 	joinrel->has_eclass_joins = false;
 	joinrel->consider_partitionwise_join = false;	/* might get changed later */
+	joinrel->agg_info = NULL;
 	joinrel->parent = NULL;
 	joinrel->top_parent = NULL;
 	joinrel->top_parent_relids = NULL;
@@ -939,6 +1097,7 @@ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
 	joinrel->joininfo = NIL;
 	joinrel->has_eclass_joins = false;
 	joinrel->consider_partitionwise_join = false;	/* might get changed later */
+	joinrel->agg_info = NULL;
 	joinrel->parent = parent_joinrel;
 	joinrel->top_parent = parent_joinrel->top_parent ? parent_joinrel->top_parent : parent_joinrel;
 	joinrel->top_parent_relids = joinrel->top_parent->relids;
@@ -2518,3 +2677,511 @@ build_child_join_reltarget(PlannerInfo *root,
 	childrel->reltarget->cost.per_tuple = parentrel->reltarget->cost.per_tuple;
 	childrel->reltarget->width = parentrel->reltarget->width;
 }
+
+/*
+ * create_rel_agg_info
+ *	  Create the RelAggInfo structure for the given relation if it can produce
+ *	  grouped paths.  The given relation is the non-grouped one which has the
+ *	  reltarget already constructed.
+ */
+RelAggInfo *
+create_rel_agg_info(PlannerInfo *root, RelOptInfo *rel)
+{
+	ListCell   *lc;
+	RelAggInfo *result;
+	PathTarget *agg_input;
+	PathTarget *target;
+	List	   *group_clauses = NIL;
+	List	   *group_exprs = NIL;
+
+	/*
+	 * The lists of aggregate expressions and grouping expressions should have
+	 * been constructed.
+	 */
+	Assert(root->agg_clause_list != NIL);
+	Assert(root->group_expr_list != NIL);
+
+	/*
+	 * If this is a child rel, the grouped rel for its parent rel must have
+	 * been created if it can.  So we can just use parent's RelAggInfo if
+	 * there is one, with appropriate variable substitutions.
+	 */
+	if (IS_OTHER_REL(rel))
+	{
+		RelOptInfo *rel_grouped;
+		RelAggInfo *agg_info;
+
+		Assert(!bms_is_empty(rel->top_parent_relids));
+		rel_grouped = find_grouped_rel(root, rel->top_parent_relids);
+
+		if (rel_grouped == NULL)
+			return NULL;
+
+		Assert(IS_GROUPED_REL(rel_grouped));
+		/* Must do multi-level transformation */
+		agg_info = (RelAggInfo *)
+			adjust_appendrel_attrs_multilevel(root,
+											  (Node *) rel_grouped->agg_info,
+											  rel,
+											  rel->top_parent);
+
+		agg_info->grouped_rows =
+			estimate_num_groups(root, agg_info->group_exprs,
+								rel->rows, NULL, NULL);
+
+		/*
+		 * The grouped paths for the given relation are considered useful iff
+		 * the row reduction ratio is greater than EAGER_AGGREGATE_RATIO.
+		 */
+		agg_info->agg_useful =
+			(agg_info->grouped_rows <= rel->rows * (1 - EAGER_AGGREGATE_RATIO));
+
+		return agg_info;
+	}
+
+	/* Check if it's possible to produce grouped paths for this relation. */
+	if (!eager_aggregation_possible_for_relation(root, rel))
+		return NULL;
+
+	/*
+	 * Create targets for the grouped paths and for the input paths of the
+	 * grouped paths.
+	 */
+	target = create_empty_pathtarget();
+	agg_input = create_empty_pathtarget();
+
+	/* ... and initialize these targets */
+	if (!init_grouping_targets(root, rel, target, agg_input,
+							   &group_clauses, &group_exprs))
+		return NULL;
+
+	/*
+	 * Eager aggregation is not applicable if there are no available grouping
+	 * expressions.
+	 */
+	if (list_length(group_clauses) == 0)
+		return NULL;
+
+	/* build the RelAggInfo result */
+	result = makeNode(RelAggInfo);
+
+	result->group_clauses = group_clauses;
+	result->group_exprs = group_exprs;
+
+	/* Calculate pathkeys that represent this grouping requirements */
+	result->group_pathkeys =
+		make_pathkeys_for_sortclauses(root, result->group_clauses,
+									  make_tlist_from_pathtarget(target));
+
+	/* Add aggregates to the grouping target */
+	foreach(lc, root->agg_clause_list)
+	{
+		AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+		Aggref	   *aggref;
+
+		Assert(IsA(ac_info->aggref, Aggref));
+
+		aggref = (Aggref *) copyObject(ac_info->aggref);
+		mark_partial_aggref(aggref, AGGSPLIT_INITIAL_SERIAL);
+
+		add_column_to_pathtarget(target, (Expr *) aggref, 0);
+	}
+
+	/* Set the estimated eval cost and output width for both targets */
+	set_pathtarget_cost_width(root, target);
+	set_pathtarget_cost_width(root, agg_input);
+
+	result->relids = bms_copy(rel->relids);
+	result->target = target;
+	result->agg_input = agg_input;
+	result->grouped_rows = estimate_num_groups(root, result->group_exprs,
+											   rel->rows, NULL, NULL);
+
+	/*
+	 * The grouped paths for the given relation are considered useful iff the
+	 * row reduction ratio is greater than EAGER_AGGREGATE_RATIO.
+	 */
+	result->agg_useful =
+		(result->grouped_rows <= rel->rows * (1 - EAGER_AGGREGATE_RATIO));
+
+	return result;
+}
+
+/*
+ * eager_aggregation_possible_for_relation
+ * 	  Check if it's possible to produce grouped paths for the given relation.
+ */
+static bool
+eager_aggregation_possible_for_relation(PlannerInfo *root, RelOptInfo *rel)
+{
+	ListCell   *lc;
+	int			cur_relid;
+
+	/*
+	 * Check to see if the given relation is in the nullable side of an outer
+	 * join.  In this case, we cannot push a partial aggregation down to the
+	 * relation, because the NULL-extended rows produced by the outer join
+	 * would not be available when we perform the partial aggregation, while
+	 * with a non-eager-aggregation plan these rows are available for the
+	 * top-level aggregation.  Doing so may result in the rows being grouped
+	 * differently than expected, or produce incorrect values from the
+	 * aggregate functions.
+	 */
+	cur_relid = -1;
+	while ((cur_relid = bms_next_member(rel->relids, cur_relid)) >= 0)
+	{
+		RelOptInfo *baserel = find_base_rel_ignore_join(root, cur_relid);
+
+		if (baserel == NULL)
+			continue;			/* ignore outer joins in rel->relids */
+
+		if (!bms_is_subset(baserel->nulling_relids, rel->relids))
+			return false;
+	}
+
+	/*
+	 * For now we don't try to support PlaceHolderVars.
+	 */
+	foreach(lc, rel->reltarget->exprs)
+	{
+		Expr	   *expr = lfirst(lc);
+
+		if (IsA(expr, PlaceHolderVar))
+			return false;
+	}
+
+	/* Caller should only pass base relations or joins. */
+	Assert(rel->reloptkind == RELOPT_BASEREL ||
+		   rel->reloptkind == RELOPT_JOINREL);
+
+	/*
+	 * Check if all aggregate expressions can be evaluated on this relation
+	 * level.
+	 */
+	foreach(lc, root->agg_clause_list)
+	{
+		AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+
+		Assert(IsA(ac_info->aggref, Aggref));
+
+		/*
+		 * Give up if any aggregate needs relations other than the current
+		 * one.
+		 *
+		 * If the aggregate needs the current rel plus anything else, grouping
+		 * the current rel could make some input variables unavailable for the
+		 * higher aggregate and also reduce the number of input rows it
+		 * receives.
+		 *
+		 * If the aggregate does not need the current rel at all, then the
+		 * current rel should not be grouped, as we do not support joining two
+		 * grouped relations.
+		 */
+		if (!bms_is_subset(ac_info->agg_eval_at, rel->relids))
+			return false;
+	}
+
+	return true;
+}
+
+/*
+ * init_grouping_targets
+ *	  Initialize the target for grouped paths (target) as well as the target
+ *	  for paths that generate input for the grouped paths (agg_input).
+ *
+ * We also construct the list of SortGroupClauses and the list of grouping
+ * expressions for the partial aggregation, and return them in *group_clause
+ * and *group_exprs.
+ *
+ * Return true if the targets could be initialized, false otherwise.
+ */
+static bool
+init_grouping_targets(PlannerInfo *root, RelOptInfo *rel,
+					  PathTarget *target, PathTarget *agg_input,
+					  List **group_clauses, List **group_exprs)
+{
+	ListCell   *lc;
+	List	   *possibly_dependent = NIL;
+	Index		maxSortGroupRef;
+
+	/* Identify the max sortgroupref */
+	maxSortGroupRef = 0;
+	foreach(lc, root->processed_tlist)
+	{
+		Index		ref = ((TargetEntry *) lfirst(lc))->ressortgroupref;
+
+		if (ref > maxSortGroupRef)
+			maxSortGroupRef = ref;
+	}
+
+	foreach(lc, rel->reltarget->exprs)
+	{
+		Expr	   *expr = (Expr *) lfirst(lc);
+		Index		sortgroupref;
+
+		/*
+		 * Given that PlaceHolderVar currently prevents us from doing eager
+		 * aggregation, the source target cannot contain anything more complex
+		 * than a Var.
+		 */
+		Assert(IsA(expr, Var));
+
+		/* Get the sortgroupref if the expr can act as grouping expression. */
+		sortgroupref = get_expression_sortgroupref(root, expr);
+		if (sortgroupref > 0)
+		{
+			SortGroupClause *sgc;
+
+			/* Find the matching SortGroupClause */
+			sgc = get_sortgroupref_clause(sortgroupref, root->processed_groupClause);
+			Assert(sgc->tleSortGroupRef <= maxSortGroupRef);
+
+			/*
+			 * If the target expression can be used as a grouping key, it
+			 * should be emitted by the grouped paths that have been pushed
+			 * down to this relation level.
+			 */
+			add_column_to_pathtarget(target, expr, sortgroupref);
+
+			/*
+			 * ... and it also should be emitted by the input paths.
+			 */
+			add_column_to_pathtarget(agg_input, expr, sortgroupref);
+
+			/*
+			 * Record this SortGroupClause and grouping expression.  Note that
+			 * this SortGroupClause might have already been recorded.
+			 */
+			if (!list_member(*group_clauses, sgc))
+			{
+				*group_clauses = lappend(*group_clauses, sgc);
+				*group_exprs = lappend(*group_exprs, expr);
+			}
+		}
+		else if (is_var_needed_by_join(root, (Var *) expr, rel))
+		{
+			/*
+			 * The expression is needed for an upper join but is neither in
+			 * the GROUP BY clause nor derivable from it using EC (otherwise,
+			 * it would have already been included in the targets above).  We
+			 * need to create a special SortGroupClause for this expression.
+			 */
+			SortGroupClause *sgc;
+			TypeCacheEntry *tce;
+			Oid			equalimageproc;
+
+			/*
+			 * But first, check if equality implies image equality for this
+			 * expression.  If not, we cannot use it as a grouping key.  See
+			 * comments in create_grouping_expr_infos().
+			 */
+			tce = lookup_type_cache(exprType((Node *) expr),
+									TYPECACHE_BTREE_OPFAMILY);
+			if (!OidIsValid(tce->btree_opf) ||
+				!OidIsValid(tce->btree_opintype))
+				return false;
+
+			equalimageproc = get_opfamily_proc(tce->btree_opf,
+											   tce->btree_opintype,
+											   tce->btree_opintype,
+											   BTEQUALIMAGE_PROC);
+			if (!OidIsValid(equalimageproc) ||
+				!DatumGetBool(OidFunctionCall1Coll(equalimageproc,
+												   tce->typcollation,
+												   ObjectIdGetDatum(tce->btree_opintype))))
+				return false;
+
+			/* Create the SortGroupClause. */
+			sgc = makeNode(SortGroupClause);
+
+			/* Initialize the SortGroupClause. */
+			sgc->tleSortGroupRef = ++maxSortGroupRef;
+			get_sort_group_operators(exprType((Node *) expr),
+									 false, true, false,
+									 &sgc->sortop, &sgc->eqop, NULL,
+									 &sgc->hashable);
+
+			/* This expression should be emitted by the grouped paths */
+			add_column_to_pathtarget(target, expr, sgc->tleSortGroupRef);
+
+			/* ... and it also should be emitted by the input paths. */
+			add_column_to_pathtarget(agg_input, expr, sgc->tleSortGroupRef);
+
+			/* Record this SortGroupClause and grouping expression */
+			*group_clauses = lappend(*group_clauses, sgc);
+			*group_exprs = lappend(*group_exprs, expr);
+		}
+		else if (is_var_in_aggref_only(root, (Var *) expr))
+		{
+			/*
+			 * The expression is referenced by an aggregate function pushed
+			 * down to this relation and does not appear elsewhere in the
+			 * targetlist or havingQual.  Add it to 'agg_input' but not to
+			 * 'target'.
+			 */
+			add_new_column_to_pathtarget(agg_input, expr);
+		}
+		else
+		{
+			/*
+			 * The expression may be functionally dependent on other
+			 * expressions in the target, but we cannot verify this until all
+			 * target expressions have been constructed.
+			 */
+			possibly_dependent = lappend(possibly_dependent, expr);
+		}
+	}
+
+	/*
+	 * Now we can verify whether an expression is functionally dependent on
+	 * others.
+	 */
+	foreach(lc, possibly_dependent)
+	{
+		Var		   *tvar;
+		List	   *deps = NIL;
+		RangeTblEntry *rte;
+
+		tvar = lfirst_node(Var, lc);
+		rte = root->simple_rte_array[tvar->varno];
+
+		if (check_functional_grouping(rte->relid, tvar->varno,
+									  tvar->varlevelsup,
+									  target->exprs, &deps))
+		{
+			/*
+			 * The expression is functionally dependent on other target
+			 * expressions, so it can be included in the targets.  Since it
+			 * will not be used as a grouping key, a sortgroupref is not
+			 * needed for it.
+			 */
+			add_new_column_to_pathtarget(target, (Expr *) tvar);
+			add_new_column_to_pathtarget(agg_input, (Expr *) tvar);
+		}
+		else
+		{
+			/*
+			 * We may arrive here with a grouping expression that is proven
+			 * redundant by EquivalenceClass processing, such as 't1.a' in the
+			 * query below.
+			 *
+			 * select max(t1.c) from t t1, t t2 where t1.a = 1 group by t1.a,
+			 * t1.b;
+			 *
+			 * For now we just give up in this case.
+			 */
+			return false;
+		}
+	}
+
+	return true;
+}
+
+/*
+ * is_var_in_aggref_only
+ *	  Check whether the given Var appears in aggregate expressions and not
+ *	  elsewhere in the targetlist or havingQual.
+ */
+static bool
+is_var_in_aggref_only(PlannerInfo *root, Var *var)
+{
+	ListCell   *lc;
+
+	/*
+	 * Search the list of aggregate expressions for the Var.
+	 */
+	foreach(lc, root->agg_clause_list)
+	{
+		AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+		List	   *vars;
+
+		Assert(IsA(ac_info->aggref, Aggref));
+
+		if (!bms_is_member(var->varno, ac_info->agg_eval_at))
+			continue;
+
+		vars = pull_var_clause((Node *) ac_info->aggref,
+							   PVC_RECURSE_AGGREGATES |
+							   PVC_RECURSE_WINDOWFUNCS |
+							   PVC_RECURSE_PLACEHOLDERS);
+
+		if (list_member(vars, var))
+		{
+			list_free(vars);
+			break;
+		}
+
+		list_free(vars);
+	}
+
+	return (lc != NULL && !list_member(root->tlist_vars, var));
+}
+
+/*
+ * is_var_needed_by_join
+ *	  Check if the given Var is needed by joins above the current rel.
+ *
+ * Consider pushing the aggregate avg(b.y) down to relation b for the following
+ * query:
+ *
+ *    SELECT a.i, avg(b.y)
+ *    FROM a JOIN b ON a.j = b.j
+ *    GROUP BY a.i;
+ *
+ * Column b.j needs to be used as the grouping key because otherwise it cannot
+ * find its way to the input of the join expression.
+ */
+static bool
+is_var_needed_by_join(PlannerInfo *root, Var *var, RelOptInfo *rel)
+{
+	Relids		relids;
+	int			attno;
+	RelOptInfo *baserel;
+
+	/*
+	 * Note that when checking if the Var is needed by joins above, we want to
+	 * exclude cases where the Var is only needed in the final output.  So
+	 * include "relation 0" in the check.
+	 */
+	relids = bms_copy(rel->relids);
+	relids = bms_add_member(relids, 0);
+
+	baserel = find_base_rel(root, var->varno);
+	attno = var->varattno - baserel->min_attr;
+
+	return bms_nonempty_difference(baserel->attr_needed[attno], relids);
+}
+
+/*
+ * get_expression_sortgroupref
+ *	  Return sortgroupref if the given 'expr' can be used as a grouping key in
+ *	  grouped paths for base or join relations, or 0 otherwise.
+ *
+ * We first check if 'expr' is among the grouping expressions.  If it is not,
+ * we then check if 'expr' is known equal to any of the grouping expressions
+ * due to equivalence relationships.
+ */
+static Index
+get_expression_sortgroupref(PlannerInfo *root, Expr *expr)
+{
+	ListCell   *lc;
+
+	foreach(lc, root->group_expr_list)
+	{
+		GroupExprInfo *ge_info = lfirst_node(GroupExprInfo, lc);
+
+		Assert(IsA(ge_info->expr, Var));
+
+		if (equal(ge_info->expr, expr) ||
+			exprs_known_equal(root, (Node *) expr, (Node *) ge_info->expr,
+							  ge_info->btree_opfamily))
+		{
+			Assert(ge_info->sortgroupref > 0);
+
+			return ge_info->sortgroupref;
+		}
+	}
+
+	/* The expression cannot be used as a grouping key. */
+	return 0;
+}
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 8cf1afbad2..95bd80c4dd 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -929,6 +929,16 @@ struct config_bool ConfigureNamesBool[] =
 		false,
 		NULL, NULL, NULL
 	},
+	{
+		{"enable_eager_aggregate", PGC_USERSET, QUERY_TUNING_METHOD,
+			gettext_noop("Enables eager aggregation."),
+			NULL,
+			GUC_EXPLAIN
+		},
+		&enable_eager_aggregate,
+		false,
+		NULL, NULL, NULL
+	},
 	{
 		{"enable_parallel_append", PGC_USERSET, QUERY_TUNING_METHOD,
 			gettext_noop("Enables the planner's use of parallel append plans."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index a2ac7575ca..154fc5b1fa 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -416,6 +416,7 @@
 #enable_tidscan = on
 #enable_group_by_reordering = on
 #enable_distinct_reordering = on
+#enable_eager_aggregate = off
 
 # - Planner Cost Constants -
 
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 0759e00e96..6a0572d9c7 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -80,6 +80,25 @@ typedef enum UpperRelationKind
 	/* NB: UPPERREL_FINAL must be last enum entry; it's used to size arrays */
 } UpperRelationKind;
 
+/*
+ * A structure consisting of a list and a hash table to store relations.
+ *
+ * For small problems we just scan the list to do lookups, but when there are
+ * many relations we build a hash table for faster lookups.  The hash table is
+ * present and valid when 'hash' is not NULL.  Note that we still maintain the
+ * list even when using the hash table for lookups; this simplifies life for
+ * GEQO.
+ */
+typedef struct RelInfoList
+{
+	pg_node_attr(no_copy_equal, no_read)
+
+	NodeTag		type;
+
+	List	   *items;
+	struct HTAB *hash pg_node_attr(read_write_ignore);
+} RelInfoList;
+
 /*----------
  * PlannerGlobal
  *		Global information for planning/optimization
@@ -270,15 +289,16 @@ struct PlannerInfo
 
 	/*
 	 * join_rel_list is a list of all join-relation RelOptInfos we have
-	 * considered in this planning run.  For small problems we just scan the
-	 * list to do lookups, but when there are many join relations we build a
-	 * hash table for faster lookups.  The hash table is present and valid
-	 * when join_rel_hash is not NULL.  Note that we still maintain the list
-	 * even when using the hash table for lookups; this simplifies life for
-	 * GEQO.
+	 * considered in this planning run.
 	 */
-	List	   *join_rel_list;
-	struct HTAB *join_rel_hash pg_node_attr(read_write_ignore);
+	RelInfoList *join_rel_list; /* list of join-relation RelOptInfos */
+
+	/*
+	 * grouped_rel_list is a list of all grouped-relation RelOptInfos we have
+	 * considered in this planning run.  This is only used by eager
+	 * aggregation.
+	 */
+	RelInfoList *grouped_rel_list;	/* list of grouped-relation RelOptInfos */
 
 	/*
 	 * When doing a dynamic-programming-style join search, join_rel_level[k]
@@ -373,6 +393,15 @@ struct PlannerInfo
 	/* list of PlaceHolderInfos */
 	List	   *placeholder_list;
 
+	/* list of AggClauseInfos */
+	List	   *agg_clause_list;
+
+	/* list of GroupExprInfos */
+	List	   *group_expr_list;
+
+	/* list of plain Vars contained in targetlist and havingQual */
+	List	   *tlist_vars;
+
 	/* array of PlaceHolderInfos indexed by phid */
 	struct PlaceHolderInfo **placeholder_array pg_node_attr(read_write_ignore, array_size(placeholder_array_size));
 	/* allocated size of array */
@@ -998,6 +1027,12 @@ typedef struct RelOptInfo
 	/* consider partitionwise join paths? (if partitioned rel) */
 	bool		consider_partitionwise_join;
 
+	/*
+	 * used by eager aggregation:
+	 */
+	/* information needed to create grouped paths */
+	struct RelAggInfo *agg_info;
+
 	/*
 	 * inheritance links, if this is an otherrel (otherwise NULL):
 	 */
@@ -1071,6 +1106,68 @@ typedef struct RelOptInfo
 	((rel)->part_scheme && (rel)->boundinfo && (rel)->nparts > 0 && \
 	 (rel)->part_rels && (rel)->partexprs && (rel)->nullable_partexprs)
 
+/*
+ * Is the given relation a grouped relation?
+ */
+#define IS_GROUPED_REL(rel) \
+	((rel)->agg_info != NULL)
+
+/*
+ * RelAggInfo
+ *		Information needed to create grouped paths for base and join rels.
+ *
+ * "relids" is the set of relation identifiers (RT indexes).
+ *
+ * "target" is the output tlist for the grouped paths.
+ *
+ * "agg_input" is the output tlist for the paths that provide input to the
+ * grouped paths.  One difference from the reltarget of the non-grouped
+ * relation is that agg_input has its sortgrouprefs[] initialized.
+ *
+ * "grouped_rows" is the estimated number of result tuples of the grouped
+ * relation.
+ *
+ * "group_clauses", "group_exprs" and "group_pathkeys" are lists of
+ * SortGroupClauses, the corresponding grouping expressions and PathKeys
+ * respectively.
+ *
+ * "agg_useful" is a flag to indicate whether the grouped paths are considered
+ * useful.
+ */
+typedef struct RelAggInfo
+{
+	pg_node_attr(no_copy_equal, no_read, no_query_jumble)
+
+	NodeTag		type;
+
+	/* set of base + OJ relids (rangetable indexes) */
+	Relids		relids;
+
+	/*
+	 * default result targetlist for Paths scanning this grouped relation;
+	 * list of Vars/Exprs, cost, width
+	 */
+	struct PathTarget *target;
+
+	/*
+	 * the targetlist for Paths that provide input to the grouped paths
+	 */
+	struct PathTarget *agg_input;
+
+	/* estimated number of result tuples */
+	Cardinality grouped_rows;
+
+	/* a list of SortGroupClauses */
+	List	   *group_clauses;
+	/* a list of grouping expressions */
+	List	   *group_exprs;
+	/* a list of PathKeys */
+	List	   *group_pathkeys;
+
+	/* the grouped paths are considered useful? */
+	bool		agg_useful;
+} RelAggInfo;
+
 /*
  * IndexOptInfo
  *		Per-index information for planning/optimization
@@ -3145,6 +3242,41 @@ typedef struct MinMaxAggInfo
 	Param	   *param;
 } MinMaxAggInfo;
 
+/*
+ * The aggregate expressions that appear in targetlist and having clauses
+ */
+typedef struct AggClauseInfo
+{
+	pg_node_attr(no_read, no_query_jumble)
+
+	NodeTag		type;
+
+	/* the Aggref expr */
+	Aggref	   *aggref;
+
+	/* lowest level we can evaluate this aggregate at */
+	Relids		agg_eval_at;
+} AggClauseInfo;
+
+/*
+ * The grouping expressions that appear in grouping clauses
+ */
+typedef struct GroupExprInfo
+{
+	pg_node_attr(no_read, no_query_jumble)
+
+	NodeTag		type;
+
+	/* the represented expression */
+	Expr	   *expr;
+
+	/* the tleSortGroupRef of the corresponding SortGroupClause */
+	Index		sortgroupref;
+
+	/* btree opfamily defining the ordering */
+	Oid			btree_opfamily;
+} GroupExprInfo;
+
 /*
  * At runtime, PARAM_EXEC slots are used to pass values around from one plan
  * node to another.  They can be used to pass values down into subqueries (for
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index 1035e6560c..d3c05a61ba 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -314,10 +314,16 @@ extern void setup_simple_rel_arrays(PlannerInfo *root);
 extern void expand_planner_arrays(PlannerInfo *root, int add_size);
 extern RelOptInfo *build_simple_rel(PlannerInfo *root, int relid,
 									RelOptInfo *parent);
+extern RelOptInfo *build_simple_grouped_rel(PlannerInfo *root,
+											RelOptInfo *rel_plain);
+extern RelOptInfo *build_grouped_rel(PlannerInfo *root,
+									 RelOptInfo *rel_plain);
 extern RelOptInfo *find_base_rel(PlannerInfo *root, int relid);
 extern RelOptInfo *find_base_rel_noerr(PlannerInfo *root, int relid);
 extern RelOptInfo *find_base_rel_ignore_join(PlannerInfo *root, int relid);
 extern RelOptInfo *find_join_rel(PlannerInfo *root, Relids relids);
+extern void add_grouped_rel(PlannerInfo *root, RelOptInfo *rel);
+extern RelOptInfo *find_grouped_rel(PlannerInfo *root, Relids relids);
 extern RelOptInfo *build_join_rel(PlannerInfo *root,
 								  Relids joinrelids,
 								  RelOptInfo *outer_rel,
@@ -353,4 +359,5 @@ extern RelOptInfo *build_child_join_rel(PlannerInfo *root,
 										SpecialJoinInfo *sjinfo,
 										int nappinfos, AppendRelInfo **appinfos);
 
+extern RelAggInfo *create_rel_agg_info(PlannerInfo *root, RelOptInfo *rel);
 #endif							/* PATHNODE_H */
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 54869d4401..a189b7f18c 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -21,6 +21,7 @@
  * allpaths.c
  */
 extern PGDLLIMPORT bool enable_geqo;
+extern PGDLLIMPORT bool enable_eager_aggregate;
 extern PGDLLIMPORT int geqo_threshold;
 extern PGDLLIMPORT int min_parallel_table_scan_size;
 extern PGDLLIMPORT int min_parallel_index_scan_size;
@@ -57,6 +58,10 @@ extern void generate_gather_paths(PlannerInfo *root, RelOptInfo *rel,
 								  bool override_rows);
 extern void generate_useful_gather_paths(PlannerInfo *root, RelOptInfo *rel,
 										 bool override_rows);
+extern void generate_grouped_paths(PlannerInfo *root,
+								   RelOptInfo *rel_grouped,
+								   RelOptInfo *rel_plain,
+								   RelAggInfo *agg_info);
 extern int	compute_parallel_worker(RelOptInfo *rel, double heap_pages,
 									double index_pages, int max_workers);
 extern void create_partial_bitmap_paths(PlannerInfo *root, RelOptInfo *rel,
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index 0b6f0f7969..49614dbd75 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -75,6 +75,7 @@ extern void add_vars_to_targetlist(PlannerInfo *root, List *vars,
 extern void add_vars_to_attr_needed(PlannerInfo *root, List *vars,
 									Relids where_needed);
 extern void remove_useless_groupby_columns(PlannerInfo *root);
+extern void setup_eager_aggregation(PlannerInfo *root);
 extern void find_lateral_references(PlannerInfo *root);
 extern void rebuild_lateral_attr_needed(PlannerInfo *root);
 extern void create_lateral_join_info(PlannerInfo *root);
diff --git a/src/test/regress/expected/eager_aggregate.out b/src/test/regress/expected/eager_aggregate.out
new file mode 100644
index 0000000000..9f63472eff
--- /dev/null
+++ b/src/test/regress/expected/eager_aggregate.out
@@ -0,0 +1,1308 @@
+--
+-- EAGER AGGREGATION
+-- Test we can push aggregation down below join
+--
+-- Enable eager aggregation, which by default is disabled.
+SET enable_eager_aggregate TO on;
+CREATE TABLE eager_agg_t1 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t2 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t3 (a int, b int, c double precision);
+INSERT INTO eager_agg_t1 SELECT i, i, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t2 SELECT i, i%10, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t3 SELECT i%10, i%10, i FROM generate_series(1, 1000)i;
+ANALYZE eager_agg_t1;
+ANALYZE eager_agg_t2;
+ANALYZE eager_agg_t3;
+--
+-- Test eager aggregation over base rel
+--
+-- Perform scan of a table, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                            QUERY PLAN                            
+------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Seq Scan on public.eager_agg_t2 t2
+                                 Output: t2.a, t2.b, t2.c
+(18 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                               QUERY PLAN                               
+------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial GroupAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Sort
+                                 Output: t2.c, t2.b
+                                 Sort Key: t2.b
+                                 ->  Seq Scan on public.eager_agg_t2 t2
+                                       Output: t2.c, t2.b
+(21 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+RESET enable_hashagg;
+--
+-- Test eager aggregation over join rel
+--
+-- Perform join of tables, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+                                  QUERY PLAN                                  
+------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg((t2.c + t3.c))
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg((t2.c + t3.c)))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg((t2.c + t3.c))
+                           Group Key: t2.b
+                           ->  Hash Join
+                                 Output: t2.c, t2.b, t3.c
+                                 Hash Cond: (t3.a = t2.a)
+                                 ->  Seq Scan on public.eager_agg_t3 t3
+                                       Output: t3.a, t3.b, t3.c
+                                 ->  Hash
+                                       Output: t2.c, t2.b, t2.a
+                                       ->  Seq Scan on public.eager_agg_t2 t2
+                                             Output: t2.c, t2.b, t2.a
+(25 rows)
+
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 497
+ 2 | 499
+ 3 | 501
+ 4 | 503
+ 5 | 505
+ 6 | 507
+ 7 | 509
+ 8 | 511
+ 9 | 513
+(9 rows)
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+                                     QUERY PLAN                                     
+------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg((t2.c + t3.c))
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg((t2.c + t3.c)))
+                     ->  Partial GroupAggregate
+                           Output: t2.b, PARTIAL avg((t2.c + t3.c))
+                           Group Key: t2.b
+                           ->  Sort
+                                 Output: t2.c, t2.b, t3.c
+                                 Sort Key: t2.b
+                                 ->  Hash Join
+                                       Output: t2.c, t2.b, t3.c
+                                       Hash Cond: (t3.a = t2.a)
+                                       ->  Seq Scan on public.eager_agg_t3 t3
+                                             Output: t3.a, t3.b, t3.c
+                                       ->  Hash
+                                             Output: t2.c, t2.b, t2.a
+                                             ->  Seq Scan on public.eager_agg_t2 t2
+                                                   Output: t2.c, t2.b, t2.a
+(28 rows)
+
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 497
+ 2 | 499
+ 3 | 501
+ 4 | 503
+ 5 | 505
+ 6 | 507
+ 7 | 509
+ 8 | 511
+ 9 | 513
+(9 rows)
+
+RESET enable_hashagg;
+--
+-- Test that eager aggregation works for outer join
+--
+-- Ensure aggregation can be pushed down to the non-nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                            QUERY PLAN                            
+------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Right Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Seq Scan on public.eager_agg_t2 t2
+                                 Output: t2.a, t2.b, t2.c
+(18 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+   | 505
+(10 rows)
+
+-- Ensure aggregation cannot be pushed down to the nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Sort
+   Output: t2.b, (avg(t2.c))
+   Sort Key: t2.b
+   ->  HashAggregate
+         Output: t2.b, avg(t2.c)
+         Group Key: t2.b
+         ->  Hash Right Join
+               Output: t2.b, t2.c
+               Hash Cond: (t2.b = t1.b)
+               ->  Seq Scan on public.eager_agg_t2 t2
+                     Output: t2.a, t2.b, t2.c
+               ->  Hash
+                     Output: t1.b
+                     ->  Seq Scan on public.eager_agg_t1 t1
+                           Output: t1.b
+(15 rows)
+
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+ b | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+   |    
+(10 rows)
+
+--
+-- Test that eager aggregation works for parallel plans
+--
+SET parallel_setup_cost=0;
+SET parallel_tuple_cost=0;
+SET min_parallel_table_scan_size=0;
+SET max_parallel_workers_per_gather=4;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                                   QUERY PLAN                                    
+---------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Gather Merge
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Workers Planned: 2
+         ->  Sort
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Sort Key: t1.a
+               ->  Parallel Hash Join
+                     Output: t1.a, (PARTIAL avg(t2.c))
+                     Hash Cond: (t1.b = t2.b)
+                     ->  Parallel Seq Scan on public.eager_agg_t1 t1
+                           Output: t1.a, t1.b, t1.c
+                     ->  Parallel Hash
+                           Output: t2.b, (PARTIAL avg(t2.c))
+                           ->  Partial HashAggregate
+                                 Output: t2.b, PARTIAL avg(t2.c)
+                                 Group Key: t2.b
+                                 ->  Parallel Seq Scan on public.eager_agg_t2 t2
+                                       Output: t2.a, t2.b, t2.c
+(21 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+RESET parallel_setup_cost;
+RESET parallel_tuple_cost;
+RESET min_parallel_table_scan_size;
+RESET max_parallel_workers_per_gather;
+DROP TABLE eager_agg_t1;
+DROP TABLE eager_agg_t2;
+DROP TABLE eager_agg_t3;
+--
+-- Test eager aggregation for partitionwise join
+--
+-- Enable partitionwise aggregate, which by default is disabled.
+SET enable_partitionwise_aggregate TO true;
+-- Enable partitionwise join, which by default is disabled.
+SET enable_partitionwise_join TO true;
+CREATE TABLE eager_agg_tab1(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab1_p1 PARTITION OF eager_agg_tab1 FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab1_p2 PARTITION OF eager_agg_tab1 FOR VALUES FROM (10) TO (20);
+CREATE TABLE eager_agg_tab1_p3 PARTITION OF eager_agg_tab1 FOR VALUES FROM (20) TO (30);
+CREATE TABLE eager_agg_tab2(x int, y int) PARTITION BY RANGE(y);
+CREATE TABLE eager_agg_tab2_p1 PARTITION OF eager_agg_tab2 FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab2_p2 PARTITION OF eager_agg_tab2 FOR VALUES FROM (10) TO (20);
+CREATE TABLE eager_agg_tab2_p3 PARTITION OF eager_agg_tab2 FOR VALUES FROM (20) TO (30);
+INSERT INTO eager_agg_tab1 SELECT i % 30, i % 20 FROM generate_series(0, 299, 2) i;
+INSERT INTO eager_agg_tab2 SELECT i % 20, i % 30 FROM generate_series(0, 299, 3) i;
+ANALYZE eager_agg_tab1;
+ANALYZE eager_agg_tab2;
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum(t1.y)), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum(t1.y), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2.y = t1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2
+                           Output: t2.y
+                     ->  Hash
+                           Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1.x, PARTIAL sum(t1.y), PARTIAL count(*)
+                                 Group Key: t1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                       Output: t1.x, t1.y
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum(t1_1.y), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_1
+                           Output: t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.y), PARTIAL count(*)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                       Output: t1_1.x, t1_1.y
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum(t1_2.y), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_2
+                           Output: t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.y), PARTIAL count(*)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                       Output: t1_2.x, t1_2.y
+(49 rows)
+
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+ x  | sum  | count 
+----+------+-------
+  0 |  500 |   100
+  6 | 1100 |   100
+ 12 |  700 |   100
+ 18 | 1300 |   100
+ 24 |  900 |   100
+(5 rows)
+
+-- GROUP BY having other matching key
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t2.y, (sum(t1.y)), (count(*))
+   Sort Key: t2.y
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t2.y, sum(t1.y), count(*)
+               Group Key: t2.y
+               ->  Hash Join
+                     Output: t2.y, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2.y = t1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2
+                           Output: t2.y
+                     ->  Hash
+                           Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1.x, PARTIAL sum(t1.y), PARTIAL count(*)
+                                 Group Key: t1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                       Output: t1.y, t1.x
+         ->  Finalize HashAggregate
+               Output: t2_1.y, sum(t1_1.y), count(*)
+               Group Key: t2_1.y
+               ->  Hash Join
+                     Output: t2_1.y, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_1
+                           Output: t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.y), PARTIAL count(*)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                       Output: t1_1.y, t1_1.x
+         ->  Finalize HashAggregate
+               Output: t2_2.y, sum(t1_2.y), count(*)
+               Group Key: t2_2.y
+               ->  Hash Join
+                     Output: t2_2.y, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_2
+                           Output: t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.y), PARTIAL count(*)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                       Output: t1_2.y, t1_2.x
+(49 rows)
+
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+ y  | sum  | count 
+----+------+-------
+  0 |  500 |   100
+  6 | 1100 |   100
+ 12 |  700 |   100
+ 18 | 1300 |   100
+ 24 |  900 |   100
+(5 rows)
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 10 ORDER BY t2.x;
+                                                 QUERY PLAN                                                 
+------------------------------------------------------------------------------------------------------------
+ Sort
+   Output: t2.x, (sum(t1.x)), (count(*))
+   Sort Key: t2.x
+   ->  Finalize HashAggregate
+         Output: t2.x, sum(t1.x), count(*)
+         Group Key: t2.x
+         Filter: (avg(t1.x) > '10'::numeric)
+         ->  Append
+               ->  Hash Join
+                     Output: t2_1.x, (PARTIAL sum(t1_1.x)), (PARTIAL count(*)), (PARTIAL avg(t1_1.x))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2_1
+                           Output: t2_1.x, t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.x)), (PARTIAL count(*)), (PARTIAL avg(t1_1.x))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.x), PARTIAL count(*), PARTIAL avg(t1_1.x)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1_1
+                                       Output: t1_1.x
+               ->  Hash Join
+                     Output: t2_2.x, (PARTIAL sum(t1_2.x)), (PARTIAL count(*)), (PARTIAL avg(t1_2.x))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_2
+                           Output: t2_2.x, t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.x)), (PARTIAL count(*)), (PARTIAL avg(t1_2.x))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.x), PARTIAL count(*), PARTIAL avg(t1_2.x)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_2
+                                       Output: t1_2.x
+               ->  Hash Join
+                     Output: t2_3.x, (PARTIAL sum(t1_3.x)), (PARTIAL count(*)), (PARTIAL avg(t1_3.x))
+                     Hash Cond: (t2_3.y = t1_3.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_3
+                           Output: t2_3.x, t2_3.y
+                     ->  Hash
+                           Output: t1_3.x, (PARTIAL sum(t1_3.x)), (PARTIAL count(*)), (PARTIAL avg(t1_3.x))
+                           ->  Partial HashAggregate
+                                 Output: t1_3.x, PARTIAL sum(t1_3.x), PARTIAL count(*), PARTIAL avg(t1_3.x)
+                                 Group Key: t1_3.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_3
+                                       Output: t1_3.x
+(44 rows)
+
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 10 ORDER BY t2.x;
+ x  | sum  | count 
+----+------+-------
+  2 |  600 |    50
+  4 | 1200 |    50
+  8 |  900 |    50
+ 12 |  600 |    50
+ 14 | 1200 |    50
+ 18 |  900 |    50
+(6 rows)
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum((t2.y + t3.y)))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum((t2.y + t3.y))
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum((t2.y + t3.y)))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, t3.x, (PARTIAL sum((t2.y + t3.y)))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, t3.x, PARTIAL sum((t2.y + t3.y))
+                                 Group Key: t2.x
+                                 ->  Hash Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Hash Cond: (t2.x = t3.x)
+                                       ->  Seq Scan on public.eager_agg_tab1_p1 t2
+                                             Output: t2.y, t2.x
+                                       ->  Hash
+                                             Output: t3.y, t3.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p1 t3
+                                                   Output: t3.y, t3.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum((t2_1.y + t3_1.y))
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum((t2_1.y + t3_1.y)))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y)))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y))
+                                 Group Key: t2_1.x
+                                 ->  Hash Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Hash Cond: (t2_1.x = t3_1.x)
+                                       ->  Seq Scan on public.eager_agg_tab1_p2 t2_1
+                                             Output: t2_1.y, t2_1.x
+                                       ->  Hash
+                                             Output: t3_1.y, t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p2 t3_1
+                                                   Output: t3_1.y, t3_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum((t2_2.y + t3_2.y))
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum((t2_2.y + t3_2.y)))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y)))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y))
+                                 Group Key: t2_2.x
+                                 ->  Hash Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Hash Cond: (t2_2.x = t3_2.x)
+                                       ->  Seq Scan on public.eager_agg_tab1_p3 t2_2
+                                             Output: t2_2.y, t2_2.x
+                                       ->  Hash
+                                             Output: t3_2.y, t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p3 t3_2
+                                                   Output: t3_2.y, t3_2.x
+(70 rows)
+
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+ x  |  sum  
+----+-------
+  0 | 10000
+  2 | 14000
+  4 | 18000
+  6 | 22000
+  8 | 26000
+ 10 | 10000
+ 12 | 14000
+ 14 | 18000
+ 16 | 22000
+ 18 | 26000
+ 20 | 10000
+ 22 | 14000
+ 24 | 18000
+ 26 | 22000
+ 28 | 26000
+(15 rows)
+
+-- partial aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t3.y, sum((t2.y + t3.y))
+   Group Key: t3.y
+   ->  Sort
+         Output: t3.y, (PARTIAL sum((t2.y + t3.y)))
+         Sort Key: t3.y
+         ->  Append
+               ->  Hash Join
+                     Output: t3_1.y, (PARTIAL sum((t2_1.y + t3_1.y)))
+                     Hash Cond: (t2_1.x = t1_1.x)
+                     ->  Partial GroupAggregate
+                           Output: t2_1.x, t3_1.y, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y))
+                           Group Key: t2_1.x, t3_1.y, t3_1.x
+                           ->  Incremental Sort
+                                 Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                 Sort Key: t2_1.x, t3_1.y
+                                 Presorted Key: t2_1.x
+                                 ->  Merge Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Merge Cond: (t2_1.x = t3_1.x)
+                                       ->  Sort
+                                             Output: t2_1.y, t2_1.x
+                                             Sort Key: t2_1.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p1 t2_1
+                                                   Output: t2_1.y, t2_1.x
+                                       ->  Sort
+                                             Output: t3_1.y, t3_1.x
+                                             Sort Key: t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p1 t3_1
+                                                   Output: t3_1.y, t3_1.x
+                     ->  Hash
+                           Output: t1_1.x
+                           ->  Seq Scan on public.eager_agg_tab1_p1 t1_1
+                                 Output: t1_1.x
+               ->  Hash Join
+                     Output: t3_2.y, (PARTIAL sum((t2_2.y + t3_2.y)))
+                     Hash Cond: (t2_2.x = t1_2.x)
+                     ->  Partial GroupAggregate
+                           Output: t2_2.x, t3_2.y, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y))
+                           Group Key: t2_2.x, t3_2.y, t3_2.x
+                           ->  Incremental Sort
+                                 Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                 Sort Key: t2_2.x, t3_2.y
+                                 Presorted Key: t2_2.x
+                                 ->  Merge Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Merge Cond: (t2_2.x = t3_2.x)
+                                       ->  Sort
+                                             Output: t2_2.y, t2_2.x
+                                             Sort Key: t2_2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p2 t2_2
+                                                   Output: t2_2.y, t2_2.x
+                                       ->  Sort
+                                             Output: t3_2.y, t3_2.x
+                                             Sort Key: t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p2 t3_2
+                                                   Output: t3_2.y, t3_2.x
+                     ->  Hash
+                           Output: t1_2.x
+                           ->  Seq Scan on public.eager_agg_tab1_p2 t1_2
+                                 Output: t1_2.x
+               ->  Hash Join
+                     Output: t3_3.y, (PARTIAL sum((t2_3.y + t3_3.y)))
+                     Hash Cond: (t2_3.x = t1_3.x)
+                     ->  Partial GroupAggregate
+                           Output: t2_3.x, t3_3.y, t3_3.x, PARTIAL sum((t2_3.y + t3_3.y))
+                           Group Key: t2_3.x, t3_3.y, t3_3.x
+                           ->  Incremental Sort
+                                 Output: t2_3.y, t2_3.x, t3_3.y, t3_3.x
+                                 Sort Key: t2_3.x, t3_3.y
+                                 Presorted Key: t2_3.x
+                                 ->  Merge Join
+                                       Output: t2_3.y, t2_3.x, t3_3.y, t3_3.x
+                                       Merge Cond: (t2_3.x = t3_3.x)
+                                       ->  Sort
+                                             Output: t2_3.y, t2_3.x
+                                             Sort Key: t2_3.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p3 t2_3
+                                                   Output: t2_3.y, t2_3.x
+                                       ->  Sort
+                                             Output: t3_3.y, t3_3.x
+                                             Sort Key: t3_3.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p3 t3_3
+                                                   Output: t3_3.y, t3_3.x
+                     ->  Hash
+                           Output: t1_3.x
+                           ->  Seq Scan on public.eager_agg_tab1_p3 t1_3
+                                 Output: t1_3.x
+(88 rows)
+
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+ y  |  sum  
+----+-------
+  0 |  7500
+  2 | 13500
+  4 | 19500
+  6 | 25500
+  8 | 31500
+ 10 | 22500
+ 12 | 28500
+ 14 | 34500
+ 16 | 40500
+ 18 | 46500
+(10 rows)
+
+RESET enable_hashagg;
+DROP TABLE eager_agg_tab1;
+DROP TABLE eager_agg_tab2;
+--
+-- Test with multi-level partitioning scheme
+--
+CREATE TABLE eager_agg_tab_ml(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p1 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab_ml_p2 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (10) TO (20) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p2_s1 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab_ml_p2_s2 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (15) TO (20);
+CREATE TABLE eager_agg_tab_ml_p3 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (20) TO (30) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p3_s1 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (20) TO (25);
+CREATE TABLE eager_agg_tab_ml_p3_s2 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (25) TO (30);
+INSERT INTO eager_agg_tab_ml SELECT i % 30, i % 30 FROM generate_series(1, 1000) i;
+ANALYZE eager_agg_tab_ml;
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum(t2.y)), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum(t2.y), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, PARTIAL sum(t2.y), PARTIAL count(*)
+                                 Group Key: t2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                       Output: t2.y, t2.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum(t2_1.y), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, PARTIAL sum(t2_1.y), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                       Output: t2_1.y, t2_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum(t2_2.y), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, PARTIAL sum(t2_2.y), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                       Output: t2_2.y, t2_2.x
+         ->  Finalize HashAggregate
+               Output: t1_3.x, sum(t2_3.y), count(*)
+               Group Key: t1_3.x
+               ->  Hash Join
+                     Output: t1_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, PARTIAL sum(t2_3.y), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                       Output: t2_3.y, t2_3.x
+         ->  Finalize HashAggregate
+               Output: t1_4.x, sum(t2_4.y), count(*)
+               Group Key: t1_4.x
+               ->  Hash Join
+                     Output: t1_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, PARTIAL sum(t2_4.y), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                       Output: t2_4.y, t2_4.x
+(79 rows)
+
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+ x  |  sum  | count 
+----+-------+-------
+  0 |     0 |  1089
+  1 |  1156 |  1156
+  2 |  2312 |  1156
+  3 |  3468 |  1156
+  4 |  4624 |  1156
+  5 |  5780 |  1156
+  6 |  6936 |  1156
+  7 |  8092 |  1156
+  8 |  9248 |  1156
+  9 | 10404 |  1156
+ 10 | 11560 |  1156
+ 11 | 11979 |  1089
+ 12 | 13068 |  1089
+ 13 | 14157 |  1089
+ 14 | 15246 |  1089
+ 15 | 16335 |  1089
+ 16 | 17424 |  1089
+ 17 | 18513 |  1089
+ 18 | 19602 |  1089
+ 19 | 20691 |  1089
+ 20 | 21780 |  1089
+ 21 | 22869 |  1089
+ 22 | 23958 |  1089
+ 23 | 25047 |  1089
+ 24 | 26136 |  1089
+ 25 | 27225 |  1089
+ 26 | 28314 |  1089
+ 27 | 29403 |  1089
+ 28 | 30492 |  1089
+ 29 | 31581 |  1089
+(30 rows)
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.y, (sum(t2.y)), (count(*))
+   Sort Key: t1.y
+   ->  Finalize HashAggregate
+         Output: t1.y, sum(t2.y), count(*)
+         Group Key: t1.y
+         ->  Append
+               ->  Hash Join
+                     Output: t1_1.y, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1_1
+                           Output: t1_1.y, t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, PARTIAL sum(t2_1.y), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p1 t2_1
+                                       Output: t2_1.y, t2_1.x
+               ->  Hash Join
+                     Output: t1_2.y, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_2
+                           Output: t1_2.y, t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, PARTIAL sum(t2_2.y), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_2
+                                       Output: t2_2.y, t2_2.x
+               ->  Hash Join
+                     Output: t1_3.y, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_3
+                           Output: t1_3.y, t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, PARTIAL sum(t2_3.y), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_3
+                                       Output: t2_3.y, t2_3.x
+               ->  Hash Join
+                     Output: t1_4.y, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_4
+                           Output: t1_4.y, t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, PARTIAL sum(t2_4.y), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_4
+                                       Output: t2_4.y, t2_4.x
+               ->  Hash Join
+                     Output: t1_5.y, (PARTIAL sum(t2_5.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_5.x = t2_5.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_5
+                           Output: t1_5.y, t1_5.x
+                     ->  Hash
+                           Output: t2_5.x, (PARTIAL sum(t2_5.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_5.x, PARTIAL sum(t2_5.y), PARTIAL count(*)
+                                 Group Key: t2_5.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_5
+                                       Output: t2_5.y, t2_5.x
+(67 rows)
+
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+ y  |  sum  | count 
+----+-------+-------
+  0 |     0 |  1089
+  1 |  1156 |  1156
+  2 |  2312 |  1156
+  3 |  3468 |  1156
+  4 |  4624 |  1156
+  5 |  5780 |  1156
+  6 |  6936 |  1156
+  7 |  8092 |  1156
+  8 |  9248 |  1156
+  9 | 10404 |  1156
+ 10 | 11560 |  1156
+ 11 | 11979 |  1089
+ 12 | 13068 |  1089
+ 13 | 14157 |  1089
+ 14 | 15246 |  1089
+ 15 | 16335 |  1089
+ 16 | 17424 |  1089
+ 17 | 18513 |  1089
+ 18 | 19602 |  1089
+ 19 | 20691 |  1089
+ 20 | 21780 |  1089
+ 21 | 22869 |  1089
+ 22 | 23958 |  1089
+ 23 | 25047 |  1089
+ 24 | 26136 |  1089
+ 25 | 27225 |  1089
+ 26 | 28314 |  1089
+ 27 | 29403 |  1089
+ 28 | 30492 |  1089
+ 29 | 31581 |  1089
+(30 rows)
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+                                                QUERY PLAN                                                
+----------------------------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum((t2.y + t3.y))), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum((t2.y + t3.y)), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, t3.x, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, t3.x, PARTIAL sum((t2.y + t3.y)), PARTIAL count(*)
+                                 Group Key: t2.x
+                                 ->  Hash Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Hash Cond: (t2.x = t3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                             Output: t2.y, t2.x
+                                       ->  Hash
+                                             Output: t3.y, t3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p1 t3
+                                                   Output: t3.y, t3.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum((t2_1.y + t3_1.y)), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y)), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Hash Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Hash Cond: (t2_1.x = t3_1.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                             Output: t2_1.y, t2_1.x
+                                       ->  Hash
+                                             Output: t3_1.y, t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t3_1
+                                                   Output: t3_1.y, t3_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum((t2_2.y + t3_2.y)), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y)), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Hash Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Hash Cond: (t2_2.x = t3_2.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                             Output: t2_2.y, t2_2.x
+                                       ->  Hash
+                                             Output: t3_2.y, t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t3_2
+                                                   Output: t3_2.y, t3_2.x
+         ->  Finalize HashAggregate
+               Output: t1_3.x, sum((t2_3.y + t3_3.y)), count(*)
+               Group Key: t1_3.x
+               ->  Hash Join
+                     Output: t1_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, t3_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, t3_3.x, PARTIAL sum((t2_3.y + t3_3.y)), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Hash Join
+                                       Output: t2_3.y, t2_3.x, t3_3.y, t3_3.x
+                                       Hash Cond: (t2_3.x = t3_3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                             Output: t2_3.y, t2_3.x
+                                       ->  Hash
+                                             Output: t3_3.y, t3_3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t3_3
+                                                   Output: t3_3.y, t3_3.x
+         ->  Finalize HashAggregate
+               Output: t1_4.x, sum((t2_4.y + t3_4.y)), count(*)
+               Group Key: t1_4.x
+               ->  Hash Join
+                     Output: t1_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, t3_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, t3_4.x, PARTIAL sum((t2_4.y + t3_4.y)), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Hash Join
+                                       Output: t2_4.y, t2_4.x, t3_4.y, t3_4.x
+                                       Hash Cond: (t2_4.x = t3_4.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                             Output: t2_4.y, t2_4.x
+                                       ->  Hash
+                                             Output: t3_4.y, t3_4.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t3_4
+                                                   Output: t3_4.y, t3_4.x
+(114 rows)
+
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+ x  |   sum   | count 
+----+---------+-------
+  0 |       0 | 35937
+  1 |   78608 | 39304
+  2 |  157216 | 39304
+  3 |  235824 | 39304
+  4 |  314432 | 39304
+  5 |  393040 | 39304
+  6 |  471648 | 39304
+  7 |  550256 | 39304
+  8 |  628864 | 39304
+  9 |  707472 | 39304
+ 10 |  786080 | 39304
+ 11 |  790614 | 35937
+ 12 |  862488 | 35937
+ 13 |  934362 | 35937
+ 14 | 1006236 | 35937
+ 15 | 1078110 | 35937
+ 16 | 1149984 | 35937
+ 17 | 1221858 | 35937
+ 18 | 1293732 | 35937
+ 19 | 1365606 | 35937
+ 20 | 1437480 | 35937
+ 21 | 1509354 | 35937
+ 22 | 1581228 | 35937
+ 23 | 1653102 | 35937
+ 24 | 1724976 | 35937
+ 25 | 1796850 | 35937
+ 26 | 1868724 | 35937
+ 27 | 1940598 | 35937
+ 28 | 2012472 | 35937
+ 29 | 2084346 | 35937
+(30 rows)
+
+-- partial aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+                                                    QUERY PLAN                                                    
+------------------------------------------------------------------------------------------------------------------
+ Sort
+   Output: t3.y, (sum((t2.y + t3.y))), (count(*))
+   Sort Key: t3.y
+   ->  Finalize HashAggregate
+         Output: t3.y, sum((t2.y + t3.y)), count(*)
+         Group Key: t3.y
+         ->  Append
+               ->  Hash Join
+                     Output: t3_1.y, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, t3_1.y, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, t3_1.y, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y)), PARTIAL count(*)
+                                 Group Key: t2_1.x, t3_1.y, t3_1.x
+                                 ->  Hash Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Hash Cond: (t2_1.x = t3_1.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p1 t2_1
+                                             Output: t2_1.y, t2_1.x
+                                       ->  Hash
+                                             Output: t3_1.y, t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p1 t3_1
+                                                   Output: t3_1.y, t3_1.x
+               ->  Hash Join
+                     Output: t3_2.y, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, t3_2.y, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, t3_2.y, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y)), PARTIAL count(*)
+                                 Group Key: t2_2.x, t3_2.y, t3_2.x
+                                 ->  Hash Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Hash Cond: (t2_2.x = t3_2.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_2
+                                             Output: t2_2.y, t2_2.x
+                                       ->  Hash
+                                             Output: t3_2.y, t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t3_2
+                                                   Output: t3_2.y, t3_2.x
+               ->  Hash Join
+                     Output: t3_3.y, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, t3_3.y, t3_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, t3_3.y, t3_3.x, PARTIAL sum((t2_3.y + t3_3.y)), PARTIAL count(*)
+                                 Group Key: t2_3.x, t3_3.y, t3_3.x
+                                 ->  Hash Join
+                                       Output: t2_3.y, t2_3.x, t3_3.y, t3_3.x
+                                       Hash Cond: (t2_3.x = t3_3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_3
+                                             Output: t2_3.y, t2_3.x
+                                       ->  Hash
+                                             Output: t3_3.y, t3_3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t3_3
+                                                   Output: t3_3.y, t3_3.x
+               ->  Hash Join
+                     Output: t3_4.y, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, t3_4.y, t3_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, t3_4.y, t3_4.x, PARTIAL sum((t2_4.y + t3_4.y)), PARTIAL count(*)
+                                 Group Key: t2_4.x, t3_4.y, t3_4.x
+                                 ->  Hash Join
+                                       Output: t2_4.y, t2_4.x, t3_4.y, t3_4.x
+                                       Hash Cond: (t2_4.x = t3_4.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_4
+                                             Output: t2_4.y, t2_4.x
+                                       ->  Hash
+                                             Output: t3_4.y, t3_4.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t3_4
+                                                   Output: t3_4.y, t3_4.x
+               ->  Hash Join
+                     Output: t3_5.y, (PARTIAL sum((t2_5.y + t3_5.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_5.x = t2_5.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_5
+                           Output: t1_5.x
+                     ->  Hash
+                           Output: t2_5.x, t3_5.y, t3_5.x, (PARTIAL sum((t2_5.y + t3_5.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_5.x, t3_5.y, t3_5.x, PARTIAL sum((t2_5.y + t3_5.y)), PARTIAL count(*)
+                                 Group Key: t2_5.x, t3_5.y, t3_5.x
+                                 ->  Hash Join
+                                       Output: t2_5.y, t2_5.x, t3_5.y, t3_5.x
+                                       Hash Cond: (t2_5.x = t3_5.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_5
+                                             Output: t2_5.y, t2_5.x
+                                       ->  Hash
+                                             Output: t3_5.y, t3_5.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t3_5
+                                                   Output: t3_5.y, t3_5.x
+(102 rows)
+
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+ y  |   sum   | count 
+----+---------+-------
+  0 |       0 | 35937
+  1 |   78608 | 39304
+  2 |  157216 | 39304
+  3 |  235824 | 39304
+  4 |  314432 | 39304
+  5 |  393040 | 39304
+  6 |  471648 | 39304
+  7 |  550256 | 39304
+  8 |  628864 | 39304
+  9 |  707472 | 39304
+ 10 |  786080 | 39304
+ 11 |  790614 | 35937
+ 12 |  862488 | 35937
+ 13 |  934362 | 35937
+ 14 | 1006236 | 35937
+ 15 | 1078110 | 35937
+ 16 | 1149984 | 35937
+ 17 | 1221858 | 35937
+ 18 | 1293732 | 35937
+ 19 | 1365606 | 35937
+ 20 | 1437480 | 35937
+ 21 | 1509354 | 35937
+ 22 | 1581228 | 35937
+ 23 | 1653102 | 35937
+ 24 | 1724976 | 35937
+ 25 | 1796850 | 35937
+ 26 | 1868724 | 35937
+ 27 | 1940598 | 35937
+ 28 | 2012472 | 35937
+ 29 | 2084346 | 35937
+(30 rows)
+
+DROP TABLE eager_agg_tab_ml;
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index 91089ac215..6370504377 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -151,6 +151,7 @@ select name, setting from pg_settings where name like 'enable%';
  enable_async_append            | on
  enable_bitmapscan              | on
  enable_distinct_reordering     | on
+ enable_eager_aggregate         | off
  enable_gathermerge             | on
  enable_group_by_reordering     | on
  enable_hashagg                 | on
@@ -171,7 +172,7 @@ select name, setting from pg_settings where name like 'enable%';
  enable_seqscan                 | on
  enable_sort                    | on
  enable_tidscan                 | on
-(23 rows)
+(24 rows)
 
 -- There are always wait event descriptions for various types.  InjectionPoint
 -- may be present or absent, depending on history since last postmaster start.
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 1edd9e45eb..4fc210e2ef 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -119,7 +119,7 @@ test: plancache limit plpgsql copy2 temp domain rangefuncs prepare conversion tr
 # The stats test resets stats, so nothing else needing stats access can be in
 # this group.
 # ----------
-test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats predicate
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats predicate eager_aggregate
 
 # event_trigger depends on create_am and cannot run concurrently with
 # any test that runs DDL
diff --git a/src/test/regress/sql/eager_aggregate.sql b/src/test/regress/sql/eager_aggregate.sql
new file mode 100644
index 0000000000..4050e4df44
--- /dev/null
+++ b/src/test/regress/sql/eager_aggregate.sql
@@ -0,0 +1,192 @@
+--
+-- EAGER AGGREGATION
+-- Test we can push aggregation down below join
+--
+
+-- Enable eager aggregation, which by default is disabled.
+SET enable_eager_aggregate TO on;
+
+CREATE TABLE eager_agg_t1 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t2 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t3 (a int, b int, c double precision);
+
+INSERT INTO eager_agg_t1 SELECT i, i, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t2 SELECT i, i%10, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t3 SELECT i%10, i%10, i FROM generate_series(1, 1000)i;
+
+ANALYZE eager_agg_t1;
+ANALYZE eager_agg_t2;
+ANALYZE eager_agg_t3;
+
+
+--
+-- Test eager aggregation over base rel
+--
+
+-- Perform scan of a table, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+RESET enable_hashagg;
+
+
+--
+-- Test eager aggregation over join rel
+--
+
+-- Perform join of tables, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+
+RESET enable_hashagg;
+
+
+--
+-- Test that eager aggregation works for outer join
+--
+
+-- Ensure aggregation can be pushed down to the non-nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+-- Ensure aggregation cannot be pushed down to the nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+
+
+--
+-- Test that eager aggregation works for parallel plans
+--
+
+SET parallel_setup_cost=0;
+SET parallel_tuple_cost=0;
+SET min_parallel_table_scan_size=0;
+SET max_parallel_workers_per_gather=4;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+RESET parallel_setup_cost;
+RESET parallel_tuple_cost;
+RESET min_parallel_table_scan_size;
+RESET max_parallel_workers_per_gather;
+
+
+DROP TABLE eager_agg_t1;
+DROP TABLE eager_agg_t2;
+DROP TABLE eager_agg_t3;
+
+
+--
+-- Test eager aggregation for partitionwise join
+--
+
+-- Enable partitionwise aggregate, which by default is disabled.
+SET enable_partitionwise_aggregate TO true;
+-- Enable partitionwise join, which by default is disabled.
+SET enable_partitionwise_join TO true;
+
+CREATE TABLE eager_agg_tab1(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab1_p1 PARTITION OF eager_agg_tab1 FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab1_p2 PARTITION OF eager_agg_tab1 FOR VALUES FROM (10) TO (20);
+CREATE TABLE eager_agg_tab1_p3 PARTITION OF eager_agg_tab1 FOR VALUES FROM (20) TO (30);
+CREATE TABLE eager_agg_tab2(x int, y int) PARTITION BY RANGE(y);
+CREATE TABLE eager_agg_tab2_p1 PARTITION OF eager_agg_tab2 FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab2_p2 PARTITION OF eager_agg_tab2 FOR VALUES FROM (10) TO (20);
+CREATE TABLE eager_agg_tab2_p3 PARTITION OF eager_agg_tab2 FOR VALUES FROM (20) TO (30);
+INSERT INTO eager_agg_tab1 SELECT i % 30, i % 20 FROM generate_series(0, 299, 2) i;
+INSERT INTO eager_agg_tab2 SELECT i % 20, i % 30 FROM generate_series(0, 299, 3) i;
+
+ANALYZE eager_agg_tab1;
+ANALYZE eager_agg_tab2;
+
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+
+-- GROUP BY having other matching key
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 10 ORDER BY t2.x;
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 10 ORDER BY t2.x;
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+
+-- partial aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+RESET enable_hashagg;
+
+DROP TABLE eager_agg_tab1;
+DROP TABLE eager_agg_tab2;
+
+
+--
+-- Test with multi-level partitioning scheme
+--
+CREATE TABLE eager_agg_tab_ml(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p1 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab_ml_p2 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (10) TO (20) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p2_s1 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab_ml_p2_s2 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (15) TO (20);
+CREATE TABLE eager_agg_tab_ml_p3 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (20) TO (30) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p3_s1 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (20) TO (25);
+CREATE TABLE eager_agg_tab_ml_p3_s2 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (25) TO (30);
+INSERT INTO eager_agg_tab_ml SELECT i % 30, i % 30 FROM generate_series(1, 1000) i;
+
+ANALYZE eager_agg_tab_ml;
+
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+
+-- partial aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+
+DROP TABLE eager_agg_tab_ml;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index ce33e55bf1..ddd669b467 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -41,6 +41,7 @@ AfterTriggersTableData
 AfterTriggersTransData
 Agg
 AggClauseCosts
+AggClauseInfo
 AggInfo
 AggPath
 AggSplit
@@ -1064,6 +1065,7 @@ GrantTargetType
 Group
 GroupByOrdering
 GroupClause
+GroupExprInfo
 GroupPath
 GroupPathExtraData
 GroupResultPath
@@ -1296,7 +1298,6 @@ Join
 JoinCostWorkspace
 JoinDomain
 JoinExpr
-JoinHashEntry
 JoinPath
 JoinPathExtraData
 JoinState
@@ -2379,13 +2380,17 @@ ReindexObjectType
 ReindexParams
 ReindexStmt
 ReindexType
+RelAggInfo
 RelFileLocator
 RelFileLocatorBackend
 RelFileNumber
+RelHashEntry
 RelIdCacheEnt
 RelIdToTypeIdCacheEntry
 RelInfo
 RelInfoArr
+RelInfoList
+RelInfoListInfo
 RelMapFile
 RelMapping
 RelOptInfo
-- 
2.43.0



^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2024-12-21 01:05 ` Richard Guo <[email protected]>
  2025-01-09 03:15   ` Re: Eager aggregation, take 3 jian he <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  0 siblings, 2 replies; 70+ messages in thread

From: Richard Guo @ 2024-12-21 01:05 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

On Tue, Dec 17, 2024 at 12:42 PM Richard Guo <[email protected]> wrote:
> Attached is the patch rebased on the latest master.  It refines the
> theoretical justification for the correctness of this transformation
> in README and commit message.  It also adds the check for image
> equality for all grouping keys used in partial aggregation, and fixes
> the issue reported by Jian.  It does not yet handle the RLS case
> though.

I've looked at the RLS case.  AFAIU we want to prevent any
non-leakproof aggregation functions from being pushed down past
securityQuals.  I added a check in create_agg_clause_infos to ensure
that no aggregation is pushed down if securityQuals are present along
with any non-leakproof aggregate functions.  I know this might be
overly strict, but for now, I want to focus on the eager aggregation
transformation itself.  We can relax this restriction in subsequent
patches after this already large one.

Attached is the latest patch, which also includes some cosmetic
tweaks.  I am seeking the possibility of pushing this by the end of
January, so that I can have enough time to react to any bugs before
the feature freeze.

Thanks
Richard


Attachments:

  [application/octet-stream] v15-0001-Implement-Eager-Aggregation.patch (176.8K, 2-v15-0001-Implement-Eager-Aggregation.patch)
  download | inline diff:
From 12f11079c46ee5d7ec9a285bb0d667fd461703ed Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Tue, 11 Jun 2024 15:59:19 +0900
Subject: [PATCH v15] Implement Eager Aggregation

Eager aggregation is a query optimization technique that partially
pushes aggregation past a join, and finalizes it once all the
relations are joined.  Eager aggregation may reduce the number of
input rows to the join and thus could result in a better overall plan.

A plan with eager aggregation looks like:

 EXPLAIN (COSTS OFF)
 SELECT a.i, avg(b.y)
 FROM a JOIN b ON a.i = b.j
 GROUP BY a.i;

 Finalize HashAggregate
   Group Key: a.i
   ->  Nested Loop
         ->  Partial HashAggregate
               Group Key: b.j
               ->  Seq Scan on b
         ->  Index Only Scan using a_pkey on a
               Index Cond: (i = b.j)

During the construction of the join tree, we evaluate each base or
join relation to determine if eager aggregation can be applied.  If
feasible, we create a separate RelOptInfo called a "grouped relation"
and store it in a dedicated list.

Grouped relation paths can be generated in two ways.  The first method
involves adding sorted and hashed partial aggregation paths on top of
the non-grouped paths.  To limit planning time, we only consider the
cheapest or suitably-sorted non-grouped paths during this phase.

Alternatively, grouped paths can be generated by joining a grouped
relation with a non-grouped relation.  Joining two grouped relations
does not seem to be very useful and is currently not supported.

For the partial aggregation that is pushed down to a non-aggregated
relation, we need to consider all expressions from this relation that
are involved in upper join clauses and include them in the grouping
keys, using compatible operators.  This is essential to ensure that an
aggregated row from the partial aggregation matches the other side of
the join if and only if each row in the partial group does.  This
ensures that all rows within the same partial group share the same
'destiny', which is crucial for maintaining correctness.

One restriction is that we cannot push partial aggregation down to a
relation that is in the nullable side of an outer join, because the
NULL-extended rows produced by the outer join would not be available
when we perform the partial aggregation, while with a
non-eager-aggregation plan these rows are available for the top-level
aggregation.  Pushing partial aggregation in this case may result in
the rows being grouped differently than expected, or produce incorrect
values from the aggregate functions.

If we have generated a grouped relation for the topmost join relation,
we finalize its paths at the end.  The final paths will compete in the
usual way with paths built from regular planning.

Since eager aggregation can generate many grouped relations, we
introduce a RelInfoList structure, which encapsulates both a list and
a hash table, so that we can leverage the hash table for faster
lookups not only for join relations but also for grouped relations.

Eager aggregation can use significantly more CPU time and memory than
regular planning when the query involves aggregates and many joining
relations.  However, in some cases, the resulting plan can be much
better, justifying the additional planning effort.  All the same, for
now, turn this feature off by default.
---
 contrib/postgres_fdw/postgres_fdw.c           |    3 +-
 doc/src/sgml/config.sgml                      |   15 +
 src/backend/optimizer/README                  |   80 +
 src/backend/optimizer/geqo/geqo_eval.c        |   98 +-
 src/backend/optimizer/path/allpaths.c         |  455 +++++-
 src/backend/optimizer/path/costsize.c         |   95 +-
 src/backend/optimizer/path/joinrels.c         |  141 ++
 src/backend/optimizer/plan/initsplan.c        |  273 ++++
 src/backend/optimizer/plan/planmain.c         |   17 +-
 src/backend/optimizer/plan/planner.c          |   99 +-
 src/backend/optimizer/util/appendinfo.c       |   60 +
 src/backend/optimizer/util/pathnode.c         |   47 +-
 src/backend/optimizer/util/relnode.c          |  758 +++++++++-
 src/backend/utils/misc/guc_tables.c           |   10 +
 src/backend/utils/misc/postgresql.conf.sample |    1 +
 src/include/nodes/pathnodes.h                 |  148 +-
 src/include/optimizer/pathnode.h              |    7 +
 src/include/optimizer/paths.h                 |    5 +
 src/include/optimizer/planmain.h              |    1 +
 src/test/regress/expected/eager_aggregate.out | 1308 +++++++++++++++++
 src/test/regress/expected/sysviews.out        |    3 +-
 src/test/regress/parallel_schedule            |    2 +-
 src/test/regress/sql/eager_aggregate.sql      |  192 +++
 src/tools/pgindent/typedefs.list              |    7 +-
 24 files changed, 3667 insertions(+), 158 deletions(-)
 create mode 100644 src/test/regress/expected/eager_aggregate.out
 create mode 100644 src/test/regress/sql/eager_aggregate.sql

diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index cf56434118..7bb36a52d1 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -6089,7 +6089,8 @@ foreign_join_ok(PlannerInfo *root, RelOptInfo *joinrel, JoinType jointype,
 	 */
 	Assert(fpinfo->relation_index == 0);	/* shouldn't be set yet */
 	fpinfo->relation_index =
-		list_length(root->parse->rtable) + list_length(root->join_rel_list);
+		list_length(root->parse->rtable) +
+		list_length(root->join_rel_list->items);
 
 	return true;
 }
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index fbdd6ce574..3d78a5875f 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -5382,6 +5382,21 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-enable-eager-aggregate" xreflabel="enable_eager_aggregate">
+      <term><varname>enable_eager_aggregate</varname> (<type>boolean</type>)
+      <indexterm>
+       <primary><varname>enable_eager_aggregate</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Enables or disables the query planner's ability to partially push
+        aggregation past a join, and finalize it once all the relations are
+        joined. The default is <literal>off</literal>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-enable-gathermerge" xreflabel="enable_gathermerge">
       <term><varname>enable_gathermerge</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/src/backend/optimizer/README b/src/backend/optimizer/README
index f341d9f303..45236ca46b 100644
--- a/src/backend/optimizer/README
+++ b/src/backend/optimizer/README
@@ -1497,3 +1497,83 @@ breaking down aggregation or grouping over a partitioned relation into
 aggregation or grouping over its partitions is called partitionwise
 aggregation.  Especially when the partition keys match the GROUP BY clause,
 this can be significantly faster than the regular method.
+
+Eager aggregation
+-----------------
+
+Eager aggregation is a query optimization technique that partially pushes
+aggregation past a join, and finalizes it once all the relations are joined.
+Eager aggregation may reduce the number of input rows to the join and thus
+could result in a better overall plan.
+
+For example:
+
+ EXPLAIN (COSTS OFF)
+ SELECT a.i, avg(b.y)
+ FROM a JOIN b ON a.i = b.j
+ GROUP BY a.i;
+
+ Finalize HashAggregate
+   Group Key: a.i
+   ->  Nested Loop
+         ->  Partial HashAggregate
+               Group Key: b.j
+               ->  Seq Scan on b
+         ->  Index Only Scan using a_pkey on a
+               Index Cond: (i = b.j)
+
+If the partial aggregation on table B significantly reduces the number of
+input rows, the join above will be much cheaper, leading to a more efficient
+final plan.
+
+For the partial aggregation that is pushed down to a non-aggregated relation,
+we need to consider all expressions from this relation that are involved in
+upper join clauses and include them in the grouping keys, using compatible
+operators.  This is essential to ensure that an aggregated row from the partial
+aggregation matches the other side of the join if and only if each row in the
+partial group does.  This ensures that all rows within the same partial group
+share the same 'destiny', which is crucial for maintaining correctness.
+
+One restriction is that we cannot push partial aggregation down to a relation
+that is in the nullable side of an outer join, because the NULL-extended rows
+produced by the outer join would not be available when we perform the partial
+aggregation, while with a non-eager-aggregation plan these rows are available
+for the top-level aggregation.  Pushing partial aggregation in this case may
+result in the rows being grouped differently than expected, or produce
+incorrect values from the aggregate functions.
+
+We can also apply eager aggregation to a join:
+
+ EXPLAIN (COSTS OFF)
+ SELECT a.i, avg(b.y + c.z)
+ FROM a JOIN b ON a.i = b.j
+        JOIN c ON b.j = c.i
+ GROUP BY a.i;
+
+ Finalize HashAggregate
+   Group Key: a.i
+   ->  Nested Loop
+         ->  Partial HashAggregate
+               Group Key: b.j
+               ->  Hash Join
+                     Hash Cond: (b.j = c.i)
+                     ->  Seq Scan on b
+                     ->  Hash
+                           ->  Seq Scan on c
+         ->  Index Only Scan using a_pkey on a
+               Index Cond: (i = b.j)
+
+During the construction of the join tree, we evaluate each base or join
+relation to determine if eager aggregation can be applied.  If feasible, we
+create a separate RelOptInfo called a "grouped relation" and generate grouped
+paths by adding sorted and hashed partial aggregation paths on top of the
+non-grouped paths.  To limit planning time, we consider only the cheapest or
+suitably-sorted non-grouped paths in this step.
+
+Another way to generate grouped paths is to join a grouped relation with a
+non-grouped relation.  Joining two grouped relations does not seem to be very
+useful and is currently not supported.
+
+If we have generated a grouped relation for the topmost join relation, we need
+to finalize its paths at the end.  The final paths will compete in the usual
+way with paths built from regular planning.
diff --git a/src/backend/optimizer/geqo/geqo_eval.c b/src/backend/optimizer/geqo/geqo_eval.c
index d2f7f4e5f3..cdc9543135 100644
--- a/src/backend/optimizer/geqo/geqo_eval.c
+++ b/src/backend/optimizer/geqo/geqo_eval.c
@@ -39,10 +39,20 @@ typedef struct
 	int			size;			/* number of input relations in clump */
 } Clump;
 
+/* The original length and hashtable of a RelInfoList */
+typedef struct
+{
+	int			savelength;
+	struct HTAB *savehash;
+} RelInfoListInfo;
+
 static List *merge_clump(PlannerInfo *root, List *clumps, Clump *new_clump,
 						 int num_gene, bool force);
 static bool desirable_join(PlannerInfo *root,
 						   RelOptInfo *outer_rel, RelOptInfo *inner_rel);
+static RelInfoListInfo save_relinfolist(RelInfoList *relinfo_list);
+static void restore_relinfolist(RelInfoList *relinfo_list,
+								RelInfoListInfo *info);
 
 
 /*
@@ -60,8 +70,8 @@ geqo_eval(PlannerInfo *root, Gene *tour, int num_gene)
 	MemoryContext oldcxt;
 	RelOptInfo *joinrel;
 	Cost		fitness;
-	int			savelength;
-	struct HTAB *savehash;
+	RelInfoListInfo save_join_rel;
+	RelInfoListInfo save_grouped_rel;
 
 	/*
 	 * Create a private memory context that will hold all temp storage
@@ -78,25 +88,29 @@ geqo_eval(PlannerInfo *root, Gene *tour, int num_gene)
 	oldcxt = MemoryContextSwitchTo(mycontext);
 
 	/*
-	 * gimme_tree will add entries to root->join_rel_list, which may or may
-	 * not already contain some entries.  The newly added entries will be
-	 * recycled by the MemoryContextDelete below, so we must ensure that the
-	 * list is restored to its former state before exiting.  We can do this by
-	 * truncating the list to its original length.  NOTE this assumes that any
-	 * added entries are appended at the end!
+	 * gimme_tree will add entries to root->join_rel_list and
+	 * root->grouped_rel_list, which may or may not already contain some
+	 * entries.  The newly added entries will be recycled by the
+	 * MemoryContextDelete below, so we must ensure that each list within the
+	 * RelInfoList structures is restored to its former state before exiting.
+	 * We can do this by truncating each list to its original length.  NOTE
+	 * this assumes that any added entries are appended at the end!
 	 *
-	 * We also must take care not to mess up the outer join_rel_hash, if there
-	 * is one.  We can do this by just temporarily setting the link to NULL.
-	 * (If we are dealing with enough join rels, which we very likely are, a
-	 * new hash table will get built and used locally.)
+	 * We also must take care not to mess up the outer hash tables within the
+	 * RelInfoList structures, if any.  We can do this by just temporarily
+	 * setting each link to NULL.  (If we are dealing with enough join rels or
+	 * grouped rels, which we very likely are, new hash tables will get built
+	 * and used locally.)
 	 *
 	 * join_rel_level[] shouldn't be in use, so just Assert it isn't.
 	 */
-	savelength = list_length(root->join_rel_list);
-	savehash = root->join_rel_hash;
+	save_join_rel = save_relinfolist(root->join_rel_list);
+	save_grouped_rel = save_relinfolist(root->grouped_rel_list);
+
 	Assert(root->join_rel_level == NULL);
 
-	root->join_rel_hash = NULL;
+	root->join_rel_list->hash = NULL;
+	root->grouped_rel_list->hash = NULL;
 
 	/* construct the best path for the given combination of relations */
 	joinrel = gimme_tree(root, tour, num_gene);
@@ -118,12 +132,11 @@ geqo_eval(PlannerInfo *root, Gene *tour, int num_gene)
 		fitness = DBL_MAX;
 
 	/*
-	 * Restore join_rel_list to its former state, and put back original
-	 * hashtable if any.
+	 * Restore each of the list in join_rel_list and grouped_rel_list to its
+	 * former state, and put back original hashtables if any.
 	 */
-	root->join_rel_list = list_truncate(root->join_rel_list,
-										savelength);
-	root->join_rel_hash = savehash;
+	restore_relinfolist(root->join_rel_list, &save_join_rel);
+	restore_relinfolist(root->grouped_rel_list, &save_grouped_rel);
 
 	/* release all the memory acquired within gimme_tree */
 	MemoryContextSwitchTo(oldcxt);
@@ -279,6 +292,27 @@ merge_clump(PlannerInfo *root, List *clumps, Clump *new_clump, int num_gene,
 				/* Find and save the cheapest paths for this joinrel */
 				set_cheapest(joinrel);
 
+				/*
+				 * Except for the topmost scan/join rel, consider generating
+				 * partial aggregation paths for the grouped relation on top
+				 * of the paths of this rel.  After that, we're done creating
+				 * paths for the grouped relation, so run set_cheapest().
+				 */
+				if (!bms_equal(joinrel->relids, root->all_query_rels))
+				{
+					RelOptInfo *rel_grouped;
+
+					rel_grouped = find_grouped_rel(root, joinrel->relids);
+					if (rel_grouped)
+					{
+						Assert(IS_GROUPED_REL(rel_grouped));
+
+						generate_grouped_paths(root, rel_grouped, joinrel,
+											   rel_grouped->agg_info);
+						set_cheapest(rel_grouped);
+					}
+				}
+
 				/* Absorb new clump into old */
 				old_clump->joinrel = joinrel;
 				old_clump->size += new_clump->size;
@@ -336,3 +370,27 @@ desirable_join(PlannerInfo *root,
 	/* Otherwise postpone the join till later. */
 	return false;
 }
+
+/*
+ * Save the original length and hashtable of a RelInfoList.
+ */
+static RelInfoListInfo
+save_relinfolist(RelInfoList *relinfo_list)
+{
+	RelInfoListInfo info;
+
+	info.savelength = list_length(relinfo_list->items);
+	info.savehash = relinfo_list->hash;
+
+	return info;
+}
+
+/*
+ * Restore the original length and hashtable of a RelInfoList.
+ */
+static void
+restore_relinfolist(RelInfoList *relinfo_list, RelInfoListInfo *info)
+{
+	relinfo_list->items = list_truncate(relinfo_list->items, info->savelength);
+	relinfo_list->hash = info->savehash;
+}
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 172edb643a..13228377a5 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -40,6 +40,7 @@
 #include "optimizer/paths.h"
 #include "optimizer/plancat.h"
 #include "optimizer/planner.h"
+#include "optimizer/prep.h"
 #include "optimizer/tlist.h"
 #include "parser/parse_clause.h"
 #include "parser/parsetree.h"
@@ -47,6 +48,7 @@
 #include "port/pg_bitutils.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/lsyscache.h"
+#include "utils/selfuncs.h"
 
 
 /* Bitmask flags for pushdown_safety_info.unsafeFlags */
@@ -77,6 +79,7 @@ typedef enum pushdown_safe_type
 
 /* These parameters are set by GUC */
 bool		enable_geqo = false;	/* just in case GUC doesn't set it */
+bool		enable_eager_aggregate = false;
 int			geqo_threshold;
 int			min_parallel_table_scan_size;
 int			min_parallel_index_scan_size;
@@ -90,6 +93,7 @@ join_search_hook_type join_search_hook = NULL;
 
 static void set_base_rel_consider_startup(PlannerInfo *root);
 static void set_base_rel_sizes(PlannerInfo *root);
+static void setup_base_grouped_rels(PlannerInfo *root);
 static void set_base_rel_pathlists(PlannerInfo *root);
 static void set_rel_size(PlannerInfo *root, RelOptInfo *rel,
 						 Index rti, RangeTblEntry *rte);
@@ -114,6 +118,7 @@ static void set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
 								Index rti, RangeTblEntry *rte);
 static void set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 									Index rti, RangeTblEntry *rte);
+static void set_grouped_rel_pathlist(PlannerInfo *root, RelOptInfo *rel);
 static void generate_orderedappend_paths(PlannerInfo *root, RelOptInfo *rel,
 										 List *live_childrels,
 										 List *all_child_pathkeys);
@@ -182,6 +187,11 @@ make_one_rel(PlannerInfo *root, List *joinlist)
 	 */
 	set_base_rel_sizes(root);
 
+	/*
+	 * Build grouped relations for base rels where possible.
+	 */
+	setup_base_grouped_rels(root);
+
 	/*
 	 * We should now have size estimates for every actual table involved in
 	 * the query, and we also know which if any have been deleted from the
@@ -323,6 +333,45 @@ set_base_rel_sizes(PlannerInfo *root)
 	}
 }
 
+/*
+ * setup_base_grouped_rels
+ *	  For each "plain" base relation, build a grouped base relation if eager
+ *	  aggregation is possible and if this relation can produce grouped paths.
+ */
+static void
+setup_base_grouped_rels(PlannerInfo *root)
+{
+	Index		rti;
+
+	/*
+	 * If there are no aggregate expressions or grouping expressions, eager
+	 * aggregation is not possible.
+	 */
+	if (root->agg_clause_list == NIL ||
+		root->group_expr_list == NIL)
+		return;
+
+	for (rti = 1; rti < root->simple_rel_array_size; rti++)
+	{
+		RelOptInfo *rel = root->simple_rel_array[rti];
+		RelOptInfo *rel_grouped;
+
+		/* there may be empty slots corresponding to non-baserel RTEs */
+		if (rel == NULL)
+			continue;
+
+		Assert(rel->relid == rti);	/* sanity check on array */
+		Assert(IS_SIMPLE_REL(rel)); /* sanity check on rel */
+
+		rel_grouped = build_simple_grouped_rel(root, rel);
+		if (rel_grouped)
+		{
+			/* Make the grouped relation available for joining. */
+			add_grouped_rel(root, rel_grouped);
+		}
+	}
+}
+
 /*
  * set_base_rel_pathlists
  *	  Finds all paths available for scanning each base-relation entry.
@@ -559,6 +608,15 @@ set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 	/* Now find the cheapest of the paths for this rel */
 	set_cheapest(rel);
 
+	/*
+	 * If a grouped relation for this rel exists, build partial aggregation
+	 * paths for it.
+	 *
+	 * Note that this can only happen after we've called set_cheapest() for
+	 * this base rel, because we need its cheapest paths.
+	 */
+	set_grouped_rel_pathlist(root, rel);
+
 #ifdef OPTIMIZER_DEBUG
 	pprint(rel);
 #endif
@@ -1298,6 +1356,36 @@ set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 	add_paths_to_append_rel(root, rel, live_childrels);
 }
 
+/*
+ * set_grouped_rel_pathlist
+ *	  If a grouped relation for the given 'rel' exists, build partial
+ *	  aggregation paths for it.
+ */
+static void
+set_grouped_rel_pathlist(PlannerInfo *root, RelOptInfo *rel)
+{
+	RelOptInfo *rel_grouped;
+
+	/*
+	 * If there are no aggregate expressions or grouping expressions, eager
+	 * aggregation is not possible.
+	 */
+	if (root->agg_clause_list == NIL ||
+		root->group_expr_list == NIL)
+		return;
+
+	/* Add paths to the grouped base relation if one exists. */
+	rel_grouped = find_grouped_rel(root, rel->relids);
+	if (rel_grouped)
+	{
+		Assert(IS_GROUPED_REL(rel_grouped));
+
+		generate_grouped_paths(root, rel_grouped, rel,
+							   rel_grouped->agg_info);
+		set_cheapest(rel_grouped);
+	}
+}
+
 
 /*
  * add_paths_to_append_rel
@@ -3306,6 +3394,318 @@ generate_useful_gather_paths(PlannerInfo *root, RelOptInfo *rel, bool override_r
 	}
 }
 
+/*
+ * generate_grouped_paths
+ *		Generate paths for a grouped relation by adding sorted and hashed
+ *		partial aggregation paths on top of paths of the plain base or join
+ *		relation.
+ *
+ * The information needed are provided by the RelAggInfo structure.
+ */
+void
+generate_grouped_paths(PlannerInfo *root, RelOptInfo *rel_grouped,
+					   RelOptInfo *rel_plain, RelAggInfo *agg_info)
+{
+	AggClauseCosts agg_costs;
+	bool		can_hash;
+	bool		can_sort;
+	Path	   *cheapest_total_path = NULL;
+	Path	   *cheapest_partial_path = NULL;
+	double		dNumGroups = 0;
+	double		dNumPartialGroups = 0;
+
+	if (IS_DUMMY_REL(rel_plain))
+	{
+		mark_dummy_rel(rel_grouped);
+		return;
+	}
+
+	/*
+	 * If the grouped paths for the given relation are not considered useful,
+	 * do not bother to generate them.
+	 */
+	if (!agg_info->agg_useful)
+		return;
+
+	MemSet(&agg_costs, 0, sizeof(AggClauseCosts));
+	get_agg_clause_costs(root, AGGSPLIT_INITIAL_SERIAL, &agg_costs);
+
+	/*
+	 * Determine whether it's possible to perform sort-based implementations
+	 * of grouping.
+	 */
+	can_sort = grouping_is_sortable(agg_info->group_clauses);
+
+	/*
+	 * Determine whether we should consider hash-based implementations of
+	 * grouping.
+	 */
+	Assert(root->numOrderedAggs == 0);
+	can_hash = (agg_info->group_clauses != NIL &&
+				grouping_is_hashable(agg_info->group_clauses));
+
+	/*
+	 * Consider whether we should generate partially aggregated non-partial
+	 * paths.  We can only do this if we have a non-partial path.
+	 */
+	if (rel_plain->pathlist != NIL)
+	{
+		cheapest_total_path = rel_plain->cheapest_total_path;
+		Assert(cheapest_total_path != NULL);
+	}
+
+	/*
+	 * If parallelism is possible for rel_grouped, then we should consider
+	 * generating partially-grouped partial paths.  However, if the plain rel
+	 * has no partial paths, then we can't.
+	 */
+	if (rel_grouped->consider_parallel && rel_plain->partial_pathlist != NIL)
+	{
+		cheapest_partial_path = linitial(rel_plain->partial_pathlist);
+		Assert(cheapest_partial_path != NULL);
+	}
+
+	/* Estimate number of partial groups. */
+	if (cheapest_total_path != NULL)
+		dNumGroups = estimate_num_groups(root,
+										 agg_info->group_exprs,
+										 cheapest_total_path->rows,
+										 NULL, NULL);
+	if (cheapest_partial_path != NULL)
+		dNumPartialGroups = estimate_num_groups(root,
+												agg_info->group_exprs,
+												cheapest_partial_path->rows,
+												NULL, NULL);
+
+	if (can_sort && cheapest_total_path != NULL)
+	{
+		ListCell   *lc;
+
+		/*
+		 * Use any available suitably-sorted path as input, and also consider
+		 * sorting the cheapest-total path.
+		 */
+		foreach(lc, rel_plain->pathlist)
+		{
+			Path	   *input_path = (Path *) lfirst(lc);
+			Path	   *path;
+			bool		is_sorted;
+			int			presorted_keys;
+
+			/*
+			 * Since the path originates from a non-grouped relation that is
+			 * not aware of eager aggregation, we must ensure that it provides
+			 * the correct input for partial aggregation.
+			 */
+			path = (Path *) create_projection_path(root,
+												   rel_grouped,
+												   input_path,
+												   agg_info->agg_input);
+
+			is_sorted = pathkeys_count_contained_in(agg_info->group_pathkeys,
+													path->pathkeys,
+													&presorted_keys);
+			if (!is_sorted)
+			{
+				/*
+				 * Try at least sorting the cheapest path and also try
+				 * incrementally sorting any path which is partially sorted
+				 * already (no need to deal with paths which have presorted
+				 * keys when incremental sort is disabled unless it's the
+				 * cheapest input path).
+				 */
+				if (input_path != cheapest_total_path &&
+					(presorted_keys == 0 || !enable_incremental_sort))
+					continue;
+
+				/*
+				 * We've no need to consider both a sort and incremental sort.
+				 * We'll just do a sort if there are no presorted keys and an
+				 * incremental sort when there are presorted keys.
+				 */
+				if (presorted_keys == 0 || !enable_incremental_sort)
+					path = (Path *) create_sort_path(root,
+													 rel_grouped,
+													 path,
+													 agg_info->group_pathkeys,
+													 -1.0);
+				else
+					path = (Path *) create_incremental_sort_path(root,
+																 rel_grouped,
+																 path,
+																 agg_info->group_pathkeys,
+																 presorted_keys,
+																 -1.0);
+			}
+
+			/*
+			 * qual is NIL because the HAVING clause cannot be evaluated until
+			 * the final value of the aggregate is known.
+			 */
+			path = (Path *) create_agg_path(root,
+											rel_grouped,
+											path,
+											agg_info->target,
+											AGG_SORTED,
+											AGGSPLIT_INITIAL_SERIAL,
+											agg_info->group_clauses,
+											NIL,
+											&agg_costs,
+											dNumGroups);
+
+			add_path(rel_grouped, path);
+		}
+	}
+
+	if (can_sort && cheapest_partial_path != NULL)
+	{
+		ListCell   *lc;
+
+		/* Similar to above logic, but for partial paths. */
+		foreach(lc, rel_plain->partial_pathlist)
+		{
+			Path	   *input_path = (Path *) lfirst(lc);
+			Path	   *path;
+			bool		is_sorted;
+			int			presorted_keys;
+
+			/*
+			 * Since the path originates from a non-grouped relation that is
+			 * not aware of eager aggregation, we must ensure that it provides
+			 * the correct input for partial aggregation.
+			 */
+			path = (Path *) create_projection_path(root,
+												   rel_grouped,
+												   input_path,
+												   agg_info->agg_input);
+
+			is_sorted = pathkeys_count_contained_in(agg_info->group_pathkeys,
+													path->pathkeys,
+													&presorted_keys);
+
+			if (!is_sorted)
+			{
+				/*
+				 * Try at least sorting the cheapest path and also try
+				 * incrementally sorting any path which is partially sorted
+				 * already (no need to deal with paths which have presorted
+				 * keys when incremental sort is disabled unless it's the
+				 * cheapest input path).
+				 */
+				if (input_path != cheapest_partial_path &&
+					(presorted_keys == 0 || !enable_incremental_sort))
+					continue;
+
+				/*
+				 * We've no need to consider both a sort and incremental sort.
+				 * We'll just do a sort if there are no presorted keys and an
+				 * incremental sort when there are presorted keys.
+				 */
+				if (presorted_keys == 0 || !enable_incremental_sort)
+					path = (Path *) create_sort_path(root,
+													 rel_grouped,
+													 path,
+													 agg_info->group_pathkeys,
+													 -1.0);
+				else
+					path = (Path *) create_incremental_sort_path(root,
+																 rel_grouped,
+																 path,
+																 agg_info->group_pathkeys,
+																 presorted_keys,
+																 -1.0);
+			}
+
+			/*
+			 * qual is NIL because the HAVING clause cannot be evaluated until
+			 * the final value of the aggregate is known.
+			 */
+			path = (Path *) create_agg_path(root,
+											rel_grouped,
+											path,
+											agg_info->target,
+											AGG_SORTED,
+											AGGSPLIT_INITIAL_SERIAL,
+											agg_info->group_clauses,
+											NIL,
+											&agg_costs,
+											dNumPartialGroups);
+
+			add_partial_path(rel_grouped, path);
+		}
+	}
+
+	/*
+	 * Add a partially-grouped HashAgg Path where possible
+	 */
+	if (can_hash && cheapest_total_path != NULL)
+	{
+		Path	   *path;
+
+		/*
+		 * Since the path originates from a non-grouped relation that is not
+		 * aware of eager aggregation, we must ensure that it provides the
+		 * correct input for partial aggregation.
+		 */
+		path = (Path *) create_projection_path(root,
+											   rel_grouped,
+											   cheapest_total_path,
+											   agg_info->agg_input);
+
+		/*
+		 * qual is NIL because the HAVING clause cannot be evaluated until the
+		 * final value of the aggregate is known.
+		 */
+		path = (Path *) create_agg_path(root,
+										rel_grouped,
+										path,
+										agg_info->target,
+										AGG_HASHED,
+										AGGSPLIT_INITIAL_SERIAL,
+										agg_info->group_clauses,
+										NIL,
+										&agg_costs,
+										dNumGroups);
+
+		add_path(rel_grouped, path);
+	}
+
+	/*
+	 * Now add a partially-grouped HashAgg partial Path where possible
+	 */
+	if (can_hash && cheapest_partial_path != NULL)
+	{
+		Path	   *path;
+
+		/*
+		 * Since the path originates from a non-grouped relation that is not
+		 * aware of eager aggregation, we must ensure that it provides the
+		 * correct input for partial aggregation.
+		 */
+		path = (Path *) create_projection_path(root,
+											   rel_grouped,
+											   cheapest_partial_path,
+											   agg_info->agg_input);
+
+		/*
+		 * qual is NIL because the HAVING clause cannot be evaluated until the
+		 * final value of the aggregate is known.
+		 */
+		path = (Path *) create_agg_path(root,
+										rel_grouped,
+										path,
+										agg_info->target,
+										AGG_HASHED,
+										AGGSPLIT_INITIAL_SERIAL,
+										agg_info->group_clauses,
+										NIL,
+										&agg_costs,
+										dNumPartialGroups);
+
+		add_partial_path(rel_grouped, path);
+	}
+}
+
 /*
  * make_rel_from_joinlist
  *	  Build access paths using a "joinlist" to guide the join path search.
@@ -3414,9 +3814,10 @@ make_rel_from_joinlist(PlannerInfo *root, List *joinlist)
  * needed for these paths need have been instantiated.
  *
  * Note to plugin authors: the functions invoked during standard_join_search()
- * modify root->join_rel_list and root->join_rel_hash.  If you want to do more
- * than one join-order search, you'll probably need to save and restore the
- * original states of those data structures.  See geqo_eval() for an example.
+ * modify root->join_rel_list->items and root->join_rel_list->hash.  If you
+ * want to do more than one join-order search, you'll probably need to save and
+ * restore the original states of those data structures.  See geqo_eval() for
+ * an example.
  */
 RelOptInfo *
 standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
@@ -3465,6 +3866,10 @@ standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
 		 *
 		 * After that, we're done creating paths for the joinrel, so run
 		 * set_cheapest().
+		 *
+		 * In addition, we also run generate_grouped_paths() for the grouped
+		 * relation of each just-processed joinrel, and run set_cheapest() for
+		 * the grouped relation afterwards.
 		 */
 		foreach(lc, root->join_rel_level[lev])
 		{
@@ -3485,6 +3890,27 @@ standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
 			/* Find and save the cheapest paths for this rel */
 			set_cheapest(rel);
 
+			/*
+			 * Except for the topmost scan/join rel, consider generating
+			 * partial aggregation paths for the grouped relation on top of
+			 * the paths of this rel.  After that, we're done creating paths
+			 * for the grouped relation, so run set_cheapest().
+			 */
+			if (!bms_equal(rel->relids, root->all_query_rels))
+			{
+				RelOptInfo *rel_grouped;
+
+				rel_grouped = find_grouped_rel(root, rel->relids);
+				if (rel_grouped)
+				{
+					Assert(IS_GROUPED_REL(rel_grouped));
+
+					generate_grouped_paths(root, rel_grouped, rel,
+										   rel_grouped->agg_info);
+					set_cheapest(rel_grouped);
+				}
+			}
+
 #ifdef OPTIMIZER_DEBUG
 			pprint(rel);
 #endif
@@ -4353,6 +4779,29 @@ generate_partitionwise_join_paths(PlannerInfo *root, RelOptInfo *rel)
 		if (IS_DUMMY_REL(child_rel))
 			continue;
 
+		/*
+		 * Except for the topmost scan/join rel, consider generating partial
+		 * aggregation paths for the grouped relation on top of the paths of
+		 * this partitioned child-join.  After that, we're done creating paths
+		 * for the grouped relation, so run set_cheapest().
+		 */
+		if (!bms_equal(IS_OTHER_REL(rel) ?
+					   rel->top_parent_relids : rel->relids,
+					   root->all_query_rels))
+		{
+			RelOptInfo *rel_grouped;
+
+			rel_grouped = find_grouped_rel(root, child_rel->relids);
+			if (rel_grouped)
+			{
+				Assert(IS_GROUPED_REL(rel_grouped));
+
+				generate_grouped_paths(root, rel_grouped, child_rel,
+									   rel_grouped->agg_info);
+				set_cheapest(rel_grouped);
+			}
+		}
+
 #ifdef OPTIMIZER_DEBUG
 		pprint(child_rel);
 #endif
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index c36687aa4d..c093b47af4 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -180,6 +180,8 @@ static bool cost_qual_eval_walker(Node *node, cost_qual_eval_context *context);
 static void get_restriction_qual_cost(PlannerInfo *root, RelOptInfo *baserel,
 									  ParamPathInfo *param_info,
 									  QualCost *qpqual_cost);
+static void set_joinpath_size(PlannerInfo *root, JoinPath *jpath,
+							  SpecialJoinInfo *sjinfo);
 static bool has_indexed_join_quals(NestPath *path);
 static double approx_tuple_count(PlannerInfo *root, JoinPath *path,
 								 List *quals);
@@ -3370,19 +3372,7 @@ final_cost_nestloop(PlannerInfo *root, NestPath *path,
 	if (inner_path_rows <= 0)
 		inner_path_rows = 1;
 	/* Mark the path with the correct row estimate */
-	if (path->jpath.path.param_info)
-		path->jpath.path.rows = path->jpath.path.param_info->ppi_rows;
-	else
-		path->jpath.path.rows = path->jpath.path.parent->rows;
-
-	/* For partial paths, scale row estimate. */
-	if (path->jpath.path.parallel_workers > 0)
-	{
-		double		parallel_divisor = get_parallel_divisor(&path->jpath.path);
-
-		path->jpath.path.rows =
-			clamp_row_est(path->jpath.path.rows / parallel_divisor);
-	}
+	set_joinpath_size(root, &path->jpath, extra->sjinfo);
 
 	/* cost of inner-relation source data (we already dealt with outer rel) */
 
@@ -3867,19 +3857,7 @@ final_cost_mergejoin(PlannerInfo *root, MergePath *path,
 		inner_path_rows = 1;
 
 	/* Mark the path with the correct row estimate */
-	if (path->jpath.path.param_info)
-		path->jpath.path.rows = path->jpath.path.param_info->ppi_rows;
-	else
-		path->jpath.path.rows = path->jpath.path.parent->rows;
-
-	/* For partial paths, scale row estimate. */
-	if (path->jpath.path.parallel_workers > 0)
-	{
-		double		parallel_divisor = get_parallel_divisor(&path->jpath.path);
-
-		path->jpath.path.rows =
-			clamp_row_est(path->jpath.path.rows / parallel_divisor);
-	}
+	set_joinpath_size(root, &path->jpath, extra->sjinfo);
 
 	/*
 	 * Compute cost of the mergequals and qpquals (other restriction clauses)
@@ -4299,19 +4277,7 @@ final_cost_hashjoin(PlannerInfo *root, HashPath *path,
 	path->jpath.path.disabled_nodes = workspace->disabled_nodes;
 
 	/* Mark the path with the correct row estimate */
-	if (path->jpath.path.param_info)
-		path->jpath.path.rows = path->jpath.path.param_info->ppi_rows;
-	else
-		path->jpath.path.rows = path->jpath.path.parent->rows;
-
-	/* For partial paths, scale row estimate. */
-	if (path->jpath.path.parallel_workers > 0)
-	{
-		double		parallel_divisor = get_parallel_divisor(&path->jpath.path);
-
-		path->jpath.path.rows =
-			clamp_row_est(path->jpath.path.rows / parallel_divisor);
-	}
+	set_joinpath_size(root, &path->jpath, extra->sjinfo);
 
 	/* mark the path with estimated # of batches */
 	path->num_batches = numbatches;
@@ -5061,6 +5027,57 @@ get_restriction_qual_cost(PlannerInfo *root, RelOptInfo *baserel,
 		*qpqual_cost = baserel->baserestrictcost;
 }
 
+/*
+ * set_joinpath_size
+ *	  Set the correct row estimate for the given join path.
+ *
+ * 'jpath' is the join path under consideration.
+ * 'sjinfo' is any SpecialJoinInfo relevant to this join.
+ *
+ * Note that for a grouped join relation, its paths could have very different
+ * rowcount estimates, so we need to calculate the rowcount estimate using the
+ * outer path and inner path of the given join path.
+ */
+static void
+set_joinpath_size(PlannerInfo *root, JoinPath *jpath, SpecialJoinInfo *sjinfo)
+{
+	if (IS_GROUPED_REL(jpath->path.parent))
+	{
+		Path	   *outer_path = jpath->outerjoinpath;
+		Path	   *inner_path = jpath->innerjoinpath;
+
+		/*
+		 * Estimate the number of rows of this grouped join path as the sizes
+		 * of the outer and inner paths times the selectivity of the clauses
+		 * that have ended up at this join node.
+		 */
+		jpath->path.rows = calc_joinrel_size_estimate(root,
+													  jpath->path.parent,
+													  outer_path->parent,
+													  inner_path->parent,
+													  outer_path->rows,
+													  inner_path->rows,
+													  sjinfo,
+													  jpath->joinrestrictinfo);
+	}
+	else
+	{
+		if (jpath->path.param_info)
+			jpath->path.rows = jpath->path.param_info->ppi_rows;
+		else
+			jpath->path.rows = jpath->path.parent->rows;
+
+		/* For partial paths, scale row estimate. */
+		if (jpath->path.parallel_workers > 0)
+		{
+			double		parallel_divisor = get_parallel_divisor(&jpath->path);
+
+			jpath->path.rows =
+				clamp_row_est(jpath->path.rows / parallel_divisor);
+		}
+	}
+}
+
 
 /*
  * compute_semi_anti_join_factors
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index 7db5e30eef..248aa3fffe 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -35,6 +35,9 @@ static bool has_legal_joinclause(PlannerInfo *root, RelOptInfo *rel);
 static bool restriction_is_constant_false(List *restrictlist,
 										  RelOptInfo *joinrel,
 										  bool only_pushed_down);
+static void make_grouped_join_rel(PlannerInfo *root, RelOptInfo *rel1,
+								  RelOptInfo *rel2, RelOptInfo *joinrel,
+								  SpecialJoinInfo *sjinfo, List *restrictlist);
 static void populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
 										RelOptInfo *rel2, RelOptInfo *joinrel,
 										SpecialJoinInfo *sjinfo, List *restrictlist);
@@ -771,6 +774,10 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
 		return joinrel;
 	}
 
+	/* Build a grouped join relation for 'joinrel' if possible. */
+	make_grouped_join_rel(root, rel1, rel2, joinrel, sjinfo,
+						  restrictlist);
+
 	/* Add paths to the join relation. */
 	populate_joinrel_with_paths(root, rel1, rel2, joinrel, sjinfo,
 								restrictlist);
@@ -882,6 +889,135 @@ add_outer_joins_to_relids(PlannerInfo *root, Relids input_relids,
 	return input_relids;
 }
 
+/*
+ * make_grouped_join_rel
+ *	  Build a grouped join relation out of 'joinrel' if eager aggregation is
+ *	  possible and the 'joinrel' can produce grouped paths.
+ *
+ * We also generate partial aggregation paths for the grouped relation by
+ * joining the grouped paths of 'rel1' to the plain paths of 'rel2', or by
+ * joining the grouped paths of 'rel2' to the plain paths of 'rel1'.
+ */
+static void
+make_grouped_join_rel(PlannerInfo *root, RelOptInfo *rel1,
+					  RelOptInfo *rel2, RelOptInfo *joinrel,
+					  SpecialJoinInfo *sjinfo, List *restrictlist)
+{
+	RelOptInfo *rel_grouped;
+	RelOptInfo *rel1_grouped;
+	RelOptInfo *rel2_grouped;
+	bool		rel1_empty;
+	bool		rel2_empty;
+	bool		yet_to_add = false;
+
+	/*
+	 * If there are no aggregate expressions or grouping expressions, eager
+	 * aggregation is not possible.
+	 */
+	if (root->agg_clause_list == NIL ||
+		root->group_expr_list == NIL)
+		return;
+
+	/*
+	 * See if we already have a grouped joinrel for this joinrel.
+	 */
+	rel_grouped = find_grouped_rel(root, joinrel->relids);
+
+	/*
+	 * Construct a new RelOptInfo for the grouped join relation if there is no
+	 * existing one.
+	 */
+	if (rel_grouped == NULL)
+	{
+		RelAggInfo *agg_info = NULL;
+
+		/*
+		 * Prepare the information needed to create grouped paths for this
+		 * join relation.
+		 */
+		agg_info = create_rel_agg_info(root, joinrel);
+		if (agg_info == NULL)
+			return;
+
+		/* build a grouped relation out of the plain relation */
+		rel_grouped = build_grouped_rel(root, joinrel);
+		rel_grouped->reltarget = agg_info->target;
+		rel_grouped->rows = agg_info->grouped_rows;
+		rel_grouped->agg_info = agg_info;
+
+		/*
+		 * If the grouped paths for the given join relation are considered
+		 * useful, add the grouped relation we just built to the PlannerInfo
+		 * to make it available for further joining or for acting as the upper
+		 * rel representing the result of partial aggregation.  Otherwise, we
+		 * need to postpone the decision on adding the grouped relation to the
+		 * PlannerInfo, as it depends on whether we can generate any grouped
+		 * paths by joining the given pair of input relations.
+		 */
+		if (agg_info->agg_useful)
+			add_grouped_rel(root, rel_grouped);
+		else
+			yet_to_add = true;
+	}
+
+	Assert(IS_GROUPED_REL(rel_grouped));
+
+	/* We may have already proven this grouped join relation to be dummy. */
+	if (IS_DUMMY_REL(rel_grouped))
+		return;
+
+	/* Retrieve the grouped relations for the two input rels */
+	rel1_grouped = find_grouped_rel(root, rel1->relids);
+	rel2_grouped = find_grouped_rel(root, rel2->relids);
+
+	rel1_empty = (rel1_grouped == NULL || IS_DUMMY_REL(rel1_grouped));
+	rel2_empty = (rel2_grouped == NULL || IS_DUMMY_REL(rel2_grouped));
+
+	/* Nothing to do if there's no grouped relation. */
+	if (rel1_empty && rel2_empty)
+		return;
+
+	/* Joining two grouped relations is currently not supported */
+	if (!rel1_empty && !rel2_empty)
+		return;
+
+	/* Generate partial aggregation paths for the grouped relation */
+	if (!rel1_empty)
+	{
+		populate_joinrel_with_paths(root, rel1_grouped, rel2, rel_grouped,
+									sjinfo, restrictlist);
+
+		/*
+		 * It shouldn't happen that we have marked rel1_grouped as dummy in
+		 * populate_joinrel_with_paths due to provably constant-false join
+		 * restrictions, hence we wouldn't end up with a plan that has Aggref
+		 * in non-Agg plan node.
+		 */
+		Assert(!IS_DUMMY_REL(rel1_grouped));
+	}
+	else if (!rel2_empty)
+	{
+		populate_joinrel_with_paths(root, rel1, rel2_grouped, rel_grouped,
+									sjinfo, restrictlist);
+
+		/*
+		 * It shouldn't happen that we have marked rel2_grouped as dummy in
+		 * populate_joinrel_with_paths due to provably constant-false join
+		 * restrictions, hence we wouldn't end up with a plan that has Aggref
+		 * in non-Agg plan node.
+		 */
+		Assert(!IS_DUMMY_REL(rel2_grouped));
+	}
+
+	/*
+	 * Since we have generated grouped paths by joining the given pair of
+	 * input relations, add the grouped relation to the PlannerInfo if we have
+	 * not already done so.
+	 */
+	if (yet_to_add)
+		add_grouped_rel(root, rel_grouped);
+}
+
 /*
  * populate_joinrel_with_paths
  *	  Add paths to the given joinrel for given pair of joining relations. The
@@ -1674,6 +1810,11 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
 						 adjust_child_relids(joinrel->relids,
 											 nappinfos, appinfos)));
 
+		/* Build a grouped join relation for 'child_joinrel' if possible */
+		make_grouped_join_rel(root, child_rel1, child_rel2,
+							  child_joinrel, child_sjinfo,
+							  child_restrictlist);
+
 		/* And make paths for the child join */
 		populate_joinrel_with_paths(root, child_rel1, child_rel2,
 									child_joinrel, child_sjinfo,
diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index 5f3908be51..051276a73e 100644
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -14,6 +14,7 @@
  */
 #include "postgres.h"
 
+#include "access/nbtree.h"
 #include "catalog/pg_constraint.h"
 #include "catalog/pg_type.h"
 #include "nodes/makefuncs.h"
@@ -81,6 +82,8 @@ typedef struct JoinTreeItem
 } JoinTreeItem;
 
 
+static void create_agg_clause_infos(PlannerInfo *root);
+static void create_grouping_expr_infos(PlannerInfo *root);
 static void extract_lateral_references(PlannerInfo *root, RelOptInfo *brel,
 									   Index rtindex);
 static List *deconstruct_recurse(PlannerInfo *root, Node *jtnode,
@@ -628,6 +631,276 @@ remove_useless_groupby_columns(PlannerInfo *root)
 	}
 }
 
+/*
+ * setup_eager_aggregation
+ *	  Check if eager aggregation is applicable, and if so collect suitable
+ *	  aggregate expressions and grouping expressions in the query.
+ */
+void
+setup_eager_aggregation(PlannerInfo *root)
+{
+	/*
+	 * Don't apply eager aggregation if disabled by user.
+	 */
+	if (!enable_eager_aggregate)
+		return;
+
+	/*
+	 * Don't apply eager aggregation if there are no available GROUP BY
+	 * clauses.
+	 */
+	if (!root->processed_groupClause)
+		return;
+
+	/*
+	 * For now we don't try to support grouping sets.
+	 */
+	if (root->parse->groupingSets)
+		return;
+
+	/*
+	 * For now we don't try to support DISTINCT or ORDER BY aggregates.
+	 */
+	if (root->numOrderedAggs > 0)
+		return;
+
+	/*
+	 * If there are any aggregates that do not support partial mode, or any
+	 * partial aggregates that are non-serializable, do not apply eager
+	 * aggregation.
+	 */
+	if (root->hasNonPartialAggs || root->hasNonSerialAggs)
+		return;
+
+	/*
+	 * We don't try to apply eager aggregation if there are set-returning
+	 * functions in targetlist.
+	 */
+	if (root->parse->hasTargetSRFs)
+		return;
+
+	/*
+	 * Eager aggregation only makes sense if there are multiple base rels in
+	 * the query.
+	 */
+	if (bms_membership(root->all_baserels) != BMS_MULTIPLE)
+		return;
+
+	/*
+	 * Collect aggregate expressions and plain Vars that appear in targetlist
+	 * and havingQual.
+	 */
+	create_agg_clause_infos(root);
+
+	/*
+	 * If there are no suitable aggregate expressions, we cannot apply eager
+	 * aggregation.
+	 */
+	if (root->agg_clause_list == NIL)
+		return;
+
+	/*
+	 * Collect grouping expressions that appear in grouping clauses.
+	 */
+	create_grouping_expr_infos(root);
+}
+
+/*
+ * create_agg_clause_infos
+ *	  Search the targetlist and havingQual for Aggrefs and plain Vars, and
+ *	  create an AggClauseInfo for each Aggref node.
+ */
+static void
+create_agg_clause_infos(PlannerInfo *root)
+{
+	List	   *tlist_exprs;
+	List	   *agg_clause_list = NIL;
+	List	   *tlist_vars = NIL;
+	ListCell   *lc;
+
+	Assert(root->agg_clause_list == NIL);
+	Assert(root->tlist_vars == NIL);
+
+	tlist_exprs = pull_var_clause((Node *) root->processed_tlist,
+								  PVC_INCLUDE_AGGREGATES |
+								  PVC_RECURSE_WINDOWFUNCS |
+								  PVC_RECURSE_PLACEHOLDERS);
+
+	/*
+	 * Aggregates within the HAVING clause need to be processed in the same
+	 * way as those in the targetlist.  Note that HAVING can contain Aggrefs
+	 * but not WindowFuncs.
+	 */
+	if (root->parse->havingQual != NULL)
+	{
+		List	   *having_exprs;
+
+		having_exprs = pull_var_clause((Node *) root->parse->havingQual,
+									   PVC_INCLUDE_AGGREGATES |
+									   PVC_RECURSE_PLACEHOLDERS);
+		if (having_exprs != NIL)
+		{
+			tlist_exprs = list_concat(tlist_exprs, having_exprs);
+			list_free(having_exprs);
+		}
+	}
+
+	foreach(lc, tlist_exprs)
+	{
+		Expr	   *expr = (Expr *) lfirst(lc);
+		Aggref	   *aggref;
+		AggClauseInfo *ac_info;
+
+		/* For now we don't try to support GROUPING() expressions */
+		if (IsA(expr, GroupingFunc))
+		{
+			list_free_deep(agg_clause_list);
+			list_free(tlist_vars);
+
+			return;
+		}
+
+		/* Collect plain Vars for future reference */
+		if (IsA(expr, Var))
+		{
+			tlist_vars = list_append_unique(tlist_vars, expr);
+			continue;
+		}
+
+		aggref = castNode(Aggref, expr);
+
+		Assert(aggref->aggorder == NIL);
+		Assert(aggref->aggdistinct == NIL);
+
+		/*
+		 * If there are any securityQuals, do not try to apply eager
+		 * aggregation if any non-leakproof aggregate functions are present.
+		 * This is overly strict, but for now...
+		 */
+		if (root->qual_security_level > 0 &&
+			!get_func_leakproof(aggref->aggfnoid))
+		{
+			list_free_deep(agg_clause_list);
+			list_free(tlist_vars);
+
+			return;
+		}
+
+		ac_info = makeNode(AggClauseInfo);
+		ac_info->aggref = aggref;
+		ac_info->agg_eval_at = pull_varnos(root, (Node *) aggref);
+
+		agg_clause_list = list_append_unique(agg_clause_list, ac_info);
+	}
+
+	list_free(tlist_exprs);
+
+	root->agg_clause_list = agg_clause_list;
+	root->tlist_vars = tlist_vars;
+}
+
+/*
+ * create_grouping_expr_infos
+ *	  Create GroupExprInfo for each expression usable as grouping key.
+ *
+ * If any grouping expression is not suitable, we will just return with
+ * root->group_expr_list being NIL.
+ */
+static void
+create_grouping_expr_infos(PlannerInfo *root)
+{
+	List	   *exprs = NIL;
+	List	   *sortgrouprefs = NIL;
+	List	   *btree_opfamilies = NIL;
+	ListCell   *lc,
+			   *lc1,
+			   *lc2,
+			   *lc3;
+
+	Assert(root->group_expr_list == NIL);
+
+	foreach(lc, root->processed_groupClause)
+	{
+		SortGroupClause *sgc = lfirst_node(SortGroupClause, lc);
+		TargetEntry *tle = get_sortgroupclause_tle(sgc, root->processed_tlist);
+		TypeCacheEntry *tce;
+		Oid			equalimageproc;
+		Oid			eq_op;
+		List	   *eq_opfamilies;
+		Oid			btree_opfamily;
+
+		Assert(tle->ressortgroupref > 0);
+
+		/*
+		 * For now we only support plain Vars as grouping expressions.
+		 */
+		if (!IsA(tle->expr, Var))
+			return;
+
+		/*
+		 * Eager aggregation is only possible if equality implies image
+		 * equality for each grouping key.  Otherwise, placing keys with
+		 * different byte images into the same group may result in the loss of
+		 * information that could be necessary to evaluate upper qual clauses.
+		 *
+		 * For instance, the NUMERIC data type is not supported, as values
+		 * that are considered equal by the equality operator (e.g., 0 and
+		 * 0.0) can have different scales.
+		 */
+		tce = lookup_type_cache(exprType((Node *) tle->expr),
+								TYPECACHE_BTREE_OPFAMILY);
+		if (!OidIsValid(tce->btree_opf) ||
+			!OidIsValid(tce->btree_opintype))
+			return;
+
+		equalimageproc = get_opfamily_proc(tce->btree_opf,
+										   tce->btree_opintype,
+										   tce->btree_opintype,
+										   BTEQUALIMAGE_PROC);
+		if (!OidIsValid(equalimageproc) ||
+			!DatumGetBool(OidFunctionCall1Coll(equalimageproc,
+											   tce->typcollation,
+											   ObjectIdGetDatum(tce->btree_opintype))))
+			return;
+
+		/*
+		 * Get the operator in the btree's opfamily.
+		 */
+		eq_op = get_opfamily_member(tce->btree_opf,
+									tce->btree_opintype,
+									tce->btree_opintype,
+									BTEqualStrategyNumber);
+		if (!OidIsValid(eq_op))
+			return;
+		eq_opfamilies = get_mergejoin_opfamilies(eq_op);
+		if (!eq_opfamilies)
+			return;
+		btree_opfamily = linitial_oid(eq_opfamilies);
+
+		exprs = lappend(exprs, tle->expr);
+		sortgrouprefs = lappend_int(sortgrouprefs, tle->ressortgroupref);
+		btree_opfamilies = lappend_oid(btree_opfamilies, btree_opfamily);
+	}
+
+	/*
+	 * Construct GroupExprInfo for each expression.
+	 */
+	forthree(lc1, exprs, lc2, sortgrouprefs, lc3, btree_opfamilies)
+	{
+		Expr	   *expr = (Expr *) lfirst(lc1);
+		int			sortgroupref = lfirst_int(lc2);
+		Oid			btree_opfamily = lfirst_oid(lc3);
+		GroupExprInfo *ge_info;
+
+		ge_info = makeNode(GroupExprInfo);
+		ge_info->expr = (Expr *) copyObject(expr);
+		ge_info->sortgroupref = sortgroupref;
+		ge_info->btree_opfamily = btree_opfamily;
+
+		root->group_expr_list = lappend(root->group_expr_list, ge_info);
+	}
+}
+
 /*****************************************************************************
  *
  *	  LATERAL REFERENCES
diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c
index 735560e8ca..22df968629 100644
--- a/src/backend/optimizer/plan/planmain.c
+++ b/src/backend/optimizer/plan/planmain.c
@@ -64,8 +64,12 @@ query_planner(PlannerInfo *root,
 	 * NOTE: append_rel_list was set up by subquery_planner, so do not touch
 	 * here.
 	 */
-	root->join_rel_list = NIL;
-	root->join_rel_hash = NULL;
+	root->join_rel_list = makeNode(RelInfoList);
+	root->join_rel_list->items = NIL;
+	root->join_rel_list->hash = NULL;
+	root->grouped_rel_list = makeNode(RelInfoList);
+	root->grouped_rel_list->items = NIL;
+	root->grouped_rel_list->hash = NULL;
 	root->join_rel_level = NULL;
 	root->join_cur_level = 0;
 	root->canon_pathkeys = NIL;
@@ -76,6 +80,9 @@ query_planner(PlannerInfo *root,
 	root->placeholder_list = NIL;
 	root->placeholder_array = NULL;
 	root->placeholder_array_size = 0;
+	root->agg_clause_list = NIL;
+	root->group_expr_list = NIL;
+	root->tlist_vars = NIL;
 	root->fkey_list = NIL;
 	root->initial_rels = NIL;
 
@@ -260,6 +267,12 @@ query_planner(PlannerInfo *root,
 	 */
 	extract_restriction_or_clauses(root);
 
+	/*
+	 * Check if eager aggregation is applicable, and if so, set up
+	 * root->agg_clause_list and root->group_expr_list.
+	 */
+	setup_eager_aggregation(root);
+
 	/*
 	 * Now expand appendrels by adding "otherrels" for their children.  We
 	 * delay this to the end so that we have as much information as possible
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 7468961b01..99e46cc152 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -229,7 +229,6 @@ static void add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 									  RelOptInfo *partially_grouped_rel,
 									  const AggClauseCosts *agg_costs,
 									  grouping_sets_data *gd,
-									  double dNumGroups,
 									  GroupPathExtraData *extra);
 static RelOptInfo *create_partial_grouping_paths(PlannerInfo *root,
 												 RelOptInfo *grouped_rel,
@@ -3915,9 +3914,7 @@ create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
 							   GroupPathExtraData *extra,
 							   RelOptInfo **partially_grouped_rel_p)
 {
-	Path	   *cheapest_path = input_rel->cheapest_total_path;
 	RelOptInfo *partially_grouped_rel = NULL;
-	double		dNumGroups;
 	PartitionwiseAggregateType patype = PARTITIONWISE_AGGREGATE_NONE;
 
 	/*
@@ -3999,23 +3996,16 @@ create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
 
 	/* Gather any partially grouped partial paths. */
 	if (partially_grouped_rel && partially_grouped_rel->partial_pathlist)
-	{
 		gather_grouping_paths(root, partially_grouped_rel);
-		set_cheapest(partially_grouped_rel);
-	}
 
-	/*
-	 * Estimate number of groups.
-	 */
-	dNumGroups = get_number_of_groups(root,
-									  cheapest_path->rows,
-									  gd,
-									  extra->targetList);
+	/* Now choose the best path(s) for partially_grouped_rel. */
+	if (partially_grouped_rel && partially_grouped_rel->pathlist)
+		set_cheapest(partially_grouped_rel);
 
 	/* Build final grouping paths */
 	add_paths_to_grouping_rel(root, input_rel, grouped_rel,
 							  partially_grouped_rel, agg_costs, gd,
-							  dNumGroups, extra);
+							  extra);
 
 	/* Give a helpful error if we failed to find any implementation */
 	if (grouped_rel->pathlist == NIL)
@@ -6906,16 +6896,42 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 						  RelOptInfo *grouped_rel,
 						  RelOptInfo *partially_grouped_rel,
 						  const AggClauseCosts *agg_costs,
-						  grouping_sets_data *gd, double dNumGroups,
+						  grouping_sets_data *gd,
 						  GroupPathExtraData *extra)
 {
 	Query	   *parse = root->parse;
 	Path	   *cheapest_path = input_rel->cheapest_total_path;
+	Path	   *cheapest_partially_grouped_path = NULL;
 	ListCell   *lc;
 	bool		can_hash = (extra->flags & GROUPING_CAN_USE_HASH) != 0;
 	bool		can_sort = (extra->flags & GROUPING_CAN_USE_SORT) != 0;
 	List	   *havingQual = (List *) extra->havingQual;
 	AggClauseCosts *agg_final_costs = &extra->agg_final_costs;
+	double		dNumGroups = 0;
+	double		dNumFinalGroups = 0;
+
+	/*
+	 * Estimate number of groups for non-split aggregation.
+	 */
+	dNumGroups = get_number_of_groups(root,
+									  cheapest_path->rows,
+									  gd,
+									  extra->targetList);
+
+	if (partially_grouped_rel && partially_grouped_rel->pathlist)
+	{
+		cheapest_partially_grouped_path =
+			partially_grouped_rel->cheapest_total_path;
+
+		/*
+		 * Estimate number of groups for final phase of partial aggregation.
+		 */
+		dNumFinalGroups =
+			get_number_of_groups(root,
+								 cheapest_partially_grouped_path->rows,
+								 gd,
+								 extra->targetList);
+	}
 
 	if (can_sort)
 	{
@@ -7028,7 +7044,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 					path = make_ordered_path(root,
 											 grouped_rel,
 											 path,
-											 partially_grouped_rel->cheapest_total_path,
+											 cheapest_partially_grouped_path,
 											 info->pathkeys,
 											 -1.0);
 
@@ -7046,7 +7062,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 												 info->clauses,
 												 havingQual,
 												 agg_final_costs,
-												 dNumGroups));
+												 dNumFinalGroups));
 					else
 						add_path(grouped_rel, (Path *)
 								 create_group_path(root,
@@ -7054,7 +7070,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 												   path,
 												   info->clauses,
 												   havingQual,
-												   dNumGroups));
+												   dNumFinalGroups));
 
 				}
 			}
@@ -7096,19 +7112,17 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 		 */
 		if (partially_grouped_rel && partially_grouped_rel->pathlist)
 		{
-			Path	   *path = partially_grouped_rel->cheapest_total_path;
-
 			add_path(grouped_rel, (Path *)
 					 create_agg_path(root,
 									 grouped_rel,
-									 path,
+									 cheapest_partially_grouped_path,
 									 grouped_rel->reltarget,
 									 AGG_HASHED,
 									 AGGSPLIT_FINAL_DESERIAL,
 									 root->processed_groupClause,
 									 havingQual,
 									 agg_final_costs,
-									 dNumGroups));
+									 dNumFinalGroups));
 		}
 	}
 
@@ -7158,6 +7172,21 @@ create_partial_grouping_paths(PlannerInfo *root,
 	bool		can_hash = (extra->flags & GROUPING_CAN_USE_HASH) != 0;
 	bool		can_sort = (extra->flags & GROUPING_CAN_USE_SORT) != 0;
 
+	/*
+	 * The partially_grouped_rel could have been already created due to eager
+	 * aggregation.
+	 */
+	partially_grouped_rel = find_grouped_rel(root, input_rel->relids);
+	Assert(enable_eager_aggregate || partially_grouped_rel == NULL);
+
+	/*
+	 * It is possible that the partially_grouped_rel created by eager
+	 * aggregation is dummy.  In this case we just set it to NULL.  It might
+	 * be created again by the following logic if possible.
+	 */
+	if (partially_grouped_rel && IS_DUMMY_REL(partially_grouped_rel))
+		partially_grouped_rel = NULL;
+
 	/*
 	 * Consider whether we should generate partially aggregated non-partial
 	 * paths.  We can only do this if we have a non-partial path, and only if
@@ -7181,19 +7210,27 @@ create_partial_grouping_paths(PlannerInfo *root,
 	 * If we can't partially aggregate partial paths, and we can't partially
 	 * aggregate non-partial paths, then don't bother creating the new
 	 * RelOptInfo at all, unless the caller specified force_rel_creation.
+	 *
+	 * Note that the partially_grouped_rel could have been already created and
+	 * populated with appropriate paths by eager aggregation.
 	 */
 	if (cheapest_total_path == NULL &&
 		cheapest_partial_path == NULL &&
+		(partially_grouped_rel == NULL ||
+		 partially_grouped_rel->pathlist == NIL) &&
 		!force_rel_creation)
 		return NULL;
 
 	/*
 	 * Build a new upper relation to represent the result of partially
-	 * aggregating the rows from the input relation.
-	 */
-	partially_grouped_rel = fetch_upper_rel(root,
-											UPPERREL_PARTIAL_GROUP_AGG,
-											grouped_rel->relids);
+	 * aggregating the rows from the input relation.  The relation may already
+	 * exist due to eager aggregation, in which case we don't need to create
+	 * it.
+	 */
+	if (partially_grouped_rel == NULL)
+		partially_grouped_rel = fetch_upper_rel(root,
+												UPPERREL_PARTIAL_GROUP_AGG,
+												grouped_rel->relids);
 	partially_grouped_rel->consider_parallel =
 		grouped_rel->consider_parallel;
 	partially_grouped_rel->reloptkind = grouped_rel->reloptkind;
@@ -7202,6 +7239,14 @@ create_partial_grouping_paths(PlannerInfo *root,
 	partially_grouped_rel->useridiscurrent = grouped_rel->useridiscurrent;
 	partially_grouped_rel->fdwroutine = grouped_rel->fdwroutine;
 
+	/*
+	 * Partially-grouped partial paths may have been generated by eager
+	 * aggregation.  If we find that parallelism is not possible for
+	 * partially_grouped_rel, we need to drop these partial paths.
+	 */
+	if (!partially_grouped_rel->consider_parallel)
+		partially_grouped_rel->partial_pathlist = NIL;
+
 	/*
 	 * Build target list for partial aggregate paths.  These paths cannot just
 	 * emit the same tlist as regular aggregate paths, because (1) we must
diff --git a/src/backend/optimizer/util/appendinfo.c b/src/backend/optimizer/util/appendinfo.c
index 45e8b74f94..0e4c7b2b2d 100644
--- a/src/backend/optimizer/util/appendinfo.c
+++ b/src/backend/optimizer/util/appendinfo.c
@@ -499,6 +499,66 @@ adjust_appendrel_attrs_mutator(Node *node,
 		return (Node *) newinfo;
 	}
 
+	/*
+	 * We have to process RelAggInfo nodes specially.
+	 */
+	if (IsA(node, RelAggInfo))
+	{
+		RelAggInfo *oldinfo = (RelAggInfo *) node;
+		RelAggInfo *newinfo = makeNode(RelAggInfo);
+
+		/* Copy all flat-copiable fields */
+		memcpy(newinfo, oldinfo, sizeof(RelAggInfo));
+
+		newinfo->relids = adjust_child_relids(oldinfo->relids,
+											  context->nappinfos,
+											  context->appinfos);
+
+		newinfo->target = (PathTarget *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->target,
+										   context);
+
+		newinfo->agg_input = (PathTarget *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->agg_input,
+										   context);
+
+		newinfo->group_clauses = (List *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->group_clauses,
+										   context);
+
+		newinfo->group_exprs = (List *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->group_exprs,
+										   context);
+
+		return (Node *) newinfo;
+	}
+
+	/*
+	 * We have to process PathTarget nodes specially.
+	 */
+	if (IsA(node, PathTarget))
+	{
+		PathTarget *oldtarget = (PathTarget *) node;
+		PathTarget *newtarget = makeNode(PathTarget);
+
+		/* Copy all flat-copiable fields */
+		memcpy(newtarget, oldtarget, sizeof(PathTarget));
+
+		if (oldtarget->sortgrouprefs)
+		{
+			Size		nbytes = list_length(oldtarget->exprs) * sizeof(Index);
+
+			newtarget->exprs = (List *)
+				adjust_appendrel_attrs_mutator((Node *) oldtarget->exprs,
+											   context);
+
+			newtarget->sortgrouprefs = (Index *) palloc(nbytes);
+			memcpy(newtarget->sortgrouprefs, oldtarget->sortgrouprefs, nbytes);
+		}
+
+		return (Node *) newtarget;
+	}
+
 	/*
 	 * NOTE: we do not need to recurse into sublinks, because they should
 	 * already have been converted to subplans before we see them.
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index 4f74cafa25..85e419160b 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -262,6 +262,12 @@ compare_path_costs_fuzzily(Path *path1, Path *path2, double fuzz_factor)
  * unparameterized path, too, if there is one; the users of that list find
  * it more convenient if that's included.
  *
+ * cheapest_parameterized_paths also always includes the fewest-row
+ * unparameterized path, if there is one, for grouped relations.  Different
+ * paths of a grouped relation can have very different row counts, and in some
+ * cases the cheapest-total unparameterized path may not be the one with the
+ * fewest row.
+ *
  * This is normally called only after we've finished constructing the path
  * list for the rel node.
  */
@@ -271,6 +277,7 @@ set_cheapest(RelOptInfo *parent_rel)
 	Path	   *cheapest_startup_path;
 	Path	   *cheapest_total_path;
 	Path	   *best_param_path;
+	Path	   *fewest_row_path;
 	List	   *parameterized_paths;
 	ListCell   *p;
 
@@ -280,6 +287,7 @@ set_cheapest(RelOptInfo *parent_rel)
 		elog(ERROR, "could not devise a query plan for the given query");
 
 	cheapest_startup_path = cheapest_total_path = best_param_path = NULL;
+	fewest_row_path = NULL;
 	parameterized_paths = NIL;
 
 	foreach(p, parent_rel->pathlist)
@@ -341,6 +349,8 @@ set_cheapest(RelOptInfo *parent_rel)
 			if (cheapest_total_path == NULL)
 			{
 				cheapest_startup_path = cheapest_total_path = path;
+				if (IS_GROUPED_REL(parent_rel))
+					fewest_row_path = path;
 				continue;
 			}
 
@@ -364,6 +374,27 @@ set_cheapest(RelOptInfo *parent_rel)
 				 compare_pathkeys(cheapest_total_path->pathkeys,
 								  path->pathkeys) == PATHKEYS_BETTER2))
 				cheapest_total_path = path;
+
+			/*
+			 * Find the fewest-row unparameterized path for a grouped
+			 * relation.  If we find two paths of the same row count, try to
+			 * keep the one with the cheaper total cost; if the costs are
+			 * identical, keep the better-sorted one.
+			 */
+			if (IS_GROUPED_REL(parent_rel))
+			{
+				if (fewest_row_path->rows > path->rows)
+					fewest_row_path = path;
+				else if (fewest_row_path->rows == path->rows)
+				{
+					cmp = compare_path_costs(fewest_row_path, path, TOTAL_COST);
+					if (cmp > 0 ||
+						(cmp == 0 &&
+						 compare_pathkeys(fewest_row_path->pathkeys,
+										  path->pathkeys) == PATHKEYS_BETTER2))
+						fewest_row_path = path;
+				}
+			}
 		}
 	}
 
@@ -371,6 +402,10 @@ set_cheapest(RelOptInfo *parent_rel)
 	if (cheapest_total_path)
 		parameterized_paths = lcons(cheapest_total_path, parameterized_paths);
 
+	/* Add fewest-row unparameterized path, if any, to parameterized_paths */
+	if (fewest_row_path && fewest_row_path != cheapest_total_path)
+		parameterized_paths = lcons(fewest_row_path, parameterized_paths);
+
 	/*
 	 * If there is no unparameterized path, use the best parameterized path as
 	 * cheapest_total_path (but not as cheapest_startup_path).
@@ -2787,8 +2822,7 @@ create_projection_path(PlannerInfo *root,
 	pathnode->path.pathtype = T_Result;
 	pathnode->path.parent = rel;
 	pathnode->path.pathtarget = target;
-	/* For now, assume we are above any joins, so no parameterization */
-	pathnode->path.param_info = NULL;
+	pathnode->path.param_info = subpath->param_info;
 	pathnode->path.parallel_aware = false;
 	pathnode->path.parallel_safe = rel->consider_parallel &&
 		subpath->parallel_safe &&
@@ -3043,8 +3077,7 @@ create_incremental_sort_path(PlannerInfo *root,
 	pathnode->path.parent = rel;
 	/* Sort doesn't project, so use source path's pathtarget */
 	pathnode->path.pathtarget = subpath->pathtarget;
-	/* For now, assume we are above any joins, so no parameterization */
-	pathnode->path.param_info = NULL;
+	pathnode->path.param_info = subpath->param_info;
 	pathnode->path.parallel_aware = false;
 	pathnode->path.parallel_safe = rel->consider_parallel &&
 		subpath->parallel_safe;
@@ -3091,8 +3124,7 @@ create_sort_path(PlannerInfo *root,
 	pathnode->path.parent = rel;
 	/* Sort doesn't project, so use source path's pathtarget */
 	pathnode->path.pathtarget = subpath->pathtarget;
-	/* For now, assume we are above any joins, so no parameterization */
-	pathnode->path.param_info = NULL;
+	pathnode->path.param_info = subpath->param_info;
 	pathnode->path.parallel_aware = false;
 	pathnode->path.parallel_safe = rel->consider_parallel &&
 		subpath->parallel_safe;
@@ -3253,8 +3285,7 @@ create_agg_path(PlannerInfo *root,
 	pathnode->path.pathtype = T_Agg;
 	pathnode->path.parent = rel;
 	pathnode->path.pathtarget = target;
-	/* For now, assume we are above any joins, so no parameterization */
-	pathnode->path.param_info = NULL;
+	pathnode->path.param_info = subpath->param_info;
 	pathnode->path.parallel_aware = false;
 	pathnode->path.parallel_safe = rel->consider_parallel &&
 		subpath->parallel_safe;
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index f96573eb5d..d349ae521b 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -16,6 +16,8 @@
 
 #include <limits.h>
 
+#include "access/nbtree.h"
+#include "catalog/pg_constraint.h"
 #include "miscadmin.h"
 #include "nodes/nodeFuncs.h"
 #include "optimizer/appendinfo.h"
@@ -27,19 +29,27 @@
 #include "optimizer/paths.h"
 #include "optimizer/placeholder.h"
 #include "optimizer/plancat.h"
+#include "optimizer/planner.h"
 #include "optimizer/restrictinfo.h"
 #include "optimizer/tlist.h"
+#include "parser/parse_oper.h"
 #include "parser/parse_relation.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/hsearch.h"
 #include "utils/lsyscache.h"
+#include "utils/selfuncs.h"
+#include "utils/typcache.h"
 
 
-typedef struct JoinHashEntry
+/*
+ * An entry of a hash table that we use to make lookup for RelOptInfo
+ * structures more efficient.
+ */
+typedef struct RelHashEntry
 {
-	Relids		join_relids;	/* hash key --- MUST BE FIRST */
-	RelOptInfo *join_rel;
-} JoinHashEntry;
+	Relids		relids;			/* hash key --- MUST BE FIRST */
+	RelOptInfo *rel;
+} RelHashEntry;
 
 static void build_joinrel_tlist(PlannerInfo *root, RelOptInfo *joinrel,
 								RelOptInfo *input_rel,
@@ -83,7 +93,17 @@ static void build_child_join_reltarget(PlannerInfo *root,
 									   RelOptInfo *childrel,
 									   int nappinfos,
 									   AppendRelInfo **appinfos);
-
+static bool eager_aggregation_possible_for_relation(PlannerInfo *root,
+													RelOptInfo *rel);
+static bool init_grouping_targets(PlannerInfo *root, RelOptInfo *rel,
+								  PathTarget *target, PathTarget *agg_input,
+								  List **group_clauses, List **group_exprs);
+static bool is_var_in_aggref_only(PlannerInfo *root, Var *var);
+static bool is_var_needed_by_join(PlannerInfo *root, Var *var, RelOptInfo *rel);
+static Index get_expression_sortgroupref(PlannerInfo *root, Expr *expr);
+
+/* Minimum row reduction ratio at which a grouped path is considered useful */
+#define EAGER_AGGREGATE_RATIO 0.5
 
 /*
  * setup_simple_rel_arrays
@@ -276,6 +296,7 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
 	rel->joininfo = NIL;
 	rel->has_eclass_joins = false;
 	rel->consider_partitionwise_join = false;	/* might get changed later */
+	rel->agg_info = NULL;
 	rel->part_scheme = NULL;
 	rel->nparts = -1;
 	rel->boundinfo = NULL;
@@ -406,6 +427,99 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
 	return rel;
 }
 
+/*
+ * build_simple_grouped_rel
+ *	  Construct a new RelOptInfo for a grouped base relation out of an existing
+ *	  non-grouped base relation.
+ */
+RelOptInfo *
+build_simple_grouped_rel(PlannerInfo *root, RelOptInfo *rel_plain)
+{
+	RelOptInfo *rel_grouped;
+	RelAggInfo *agg_info;
+
+	/*
+	 * We should have available aggregate expressions and grouping
+	 * expressions, otherwise we cannot reach here.
+	 */
+	Assert(root->agg_clause_list != NIL);
+	Assert(root->group_expr_list != NIL);
+
+	/* nothing to do for dummy rel */
+	if (IS_DUMMY_REL(rel_plain))
+		return NULL;
+
+	/*
+	 * Prepare the information needed to create grouped paths for this base
+	 * relation.
+	 */
+	agg_info = create_rel_agg_info(root, rel_plain);
+	if (agg_info == NULL)
+		return NULL;
+
+	/*
+	 * If the grouped paths for the given base relation are not considered
+	 * useful, do not build the grouped relation.
+	 */
+	if (!agg_info->agg_useful)
+		return NULL;
+
+	/* build a grouped relation out of the plain relation */
+	rel_grouped = build_grouped_rel(root, rel_plain);
+	rel_grouped->reltarget = agg_info->target;
+	rel_grouped->rows = agg_info->grouped_rows;
+	rel_grouped->agg_info = agg_info;
+
+	return rel_grouped;
+}
+
+/*
+ * build_grouped_rel
+ *	  Build a grouped relation by flat copying a plain relation and resetting
+ *	  the necessary fields.
+ */
+RelOptInfo *
+build_grouped_rel(PlannerInfo *root, RelOptInfo *rel_plain)
+{
+	RelOptInfo *rel_grouped;
+
+	rel_grouped = makeNode(RelOptInfo);
+	memcpy(rel_grouped, rel_plain, sizeof(RelOptInfo));
+
+	/*
+	 * clear path info
+	 */
+	rel_grouped->pathlist = NIL;
+	rel_grouped->ppilist = NIL;
+	rel_grouped->partial_pathlist = NIL;
+	rel_grouped->cheapest_startup_path = NULL;
+	rel_grouped->cheapest_total_path = NULL;
+	rel_grouped->cheapest_unique_path = NULL;
+	rel_grouped->cheapest_parameterized_paths = NIL;
+
+	/*
+	 * clear partition info
+	 */
+	rel_grouped->part_scheme = NULL;
+	rel_grouped->nparts = -1;
+	rel_grouped->boundinfo = NULL;
+	rel_grouped->partbounds_merged = false;
+	rel_grouped->partition_qual = NIL;
+	rel_grouped->part_rels = NULL;
+	rel_grouped->live_parts = NULL;
+	rel_grouped->all_partrels = NULL;
+	rel_grouped->partexprs = NULL;
+	rel_grouped->nullable_partexprs = NULL;
+	rel_grouped->consider_partitionwise_join = false;
+
+	/*
+	 * clear size estimates
+	 */
+	rel_grouped->rows = 0;
+
+	return rel_grouped;
+}
+
 /*
  * find_base_rel
  *	  Find a base or otherrel relation entry, which must already exist.
@@ -479,11 +593,11 @@ find_base_rel_ignore_join(PlannerInfo *root, int relid)
 }
 
 /*
- * build_join_rel_hash
- *	  Construct the auxiliary hash table for join relations.
+ * build_rel_hash
+ *	  Construct the auxiliary hash table for relations.
  */
 static void
-build_join_rel_hash(PlannerInfo *root)
+build_rel_hash(RelInfoList *list)
 {
 	HTAB	   *hashtab;
 	HASHCTL		hash_ctl;
@@ -491,47 +605,46 @@ build_join_rel_hash(PlannerInfo *root)
 
 	/* Create the hash table */
 	hash_ctl.keysize = sizeof(Relids);
-	hash_ctl.entrysize = sizeof(JoinHashEntry);
+	hash_ctl.entrysize = sizeof(RelHashEntry);
 	hash_ctl.hash = bitmap_hash;
 	hash_ctl.match = bitmap_match;
 	hash_ctl.hcxt = CurrentMemoryContext;
-	hashtab = hash_create("JoinRelHashTable",
+	hashtab = hash_create("RelHashTable",
 						  256L,
 						  &hash_ctl,
 						  HASH_ELEM | HASH_FUNCTION | HASH_COMPARE | HASH_CONTEXT);
 
-	/* Insert all the already-existing joinrels */
-	foreach(l, root->join_rel_list)
+	/* Insert all the already-existing RelOptInfos */
+	foreach(l, list->items)
 	{
 		RelOptInfo *rel = (RelOptInfo *) lfirst(l);
-		JoinHashEntry *hentry;
+		RelHashEntry *hentry;
 		bool		found;
 
-		hentry = (JoinHashEntry *) hash_search(hashtab,
-											   &(rel->relids),
-											   HASH_ENTER,
-											   &found);
+		hentry = (RelHashEntry *) hash_search(hashtab,
+											  &(rel->relids),
+											  HASH_ENTER,
+											  &found);
 		Assert(!found);
-		hentry->join_rel = rel;
+		hentry->rel = rel;
 	}
 
-	root->join_rel_hash = hashtab;
+	list->hash = hashtab;
 }
 
 /*
- * find_join_rel
- *	  Returns relation entry corresponding to 'relids' (a set of RT indexes),
- *	  or NULL if none exists.  This is for join relations.
+ * find_rel_info
+ *	  Find a RelOptInfo entry corresponding to 'relids'.
  */
-RelOptInfo *
-find_join_rel(PlannerInfo *root, Relids relids)
+static RelOptInfo *
+find_rel_info(RelInfoList *list, Relids relids)
 {
 	/*
 	 * Switch to using hash lookup when list grows "too long".  The threshold
 	 * is arbitrary and is known only here.
 	 */
-	if (!root->join_rel_hash && list_length(root->join_rel_list) > 32)
-		build_join_rel_hash(root);
+	if (!list->hash && list_length(list->items) > 32)
+		build_rel_hash(list);
 
 	/*
 	 * Use either hashtable lookup or linear search, as appropriate.
@@ -541,23 +654,23 @@ find_join_rel(PlannerInfo *root, Relids relids)
 	 * so would force relids out of a register and thus probably slow down the
 	 * list-search case.
 	 */
-	if (root->join_rel_hash)
+	if (list->hash)
 	{
 		Relids		hashkey = relids;
-		JoinHashEntry *hentry;
+		RelHashEntry *hentry;
 
-		hentry = (JoinHashEntry *) hash_search(root->join_rel_hash,
-											   &hashkey,
-											   HASH_FIND,
-											   NULL);
+		hentry = (RelHashEntry *) hash_search(list->hash,
+											  &hashkey,
+											  HASH_FIND,
+											  NULL);
 		if (hentry)
-			return hentry->join_rel;
+			return hentry->rel;
 	}
 	else
 	{
 		ListCell   *l;
 
-		foreach(l, root->join_rel_list)
+		foreach(l, list->items)
 		{
 			RelOptInfo *rel = (RelOptInfo *) lfirst(l);
 
@@ -569,6 +682,28 @@ find_join_rel(PlannerInfo *root, Relids relids)
 	return NULL;
 }
 
+/*
+ * find_join_rel
+ *	  Returns relation entry corresponding to 'relids' (a set of RT indexes),
+ *	  or NULL if none exists.  This is for join relations.
+ */
+RelOptInfo *
+find_join_rel(PlannerInfo *root, Relids relids)
+{
+	return find_rel_info(root->join_rel_list, relids);
+}
+
+/*
+ * find_grouped_rel
+ *	  Returns relation entry corresponding to 'relids' (a set of RT indexes),
+ *	  or NULL if none exists.  This is for grouped relations.
+ */
+RelOptInfo *
+find_grouped_rel(PlannerInfo *root, Relids relids)
+{
+	return find_rel_info(root->grouped_rel_list, relids);
+}
+
 /*
  * set_foreign_rel_properties
  *		Set up foreign-join fields if outer and inner relation are foreign
@@ -619,31 +754,53 @@ set_foreign_rel_properties(RelOptInfo *joinrel, RelOptInfo *outer_rel,
 }
 
 /*
- * add_join_rel
- *		Add given join relation to the list of join relations in the given
- *		PlannerInfo. Also add it to the auxiliary hashtable if there is one.
+ * add_rel_info
+ *		Add given relation to the list, and also add it to the auxiliary
+ *		hashtable if there is one.
  */
 static void
-add_join_rel(PlannerInfo *root, RelOptInfo *joinrel)
+add_rel_info(RelInfoList *list, RelOptInfo *rel)
 {
-	/* GEQO requires us to append the new joinrel to the end of the list! */
-	root->join_rel_list = lappend(root->join_rel_list, joinrel);
+	/* GEQO requires us to append the new relation to the end of the list! */
+	list->items = lappend(list->items, rel);
 
 	/* store it into the auxiliary hashtable if there is one. */
-	if (root->join_rel_hash)
+	if (list->hash)
 	{
-		JoinHashEntry *hentry;
+		RelHashEntry *hentry;
 		bool		found;
 
-		hentry = (JoinHashEntry *) hash_search(root->join_rel_hash,
-											   &(joinrel->relids),
-											   HASH_ENTER,
-											   &found);
+		hentry = (RelHashEntry *) hash_search(list->hash,
+											  &(rel->relids),
+											  HASH_ENTER,
+											  &found);
 		Assert(!found);
-		hentry->join_rel = joinrel;
+		hentry->rel = rel;
 	}
 }
 
+/*
+ * add_join_rel
+ *		Add given join relation to the list of join relations in the given
+ *		PlannerInfo.
+ */
+static void
+add_join_rel(PlannerInfo *root, RelOptInfo *joinrel)
+{
+	add_rel_info(root->join_rel_list, joinrel);
+}
+
+/*
+ * add_grouped_rel
+ *		Add given grouped relation to the list of grouped relations in the
+ *		given PlannerInfo.
+ */
+void
+add_grouped_rel(PlannerInfo *root, RelOptInfo *rel)
+{
+	add_rel_info(root->grouped_rel_list, rel);
+}
+
 /*
  * build_join_rel
  *	  Returns relation entry corresponding to the union of two given rels,
@@ -755,6 +912,7 @@ build_join_rel(PlannerInfo *root,
 	joinrel->joininfo = NIL;
 	joinrel->has_eclass_joins = false;
 	joinrel->consider_partitionwise_join = false;	/* might get changed later */
+	joinrel->agg_info = NULL;
 	joinrel->parent = NULL;
 	joinrel->top_parent = NULL;
 	joinrel->top_parent_relids = NULL;
@@ -939,6 +1097,7 @@ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
 	joinrel->joininfo = NIL;
 	joinrel->has_eclass_joins = false;
 	joinrel->consider_partitionwise_join = false;	/* might get changed later */
+	joinrel->agg_info = NULL;
 	joinrel->parent = parent_joinrel;
 	joinrel->top_parent = parent_joinrel->top_parent ? parent_joinrel->top_parent : parent_joinrel;
 	joinrel->top_parent_relids = joinrel->top_parent->relids;
@@ -2518,3 +2677,508 @@ build_child_join_reltarget(PlannerInfo *root,
 	childrel->reltarget->cost.per_tuple = parentrel->reltarget->cost.per_tuple;
 	childrel->reltarget->width = parentrel->reltarget->width;
 }
+
+/*
+ * create_rel_agg_info
+ *	  Create the RelAggInfo structure for the given relation if it can produce
+ *	  grouped paths.  The given relation is the non-grouped one which has the
+ *	  reltarget already constructed.
+ */
+RelAggInfo *
+create_rel_agg_info(PlannerInfo *root, RelOptInfo *rel)
+{
+	ListCell   *lc;
+	RelAggInfo *result;
+	PathTarget *agg_input;
+	PathTarget *target;
+	List	   *group_clauses = NIL;
+	List	   *group_exprs = NIL;
+
+	/*
+	 * The lists of aggregate expressions and grouping expressions should have
+	 * been constructed.
+	 */
+	Assert(root->agg_clause_list != NIL);
+	Assert(root->group_expr_list != NIL);
+
+	/*
+	 * If this is a child rel, the grouped rel for its parent rel must have
+	 * been created if it can.  So we can just use parent's RelAggInfo if
+	 * there is one, with appropriate variable substitutions.
+	 */
+	if (IS_OTHER_REL(rel))
+	{
+		RelOptInfo *rel_grouped;
+		RelAggInfo *agg_info;
+
+		Assert(!bms_is_empty(rel->top_parent_relids));
+		rel_grouped = find_grouped_rel(root, rel->top_parent_relids);
+
+		if (rel_grouped == NULL)
+			return NULL;
+
+		Assert(IS_GROUPED_REL(rel_grouped));
+		/* Must do multi-level transformation */
+		agg_info = (RelAggInfo *)
+			adjust_appendrel_attrs_multilevel(root,
+											  (Node *) rel_grouped->agg_info,
+											  rel,
+											  rel->top_parent);
+
+		agg_info->grouped_rows =
+			estimate_num_groups(root, agg_info->group_exprs,
+								rel->rows, NULL, NULL);
+
+		/*
+		 * The grouped paths for the given relation are considered useful iff
+		 * the row reduction ratio is greater than EAGER_AGGREGATE_RATIO.
+		 */
+		agg_info->agg_useful =
+			(agg_info->grouped_rows <= rel->rows * (1 - EAGER_AGGREGATE_RATIO));
+
+		return agg_info;
+	}
+
+	/* Check if it's possible to produce grouped paths for this relation. */
+	if (!eager_aggregation_possible_for_relation(root, rel))
+		return NULL;
+
+	/*
+	 * Create targets for the grouped paths and for the input paths of the
+	 * grouped paths.
+	 */
+	target = create_empty_pathtarget();
+	agg_input = create_empty_pathtarget();
+
+	/* ... and initialize these targets */
+	if (!init_grouping_targets(root, rel, target, agg_input,
+							   &group_clauses, &group_exprs))
+		return NULL;
+
+	/*
+	 * Eager aggregation is not applicable if there are no available grouping
+	 * expressions.
+	 */
+	if (list_length(group_clauses) == 0)
+		return NULL;
+
+	/* build the RelAggInfo result */
+	result = makeNode(RelAggInfo);
+
+	result->group_clauses = group_clauses;
+	result->group_exprs = group_exprs;
+
+	/* Calculate pathkeys that represent this grouping requirements */
+	result->group_pathkeys =
+		make_pathkeys_for_sortclauses(root, result->group_clauses,
+									  make_tlist_from_pathtarget(target));
+
+	/* Add aggregates to the grouping target */
+	foreach(lc, root->agg_clause_list)
+	{
+		AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+		Aggref	   *aggref;
+
+		Assert(IsA(ac_info->aggref, Aggref));
+
+		aggref = (Aggref *) copyObject(ac_info->aggref);
+		mark_partial_aggref(aggref, AGGSPLIT_INITIAL_SERIAL);
+
+		add_column_to_pathtarget(target, (Expr *) aggref, 0);
+	}
+
+	/* Set the estimated eval cost and output width for both targets */
+	set_pathtarget_cost_width(root, target);
+	set_pathtarget_cost_width(root, agg_input);
+
+	result->relids = bms_copy(rel->relids);
+	result->target = target;
+	result->agg_input = agg_input;
+	result->grouped_rows = estimate_num_groups(root, result->group_exprs,
+											   rel->rows, NULL, NULL);
+
+	/*
+	 * The grouped paths for the given relation are considered useful iff the
+	 * row reduction ratio is greater than EAGER_AGGREGATE_RATIO.
+	 */
+	result->agg_useful =
+		(result->grouped_rows <= rel->rows * (1 - EAGER_AGGREGATE_RATIO));
+
+	return result;
+}
+
+/*
+ * eager_aggregation_possible_for_relation
+ * 	  Check if it's possible to produce grouped paths for the given relation.
+ */
+static bool
+eager_aggregation_possible_for_relation(PlannerInfo *root, RelOptInfo *rel)
+{
+	ListCell   *lc;
+	int			cur_relid;
+
+	/*
+	 * Check to see if the given relation is in the nullable side of an outer
+	 * join.  In this case, we cannot push a partial aggregation down to the
+	 * relation, because the NULL-extended rows produced by the outer join
+	 * would not be available when we perform the partial aggregation, while
+	 * with a non-eager-aggregation plan these rows are available for the
+	 * top-level aggregation.  Doing so may result in the rows being grouped
+	 * differently than expected, or produce incorrect values from the
+	 * aggregate functions.
+	 */
+	cur_relid = -1;
+	while ((cur_relid = bms_next_member(rel->relids, cur_relid)) >= 0)
+	{
+		RelOptInfo *baserel = find_base_rel_ignore_join(root, cur_relid);
+
+		if (baserel == NULL)
+			continue;			/* ignore outer joins in rel->relids */
+
+		if (!bms_is_subset(baserel->nulling_relids, rel->relids))
+			return false;
+	}
+
+	/*
+	 * For now we don't try to support PlaceHolderVars.
+	 */
+	foreach(lc, rel->reltarget->exprs)
+	{
+		Expr	   *expr = lfirst(lc);
+
+		if (IsA(expr, PlaceHolderVar))
+			return false;
+	}
+
+	/* Caller should only pass base relations or joins. */
+	Assert(rel->reloptkind == RELOPT_BASEREL ||
+		   rel->reloptkind == RELOPT_JOINREL);
+
+	/*
+	 * Check if all aggregate expressions can be evaluated on this relation
+	 * level.
+	 */
+	foreach(lc, root->agg_clause_list)
+	{
+		AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+
+		Assert(IsA(ac_info->aggref, Aggref));
+
+		/*
+		 * Give up if any aggregate needs relations other than the current
+		 * one.
+		 *
+		 * If the aggregate needs the current rel plus anything else, grouping
+		 * the current rel could make some input variables unavailable for the
+		 * higher aggregate and also reduce the number of input rows it
+		 * receives.
+		 *
+		 * If the aggregate does not need the current rel at all, then the
+		 * current rel should not be grouped, as we do not support joining two
+		 * grouped relations.
+		 */
+		if (!bms_is_subset(ac_info->agg_eval_at, rel->relids))
+			return false;
+	}
+
+	return true;
+}
+
+/*
+ * init_grouping_targets
+ *	  Initialize the target for grouped paths (target) as well as the target
+ *	  for paths that generate input for the grouped paths (agg_input).
+ *
+ * We also construct the list of SortGroupClauses and the list of grouping
+ * expressions for the partial aggregation, and return them in *group_clause
+ * and *group_exprs.
+ *
+ * Return true if the targets could be initialized, false otherwise.
+ */
+static bool
+init_grouping_targets(PlannerInfo *root, RelOptInfo *rel,
+					  PathTarget *target, PathTarget *agg_input,
+					  List **group_clauses, List **group_exprs)
+{
+	ListCell   *lc;
+	List	   *possibly_dependent = NIL;
+	Index		maxSortGroupRef;
+
+	/* Identify the max sortgroupref */
+	maxSortGroupRef = 0;
+	foreach(lc, root->processed_tlist)
+	{
+		Index		ref = ((TargetEntry *) lfirst(lc))->ressortgroupref;
+
+		if (ref > maxSortGroupRef)
+			maxSortGroupRef = ref;
+	}
+
+	foreach(lc, rel->reltarget->exprs)
+	{
+		Expr	   *expr = (Expr *) lfirst(lc);
+		Index		sortgroupref;
+
+		/*
+		 * Given that PlaceHolderVar currently prevents us from doing eager
+		 * aggregation, the source target cannot contain anything more complex
+		 * than a Var.
+		 */
+		Assert(IsA(expr, Var));
+
+		/* Get the sortgroupref if the expr can act as grouping expression. */
+		sortgroupref = get_expression_sortgroupref(root, expr);
+		if (sortgroupref > 0)
+		{
+			SortGroupClause *sgc;
+
+			/* Find the matching SortGroupClause */
+			sgc = get_sortgroupref_clause(sortgroupref, root->processed_groupClause);
+			Assert(sgc->tleSortGroupRef <= maxSortGroupRef);
+
+			/*
+			 * If the target expression can be used as a grouping key, it
+			 * should be emitted by the grouped paths that have been pushed
+			 * down to this relation level.
+			 */
+			add_column_to_pathtarget(target, expr, sortgroupref);
+
+			/*
+			 * ... and it also should be emitted by the input paths.
+			 */
+			add_column_to_pathtarget(agg_input, expr, sortgroupref);
+
+			/*
+			 * Record this SortGroupClause and grouping expression.  Note that
+			 * this SortGroupClause might have already been recorded.
+			 */
+			if (!list_member(*group_clauses, sgc))
+			{
+				*group_clauses = lappend(*group_clauses, sgc);
+				*group_exprs = lappend(*group_exprs, expr);
+			}
+		}
+		else if (is_var_needed_by_join(root, (Var *) expr, rel))
+		{
+			/*
+			 * The expression is needed for an upper join but is neither in
+			 * the GROUP BY clause nor derivable from it using EC (otherwise,
+			 * it would have already been included in the targets above).  We
+			 * need to create a special SortGroupClause for this expression.
+			 *
+			 * It is important to include such expressions in the grouping
+			 * keys.  This is essential to ensure that an aggregated row from
+			 * the partial aggregation matches the other side of the join if
+			 * and only if each row in the partial group does.  This ensures
+			 * that all rows within the same partial group share the same
+			 * 'destiny', which is crucial for maintaining correctness.
+			 */
+			SortGroupClause *sgc;
+			TypeCacheEntry *tce;
+			Oid			equalimageproc;
+
+			/*
+			 * But first, check if equality implies image equality for this
+			 * expression.  If not, we cannot use it as a grouping key.  See
+			 * comments in create_grouping_expr_infos().
+			 */
+			tce = lookup_type_cache(exprType((Node *) expr),
+									TYPECACHE_BTREE_OPFAMILY);
+			if (!OidIsValid(tce->btree_opf) ||
+				!OidIsValid(tce->btree_opintype))
+				return false;
+
+			equalimageproc = get_opfamily_proc(tce->btree_opf,
+											   tce->btree_opintype,
+											   tce->btree_opintype,
+											   BTEQUALIMAGE_PROC);
+			if (!OidIsValid(equalimageproc) ||
+				!DatumGetBool(OidFunctionCall1Coll(equalimageproc,
+												   tce->typcollation,
+												   ObjectIdGetDatum(tce->btree_opintype))))
+				return false;
+
+			/* Create the SortGroupClause. */
+			sgc = makeNode(SortGroupClause);
+
+			/* Initialize the SortGroupClause. */
+			sgc->tleSortGroupRef = ++maxSortGroupRef;
+			get_sort_group_operators(exprType((Node *) expr),
+									 false, true, false,
+									 &sgc->sortop, &sgc->eqop, NULL,
+									 &sgc->hashable);
+
+			/* This expression should be emitted by the grouped paths */
+			add_column_to_pathtarget(target, expr, sgc->tleSortGroupRef);
+
+			/* ... and it also should be emitted by the input paths. */
+			add_column_to_pathtarget(agg_input, expr, sgc->tleSortGroupRef);
+
+			/* Record this SortGroupClause and grouping expression */
+			*group_clauses = lappend(*group_clauses, sgc);
+			*group_exprs = lappend(*group_exprs, expr);
+		}
+		else if (is_var_in_aggref_only(root, (Var *) expr))
+		{
+			/*
+			 * The expression is referenced by an aggregate function pushed
+			 * down to this relation and does not appear elsewhere in the
+			 * targetlist or havingQual.  Add it to 'agg_input' but not to
+			 * 'target'.
+			 */
+			add_new_column_to_pathtarget(agg_input, expr);
+		}
+		else
+		{
+			/*
+			 * The expression may be functionally dependent on other
+			 * expressions in the target, but we cannot verify this until all
+			 * target expressions have been constructed.
+			 */
+			possibly_dependent = lappend(possibly_dependent, expr);
+		}
+	}
+
+	/*
+	 * Now we can verify whether an expression is functionally dependent on
+	 * others.
+	 */
+	foreach(lc, possibly_dependent)
+	{
+		Var		   *tvar;
+		List	   *deps = NIL;
+		RangeTblEntry *rte;
+
+		tvar = lfirst_node(Var, lc);
+		rte = root->simple_rte_array[tvar->varno];
+
+		if (check_functional_grouping(rte->relid, tvar->varno,
+									  tvar->varlevelsup,
+									  target->exprs, &deps))
+		{
+			/*
+			 * The expression is functionally dependent on other target
+			 * expressions, so it can be included in the targets.  Since it
+			 * will not be used as a grouping key, a sortgroupref is not
+			 * needed for it.
+			 */
+			add_new_column_to_pathtarget(target, (Expr *) tvar);
+			add_new_column_to_pathtarget(agg_input, (Expr *) tvar);
+		}
+		else
+		{
+			/*
+			 * We may arrive here with a grouping expression that is proven
+			 * redundant by EquivalenceClass processing, such as 't1.a' in the
+			 * query below.
+			 *
+			 * select max(t1.c) from t t1, t t2 where t1.a = 1 group by t1.a,
+			 * t1.b;
+			 *
+			 * For now we just give up in this case.
+			 */
+			return false;
+		}
+	}
+
+	return true;
+}
+
+/*
+ * is_var_in_aggref_only
+ *	  Check whether the given Var appears in aggregate expressions and not
+ *	  elsewhere in the targetlist or havingQual.
+ */
+static bool
+is_var_in_aggref_only(PlannerInfo *root, Var *var)
+{
+	ListCell   *lc;
+
+	/*
+	 * Search the list of aggregate expressions for the Var.
+	 */
+	foreach(lc, root->agg_clause_list)
+	{
+		AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+		List	   *vars;
+
+		Assert(IsA(ac_info->aggref, Aggref));
+
+		if (!bms_is_member(var->varno, ac_info->agg_eval_at))
+			continue;
+
+		vars = pull_var_clause((Node *) ac_info->aggref,
+							   PVC_RECURSE_AGGREGATES |
+							   PVC_RECURSE_WINDOWFUNCS |
+							   PVC_RECURSE_PLACEHOLDERS);
+
+		if (list_member(vars, var))
+		{
+			list_free(vars);
+			break;
+		}
+
+		list_free(vars);
+	}
+
+	return (lc != NULL && !list_member(root->tlist_vars, var));
+}
+
+/*
+ * is_var_needed_by_join
+ *	  Check if the given Var is needed by joins above the current rel.
+ */
+static bool
+is_var_needed_by_join(PlannerInfo *root, Var *var, RelOptInfo *rel)
+{
+	Relids		relids;
+	int			attno;
+	RelOptInfo *baserel;
+
+	/*
+	 * Note that when checking if the Var is needed by joins above, we want to
+	 * exclude cases where the Var is only needed in the final output.  So
+	 * include "relation 0" in the check.
+	 */
+	relids = bms_copy(rel->relids);
+	relids = bms_add_member(relids, 0);
+
+	baserel = find_base_rel(root, var->varno);
+	attno = var->varattno - baserel->min_attr;
+
+	return bms_nonempty_difference(baserel->attr_needed[attno], relids);
+}
+
+/*
+ * get_expression_sortgroupref
+ *	  Return sortgroupref if the given 'expr' can be used as a grouping key in
+ *	  grouped paths for base or join relations, or 0 otherwise.
+ *
+ * We first check if 'expr' is among the grouping expressions.  If it is not,
+ * we then check if 'expr' is known equal to any of the grouping expressions
+ * due to equivalence relationships.
+ */
+static Index
+get_expression_sortgroupref(PlannerInfo *root, Expr *expr)
+{
+	ListCell   *lc;
+
+	foreach(lc, root->group_expr_list)
+	{
+		GroupExprInfo *ge_info = lfirst_node(GroupExprInfo, lc);
+
+		Assert(IsA(ge_info->expr, Var));
+
+		if (equal(ge_info->expr, expr) ||
+			exprs_known_equal(root, (Node *) expr, (Node *) ge_info->expr,
+							  ge_info->btree_opfamily))
+		{
+			Assert(ge_info->sortgroupref > 0);
+
+			return ge_info->sortgroupref;
+		}
+	}
+
+	/* The expression cannot be used as a grouping key. */
+	return 0;
+}
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index 8cf1afbad2..95bd80c4dd 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -929,6 +929,16 @@ struct config_bool ConfigureNamesBool[] =
 		false,
 		NULL, NULL, NULL
 	},
+	{
+		{"enable_eager_aggregate", PGC_USERSET, QUERY_TUNING_METHOD,
+			gettext_noop("Enables eager aggregation."),
+			NULL,
+			GUC_EXPLAIN
+		},
+		&enable_eager_aggregate,
+		false,
+		NULL, NULL, NULL
+	},
 	{
 		{"enable_parallel_append", PGC_USERSET, QUERY_TUNING_METHOD,
 			gettext_noop("Enables the planner's use of parallel append plans."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index a2ac7575ca..154fc5b1fa 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -416,6 +416,7 @@
 #enable_tidscan = on
 #enable_group_by_reordering = on
 #enable_distinct_reordering = on
+#enable_eager_aggregate = off
 
 # - Planner Cost Constants -
 
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 58748d2ca6..0b2c51f73e 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -80,6 +80,25 @@ typedef enum UpperRelationKind
 	/* NB: UPPERREL_FINAL must be last enum entry; it's used to size arrays */
 } UpperRelationKind;
 
+/*
+ * A structure consisting of a list and a hash table to store relations.
+ *
+ * For small problems we just scan the list to do lookups, but when there are
+ * many relations we build a hash table for faster lookups.  The hash table is
+ * present and valid when 'hash' is not NULL.  Note that we still maintain the
+ * list even when using the hash table for lookups; this simplifies life for
+ * GEQO.
+ */
+typedef struct RelInfoList
+{
+	pg_node_attr(no_copy_equal, no_read)
+
+	NodeTag		type;
+
+	List	   *items;
+	struct HTAB *hash pg_node_attr(read_write_ignore);
+} RelInfoList;
+
 /*----------
  * PlannerGlobal
  *		Global information for planning/optimization
@@ -270,15 +289,16 @@ struct PlannerInfo
 
 	/*
 	 * join_rel_list is a list of all join-relation RelOptInfos we have
-	 * considered in this planning run.  For small problems we just scan the
-	 * list to do lookups, but when there are many join relations we build a
-	 * hash table for faster lookups.  The hash table is present and valid
-	 * when join_rel_hash is not NULL.  Note that we still maintain the list
-	 * even when using the hash table for lookups; this simplifies life for
-	 * GEQO.
+	 * considered in this planning run.
 	 */
-	List	   *join_rel_list;
-	struct HTAB *join_rel_hash pg_node_attr(read_write_ignore);
+	RelInfoList *join_rel_list; /* list of join-relation RelOptInfos */
+
+	/*
+	 * grouped_rel_list is a list of all grouped-relation RelOptInfos we have
+	 * considered in this planning run.  This is only used by eager
+	 * aggregation.
+	 */
+	RelInfoList *grouped_rel_list;	/* list of grouped-relation RelOptInfos */
 
 	/*
 	 * When doing a dynamic-programming-style join search, join_rel_level[k]
@@ -373,6 +393,15 @@ struct PlannerInfo
 	/* list of PlaceHolderInfos */
 	List	   *placeholder_list;
 
+	/* list of AggClauseInfos */
+	List	   *agg_clause_list;
+
+	/* list of GroupExprInfos */
+	List	   *group_expr_list;
+
+	/* list of plain Vars contained in targetlist and havingQual */
+	List	   *tlist_vars;
+
 	/* array of PlaceHolderInfos indexed by phid */
 	struct PlaceHolderInfo **placeholder_array pg_node_attr(read_write_ignore, array_size(placeholder_array_size));
 	/* allocated size of array */
@@ -998,6 +1027,12 @@ typedef struct RelOptInfo
 	/* consider partitionwise join paths? (if partitioned rel) */
 	bool		consider_partitionwise_join;
 
+	/*
+	 * used by eager aggregation:
+	 */
+	/* information needed to create grouped paths */
+	struct RelAggInfo *agg_info;
+
 	/*
 	 * inheritance links, if this is an otherrel (otherwise NULL):
 	 */
@@ -1071,6 +1106,68 @@ typedef struct RelOptInfo
 	((rel)->part_scheme && (rel)->boundinfo && (rel)->nparts > 0 && \
 	 (rel)->part_rels && (rel)->partexprs && (rel)->nullable_partexprs)
 
+/*
+ * Is the given relation a grouped relation?
+ */
+#define IS_GROUPED_REL(rel) \
+	((rel)->agg_info != NULL)
+
+/*
+ * RelAggInfo
+ *		Information needed to create grouped paths for base and join rels.
+ *
+ * "relids" is the set of relation identifiers (RT indexes).
+ *
+ * "target" is the output tlist for the grouped paths.
+ *
+ * "agg_input" is the output tlist for the paths that provide input to the
+ * grouped paths.  One difference from the reltarget of the non-grouped
+ * relation is that agg_input has its sortgrouprefs[] initialized.
+ *
+ * "grouped_rows" is the estimated number of result tuples of the grouped
+ * relation.
+ *
+ * "group_clauses", "group_exprs" and "group_pathkeys" are lists of
+ * SortGroupClauses, the corresponding grouping expressions and PathKeys
+ * respectively.
+ *
+ * "agg_useful" is a flag to indicate whether the grouped paths are considered
+ * useful.
+ */
+typedef struct RelAggInfo
+{
+	pg_node_attr(no_copy_equal, no_read, no_query_jumble)
+
+	NodeTag		type;
+
+	/* set of base + OJ relids (rangetable indexes) */
+	Relids		relids;
+
+	/*
+	 * default result targetlist for Paths scanning this grouped relation;
+	 * list of Vars/Exprs, cost, width
+	 */
+	struct PathTarget *target;
+
+	/*
+	 * the targetlist for Paths that provide input to the grouped paths
+	 */
+	struct PathTarget *agg_input;
+
+	/* estimated number of result tuples */
+	Cardinality grouped_rows;
+
+	/* a list of SortGroupClauses */
+	List	   *group_clauses;
+	/* a list of grouping expressions */
+	List	   *group_exprs;
+	/* a list of PathKeys */
+	List	   *group_pathkeys;
+
+	/* the grouped paths are considered useful? */
+	bool		agg_useful;
+} RelAggInfo;
+
 /*
  * IndexOptInfo
  *		Per-index information for planning/optimization
@@ -3144,6 +3241,41 @@ typedef struct MinMaxAggInfo
 	Param	   *param;
 } MinMaxAggInfo;
 
+/*
+ * The aggregate expressions that appear in targetlist and having clauses
+ */
+typedef struct AggClauseInfo
+{
+	pg_node_attr(no_read, no_query_jumble)
+
+	NodeTag		type;
+
+	/* the Aggref expr */
+	Aggref	   *aggref;
+
+	/* lowest level we can evaluate this aggregate at */
+	Relids		agg_eval_at;
+} AggClauseInfo;
+
+/*
+ * The grouping expressions that appear in grouping clauses
+ */
+typedef struct GroupExprInfo
+{
+	pg_node_attr(no_read, no_query_jumble)
+
+	NodeTag		type;
+
+	/* the represented expression */
+	Expr	   *expr;
+
+	/* the tleSortGroupRef of the corresponding SortGroupClause */
+	Index		sortgroupref;
+
+	/* btree opfamily defining the ordering */
+	Oid			btree_opfamily;
+} GroupExprInfo;
+
 /*
  * At runtime, PARAM_EXEC slots are used to pass values around from one plan
  * node to another.  They can be used to pass values down into subqueries (for
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index 5a6d0350c1..8dde37cbff 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -313,10 +313,16 @@ extern void setup_simple_rel_arrays(PlannerInfo *root);
 extern void expand_planner_arrays(PlannerInfo *root, int add_size);
 extern RelOptInfo *build_simple_rel(PlannerInfo *root, int relid,
 									RelOptInfo *parent);
+extern RelOptInfo *build_simple_grouped_rel(PlannerInfo *root,
+											RelOptInfo *rel_plain);
+extern RelOptInfo *build_grouped_rel(PlannerInfo *root,
+									 RelOptInfo *rel_plain);
 extern RelOptInfo *find_base_rel(PlannerInfo *root, int relid);
 extern RelOptInfo *find_base_rel_noerr(PlannerInfo *root, int relid);
 extern RelOptInfo *find_base_rel_ignore_join(PlannerInfo *root, int relid);
 extern RelOptInfo *find_join_rel(PlannerInfo *root, Relids relids);
+extern void add_grouped_rel(PlannerInfo *root, RelOptInfo *rel);
+extern RelOptInfo *find_grouped_rel(PlannerInfo *root, Relids relids);
 extern RelOptInfo *build_join_rel(PlannerInfo *root,
 								  Relids joinrelids,
 								  RelOptInfo *outer_rel,
@@ -352,4 +358,5 @@ extern RelOptInfo *build_child_join_rel(PlannerInfo *root,
 										SpecialJoinInfo *sjinfo,
 										int nappinfos, AppendRelInfo **appinfos);
 
+extern RelAggInfo *create_rel_agg_info(PlannerInfo *root, RelOptInfo *rel);
 #endif							/* PATHNODE_H */
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 54869d4401..a189b7f18c 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -21,6 +21,7 @@
  * allpaths.c
  */
 extern PGDLLIMPORT bool enable_geqo;
+extern PGDLLIMPORT bool enable_eager_aggregate;
 extern PGDLLIMPORT int geqo_threshold;
 extern PGDLLIMPORT int min_parallel_table_scan_size;
 extern PGDLLIMPORT int min_parallel_index_scan_size;
@@ -57,6 +58,10 @@ extern void generate_gather_paths(PlannerInfo *root, RelOptInfo *rel,
 								  bool override_rows);
 extern void generate_useful_gather_paths(PlannerInfo *root, RelOptInfo *rel,
 										 bool override_rows);
+extern void generate_grouped_paths(PlannerInfo *root,
+								   RelOptInfo *rel_grouped,
+								   RelOptInfo *rel_plain,
+								   RelAggInfo *agg_info);
 extern int	compute_parallel_worker(RelOptInfo *rel, double heap_pages,
 									double index_pages, int max_workers);
 extern void create_partial_bitmap_paths(PlannerInfo *root, RelOptInfo *rel,
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index 0b6f0f7969..49614dbd75 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -75,6 +75,7 @@ extern void add_vars_to_targetlist(PlannerInfo *root, List *vars,
 extern void add_vars_to_attr_needed(PlannerInfo *root, List *vars,
 									Relids where_needed);
 extern void remove_useless_groupby_columns(PlannerInfo *root);
+extern void setup_eager_aggregation(PlannerInfo *root);
 extern void find_lateral_references(PlannerInfo *root);
 extern void rebuild_lateral_attr_needed(PlannerInfo *root);
 extern void create_lateral_join_info(PlannerInfo *root);
diff --git a/src/test/regress/expected/eager_aggregate.out b/src/test/regress/expected/eager_aggregate.out
new file mode 100644
index 0000000000..9f63472eff
--- /dev/null
+++ b/src/test/regress/expected/eager_aggregate.out
@@ -0,0 +1,1308 @@
+--
+-- EAGER AGGREGATION
+-- Test we can push aggregation down below join
+--
+-- Enable eager aggregation, which by default is disabled.
+SET enable_eager_aggregate TO on;
+CREATE TABLE eager_agg_t1 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t2 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t3 (a int, b int, c double precision);
+INSERT INTO eager_agg_t1 SELECT i, i, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t2 SELECT i, i%10, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t3 SELECT i%10, i%10, i FROM generate_series(1, 1000)i;
+ANALYZE eager_agg_t1;
+ANALYZE eager_agg_t2;
+ANALYZE eager_agg_t3;
+--
+-- Test eager aggregation over base rel
+--
+-- Perform scan of a table, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                            QUERY PLAN                            
+------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Seq Scan on public.eager_agg_t2 t2
+                                 Output: t2.a, t2.b, t2.c
+(18 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                               QUERY PLAN                               
+------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial GroupAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Sort
+                                 Output: t2.c, t2.b
+                                 Sort Key: t2.b
+                                 ->  Seq Scan on public.eager_agg_t2 t2
+                                       Output: t2.c, t2.b
+(21 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+RESET enable_hashagg;
+--
+-- Test eager aggregation over join rel
+--
+-- Perform join of tables, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+                                  QUERY PLAN                                  
+------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg((t2.c + t3.c))
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg((t2.c + t3.c)))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg((t2.c + t3.c))
+                           Group Key: t2.b
+                           ->  Hash Join
+                                 Output: t2.c, t2.b, t3.c
+                                 Hash Cond: (t3.a = t2.a)
+                                 ->  Seq Scan on public.eager_agg_t3 t3
+                                       Output: t3.a, t3.b, t3.c
+                                 ->  Hash
+                                       Output: t2.c, t2.b, t2.a
+                                       ->  Seq Scan on public.eager_agg_t2 t2
+                                             Output: t2.c, t2.b, t2.a
+(25 rows)
+
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 497
+ 2 | 499
+ 3 | 501
+ 4 | 503
+ 5 | 505
+ 6 | 507
+ 7 | 509
+ 8 | 511
+ 9 | 513
+(9 rows)
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+                                     QUERY PLAN                                     
+------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg((t2.c + t3.c))
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg((t2.c + t3.c)))
+                     ->  Partial GroupAggregate
+                           Output: t2.b, PARTIAL avg((t2.c + t3.c))
+                           Group Key: t2.b
+                           ->  Sort
+                                 Output: t2.c, t2.b, t3.c
+                                 Sort Key: t2.b
+                                 ->  Hash Join
+                                       Output: t2.c, t2.b, t3.c
+                                       Hash Cond: (t3.a = t2.a)
+                                       ->  Seq Scan on public.eager_agg_t3 t3
+                                             Output: t3.a, t3.b, t3.c
+                                       ->  Hash
+                                             Output: t2.c, t2.b, t2.a
+                                             ->  Seq Scan on public.eager_agg_t2 t2
+                                                   Output: t2.c, t2.b, t2.a
+(28 rows)
+
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 497
+ 2 | 499
+ 3 | 501
+ 4 | 503
+ 5 | 505
+ 6 | 507
+ 7 | 509
+ 8 | 511
+ 9 | 513
+(9 rows)
+
+RESET enable_hashagg;
+--
+-- Test that eager aggregation works for outer join
+--
+-- Ensure aggregation can be pushed down to the non-nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                            QUERY PLAN                            
+------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Right Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Seq Scan on public.eager_agg_t2 t2
+                                 Output: t2.a, t2.b, t2.c
+(18 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+   | 505
+(10 rows)
+
+-- Ensure aggregation cannot be pushed down to the nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Sort
+   Output: t2.b, (avg(t2.c))
+   Sort Key: t2.b
+   ->  HashAggregate
+         Output: t2.b, avg(t2.c)
+         Group Key: t2.b
+         ->  Hash Right Join
+               Output: t2.b, t2.c
+               Hash Cond: (t2.b = t1.b)
+               ->  Seq Scan on public.eager_agg_t2 t2
+                     Output: t2.a, t2.b, t2.c
+               ->  Hash
+                     Output: t1.b
+                     ->  Seq Scan on public.eager_agg_t1 t1
+                           Output: t1.b
+(15 rows)
+
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+ b | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+   |    
+(10 rows)
+
+--
+-- Test that eager aggregation works for parallel plans
+--
+SET parallel_setup_cost=0;
+SET parallel_tuple_cost=0;
+SET min_parallel_table_scan_size=0;
+SET max_parallel_workers_per_gather=4;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                                   QUERY PLAN                                    
+---------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Gather Merge
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Workers Planned: 2
+         ->  Sort
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Sort Key: t1.a
+               ->  Parallel Hash Join
+                     Output: t1.a, (PARTIAL avg(t2.c))
+                     Hash Cond: (t1.b = t2.b)
+                     ->  Parallel Seq Scan on public.eager_agg_t1 t1
+                           Output: t1.a, t1.b, t1.c
+                     ->  Parallel Hash
+                           Output: t2.b, (PARTIAL avg(t2.c))
+                           ->  Partial HashAggregate
+                                 Output: t2.b, PARTIAL avg(t2.c)
+                                 Group Key: t2.b
+                                 ->  Parallel Seq Scan on public.eager_agg_t2 t2
+                                       Output: t2.a, t2.b, t2.c
+(21 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+RESET parallel_setup_cost;
+RESET parallel_tuple_cost;
+RESET min_parallel_table_scan_size;
+RESET max_parallel_workers_per_gather;
+DROP TABLE eager_agg_t1;
+DROP TABLE eager_agg_t2;
+DROP TABLE eager_agg_t3;
+--
+-- Test eager aggregation for partitionwise join
+--
+-- Enable partitionwise aggregate, which by default is disabled.
+SET enable_partitionwise_aggregate TO true;
+-- Enable partitionwise join, which by default is disabled.
+SET enable_partitionwise_join TO true;
+CREATE TABLE eager_agg_tab1(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab1_p1 PARTITION OF eager_agg_tab1 FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab1_p2 PARTITION OF eager_agg_tab1 FOR VALUES FROM (10) TO (20);
+CREATE TABLE eager_agg_tab1_p3 PARTITION OF eager_agg_tab1 FOR VALUES FROM (20) TO (30);
+CREATE TABLE eager_agg_tab2(x int, y int) PARTITION BY RANGE(y);
+CREATE TABLE eager_agg_tab2_p1 PARTITION OF eager_agg_tab2 FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab2_p2 PARTITION OF eager_agg_tab2 FOR VALUES FROM (10) TO (20);
+CREATE TABLE eager_agg_tab2_p3 PARTITION OF eager_agg_tab2 FOR VALUES FROM (20) TO (30);
+INSERT INTO eager_agg_tab1 SELECT i % 30, i % 20 FROM generate_series(0, 299, 2) i;
+INSERT INTO eager_agg_tab2 SELECT i % 20, i % 30 FROM generate_series(0, 299, 3) i;
+ANALYZE eager_agg_tab1;
+ANALYZE eager_agg_tab2;
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum(t1.y)), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum(t1.y), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2.y = t1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2
+                           Output: t2.y
+                     ->  Hash
+                           Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1.x, PARTIAL sum(t1.y), PARTIAL count(*)
+                                 Group Key: t1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                       Output: t1.x, t1.y
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum(t1_1.y), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_1
+                           Output: t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.y), PARTIAL count(*)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                       Output: t1_1.x, t1_1.y
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum(t1_2.y), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_2
+                           Output: t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.y), PARTIAL count(*)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                       Output: t1_2.x, t1_2.y
+(49 rows)
+
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+ x  | sum  | count 
+----+------+-------
+  0 |  500 |   100
+  6 | 1100 |   100
+ 12 |  700 |   100
+ 18 | 1300 |   100
+ 24 |  900 |   100
+(5 rows)
+
+-- GROUP BY having other matching key
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t2.y, (sum(t1.y)), (count(*))
+   Sort Key: t2.y
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t2.y, sum(t1.y), count(*)
+               Group Key: t2.y
+               ->  Hash Join
+                     Output: t2.y, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2.y = t1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2
+                           Output: t2.y
+                     ->  Hash
+                           Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1.x, PARTIAL sum(t1.y), PARTIAL count(*)
+                                 Group Key: t1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                       Output: t1.y, t1.x
+         ->  Finalize HashAggregate
+               Output: t2_1.y, sum(t1_1.y), count(*)
+               Group Key: t2_1.y
+               ->  Hash Join
+                     Output: t2_1.y, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_1
+                           Output: t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.y), PARTIAL count(*)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                       Output: t1_1.y, t1_1.x
+         ->  Finalize HashAggregate
+               Output: t2_2.y, sum(t1_2.y), count(*)
+               Group Key: t2_2.y
+               ->  Hash Join
+                     Output: t2_2.y, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_2
+                           Output: t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.y), PARTIAL count(*)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                       Output: t1_2.y, t1_2.x
+(49 rows)
+
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+ y  | sum  | count 
+----+------+-------
+  0 |  500 |   100
+  6 | 1100 |   100
+ 12 |  700 |   100
+ 18 | 1300 |   100
+ 24 |  900 |   100
+(5 rows)
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 10 ORDER BY t2.x;
+                                                 QUERY PLAN                                                 
+------------------------------------------------------------------------------------------------------------
+ Sort
+   Output: t2.x, (sum(t1.x)), (count(*))
+   Sort Key: t2.x
+   ->  Finalize HashAggregate
+         Output: t2.x, sum(t1.x), count(*)
+         Group Key: t2.x
+         Filter: (avg(t1.x) > '10'::numeric)
+         ->  Append
+               ->  Hash Join
+                     Output: t2_1.x, (PARTIAL sum(t1_1.x)), (PARTIAL count(*)), (PARTIAL avg(t1_1.x))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2_1
+                           Output: t2_1.x, t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.x)), (PARTIAL count(*)), (PARTIAL avg(t1_1.x))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.x), PARTIAL count(*), PARTIAL avg(t1_1.x)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1_1
+                                       Output: t1_1.x
+               ->  Hash Join
+                     Output: t2_2.x, (PARTIAL sum(t1_2.x)), (PARTIAL count(*)), (PARTIAL avg(t1_2.x))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_2
+                           Output: t2_2.x, t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.x)), (PARTIAL count(*)), (PARTIAL avg(t1_2.x))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.x), PARTIAL count(*), PARTIAL avg(t1_2.x)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_2
+                                       Output: t1_2.x
+               ->  Hash Join
+                     Output: t2_3.x, (PARTIAL sum(t1_3.x)), (PARTIAL count(*)), (PARTIAL avg(t1_3.x))
+                     Hash Cond: (t2_3.y = t1_3.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_3
+                           Output: t2_3.x, t2_3.y
+                     ->  Hash
+                           Output: t1_3.x, (PARTIAL sum(t1_3.x)), (PARTIAL count(*)), (PARTIAL avg(t1_3.x))
+                           ->  Partial HashAggregate
+                                 Output: t1_3.x, PARTIAL sum(t1_3.x), PARTIAL count(*), PARTIAL avg(t1_3.x)
+                                 Group Key: t1_3.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_3
+                                       Output: t1_3.x
+(44 rows)
+
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 10 ORDER BY t2.x;
+ x  | sum  | count 
+----+------+-------
+  2 |  600 |    50
+  4 | 1200 |    50
+  8 |  900 |    50
+ 12 |  600 |    50
+ 14 | 1200 |    50
+ 18 |  900 |    50
+(6 rows)
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum((t2.y + t3.y)))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum((t2.y + t3.y))
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum((t2.y + t3.y)))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, t3.x, (PARTIAL sum((t2.y + t3.y)))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, t3.x, PARTIAL sum((t2.y + t3.y))
+                                 Group Key: t2.x
+                                 ->  Hash Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Hash Cond: (t2.x = t3.x)
+                                       ->  Seq Scan on public.eager_agg_tab1_p1 t2
+                                             Output: t2.y, t2.x
+                                       ->  Hash
+                                             Output: t3.y, t3.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p1 t3
+                                                   Output: t3.y, t3.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum((t2_1.y + t3_1.y))
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum((t2_1.y + t3_1.y)))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y)))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y))
+                                 Group Key: t2_1.x
+                                 ->  Hash Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Hash Cond: (t2_1.x = t3_1.x)
+                                       ->  Seq Scan on public.eager_agg_tab1_p2 t2_1
+                                             Output: t2_1.y, t2_1.x
+                                       ->  Hash
+                                             Output: t3_1.y, t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p2 t3_1
+                                                   Output: t3_1.y, t3_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum((t2_2.y + t3_2.y))
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum((t2_2.y + t3_2.y)))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y)))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y))
+                                 Group Key: t2_2.x
+                                 ->  Hash Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Hash Cond: (t2_2.x = t3_2.x)
+                                       ->  Seq Scan on public.eager_agg_tab1_p3 t2_2
+                                             Output: t2_2.y, t2_2.x
+                                       ->  Hash
+                                             Output: t3_2.y, t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p3 t3_2
+                                                   Output: t3_2.y, t3_2.x
+(70 rows)
+
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+ x  |  sum  
+----+-------
+  0 | 10000
+  2 | 14000
+  4 | 18000
+  6 | 22000
+  8 | 26000
+ 10 | 10000
+ 12 | 14000
+ 14 | 18000
+ 16 | 22000
+ 18 | 26000
+ 20 | 10000
+ 22 | 14000
+ 24 | 18000
+ 26 | 22000
+ 28 | 26000
+(15 rows)
+
+-- partial aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t3.y, sum((t2.y + t3.y))
+   Group Key: t3.y
+   ->  Sort
+         Output: t3.y, (PARTIAL sum((t2.y + t3.y)))
+         Sort Key: t3.y
+         ->  Append
+               ->  Hash Join
+                     Output: t3_1.y, (PARTIAL sum((t2_1.y + t3_1.y)))
+                     Hash Cond: (t2_1.x = t1_1.x)
+                     ->  Partial GroupAggregate
+                           Output: t2_1.x, t3_1.y, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y))
+                           Group Key: t2_1.x, t3_1.y, t3_1.x
+                           ->  Incremental Sort
+                                 Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                 Sort Key: t2_1.x, t3_1.y
+                                 Presorted Key: t2_1.x
+                                 ->  Merge Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Merge Cond: (t2_1.x = t3_1.x)
+                                       ->  Sort
+                                             Output: t2_1.y, t2_1.x
+                                             Sort Key: t2_1.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p1 t2_1
+                                                   Output: t2_1.y, t2_1.x
+                                       ->  Sort
+                                             Output: t3_1.y, t3_1.x
+                                             Sort Key: t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p1 t3_1
+                                                   Output: t3_1.y, t3_1.x
+                     ->  Hash
+                           Output: t1_1.x
+                           ->  Seq Scan on public.eager_agg_tab1_p1 t1_1
+                                 Output: t1_1.x
+               ->  Hash Join
+                     Output: t3_2.y, (PARTIAL sum((t2_2.y + t3_2.y)))
+                     Hash Cond: (t2_2.x = t1_2.x)
+                     ->  Partial GroupAggregate
+                           Output: t2_2.x, t3_2.y, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y))
+                           Group Key: t2_2.x, t3_2.y, t3_2.x
+                           ->  Incremental Sort
+                                 Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                 Sort Key: t2_2.x, t3_2.y
+                                 Presorted Key: t2_2.x
+                                 ->  Merge Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Merge Cond: (t2_2.x = t3_2.x)
+                                       ->  Sort
+                                             Output: t2_2.y, t2_2.x
+                                             Sort Key: t2_2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p2 t2_2
+                                                   Output: t2_2.y, t2_2.x
+                                       ->  Sort
+                                             Output: t3_2.y, t3_2.x
+                                             Sort Key: t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p2 t3_2
+                                                   Output: t3_2.y, t3_2.x
+                     ->  Hash
+                           Output: t1_2.x
+                           ->  Seq Scan on public.eager_agg_tab1_p2 t1_2
+                                 Output: t1_2.x
+               ->  Hash Join
+                     Output: t3_3.y, (PARTIAL sum((t2_3.y + t3_3.y)))
+                     Hash Cond: (t2_3.x = t1_3.x)
+                     ->  Partial GroupAggregate
+                           Output: t2_3.x, t3_3.y, t3_3.x, PARTIAL sum((t2_3.y + t3_3.y))
+                           Group Key: t2_3.x, t3_3.y, t3_3.x
+                           ->  Incremental Sort
+                                 Output: t2_3.y, t2_3.x, t3_3.y, t3_3.x
+                                 Sort Key: t2_3.x, t3_3.y
+                                 Presorted Key: t2_3.x
+                                 ->  Merge Join
+                                       Output: t2_3.y, t2_3.x, t3_3.y, t3_3.x
+                                       Merge Cond: (t2_3.x = t3_3.x)
+                                       ->  Sort
+                                             Output: t2_3.y, t2_3.x
+                                             Sort Key: t2_3.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p3 t2_3
+                                                   Output: t2_3.y, t2_3.x
+                                       ->  Sort
+                                             Output: t3_3.y, t3_3.x
+                                             Sort Key: t3_3.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p3 t3_3
+                                                   Output: t3_3.y, t3_3.x
+                     ->  Hash
+                           Output: t1_3.x
+                           ->  Seq Scan on public.eager_agg_tab1_p3 t1_3
+                                 Output: t1_3.x
+(88 rows)
+
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+ y  |  sum  
+----+-------
+  0 |  7500
+  2 | 13500
+  4 | 19500
+  6 | 25500
+  8 | 31500
+ 10 | 22500
+ 12 | 28500
+ 14 | 34500
+ 16 | 40500
+ 18 | 46500
+(10 rows)
+
+RESET enable_hashagg;
+DROP TABLE eager_agg_tab1;
+DROP TABLE eager_agg_tab2;
+--
+-- Test with multi-level partitioning scheme
+--
+CREATE TABLE eager_agg_tab_ml(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p1 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab_ml_p2 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (10) TO (20) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p2_s1 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab_ml_p2_s2 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (15) TO (20);
+CREATE TABLE eager_agg_tab_ml_p3 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (20) TO (30) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p3_s1 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (20) TO (25);
+CREATE TABLE eager_agg_tab_ml_p3_s2 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (25) TO (30);
+INSERT INTO eager_agg_tab_ml SELECT i % 30, i % 30 FROM generate_series(1, 1000) i;
+ANALYZE eager_agg_tab_ml;
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum(t2.y)), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum(t2.y), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, PARTIAL sum(t2.y), PARTIAL count(*)
+                                 Group Key: t2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                       Output: t2.y, t2.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum(t2_1.y), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, PARTIAL sum(t2_1.y), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                       Output: t2_1.y, t2_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum(t2_2.y), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, PARTIAL sum(t2_2.y), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                       Output: t2_2.y, t2_2.x
+         ->  Finalize HashAggregate
+               Output: t1_3.x, sum(t2_3.y), count(*)
+               Group Key: t1_3.x
+               ->  Hash Join
+                     Output: t1_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, PARTIAL sum(t2_3.y), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                       Output: t2_3.y, t2_3.x
+         ->  Finalize HashAggregate
+               Output: t1_4.x, sum(t2_4.y), count(*)
+               Group Key: t1_4.x
+               ->  Hash Join
+                     Output: t1_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, PARTIAL sum(t2_4.y), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                       Output: t2_4.y, t2_4.x
+(79 rows)
+
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+ x  |  sum  | count 
+----+-------+-------
+  0 |     0 |  1089
+  1 |  1156 |  1156
+  2 |  2312 |  1156
+  3 |  3468 |  1156
+  4 |  4624 |  1156
+  5 |  5780 |  1156
+  6 |  6936 |  1156
+  7 |  8092 |  1156
+  8 |  9248 |  1156
+  9 | 10404 |  1156
+ 10 | 11560 |  1156
+ 11 | 11979 |  1089
+ 12 | 13068 |  1089
+ 13 | 14157 |  1089
+ 14 | 15246 |  1089
+ 15 | 16335 |  1089
+ 16 | 17424 |  1089
+ 17 | 18513 |  1089
+ 18 | 19602 |  1089
+ 19 | 20691 |  1089
+ 20 | 21780 |  1089
+ 21 | 22869 |  1089
+ 22 | 23958 |  1089
+ 23 | 25047 |  1089
+ 24 | 26136 |  1089
+ 25 | 27225 |  1089
+ 26 | 28314 |  1089
+ 27 | 29403 |  1089
+ 28 | 30492 |  1089
+ 29 | 31581 |  1089
+(30 rows)
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.y, (sum(t2.y)), (count(*))
+   Sort Key: t1.y
+   ->  Finalize HashAggregate
+         Output: t1.y, sum(t2.y), count(*)
+         Group Key: t1.y
+         ->  Append
+               ->  Hash Join
+                     Output: t1_1.y, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1_1
+                           Output: t1_1.y, t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, PARTIAL sum(t2_1.y), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p1 t2_1
+                                       Output: t2_1.y, t2_1.x
+               ->  Hash Join
+                     Output: t1_2.y, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_2
+                           Output: t1_2.y, t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, PARTIAL sum(t2_2.y), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_2
+                                       Output: t2_2.y, t2_2.x
+               ->  Hash Join
+                     Output: t1_3.y, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_3
+                           Output: t1_3.y, t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, PARTIAL sum(t2_3.y), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_3
+                                       Output: t2_3.y, t2_3.x
+               ->  Hash Join
+                     Output: t1_4.y, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_4
+                           Output: t1_4.y, t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, PARTIAL sum(t2_4.y), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_4
+                                       Output: t2_4.y, t2_4.x
+               ->  Hash Join
+                     Output: t1_5.y, (PARTIAL sum(t2_5.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_5.x = t2_5.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_5
+                           Output: t1_5.y, t1_5.x
+                     ->  Hash
+                           Output: t2_5.x, (PARTIAL sum(t2_5.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_5.x, PARTIAL sum(t2_5.y), PARTIAL count(*)
+                                 Group Key: t2_5.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_5
+                                       Output: t2_5.y, t2_5.x
+(67 rows)
+
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+ y  |  sum  | count 
+----+-------+-------
+  0 |     0 |  1089
+  1 |  1156 |  1156
+  2 |  2312 |  1156
+  3 |  3468 |  1156
+  4 |  4624 |  1156
+  5 |  5780 |  1156
+  6 |  6936 |  1156
+  7 |  8092 |  1156
+  8 |  9248 |  1156
+  9 | 10404 |  1156
+ 10 | 11560 |  1156
+ 11 | 11979 |  1089
+ 12 | 13068 |  1089
+ 13 | 14157 |  1089
+ 14 | 15246 |  1089
+ 15 | 16335 |  1089
+ 16 | 17424 |  1089
+ 17 | 18513 |  1089
+ 18 | 19602 |  1089
+ 19 | 20691 |  1089
+ 20 | 21780 |  1089
+ 21 | 22869 |  1089
+ 22 | 23958 |  1089
+ 23 | 25047 |  1089
+ 24 | 26136 |  1089
+ 25 | 27225 |  1089
+ 26 | 28314 |  1089
+ 27 | 29403 |  1089
+ 28 | 30492 |  1089
+ 29 | 31581 |  1089
+(30 rows)
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+                                                QUERY PLAN                                                
+----------------------------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum((t2.y + t3.y))), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum((t2.y + t3.y)), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, t3.x, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, t3.x, PARTIAL sum((t2.y + t3.y)), PARTIAL count(*)
+                                 Group Key: t2.x
+                                 ->  Hash Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Hash Cond: (t2.x = t3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                             Output: t2.y, t2.x
+                                       ->  Hash
+                                             Output: t3.y, t3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p1 t3
+                                                   Output: t3.y, t3.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum((t2_1.y + t3_1.y)), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y)), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Hash Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Hash Cond: (t2_1.x = t3_1.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                             Output: t2_1.y, t2_1.x
+                                       ->  Hash
+                                             Output: t3_1.y, t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t3_1
+                                                   Output: t3_1.y, t3_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum((t2_2.y + t3_2.y)), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y)), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Hash Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Hash Cond: (t2_2.x = t3_2.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                             Output: t2_2.y, t2_2.x
+                                       ->  Hash
+                                             Output: t3_2.y, t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t3_2
+                                                   Output: t3_2.y, t3_2.x
+         ->  Finalize HashAggregate
+               Output: t1_3.x, sum((t2_3.y + t3_3.y)), count(*)
+               Group Key: t1_3.x
+               ->  Hash Join
+                     Output: t1_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, t3_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, t3_3.x, PARTIAL sum((t2_3.y + t3_3.y)), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Hash Join
+                                       Output: t2_3.y, t2_3.x, t3_3.y, t3_3.x
+                                       Hash Cond: (t2_3.x = t3_3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                             Output: t2_3.y, t2_3.x
+                                       ->  Hash
+                                             Output: t3_3.y, t3_3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t3_3
+                                                   Output: t3_3.y, t3_3.x
+         ->  Finalize HashAggregate
+               Output: t1_4.x, sum((t2_4.y + t3_4.y)), count(*)
+               Group Key: t1_4.x
+               ->  Hash Join
+                     Output: t1_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, t3_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, t3_4.x, PARTIAL sum((t2_4.y + t3_4.y)), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Hash Join
+                                       Output: t2_4.y, t2_4.x, t3_4.y, t3_4.x
+                                       Hash Cond: (t2_4.x = t3_4.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                             Output: t2_4.y, t2_4.x
+                                       ->  Hash
+                                             Output: t3_4.y, t3_4.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t3_4
+                                                   Output: t3_4.y, t3_4.x
+(114 rows)
+
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+ x  |   sum   | count 
+----+---------+-------
+  0 |       0 | 35937
+  1 |   78608 | 39304
+  2 |  157216 | 39304
+  3 |  235824 | 39304
+  4 |  314432 | 39304
+  5 |  393040 | 39304
+  6 |  471648 | 39304
+  7 |  550256 | 39304
+  8 |  628864 | 39304
+  9 |  707472 | 39304
+ 10 |  786080 | 39304
+ 11 |  790614 | 35937
+ 12 |  862488 | 35937
+ 13 |  934362 | 35937
+ 14 | 1006236 | 35937
+ 15 | 1078110 | 35937
+ 16 | 1149984 | 35937
+ 17 | 1221858 | 35937
+ 18 | 1293732 | 35937
+ 19 | 1365606 | 35937
+ 20 | 1437480 | 35937
+ 21 | 1509354 | 35937
+ 22 | 1581228 | 35937
+ 23 | 1653102 | 35937
+ 24 | 1724976 | 35937
+ 25 | 1796850 | 35937
+ 26 | 1868724 | 35937
+ 27 | 1940598 | 35937
+ 28 | 2012472 | 35937
+ 29 | 2084346 | 35937
+(30 rows)
+
+-- partial aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+                                                    QUERY PLAN                                                    
+------------------------------------------------------------------------------------------------------------------
+ Sort
+   Output: t3.y, (sum((t2.y + t3.y))), (count(*))
+   Sort Key: t3.y
+   ->  Finalize HashAggregate
+         Output: t3.y, sum((t2.y + t3.y)), count(*)
+         Group Key: t3.y
+         ->  Append
+               ->  Hash Join
+                     Output: t3_1.y, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, t3_1.y, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, t3_1.y, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y)), PARTIAL count(*)
+                                 Group Key: t2_1.x, t3_1.y, t3_1.x
+                                 ->  Hash Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Hash Cond: (t2_1.x = t3_1.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p1 t2_1
+                                             Output: t2_1.y, t2_1.x
+                                       ->  Hash
+                                             Output: t3_1.y, t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p1 t3_1
+                                                   Output: t3_1.y, t3_1.x
+               ->  Hash Join
+                     Output: t3_2.y, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, t3_2.y, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, t3_2.y, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y)), PARTIAL count(*)
+                                 Group Key: t2_2.x, t3_2.y, t3_2.x
+                                 ->  Hash Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Hash Cond: (t2_2.x = t3_2.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_2
+                                             Output: t2_2.y, t2_2.x
+                                       ->  Hash
+                                             Output: t3_2.y, t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t3_2
+                                                   Output: t3_2.y, t3_2.x
+               ->  Hash Join
+                     Output: t3_3.y, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, t3_3.y, t3_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, t3_3.y, t3_3.x, PARTIAL sum((t2_3.y + t3_3.y)), PARTIAL count(*)
+                                 Group Key: t2_3.x, t3_3.y, t3_3.x
+                                 ->  Hash Join
+                                       Output: t2_3.y, t2_3.x, t3_3.y, t3_3.x
+                                       Hash Cond: (t2_3.x = t3_3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_3
+                                             Output: t2_3.y, t2_3.x
+                                       ->  Hash
+                                             Output: t3_3.y, t3_3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t3_3
+                                                   Output: t3_3.y, t3_3.x
+               ->  Hash Join
+                     Output: t3_4.y, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, t3_4.y, t3_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, t3_4.y, t3_4.x, PARTIAL sum((t2_4.y + t3_4.y)), PARTIAL count(*)
+                                 Group Key: t2_4.x, t3_4.y, t3_4.x
+                                 ->  Hash Join
+                                       Output: t2_4.y, t2_4.x, t3_4.y, t3_4.x
+                                       Hash Cond: (t2_4.x = t3_4.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_4
+                                             Output: t2_4.y, t2_4.x
+                                       ->  Hash
+                                             Output: t3_4.y, t3_4.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t3_4
+                                                   Output: t3_4.y, t3_4.x
+               ->  Hash Join
+                     Output: t3_5.y, (PARTIAL sum((t2_5.y + t3_5.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_5.x = t2_5.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_5
+                           Output: t1_5.x
+                     ->  Hash
+                           Output: t2_5.x, t3_5.y, t3_5.x, (PARTIAL sum((t2_5.y + t3_5.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_5.x, t3_5.y, t3_5.x, PARTIAL sum((t2_5.y + t3_5.y)), PARTIAL count(*)
+                                 Group Key: t2_5.x, t3_5.y, t3_5.x
+                                 ->  Hash Join
+                                       Output: t2_5.y, t2_5.x, t3_5.y, t3_5.x
+                                       Hash Cond: (t2_5.x = t3_5.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_5
+                                             Output: t2_5.y, t2_5.x
+                                       ->  Hash
+                                             Output: t3_5.y, t3_5.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t3_5
+                                                   Output: t3_5.y, t3_5.x
+(102 rows)
+
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+ y  |   sum   | count 
+----+---------+-------
+  0 |       0 | 35937
+  1 |   78608 | 39304
+  2 |  157216 | 39304
+  3 |  235824 | 39304
+  4 |  314432 | 39304
+  5 |  393040 | 39304
+  6 |  471648 | 39304
+  7 |  550256 | 39304
+  8 |  628864 | 39304
+  9 |  707472 | 39304
+ 10 |  786080 | 39304
+ 11 |  790614 | 35937
+ 12 |  862488 | 35937
+ 13 |  934362 | 35937
+ 14 | 1006236 | 35937
+ 15 | 1078110 | 35937
+ 16 | 1149984 | 35937
+ 17 | 1221858 | 35937
+ 18 | 1293732 | 35937
+ 19 | 1365606 | 35937
+ 20 | 1437480 | 35937
+ 21 | 1509354 | 35937
+ 22 | 1581228 | 35937
+ 23 | 1653102 | 35937
+ 24 | 1724976 | 35937
+ 25 | 1796850 | 35937
+ 26 | 1868724 | 35937
+ 27 | 1940598 | 35937
+ 28 | 2012472 | 35937
+ 29 | 2084346 | 35937
+(30 rows)
+
+DROP TABLE eager_agg_tab_ml;
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index 91089ac215..6370504377 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -151,6 +151,7 @@ select name, setting from pg_settings where name like 'enable%';
  enable_async_append            | on
  enable_bitmapscan              | on
  enable_distinct_reordering     | on
+ enable_eager_aggregate         | off
  enable_gathermerge             | on
  enable_group_by_reordering     | on
  enable_hashagg                 | on
@@ -171,7 +172,7 @@ select name, setting from pg_settings where name like 'enable%';
  enable_seqscan                 | on
  enable_sort                    | on
  enable_tidscan                 | on
-(23 rows)
+(24 rows)
 
 -- There are always wait event descriptions for various types.  InjectionPoint
 -- may be present or absent, depending on history since last postmaster start.
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 1edd9e45eb..4fc210e2ef 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -119,7 +119,7 @@ test: plancache limit plpgsql copy2 temp domain rangefuncs prepare conversion tr
 # The stats test resets stats, so nothing else needing stats access can be in
 # this group.
 # ----------
-test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats predicate
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats predicate eager_aggregate
 
 # event_trigger depends on create_am and cannot run concurrently with
 # any test that runs DDL
diff --git a/src/test/regress/sql/eager_aggregate.sql b/src/test/regress/sql/eager_aggregate.sql
new file mode 100644
index 0000000000..4050e4df44
--- /dev/null
+++ b/src/test/regress/sql/eager_aggregate.sql
@@ -0,0 +1,192 @@
+--
+-- EAGER AGGREGATION
+-- Test we can push aggregation down below join
+--
+
+-- Enable eager aggregation, which by default is disabled.
+SET enable_eager_aggregate TO on;
+
+CREATE TABLE eager_agg_t1 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t2 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t3 (a int, b int, c double precision);
+
+INSERT INTO eager_agg_t1 SELECT i, i, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t2 SELECT i, i%10, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t3 SELECT i%10, i%10, i FROM generate_series(1, 1000)i;
+
+ANALYZE eager_agg_t1;
+ANALYZE eager_agg_t2;
+ANALYZE eager_agg_t3;
+
+
+--
+-- Test eager aggregation over base rel
+--
+
+-- Perform scan of a table, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+RESET enable_hashagg;
+
+
+--
+-- Test eager aggregation over join rel
+--
+
+-- Perform join of tables, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+
+RESET enable_hashagg;
+
+
+--
+-- Test that eager aggregation works for outer join
+--
+
+-- Ensure aggregation can be pushed down to the non-nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+-- Ensure aggregation cannot be pushed down to the nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+
+
+--
+-- Test that eager aggregation works for parallel plans
+--
+
+SET parallel_setup_cost=0;
+SET parallel_tuple_cost=0;
+SET min_parallel_table_scan_size=0;
+SET max_parallel_workers_per_gather=4;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+RESET parallel_setup_cost;
+RESET parallel_tuple_cost;
+RESET min_parallel_table_scan_size;
+RESET max_parallel_workers_per_gather;
+
+
+DROP TABLE eager_agg_t1;
+DROP TABLE eager_agg_t2;
+DROP TABLE eager_agg_t3;
+
+
+--
+-- Test eager aggregation for partitionwise join
+--
+
+-- Enable partitionwise aggregate, which by default is disabled.
+SET enable_partitionwise_aggregate TO true;
+-- Enable partitionwise join, which by default is disabled.
+SET enable_partitionwise_join TO true;
+
+CREATE TABLE eager_agg_tab1(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab1_p1 PARTITION OF eager_agg_tab1 FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab1_p2 PARTITION OF eager_agg_tab1 FOR VALUES FROM (10) TO (20);
+CREATE TABLE eager_agg_tab1_p3 PARTITION OF eager_agg_tab1 FOR VALUES FROM (20) TO (30);
+CREATE TABLE eager_agg_tab2(x int, y int) PARTITION BY RANGE(y);
+CREATE TABLE eager_agg_tab2_p1 PARTITION OF eager_agg_tab2 FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab2_p2 PARTITION OF eager_agg_tab2 FOR VALUES FROM (10) TO (20);
+CREATE TABLE eager_agg_tab2_p3 PARTITION OF eager_agg_tab2 FOR VALUES FROM (20) TO (30);
+INSERT INTO eager_agg_tab1 SELECT i % 30, i % 20 FROM generate_series(0, 299, 2) i;
+INSERT INTO eager_agg_tab2 SELECT i % 20, i % 30 FROM generate_series(0, 299, 3) i;
+
+ANALYZE eager_agg_tab1;
+ANALYZE eager_agg_tab2;
+
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+
+-- GROUP BY having other matching key
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 10 ORDER BY t2.x;
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 10 ORDER BY t2.x;
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+
+-- partial aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+RESET enable_hashagg;
+
+DROP TABLE eager_agg_tab1;
+DROP TABLE eager_agg_tab2;
+
+
+--
+-- Test with multi-level partitioning scheme
+--
+CREATE TABLE eager_agg_tab_ml(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p1 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab_ml_p2 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (10) TO (20) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p2_s1 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab_ml_p2_s2 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (15) TO (20);
+CREATE TABLE eager_agg_tab_ml_p3 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (20) TO (30) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p3_s1 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (20) TO (25);
+CREATE TABLE eager_agg_tab_ml_p3_s2 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (25) TO (30);
+INSERT INTO eager_agg_tab_ml SELECT i % 30, i % 30 FROM generate_series(1, 1000) i;
+
+ANALYZE eager_agg_tab_ml;
+
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+
+-- partial aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+
+DROP TABLE eager_agg_tab_ml;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index e1c4f913f8..95be701ec3 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -41,6 +41,7 @@ AfterTriggersTableData
 AfterTriggersTransData
 Agg
 AggClauseCosts
+AggClauseInfo
 AggInfo
 AggPath
 AggSplit
@@ -1065,6 +1066,7 @@ GrantTargetType
 Group
 GroupByOrdering
 GroupClause
+GroupExprInfo
 GroupPath
 GroupPathExtraData
 GroupResultPath
@@ -1297,7 +1299,6 @@ Join
 JoinCostWorkspace
 JoinDomain
 JoinExpr
-JoinHashEntry
 JoinPath
 JoinPathExtraData
 JoinState
@@ -2383,13 +2384,17 @@ ReindexObjectType
 ReindexParams
 ReindexStmt
 ReindexType
+RelAggInfo
 RelFileLocator
 RelFileLocatorBackend
 RelFileNumber
+RelHashEntry
 RelIdCacheEnt
 RelIdToTypeIdCacheEntry
 RelInfo
 RelInfoArr
+RelInfoList
+RelInfoListInfo
 RelMapFile
 RelMapping
 RelOptInfo
-- 
2.43.0



^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-01-09 03:15   ` jian he <[email protected]>
  2025-01-09 09:27     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  1 sibling, 1 reply; 70+ messages in thread

From: jian he @ 2025-01-09 03:15 UTC (permalink / raw)
  To: Richard Guo <[email protected]>; +Cc: Robert Haas <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

hi.
in create_grouping_expr_infos

        tce = lookup_type_cache(exprType((Node *) tle->expr),
                                TYPECACHE_BTREE_OPFAMILY);
        if (!OidIsValid(tce->btree_opf) ||
            !OidIsValid(tce->btree_opintype))
            return;
       ....
        /*
         * Get the operator in the btree's opfamily.
         */
        eq_op = get_opfamily_member(tce->btree_opf,
                                    tce->btree_opintype,
                                    tce->btree_opintype,
                                    BTEqualStrategyNumber);
        if (!OidIsValid(eq_op))
            return;
        eq_opfamilies = get_mergejoin_opfamilies(eq_op);
        if (!eq_opfamilies)
            return;
        btree_opfamily = linitial_oid(eq_opfamilies);


If eq_op is valid, then we don't need to call get_mergejoin_opfamilies?
since get_mergejoin_opfamilies output will be the same as tce->btree_opf.
and we already checked (tce->btree_opf) is valid.

In other words, I think eq_op is valid imply
that tce->btree_opf is the value (btree opfamily) we need.






^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-09 03:15   ` Re: Eager aggregation, take 3 jian he <[email protected]>
@ 2025-01-09 09:27     ` Richard Guo <[email protected]>
  0 siblings, 0 replies; 70+ messages in thread

From: Richard Guo @ 2025-01-09 09:27 UTC (permalink / raw)
  To: jian he <[email protected]>; +Cc: Robert Haas <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

On Thu, Jan 9, 2025 at 12:15 PM jian he <[email protected]> wrote:
> hi.
> in create_grouping_expr_infos
>
>         tce = lookup_type_cache(exprType((Node *) tle->expr),
>                                 TYPECACHE_BTREE_OPFAMILY);
>         if (!OidIsValid(tce->btree_opf) ||
>             !OidIsValid(tce->btree_opintype))
>             return;
>        ....
>         /*
>          * Get the operator in the btree's opfamily.
>          */
>         eq_op = get_opfamily_member(tce->btree_opf,
>                                     tce->btree_opintype,
>                                     tce->btree_opintype,
>                                     BTEqualStrategyNumber);
>         if (!OidIsValid(eq_op))
>             return;
>         eq_opfamilies = get_mergejoin_opfamilies(eq_op);
>         if (!eq_opfamilies)
>             return;
>         btree_opfamily = linitial_oid(eq_opfamilies);
>
>
> If eq_op is valid, then we don't need to call get_mergejoin_opfamilies?
> since get_mergejoin_opfamilies output will be the same as tce->btree_opf.
> and we already checked (tce->btree_opf) is valid.
>
> In other words, I think eq_op is valid imply
> that tce->btree_opf is the value (btree opfamily) we need.

Nice catch!  Actually, we can use tce->btree_opf directly, without
needing to check its equality operator, since we know it's a btree
opfamily and it's valid.  If it were a different opfamily (such as a
hash opfamily), we would need to look up its equality operator, and
select some btree opfamily that that operator is part of.  But in this
case, that's not necessary.

Thanks
Richard






^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-01-13 02:04   ` Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  1 sibling, 1 reply; 70+ messages in thread

From: Richard Guo @ 2025-01-13 02:04 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

On Sat, Dec 21, 2024 at 10:05 AM Richard Guo <[email protected]> wrote:
> Attached is the latest patch, which also includes some cosmetic
> tweaks.  I am seeking the possibility of pushing this by the end of
> January, so that I can have enough time to react to any bugs before
> the feature freeze.

Attached is an updated version of this patch that addresses Jian's
review comments, along with some more cosmetic tweaks.  I'm going to
be looking at this patch again from the point of view of committing
it, with the plan to commit it late this week or early next week,
barring any further comments or objections.

Thanks
Richard


Attachments:

  [application/octet-stream] v16-0001-Implement-Eager-Aggregation.patch (178.3K, 2-v16-0001-Implement-Eager-Aggregation.patch)
  download | inline diff:
From 939ad5d47e6fdbc260fdf41b64ffe2bdd3e4ad2c Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Tue, 11 Jun 2024 15:59:19 +0900
Subject: [PATCH v16] Implement Eager Aggregation

Eager aggregation is a query optimization technique that partially
pushes aggregation past a join, and finalizes it once all the
relations are joined.  Eager aggregation may reduce the number of
input rows to the join and thus could result in a better overall plan.

A plan with eager aggregation looks like:

 EXPLAIN (COSTS OFF)
 SELECT a.i, avg(b.y)
 FROM a JOIN b ON a.i = b.j
 GROUP BY a.i;

 Finalize HashAggregate
   Group Key: a.i
   ->  Nested Loop
         ->  Partial HashAggregate
               Group Key: b.j
               ->  Seq Scan on b
         ->  Index Only Scan using a_pkey on a
               Index Cond: (i = b.j)

During the construction of the join tree, we evaluate each base or
join relation to determine if eager aggregation can be applied.  If
feasible, we create a separate RelOptInfo called a "grouped relation"
and store it in a dedicated list.

Grouped relation paths can be generated in two ways.  The first method
involves adding sorted and hashed partial aggregation paths on top of
the non-grouped paths.  To limit planning time, we only consider the
cheapest or suitably-sorted non-grouped paths during this phase.

Alternatively, grouped paths can be generated by joining a grouped
relation with a non-grouped relation.  Joining two grouped relations
does not seem to be very useful and is currently not supported.

For the partial aggregation that is pushed down to a non-aggregated
relation, we need to consider all expressions from this relation that
are involved in upper join clauses and include them in the grouping
keys, using compatible operators.  This is essential to ensure that an
aggregated row from the partial aggregation matches the other side of
the join if and only if each row in the partial group does.  This
ensures that all rows within the same partial group share the same
'destiny', which is crucial for maintaining correctness.

One restriction is that we cannot push partial aggregation down to a
relation that is in the nullable side of an outer join, because the
NULL-extended rows produced by the outer join would not be available
when we perform the partial aggregation, while with a
non-eager-aggregation plan these rows are available for the top-level
aggregation.  Pushing partial aggregation in this case may result in
the rows being grouped differently than expected, or produce incorrect
values from the aggregate functions.

If we have generated a grouped relation for the topmost join relation,
we finalize its paths at the end.  The final paths will compete in the
usual way with paths built from regular planning.

Since eager aggregation can generate many grouped relations, we
introduce a RelInfoList structure, which encapsulates both a list and
a hash table, so that we can leverage the hash table for faster
lookups not only for join relations but also for grouped relations.

Eager aggregation can use significantly more CPU time and memory than
regular planning when the query involves aggregates and many joining
relations.  However, in some cases, the resulting plan can be much
better, justifying the additional planning effort.  All the same, for
now, turn this feature off by default.

The patch was originally proposed by Antonin Houska in 2017.  This
commit reworks various important aspects and rewrites most of the
current code.  However, the original patch and reviews were very
useful.

Author: Richard Guo, Antonin Houska
Reviewed-by: Robert Haas, Jian He, Tender Wang, Paul George, Tom Lane
Reviewed-by: Tomas Vondra, Andy Fan, Ashutosh Bapat
Discussion: https://postgr.es/m/CAMbWs48jzLrPt1J_00ZcPZXWUQKawQOFE8ROc-ADiYqsqrpBNw@mail.gmail.com
---
 contrib/postgres_fdw/postgres_fdw.c           |    3 +-
 doc/src/sgml/config.sgml                      |   15 +
 src/backend/optimizer/README                  |   80 +
 src/backend/optimizer/geqo/geqo_eval.c        |   98 +-
 src/backend/optimizer/path/allpaths.c         |  455 +++++-
 src/backend/optimizer/path/costsize.c         |   95 +-
 src/backend/optimizer/path/joinrels.c         |  141 ++
 src/backend/optimizer/plan/initsplan.c        |  258 ++++
 src/backend/optimizer/plan/planmain.c         |   17 +-
 src/backend/optimizer/plan/planner.c          |   99 +-
 src/backend/optimizer/util/appendinfo.c       |   60 +
 src/backend/optimizer/util/pathnode.c         |   47 +-
 src/backend/optimizer/util/relnode.c          |  754 +++++++++-
 src/backend/utils/misc/guc_tables.c           |   10 +
 src/backend/utils/misc/postgresql.conf.sample |    1 +
 src/include/nodes/pathnodes.h                 |  157 +-
 src/include/optimizer/pathnode.h              |    7 +
 src/include/optimizer/paths.h                 |    5 +
 src/include/optimizer/planmain.h              |    1 +
 src/test/regress/expected/eager_aggregate.out | 1308 +++++++++++++++++
 src/test/regress/expected/sysviews.out        |    3 +-
 src/test/regress/parallel_schedule            |    2 +-
 src/test/regress/sql/eager_aggregate.sql      |  192 +++
 src/tools/pgindent/typedefs.list              |    7 +-
 24 files changed, 3655 insertions(+), 160 deletions(-)
 create mode 100644 src/test/regress/expected/eager_aggregate.out
 create mode 100644 src/test/regress/sql/eager_aggregate.sql

diff --git a/contrib/postgres_fdw/postgres_fdw.c b/contrib/postgres_fdw/postgres_fdw.c
index b92e2a0fc9..76f88bd3e3 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -6089,7 +6089,8 @@ foreign_join_ok(PlannerInfo *root, RelOptInfo *joinrel, JoinType jointype,
 	 */
 	Assert(fpinfo->relation_index == 0);	/* shouldn't be set yet */
 	fpinfo->relation_index =
-		list_length(root->parse->rtable) + list_length(root->join_rel_list);
+		list_length(root->parse->rtable) +
+		list_length(root->join_rel_list->items);
 
 	return true;
 }
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 3f41a17b1f..09a3c4caf2 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -5241,6 +5241,21 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-enable-eager-aggregate" xreflabel="enable_eager_aggregate">
+      <term><varname>enable_eager_aggregate</varname> (<type>boolean</type>)
+      <indexterm>
+       <primary><varname>enable_eager_aggregate</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Enables or disables the query planner's ability to partially push
+        aggregation past a join, and finalize it once all the relations are
+        joined. The default is <literal>off</literal>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-enable-gathermerge" xreflabel="enable_gathermerge">
       <term><varname>enable_gathermerge</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/src/backend/optimizer/README b/src/backend/optimizer/README
index f341d9f303..45236ca46b 100644
--- a/src/backend/optimizer/README
+++ b/src/backend/optimizer/README
@@ -1497,3 +1497,83 @@ breaking down aggregation or grouping over a partitioned relation into
 aggregation or grouping over its partitions is called partitionwise
 aggregation.  Especially when the partition keys match the GROUP BY clause,
 this can be significantly faster than the regular method.
+
+Eager aggregation
+-----------------
+
+Eager aggregation is a query optimization technique that partially pushes
+aggregation past a join, and finalizes it once all the relations are joined.
+Eager aggregation may reduce the number of input rows to the join and thus
+could result in a better overall plan.
+
+For example:
+
+ EXPLAIN (COSTS OFF)
+ SELECT a.i, avg(b.y)
+ FROM a JOIN b ON a.i = b.j
+ GROUP BY a.i;
+
+ Finalize HashAggregate
+   Group Key: a.i
+   ->  Nested Loop
+         ->  Partial HashAggregate
+               Group Key: b.j
+               ->  Seq Scan on b
+         ->  Index Only Scan using a_pkey on a
+               Index Cond: (i = b.j)
+
+If the partial aggregation on table B significantly reduces the number of
+input rows, the join above will be much cheaper, leading to a more efficient
+final plan.
+
+For the partial aggregation that is pushed down to a non-aggregated relation,
+we need to consider all expressions from this relation that are involved in
+upper join clauses and include them in the grouping keys, using compatible
+operators.  This is essential to ensure that an aggregated row from the partial
+aggregation matches the other side of the join if and only if each row in the
+partial group does.  This ensures that all rows within the same partial group
+share the same 'destiny', which is crucial for maintaining correctness.
+
+One restriction is that we cannot push partial aggregation down to a relation
+that is in the nullable side of an outer join, because the NULL-extended rows
+produced by the outer join would not be available when we perform the partial
+aggregation, while with a non-eager-aggregation plan these rows are available
+for the top-level aggregation.  Pushing partial aggregation in this case may
+result in the rows being grouped differently than expected, or produce
+incorrect values from the aggregate functions.
+
+We can also apply eager aggregation to a join:
+
+ EXPLAIN (COSTS OFF)
+ SELECT a.i, avg(b.y + c.z)
+ FROM a JOIN b ON a.i = b.j
+        JOIN c ON b.j = c.i
+ GROUP BY a.i;
+
+ Finalize HashAggregate
+   Group Key: a.i
+   ->  Nested Loop
+         ->  Partial HashAggregate
+               Group Key: b.j
+               ->  Hash Join
+                     Hash Cond: (b.j = c.i)
+                     ->  Seq Scan on b
+                     ->  Hash
+                           ->  Seq Scan on c
+         ->  Index Only Scan using a_pkey on a
+               Index Cond: (i = b.j)
+
+During the construction of the join tree, we evaluate each base or join
+relation to determine if eager aggregation can be applied.  If feasible, we
+create a separate RelOptInfo called a "grouped relation" and generate grouped
+paths by adding sorted and hashed partial aggregation paths on top of the
+non-grouped paths.  To limit planning time, we consider only the cheapest or
+suitably-sorted non-grouped paths in this step.
+
+Another way to generate grouped paths is to join a grouped relation with a
+non-grouped relation.  Joining two grouped relations does not seem to be very
+useful and is currently not supported.
+
+If we have generated a grouped relation for the topmost join relation, we need
+to finalize its paths at the end.  The final paths will compete in the usual
+way with paths built from regular planning.
diff --git a/src/backend/optimizer/geqo/geqo_eval.c b/src/backend/optimizer/geqo/geqo_eval.c
index f07d1dc8ac..e69eac9bff 100644
--- a/src/backend/optimizer/geqo/geqo_eval.c
+++ b/src/backend/optimizer/geqo/geqo_eval.c
@@ -39,10 +39,20 @@ typedef struct
 	int			size;			/* number of input relations in clump */
 } Clump;
 
+/* The original length and hashtable of a RelInfoList */
+typedef struct
+{
+	int			savelength;
+	struct HTAB *savehash;
+} RelInfoListInfo;
+
 static List *merge_clump(PlannerInfo *root, List *clumps, Clump *new_clump,
 						 int num_gene, bool force);
 static bool desirable_join(PlannerInfo *root,
 						   RelOptInfo *outer_rel, RelOptInfo *inner_rel);
+static RelInfoListInfo save_relinfolist(RelInfoList *relinfo_list);
+static void restore_relinfolist(RelInfoList *relinfo_list,
+								RelInfoListInfo *info);
 
 
 /*
@@ -60,8 +70,8 @@ geqo_eval(PlannerInfo *root, Gene *tour, int num_gene)
 	MemoryContext oldcxt;
 	RelOptInfo *joinrel;
 	Cost		fitness;
-	int			savelength;
-	struct HTAB *savehash;
+	RelInfoListInfo save_join_rel;
+	RelInfoListInfo save_grouped_rel;
 
 	/*
 	 * Create a private memory context that will hold all temp storage
@@ -78,25 +88,29 @@ geqo_eval(PlannerInfo *root, Gene *tour, int num_gene)
 	oldcxt = MemoryContextSwitchTo(mycontext);
 
 	/*
-	 * gimme_tree will add entries to root->join_rel_list, which may or may
-	 * not already contain some entries.  The newly added entries will be
-	 * recycled by the MemoryContextDelete below, so we must ensure that the
-	 * list is restored to its former state before exiting.  We can do this by
-	 * truncating the list to its original length.  NOTE this assumes that any
-	 * added entries are appended at the end!
+	 * gimme_tree will add entries to root->join_rel_list and
+	 * root->grouped_rel_list, which may or may not already contain some
+	 * entries.  The newly added entries will be recycled by the
+	 * MemoryContextDelete below, so we must ensure that each list within the
+	 * RelInfoList structures is restored to its former state before exiting.
+	 * We can do this by truncating each list to its original length.  NOTE
+	 * this assumes that any added entries are appended at the end!
 	 *
-	 * We also must take care not to mess up the outer join_rel_hash, if there
-	 * is one.  We can do this by just temporarily setting the link to NULL.
-	 * (If we are dealing with enough join rels, which we very likely are, a
-	 * new hash table will get built and used locally.)
+	 * We also must take care not to mess up the outer hash tables within the
+	 * RelInfoList structures, if any.  We can do this by just temporarily
+	 * setting each link to NULL.  (If we are dealing with enough join rels or
+	 * grouped rels, which we very likely are, new hash tables will get built
+	 * and used locally.)
 	 *
 	 * join_rel_level[] shouldn't be in use, so just Assert it isn't.
 	 */
-	savelength = list_length(root->join_rel_list);
-	savehash = root->join_rel_hash;
+	save_join_rel = save_relinfolist(root->join_rel_list);
+	save_grouped_rel = save_relinfolist(root->grouped_rel_list);
+
 	Assert(root->join_rel_level == NULL);
 
-	root->join_rel_hash = NULL;
+	root->join_rel_list->hash = NULL;
+	root->grouped_rel_list->hash = NULL;
 
 	/* construct the best path for the given combination of relations */
 	joinrel = gimme_tree(root, tour, num_gene);
@@ -118,12 +132,11 @@ geqo_eval(PlannerInfo *root, Gene *tour, int num_gene)
 		fitness = DBL_MAX;
 
 	/*
-	 * Restore join_rel_list to its former state, and put back original
-	 * hashtable if any.
+	 * Restore each of the list in join_rel_list and grouped_rel_list to its
+	 * former state, and put back original hashtables if any.
 	 */
-	root->join_rel_list = list_truncate(root->join_rel_list,
-										savelength);
-	root->join_rel_hash = savehash;
+	restore_relinfolist(root->join_rel_list, &save_join_rel);
+	restore_relinfolist(root->grouped_rel_list, &save_grouped_rel);
 
 	/* release all the memory acquired within gimme_tree */
 	MemoryContextSwitchTo(oldcxt);
@@ -279,6 +292,27 @@ merge_clump(PlannerInfo *root, List *clumps, Clump *new_clump, int num_gene,
 				/* Find and save the cheapest paths for this joinrel */
 				set_cheapest(joinrel);
 
+				/*
+				 * Except for the topmost scan/join rel, consider generating
+				 * partial aggregation paths for the grouped relation on top
+				 * of the paths of this rel.  After that, we're done creating
+				 * paths for the grouped relation, so run set_cheapest().
+				 */
+				if (!bms_equal(joinrel->relids, root->all_query_rels))
+				{
+					RelOptInfo *rel_grouped;
+
+					rel_grouped = find_grouped_rel(root, joinrel->relids);
+					if (rel_grouped)
+					{
+						Assert(IS_GROUPED_REL(rel_grouped));
+
+						generate_grouped_paths(root, rel_grouped, joinrel,
+											   rel_grouped->agg_info);
+						set_cheapest(rel_grouped);
+					}
+				}
+
 				/* Absorb new clump into old */
 				old_clump->joinrel = joinrel;
 				old_clump->size += new_clump->size;
@@ -336,3 +370,27 @@ desirable_join(PlannerInfo *root,
 	/* Otherwise postpone the join till later. */
 	return false;
 }
+
+/*
+ * Save the original length and hashtable of a RelInfoList.
+ */
+static RelInfoListInfo
+save_relinfolist(RelInfoList *relinfo_list)
+{
+	RelInfoListInfo info;
+
+	info.savelength = list_length(relinfo_list->items);
+	info.savehash = relinfo_list->hash;
+
+	return info;
+}
+
+/*
+ * Restore the original length and hashtable of a RelInfoList.
+ */
+static void
+restore_relinfolist(RelInfoList *relinfo_list, RelInfoListInfo *info)
+{
+	relinfo_list->items = list_truncate(relinfo_list->items, info->savelength);
+	relinfo_list->hash = info->savehash;
+}
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 3364589391..836c0bcbf5 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -40,6 +40,7 @@
 #include "optimizer/paths.h"
 #include "optimizer/plancat.h"
 #include "optimizer/planner.h"
+#include "optimizer/prep.h"
 #include "optimizer/tlist.h"
 #include "parser/parse_clause.h"
 #include "parser/parsetree.h"
@@ -47,6 +48,7 @@
 #include "port/pg_bitutils.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/lsyscache.h"
+#include "utils/selfuncs.h"
 
 
 /* Bitmask flags for pushdown_safety_info.unsafeFlags */
@@ -77,6 +79,7 @@ typedef enum pushdown_safe_type
 
 /* These parameters are set by GUC */
 bool		enable_geqo = false;	/* just in case GUC doesn't set it */
+bool		enable_eager_aggregate = false;
 int			geqo_threshold;
 int			min_parallel_table_scan_size;
 int			min_parallel_index_scan_size;
@@ -90,6 +93,7 @@ join_search_hook_type join_search_hook = NULL;
 
 static void set_base_rel_consider_startup(PlannerInfo *root);
 static void set_base_rel_sizes(PlannerInfo *root);
+static void setup_base_grouped_rels(PlannerInfo *root);
 static void set_base_rel_pathlists(PlannerInfo *root);
 static void set_rel_size(PlannerInfo *root, RelOptInfo *rel,
 						 Index rti, RangeTblEntry *rte);
@@ -114,6 +118,7 @@ static void set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
 								Index rti, RangeTblEntry *rte);
 static void set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 									Index rti, RangeTblEntry *rte);
+static void set_grouped_rel_pathlist(PlannerInfo *root, RelOptInfo *rel);
 static void generate_orderedappend_paths(PlannerInfo *root, RelOptInfo *rel,
 										 List *live_childrels,
 										 List *all_child_pathkeys);
@@ -182,6 +187,11 @@ make_one_rel(PlannerInfo *root, List *joinlist)
 	 */
 	set_base_rel_sizes(root);
 
+	/*
+	 * Build grouped relations for base rels where possible.
+	 */
+	setup_base_grouped_rels(root);
+
 	/*
 	 * We should now have size estimates for every actual table involved in
 	 * the query, and we also know which if any have been deleted from the
@@ -323,6 +333,45 @@ set_base_rel_sizes(PlannerInfo *root)
 	}
 }
 
+/*
+ * setup_base_grouped_rels
+ *	  For each "plain" base relation, build a grouped base relation if eager
+ *	  aggregation is possible and if this relation can produce grouped paths.
+ */
+static void
+setup_base_grouped_rels(PlannerInfo *root)
+{
+	Index		rti;
+
+	/*
+	 * If there are no aggregate expressions or grouping expressions, eager
+	 * aggregation is not possible.
+	 */
+	if (root->agg_clause_list == NIL ||
+		root->group_expr_list == NIL)
+		return;
+
+	for (rti = 1; rti < root->simple_rel_array_size; rti++)
+	{
+		RelOptInfo *rel = root->simple_rel_array[rti];
+		RelOptInfo *rel_grouped;
+
+		/* there may be empty slots corresponding to non-baserel RTEs */
+		if (rel == NULL)
+			continue;
+
+		Assert(rel->relid == rti);	/* sanity check on array */
+		Assert(IS_SIMPLE_REL(rel)); /* sanity check on rel */
+
+		rel_grouped = build_simple_grouped_rel(root, rel);
+		if (rel_grouped)
+		{
+			/* Make the grouped relation available for joining. */
+			add_grouped_rel(root, rel_grouped);
+		}
+	}
+}
+
 /*
  * set_base_rel_pathlists
  *	  Finds all paths available for scanning each base-relation entry.
@@ -559,6 +608,15 @@ set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 	/* Now find the cheapest of the paths for this rel */
 	set_cheapest(rel);
 
+	/*
+	 * If a grouped relation for this rel exists, build partial aggregation
+	 * paths for it.
+	 *
+	 * Note that this can only happen after we've called set_cheapest() for
+	 * this base rel, because we need its cheapest paths.
+	 */
+	set_grouped_rel_pathlist(root, rel);
+
 #ifdef OPTIMIZER_DEBUG
 	pprint(rel);
 #endif
@@ -1298,6 +1356,36 @@ set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 	add_paths_to_append_rel(root, rel, live_childrels);
 }
 
+/*
+ * set_grouped_rel_pathlist
+ *	  If a grouped relation for the given 'rel' exists, build partial
+ *	  aggregation paths for it.
+ */
+static void
+set_grouped_rel_pathlist(PlannerInfo *root, RelOptInfo *rel)
+{
+	RelOptInfo *rel_grouped;
+
+	/*
+	 * If there are no aggregate expressions or grouping expressions, eager
+	 * aggregation is not possible.
+	 */
+	if (root->agg_clause_list == NIL ||
+		root->group_expr_list == NIL)
+		return;
+
+	/* Add paths to the grouped base relation if one exists. */
+	rel_grouped = find_grouped_rel(root, rel->relids);
+	if (rel_grouped)
+	{
+		Assert(IS_GROUPED_REL(rel_grouped));
+
+		generate_grouped_paths(root, rel_grouped, rel,
+							   rel_grouped->agg_info);
+		set_cheapest(rel_grouped);
+	}
+}
+
 
 /*
  * add_paths_to_append_rel
@@ -3306,6 +3394,318 @@ generate_useful_gather_paths(PlannerInfo *root, RelOptInfo *rel, bool override_r
 	}
 }
 
+/*
+ * generate_grouped_paths
+ *		Generate paths for a grouped relation by adding sorted and hashed
+ *		partial aggregation paths on top of paths of the plain base or join
+ *		relation.
+ *
+ * The information needed are provided by the RelAggInfo structure.
+ */
+void
+generate_grouped_paths(PlannerInfo *root, RelOptInfo *rel_grouped,
+					   RelOptInfo *rel_plain, RelAggInfo *agg_info)
+{
+	AggClauseCosts agg_costs;
+	bool		can_hash;
+	bool		can_sort;
+	Path	   *cheapest_total_path = NULL;
+	Path	   *cheapest_partial_path = NULL;
+	double		dNumGroups = 0;
+	double		dNumPartialGroups = 0;
+
+	if (IS_DUMMY_REL(rel_plain))
+	{
+		mark_dummy_rel(rel_grouped);
+		return;
+	}
+
+	/*
+	 * If the grouped paths for the given relation are not considered useful,
+	 * do not bother to generate them.
+	 */
+	if (!agg_info->agg_useful)
+		return;
+
+	MemSet(&agg_costs, 0, sizeof(AggClauseCosts));
+	get_agg_clause_costs(root, AGGSPLIT_INITIAL_SERIAL, &agg_costs);
+
+	/*
+	 * Determine whether it's possible to perform sort-based implementations
+	 * of grouping.
+	 */
+	can_sort = grouping_is_sortable(agg_info->group_clauses);
+
+	/*
+	 * Determine whether we should consider hash-based implementations of
+	 * grouping.
+	 */
+	Assert(root->numOrderedAggs == 0);
+	can_hash = (agg_info->group_clauses != NIL &&
+				grouping_is_hashable(agg_info->group_clauses));
+
+	/*
+	 * Consider whether we should generate partially aggregated non-partial
+	 * paths.  We can only do this if we have a non-partial path.
+	 */
+	if (rel_plain->pathlist != NIL)
+	{
+		cheapest_total_path = rel_plain->cheapest_total_path;
+		Assert(cheapest_total_path != NULL);
+	}
+
+	/*
+	 * If parallelism is possible for rel_grouped, then we should consider
+	 * generating partially-grouped partial paths.  However, if the plain rel
+	 * has no partial paths, then we can't.
+	 */
+	if (rel_grouped->consider_parallel && rel_plain->partial_pathlist != NIL)
+	{
+		cheapest_partial_path = linitial(rel_plain->partial_pathlist);
+		Assert(cheapest_partial_path != NULL);
+	}
+
+	/* Estimate number of partial groups. */
+	if (cheapest_total_path != NULL)
+		dNumGroups = estimate_num_groups(root,
+										 agg_info->group_exprs,
+										 cheapest_total_path->rows,
+										 NULL, NULL);
+	if (cheapest_partial_path != NULL)
+		dNumPartialGroups = estimate_num_groups(root,
+												agg_info->group_exprs,
+												cheapest_partial_path->rows,
+												NULL, NULL);
+
+	if (can_sort && cheapest_total_path != NULL)
+	{
+		ListCell   *lc;
+
+		/*
+		 * Use any available suitably-sorted path as input, and also consider
+		 * sorting the cheapest-total path.
+		 */
+		foreach(lc, rel_plain->pathlist)
+		{
+			Path	   *input_path = (Path *) lfirst(lc);
+			Path	   *path;
+			bool		is_sorted;
+			int			presorted_keys;
+
+			/*
+			 * Since the path originates from a non-grouped relation that is
+			 * not aware of eager aggregation, we must ensure that it provides
+			 * the correct input for partial aggregation.
+			 */
+			path = (Path *) create_projection_path(root,
+												   rel_grouped,
+												   input_path,
+												   agg_info->agg_input);
+
+			is_sorted = pathkeys_count_contained_in(agg_info->group_pathkeys,
+													path->pathkeys,
+													&presorted_keys);
+			if (!is_sorted)
+			{
+				/*
+				 * Try at least sorting the cheapest path and also try
+				 * incrementally sorting any path which is partially sorted
+				 * already (no need to deal with paths which have presorted
+				 * keys when incremental sort is disabled unless it's the
+				 * cheapest input path).
+				 */
+				if (input_path != cheapest_total_path &&
+					(presorted_keys == 0 || !enable_incremental_sort))
+					continue;
+
+				/*
+				 * We've no need to consider both a sort and incremental sort.
+				 * We'll just do a sort if there are no presorted keys and an
+				 * incremental sort when there are presorted keys.
+				 */
+				if (presorted_keys == 0 || !enable_incremental_sort)
+					path = (Path *) create_sort_path(root,
+													 rel_grouped,
+													 path,
+													 agg_info->group_pathkeys,
+													 -1.0);
+				else
+					path = (Path *) create_incremental_sort_path(root,
+																 rel_grouped,
+																 path,
+																 agg_info->group_pathkeys,
+																 presorted_keys,
+																 -1.0);
+			}
+
+			/*
+			 * qual is NIL because the HAVING clause cannot be evaluated until
+			 * the final value of the aggregate is known.
+			 */
+			path = (Path *) create_agg_path(root,
+											rel_grouped,
+											path,
+											agg_info->target,
+											AGG_SORTED,
+											AGGSPLIT_INITIAL_SERIAL,
+											agg_info->group_clauses,
+											NIL,
+											&agg_costs,
+											dNumGroups);
+
+			add_path(rel_grouped, path);
+		}
+	}
+
+	if (can_sort && cheapest_partial_path != NULL)
+	{
+		ListCell   *lc;
+
+		/* Similar to above logic, but for partial paths. */
+		foreach(lc, rel_plain->partial_pathlist)
+		{
+			Path	   *input_path = (Path *) lfirst(lc);
+			Path	   *path;
+			bool		is_sorted;
+			int			presorted_keys;
+
+			/*
+			 * Since the path originates from a non-grouped relation that is
+			 * not aware of eager aggregation, we must ensure that it provides
+			 * the correct input for partial aggregation.
+			 */
+			path = (Path *) create_projection_path(root,
+												   rel_grouped,
+												   input_path,
+												   agg_info->agg_input);
+
+			is_sorted = pathkeys_count_contained_in(agg_info->group_pathkeys,
+													path->pathkeys,
+													&presorted_keys);
+
+			if (!is_sorted)
+			{
+				/*
+				 * Try at least sorting the cheapest path and also try
+				 * incrementally sorting any path which is partially sorted
+				 * already (no need to deal with paths which have presorted
+				 * keys when incremental sort is disabled unless it's the
+				 * cheapest input path).
+				 */
+				if (input_path != cheapest_partial_path &&
+					(presorted_keys == 0 || !enable_incremental_sort))
+					continue;
+
+				/*
+				 * We've no need to consider both a sort and incremental sort.
+				 * We'll just do a sort if there are no presorted keys and an
+				 * incremental sort when there are presorted keys.
+				 */
+				if (presorted_keys == 0 || !enable_incremental_sort)
+					path = (Path *) create_sort_path(root,
+													 rel_grouped,
+													 path,
+													 agg_info->group_pathkeys,
+													 -1.0);
+				else
+					path = (Path *) create_incremental_sort_path(root,
+																 rel_grouped,
+																 path,
+																 agg_info->group_pathkeys,
+																 presorted_keys,
+																 -1.0);
+			}
+
+			/*
+			 * qual is NIL because the HAVING clause cannot be evaluated until
+			 * the final value of the aggregate is known.
+			 */
+			path = (Path *) create_agg_path(root,
+											rel_grouped,
+											path,
+											agg_info->target,
+											AGG_SORTED,
+											AGGSPLIT_INITIAL_SERIAL,
+											agg_info->group_clauses,
+											NIL,
+											&agg_costs,
+											dNumPartialGroups);
+
+			add_partial_path(rel_grouped, path);
+		}
+	}
+
+	/*
+	 * Add a partially-grouped HashAgg Path where possible
+	 */
+	if (can_hash && cheapest_total_path != NULL)
+	{
+		Path	   *path;
+
+		/*
+		 * Since the path originates from a non-grouped relation that is not
+		 * aware of eager aggregation, we must ensure that it provides the
+		 * correct input for partial aggregation.
+		 */
+		path = (Path *) create_projection_path(root,
+											   rel_grouped,
+											   cheapest_total_path,
+											   agg_info->agg_input);
+
+		/*
+		 * qual is NIL because the HAVING clause cannot be evaluated until the
+		 * final value of the aggregate is known.
+		 */
+		path = (Path *) create_agg_path(root,
+										rel_grouped,
+										path,
+										agg_info->target,
+										AGG_HASHED,
+										AGGSPLIT_INITIAL_SERIAL,
+										agg_info->group_clauses,
+										NIL,
+										&agg_costs,
+										dNumGroups);
+
+		add_path(rel_grouped, path);
+	}
+
+	/*
+	 * Now add a partially-grouped HashAgg partial Path where possible
+	 */
+	if (can_hash && cheapest_partial_path != NULL)
+	{
+		Path	   *path;
+
+		/*
+		 * Since the path originates from a non-grouped relation that is not
+		 * aware of eager aggregation, we must ensure that it provides the
+		 * correct input for partial aggregation.
+		 */
+		path = (Path *) create_projection_path(root,
+											   rel_grouped,
+											   cheapest_partial_path,
+											   agg_info->agg_input);
+
+		/*
+		 * qual is NIL because the HAVING clause cannot be evaluated until the
+		 * final value of the aggregate is known.
+		 */
+		path = (Path *) create_agg_path(root,
+										rel_grouped,
+										path,
+										agg_info->target,
+										AGG_HASHED,
+										AGGSPLIT_INITIAL_SERIAL,
+										agg_info->group_clauses,
+										NIL,
+										&agg_costs,
+										dNumPartialGroups);
+
+		add_partial_path(rel_grouped, path);
+	}
+}
+
 /*
  * make_rel_from_joinlist
  *	  Build access paths using a "joinlist" to guide the join path search.
@@ -3414,9 +3814,10 @@ make_rel_from_joinlist(PlannerInfo *root, List *joinlist)
  * needed for these paths need have been instantiated.
  *
  * Note to plugin authors: the functions invoked during standard_join_search()
- * modify root->join_rel_list and root->join_rel_hash.  If you want to do more
- * than one join-order search, you'll probably need to save and restore the
- * original states of those data structures.  See geqo_eval() for an example.
+ * modify root->join_rel_list->items and root->join_rel_list->hash.  If you
+ * want to do more than one join-order search, you'll probably need to save and
+ * restore the original states of those data structures.  See geqo_eval() for
+ * an example.
  */
 RelOptInfo *
 standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
@@ -3465,6 +3866,10 @@ standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
 		 *
 		 * After that, we're done creating paths for the joinrel, so run
 		 * set_cheapest().
+		 *
+		 * In addition, we also run generate_grouped_paths() for the grouped
+		 * relation of each just-processed joinrel, and run set_cheapest() for
+		 * the grouped relation afterwards.
 		 */
 		foreach(lc, root->join_rel_level[lev])
 		{
@@ -3485,6 +3890,27 @@ standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
 			/* Find and save the cheapest paths for this rel */
 			set_cheapest(rel);
 
+			/*
+			 * Except for the topmost scan/join rel, consider generating
+			 * partial aggregation paths for the grouped relation on top of
+			 * the paths of this rel.  After that, we're done creating paths
+			 * for the grouped relation, so run set_cheapest().
+			 */
+			if (!bms_equal(rel->relids, root->all_query_rels))
+			{
+				RelOptInfo *rel_grouped;
+
+				rel_grouped = find_grouped_rel(root, rel->relids);
+				if (rel_grouped)
+				{
+					Assert(IS_GROUPED_REL(rel_grouped));
+
+					generate_grouped_paths(root, rel_grouped, rel,
+										   rel_grouped->agg_info);
+					set_cheapest(rel_grouped);
+				}
+			}
+
 #ifdef OPTIMIZER_DEBUG
 			pprint(rel);
 #endif
@@ -4353,6 +4779,29 @@ generate_partitionwise_join_paths(PlannerInfo *root, RelOptInfo *rel)
 		if (IS_DUMMY_REL(child_rel))
 			continue;
 
+		/*
+		 * Except for the topmost scan/join rel, consider generating partial
+		 * aggregation paths for the grouped relation on top of the paths of
+		 * this partitioned child-join.  After that, we're done creating paths
+		 * for the grouped relation, so run set_cheapest().
+		 */
+		if (!bms_equal(IS_OTHER_REL(rel) ?
+					   rel->top_parent_relids : rel->relids,
+					   root->all_query_rels))
+		{
+			RelOptInfo *rel_grouped;
+
+			rel_grouped = find_grouped_rel(root, child_rel->relids);
+			if (rel_grouped)
+			{
+				Assert(IS_GROUPED_REL(rel_grouped));
+
+				generate_grouped_paths(root, rel_grouped, child_rel,
+									   rel_grouped->agg_info);
+				set_cheapest(rel_grouped);
+			}
+		}
+
 #ifdef OPTIMIZER_DEBUG
 		pprint(child_rel);
 #endif
diff --git a/src/backend/optimizer/path/costsize.c b/src/backend/optimizer/path/costsize.c
index ec004ed949..78ea6550a4 100644
--- a/src/backend/optimizer/path/costsize.c
+++ b/src/backend/optimizer/path/costsize.c
@@ -180,6 +180,8 @@ static bool cost_qual_eval_walker(Node *node, cost_qual_eval_context *context);
 static void get_restriction_qual_cost(PlannerInfo *root, RelOptInfo *baserel,
 									  ParamPathInfo *param_info,
 									  QualCost *qpqual_cost);
+static void set_joinpath_size(PlannerInfo *root, JoinPath *jpath,
+							  SpecialJoinInfo *sjinfo);
 static bool has_indexed_join_quals(NestPath *path);
 static double approx_tuple_count(PlannerInfo *root, JoinPath *path,
 								 List *quals);
@@ -3370,19 +3372,7 @@ final_cost_nestloop(PlannerInfo *root, NestPath *path,
 	if (inner_path_rows <= 0)
 		inner_path_rows = 1;
 	/* Mark the path with the correct row estimate */
-	if (path->jpath.path.param_info)
-		path->jpath.path.rows = path->jpath.path.param_info->ppi_rows;
-	else
-		path->jpath.path.rows = path->jpath.path.parent->rows;
-
-	/* For partial paths, scale row estimate. */
-	if (path->jpath.path.parallel_workers > 0)
-	{
-		double		parallel_divisor = get_parallel_divisor(&path->jpath.path);
-
-		path->jpath.path.rows =
-			clamp_row_est(path->jpath.path.rows / parallel_divisor);
-	}
+	set_joinpath_size(root, &path->jpath, extra->sjinfo);
 
 	/* cost of inner-relation source data (we already dealt with outer rel) */
 
@@ -3867,19 +3857,7 @@ final_cost_mergejoin(PlannerInfo *root, MergePath *path,
 		inner_path_rows = 1;
 
 	/* Mark the path with the correct row estimate */
-	if (path->jpath.path.param_info)
-		path->jpath.path.rows = path->jpath.path.param_info->ppi_rows;
-	else
-		path->jpath.path.rows = path->jpath.path.parent->rows;
-
-	/* For partial paths, scale row estimate. */
-	if (path->jpath.path.parallel_workers > 0)
-	{
-		double		parallel_divisor = get_parallel_divisor(&path->jpath.path);
-
-		path->jpath.path.rows =
-			clamp_row_est(path->jpath.path.rows / parallel_divisor);
-	}
+	set_joinpath_size(root, &path->jpath, extra->sjinfo);
 
 	/*
 	 * Compute cost of the mergequals and qpquals (other restriction clauses)
@@ -4299,19 +4277,7 @@ final_cost_hashjoin(PlannerInfo *root, HashPath *path,
 	path->jpath.path.disabled_nodes = workspace->disabled_nodes;
 
 	/* Mark the path with the correct row estimate */
-	if (path->jpath.path.param_info)
-		path->jpath.path.rows = path->jpath.path.param_info->ppi_rows;
-	else
-		path->jpath.path.rows = path->jpath.path.parent->rows;
-
-	/* For partial paths, scale row estimate. */
-	if (path->jpath.path.parallel_workers > 0)
-	{
-		double		parallel_divisor = get_parallel_divisor(&path->jpath.path);
-
-		path->jpath.path.rows =
-			clamp_row_est(path->jpath.path.rows / parallel_divisor);
-	}
+	set_joinpath_size(root, &path->jpath, extra->sjinfo);
 
 	/* mark the path with estimated # of batches */
 	path->num_batches = numbatches;
@@ -5061,6 +5027,57 @@ get_restriction_qual_cost(PlannerInfo *root, RelOptInfo *baserel,
 		*qpqual_cost = baserel->baserestrictcost;
 }
 
+/*
+ * set_joinpath_size
+ *	  Set the correct row estimate for the given join path.
+ *
+ * 'jpath' is the join path under consideration.
+ * 'sjinfo' is any SpecialJoinInfo relevant to this join.
+ *
+ * Note that for a grouped join relation, its paths could have very different
+ * rowcount estimates, so we need to calculate the rowcount estimate using the
+ * outer path and inner path of the given join path.
+ */
+static void
+set_joinpath_size(PlannerInfo *root, JoinPath *jpath, SpecialJoinInfo *sjinfo)
+{
+	if (IS_GROUPED_REL(jpath->path.parent))
+	{
+		Path	   *outer_path = jpath->outerjoinpath;
+		Path	   *inner_path = jpath->innerjoinpath;
+
+		/*
+		 * Estimate the number of rows of this grouped join path as the sizes
+		 * of the outer and inner paths times the selectivity of the clauses
+		 * that have ended up at this join node.
+		 */
+		jpath->path.rows = calc_joinrel_size_estimate(root,
+													  jpath->path.parent,
+													  outer_path->parent,
+													  inner_path->parent,
+													  outer_path->rows,
+													  inner_path->rows,
+													  sjinfo,
+													  jpath->joinrestrictinfo);
+	}
+	else
+	{
+		if (jpath->path.param_info)
+			jpath->path.rows = jpath->path.param_info->ppi_rows;
+		else
+			jpath->path.rows = jpath->path.parent->rows;
+
+		/* For partial paths, scale row estimate. */
+		if (jpath->path.parallel_workers > 0)
+		{
+			double		parallel_divisor = get_parallel_divisor(&jpath->path);
+
+			jpath->path.rows =
+				clamp_row_est(jpath->path.rows / parallel_divisor);
+		}
+	}
+}
+
 
 /*
  * compute_semi_anti_join_factors
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index c2eb300ea9..88ab272479 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -35,6 +35,9 @@ static bool has_legal_joinclause(PlannerInfo *root, RelOptInfo *rel);
 static bool restriction_is_constant_false(List *restrictlist,
 										  RelOptInfo *joinrel,
 										  bool only_pushed_down);
+static void make_grouped_join_rel(PlannerInfo *root, RelOptInfo *rel1,
+								  RelOptInfo *rel2, RelOptInfo *joinrel,
+								  SpecialJoinInfo *sjinfo, List *restrictlist);
 static void populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
 										RelOptInfo *rel2, RelOptInfo *joinrel,
 										SpecialJoinInfo *sjinfo, List *restrictlist);
@@ -771,6 +774,10 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
 		return joinrel;
 	}
 
+	/* Build a grouped join relation for 'joinrel' if possible. */
+	make_grouped_join_rel(root, rel1, rel2, joinrel, sjinfo,
+						  restrictlist);
+
 	/* Add paths to the join relation. */
 	populate_joinrel_with_paths(root, rel1, rel2, joinrel, sjinfo,
 								restrictlist);
@@ -882,6 +889,135 @@ add_outer_joins_to_relids(PlannerInfo *root, Relids input_relids,
 	return input_relids;
 }
 
+/*
+ * make_grouped_join_rel
+ *	  Build a grouped join relation out of 'joinrel' if eager aggregation is
+ *	  possible and the 'joinrel' can produce grouped paths.
+ *
+ * We also generate partial aggregation paths for the grouped relation by
+ * joining the grouped paths of 'rel1' to the plain paths of 'rel2', or by
+ * joining the grouped paths of 'rel2' to the plain paths of 'rel1'.
+ */
+static void
+make_grouped_join_rel(PlannerInfo *root, RelOptInfo *rel1,
+					  RelOptInfo *rel2, RelOptInfo *joinrel,
+					  SpecialJoinInfo *sjinfo, List *restrictlist)
+{
+	RelOptInfo *rel_grouped;
+	RelOptInfo *rel1_grouped;
+	RelOptInfo *rel2_grouped;
+	bool		rel1_empty;
+	bool		rel2_empty;
+	bool		yet_to_add = false;
+
+	/*
+	 * If there are no aggregate expressions or grouping expressions, eager
+	 * aggregation is not possible.
+	 */
+	if (root->agg_clause_list == NIL ||
+		root->group_expr_list == NIL)
+		return;
+
+	/*
+	 * See if we already have a grouped joinrel for this joinrel.
+	 */
+	rel_grouped = find_grouped_rel(root, joinrel->relids);
+
+	/*
+	 * Construct a new RelOptInfo for the grouped join relation if there is no
+	 * existing one.
+	 */
+	if (rel_grouped == NULL)
+	{
+		RelAggInfo *agg_info = NULL;
+
+		/*
+		 * Prepare the information needed to create grouped paths for this
+		 * join relation.
+		 */
+		agg_info = create_rel_agg_info(root, joinrel);
+		if (agg_info == NULL)
+			return;
+
+		/* build a grouped relation out of the plain relation */
+		rel_grouped = build_grouped_rel(root, joinrel);
+		rel_grouped->reltarget = agg_info->target;
+		rel_grouped->rows = agg_info->grouped_rows;
+		rel_grouped->agg_info = agg_info;
+
+		/*
+		 * If the grouped paths for the given join relation are considered
+		 * useful, add the grouped relation we just built to the PlannerInfo
+		 * to make it available for further joining or for acting as the upper
+		 * rel representing the result of partial aggregation.  Otherwise, we
+		 * need to postpone the decision on adding the grouped relation to the
+		 * PlannerInfo, as it depends on whether we can generate any grouped
+		 * paths by joining the given pair of input relations.
+		 */
+		if (agg_info->agg_useful)
+			add_grouped_rel(root, rel_grouped);
+		else
+			yet_to_add = true;
+	}
+
+	Assert(IS_GROUPED_REL(rel_grouped));
+
+	/* We may have already proven this grouped join relation to be dummy. */
+	if (IS_DUMMY_REL(rel_grouped))
+		return;
+
+	/* Retrieve the grouped relations for the two input rels */
+	rel1_grouped = find_grouped_rel(root, rel1->relids);
+	rel2_grouped = find_grouped_rel(root, rel2->relids);
+
+	rel1_empty = (rel1_grouped == NULL || IS_DUMMY_REL(rel1_grouped));
+	rel2_empty = (rel2_grouped == NULL || IS_DUMMY_REL(rel2_grouped));
+
+	/* Nothing to do if there's no grouped relation. */
+	if (rel1_empty && rel2_empty)
+		return;
+
+	/* Joining two grouped relations is currently not supported */
+	if (!rel1_empty && !rel2_empty)
+		return;
+
+	/* Generate partial aggregation paths for the grouped relation */
+	if (!rel1_empty)
+	{
+		populate_joinrel_with_paths(root, rel1_grouped, rel2, rel_grouped,
+									sjinfo, restrictlist);
+
+		/*
+		 * It shouldn't happen that we have marked rel1_grouped as dummy in
+		 * populate_joinrel_with_paths due to provably constant-false join
+		 * restrictions, hence we wouldn't end up with a plan that has Aggref
+		 * in non-Agg plan node.
+		 */
+		Assert(!IS_DUMMY_REL(rel1_grouped));
+	}
+	else if (!rel2_empty)
+	{
+		populate_joinrel_with_paths(root, rel1, rel2_grouped, rel_grouped,
+									sjinfo, restrictlist);
+
+		/*
+		 * It shouldn't happen that we have marked rel2_grouped as dummy in
+		 * populate_joinrel_with_paths due to provably constant-false join
+		 * restrictions, hence we wouldn't end up with a plan that has Aggref
+		 * in non-Agg plan node.
+		 */
+		Assert(!IS_DUMMY_REL(rel2_grouped));
+	}
+
+	/*
+	 * Since we have generated grouped paths by joining the given pair of
+	 * input relations, add the grouped relation to the PlannerInfo if we have
+	 * not already done so.
+	 */
+	if (yet_to_add)
+		add_grouped_rel(root, rel_grouped);
+}
+
 /*
  * populate_joinrel_with_paths
  *	  Add paths to the given joinrel for given pair of joining relations. The
@@ -1674,6 +1810,11 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
 						 adjust_child_relids(joinrel->relids,
 											 nappinfos, appinfos)));
 
+		/* Build a grouped join relation for 'child_joinrel' if possible */
+		make_grouped_join_rel(root, child_rel1, child_rel2,
+							  child_joinrel, child_sjinfo,
+							  child_restrictlist);
+
 		/* And make paths for the child join */
 		populate_joinrel_with_paths(root, child_rel1, child_rel2,
 									child_joinrel, child_sjinfo,
diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index 2cb0ae6d65..0821723754 100644
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -14,6 +14,7 @@
  */
 #include "postgres.h"
 
+#include "access/nbtree.h"
 #include "catalog/pg_constraint.h"
 #include "catalog/pg_type.h"
 #include "nodes/makefuncs.h"
@@ -81,6 +82,8 @@ typedef struct JoinTreeItem
 } JoinTreeItem;
 
 
+static void create_agg_clause_infos(PlannerInfo *root);
+static void create_grouping_expr_infos(PlannerInfo *root);
 static void extract_lateral_references(PlannerInfo *root, RelOptInfo *brel,
 									   Index rtindex);
 static List *deconstruct_recurse(PlannerInfo *root, Node *jtnode,
@@ -628,6 +631,261 @@ remove_useless_groupby_columns(PlannerInfo *root)
 	}
 }
 
+/*
+ * setup_eager_aggregation
+ *	  Check if eager aggregation is applicable, and if so collect suitable
+ *	  aggregate expressions and grouping expressions in the query.
+ */
+void
+setup_eager_aggregation(PlannerInfo *root)
+{
+	/*
+	 * Don't apply eager aggregation if disabled by user.
+	 */
+	if (!enable_eager_aggregate)
+		return;
+
+	/*
+	 * Don't apply eager aggregation if there are no available GROUP BY
+	 * clauses.
+	 */
+	if (!root->processed_groupClause)
+		return;
+
+	/*
+	 * For now we don't try to support grouping sets.
+	 */
+	if (root->parse->groupingSets)
+		return;
+
+	/*
+	 * For now we don't try to support DISTINCT or ORDER BY aggregates.
+	 */
+	if (root->numOrderedAggs > 0)
+		return;
+
+	/*
+	 * If there are any aggregates that do not support partial mode, or any
+	 * partial aggregates that are non-serializable, do not apply eager
+	 * aggregation.
+	 */
+	if (root->hasNonPartialAggs || root->hasNonSerialAggs)
+		return;
+
+	/*
+	 * We don't try to apply eager aggregation if there are set-returning
+	 * functions in targetlist.
+	 */
+	if (root->parse->hasTargetSRFs)
+		return;
+
+	/*
+	 * Eager aggregation only makes sense if there are multiple base rels in
+	 * the query.
+	 */
+	if (bms_membership(root->all_baserels) != BMS_MULTIPLE)
+		return;
+
+	/*
+	 * Collect aggregate expressions and plain Vars that appear in targetlist
+	 * and havingQual.
+	 */
+	create_agg_clause_infos(root);
+
+	/*
+	 * If there are no suitable aggregate expressions, we cannot apply eager
+	 * aggregation.
+	 */
+	if (root->agg_clause_list == NIL)
+		return;
+
+	/*
+	 * Collect grouping expressions that appear in grouping clauses.
+	 */
+	create_grouping_expr_infos(root);
+}
+
+/*
+ * create_agg_clause_infos
+ *	  Search the targetlist and havingQual for Aggrefs and plain Vars, and
+ *	  create an AggClauseInfo for each Aggref node.
+ */
+static void
+create_agg_clause_infos(PlannerInfo *root)
+{
+	List	   *tlist_exprs;
+	List	   *agg_clause_list = NIL;
+	List	   *tlist_vars = NIL;
+	ListCell   *lc;
+
+	Assert(root->agg_clause_list == NIL);
+	Assert(root->tlist_vars == NIL);
+
+	tlist_exprs = pull_var_clause((Node *) root->processed_tlist,
+								  PVC_INCLUDE_AGGREGATES |
+								  PVC_RECURSE_WINDOWFUNCS |
+								  PVC_RECURSE_PLACEHOLDERS);
+
+	/*
+	 * Aggregates within the HAVING clause need to be processed in the same
+	 * way as those in the targetlist.  Note that HAVING can contain Aggrefs
+	 * but not WindowFuncs.
+	 */
+	if (root->parse->havingQual != NULL)
+	{
+		List	   *having_exprs;
+
+		having_exprs = pull_var_clause((Node *) root->parse->havingQual,
+									   PVC_INCLUDE_AGGREGATES |
+									   PVC_RECURSE_PLACEHOLDERS);
+		if (having_exprs != NIL)
+		{
+			tlist_exprs = list_concat(tlist_exprs, having_exprs);
+			list_free(having_exprs);
+		}
+	}
+
+	foreach(lc, tlist_exprs)
+	{
+		Expr	   *expr = (Expr *) lfirst(lc);
+		Aggref	   *aggref;
+		AggClauseInfo *ac_info;
+
+		/* For now we don't try to support GROUPING() expressions */
+		if (IsA(expr, GroupingFunc))
+		{
+			list_free_deep(agg_clause_list);
+			list_free(tlist_vars);
+			list_free(tlist_exprs);
+
+			return;
+		}
+
+		/* Collect plain Vars for future reference */
+		if (IsA(expr, Var))
+		{
+			tlist_vars = list_append_unique(tlist_vars, expr);
+			continue;
+		}
+
+		aggref = castNode(Aggref, expr);
+
+		Assert(aggref->aggorder == NIL);
+		Assert(aggref->aggdistinct == NIL);
+
+		/*
+		 * If there are any securityQuals, do not try to apply eager
+		 * aggregation if any non-leakproof aggregate functions are present.
+		 * This is overly strict, but for now...
+		 */
+		if (root->qual_security_level > 0 &&
+			!get_func_leakproof(aggref->aggfnoid))
+		{
+			list_free_deep(agg_clause_list);
+			list_free(tlist_vars);
+			list_free(tlist_exprs);
+
+			return;
+		}
+
+		ac_info = makeNode(AggClauseInfo);
+		ac_info->aggref = aggref;
+		ac_info->agg_eval_at = pull_varnos(root, (Node *) aggref);
+
+		agg_clause_list = list_append_unique(agg_clause_list, ac_info);
+	}
+
+	list_free(tlist_exprs);
+
+	root->agg_clause_list = agg_clause_list;
+	root->tlist_vars = tlist_vars;
+}
+
+/*
+ * create_grouping_expr_infos
+ *	  Create GroupExprInfo for each expression usable as grouping key.
+ *
+ * If any grouping expression is not suitable, we will just return with
+ * root->group_expr_list being NIL.
+ */
+static void
+create_grouping_expr_infos(PlannerInfo *root)
+{
+	List	   *exprs = NIL;
+	List	   *sortgrouprefs = NIL;
+	List	   *btree_opfamilies = NIL;
+	ListCell   *lc,
+			   *lc1,
+			   *lc2,
+			   *lc3;
+
+	Assert(root->group_expr_list == NIL);
+
+	foreach(lc, root->processed_groupClause)
+	{
+		SortGroupClause *sgc = lfirst_node(SortGroupClause, lc);
+		TargetEntry *tle = get_sortgroupclause_tle(sgc, root->processed_tlist);
+		TypeCacheEntry *tce;
+		Oid			equalimageproc;
+
+		Assert(tle->ressortgroupref > 0);
+
+		/*
+		 * For now we only support plain Vars as grouping expressions.
+		 */
+		if (!IsA(tle->expr, Var))
+			return;
+
+		/*
+		 * Eager aggregation is only possible if equality implies image
+		 * equality for each grouping key.  Otherwise, placing keys with
+		 * different byte images into the same group may result in the loss of
+		 * information that could be necessary to evaluate upper qual clauses.
+		 *
+		 * For instance, the NUMERIC data type is not supported, as values
+		 * that are considered equal by the equality operator (e.g., 0 and
+		 * 0.0) can have different scales.
+		 */
+		tce = lookup_type_cache(exprType((Node *) tle->expr),
+								TYPECACHE_BTREE_OPFAMILY);
+		if (!OidIsValid(tce->btree_opf) ||
+			!OidIsValid(tce->btree_opintype))
+			return;
+
+		equalimageproc = get_opfamily_proc(tce->btree_opf,
+										   tce->btree_opintype,
+										   tce->btree_opintype,
+										   BTEQUALIMAGE_PROC);
+		if (!OidIsValid(equalimageproc) ||
+			!DatumGetBool(OidFunctionCall1Coll(equalimageproc,
+											   tce->typcollation,
+											   ObjectIdGetDatum(tce->btree_opintype))))
+			return;
+
+		exprs = lappend(exprs, tle->expr);
+		sortgrouprefs = lappend_int(sortgrouprefs, tle->ressortgroupref);
+		btree_opfamilies = lappend_oid(btree_opfamilies, tce->btree_opf);
+	}
+
+	/*
+	 * Construct GroupExprInfo for each expression.
+	 */
+	forthree(lc1, exprs, lc2, sortgrouprefs, lc3, btree_opfamilies)
+	{
+		Expr	   *expr = (Expr *) lfirst(lc1);
+		int			sortgroupref = lfirst_int(lc2);
+		Oid			btree_opfamily = lfirst_oid(lc3);
+		GroupExprInfo *ge_info;
+
+		ge_info = makeNode(GroupExprInfo);
+		ge_info->expr = (Expr *) copyObject(expr);
+		ge_info->sortgroupref = sortgroupref;
+		ge_info->btree_opfamily = btree_opfamily;
+
+		root->group_expr_list = lappend(root->group_expr_list, ge_info);
+	}
+}
+
 /*****************************************************************************
  *
  *	  LATERAL REFERENCES
diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c
index ade23fd9d5..30cec6d9b2 100644
--- a/src/backend/optimizer/plan/planmain.c
+++ b/src/backend/optimizer/plan/planmain.c
@@ -64,8 +64,12 @@ query_planner(PlannerInfo *root,
 	 * NOTE: append_rel_list was set up by subquery_planner, so do not touch
 	 * here.
 	 */
-	root->join_rel_list = NIL;
-	root->join_rel_hash = NULL;
+	root->join_rel_list = makeNode(RelInfoList);
+	root->join_rel_list->items = NIL;
+	root->join_rel_list->hash = NULL;
+	root->grouped_rel_list = makeNode(RelInfoList);
+	root->grouped_rel_list->items = NIL;
+	root->grouped_rel_list->hash = NULL;
 	root->join_rel_level = NULL;
 	root->join_cur_level = 0;
 	root->canon_pathkeys = NIL;
@@ -76,6 +80,9 @@ query_planner(PlannerInfo *root,
 	root->placeholder_list = NIL;
 	root->placeholder_array = NULL;
 	root->placeholder_array_size = 0;
+	root->agg_clause_list = NIL;
+	root->group_expr_list = NIL;
+	root->tlist_vars = NIL;
 	root->fkey_list = NIL;
 	root->initial_rels = NIL;
 
@@ -260,6 +267,12 @@ query_planner(PlannerInfo *root,
 	 */
 	extract_restriction_or_clauses(root);
 
+	/*
+	 * Check if eager aggregation is applicable, and if so, set up
+	 * root->agg_clause_list and root->group_expr_list.
+	 */
+	setup_eager_aggregation(root);
+
 	/*
 	 * Now expand appendrels by adding "otherrels" for their children.  We
 	 * delay this to the end so that we have as much information as possible
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 6803edd085..dce50e4837 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -229,7 +229,6 @@ static void add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 									  RelOptInfo *partially_grouped_rel,
 									  const AggClauseCosts *agg_costs,
 									  grouping_sets_data *gd,
-									  double dNumGroups,
 									  GroupPathExtraData *extra);
 static RelOptInfo *create_partial_grouping_paths(PlannerInfo *root,
 												 RelOptInfo *grouped_rel,
@@ -3915,9 +3914,7 @@ create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
 							   GroupPathExtraData *extra,
 							   RelOptInfo **partially_grouped_rel_p)
 {
-	Path	   *cheapest_path = input_rel->cheapest_total_path;
 	RelOptInfo *partially_grouped_rel = NULL;
-	double		dNumGroups;
 	PartitionwiseAggregateType patype = PARTITIONWISE_AGGREGATE_NONE;
 
 	/*
@@ -3999,23 +3996,16 @@ create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
 
 	/* Gather any partially grouped partial paths. */
 	if (partially_grouped_rel && partially_grouped_rel->partial_pathlist)
-	{
 		gather_grouping_paths(root, partially_grouped_rel);
-		set_cheapest(partially_grouped_rel);
-	}
 
-	/*
-	 * Estimate number of groups.
-	 */
-	dNumGroups = get_number_of_groups(root,
-									  cheapest_path->rows,
-									  gd,
-									  extra->targetList);
+	/* Now choose the best path(s) for partially_grouped_rel. */
+	if (partially_grouped_rel && partially_grouped_rel->pathlist)
+		set_cheapest(partially_grouped_rel);
 
 	/* Build final grouping paths */
 	add_paths_to_grouping_rel(root, input_rel, grouped_rel,
 							  partially_grouped_rel, agg_costs, gd,
-							  dNumGroups, extra);
+							  extra);
 
 	/* Give a helpful error if we failed to find any implementation */
 	if (grouped_rel->pathlist == NIL)
@@ -6906,16 +6896,42 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 						  RelOptInfo *grouped_rel,
 						  RelOptInfo *partially_grouped_rel,
 						  const AggClauseCosts *agg_costs,
-						  grouping_sets_data *gd, double dNumGroups,
+						  grouping_sets_data *gd,
 						  GroupPathExtraData *extra)
 {
 	Query	   *parse = root->parse;
 	Path	   *cheapest_path = input_rel->cheapest_total_path;
+	Path	   *cheapest_partially_grouped_path = NULL;
 	ListCell   *lc;
 	bool		can_hash = (extra->flags & GROUPING_CAN_USE_HASH) != 0;
 	bool		can_sort = (extra->flags & GROUPING_CAN_USE_SORT) != 0;
 	List	   *havingQual = (List *) extra->havingQual;
 	AggClauseCosts *agg_final_costs = &extra->agg_final_costs;
+	double		dNumGroups = 0;
+	double		dNumFinalGroups = 0;
+
+	/*
+	 * Estimate number of groups for non-split aggregation.
+	 */
+	dNumGroups = get_number_of_groups(root,
+									  cheapest_path->rows,
+									  gd,
+									  extra->targetList);
+
+	if (partially_grouped_rel && partially_grouped_rel->pathlist)
+	{
+		cheapest_partially_grouped_path =
+			partially_grouped_rel->cheapest_total_path;
+
+		/*
+		 * Estimate number of groups for final phase of partial aggregation.
+		 */
+		dNumFinalGroups =
+			get_number_of_groups(root,
+								 cheapest_partially_grouped_path->rows,
+								 gd,
+								 extra->targetList);
+	}
 
 	if (can_sort)
 	{
@@ -7028,7 +7044,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 					path = make_ordered_path(root,
 											 grouped_rel,
 											 path,
-											 partially_grouped_rel->cheapest_total_path,
+											 cheapest_partially_grouped_path,
 											 info->pathkeys,
 											 -1.0);
 
@@ -7046,7 +7062,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 												 info->clauses,
 												 havingQual,
 												 agg_final_costs,
-												 dNumGroups));
+												 dNumFinalGroups));
 					else
 						add_path(grouped_rel, (Path *)
 								 create_group_path(root,
@@ -7054,7 +7070,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 												   path,
 												   info->clauses,
 												   havingQual,
-												   dNumGroups));
+												   dNumFinalGroups));
 
 				}
 			}
@@ -7096,19 +7112,17 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 		 */
 		if (partially_grouped_rel && partially_grouped_rel->pathlist)
 		{
-			Path	   *path = partially_grouped_rel->cheapest_total_path;
-
 			add_path(grouped_rel, (Path *)
 					 create_agg_path(root,
 									 grouped_rel,
-									 path,
+									 cheapest_partially_grouped_path,
 									 grouped_rel->reltarget,
 									 AGG_HASHED,
 									 AGGSPLIT_FINAL_DESERIAL,
 									 root->processed_groupClause,
 									 havingQual,
 									 agg_final_costs,
-									 dNumGroups));
+									 dNumFinalGroups));
 		}
 	}
 
@@ -7158,6 +7172,21 @@ create_partial_grouping_paths(PlannerInfo *root,
 	bool		can_hash = (extra->flags & GROUPING_CAN_USE_HASH) != 0;
 	bool		can_sort = (extra->flags & GROUPING_CAN_USE_SORT) != 0;
 
+	/*
+	 * The partially_grouped_rel could have been already created due to eager
+	 * aggregation.
+	 */
+	partially_grouped_rel = find_grouped_rel(root, input_rel->relids);
+	Assert(enable_eager_aggregate || partially_grouped_rel == NULL);
+
+	/*
+	 * It is possible that the partially_grouped_rel created by eager
+	 * aggregation is dummy.  In this case we just set it to NULL.  It might
+	 * be created again by the following logic if possible.
+	 */
+	if (partially_grouped_rel && IS_DUMMY_REL(partially_grouped_rel))
+		partially_grouped_rel = NULL;
+
 	/*
 	 * Consider whether we should generate partially aggregated non-partial
 	 * paths.  We can only do this if we have a non-partial path, and only if
@@ -7181,19 +7210,27 @@ create_partial_grouping_paths(PlannerInfo *root,
 	 * If we can't partially aggregate partial paths, and we can't partially
 	 * aggregate non-partial paths, then don't bother creating the new
 	 * RelOptInfo at all, unless the caller specified force_rel_creation.
+	 *
+	 * Note that the partially_grouped_rel could have been already created and
+	 * populated with appropriate paths by eager aggregation.
 	 */
 	if (cheapest_total_path == NULL &&
 		cheapest_partial_path == NULL &&
+		(partially_grouped_rel == NULL ||
+		 partially_grouped_rel->pathlist == NIL) &&
 		!force_rel_creation)
 		return NULL;
 
 	/*
 	 * Build a new upper relation to represent the result of partially
-	 * aggregating the rows from the input relation.
-	 */
-	partially_grouped_rel = fetch_upper_rel(root,
-											UPPERREL_PARTIAL_GROUP_AGG,
-											grouped_rel->relids);
+	 * aggregating the rows from the input relation.  The relation may already
+	 * exist due to eager aggregation, in which case we don't need to create
+	 * it.
+	 */
+	if (partially_grouped_rel == NULL)
+		partially_grouped_rel = fetch_upper_rel(root,
+												UPPERREL_PARTIAL_GROUP_AGG,
+												grouped_rel->relids);
 	partially_grouped_rel->consider_parallel =
 		grouped_rel->consider_parallel;
 	partially_grouped_rel->reloptkind = grouped_rel->reloptkind;
@@ -7202,6 +7239,14 @@ create_partial_grouping_paths(PlannerInfo *root,
 	partially_grouped_rel->useridiscurrent = grouped_rel->useridiscurrent;
 	partially_grouped_rel->fdwroutine = grouped_rel->fdwroutine;
 
+	/*
+	 * Partially-grouped partial paths may have been generated by eager
+	 * aggregation.  If we find that parallelism is not possible for
+	 * partially_grouped_rel, we need to drop these partial paths.
+	 */
+	if (!partially_grouped_rel->consider_parallel)
+		partially_grouped_rel->partial_pathlist = NIL;
+
 	/*
 	 * Build target list for partial aggregate paths.  These paths cannot just
 	 * emit the same tlist as regular aggregate paths, because (1) we must
diff --git a/src/backend/optimizer/util/appendinfo.c b/src/backend/optimizer/util/appendinfo.c
index cece3a5be7..20cfe95340 100644
--- a/src/backend/optimizer/util/appendinfo.c
+++ b/src/backend/optimizer/util/appendinfo.c
@@ -499,6 +499,66 @@ adjust_appendrel_attrs_mutator(Node *node,
 		return (Node *) newinfo;
 	}
 
+	/*
+	 * We have to process RelAggInfo nodes specially.
+	 */
+	if (IsA(node, RelAggInfo))
+	{
+		RelAggInfo *oldinfo = (RelAggInfo *) node;
+		RelAggInfo *newinfo = makeNode(RelAggInfo);
+
+		/* Copy all flat-copiable fields */
+		memcpy(newinfo, oldinfo, sizeof(RelAggInfo));
+
+		newinfo->relids = adjust_child_relids(oldinfo->relids,
+											  context->nappinfos,
+											  context->appinfos);
+
+		newinfo->target = (PathTarget *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->target,
+										   context);
+
+		newinfo->agg_input = (PathTarget *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->agg_input,
+										   context);
+
+		newinfo->group_clauses = (List *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->group_clauses,
+										   context);
+
+		newinfo->group_exprs = (List *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->group_exprs,
+										   context);
+
+		return (Node *) newinfo;
+	}
+
+	/*
+	 * We have to process PathTarget nodes specially.
+	 */
+	if (IsA(node, PathTarget))
+	{
+		PathTarget *oldtarget = (PathTarget *) node;
+		PathTarget *newtarget = makeNode(PathTarget);
+
+		/* Copy all flat-copiable fields */
+		memcpy(newtarget, oldtarget, sizeof(PathTarget));
+
+		if (oldtarget->sortgrouprefs)
+		{
+			Size		nbytes = list_length(oldtarget->exprs) * sizeof(Index);
+
+			newtarget->exprs = (List *)
+				adjust_appendrel_attrs_mutator((Node *) oldtarget->exprs,
+											   context);
+
+			newtarget->sortgrouprefs = (Index *) palloc(nbytes);
+			memcpy(newtarget->sortgrouprefs, oldtarget->sortgrouprefs, nbytes);
+		}
+
+		return (Node *) newtarget;
+	}
+
 	/*
 	 * NOTE: we do not need to recurse into sublinks, because they should
 	 * already have been converted to subplans before we see them.
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index 93e73cb44d..9d5df0553b 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -262,6 +262,12 @@ compare_path_costs_fuzzily(Path *path1, Path *path2, double fuzz_factor)
  * unparameterized path, too, if there is one; the users of that list find
  * it more convenient if that's included.
  *
+ * cheapest_parameterized_paths also always includes the fewest-row
+ * unparameterized path, if there is one, for grouped relations.  Different
+ * paths of a grouped relation can have very different row counts, and in some
+ * cases the cheapest-total unparameterized path may not be the one with the
+ * fewest row.
+ *
  * This is normally called only after we've finished constructing the path
  * list for the rel node.
  */
@@ -271,6 +277,7 @@ set_cheapest(RelOptInfo *parent_rel)
 	Path	   *cheapest_startup_path;
 	Path	   *cheapest_total_path;
 	Path	   *best_param_path;
+	Path	   *fewest_row_path;
 	List	   *parameterized_paths;
 	ListCell   *p;
 
@@ -280,6 +287,7 @@ set_cheapest(RelOptInfo *parent_rel)
 		elog(ERROR, "could not devise a query plan for the given query");
 
 	cheapest_startup_path = cheapest_total_path = best_param_path = NULL;
+	fewest_row_path = NULL;
 	parameterized_paths = NIL;
 
 	foreach(p, parent_rel->pathlist)
@@ -341,6 +349,8 @@ set_cheapest(RelOptInfo *parent_rel)
 			if (cheapest_total_path == NULL)
 			{
 				cheapest_startup_path = cheapest_total_path = path;
+				if (IS_GROUPED_REL(parent_rel))
+					fewest_row_path = path;
 				continue;
 			}
 
@@ -364,6 +374,27 @@ set_cheapest(RelOptInfo *parent_rel)
 				 compare_pathkeys(cheapest_total_path->pathkeys,
 								  path->pathkeys) == PATHKEYS_BETTER2))
 				cheapest_total_path = path;
+
+			/*
+			 * Find the fewest-row unparameterized path for a grouped
+			 * relation.  If we find two paths of the same row count, try to
+			 * keep the one with the cheaper total cost; if the costs are
+			 * identical, keep the better-sorted one.
+			 */
+			if (IS_GROUPED_REL(parent_rel))
+			{
+				if (fewest_row_path->rows > path->rows)
+					fewest_row_path = path;
+				else if (fewest_row_path->rows == path->rows)
+				{
+					cmp = compare_path_costs(fewest_row_path, path, TOTAL_COST);
+					if (cmp > 0 ||
+						(cmp == 0 &&
+						 compare_pathkeys(fewest_row_path->pathkeys,
+										  path->pathkeys) == PATHKEYS_BETTER2))
+						fewest_row_path = path;
+				}
+			}
 		}
 	}
 
@@ -371,6 +402,10 @@ set_cheapest(RelOptInfo *parent_rel)
 	if (cheapest_total_path)
 		parameterized_paths = lcons(cheapest_total_path, parameterized_paths);
 
+	/* Add fewest-row unparameterized path, if any, to parameterized_paths */
+	if (fewest_row_path && fewest_row_path != cheapest_total_path)
+		parameterized_paths = lcons(fewest_row_path, parameterized_paths);
+
 	/*
 	 * If there is no unparameterized path, use the best parameterized path as
 	 * cheapest_total_path (but not as cheapest_startup_path).
@@ -2787,8 +2822,7 @@ create_projection_path(PlannerInfo *root,
 	pathnode->path.pathtype = T_Result;
 	pathnode->path.parent = rel;
 	pathnode->path.pathtarget = target;
-	/* For now, assume we are above any joins, so no parameterization */
-	pathnode->path.param_info = NULL;
+	pathnode->path.param_info = subpath->param_info;
 	pathnode->path.parallel_aware = false;
 	pathnode->path.parallel_safe = rel->consider_parallel &&
 		subpath->parallel_safe &&
@@ -3043,8 +3077,7 @@ create_incremental_sort_path(PlannerInfo *root,
 	pathnode->path.parent = rel;
 	/* Sort doesn't project, so use source path's pathtarget */
 	pathnode->path.pathtarget = subpath->pathtarget;
-	/* For now, assume we are above any joins, so no parameterization */
-	pathnode->path.param_info = NULL;
+	pathnode->path.param_info = subpath->param_info;
 	pathnode->path.parallel_aware = false;
 	pathnode->path.parallel_safe = rel->consider_parallel &&
 		subpath->parallel_safe;
@@ -3091,8 +3124,7 @@ create_sort_path(PlannerInfo *root,
 	pathnode->path.parent = rel;
 	/* Sort doesn't project, so use source path's pathtarget */
 	pathnode->path.pathtarget = subpath->pathtarget;
-	/* For now, assume we are above any joins, so no parameterization */
-	pathnode->path.param_info = NULL;
+	pathnode->path.param_info = subpath->param_info;
 	pathnode->path.parallel_aware = false;
 	pathnode->path.parallel_safe = rel->consider_parallel &&
 		subpath->parallel_safe;
@@ -3253,8 +3285,7 @@ create_agg_path(PlannerInfo *root,
 	pathnode->path.pathtype = T_Agg;
 	pathnode->path.parent = rel;
 	pathnode->path.pathtarget = target;
-	/* For now, assume we are above any joins, so no parameterization */
-	pathnode->path.param_info = NULL;
+	pathnode->path.param_info = subpath->param_info;
 	pathnode->path.parallel_aware = false;
 	pathnode->path.parallel_safe = rel->consider_parallel &&
 		subpath->parallel_safe;
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index ff507331a0..0f72110063 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -16,6 +16,8 @@
 
 #include <limits.h>
 
+#include "access/nbtree.h"
+#include "catalog/pg_constraint.h"
 #include "miscadmin.h"
 #include "nodes/nodeFuncs.h"
 #include "optimizer/appendinfo.h"
@@ -27,19 +29,27 @@
 #include "optimizer/paths.h"
 #include "optimizer/placeholder.h"
 #include "optimizer/plancat.h"
+#include "optimizer/planner.h"
 #include "optimizer/restrictinfo.h"
 #include "optimizer/tlist.h"
+#include "parser/parse_oper.h"
 #include "parser/parse_relation.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/hsearch.h"
 #include "utils/lsyscache.h"
+#include "utils/selfuncs.h"
+#include "utils/typcache.h"
 
 
-typedef struct JoinHashEntry
+/*
+ * An entry of a hash table that we use to make lookup for RelOptInfo
+ * structures more efficient.
+ */
+typedef struct RelHashEntry
 {
-	Relids		join_relids;	/* hash key --- MUST BE FIRST */
-	RelOptInfo *join_rel;
-} JoinHashEntry;
+	Relids		relids;			/* hash key --- MUST BE FIRST */
+	RelOptInfo *rel;
+} RelHashEntry;
 
 static void build_joinrel_tlist(PlannerInfo *root, RelOptInfo *joinrel,
 								RelOptInfo *input_rel,
@@ -83,7 +93,17 @@ static void build_child_join_reltarget(PlannerInfo *root,
 									   RelOptInfo *childrel,
 									   int nappinfos,
 									   AppendRelInfo **appinfos);
-
+static bool eager_aggregation_possible_for_relation(PlannerInfo *root,
+													RelOptInfo *rel);
+static bool init_grouping_targets(PlannerInfo *root, RelOptInfo *rel,
+								  PathTarget *target, PathTarget *agg_input,
+								  List **group_clauses, List **group_exprs);
+static bool is_var_in_aggref_only(PlannerInfo *root, Var *var);
+static bool is_var_needed_by_join(PlannerInfo *root, Var *var, RelOptInfo *rel);
+static Index get_expression_sortgroupref(PlannerInfo *root, Expr *expr);
+
+/* Minimum row reduction ratio at which a grouped path is considered useful */
+#define EAGER_AGGREGATE_RATIO 0.5
 
 /*
  * setup_simple_rel_arrays
@@ -276,6 +296,7 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
 	rel->joininfo = NIL;
 	rel->has_eclass_joins = false;
 	rel->consider_partitionwise_join = false;	/* might get changed later */
+	rel->agg_info = NULL;
 	rel->part_scheme = NULL;
 	rel->nparts = -1;
 	rel->boundinfo = NULL;
@@ -406,6 +427,99 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
 	return rel;
 }
 
+/*
+ * build_simple_grouped_rel
+ *	  Construct a new RelOptInfo for a grouped base relation out of an existing
+ *	  non-grouped base relation.
+ */
+RelOptInfo *
+build_simple_grouped_rel(PlannerInfo *root, RelOptInfo *rel_plain)
+{
+	RelOptInfo *rel_grouped;
+	RelAggInfo *agg_info;
+
+	/*
+	 * We should have available aggregate expressions and grouping
+	 * expressions, otherwise we cannot reach here.
+	 */
+	Assert(root->agg_clause_list != NIL);
+	Assert(root->group_expr_list != NIL);
+
+	/* nothing to do for dummy rel */
+	if (IS_DUMMY_REL(rel_plain))
+		return NULL;
+
+	/*
+	 * Prepare the information needed to create grouped paths for this base
+	 * relation.
+	 */
+	agg_info = create_rel_agg_info(root, rel_plain);
+	if (agg_info == NULL)
+		return NULL;
+
+	/*
+	 * If the grouped paths for the given base relation are not considered
+	 * useful, do not build the grouped relation.
+	 */
+	if (!agg_info->agg_useful)
+		return NULL;
+
+	/* build a grouped relation out of the plain relation */
+	rel_grouped = build_grouped_rel(root, rel_plain);
+	rel_grouped->reltarget = agg_info->target;
+	rel_grouped->rows = agg_info->grouped_rows;
+	rel_grouped->agg_info = agg_info;
+
+	return rel_grouped;
+}
+
+/*
+ * build_grouped_rel
+ *	  Build a grouped relation by flat copying a plain relation and resetting
+ *	  the necessary fields.
+ */
+RelOptInfo *
+build_grouped_rel(PlannerInfo *root, RelOptInfo *rel_plain)
+{
+	RelOptInfo *rel_grouped;
+
+	rel_grouped = makeNode(RelOptInfo);
+	memcpy(rel_grouped, rel_plain, sizeof(RelOptInfo));
+
+	/*
+	 * clear path info
+	 */
+	rel_grouped->pathlist = NIL;
+	rel_grouped->ppilist = NIL;
+	rel_grouped->partial_pathlist = NIL;
+	rel_grouped->cheapest_startup_path = NULL;
+	rel_grouped->cheapest_total_path = NULL;
+	rel_grouped->cheapest_unique_path = NULL;
+	rel_grouped->cheapest_parameterized_paths = NIL;
+
+	/*
+	 * clear partition info
+	 */
+	rel_grouped->part_scheme = NULL;
+	rel_grouped->nparts = -1;
+	rel_grouped->boundinfo = NULL;
+	rel_grouped->partbounds_merged = false;
+	rel_grouped->partition_qual = NIL;
+	rel_grouped->part_rels = NULL;
+	rel_grouped->live_parts = NULL;
+	rel_grouped->all_partrels = NULL;
+	rel_grouped->partexprs = NULL;
+	rel_grouped->nullable_partexprs = NULL;
+	rel_grouped->consider_partitionwise_join = false;
+
+	/*
+	 * clear size estimates
+	 */
+	rel_grouped->rows = 0;
+
+	return rel_grouped;
+}
+
 /*
  * find_base_rel
  *	  Find a base or otherrel relation entry, which must already exist.
@@ -479,11 +593,11 @@ find_base_rel_ignore_join(PlannerInfo *root, int relid)
 }
 
 /*
- * build_join_rel_hash
- *	  Construct the auxiliary hash table for join relations.
+ * build_rel_hash
+ *	  Construct the auxiliary hash table for relations.
  */
 static void
-build_join_rel_hash(PlannerInfo *root)
+build_rel_hash(RelInfoList *list)
 {
 	HTAB	   *hashtab;
 	HASHCTL		hash_ctl;
@@ -491,47 +605,46 @@ build_join_rel_hash(PlannerInfo *root)
 
 	/* Create the hash table */
 	hash_ctl.keysize = sizeof(Relids);
-	hash_ctl.entrysize = sizeof(JoinHashEntry);
+	hash_ctl.entrysize = sizeof(RelHashEntry);
 	hash_ctl.hash = bitmap_hash;
 	hash_ctl.match = bitmap_match;
 	hash_ctl.hcxt = CurrentMemoryContext;
-	hashtab = hash_create("JoinRelHashTable",
+	hashtab = hash_create("RelHashTable",
 						  256L,
 						  &hash_ctl,
 						  HASH_ELEM | HASH_FUNCTION | HASH_COMPARE | HASH_CONTEXT);
 
-	/* Insert all the already-existing joinrels */
-	foreach(l, root->join_rel_list)
+	/* Insert all the already-existing RelOptInfos */
+	foreach(l, list->items)
 	{
 		RelOptInfo *rel = (RelOptInfo *) lfirst(l);
-		JoinHashEntry *hentry;
+		RelHashEntry *hentry;
 		bool		found;
 
-		hentry = (JoinHashEntry *) hash_search(hashtab,
-											   &(rel->relids),
-											   HASH_ENTER,
-											   &found);
+		hentry = (RelHashEntry *) hash_search(hashtab,
+											  &(rel->relids),
+											  HASH_ENTER,
+											  &found);
 		Assert(!found);
-		hentry->join_rel = rel;
+		hentry->rel = rel;
 	}
 
-	root->join_rel_hash = hashtab;
+	list->hash = hashtab;
 }
 
 /*
- * find_join_rel
- *	  Returns relation entry corresponding to 'relids' (a set of RT indexes),
- *	  or NULL if none exists.  This is for join relations.
+ * find_rel_info
+ *	  Find a RelOptInfo entry corresponding to 'relids'.
  */
-RelOptInfo *
-find_join_rel(PlannerInfo *root, Relids relids)
+static RelOptInfo *
+find_rel_info(RelInfoList *list, Relids relids)
 {
 	/*
 	 * Switch to using hash lookup when list grows "too long".  The threshold
 	 * is arbitrary and is known only here.
 	 */
-	if (!root->join_rel_hash && list_length(root->join_rel_list) > 32)
-		build_join_rel_hash(root);
+	if (!list->hash && list_length(list->items) > 32)
+		build_rel_hash(list);
 
 	/*
 	 * Use either hashtable lookup or linear search, as appropriate.
@@ -541,23 +654,23 @@ find_join_rel(PlannerInfo *root, Relids relids)
 	 * so would force relids out of a register and thus probably slow down the
 	 * list-search case.
 	 */
-	if (root->join_rel_hash)
+	if (list->hash)
 	{
 		Relids		hashkey = relids;
-		JoinHashEntry *hentry;
+		RelHashEntry *hentry;
 
-		hentry = (JoinHashEntry *) hash_search(root->join_rel_hash,
-											   &hashkey,
-											   HASH_FIND,
-											   NULL);
+		hentry = (RelHashEntry *) hash_search(list->hash,
+											  &hashkey,
+											  HASH_FIND,
+											  NULL);
 		if (hentry)
-			return hentry->join_rel;
+			return hentry->rel;
 	}
 	else
 	{
 		ListCell   *l;
 
-		foreach(l, root->join_rel_list)
+		foreach(l, list->items)
 		{
 			RelOptInfo *rel = (RelOptInfo *) lfirst(l);
 
@@ -569,6 +682,28 @@ find_join_rel(PlannerInfo *root, Relids relids)
 	return NULL;
 }
 
+/*
+ * find_join_rel
+ *	  Returns relation entry corresponding to 'relids' (a set of RT indexes),
+ *	  or NULL if none exists.  This is for join relations.
+ */
+RelOptInfo *
+find_join_rel(PlannerInfo *root, Relids relids)
+{
+	return find_rel_info(root->join_rel_list, relids);
+}
+
+/*
+ * find_grouped_rel
+ *	  Returns relation entry corresponding to 'relids' (a set of RT indexes),
+ *	  or NULL if none exists.  This is for grouped relations.
+ */
+RelOptInfo *
+find_grouped_rel(PlannerInfo *root, Relids relids)
+{
+	return find_rel_info(root->grouped_rel_list, relids);
+}
+
 /*
  * set_foreign_rel_properties
  *		Set up foreign-join fields if outer and inner relation are foreign
@@ -619,31 +754,53 @@ set_foreign_rel_properties(RelOptInfo *joinrel, RelOptInfo *outer_rel,
 }
 
 /*
- * add_join_rel
- *		Add given join relation to the list of join relations in the given
- *		PlannerInfo. Also add it to the auxiliary hashtable if there is one.
+ * add_rel_info
+ *		Add given relation to the list, and also add it to the auxiliary
+ *		hashtable if there is one.
  */
 static void
-add_join_rel(PlannerInfo *root, RelOptInfo *joinrel)
+add_rel_info(RelInfoList *list, RelOptInfo *rel)
 {
-	/* GEQO requires us to append the new joinrel to the end of the list! */
-	root->join_rel_list = lappend(root->join_rel_list, joinrel);
+	/* GEQO requires us to append the new relation to the end of the list! */
+	list->items = lappend(list->items, rel);
 
 	/* store it into the auxiliary hashtable if there is one. */
-	if (root->join_rel_hash)
+	if (list->hash)
 	{
-		JoinHashEntry *hentry;
+		RelHashEntry *hentry;
 		bool		found;
 
-		hentry = (JoinHashEntry *) hash_search(root->join_rel_hash,
-											   &(joinrel->relids),
-											   HASH_ENTER,
-											   &found);
+		hentry = (RelHashEntry *) hash_search(list->hash,
+											  &(rel->relids),
+											  HASH_ENTER,
+											  &found);
 		Assert(!found);
-		hentry->join_rel = joinrel;
+		hentry->rel = rel;
 	}
 }
 
+/*
+ * add_join_rel
+ *		Add given join relation to the list of join relations in the given
+ *		PlannerInfo.
+ */
+static void
+add_join_rel(PlannerInfo *root, RelOptInfo *joinrel)
+{
+	add_rel_info(root->join_rel_list, joinrel);
+}
+
+/*
+ * add_grouped_rel
+ *		Add given grouped relation to the list of grouped relations in the
+ *		given PlannerInfo.
+ */
+void
+add_grouped_rel(PlannerInfo *root, RelOptInfo *rel)
+{
+	add_rel_info(root->grouped_rel_list, rel);
+}
+
 /*
  * build_join_rel
  *	  Returns relation entry corresponding to the union of two given rels,
@@ -755,6 +912,7 @@ build_join_rel(PlannerInfo *root,
 	joinrel->joininfo = NIL;
 	joinrel->has_eclass_joins = false;
 	joinrel->consider_partitionwise_join = false;	/* might get changed later */
+	joinrel->agg_info = NULL;
 	joinrel->parent = NULL;
 	joinrel->top_parent = NULL;
 	joinrel->top_parent_relids = NULL;
@@ -939,6 +1097,7 @@ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
 	joinrel->joininfo = NIL;
 	joinrel->has_eclass_joins = false;
 	joinrel->consider_partitionwise_join = false;	/* might get changed later */
+	joinrel->agg_info = NULL;
 	joinrel->parent = parent_joinrel;
 	joinrel->top_parent = parent_joinrel->top_parent ? parent_joinrel->top_parent : parent_joinrel;
 	joinrel->top_parent_relids = joinrel->top_parent->relids;
@@ -2518,3 +2677,504 @@ build_child_join_reltarget(PlannerInfo *root,
 	childrel->reltarget->cost.per_tuple = parentrel->reltarget->cost.per_tuple;
 	childrel->reltarget->width = parentrel->reltarget->width;
 }
+
+/*
+ * create_rel_agg_info
+ *	  Create the RelAggInfo structure for the given relation if it can produce
+ *	  grouped paths.  The given relation is the non-grouped one which has the
+ *	  reltarget already constructed.
+ */
+RelAggInfo *
+create_rel_agg_info(PlannerInfo *root, RelOptInfo *rel)
+{
+	ListCell   *lc;
+	RelAggInfo *result;
+	PathTarget *agg_input;
+	PathTarget *target;
+	List	   *group_clauses = NIL;
+	List	   *group_exprs = NIL;
+
+	/*
+	 * The lists of aggregate expressions and grouping expressions should have
+	 * been constructed.
+	 */
+	Assert(root->agg_clause_list != NIL);
+	Assert(root->group_expr_list != NIL);
+
+	/*
+	 * If this is a child rel, the grouped rel for its parent rel must have
+	 * been created if it can.  So we can just use parent's RelAggInfo if
+	 * there is one, with appropriate variable substitutions.
+	 */
+	if (IS_OTHER_REL(rel))
+	{
+		RelOptInfo *rel_grouped;
+		RelAggInfo *agg_info;
+
+		Assert(!bms_is_empty(rel->top_parent_relids));
+		rel_grouped = find_grouped_rel(root, rel->top_parent_relids);
+
+		if (rel_grouped == NULL)
+			return NULL;
+
+		Assert(IS_GROUPED_REL(rel_grouped));
+		/* Must do multi-level transformation */
+		agg_info = (RelAggInfo *)
+			adjust_appendrel_attrs_multilevel(root,
+											  (Node *) rel_grouped->agg_info,
+											  rel,
+											  rel->top_parent);
+
+		agg_info->grouped_rows =
+			estimate_num_groups(root, agg_info->group_exprs,
+								rel->rows, NULL, NULL);
+
+		/*
+		 * The grouped paths for the given relation are considered useful iff
+		 * the row reduction ratio is no less than EAGER_AGGREGATE_RATIO.
+		 */
+		agg_info->agg_useful =
+			(agg_info->grouped_rows <= rel->rows * (1 - EAGER_AGGREGATE_RATIO));
+
+		return agg_info;
+	}
+
+	/* Check if it's possible to produce grouped paths for this relation. */
+	if (!eager_aggregation_possible_for_relation(root, rel))
+		return NULL;
+
+	/*
+	 * Create targets for the grouped paths and for the input paths of the
+	 * grouped paths.
+	 */
+	target = create_empty_pathtarget();
+	agg_input = create_empty_pathtarget();
+
+	/* ... and initialize these targets */
+	if (!init_grouping_targets(root, rel, target, agg_input,
+							   &group_clauses, &group_exprs))
+		return NULL;
+
+	/*
+	 * Eager aggregation is not applicable if there are no available grouping
+	 * expressions.
+	 */
+	if (list_length(group_clauses) == 0)
+		return NULL;
+
+	/* build the RelAggInfo result */
+	result = makeNode(RelAggInfo);
+
+	result->group_clauses = group_clauses;
+	result->group_exprs = group_exprs;
+
+	/* Calculate pathkeys that represent this grouping requirements */
+	result->group_pathkeys =
+		make_pathkeys_for_sortclauses(root, result->group_clauses,
+									  make_tlist_from_pathtarget(target));
+
+	/* Add aggregates to the grouping target */
+	foreach(lc, root->agg_clause_list)
+	{
+		AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+		Aggref	   *aggref;
+
+		Assert(IsA(ac_info->aggref, Aggref));
+
+		aggref = (Aggref *) copyObject(ac_info->aggref);
+		mark_partial_aggref(aggref, AGGSPLIT_INITIAL_SERIAL);
+
+		add_column_to_pathtarget(target, (Expr *) aggref, 0);
+	}
+
+	/* Set the estimated eval cost and output width for both targets */
+	set_pathtarget_cost_width(root, target);
+	set_pathtarget_cost_width(root, agg_input);
+
+	result->relids = bms_copy(rel->relids);
+	result->target = target;
+	result->agg_input = agg_input;
+	result->grouped_rows = estimate_num_groups(root, result->group_exprs,
+											   rel->rows, NULL, NULL);
+
+	/*
+	 * The grouped paths for the given relation are considered useful iff the
+	 * row reduction ratio is no less than EAGER_AGGREGATE_RATIO.
+	 */
+	result->agg_useful =
+		(result->grouped_rows <= rel->rows * (1 - EAGER_AGGREGATE_RATIO));
+
+	return result;
+}
+
+/*
+ * eager_aggregation_possible_for_relation
+ * 	  Check if it's possible to produce grouped paths for the given relation.
+ */
+static bool
+eager_aggregation_possible_for_relation(PlannerInfo *root, RelOptInfo *rel)
+{
+	ListCell   *lc;
+	int			cur_relid;
+
+	/*
+	 * Check to see if the given relation is in the nullable side of an outer
+	 * join.  In this case, we cannot push a partial aggregation down to the
+	 * relation, because the NULL-extended rows produced by the outer join
+	 * would not be available when we perform the partial aggregation, while
+	 * with a non-eager-aggregation plan these rows are available for the
+	 * top-level aggregation.  Doing so may result in the rows being grouped
+	 * differently than expected, or produce incorrect values from the
+	 * aggregate functions.
+	 */
+	cur_relid = -1;
+	while ((cur_relid = bms_next_member(rel->relids, cur_relid)) >= 0)
+	{
+		RelOptInfo *baserel = find_base_rel_ignore_join(root, cur_relid);
+
+		if (baserel == NULL)
+			continue;			/* ignore outer joins in rel->relids */
+
+		if (!bms_is_subset(baserel->nulling_relids, rel->relids))
+			return false;
+	}
+
+	/*
+	 * For now we don't try to support PlaceHolderVars.
+	 */
+	foreach(lc, rel->reltarget->exprs)
+	{
+		Expr	   *expr = lfirst(lc);
+
+		if (IsA(expr, PlaceHolderVar))
+			return false;
+	}
+
+	/* Caller should only pass base relations or joins. */
+	Assert(rel->reloptkind == RELOPT_BASEREL ||
+		   rel->reloptkind == RELOPT_JOINREL);
+
+	/*
+	 * Check if all aggregate expressions can be evaluated on this relation
+	 * level.
+	 */
+	foreach(lc, root->agg_clause_list)
+	{
+		AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+
+		Assert(IsA(ac_info->aggref, Aggref));
+
+		/*
+		 * Give up if any aggregate requires relations other than the current
+		 * one.  If the aggregate requires the current relation plus
+		 * additional relations, grouping the current relation could make some
+		 * input rows unavailable for the higher aggregate and may reduce the
+		 * number of input rows it receives.  If the aggregate does not
+		 * require the current relation at all, it should not be grouped, as
+		 * we do not support joining two grouped relations.
+		 */
+		if (!bms_is_subset(ac_info->agg_eval_at, rel->relids))
+			return false;
+	}
+
+	return true;
+}
+
+/*
+ * init_grouping_targets
+ *	  Initialize the target for grouped paths (target) as well as the target
+ *	  for paths that generate input for the grouped paths (agg_input).
+ *
+ * We also construct the list of SortGroupClauses and the list of grouping
+ * expressions for the partial aggregation, and return them in *group_clause
+ * and *group_exprs.
+ *
+ * Return true if the targets could be initialized, false otherwise.
+ */
+static bool
+init_grouping_targets(PlannerInfo *root, RelOptInfo *rel,
+					  PathTarget *target, PathTarget *agg_input,
+					  List **group_clauses, List **group_exprs)
+{
+	ListCell   *lc;
+	List	   *possibly_dependent = NIL;
+	Index		maxSortGroupRef;
+
+	/* Identify the max sortgroupref */
+	maxSortGroupRef = 0;
+	foreach(lc, root->processed_tlist)
+	{
+		Index		ref = ((TargetEntry *) lfirst(lc))->ressortgroupref;
+
+		if (ref > maxSortGroupRef)
+			maxSortGroupRef = ref;
+	}
+
+	foreach(lc, rel->reltarget->exprs)
+	{
+		Expr	   *expr = (Expr *) lfirst(lc);
+		Index		sortgroupref;
+
+		/*
+		 * Given that PlaceHolderVar currently prevents us from doing eager
+		 * aggregation, the source target cannot contain anything more complex
+		 * than a Var.
+		 */
+		Assert(IsA(expr, Var));
+
+		/* Get the sortgroupref if the expr can act as grouping expression. */
+		sortgroupref = get_expression_sortgroupref(root, expr);
+		if (sortgroupref > 0)
+		{
+			SortGroupClause *sgc;
+
+			/* Find the matching SortGroupClause */
+			sgc = get_sortgroupref_clause(sortgroupref, root->processed_groupClause);
+			Assert(sgc->tleSortGroupRef <= maxSortGroupRef);
+
+			/*
+			 * If the target expression can be used as a grouping key, it
+			 * should be emitted by the grouped paths that have been pushed
+			 * down to this relation level.
+			 */
+			add_column_to_pathtarget(target, expr, sortgroupref);
+
+			/*
+			 * ... and it also should be emitted by the input paths.
+			 */
+			add_column_to_pathtarget(agg_input, expr, sortgroupref);
+
+			/*
+			 * Record this SortGroupClause and grouping expression.  Note that
+			 * this SortGroupClause might have already been recorded.
+			 */
+			if (!list_member(*group_clauses, sgc))
+			{
+				*group_clauses = lappend(*group_clauses, sgc);
+				*group_exprs = lappend(*group_exprs, expr);
+			}
+		}
+		else if (is_var_needed_by_join(root, (Var *) expr, rel))
+		{
+			/*
+			 * The expression is needed for an upper join but is neither in
+			 * the GROUP BY clause nor derivable from it using EC (otherwise,
+			 * it would have already been included in the targets above).  We
+			 * need to create a special SortGroupClause for this expression.
+			 *
+			 * It is important to include such expressions in the grouping
+			 * keys.  This is essential to ensure that an aggregated row from
+			 * the partial aggregation matches the other side of the join if
+			 * and only if each row in the partial group does.  This ensures
+			 * that all rows within the same partial group share the same
+			 * 'destiny', which is crucial for maintaining correctness.
+			 */
+			SortGroupClause *sgc;
+			TypeCacheEntry *tce;
+			Oid			equalimageproc;
+
+			/*
+			 * But first, check if equality implies image equality for this
+			 * expression.  If not, we cannot use it as a grouping key.  See
+			 * comments in create_grouping_expr_infos().
+			 */
+			tce = lookup_type_cache(exprType((Node *) expr),
+									TYPECACHE_BTREE_OPFAMILY);
+			if (!OidIsValid(tce->btree_opf) ||
+				!OidIsValid(tce->btree_opintype))
+				return false;
+
+			equalimageproc = get_opfamily_proc(tce->btree_opf,
+											   tce->btree_opintype,
+											   tce->btree_opintype,
+											   BTEQUALIMAGE_PROC);
+			if (!OidIsValid(equalimageproc) ||
+				!DatumGetBool(OidFunctionCall1Coll(equalimageproc,
+												   tce->typcollation,
+												   ObjectIdGetDatum(tce->btree_opintype))))
+				return false;
+
+			/* Create the SortGroupClause. */
+			sgc = makeNode(SortGroupClause);
+
+			/* Initialize the SortGroupClause. */
+			sgc->tleSortGroupRef = ++maxSortGroupRef;
+			get_sort_group_operators(exprType((Node *) expr),
+									 false, true, false,
+									 &sgc->sortop, &sgc->eqop, NULL,
+									 &sgc->hashable);
+
+			/* This expression should be emitted by the grouped paths */
+			add_column_to_pathtarget(target, expr, sgc->tleSortGroupRef);
+
+			/* ... and it also should be emitted by the input paths. */
+			add_column_to_pathtarget(agg_input, expr, sgc->tleSortGroupRef);
+
+			/* Record this SortGroupClause and grouping expression */
+			*group_clauses = lappend(*group_clauses, sgc);
+			*group_exprs = lappend(*group_exprs, expr);
+		}
+		else if (is_var_in_aggref_only(root, (Var *) expr))
+		{
+			/*
+			 * The expression is referenced by an aggregate function pushed
+			 * down to this relation and does not appear elsewhere in the
+			 * targetlist or havingQual.  Add it to 'agg_input' but not to
+			 * 'target'.
+			 */
+			add_new_column_to_pathtarget(agg_input, expr);
+		}
+		else
+		{
+			/*
+			 * The expression may be functionally dependent on other
+			 * expressions in the target, but we cannot verify this until all
+			 * target expressions have been constructed.
+			 */
+			possibly_dependent = lappend(possibly_dependent, expr);
+		}
+	}
+
+	/*
+	 * Now we can verify whether an expression is functionally dependent on
+	 * others.
+	 */
+	foreach(lc, possibly_dependent)
+	{
+		Var		   *tvar;
+		List	   *deps = NIL;
+		RangeTblEntry *rte;
+
+		tvar = lfirst_node(Var, lc);
+		rte = root->simple_rte_array[tvar->varno];
+
+		if (check_functional_grouping(rte->relid, tvar->varno,
+									  tvar->varlevelsup,
+									  target->exprs, &deps))
+		{
+			/*
+			 * The expression is functionally dependent on other target
+			 * expressions, so it can be included in the targets.  Since it
+			 * will not be used as a grouping key, a sortgroupref is not
+			 * needed for it.
+			 */
+			add_new_column_to_pathtarget(target, (Expr *) tvar);
+			add_new_column_to_pathtarget(agg_input, (Expr *) tvar);
+		}
+		else
+		{
+			/*
+			 * We may arrive here with a grouping expression that is proven
+			 * redundant by EquivalenceClass processing, such as 't1.a' in the
+			 * query below.
+			 *
+			 * select max(t1.c) from t t1, t t2 where t1.a = 1 group by t1.a,
+			 * t1.b;
+			 *
+			 * For now we just give up in this case.
+			 */
+			return false;
+		}
+	}
+
+	return true;
+}
+
+/*
+ * is_var_in_aggref_only
+ *	  Check whether the given Var appears in aggregate expressions and not
+ *	  elsewhere in the targetlist or havingQual.
+ */
+static bool
+is_var_in_aggref_only(PlannerInfo *root, Var *var)
+{
+	ListCell   *lc;
+
+	/*
+	 * Search the list of aggregate expressions for the Var.
+	 */
+	foreach(lc, root->agg_clause_list)
+	{
+		AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+		List	   *vars;
+
+		Assert(IsA(ac_info->aggref, Aggref));
+
+		if (!bms_is_member(var->varno, ac_info->agg_eval_at))
+			continue;
+
+		vars = pull_var_clause((Node *) ac_info->aggref,
+							   PVC_RECURSE_AGGREGATES |
+							   PVC_RECURSE_WINDOWFUNCS |
+							   PVC_RECURSE_PLACEHOLDERS);
+
+		if (list_member(vars, var))
+		{
+			list_free(vars);
+			break;
+		}
+
+		list_free(vars);
+	}
+
+	return (lc != NULL && !list_member(root->tlist_vars, var));
+}
+
+/*
+ * is_var_needed_by_join
+ *	  Check if the given Var is needed by joins above the current rel.
+ */
+static bool
+is_var_needed_by_join(PlannerInfo *root, Var *var, RelOptInfo *rel)
+{
+	Relids		relids;
+	int			attno;
+	RelOptInfo *baserel;
+
+	/*
+	 * Note that when checking if the Var is needed by joins above, we want to
+	 * exclude cases where the Var is only needed in the final output.  So
+	 * include "relation 0" in the check.
+	 */
+	relids = bms_copy(rel->relids);
+	relids = bms_add_member(relids, 0);
+
+	baserel = find_base_rel(root, var->varno);
+	attno = var->varattno - baserel->min_attr;
+
+	return bms_nonempty_difference(baserel->attr_needed[attno], relids);
+}
+
+/*
+ * get_expression_sortgroupref
+ *	  Return sortgroupref if the given 'expr' can act as grouping expression,
+ *	  or 0 otherwise.
+ *
+ * We first check if 'expr' is among the grouping expressions.  If it is not,
+ * we then check if 'expr' is known equal to any of the grouping expressions
+ * due to equivalence relationships.
+ */
+static Index
+get_expression_sortgroupref(PlannerInfo *root, Expr *expr)
+{
+	ListCell   *lc;
+
+	foreach(lc, root->group_expr_list)
+	{
+		GroupExprInfo *ge_info = lfirst_node(GroupExprInfo, lc);
+
+		Assert(IsA(ge_info->expr, Var));
+
+		if (equal(ge_info->expr, expr) ||
+			exprs_known_equal(root, (Node *) expr, (Node *) ge_info->expr,
+							  ge_info->btree_opfamily))
+		{
+			Assert(ge_info->sortgroupref > 0);
+
+			return ge_info->sortgroupref;
+		}
+	}
+
+	/* The expression cannot act as grouping expression. */
+	return 0;
+}
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index c9d8cd796a..2286b981c3 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -929,6 +929,16 @@ struct config_bool ConfigureNamesBool[] =
 		false,
 		NULL, NULL, NULL
 	},
+	{
+		{"enable_eager_aggregate", PGC_USERSET, QUERY_TUNING_METHOD,
+			gettext_noop("Enables eager aggregation."),
+			NULL,
+			GUC_EXPLAIN
+		},
+		&enable_eager_aggregate,
+		false,
+		NULL, NULL, NULL
+	},
 	{
 		{"enable_parallel_append", PGC_USERSET, QUERY_TUNING_METHOD,
 			gettext_noop("Enables the planner's use of parallel append plans."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index b2bc43383d..e142d37c70 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -416,6 +416,7 @@
 #enable_tidscan = on
 #enable_group_by_reordering = on
 #enable_distinct_reordering = on
+#enable_eager_aggregate = off
 
 # - Planner Cost Constants -
 
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 54ee17697e..44728e5522 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -80,6 +80,25 @@ typedef enum UpperRelationKind
 	/* NB: UPPERREL_FINAL must be last enum entry; it's used to size arrays */
 } UpperRelationKind;
 
+/*
+ * A structure consisting of a list and a hash table to store relations.
+ *
+ * For small problems we just scan the list to do lookups, but when there are
+ * many relations we build a hash table for faster lookups.  The hash table is
+ * present and valid when 'hash' is not NULL.  Note that we still maintain the
+ * list even when using the hash table for lookups; this simplifies life for
+ * GEQO.
+ */
+typedef struct RelInfoList
+{
+	pg_node_attr(no_copy_equal, no_read)
+
+	NodeTag		type;
+
+	List	   *items;
+	struct HTAB *hash pg_node_attr(read_write_ignore);
+} RelInfoList;
+
 /*----------
  * PlannerGlobal
  *		Global information for planning/optimization
@@ -270,15 +289,16 @@ struct PlannerInfo
 
 	/*
 	 * join_rel_list is a list of all join-relation RelOptInfos we have
-	 * considered in this planning run.  For small problems we just scan the
-	 * list to do lookups, but when there are many join relations we build a
-	 * hash table for faster lookups.  The hash table is present and valid
-	 * when join_rel_hash is not NULL.  Note that we still maintain the list
-	 * even when using the hash table for lookups; this simplifies life for
-	 * GEQO.
+	 * considered in this planning run.
 	 */
-	List	   *join_rel_list;
-	struct HTAB *join_rel_hash pg_node_attr(read_write_ignore);
+	RelInfoList *join_rel_list; /* list of join-relation RelOptInfos */
+
+	/*
+	 * grouped_rel_list is a list of all grouped-relation RelOptInfos we have
+	 * considered in this planning run.  This is only used by eager
+	 * aggregation.
+	 */
+	RelInfoList *grouped_rel_list;	/* list of grouped-relation RelOptInfos */
 
 	/*
 	 * When doing a dynamic-programming-style join search, join_rel_level[k]
@@ -373,6 +393,15 @@ struct PlannerInfo
 	/* list of PlaceHolderInfos */
 	List	   *placeholder_list;
 
+	/* list of AggClauseInfos */
+	List	   *agg_clause_list;
+
+	/* list of GroupExprInfos */
+	List	   *group_expr_list;
+
+	/* list of plain Vars contained in targetlist and havingQual */
+	List	   *tlist_vars;
+
 	/* array of PlaceHolderInfos indexed by phid */
 	struct PlaceHolderInfo **placeholder_array pg_node_attr(read_write_ignore, array_size(placeholder_array_size));
 	/* allocated size of array */
@@ -614,7 +643,9 @@ typedef struct PartitionSchemeData *PartitionScheme;
  * the set of RT indexes for its component baserels, along with RT indexes
  * for any outer joins it has computed.  We create RelOptInfo nodes for each
  * baserel and joinrel, and store them in the PlannerInfo's simple_rel_array
- * and join_rel_list respectively.
+ * and join_rel_list respectively.  We also create RelOptInfo nodes for each
+ * grouped relation when eager aggregation is enabled, and store them in the
+ * PlannerInfo's grouped_rel_list.
  *
  * Note that there is only one joinrel for any given set of component
  * baserels, no matter what order we assemble them in; so an unordered
@@ -679,7 +710,10 @@ typedef struct PartitionSchemeData *PartitionScheme;
  *		cheapest_unique_path - for caching cheapest path to produce unique
  *			(no duplicates) output from relation; NULL if not yet requested
  *		cheapest_parameterized_paths - best paths for their parameterizations;
- *			always includes cheapest_total_path, even if that's unparameterized
+ *			always includes cheapest_total_path, even if that's unparameterized;
+ *			in the grouped relation case, always includes the unparameterized
+ *			path with the fewest rows, if there is one and it is not
+ *			cheapest_total_path
  *		direct_lateral_relids - rels this rel has direct LATERAL references to
  *		lateral_relids - required outer rels for LATERAL, as a Relids set
  *			(includes both direct and indirect lateral references)
@@ -998,6 +1032,12 @@ typedef struct RelOptInfo
 	/* consider partitionwise join paths? (if partitioned rel) */
 	bool		consider_partitionwise_join;
 
+	/*
+	 * used by eager aggregation:
+	 */
+	/* information needed to create grouped paths */
+	struct RelAggInfo *agg_info;
+
 	/*
 	 * inheritance links, if this is an otherrel (otherwise NULL):
 	 */
@@ -1071,6 +1111,68 @@ typedef struct RelOptInfo
 	((rel)->part_scheme && (rel)->boundinfo && (rel)->nparts > 0 && \
 	 (rel)->part_rels && (rel)->partexprs && (rel)->nullable_partexprs)
 
+/*
+ * Is the given relation a grouped relation?
+ */
+#define IS_GROUPED_REL(rel) \
+	((rel)->agg_info != NULL)
+
+/*
+ * RelAggInfo
+ *		Information needed to create grouped paths for base and join rels.
+ *
+ * "relids" is the set of relation identifiers (RT indexes).
+ *
+ * "target" is the output tlist for the grouped paths.
+ *
+ * "agg_input" is the output tlist for the paths that provide input to the
+ * grouped paths.  One difference from the reltarget of the non-grouped
+ * relation is that agg_input has its sortgrouprefs[] initialized.
+ *
+ * "grouped_rows" is the estimated number of result tuples of the grouped
+ * relation.
+ *
+ * "group_clauses", "group_exprs" and "group_pathkeys" are lists of
+ * SortGroupClauses, the corresponding grouping expressions and PathKeys
+ * respectively.
+ *
+ * "agg_useful" is a flag to indicate whether the grouped paths are considered
+ * useful.
+ */
+typedef struct RelAggInfo
+{
+	pg_node_attr(no_copy_equal, no_read, no_query_jumble)
+
+	NodeTag		type;
+
+	/* set of base + OJ relids (rangetable indexes) */
+	Relids		relids;
+
+	/*
+	 * default result targetlist for Paths scanning this grouped relation;
+	 * list of Vars/Exprs, cost, width
+	 */
+	struct PathTarget *target;
+
+	/*
+	 * the targetlist for Paths that provide input to the grouped paths
+	 */
+	struct PathTarget *agg_input;
+
+	/* estimated number of result tuples */
+	Cardinality grouped_rows;
+
+	/* a list of SortGroupClauses */
+	List	   *group_clauses;
+	/* a list of grouping expressions */
+	List	   *group_exprs;
+	/* a list of PathKeys */
+	List	   *group_pathkeys;
+
+	/* the grouped paths are considered useful? */
+	bool		agg_useful;
+} RelAggInfo;
+
 /*
  * IndexOptInfo
  *		Per-index information for planning/optimization
@@ -3144,6 +3246,41 @@ typedef struct MinMaxAggInfo
 	Param	   *param;
 } MinMaxAggInfo;
 
+/*
+ * The aggregate expressions that appear in targetlist and having clauses
+ */
+typedef struct AggClauseInfo
+{
+	pg_node_attr(no_read, no_query_jumble)
+
+	NodeTag		type;
+
+	/* the Aggref expr */
+	Aggref	   *aggref;
+
+	/* lowest level we can evaluate this aggregate at */
+	Relids		agg_eval_at;
+} AggClauseInfo;
+
+/*
+ * The grouping expressions that appear in grouping clauses
+ */
+typedef struct GroupExprInfo
+{
+	pg_node_attr(no_read, no_query_jumble)
+
+	NodeTag		type;
+
+	/* the represented expression */
+	Expr	   *expr;
+
+	/* the tleSortGroupRef of the corresponding SortGroupClause */
+	Index		sortgroupref;
+
+	/* btree opfamily defining the ordering */
+	Oid			btree_opfamily;
+} GroupExprInfo;
+
 /*
  * At runtime, PARAM_EXEC slots are used to pass values around from one plan
  * node to another.  They can be used to pass values down into subqueries (for
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index 719be3897f..7747fb3397 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -313,10 +313,16 @@ extern void setup_simple_rel_arrays(PlannerInfo *root);
 extern void expand_planner_arrays(PlannerInfo *root, int add_size);
 extern RelOptInfo *build_simple_rel(PlannerInfo *root, int relid,
 									RelOptInfo *parent);
+extern RelOptInfo *build_simple_grouped_rel(PlannerInfo *root,
+											RelOptInfo *rel_plain);
+extern RelOptInfo *build_grouped_rel(PlannerInfo *root,
+									 RelOptInfo *rel_plain);
 extern RelOptInfo *find_base_rel(PlannerInfo *root, int relid);
 extern RelOptInfo *find_base_rel_noerr(PlannerInfo *root, int relid);
 extern RelOptInfo *find_base_rel_ignore_join(PlannerInfo *root, int relid);
 extern RelOptInfo *find_join_rel(PlannerInfo *root, Relids relids);
+extern void add_grouped_rel(PlannerInfo *root, RelOptInfo *rel);
+extern RelOptInfo *find_grouped_rel(PlannerInfo *root, Relids relids);
 extern RelOptInfo *build_join_rel(PlannerInfo *root,
 								  Relids joinrelids,
 								  RelOptInfo *outer_rel,
@@ -352,4 +358,5 @@ extern RelOptInfo *build_child_join_rel(PlannerInfo *root,
 										SpecialJoinInfo *sjinfo,
 										int nappinfos, AppendRelInfo **appinfos);
 
+extern RelAggInfo *create_rel_agg_info(PlannerInfo *root, RelOptInfo *rel);
 #endif							/* PATHNODE_H */
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 46955d128f..5e9d9597b9 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -21,6 +21,7 @@
  * allpaths.c
  */
 extern PGDLLIMPORT bool enable_geqo;
+extern PGDLLIMPORT bool enable_eager_aggregate;
 extern PGDLLIMPORT int geqo_threshold;
 extern PGDLLIMPORT int min_parallel_table_scan_size;
 extern PGDLLIMPORT int min_parallel_index_scan_size;
@@ -57,6 +58,10 @@ extern void generate_gather_paths(PlannerInfo *root, RelOptInfo *rel,
 								  bool override_rows);
 extern void generate_useful_gather_paths(PlannerInfo *root, RelOptInfo *rel,
 										 bool override_rows);
+extern void generate_grouped_paths(PlannerInfo *root,
+								   RelOptInfo *rel_grouped,
+								   RelOptInfo *rel_plain,
+								   RelAggInfo *agg_info);
 extern int	compute_parallel_worker(RelOptInfo *rel, double heap_pages,
 									double index_pages, int max_workers);
 extern void create_partial_bitmap_paths(PlannerInfo *root, RelOptInfo *rel,
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index fee3378bbe..9fc4550158 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -75,6 +75,7 @@ extern void add_vars_to_targetlist(PlannerInfo *root, List *vars,
 extern void add_vars_to_attr_needed(PlannerInfo *root, List *vars,
 									Relids where_needed);
 extern void remove_useless_groupby_columns(PlannerInfo *root);
+extern void setup_eager_aggregation(PlannerInfo *root);
 extern void find_lateral_references(PlannerInfo *root);
 extern void rebuild_lateral_attr_needed(PlannerInfo *root);
 extern void create_lateral_join_info(PlannerInfo *root);
diff --git a/src/test/regress/expected/eager_aggregate.out b/src/test/regress/expected/eager_aggregate.out
new file mode 100644
index 0000000000..9f63472eff
--- /dev/null
+++ b/src/test/regress/expected/eager_aggregate.out
@@ -0,0 +1,1308 @@
+--
+-- EAGER AGGREGATION
+-- Test we can push aggregation down below join
+--
+-- Enable eager aggregation, which by default is disabled.
+SET enable_eager_aggregate TO on;
+CREATE TABLE eager_agg_t1 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t2 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t3 (a int, b int, c double precision);
+INSERT INTO eager_agg_t1 SELECT i, i, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t2 SELECT i, i%10, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t3 SELECT i%10, i%10, i FROM generate_series(1, 1000)i;
+ANALYZE eager_agg_t1;
+ANALYZE eager_agg_t2;
+ANALYZE eager_agg_t3;
+--
+-- Test eager aggregation over base rel
+--
+-- Perform scan of a table, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                            QUERY PLAN                            
+------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Seq Scan on public.eager_agg_t2 t2
+                                 Output: t2.a, t2.b, t2.c
+(18 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                               QUERY PLAN                               
+------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial GroupAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Sort
+                                 Output: t2.c, t2.b
+                                 Sort Key: t2.b
+                                 ->  Seq Scan on public.eager_agg_t2 t2
+                                       Output: t2.c, t2.b
+(21 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+RESET enable_hashagg;
+--
+-- Test eager aggregation over join rel
+--
+-- Perform join of tables, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+                                  QUERY PLAN                                  
+------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg((t2.c + t3.c))
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg((t2.c + t3.c)))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg((t2.c + t3.c))
+                           Group Key: t2.b
+                           ->  Hash Join
+                                 Output: t2.c, t2.b, t3.c
+                                 Hash Cond: (t3.a = t2.a)
+                                 ->  Seq Scan on public.eager_agg_t3 t3
+                                       Output: t3.a, t3.b, t3.c
+                                 ->  Hash
+                                       Output: t2.c, t2.b, t2.a
+                                       ->  Seq Scan on public.eager_agg_t2 t2
+                                             Output: t2.c, t2.b, t2.a
+(25 rows)
+
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 497
+ 2 | 499
+ 3 | 501
+ 4 | 503
+ 5 | 505
+ 6 | 507
+ 7 | 509
+ 8 | 511
+ 9 | 513
+(9 rows)
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+                                     QUERY PLAN                                     
+------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg((t2.c + t3.c))
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg((t2.c + t3.c)))
+                     ->  Partial GroupAggregate
+                           Output: t2.b, PARTIAL avg((t2.c + t3.c))
+                           Group Key: t2.b
+                           ->  Sort
+                                 Output: t2.c, t2.b, t3.c
+                                 Sort Key: t2.b
+                                 ->  Hash Join
+                                       Output: t2.c, t2.b, t3.c
+                                       Hash Cond: (t3.a = t2.a)
+                                       ->  Seq Scan on public.eager_agg_t3 t3
+                                             Output: t3.a, t3.b, t3.c
+                                       ->  Hash
+                                             Output: t2.c, t2.b, t2.a
+                                             ->  Seq Scan on public.eager_agg_t2 t2
+                                                   Output: t2.c, t2.b, t2.a
+(28 rows)
+
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 497
+ 2 | 499
+ 3 | 501
+ 4 | 503
+ 5 | 505
+ 6 | 507
+ 7 | 509
+ 8 | 511
+ 9 | 513
+(9 rows)
+
+RESET enable_hashagg;
+--
+-- Test that eager aggregation works for outer join
+--
+-- Ensure aggregation can be pushed down to the non-nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                            QUERY PLAN                            
+------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Right Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Seq Scan on public.eager_agg_t2 t2
+                                 Output: t2.a, t2.b, t2.c
+(18 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+   | 505
+(10 rows)
+
+-- Ensure aggregation cannot be pushed down to the nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Sort
+   Output: t2.b, (avg(t2.c))
+   Sort Key: t2.b
+   ->  HashAggregate
+         Output: t2.b, avg(t2.c)
+         Group Key: t2.b
+         ->  Hash Right Join
+               Output: t2.b, t2.c
+               Hash Cond: (t2.b = t1.b)
+               ->  Seq Scan on public.eager_agg_t2 t2
+                     Output: t2.a, t2.b, t2.c
+               ->  Hash
+                     Output: t1.b
+                     ->  Seq Scan on public.eager_agg_t1 t1
+                           Output: t1.b
+(15 rows)
+
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+ b | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+   |    
+(10 rows)
+
+--
+-- Test that eager aggregation works for parallel plans
+--
+SET parallel_setup_cost=0;
+SET parallel_tuple_cost=0;
+SET min_parallel_table_scan_size=0;
+SET max_parallel_workers_per_gather=4;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                                   QUERY PLAN                                    
+---------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Gather Merge
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Workers Planned: 2
+         ->  Sort
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Sort Key: t1.a
+               ->  Parallel Hash Join
+                     Output: t1.a, (PARTIAL avg(t2.c))
+                     Hash Cond: (t1.b = t2.b)
+                     ->  Parallel Seq Scan on public.eager_agg_t1 t1
+                           Output: t1.a, t1.b, t1.c
+                     ->  Parallel Hash
+                           Output: t2.b, (PARTIAL avg(t2.c))
+                           ->  Partial HashAggregate
+                                 Output: t2.b, PARTIAL avg(t2.c)
+                                 Group Key: t2.b
+                                 ->  Parallel Seq Scan on public.eager_agg_t2 t2
+                                       Output: t2.a, t2.b, t2.c
+(21 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+RESET parallel_setup_cost;
+RESET parallel_tuple_cost;
+RESET min_parallel_table_scan_size;
+RESET max_parallel_workers_per_gather;
+DROP TABLE eager_agg_t1;
+DROP TABLE eager_agg_t2;
+DROP TABLE eager_agg_t3;
+--
+-- Test eager aggregation for partitionwise join
+--
+-- Enable partitionwise aggregate, which by default is disabled.
+SET enable_partitionwise_aggregate TO true;
+-- Enable partitionwise join, which by default is disabled.
+SET enable_partitionwise_join TO true;
+CREATE TABLE eager_agg_tab1(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab1_p1 PARTITION OF eager_agg_tab1 FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab1_p2 PARTITION OF eager_agg_tab1 FOR VALUES FROM (10) TO (20);
+CREATE TABLE eager_agg_tab1_p3 PARTITION OF eager_agg_tab1 FOR VALUES FROM (20) TO (30);
+CREATE TABLE eager_agg_tab2(x int, y int) PARTITION BY RANGE(y);
+CREATE TABLE eager_agg_tab2_p1 PARTITION OF eager_agg_tab2 FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab2_p2 PARTITION OF eager_agg_tab2 FOR VALUES FROM (10) TO (20);
+CREATE TABLE eager_agg_tab2_p3 PARTITION OF eager_agg_tab2 FOR VALUES FROM (20) TO (30);
+INSERT INTO eager_agg_tab1 SELECT i % 30, i % 20 FROM generate_series(0, 299, 2) i;
+INSERT INTO eager_agg_tab2 SELECT i % 20, i % 30 FROM generate_series(0, 299, 3) i;
+ANALYZE eager_agg_tab1;
+ANALYZE eager_agg_tab2;
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum(t1.y)), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum(t1.y), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2.y = t1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2
+                           Output: t2.y
+                     ->  Hash
+                           Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1.x, PARTIAL sum(t1.y), PARTIAL count(*)
+                                 Group Key: t1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                       Output: t1.x, t1.y
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum(t1_1.y), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_1
+                           Output: t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.y), PARTIAL count(*)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                       Output: t1_1.x, t1_1.y
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum(t1_2.y), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_2
+                           Output: t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.y), PARTIAL count(*)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                       Output: t1_2.x, t1_2.y
+(49 rows)
+
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+ x  | sum  | count 
+----+------+-------
+  0 |  500 |   100
+  6 | 1100 |   100
+ 12 |  700 |   100
+ 18 | 1300 |   100
+ 24 |  900 |   100
+(5 rows)
+
+-- GROUP BY having other matching key
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t2.y, (sum(t1.y)), (count(*))
+   Sort Key: t2.y
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t2.y, sum(t1.y), count(*)
+               Group Key: t2.y
+               ->  Hash Join
+                     Output: t2.y, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2.y = t1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2
+                           Output: t2.y
+                     ->  Hash
+                           Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1.x, PARTIAL sum(t1.y), PARTIAL count(*)
+                                 Group Key: t1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                       Output: t1.y, t1.x
+         ->  Finalize HashAggregate
+               Output: t2_1.y, sum(t1_1.y), count(*)
+               Group Key: t2_1.y
+               ->  Hash Join
+                     Output: t2_1.y, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_1
+                           Output: t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.y), PARTIAL count(*)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                       Output: t1_1.y, t1_1.x
+         ->  Finalize HashAggregate
+               Output: t2_2.y, sum(t1_2.y), count(*)
+               Group Key: t2_2.y
+               ->  Hash Join
+                     Output: t2_2.y, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_2
+                           Output: t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.y), PARTIAL count(*)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                       Output: t1_2.y, t1_2.x
+(49 rows)
+
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+ y  | sum  | count 
+----+------+-------
+  0 |  500 |   100
+  6 | 1100 |   100
+ 12 |  700 |   100
+ 18 | 1300 |   100
+ 24 |  900 |   100
+(5 rows)
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 10 ORDER BY t2.x;
+                                                 QUERY PLAN                                                 
+------------------------------------------------------------------------------------------------------------
+ Sort
+   Output: t2.x, (sum(t1.x)), (count(*))
+   Sort Key: t2.x
+   ->  Finalize HashAggregate
+         Output: t2.x, sum(t1.x), count(*)
+         Group Key: t2.x
+         Filter: (avg(t1.x) > '10'::numeric)
+         ->  Append
+               ->  Hash Join
+                     Output: t2_1.x, (PARTIAL sum(t1_1.x)), (PARTIAL count(*)), (PARTIAL avg(t1_1.x))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2_1
+                           Output: t2_1.x, t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.x)), (PARTIAL count(*)), (PARTIAL avg(t1_1.x))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.x), PARTIAL count(*), PARTIAL avg(t1_1.x)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1_1
+                                       Output: t1_1.x
+               ->  Hash Join
+                     Output: t2_2.x, (PARTIAL sum(t1_2.x)), (PARTIAL count(*)), (PARTIAL avg(t1_2.x))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_2
+                           Output: t2_2.x, t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.x)), (PARTIAL count(*)), (PARTIAL avg(t1_2.x))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.x), PARTIAL count(*), PARTIAL avg(t1_2.x)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_2
+                                       Output: t1_2.x
+               ->  Hash Join
+                     Output: t2_3.x, (PARTIAL sum(t1_3.x)), (PARTIAL count(*)), (PARTIAL avg(t1_3.x))
+                     Hash Cond: (t2_3.y = t1_3.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_3
+                           Output: t2_3.x, t2_3.y
+                     ->  Hash
+                           Output: t1_3.x, (PARTIAL sum(t1_3.x)), (PARTIAL count(*)), (PARTIAL avg(t1_3.x))
+                           ->  Partial HashAggregate
+                                 Output: t1_3.x, PARTIAL sum(t1_3.x), PARTIAL count(*), PARTIAL avg(t1_3.x)
+                                 Group Key: t1_3.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_3
+                                       Output: t1_3.x
+(44 rows)
+
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 10 ORDER BY t2.x;
+ x  | sum  | count 
+----+------+-------
+  2 |  600 |    50
+  4 | 1200 |    50
+  8 |  900 |    50
+ 12 |  600 |    50
+ 14 | 1200 |    50
+ 18 |  900 |    50
+(6 rows)
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum((t2.y + t3.y)))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum((t2.y + t3.y))
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum((t2.y + t3.y)))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, t3.x, (PARTIAL sum((t2.y + t3.y)))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, t3.x, PARTIAL sum((t2.y + t3.y))
+                                 Group Key: t2.x
+                                 ->  Hash Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Hash Cond: (t2.x = t3.x)
+                                       ->  Seq Scan on public.eager_agg_tab1_p1 t2
+                                             Output: t2.y, t2.x
+                                       ->  Hash
+                                             Output: t3.y, t3.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p1 t3
+                                                   Output: t3.y, t3.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum((t2_1.y + t3_1.y))
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum((t2_1.y + t3_1.y)))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y)))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y))
+                                 Group Key: t2_1.x
+                                 ->  Hash Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Hash Cond: (t2_1.x = t3_1.x)
+                                       ->  Seq Scan on public.eager_agg_tab1_p2 t2_1
+                                             Output: t2_1.y, t2_1.x
+                                       ->  Hash
+                                             Output: t3_1.y, t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p2 t3_1
+                                                   Output: t3_1.y, t3_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum((t2_2.y + t3_2.y))
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum((t2_2.y + t3_2.y)))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y)))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y))
+                                 Group Key: t2_2.x
+                                 ->  Hash Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Hash Cond: (t2_2.x = t3_2.x)
+                                       ->  Seq Scan on public.eager_agg_tab1_p3 t2_2
+                                             Output: t2_2.y, t2_2.x
+                                       ->  Hash
+                                             Output: t3_2.y, t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p3 t3_2
+                                                   Output: t3_2.y, t3_2.x
+(70 rows)
+
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+ x  |  sum  
+----+-------
+  0 | 10000
+  2 | 14000
+  4 | 18000
+  6 | 22000
+  8 | 26000
+ 10 | 10000
+ 12 | 14000
+ 14 | 18000
+ 16 | 22000
+ 18 | 26000
+ 20 | 10000
+ 22 | 14000
+ 24 | 18000
+ 26 | 22000
+ 28 | 26000
+(15 rows)
+
+-- partial aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t3.y, sum((t2.y + t3.y))
+   Group Key: t3.y
+   ->  Sort
+         Output: t3.y, (PARTIAL sum((t2.y + t3.y)))
+         Sort Key: t3.y
+         ->  Append
+               ->  Hash Join
+                     Output: t3_1.y, (PARTIAL sum((t2_1.y + t3_1.y)))
+                     Hash Cond: (t2_1.x = t1_1.x)
+                     ->  Partial GroupAggregate
+                           Output: t2_1.x, t3_1.y, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y))
+                           Group Key: t2_1.x, t3_1.y, t3_1.x
+                           ->  Incremental Sort
+                                 Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                 Sort Key: t2_1.x, t3_1.y
+                                 Presorted Key: t2_1.x
+                                 ->  Merge Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Merge Cond: (t2_1.x = t3_1.x)
+                                       ->  Sort
+                                             Output: t2_1.y, t2_1.x
+                                             Sort Key: t2_1.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p1 t2_1
+                                                   Output: t2_1.y, t2_1.x
+                                       ->  Sort
+                                             Output: t3_1.y, t3_1.x
+                                             Sort Key: t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p1 t3_1
+                                                   Output: t3_1.y, t3_1.x
+                     ->  Hash
+                           Output: t1_1.x
+                           ->  Seq Scan on public.eager_agg_tab1_p1 t1_1
+                                 Output: t1_1.x
+               ->  Hash Join
+                     Output: t3_2.y, (PARTIAL sum((t2_2.y + t3_2.y)))
+                     Hash Cond: (t2_2.x = t1_2.x)
+                     ->  Partial GroupAggregate
+                           Output: t2_2.x, t3_2.y, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y))
+                           Group Key: t2_2.x, t3_2.y, t3_2.x
+                           ->  Incremental Sort
+                                 Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                 Sort Key: t2_2.x, t3_2.y
+                                 Presorted Key: t2_2.x
+                                 ->  Merge Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Merge Cond: (t2_2.x = t3_2.x)
+                                       ->  Sort
+                                             Output: t2_2.y, t2_2.x
+                                             Sort Key: t2_2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p2 t2_2
+                                                   Output: t2_2.y, t2_2.x
+                                       ->  Sort
+                                             Output: t3_2.y, t3_2.x
+                                             Sort Key: t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p2 t3_2
+                                                   Output: t3_2.y, t3_2.x
+                     ->  Hash
+                           Output: t1_2.x
+                           ->  Seq Scan on public.eager_agg_tab1_p2 t1_2
+                                 Output: t1_2.x
+               ->  Hash Join
+                     Output: t3_3.y, (PARTIAL sum((t2_3.y + t3_3.y)))
+                     Hash Cond: (t2_3.x = t1_3.x)
+                     ->  Partial GroupAggregate
+                           Output: t2_3.x, t3_3.y, t3_3.x, PARTIAL sum((t2_3.y + t3_3.y))
+                           Group Key: t2_3.x, t3_3.y, t3_3.x
+                           ->  Incremental Sort
+                                 Output: t2_3.y, t2_3.x, t3_3.y, t3_3.x
+                                 Sort Key: t2_3.x, t3_3.y
+                                 Presorted Key: t2_3.x
+                                 ->  Merge Join
+                                       Output: t2_3.y, t2_3.x, t3_3.y, t3_3.x
+                                       Merge Cond: (t2_3.x = t3_3.x)
+                                       ->  Sort
+                                             Output: t2_3.y, t2_3.x
+                                             Sort Key: t2_3.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p3 t2_3
+                                                   Output: t2_3.y, t2_3.x
+                                       ->  Sort
+                                             Output: t3_3.y, t3_3.x
+                                             Sort Key: t3_3.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p3 t3_3
+                                                   Output: t3_3.y, t3_3.x
+                     ->  Hash
+                           Output: t1_3.x
+                           ->  Seq Scan on public.eager_agg_tab1_p3 t1_3
+                                 Output: t1_3.x
+(88 rows)
+
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+ y  |  sum  
+----+-------
+  0 |  7500
+  2 | 13500
+  4 | 19500
+  6 | 25500
+  8 | 31500
+ 10 | 22500
+ 12 | 28500
+ 14 | 34500
+ 16 | 40500
+ 18 | 46500
+(10 rows)
+
+RESET enable_hashagg;
+DROP TABLE eager_agg_tab1;
+DROP TABLE eager_agg_tab2;
+--
+-- Test with multi-level partitioning scheme
+--
+CREATE TABLE eager_agg_tab_ml(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p1 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab_ml_p2 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (10) TO (20) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p2_s1 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab_ml_p2_s2 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (15) TO (20);
+CREATE TABLE eager_agg_tab_ml_p3 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (20) TO (30) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p3_s1 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (20) TO (25);
+CREATE TABLE eager_agg_tab_ml_p3_s2 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (25) TO (30);
+INSERT INTO eager_agg_tab_ml SELECT i % 30, i % 30 FROM generate_series(1, 1000) i;
+ANALYZE eager_agg_tab_ml;
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum(t2.y)), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum(t2.y), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, PARTIAL sum(t2.y), PARTIAL count(*)
+                                 Group Key: t2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                       Output: t2.y, t2.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum(t2_1.y), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, PARTIAL sum(t2_1.y), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                       Output: t2_1.y, t2_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum(t2_2.y), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, PARTIAL sum(t2_2.y), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                       Output: t2_2.y, t2_2.x
+         ->  Finalize HashAggregate
+               Output: t1_3.x, sum(t2_3.y), count(*)
+               Group Key: t1_3.x
+               ->  Hash Join
+                     Output: t1_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, PARTIAL sum(t2_3.y), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                       Output: t2_3.y, t2_3.x
+         ->  Finalize HashAggregate
+               Output: t1_4.x, sum(t2_4.y), count(*)
+               Group Key: t1_4.x
+               ->  Hash Join
+                     Output: t1_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, PARTIAL sum(t2_4.y), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                       Output: t2_4.y, t2_4.x
+(79 rows)
+
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+ x  |  sum  | count 
+----+-------+-------
+  0 |     0 |  1089
+  1 |  1156 |  1156
+  2 |  2312 |  1156
+  3 |  3468 |  1156
+  4 |  4624 |  1156
+  5 |  5780 |  1156
+  6 |  6936 |  1156
+  7 |  8092 |  1156
+  8 |  9248 |  1156
+  9 | 10404 |  1156
+ 10 | 11560 |  1156
+ 11 | 11979 |  1089
+ 12 | 13068 |  1089
+ 13 | 14157 |  1089
+ 14 | 15246 |  1089
+ 15 | 16335 |  1089
+ 16 | 17424 |  1089
+ 17 | 18513 |  1089
+ 18 | 19602 |  1089
+ 19 | 20691 |  1089
+ 20 | 21780 |  1089
+ 21 | 22869 |  1089
+ 22 | 23958 |  1089
+ 23 | 25047 |  1089
+ 24 | 26136 |  1089
+ 25 | 27225 |  1089
+ 26 | 28314 |  1089
+ 27 | 29403 |  1089
+ 28 | 30492 |  1089
+ 29 | 31581 |  1089
+(30 rows)
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.y, (sum(t2.y)), (count(*))
+   Sort Key: t1.y
+   ->  Finalize HashAggregate
+         Output: t1.y, sum(t2.y), count(*)
+         Group Key: t1.y
+         ->  Append
+               ->  Hash Join
+                     Output: t1_1.y, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1_1
+                           Output: t1_1.y, t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, PARTIAL sum(t2_1.y), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p1 t2_1
+                                       Output: t2_1.y, t2_1.x
+               ->  Hash Join
+                     Output: t1_2.y, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_2
+                           Output: t1_2.y, t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, PARTIAL sum(t2_2.y), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_2
+                                       Output: t2_2.y, t2_2.x
+               ->  Hash Join
+                     Output: t1_3.y, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_3
+                           Output: t1_3.y, t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, PARTIAL sum(t2_3.y), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_3
+                                       Output: t2_3.y, t2_3.x
+               ->  Hash Join
+                     Output: t1_4.y, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_4
+                           Output: t1_4.y, t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, PARTIAL sum(t2_4.y), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_4
+                                       Output: t2_4.y, t2_4.x
+               ->  Hash Join
+                     Output: t1_5.y, (PARTIAL sum(t2_5.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_5.x = t2_5.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_5
+                           Output: t1_5.y, t1_5.x
+                     ->  Hash
+                           Output: t2_5.x, (PARTIAL sum(t2_5.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_5.x, PARTIAL sum(t2_5.y), PARTIAL count(*)
+                                 Group Key: t2_5.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_5
+                                       Output: t2_5.y, t2_5.x
+(67 rows)
+
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+ y  |  sum  | count 
+----+-------+-------
+  0 |     0 |  1089
+  1 |  1156 |  1156
+  2 |  2312 |  1156
+  3 |  3468 |  1156
+  4 |  4624 |  1156
+  5 |  5780 |  1156
+  6 |  6936 |  1156
+  7 |  8092 |  1156
+  8 |  9248 |  1156
+  9 | 10404 |  1156
+ 10 | 11560 |  1156
+ 11 | 11979 |  1089
+ 12 | 13068 |  1089
+ 13 | 14157 |  1089
+ 14 | 15246 |  1089
+ 15 | 16335 |  1089
+ 16 | 17424 |  1089
+ 17 | 18513 |  1089
+ 18 | 19602 |  1089
+ 19 | 20691 |  1089
+ 20 | 21780 |  1089
+ 21 | 22869 |  1089
+ 22 | 23958 |  1089
+ 23 | 25047 |  1089
+ 24 | 26136 |  1089
+ 25 | 27225 |  1089
+ 26 | 28314 |  1089
+ 27 | 29403 |  1089
+ 28 | 30492 |  1089
+ 29 | 31581 |  1089
+(30 rows)
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+                                                QUERY PLAN                                                
+----------------------------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum((t2.y + t3.y))), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum((t2.y + t3.y)), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, t3.x, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, t3.x, PARTIAL sum((t2.y + t3.y)), PARTIAL count(*)
+                                 Group Key: t2.x
+                                 ->  Hash Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Hash Cond: (t2.x = t3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                             Output: t2.y, t2.x
+                                       ->  Hash
+                                             Output: t3.y, t3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p1 t3
+                                                   Output: t3.y, t3.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum((t2_1.y + t3_1.y)), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y)), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Hash Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Hash Cond: (t2_1.x = t3_1.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                             Output: t2_1.y, t2_1.x
+                                       ->  Hash
+                                             Output: t3_1.y, t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t3_1
+                                                   Output: t3_1.y, t3_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum((t2_2.y + t3_2.y)), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y)), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Hash Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Hash Cond: (t2_2.x = t3_2.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                             Output: t2_2.y, t2_2.x
+                                       ->  Hash
+                                             Output: t3_2.y, t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t3_2
+                                                   Output: t3_2.y, t3_2.x
+         ->  Finalize HashAggregate
+               Output: t1_3.x, sum((t2_3.y + t3_3.y)), count(*)
+               Group Key: t1_3.x
+               ->  Hash Join
+                     Output: t1_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, t3_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, t3_3.x, PARTIAL sum((t2_3.y + t3_3.y)), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Hash Join
+                                       Output: t2_3.y, t2_3.x, t3_3.y, t3_3.x
+                                       Hash Cond: (t2_3.x = t3_3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                             Output: t2_3.y, t2_3.x
+                                       ->  Hash
+                                             Output: t3_3.y, t3_3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t3_3
+                                                   Output: t3_3.y, t3_3.x
+         ->  Finalize HashAggregate
+               Output: t1_4.x, sum((t2_4.y + t3_4.y)), count(*)
+               Group Key: t1_4.x
+               ->  Hash Join
+                     Output: t1_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, t3_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, t3_4.x, PARTIAL sum((t2_4.y + t3_4.y)), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Hash Join
+                                       Output: t2_4.y, t2_4.x, t3_4.y, t3_4.x
+                                       Hash Cond: (t2_4.x = t3_4.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                             Output: t2_4.y, t2_4.x
+                                       ->  Hash
+                                             Output: t3_4.y, t3_4.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t3_4
+                                                   Output: t3_4.y, t3_4.x
+(114 rows)
+
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+ x  |   sum   | count 
+----+---------+-------
+  0 |       0 | 35937
+  1 |   78608 | 39304
+  2 |  157216 | 39304
+  3 |  235824 | 39304
+  4 |  314432 | 39304
+  5 |  393040 | 39304
+  6 |  471648 | 39304
+  7 |  550256 | 39304
+  8 |  628864 | 39304
+  9 |  707472 | 39304
+ 10 |  786080 | 39304
+ 11 |  790614 | 35937
+ 12 |  862488 | 35937
+ 13 |  934362 | 35937
+ 14 | 1006236 | 35937
+ 15 | 1078110 | 35937
+ 16 | 1149984 | 35937
+ 17 | 1221858 | 35937
+ 18 | 1293732 | 35937
+ 19 | 1365606 | 35937
+ 20 | 1437480 | 35937
+ 21 | 1509354 | 35937
+ 22 | 1581228 | 35937
+ 23 | 1653102 | 35937
+ 24 | 1724976 | 35937
+ 25 | 1796850 | 35937
+ 26 | 1868724 | 35937
+ 27 | 1940598 | 35937
+ 28 | 2012472 | 35937
+ 29 | 2084346 | 35937
+(30 rows)
+
+-- partial aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+                                                    QUERY PLAN                                                    
+------------------------------------------------------------------------------------------------------------------
+ Sort
+   Output: t3.y, (sum((t2.y + t3.y))), (count(*))
+   Sort Key: t3.y
+   ->  Finalize HashAggregate
+         Output: t3.y, sum((t2.y + t3.y)), count(*)
+         Group Key: t3.y
+         ->  Append
+               ->  Hash Join
+                     Output: t3_1.y, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, t3_1.y, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, t3_1.y, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y)), PARTIAL count(*)
+                                 Group Key: t2_1.x, t3_1.y, t3_1.x
+                                 ->  Hash Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Hash Cond: (t2_1.x = t3_1.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p1 t2_1
+                                             Output: t2_1.y, t2_1.x
+                                       ->  Hash
+                                             Output: t3_1.y, t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p1 t3_1
+                                                   Output: t3_1.y, t3_1.x
+               ->  Hash Join
+                     Output: t3_2.y, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, t3_2.y, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, t3_2.y, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y)), PARTIAL count(*)
+                                 Group Key: t2_2.x, t3_2.y, t3_2.x
+                                 ->  Hash Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Hash Cond: (t2_2.x = t3_2.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_2
+                                             Output: t2_2.y, t2_2.x
+                                       ->  Hash
+                                             Output: t3_2.y, t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t3_2
+                                                   Output: t3_2.y, t3_2.x
+               ->  Hash Join
+                     Output: t3_3.y, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, t3_3.y, t3_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, t3_3.y, t3_3.x, PARTIAL sum((t2_3.y + t3_3.y)), PARTIAL count(*)
+                                 Group Key: t2_3.x, t3_3.y, t3_3.x
+                                 ->  Hash Join
+                                       Output: t2_3.y, t2_3.x, t3_3.y, t3_3.x
+                                       Hash Cond: (t2_3.x = t3_3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_3
+                                             Output: t2_3.y, t2_3.x
+                                       ->  Hash
+                                             Output: t3_3.y, t3_3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t3_3
+                                                   Output: t3_3.y, t3_3.x
+               ->  Hash Join
+                     Output: t3_4.y, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, t3_4.y, t3_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, t3_4.y, t3_4.x, PARTIAL sum((t2_4.y + t3_4.y)), PARTIAL count(*)
+                                 Group Key: t2_4.x, t3_4.y, t3_4.x
+                                 ->  Hash Join
+                                       Output: t2_4.y, t2_4.x, t3_4.y, t3_4.x
+                                       Hash Cond: (t2_4.x = t3_4.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_4
+                                             Output: t2_4.y, t2_4.x
+                                       ->  Hash
+                                             Output: t3_4.y, t3_4.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t3_4
+                                                   Output: t3_4.y, t3_4.x
+               ->  Hash Join
+                     Output: t3_5.y, (PARTIAL sum((t2_5.y + t3_5.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_5.x = t2_5.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_5
+                           Output: t1_5.x
+                     ->  Hash
+                           Output: t2_5.x, t3_5.y, t3_5.x, (PARTIAL sum((t2_5.y + t3_5.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_5.x, t3_5.y, t3_5.x, PARTIAL sum((t2_5.y + t3_5.y)), PARTIAL count(*)
+                                 Group Key: t2_5.x, t3_5.y, t3_5.x
+                                 ->  Hash Join
+                                       Output: t2_5.y, t2_5.x, t3_5.y, t3_5.x
+                                       Hash Cond: (t2_5.x = t3_5.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_5
+                                             Output: t2_5.y, t2_5.x
+                                       ->  Hash
+                                             Output: t3_5.y, t3_5.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t3_5
+                                                   Output: t3_5.y, t3_5.x
+(102 rows)
+
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+ y  |   sum   | count 
+----+---------+-------
+  0 |       0 | 35937
+  1 |   78608 | 39304
+  2 |  157216 | 39304
+  3 |  235824 | 39304
+  4 |  314432 | 39304
+  5 |  393040 | 39304
+  6 |  471648 | 39304
+  7 |  550256 | 39304
+  8 |  628864 | 39304
+  9 |  707472 | 39304
+ 10 |  786080 | 39304
+ 11 |  790614 | 35937
+ 12 |  862488 | 35937
+ 13 |  934362 | 35937
+ 14 | 1006236 | 35937
+ 15 | 1078110 | 35937
+ 16 | 1149984 | 35937
+ 17 | 1221858 | 35937
+ 18 | 1293732 | 35937
+ 19 | 1365606 | 35937
+ 20 | 1437480 | 35937
+ 21 | 1509354 | 35937
+ 22 | 1581228 | 35937
+ 23 | 1653102 | 35937
+ 24 | 1724976 | 35937
+ 25 | 1796850 | 35937
+ 26 | 1868724 | 35937
+ 27 | 1940598 | 35937
+ 28 | 2012472 | 35937
+ 29 | 2084346 | 35937
+(30 rows)
+
+DROP TABLE eager_agg_tab_ml;
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index 91089ac215..6370504377 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -151,6 +151,7 @@ select name, setting from pg_settings where name like 'enable%';
  enable_async_append            | on
  enable_bitmapscan              | on
  enable_distinct_reordering     | on
+ enable_eager_aggregate         | off
  enable_gathermerge             | on
  enable_group_by_reordering     | on
  enable_hashagg                 | on
@@ -171,7 +172,7 @@ select name, setting from pg_settings where name like 'enable%';
  enable_seqscan                 | on
  enable_sort                    | on
  enable_tidscan                 | on
-(23 rows)
+(24 rows)
 
 -- There are always wait event descriptions for various types.  InjectionPoint
 -- may be present or absent, depending on history since last postmaster start.
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 1edd9e45eb..4fc210e2ef 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -119,7 +119,7 @@ test: plancache limit plpgsql copy2 temp domain rangefuncs prepare conversion tr
 # The stats test resets stats, so nothing else needing stats access can be in
 # this group.
 # ----------
-test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats predicate
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats predicate eager_aggregate
 
 # event_trigger depends on create_am and cannot run concurrently with
 # any test that runs DDL
diff --git a/src/test/regress/sql/eager_aggregate.sql b/src/test/regress/sql/eager_aggregate.sql
new file mode 100644
index 0000000000..4050e4df44
--- /dev/null
+++ b/src/test/regress/sql/eager_aggregate.sql
@@ -0,0 +1,192 @@
+--
+-- EAGER AGGREGATION
+-- Test we can push aggregation down below join
+--
+
+-- Enable eager aggregation, which by default is disabled.
+SET enable_eager_aggregate TO on;
+
+CREATE TABLE eager_agg_t1 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t2 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t3 (a int, b int, c double precision);
+
+INSERT INTO eager_agg_t1 SELECT i, i, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t2 SELECT i, i%10, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t3 SELECT i%10, i%10, i FROM generate_series(1, 1000)i;
+
+ANALYZE eager_agg_t1;
+ANALYZE eager_agg_t2;
+ANALYZE eager_agg_t3;
+
+
+--
+-- Test eager aggregation over base rel
+--
+
+-- Perform scan of a table, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+RESET enable_hashagg;
+
+
+--
+-- Test eager aggregation over join rel
+--
+
+-- Perform join of tables, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+
+RESET enable_hashagg;
+
+
+--
+-- Test that eager aggregation works for outer join
+--
+
+-- Ensure aggregation can be pushed down to the non-nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+-- Ensure aggregation cannot be pushed down to the nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+
+
+--
+-- Test that eager aggregation works for parallel plans
+--
+
+SET parallel_setup_cost=0;
+SET parallel_tuple_cost=0;
+SET min_parallel_table_scan_size=0;
+SET max_parallel_workers_per_gather=4;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+RESET parallel_setup_cost;
+RESET parallel_tuple_cost;
+RESET min_parallel_table_scan_size;
+RESET max_parallel_workers_per_gather;
+
+
+DROP TABLE eager_agg_t1;
+DROP TABLE eager_agg_t2;
+DROP TABLE eager_agg_t3;
+
+
+--
+-- Test eager aggregation for partitionwise join
+--
+
+-- Enable partitionwise aggregate, which by default is disabled.
+SET enable_partitionwise_aggregate TO true;
+-- Enable partitionwise join, which by default is disabled.
+SET enable_partitionwise_join TO true;
+
+CREATE TABLE eager_agg_tab1(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab1_p1 PARTITION OF eager_agg_tab1 FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab1_p2 PARTITION OF eager_agg_tab1 FOR VALUES FROM (10) TO (20);
+CREATE TABLE eager_agg_tab1_p3 PARTITION OF eager_agg_tab1 FOR VALUES FROM (20) TO (30);
+CREATE TABLE eager_agg_tab2(x int, y int) PARTITION BY RANGE(y);
+CREATE TABLE eager_agg_tab2_p1 PARTITION OF eager_agg_tab2 FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab2_p2 PARTITION OF eager_agg_tab2 FOR VALUES FROM (10) TO (20);
+CREATE TABLE eager_agg_tab2_p3 PARTITION OF eager_agg_tab2 FOR VALUES FROM (20) TO (30);
+INSERT INTO eager_agg_tab1 SELECT i % 30, i % 20 FROM generate_series(0, 299, 2) i;
+INSERT INTO eager_agg_tab2 SELECT i % 20, i % 30 FROM generate_series(0, 299, 3) i;
+
+ANALYZE eager_agg_tab1;
+ANALYZE eager_agg_tab2;
+
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+
+-- GROUP BY having other matching key
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 10 ORDER BY t2.x;
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 10 ORDER BY t2.x;
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+
+-- partial aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+RESET enable_hashagg;
+
+DROP TABLE eager_agg_tab1;
+DROP TABLE eager_agg_tab2;
+
+
+--
+-- Test with multi-level partitioning scheme
+--
+CREATE TABLE eager_agg_tab_ml(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p1 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab_ml_p2 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (10) TO (20) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p2_s1 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab_ml_p2_s2 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (15) TO (20);
+CREATE TABLE eager_agg_tab_ml_p3 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (20) TO (30) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p3_s1 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (20) TO (25);
+CREATE TABLE eager_agg_tab_ml_p3_s2 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (25) TO (30);
+INSERT INTO eager_agg_tab_ml SELECT i % 30, i % 30 FROM generate_series(1, 1000) i;
+
+ANALYZE eager_agg_tab_ml;
+
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+
+-- partial aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+
+DROP TABLE eager_agg_tab_ml;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index eb93debe10..af551da13e 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -41,6 +41,7 @@ AfterTriggersTableData
 AfterTriggersTransData
 Agg
 AggClauseCosts
+AggClauseInfo
 AggInfo
 AggPath
 AggSplit
@@ -1066,6 +1067,7 @@ GrantTargetType
 Group
 GroupByOrdering
 GroupClause
+GroupExprInfo
 GroupPath
 GroupPathExtraData
 GroupResultPath
@@ -1298,7 +1300,6 @@ Join
 JoinCostWorkspace
 JoinDomain
 JoinExpr
-JoinHashEntry
 JoinPath
 JoinPathExtraData
 JoinState
@@ -2384,13 +2385,17 @@ ReindexObjectType
 ReindexParams
 ReindexStmt
 ReindexType
+RelAggInfo
 RelFileLocator
 RelFileLocatorBackend
 RelFileNumber
+RelHashEntry
 RelIdCacheEnt
 RelIdToTypeIdCacheEntry
 RelInfo
 RelInfoArr
+RelInfoList
+RelInfoListInfo
 RelMapFile
 RelMapping
 RelOptInfo
-- 
2.43.0



^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-01-14 15:07     ` Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  0 siblings, 1 reply; 70+ messages in thread

From: Robert Haas @ 2025-01-14 15:07 UTC (permalink / raw)
  To: Richard Guo <[email protected]>; Tom Lane <[email protected]>; +Cc: Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

On Sun, Jan 12, 2025 at 9:04 PM Richard Guo <[email protected]> wrote:
> Attached is an updated version of this patch that addresses Jian's
> review comments, along with some more cosmetic tweaks.  I'm going to
> be looking at this patch again from the point of view of committing
> it, with the plan to commit it late this week or early next week,
> barring any further comments or objections.

I feel this is rushed. This is a pretty big patch touching a sensitive
area of the code. I'm the only senior hacker who has reviewed it, and
I would say that I've only reviewed it pretty lightly, and that the
concerns I raised were fairly substantial. I don't think it's
customary to go from that point to commit after one more patch
revision. This really deserves to be looked at by multiple senior
hackers familiar with the planner; or at least by Tom.

My core concerns here are still what they were in the first email I
posted to the thread: it's unclear that the cost model will deliver
meaningful answers for the grouped rels, and it doesn't seem like
you've done enough to limit the overhead of the feature.

With regard to the first, I reiterate that we are in general quite bad
at having meaningful statistics for anything above an aggregate, and
this patch greatly expands how much of a query could be above an
aggregate. I felt back in August when I did my first review, and still
feel now, that when faced with a query where aggregation could be done
at any of several levels, the chances of picking the right one are not
much better than random. Why do you think otherwise?

With regard to the second, I suggested several lines of thinking back
in August that could lead to limiting the number of grouped_rels that
we create, but it doesn't really look like much of anything has
changed. We're still creating a partially grouped rel for every
baserel in the query, and every joinrel in the query. I'm not very
happy with "let's just turn it off by default" as the answer to that
concern. A lot of people won't enable the feature, which will mean it
doesn't have much value to our users, and those who do will still see
a lot of overhead. Maybe I'm wrong, but I bet with some good
heuristics the planning cost of this could be reduced by an order of
magnitude or more. If that were done, we could imagine eventually (or
maybe even immediately) enabling this by default; without that, we
still have the burden of maintaining this code and keeping it working,
but almost nobody will benefit.

+      <term><varname>enable_eager_aggregate</varname> (<type>boolean</type>)
+       <para>
+        Enables or disables the query planner's ability to partially push
+        aggregation past a join, and finalize it once all the relations are
+        joined. The default is <literal>off</literal>.

I'm a bit concerned about the naming here. I feel like we're adding an
increasing number of planner features with an increasing number of
disabling GUCs that are all a bit random. I kind of wonder if this
should be called enable_incremental_aggregate. Maybe that's worse,
because "eager" is a word we're not using for anything yet, so using
it here improves greppability and perhaps understandability. On the
other hand, the aggregate that is pushed down by this feature is
always partial (I believe) so we still need a finalize step later,
which means we're aggregating incrementally. There's some nice parity
with incremental sort, too, perhaps.

+/* The original length and hashtable of a RelInfoList */
+typedef struct
+{
+ int savelength;
+ struct HTAB *savehash;
+} RelInfoListInfo;

Both the comment and the name of the data type are completely meaningless.

+ /*
+ * Try at least sorting the cheapest path and also try
+ * incrementally sorting any path which is partially sorted
+ * already (no need to deal with paths which have presorted
+ * keys when incremental sort is disabled unless it's the
+ * cheapest input path).
+ */

This would be the fifth copy of this comment. It's not entirely this
patch's fault, of course, but some kind of refactoring or cleanup is
probably needed here.

+ * cheapest_parameterized_paths also always includes the fewest-row
+ * unparameterized path, if there is one, for grouped relations.  Different
+ * paths of a grouped relation can have very different row counts, and in some
+ * cases the cheapest-total unparameterized path may not be the one with the
+ * fewest row.

As I said back in October, this seems like mixing together in one
RelOptInfo paths that really belong to two different RelOptInfos.

--
Robert Haas
EDB: http://www.enterprisedb.com






^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
@ 2025-01-15 06:58       ` Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  0 siblings, 1 reply; 70+ messages in thread

From: Richard Guo @ 2025-01-15 06:58 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

On Wed, Jan 15, 2025 at 12:07 AM Robert Haas <[email protected]> wrote:
> On Sun, Jan 12, 2025 at 9:04 PM Richard Guo <[email protected]> wrote:
> > Attached is an updated version of this patch that addresses Jian's
> > review comments, along with some more cosmetic tweaks.  I'm going to
> > be looking at this patch again from the point of view of committing
> > it, with the plan to commit it late this week or early next week,
> > barring any further comments or objections.
>
> I feel this is rushed. This is a pretty big patch touching a sensitive
> area of the code. I'm the only senior hacker who has reviewed it, and
> I would say that I've only reviewed it pretty lightly, and that the
> concerns I raised were fairly substantial. I don't think it's
> customary to go from that point to commit after one more patch
> revision. This really deserves to be looked at by multiple senior
> hackers familiar with the planner; or at least by Tom.

Thank you for your input.  In fact, there have been several changes
since your last review, as I mentioned in the off-list email.
However, I agree that it would be great if someone else, especially
Tom, could take a look at this patch.

> My core concerns here are still what they were in the first email I
> posted to the thread: it's unclear that the cost model will deliver
> meaningful answers for the grouped rels, and it doesn't seem like
> you've done enough to limit the overhead of the feature.
>
> With regard to the first, I reiterate that we are in general quite bad
> at having meaningful statistics for anything above an aggregate, and
> this patch greatly expands how much of a query could be above an
> aggregate. I felt back in August when I did my first review, and still
> feel now, that when faced with a query where aggregation could be done
> at any of several levels, the chances of picking the right one are not
> much better than random. Why do you think otherwise?

I understand that we're currently quite bad at estimating the number
of groups after aggregation.  In fact, it's not just aggregation
estimates — we're also bad at join estimates in some cases.  This is a
reality we have to face.  Here's what I think: we should be trying our
best to cost each node type as accurately as possible, and then build
the upper nodes based on those costs.  We should not conclude that,
because we are unable to accurately cost one node type, we should
avoid any cost-based optimizations above that node.

Actually, performing aggregation before joins is not a new concept;
consider JOIN_UNIQUE_OUTER/INNER, for example:

explain (costs off)
select * from t t1 join t t2 on t1.b = t2.b
where (t1.a, t1.b) in
    (select t3.a, t3.b from t t3, t t4 where t3.a > t4.a);
                      QUERY PLAN
------------------------------------------------------
 Hash Join
   Hash Cond: ((t2.b = t1.b) AND (t3.a = t1.a))
   ->  Hash Join
         Hash Cond: (t2.b = t3.b)
         ->  Seq Scan on t t2
         ->  Hash
               ->  HashAggregate
                     Group Key: t3.a, t3.b
                     ->  Nested Loop
                           Join Filter: (t3.a > t4.a)
                           ->  Seq Scan on t t3
                           ->  Materialize
                                 ->  Seq Scan on t t4
   ->  Hash
         ->  Seq Scan on t t1
(15 rows)

I believe the HashAggregate node in this plan faces the same problem
with inaccurate estimates.  However, I don't think it's reasonable to
say that, because we cannot accurately cost the Aggregate node, we
should disregard considering JOIN_UNIQUE_OUTER/INNER.

Back in August, I responded to this issue by "Maybe we can run some
benchmarks first and investigate the regressions discovered on a
case-by-case basis".  In October, I ran the TPC-DS benchmark at scale
10 and observed that eager aggregation was applied in 7 queries, with
no notable regressions.  In contrast, Q4 and Q11 showed performance
improvements of 3–4 times.  Please see [1].

> With regard to the second, I suggested several lines of thinking back
> in August that could lead to limiting the number of grouped_rels that
> we create, but it doesn't really look like much of anything has
> changed. We're still creating a partially grouped rel for every
> baserel in the query, and every joinrel in the query. I'm not very
> happy with "let's just turn it off by default" as the answer to that
> concern. A lot of people won't enable the feature, which will mean it
> doesn't have much value to our users, and those who do will still see
> a lot of overhead. Maybe I'm wrong, but I bet with some good
> heuristics the planning cost of this could be reduced by an order of
> magnitude or more. If that were done, we could imagine eventually (or
> maybe even immediately) enabling this by default; without that, we
> still have the burden of maintaining this code and keeping it working,
> but almost nobody will benefit.

Actually, I introduced the EAGER_AGGREGATE_RATIO mechanism in October
to limit the planning effort for eager aggregation.  For more details,
please see [2].

And I don't think it's correct to say that we create a partially
grouped rel for every baserel and every joinrel.  This patch includes
a bunch of logic to determine whether it's appropriate to create a
grouped rel for a base or join rel.  Furthermore, with the
EAGER_AGGREGATE_RATIO mechanism, even if creating a grouped rel is
possible, we will skip it if the grouped paths are considered not
useful.  All of these measures help reduce the number of grouped
paths as well as the grouped relations in many cases where eager
aggregation would not help a lot.

Based on the TPC-DS benchmark results, I don't see "a lot of overhead"
in the planning cost, at least for the 7 queries where eager
aggregation is applied.  As I said in [1], "For the planning time, I
do not see notable regressions for any of the seven queries".  In
fact, I initially thought that we might consider enabling this by
default, given the positive benchmark results, but I just couldn't
summon the courage to do it.  Perhaps we should reconsider enabling it
by default, so users can benefit from the new feature and help
identify any potential bugs.

> +      <term><varname>enable_eager_aggregate</varname> (<type>boolean</type>)
> +       <para>
> +        Enables or disables the query planner's ability to partially push
> +        aggregation past a join, and finalize it once all the relations are
> +        joined. The default is <literal>off</literal>.
>
> I'm a bit concerned about the naming here. I feel like we're adding an
> increasing number of planner features with an increasing number of
> disabling GUCs that are all a bit random. I kind of wonder if this
> should be called enable_incremental_aggregate. Maybe that's worse,
> because "eager" is a word we're not using for anything yet, so using
> it here improves greppability and perhaps understandability. On the
> other hand, the aggregate that is pushed down by this feature is
> always partial (I believe) so we still need a finalize step later,
> which means we're aggregating incrementally. There's some nice parity
> with incremental sort, too, perhaps.

As I mentioned in [3], the name "Eager Aggregation" is inherited from
the paper "Eager Aggregation and Lazy Aggregation" [4], from which
many of the ideas in this feature are derived.  Personally, I like
this name a lot, but I'm open to other names if others find it
unreasonable.

> +/* The original length and hashtable of a RelInfoList */
> +typedef struct
> +{
> + int savelength;
> + struct HTAB *savehash;
> +} RelInfoListInfo;
>
> Both the comment and the name of the data type are completely meaningless.

Thanks.  Will fix the comment and the name for this data type.

> + /*
> + * Try at least sorting the cheapest path and also try
> + * incrementally sorting any path which is partially sorted
> + * already (no need to deal with paths which have presorted
> + * keys when incremental sort is disabled unless it's the
> + * cheapest input path).
> + */
>
> This would be the fifth copy of this comment. It's not entirely this
> patch's fault, of course, but some kind of refactoring or cleanup is
> probably needed here.

Agreed.  However, I think it would be better to refactor this in a
separate patch.  This issue also exists on master, and I'd prefer to
avoid introducing such refactors in this already large patch.

> + * cheapest_parameterized_paths also always includes the fewest-row
> + * unparameterized path, if there is one, for grouped relations.  Different
> + * paths of a grouped relation can have very different row counts, and in some
> + * cases the cheapest-total unparameterized path may not be the one with the
> + * fewest row.
>
> As I said back in October, this seems like mixing together in one
> RelOptInfo paths that really belong to two different RelOptInfos.

I understand that you said about the design in October where
"PartialAgg(t1 JOIN t2) and t1 JOIN PartialAgg(t2) get separate
RelOptInfos", because "it's less clear whether it's fair to compare
across the two categories".  I've shared my thoughts on this in [5].

Furthermore, even if we separate these grouped paths into two
different RelOptInfos, we still face the issue that "different paths
of a grouped relation can have very different row counts", and we need
a way to handle this.  One could argue that we can separate the
grouped paths where partial aggregation is placed at different
locations into different RelOptInfos, but this would lead to an
explosion in the number of RelOptInfos for grouped relations as we
climb up the join tree.  I think this is neither realistic nor
necessary.

[1] https://postgr.es/m/CAMbWs49DrR8Gkp3TUwFJV_1ShtmLzQUq3mOYD+GyF+Y3AmmrFw@mail.gmail.com
[2] https://postgr.es/m/CAMbWs48OS3Z0G5u3fhar1=H_ucmEcUaX0tRUNpcLQxHt=z4Y7w@mail.gmail.com
[3] https://postgr.es/m/CAMbWs48jzLrPt1J_00ZcPZXWUQKawQOFE8ROc-ADiYqsqrpBNw@mail.gmail.com
[4] https://www.vldb.org/conf/1995/P345.PDF
[5] https://postgr.es/m/CAMbWs49dLjSSQRWeud+KSN0G531ciZdYoLBd5qktXA+3JQm_UQ@mail.gmail.com

Thanks
Richard






^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-01-15 14:40         ` Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  0 siblings, 1 reply; 70+ messages in thread

From: Robert Haas @ 2025-01-15 14:40 UTC (permalink / raw)
  To: Richard Guo <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

On Wed, Jan 15, 2025 at 1:58 AM Richard Guo <[email protected]> wrote:
> I understand that we're currently quite bad at estimating the number
> of groups after aggregation.  In fact, it's not just aggregation
> estimates — we're also bad at join estimates in some cases.  This is a
> reality we have to face.  Here's what I think: we should be trying our
> best to cost each node type as accurately as possible, and then build
> the upper nodes based on those costs.  We should not conclude that,
> because we are unable to accurately cost one node type, we should
> avoid any cost-based optimizations above that node.

Well, I agree with that last sentence, for sure. But I don't think
it's true that the situations with joins and aggregates are
comparable. We are much better able to estimate the number of rows
that will come out of a join than we are to estimate the number of
rows that come out of an aggregate. It's certainly true that in some
cases we get join estimates badly wrong, and I'd like to see us do
better there, but our estimates of the number of distinct values that
exist in a column are the least reliable part of our statistics system
by far.

Also, we look at the underlying statistics for a column variable even
after joins and aggregates and assume (not having any other
information) that the distribution after that operation is likely to
be similar to the distribution before that operation. Consider a table
A with columns x and y. Let's say x is a unique ID and y is a
dependent value with some distribution over a finite range of
possibilities (e.g. a person's age). If we join table A to some other
table B on A.x = B.x and filter out some of the rows via that join,
the distribution of values in column y is likely to be altered. If the
rows are removed at random, the original distribution will prevail,
but often it won't be random and so the distribution will change in a
way we can't predict. However, guessing pre-join distribution of A.y
is still prevails isn't crazy, and it's better than assuming we can
say nothing about the distribution.

But now let's say that after joining to B, we perform an aggregation
operation, computing the minimum value of A.y for each value of B.z. A
this point, we have no usable statistics for either output column. The
result must be unique on B.z, and the distribution of MIN(A.y) is
going to be entirely different from the distribution of B.y. Any
future joins that we perform here will have to be estimated without
any MCVs, which is going to reduce the accuracy of the estimation by a
lot. In summary, the join makes relying on our MCV information less
likely to be accurate, but the aggregate makes it impossible to rely
on our MCV information at all. In terms of the accuracy of our
results, that is a lot worse.

> I believe the HashAggregate node in this plan faces the same problem
> with inaccurate estimates.  However, I don't think it's reasonable to
> say that, because we cannot accurately cost the Aggregate node, we
> should disregard considering JOIN_UNIQUE_OUTER/INNER.

Fair point.

> Back in August, I responded to this issue by "Maybe we can run some
> benchmarks first and investigate the regressions discovered on a
> case-by-case basis".  In October, I ran the TPC-DS benchmark at scale
> 10 and observed that eager aggregation was applied in 7 queries, with
> no notable regressions.  In contrast, Q4 and Q11 showed performance
> improvements of 3–4 times.  Please see [1].

I had forgotten about that, and again, fair point, but I'm concerned
that it might not be a broad enough base of queries to test against.
(7 isn't a very large number.)

> Actually, I introduced the EAGER_AGGREGATE_RATIO mechanism in October
> to limit the planning effort for eager aggregation.  For more details,
> please see [2].

OK, I missed this, but...

> And I don't think it's correct to say that we create a partially
> grouped rel for every baserel and every joinrel.  This patch includes
> a bunch of logic to determine whether it's appropriate to create a
> grouped rel for a base or join rel.  Furthermore, with the
> EAGER_AGGREGATE_RATIO mechanism, even if creating a grouped rel is
> possible, we will skip it if the grouped paths are considered not
> useful.  All of these measures help reduce the number of grouped
> paths as well as the grouped relations in many cases where eager
> aggregation would not help a lot.

...it looks to me like EAGER_AGGREGATE_RATIO is used to set the
RelAggInfo's agg_useful field, which seems like it happens after the
RelOptInfo has already been created. I had been looking for something
that would avoid creating the RelOptInfo in the first place and I
didn't see it.

> Based on the TPC-DS benchmark results, I don't see "a lot of overhead"
> in the planning cost, at least for the 7 queries where eager
> aggregation is applied.  As I said in [1], "For the planning time, I
> do not see notable regressions for any of the seven queries".  In
> fact, I initially thought that we might consider enabling this by
> default, given the positive benchmark results, but I just couldn't
> summon the courage to do it.  Perhaps we should reconsider enabling it
> by default, so users can benefit from the new feature and help
> identify any potential bugs.

If you're going to commit this, I think it would be a good idea to
enable it by default at least for now. If there are problems, it's
better to find out about them sooner rather than later. If they are
minor they can be fixed; if they are major, we can consider whether it
is better to fix them, disable the feature by default, or revert. We
can add an open item to reconsider the default setting during beta.

> > As I said back in October, this seems like mixing together in one
> > RelOptInfo paths that really belong to two different RelOptInfos.
>
> I understand that you said about the design in October where
> "PartialAgg(t1 JOIN t2) and t1 JOIN PartialAgg(t2) get separate
> RelOptInfos", because "it's less clear whether it's fair to compare
> across the two categories".  I've shared my thoughts on this in [5].
>
> Furthermore, even if we separate these grouped paths into two
> different RelOptInfos, we still face the issue that "different paths
> of a grouped relation can have very different row counts", and we need
> a way to handle this.  One could argue that we can separate the
> grouped paths where partial aggregation is placed at different
> locations into different RelOptInfos, but this would lead to an
> explosion in the number of RelOptInfos for grouped relations as we
> climb up the join tree.  I think this is neither realistic nor
> necessary.

It's possible you're right, but it does make me nervous. I do agree
that making the number of RelOptInfos explode would be really bad.

-- 
Robert Haas
EDB: http://www.enterprisedb.com






^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
@ 2025-01-16 08:18           ` Richard Guo <[email protected]>
  2025-01-16 21:40             ` Re: Eager aggregation, take 3 Tom Lane <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  0 siblings, 2 replies; 70+ messages in thread

From: Richard Guo @ 2025-01-16 08:18 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

On Wed, Jan 15, 2025 at 11:40 PM Robert Haas <[email protected]> wrote:
> On Wed, Jan 15, 2025 at 1:58 AM Richard Guo <[email protected]> wrote:
> > I understand that we're currently quite bad at estimating the number
> > of groups after aggregation.  In fact, it's not just aggregation
> > estimates — we're also bad at join estimates in some cases.  This is a
> > reality we have to face.  Here's what I think: we should be trying our
> > best to cost each node type as accurately as possible, and then build
> > the upper nodes based on those costs.  We should not conclude that,
> > because we are unable to accurately cost one node type, we should
> > avoid any cost-based optimizations above that node.
>
> Well, I agree with that last sentence, for sure. But I don't think
> it's true that the situations with joins and aggregates are
> comparable. We are much better able to estimate the number of rows
> that will come out of a join than we are to estimate the number of
> rows that come out of an aggregate. It's certainly true that in some
> cases we get join estimates badly wrong, and I'd like to see us do
> better there, but our estimates of the number of distinct values that
> exist in a column are the least reliable part of our statistics system
> by far.

I totally understand that the situation with joins is better than with
aggregates, which is why I said that we're also bad at join estimates
"in some cases" - especially in the cases where we fall back to use
default selectivity estimates.  A simple example:

create table t1 (a int, b int);
create table t2 (a int, b int);

insert into t1 select i, i from generate_series(1,1000)i;
insert into t2 select i, i from generate_series(1000, 1999)i;

analyze t1, t2;

explain analyze select * from t1 join t2 on t1.a > t2.a;

And here is what I got:

 Nested Loop  (cost=0.00..15032.50 rows=333333 width=16)
              (actual time=392.841..392.854 rows=0 loops=1)

If this t1/t2 join is part of a larger SELECT query, I think the cost
estimates for the upper join nodes would likely be quite inaccurate.

> > I believe the HashAggregate node in this plan faces the same problem
> > with inaccurate estimates.  However, I don't think it's reasonable to
> > say that, because we cannot accurately cost the Aggregate node, we
> > should disregard considering JOIN_UNIQUE_OUTER/INNER.
>
> Fair point.
>
> > Back in August, I responded to this issue by "Maybe we can run some
> > benchmarks first and investigate the regressions discovered on a
> > case-by-case basis".  In October, I ran the TPC-DS benchmark at scale
> > 10 and observed that eager aggregation was applied in 7 queries, with
> > no notable regressions.  In contrast, Q4 and Q11 showed performance
> > improvements of 3–4 times.  Please see [1].
>
> I had forgotten about that, and again, fair point, but I'm concerned
> that it might not be a broad enough base of queries to test against.
> (7 isn't a very large number.)

Yeah, I know 7 is not a large number, but this is the result I got
from running the TPC-DS benchmark.  For the remaining 92 queries in
the benchmark, either the logic in this patch determines that eager
aggregation is not applicable, or the path with eager aggregation is
not the optimal one.  I'd be more than happy if a benchmark query
showed significant performance regression, so it would provide an
opportunity to investigate how the cost estimates are negatively
impacting the final plan and explore ways to avoid or improve that.
If anyone can provide such a benchmark query, I'd be very grateful.

Perhaps this is another reason why we should enable this feature by
default, so we can identify such regression issues sooner rather than
later.

> > Actually, I introduced the EAGER_AGGREGATE_RATIO mechanism in October
> > to limit the planning effort for eager aggregation.  For more details,
> > please see [2].
>
> OK, I missed this, but...
>
> > And I don't think it's correct to say that we create a partially
> > grouped rel for every baserel and every joinrel.  This patch includes
> > a bunch of logic to determine whether it's appropriate to create a
> > grouped rel for a base or join rel.  Furthermore, with the
> > EAGER_AGGREGATE_RATIO mechanism, even if creating a grouped rel is
> > possible, we will skip it if the grouped paths are considered not
> > useful.  All of these measures help reduce the number of grouped
> > paths as well as the grouped relations in many cases where eager
> > aggregation would not help a lot.
>
> ...it looks to me like EAGER_AGGREGATE_RATIO is used to set the
> RelAggInfo's agg_useful field, which seems like it happens after the
> RelOptInfo has already been created. I had been looking for something
> that would avoid creating the RelOptInfo in the first place and I
> didn't see it.

Well, from the perspective of planning effort, what really matters is
whether the RelOptInfo for the grouped relation is added to the
PlannerInfo, as it is only then available for further joining in the
join search routine, not whether the RelOptInfo is built or not.
Building the RelOptInfo for a grouped relation is simply a makeNode
call followed by a flat copy; it doesn't require going through the
full process of determining its target list, or constructing its
restrict and join clauses, or calculating size estimates, etc.

Now, let's take a look at how the EAGER_AGGREGATE_RATIO mechanism is
used.  As you mentioned, EAGER_AGGREGATE_RATIO is used to set the
agg_useful field of the RelAggInfo.  For a base rel where we've
decided that aggregation can be pushed down, if agg_useful is false,
we skip building the grouped relation for it in the first place, not
to mention adding the grouped relation to the PlannerInfo.  For a join
rel where aggregation can be pushed down, if agg_useful is false, we
will create a temporary RelOptInfo for its grouped relation, but we
only add this RelOptInfo to the PlannerInfo if we can generate any
grouped paths by joining its input relations.  We could easily modify
make_grouped_join_rel() to create this temporary RelOptInfo only when
needed, but I'm not sure if that's necessary, since I don't have data
to suggest that the creation of this temporary RelOptInfo is a factor
in causing planning regressions.

> > Based on the TPC-DS benchmark results, I don't see "a lot of overhead"
> > in the planning cost, at least for the 7 queries where eager
> > aggregation is applied.  As I said in [1], "For the planning time, I
> > do not see notable regressions for any of the seven queries".  In
> > fact, I initially thought that we might consider enabling this by
> > default, given the positive benchmark results, but I just couldn't
> > summon the courage to do it.  Perhaps we should reconsider enabling it
> > by default, so users can benefit from the new feature and help
> > identify any potential bugs.
>
> If you're going to commit this, I think it would be a good idea to
> enable it by default at least for now. If there are problems, it's
> better to find out about them sooner rather than later. If they are
> minor they can be fixed; if they are major, we can consider whether it
> is better to fix them, disable the feature by default, or revert. We
> can add an open item to reconsider the default setting during beta.

Agreed.  And I like the suggestion of adding an open item about the
default setting during beta.

> > > As I said back in October, this seems like mixing together in one
> > > RelOptInfo paths that really belong to two different RelOptInfos.
> >
> > I understand that you said about the design in October where
> > "PartialAgg(t1 JOIN t2) and t1 JOIN PartialAgg(t2) get separate
> > RelOptInfos", because "it's less clear whether it's fair to compare
> > across the two categories".  I've shared my thoughts on this in [5].
> >
> > Furthermore, even if we separate these grouped paths into two
> > different RelOptInfos, we still face the issue that "different paths
> > of a grouped relation can have very different row counts", and we need
> > a way to handle this.  One could argue that we can separate the
> > grouped paths where partial aggregation is placed at different
> > locations into different RelOptInfos, but this would lead to an
> > explosion in the number of RelOptInfos for grouped relations as we
> > climb up the join tree.  I think this is neither realistic nor
> > necessary.
>
> It's possible you're right, but it does make me nervous. I do agree
> that making the number of RelOptInfos explode would be really bad.

Based on my explanation in [1], I think it's acceptable to compare
grouped paths for the same grouped rel, regardless of where the
partial aggregation is placed.

I fully understand that I could be wrong about this, but I don't think
it would break anything in regular planning (i.e., planning without
eager aggregation).  We would never compare a grouped path with a
non-grouped path during scan/join planning.  As far as I can see, the
only consequence in that case would be that we might fail to select
the optimal grouped path and miss out on fully leveraging the benefits
of eager aggregation.

Back in November, I considered the possibility of introducing a
GroupPathInfo into the Path structure to store the location of the
partial aggregation as well as the estimated rowcount for this grouped
path, similar to how ParamPathInfo functions for parameterized paths.
However, after some exploration, I determined that this was
unnecessary.

But in any case, I don't think it's an option to separate the grouped
paths of the same grouped relation into different RelOptInfos based on
the location of the partial aggregation within the path tree.

[1] https://postgr.es/m/CAMbWs49dLjSSQRWeud+KSN0G531ciZdYoLBd5qktXA+3JQm_UQ@mail.gmail.com

Thanks
Richard






^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-01-16 21:40             ` Tom Lane <[email protected]>
  2025-01-17 12:19               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  1 sibling, 1 reply; 70+ messages in thread

From: Tom Lane @ 2025-01-16 21:40 UTC (permalink / raw)
  To: Richard Guo <[email protected]>; +Cc: Robert Haas <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; [email protected]

I'm very sorry for not having had any time to look at this patch
before --- it's been on my radar screen for awhile, but $LIFE has
been rather demanding lately.

Anyway, I've now read through the mail thread and portions of the
v16 patch, and I have to concur with Robert's qualms about whether
this is ready.  A few observations:

* The README addition, and the basically identical text in the
commit message, don't even provide a reason to believe that the
transformation is correct let alone that it will result in faster
execution.  I don't understand why it's so hard to provide a solid
correctness argument.  This work was supposedly based on an academic
paper; surely that paper must have included a correctness proof?
PG might need a few refinements, like being specific about what we
expect from the equality operators.  But an EXPLAIN plan is not an
argument.

* As for the performance aspect, we're given

 Finalize HashAggregate
   Group Key: a.i
   ->  Nested Loop
         ->  Partial HashAggregate
               Group Key: b.j
               ->  Seq Scan on b
         ->  Index Only Scan using a_pkey on a
               Index Cond: (i = b.j)

As far as I can see, this will require aggregation to be performed
across every row of "b", whereas the naive way would have aggregated
across only rows having join partners in "a".  If most "b" rows lack
a join partner then this will be far slower than the naive way.
I do see that it can be better if most "b" rows have multiple join
partners, because we'll re-use partial aggregation results instead
of (effectively) recalculating them.  But the README text makes it
sound like this is an unconditional win, which is not the right
mindset.  (In fact, in this specific example where a.i is presumed
unique, how's it a win at all?)

* I'm also concerned about what happens with aggregates that can have
large partial-aggregation values, such as string_agg().  With the
existing usage of partial aggregation for parallel queries, it's
possible to be confident that there are not many partial-aggregation
values in existence at the same time.  I don't think that holds for
pushed-down aggregates: for example, I wouldn't be surprised if the
planner chooses a join plan that requires stuffing all those values
into a hash table, or "materializes" the output of the partial
aggregation step.  Do we have logic that will avoid blowing out
memory during such queries?

* I am just as worried as Robert is about the notion of different
paths for the same RelOptInfo having different rowcount estimates.
That is an extremely fundamental violation of basic planner
assumptions.  We did bend it for parameterized paths by restating
those assumptions as (from optimizer/README):

  To keep cost estimation rules relatively simple, we make an implementation
  restriction that all paths for a given relation of the same parameterization
  (i.e., the same set of outer relations supplying parameters) must have the
  same rowcount estimate.  This is justified by insisting that each such path
  apply *all* join clauses that are available with the named outer relations.

I don't see any corresponding statement here, and it's not clear
to me that the point has been thought through adequately.

Another aspect that bothers me is that a RelOptInfo is understood
to contain a bunch of paths that all yield the same data (the same
set of columns), and it seems like that might not be the case here.
Certainly partially-aggregated paths will output something different
than unaggregated ones, but mightn't different join orders mutate the
column set even further?

I think that we might be better off building a separate RelOptInfo for
each way of pushing down the aggregates, in order to preserve the
principle that all the paths in any one RelOptInfo have the same
output.  This'll mean more RelOptInfos, but not more paths, so
I doubt it adds that much performance overhead.

Richard Guo <[email protected]> writes:
> Back in November, I considered the possibility of introducing a
> GroupPathInfo into the Path structure to store the location of the
> partial aggregation as well as the estimated rowcount for this grouped
> path, similar to how ParamPathInfo functions for parameterized paths.
> However, after some exploration, I determined that this was
> unnecessary.

Why did you determine that was unnecessary?  The principal function
of ParamPathInfo IMV is to ensure that we use exactly the same
rowcount estimate for all the paths that should have the same
estimate, and that problem seems to exist here as well.  If you
don't have a forcing mechanism then paths' estimates will diverge
as a result of things like different roundoff errors in different
join sequences.

Anyway, I agree with Robert that this isn't ready.  I don't feel
that I can even review it adequately without a lot better internal
documentation, specifically a clearer statement of what query shapes
the optimization applies to and what's the rationale for the
transformation being correct.  The commentary in pathnodes.h for the
new data structures is likewise so skimpy as to be near useless.

			regards, tom lane






^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-16 21:40             ` Re: Eager aggregation, take 3 Tom Lane <[email protected]>
@ 2025-01-17 12:19               ` Richard Guo <[email protected]>
  0 siblings, 0 replies; 70+ messages in thread

From: Richard Guo @ 2025-01-17 12:19 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: Robert Haas <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; [email protected]

On Fri, Jan 17, 2025 at 6:40 AM Tom Lane <[email protected]> wrote:
> * The README addition, and the basically identical text in the
> commit message, don't even provide a reason to believe that the
> transformation is correct let alone that it will result in faster
> execution.  I don't understand why it's so hard to provide a solid
> correctness argument.  This work was supposedly based on an academic
> paper; surely that paper must have included a correctness proof?
> PG might need a few refinements, like being specific about what we
> expect from the equality operators.  But an EXPLAIN plan is not an
> argument.

Thank you for taking a look at this patch!

In README, I provided the justification for the correctness of this
transformation as follows:

  For the partial aggregation that is pushed down to a non-aggregated
  relation, we need to consider all expressions from this relation that
  are involved in upper join clauses and include them in the grouping
  keys, using compatible operators.  This is essential to ensure that an
  aggregated row from the partial aggregation matches the other side of
  the join if and only if each row in the partial group does.  This
  ensures that all rows within the same partial group share the same
  'destiny', which is crucial for maintaining correctness.

I believed that this explanation would make it clear why this
transformation is correct.

Yeah, this work implements one of the transformations introduced in
paper "Eager Aggregation and Lazy Aggregation".  In the paper, Section
4 presents the formalism, Section 5 proves the main theorem, and
Section 6 introduces corollaries related to this specific
transformation.  I'm just not sure how to translate these theorems and
corollaries into natural language that would be suitable to be
included in the README.  I can give it another try if you find the
above justification not clear enough, but it would be really helpful
if I could get some assistance with this.

And I'd like to clarify that the EXPLAIN plan included in the README
is only meant to illustrate how this transformation looks like, and is
not intended to serve as an argument for its correctness.

> * As for the performance aspect, we're given
>
>  Finalize HashAggregate
>    Group Key: a.i
>    ->  Nested Loop
>          ->  Partial HashAggregate
>                Group Key: b.j
>                ->  Seq Scan on b
>          ->  Index Only Scan using a_pkey on a
>                Index Cond: (i = b.j)
>
> As far as I can see, this will require aggregation to be performed
> across every row of "b", whereas the naive way would have aggregated
> across only rows having join partners in "a".

Yes, that's correct.

> If most "b" rows lack
> a join partner then this will be far slower than the naive way.

No, this is not correct.  The partial aggregation may reduce the
number of input rows to the join, and the resulting data reduction
could justify the cost of performing the partial aggregation.  As an
example, please consider:

create table t1 (a int, b int, c int);
create table t2 (a int, b int, c int);

insert into t1 select i%3, i%3, i from generate_series(1,1000000)i;
insert into t2 select i%3+3, i%3+3, i from generate_series(1,1000000)i;

analyze t1, t2;

explain analyze
select sum(t2.c) from t1 join t2 on t1.b = t2.b group by t1.a;

So for this query, most (actually all) t2 rows lack a join partner.

Running it with and without eager aggregation, I got (best of 3):

-- with eager aggregation
 Execution Time: 496.856 ms

-- without eager aggregation
 Execution Time: 1723.844 ms

> I do see that it can be better if most "b" rows have multiple join
> partners, because we'll re-use partial aggregation results instead
> of (effectively) recalculating them.

Not only because we'll re-use partial aggregation results, but also
(and perhaps more importantly) because the number of input rows to the
join could be significantly reduced.

> But the README text makes it
> sound like this is an unconditional win, which is not the right
> mindset.

I'm sorry if the README text gives that impression.  The README says:

 If the partial aggregation on table B significantly reduces the number
 of input rows, the join above will be much cheaper, leading to a more
 efficient final plan.

Perhaps I should use "could" or "might" instead of "will" to make it
less misleading.

But as you can see from the implementation, the decision is entirely
based on cost, not on rules.  There is no part of the code that ever
assumes this transformation is an unconditional win.

> (In fact, in this specific example where a.i is presumed
> unique, how's it a win at all?)

It seems to me that whether it's a win depends on whether b.j is a
column with low cardinality (i.e., relatively few unique values).  I
don't really see how a.i being unique would change that.  Please
see the example below:

create table a (i int primary key, x int);
create table b (j int, y int);

insert into a select i, i%3 from generate_series(1,10000)i;
insert into b select i%3, i from generate_series(1,10000)i;

analyze a, b;

set enable_eager_aggregate to off;

EXPLAIN (ANALYZE, COSTS OFF)
SELECT a.i, avg(b.y)
FROM a JOIN b ON a.i > b.j
GROUP BY a.i;
                                            QUERY PLAN
--------------------------------------------------------------------------------------------------
 HashAggregate (actual time=100257.254..100268.841 rows=10000 loops=1)
   Group Key: a.i
   Batches: 1  Memory Usage: 2193kB
   Buffers: shared hit=133
   ->  Nested Loop (actual time=2.629..40849.630 rows=99990000 loops=1)
         Buffers: shared hit=133
         ->  Seq Scan on b (actual time=0.450..10.066 rows=10000 loops=1)
               Buffers: shared hit=45
         ->  Memoize (actual time=0.002..0.752 rows=9999 loops=10000)
               Cache Key: b.j
               Cache Mode: binary
               Hits: 9997  Misses: 3  Evictions: 0  Overflows: 0
Memory Usage: 1055kB
               Buffers: shared hit=88
               ->  Index Only Scan using a_pkey on a (actual
time=0.752..8.100 rows=9999 loops=3)
                     Index Cond: (i > b.j)
                     Heap Fetches: 0
                     Buffers: shared hit=88
 Planning Time: 1.681 ms
 Execution Time: 100273.011 ms
(19 rows)

set enable_eager_aggregate to on;

EXPLAIN (ANALYZE, COSTS OFF)
SELECT a.i, avg(b.y)
FROM a JOIN b ON a.i > b.j
GROUP BY a.i;
                                         QUERY PLAN
--------------------------------------------------------------------------------------------
 Finalize HashAggregate (actual time=77.701..90.680 rows=10000 loops=1)
   Group Key: a.i
   Batches: 1  Memory Usage: 2193kB
   Buffers: shared hit=133
   ->  Nested Loop (actual time=27.586..52.352 rows=29997 loops=1)
         Buffers: shared hit=133
         ->  Partial HashAggregate (actual time=27.408..27.419 rows=3 loops=1)
               Group Key: b.j
               Batches: 1  Memory Usage: 24kB
               Buffers: shared hit=45
               ->  Seq Scan on b (actual time=0.173..3.767 rows=10000 loops=1)
                     Buffers: shared hit=45
         ->  Index Only Scan using a_pkey on a (actual
time=0.108..5.277 rows=9999 loops=3)
               Index Cond: (i > b.j)
               Heap Fetches: 0
               Buffers: shared hit=88
 Planning Time: 1.739 ms
 Execution Time: 93.003 ms
(18 rows)

There is a performance improvement of ~1000 times, even though a.i is
unique.

# select 100273.011/93.003;
       ?column?
-----------------------
 1078.1696396890422890
(1 row)

(I used 'a.i > b.j' instead of 'a.i = b.j' to make the performance
difference more noticeable.  I believe this is fine, as it doesn't
undermine the fact that a.i is unique.)

> * I'm also concerned about what happens with aggregates that can have
> large partial-aggregation values, such as string_agg().  With the
> existing usage of partial aggregation for parallel queries, it's
> possible to be confident that there are not many partial-aggregation
> values in existence at the same time.  I don't think that holds for
> pushed-down aggregates: for example, I wouldn't be surprised if the
> planner chooses a join plan that requires stuffing all those values
> into a hash table, or "materializes" the output of the partial
> aggregation step.  Do we have logic that will avoid blowing out
> memory during such queries?

Good point!  Thank you for bringing this up.  I hadn't considered it
before, and it seems no one else has raised this issue.  I'll look
into it.

> * I am just as worried as Robert is about the notion of different
> paths for the same RelOptInfo having different rowcount estimates.
> That is an extremely fundamental violation of basic planner
> assumptions.  We did bend it for parameterized paths by restating
> those assumptions as (from optimizer/README):
>
>   To keep cost estimation rules relatively simple, we make an implementation
>   restriction that all paths for a given relation of the same parameterization
>   (i.e., the same set of outer relations supplying parameters) must have the
>   same rowcount estimate.  This is justified by insisting that each such path
>   apply *all* join clauses that are available with the named outer relations.
>
> I don't see any corresponding statement here, and it's not clear
> to me that the point has been thought through adequately.
>
> Another aspect that bothers me is that a RelOptInfo is understood
> to contain a bunch of paths that all yield the same data (the same
> set of columns), and it seems like that might not be the case here.
> Certainly partially-aggregated paths will output something different
> than unaggregated ones, but mightn't different join orders mutate the
> column set even further?
>
> I think that we might be better off building a separate RelOptInfo for
> each way of pushing down the aggregates, in order to preserve the
> principle that all the paths in any one RelOptInfo have the same
> output.  This'll mean more RelOptInfos, but not more paths, so
> I doubt it adds that much performance overhead.

Hmm, IIUC, this means that we would separate the grouped paths of the
same grouped relation into different RelOptInfos based on the location
of the partial aggregation within the path tree.  Let's define the
"location" as the relids of the relation on top of which we place the
partial aggregation.  For grouped relation {A B C D}, if we perform
some aggregation on C, we would end up with 8 diffent grouped paths:

{A B D PartialAgg(C)}
{B D PartialAgg(A C)}
{A D PartialAgg(B C)}
{A B PartialAgg(D C)}
{D PartialAgg(A B C)}
{B PartialAgg(A D C)}
{A PartialAgg(B D C)}
{PartialAgg(A B D C)}

That means we would need to create 8 RelOptInfos for this grouped
relation.  If my math doesn't fail me, for a relation containing n
base rels, we would need to create 2^(n-1) different RelOptInfos.

When building grouped relation {A B C D E} by joining {A B C D} with
{E}, we would need to call make_grouped_join_rel() 8 times, each time
joining {E} with one of the 8 RelOptInfos mentioned above.  And at
last, considering other join orders such as joining {A B C E} with
{D}, this new grouped relation would end up with 16 new RelOptInfos.

And then we proceed with building grouped relation {A B C D E F}, and
end up with 32 new RelOptInfos, and this process continues...

It seems to me that this doesn't only result in more RelOptInfos, it
can also lead to more paths.  Consider two grouped paths, say P1 and
P2, for the same grouped relation, but with different locations of the
partial aggregation.  Suppose P1 is cheaper, at least as well ordered,
generates no more rows, requires no outer rels not required by P2, and
is no less parallel-safe.  If these two paths are kept in the same
RelOptInfo, P2 will be discarded and not considered in further
planning.  However, if P1 and P2 are separated into different
RelOptInfos, and P2 happens to have survived the add_path() tournament
for the RelOptInfo it is in, then it will be considered in subsequent
planning steps.

So in any case, this doesn't seem like a feasible approach to me.

I also have some thoughts on grouped paths and parameterized paths,
but I've run out of time for today.  I'll send a separate email.

I'm really glad you're taking a look at this patch.  Thank you!

Thanks
Richard






^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-01-17 21:16             ` Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  1 sibling, 1 reply; 70+ messages in thread

From: Robert Haas @ 2025-01-17 21:16 UTC (permalink / raw)
  To: Richard Guo <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

On Thu, Jan 16, 2025 at 3:18 AM Richard Guo <[email protected]> wrote:
> If this t1/t2 join is part of a larger SELECT query, I think the cost
> estimates for the upper join nodes would likely be quite inaccurate.

That's definitely true. However, the question is not whether the
planner has problems today (it definitely does) but whether it's OK to
make this change without improving our ability to estimate the effects
of aggregation operations. I understand that you (quite rightly) don't
want to get sucked into fixing unrelated planner problems, and I'm
also not sure to what extent these problems are actually fixable.
However, major projects sometimes require such work. For instance,
commit 5edc63bda68a77c4d38f0cbeae1c4271f9ef4100 was motivated by the
discovery that it was too easy to get a Parallel Bitmap Heap Scan plan
even when it wasn't best. The fact that the costing wasn't right
wasn't the fault of parallel query, but parallel query still needed to
do something about it to get good results.

> Yeah, I know 7 is not a large number, but this is the result I got
> from running the TPC-DS benchmark.  For the remaining 92 queries in
> the benchmark, either the logic in this patch determines that eager
> aggregation is not applicable, or the path with eager aggregation is
> not the optimal one.  I'd be more than happy if a benchmark query
> showed significant performance regression, so it would provide an
> opportunity to investigate how the cost estimates are negatively
> impacting the final plan and explore ways to avoid or improve that.
> If anyone can provide such a benchmark query, I'd be very grateful.

Yes, having more people test this and look for regressions would be
quite valuable.

> Well, from the perspective of planning effort, what really matters is
> whether the RelOptInfo for the grouped relation is added to the
> PlannerInfo, as it is only then available for further joining in the
> join search routine, not whether the RelOptInfo is built or not.
> Building the RelOptInfo for a grouped relation is simply a makeNode
> call followed by a flat copy; it doesn't require going through the
> full process of determining its target list, or constructing its
> restrict and join clauses, or calculating size estimates, etc.

That's probably mostly true, but the overhead of memory allocations in
planner routines is not trivial. There are previous cases of changing
things or declining to change this purely on the number of palloc
cycles involved.

> > It's possible you're right, but it does make me nervous. I do agree
> > that making the number of RelOptInfos explode would be really bad.
>
> Based on my explanation in [1], I think it's acceptable to compare
> grouped paths for the same grouped rel, regardless of where the
> partial aggregation is placed.
>
> I fully understand that I could be wrong about this, but I don't think
> it would break anything in regular planning (i.e., planning without
> eager aggregation).

I think you might be taking too narrow a view of the problem. As Tom
says, the issue is that this breaks a bunch of assumptions that hold
elsewhere. One place that shows up in the patch is in the special-case
logic you've added to set_cheapest(), but I fear that won't be the end
of it. It seems a bit surprising to me that you didn't also need to
adjust add_path(), for example. Even if you don't, there's lots of
places that rely on the assumption that all paths for a RelOptInfo are
returning the same set of rows. If it turns out that a bunch of those
places need to be adjusted to work with this, then the code could
potentially end up quite messy, and that might also have performance
consequences, even when this feature is disabled. Many of the code
paths that deal with paths in the planner are quite hot.

To say that another way, I'm not so much worried about the possibility
that the patch contains a bug. Patches contain bugs all the time and
we can just fix them. It's not wonderful, but that's how software
development goes. What I am worried about is whether the architecture
is right. If we go with one RelOptInfo when the "right answer" is
many, or for that matter if we go with many when the right answer is
one, those are things that cannot be easily and reasonably patched
post-commit, and especially not post-release.

-- 
Robert Haas
EDB: http://www.enterprisedb.com






^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
@ 2025-01-19 12:53               ` Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-20 17:57                 ` Re: Eager aggregation, take 3 Tom Lane <[email protected]>
  0 siblings, 2 replies; 70+ messages in thread

From: Richard Guo @ 2025-01-19 12:53 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

On Sat, Jan 18, 2025 at 6:16 AM Robert Haas <[email protected]> wrote:
> On Thu, Jan 16, 2025 at 3:18 AM Richard Guo <[email protected]> wrote:
> > If this t1/t2 join is part of a larger SELECT query, I think the cost
> > estimates for the upper join nodes would likely be quite inaccurate.
>
> That's definitely true. However, the question is not whether the
> planner has problems today (it definitely does) but whether it's OK to
> make this change without improving our ability to estimate the effects
> of aggregation operations. I understand that you (quite rightly) don't
> want to get sucked into fixing unrelated planner problems, and I'm
> also not sure to what extent these problems are actually fixable.
> However, major projects sometimes require such work. For instance,
> commit 5edc63bda68a77c4d38f0cbeae1c4271f9ef4100 was motivated by the
> discovery that it was too easy to get a Parallel Bitmap Heap Scan plan
> even when it wasn't best. The fact that the costing wasn't right
> wasn't the fault of parallel query, but parallel query still needed to
> do something about it to get good results.

Yeah, it's true that we have problems in aggregate estimates today.
And it has been the case for a long time.  In the past, we made some
improvements in this area, such as in 84f9a35e3, where we adapted a
new formula that is based on the random selection probability,
inspired by two papers from Yao and Dell'Era.  But we still have
problems with aggregate estimates.  I'm not sure when we could fix
these problems, but I doubt that it will happen in the near future.
(Sorry to be pessimistic.)

If, at last, the conclusion of this discussion is that we should not
apply this change until we fix those problems in aggregate estimates,
I'd be very sad.  This conclusion is absolutely correct, for sure, in
an ideal world, but in the real world, it feels like a death sentence
for this patch, and for any future patches that attempt to apply some
optimizations above aggregate nodes - unless, of course, the day
arrives when we finally fix those aggregate estimate problems, which
doesn't seem likely in the near future.

And if that's the case, can I then argue that the same principle
should apply to joins?  Specifically, should we refrain from applying
any optimizations above join nodes until we've fixed the join estimate
problems, particularly in cases where we fall back on default
selectivity estimates?

Please do not get me wrong.  I'm not saying that we should not fix the
problems in our current aggregate estimates.  I think, as I said
previously, that the realistic approach is to first identify some
real-world queries where this patch causes significant performance
regressions.  This would give us the opportunity to investigate these
regressions and understand how the bad cost estimates contributed to
them.  From there, we could figure out where to start fixing the cost
estimates.  And if we find that the problem is not entirely fixable,
we could then explore the possibility of introducing new heuristics to
avoid the performance regressions as much as possible.  In my opinion,
it's not very possible to make cost estimation perfect in all cases.
In a sense, cost estimation is an art of compromise.

I believe this is also the approach that commit 5edc63bda followed.
First, it was found that Bitmap Heap Scans caused performance
regressions in many TPCH queries in cases where work_mem was low.
Then, this issue was thoroughly discussed, and eventually it was
figured out that the impact of lossy pages needed to be accounted for
when estimating the cost of bitmap scans, which became 5edc63bda.

> > Well, from the perspective of planning effort, what really matters is
> > whether the RelOptInfo for the grouped relation is added to the
> > PlannerInfo, as it is only then available for further joining in the
> > join search routine, not whether the RelOptInfo is built or not.
> > Building the RelOptInfo for a grouped relation is simply a makeNode
> > call followed by a flat copy; it doesn't require going through the
> > full process of determining its target list, or constructing its
> > restrict and join clauses, or calculating size estimates, etc.
>
> That's probably mostly true, but the overhead of memory allocations in
> planner routines is not trivial. There are previous cases of changing
> things or declining to change this purely on the number of palloc
> cycles involved.

Hmm, I think you are right.  I will modify make_grouped_join_rel() to
create the RelOptInfo for a grouped join relation only if we can
generate any grouped paths by joining its input relations.

> > > It's possible you're right, but it does make me nervous. I do agree
> > > that making the number of RelOptInfos explode would be really bad.
> >
> > Based on my explanation in [1], I think it's acceptable to compare
> > grouped paths for the same grouped rel, regardless of where the
> > partial aggregation is placed.
> >
> > I fully understand that I could be wrong about this, but I don't think
> > it would break anything in regular planning (i.e., planning without
> > eager aggregation).
>
> I think you might be taking too narrow a view of the problem. As Tom
> says, the issue is that this breaks a bunch of assumptions that hold
> elsewhere. One place that shows up in the patch is in the special-case
> logic you've added to set_cheapest(), but I fear that won't be the end
> of it. It seems a bit surprising to me that you didn't also need to
> adjust add_path(), for example. Even if you don't, there's lots of
> places that rely on the assumption that all paths for a RelOptInfo are
> returning the same set of rows. If it turns out that a bunch of those
> places need to be adjusted to work with this, then the code could
> potentially end up quite messy, and that might also have performance
> consequences, even when this feature is disabled. Many of the code
> paths that deal with paths in the planner are quite hot.

Yeah, one of the basic assumptions in the planner is that all paths
for a given RelOptInfo return the same set of rows.  One exception
to this is parameterized paths.  As an example, please consider:

create table t (a int, b int);
create table t3 (a int, b int);

insert into t select i, i from generate_series(1,1000)i;
insert into t3 select i, i from generate_series(1,1000)i;

create index on t3(a, b);
analyze t, t3;

explain (costs off)
select * from t t1 join t t2 on true join t3 on t3.a > t1.a and t3.b > t2.b;

With gdb, I found the following 4 paths in the pathlist of RelOptInfo
of {t3}:

   {INDEXPATH
   :path.pathtype 341
   :parent_relids (b 4)
   :required_outer (b 1 2)
   :path.parallel_aware false
   :path.parallel_safe true
   :path.parallel_workers 0
   :path.rows 111
   :path.disabled_nodes 0
   :path.startup_cost 0.275
   :path.total_cost 4.755000000000001

   {INDEXPATH
   :path.pathtype 341
   :parent_relids (b 4)
   :required_outer (b 1)
   :path.parallel_aware false
   :path.parallel_safe true
   :path.parallel_workers 0
   :path.rows 333
   :path.disabled_nodes 0
   :path.startup_cost 0.275
   :path.total_cost 6.1425

   {INDEXPATH
   :path.pathtype 341
   :parent_relids (b 4)
   :required_outer (b 2)
   :path.parallel_aware false
   :path.parallel_safe true
   :path.parallel_workers 0
   :path.rows 333
   :path.disabled_nodes 0
   :path.startup_cost 0.275
   :path.total_cost 11.145

   {PATH
   :pathtype 338
   :parent_relids (b 4)
   :required_outer (b)
   :parallel_aware false
   :parallel_safe true
   :parallel_workers 0
   :rows 1000
   :disabled_nodes 0
   :startup_cost 0
   :total_cost 15

None of them are returning the same set of rows.  This is fine because
we have revised the assumption to that all paths for a RelOptInfo of
the same parameterization return the same set of rows.  That is to
say, it's OK that paths for the same RelOptInfo return different sets
of rows if they have different parameterizations.

Now we have the grouped paths.  I had previously considered further
revising this assumption to that all paths for a RelOptInfo of the
same parameterization and the same location of partial aggregation
return the same set of rows.  That's why, back in November, I proposed
the idea of introducing a GroupPathInfo into the Path structure to
store the location of the partial aggregation and the estimated
rowcount for each grouped path, similar to how ParamPathInfo functions
for parameterized paths.

However, I gave up on this idea in December after realizing an
important difference from the parameterized path case.  For a
parameterized path, the fewer the required outer rels, the better, as
more outer rels imply more join restrictions.  In other words, the
number of required outer rels is an important factor when comparing
two paths in add_path().  For a grouped path, however, the location of
partial aggregation does not impose such restrictions for further
planning.  As long as one grouped path is cheaper than another based
on the current merits of add_path(), we don't really care where the
partial aggregation is placed within the path tree.

I can take up the idea of GroupPathInfo again.  Before I start
implementing it (which is not trivial), I'd like to hear others'
thoughts on this approach - whether it's necessary and whether this is
the right direction to pursue.

> To say that another way, I'm not so much worried about the possibility
> that the patch contains a bug. Patches contain bugs all the time and
> we can just fix them. It's not wonderful, but that's how software
> development goes. What I am worried about is whether the architecture
> is right. If we go with one RelOptInfo when the "right answer" is
> many, or for that matter if we go with many when the right answer is
> one, those are things that cannot be easily and reasonably patched
> post-commit, and especially not post-release.

Fair point.  We should make sure the architecture of this patch is
solid before committing it.

Regarding whether we should use a single RelOptInfo or separate
RelOptInfos for the same grouped relation: If we choose to separate
the grouped paths of the same grouped relation into different
RelOptInfos based on the location of the partial aggregation within
the path tree, then, based on my calculation from the previous email,
for a relation containing n base rels, we would need to create 2^(n-1)
different RelOptInfos, not to mention that this can also lead to more
paths.  I still struggle to see how this is feasible.  Could you
please elaborate on why you believe this is a viable option?

Thanks
Richard






^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-01-20 16:28                 ` Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  1 sibling, 1 reply; 70+ messages in thread

From: Robert Haas @ 2025-01-20 16:28 UTC (permalink / raw)
  To: Richard Guo <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

On Sun, Jan 19, 2025 at 7:53 AM Richard Guo <[email protected]> wrote:
> If, at last, the conclusion of this discussion is that we should not
> apply this change until we fix those problems in aggregate estimates,
> I'd be very sad.  This conclusion is absolutely correct, for sure, in
> an ideal world, but in the real world, it feels like a death sentence
> for this patch, and for any future patches that attempt to apply some
> optimizations above aggregate nodes - unless, of course, the day
> arrives when we finally fix those aggregate estimate problems, which
> doesn't seem likely in the near future.

Well, such conclusions should be based on evidence. So far, the
evidence you've presented suggests that the optimization works, so
there's no reason to leap to the conclusion that we shouldn't move
forward. On the other hand, the amount of evidence you've presented
does not seem to me to be all that large. And I'm not sure that you've
gone looking for adversarial cases.

> And if that's the case, can I then argue that the same principle
> should apply to joins?  Specifically, should we refrain from applying
> any optimizations above join nodes until we've fixed the join estimate
> problems, particularly in cases where we fall back on default
> selectivity estimates?

I am having a hard time figuring out how to write back to this. I
mean, I don't think that what you write here is a serious proposal,
and I think you already know that I was not proposing any such thing.
But it upsets me that you think that this hypothetical argument is
equivalent to the ones I've actually been making. Apparently, you
consider my concerns quite groundless and foolish.

> Yeah, one of the basic assumptions in the planner is that all paths
> for a given RelOptInfo return the same set of rows.  One exception
> to this is parameterized paths.

Good point. I had not considered this parallel.

> Now we have the grouped paths.  I had previously considered further
> revising this assumption to that all paths for a RelOptInfo of the
> same parameterization and the same location of partial aggregation
> return the same set of rows.  That's why, back in November, I proposed
> the idea of introducing a GroupPathInfo into the Path structure to
> store the location of the partial aggregation and the estimated
> rowcount for each grouped path, similar to how ParamPathInfo functions
> for parameterized paths.

Interesting.

> However, I gave up on this idea in December after realizing an
> important difference from the parameterized path case.  For a
> parameterized path, the fewer the required outer rels, the better, as
> more outer rels imply more join restrictions.  In other words, the
> number of required outer rels is an important factor when comparing
> two paths in add_path().  For a grouped path, however, the location of
> partial aggregation does not impose such restrictions for further
> planning.  As long as one grouped path is cheaper than another based
> on the current merits of add_path(), we don't really care where the
> partial aggregation is placed within the path tree.
>
> I can take up the idea of GroupPathInfo again.  Before I start
> implementing it (which is not trivial), I'd like to hear others'
> thoughts on this approach - whether it's necessary and whether this is
> the right direction to pursue.

Yes, I would, too. Tom, do you have any thoughts on this point? Anybody else?

An advantage of this approach could be that it would avoid any
explosion in the number of RelOptInfo structures, since presumably all
the partially aggregated paths could be attached to the same
RelOptInfo as the unaggregated paths, just with a GroupPathInfo to
mark them as partially aggregated. I have to admit that I'm not sure
it was the right idea to mix parameterized and unparameterized paths
in the same path list, and I'm even less sure that it would be a good
idea to mix in partially-aggregated paths. That's because a
parameterized path behaves like a regular path with a join
order/method restriction: as long as we only create valid joins from
parameterized paths, we'll eventually end up with unparameterized
paths without doing anything else. A partially aggregated path behaves
more like a partial path, which requires a Gather or Gather Merge node
to terminate parallelism. Likewise, a partially aggregated path will
require a FinalizeAggregate step to complete the aggregation. Maybe
that's the wrong way of thinking about it, though, since the
FinalizeAggregate node must (I think) go at the top of the join tree,
whereas a Gather can go anywhere.

I felt it best when implementing parallel query to put partial paths
into a separate list, rather than mixing them into the regular path
list. I am vaguely under the impression that Tom thinks that was a
poor decision on my part. And I can sort of see that there is a
problem brewing here. If we handled this case like that one, then we'd
go from 2 lists to 4: normal paths, paths needing a FinalizeAggregate,
paths needing a Gather(Merge), paths needing both. And if we handled
one more future thing in the same way, then the number of combinations
doubles again to 8. Clearly, that way lies madness. On the other hand,
there's another kind of madness in thinking that we can just stick a
whole bunch of paths that are different from each other in an
increasing number of ways into a single path list and suffer no
adverse consequences. The growing complexity of add_path() is one
fairly obvious one.

So I don't quite know which way to jump here. It now seems to me that
we have three similar features with three different designs.
Parameterization added non-comparable paths to the same path list;
parallel query added them to a different path list in the same
RelOptInfo; and this patch currently adds them a separate RelOptInfo.
That's quite a bit of diversity. Really, if we wanted to stick
strictly to the idea of paths associated with the same RelOptInfo
being directly comparable, then parameterization should have spawned a
separate RelOptInfo for each workable parameterization, but that
wasn't done, possibly (though I'm not sure) for the same reasons that
you don't want to do it here.

> Regarding whether we should use a single RelOptInfo or separate
> RelOptInfos for the same grouped relation: If we choose to separate
> the grouped paths of the same grouped relation into different
> RelOptInfos based on the location of the partial aggregation within
> the path tree, then, based on my calculation from the previous email,
> for a relation containing n base rels, we would need to create 2^(n-1)
> different RelOptInfos, not to mention that this can also lead to more
> paths.  I still struggle to see how this is feasible.  Could you
> please elaborate on why you believe this is a viable option?

I agree that creating an exponential number of RelOptInfos is not
going to work out well. I haven't been quite as certain as you seem to
be that it's an unavoidable reality, but maybe it is. For instance, my
intuition is that if PartialAgg(t1) JOIN t2 and PartialAgg(t1 JOIN t2)
produce very different numbers of rows, we could probably just take
the one with the smaller row count regardless of cost, because the
whole selling point of this optimization is that we reduce the number
of rows that are being fed to higher level plan nodes. I don't quite
see how it can make sense to keep a less costly path that produces
more rows on the theory that maybe it's going to work out better later
on. Why is the path cheaper, after all? It feels like the savings must
come from not reducing the row count so much, but that is a cost we're
going to have to repay at a higher plan level. Moreover, we'll be
repaying it with interest, because more rows will have filtered
through every level of plan over which we postponed partial
aggregation.

I admit it's not so clear-cut when the row counts are close. If
PartialAgg(t1 JOIN t2) JOIN t3 has a very similar to PartialAgg(t1
JOIN t3) JOIN t2, can we categorically pick whichever one has the
lower row count and forget about the other? I'm not sure. But I have
an uncomfortable feeling that if we can't, we're going to have an
explosion in the number of paths we have to generate even if we avoid
an explosion in the number of RelOptInfos we generate.

For example, consider:

SELECT ... FROM fact f, dim1, dim2, dim3, dim4
WHERE f.dim1_id = dim1.id AND f.dim2_id = dim2.id
AND f.dim3_id = dim3.id AND f.dim4_id = dim4.id
GROUP BY f.something;

Let's assume that each dimN table has PRIMARY KEY (id). Because of the
primary keys, it's only sensible to consider partial aggregation for
subsets of rels that include f; and it doesn't make sense to consider
partially aggregating after joining all 5 tables because at that point
we should just do a single-step aggregation. So, the partially
grouped-rel for {f,dim1,dim2,dim3,dim4} can contain paths generated in
15 different ways, because we can join f to any proper subset of
{dim1,dim2,dim3,dim4} before partially aggregating and then to the
remainder after partially aggregating. But that feels like we're
re-performing essentially the same join search 16 times which seems
super-expensive. I can't quite say that the work is useless or that I
have a better idea, but I guess there will be a lot of cases where all
16 join searches produce the same results, or most of them do. It
doesn't feel to me like checking through all of those possibilities is
a good expenditure of planner effort.

I took a look at the paper you linked in the original post, but
unfortunately it doesn't seem to say much about how to search the plan
space efficiently. I wonder if other systems perform a search that as
exhaustive as the one that you are proposing to perform here or
whether they apply some heuristics to limit the search space, and if
so, what those heuristics are.

-- 
Robert Haas
EDB: http://www.enterprisedb.com






^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
@ 2025-01-21 08:33                   ` Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  0 siblings, 1 reply; 70+ messages in thread

From: Richard Guo @ 2025-01-21 08:33 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

On Tue, Jan 21, 2025 at 1:28 AM Robert Haas <[email protected]> wrote:
> On Sun, Jan 19, 2025 at 7:53 AM Richard Guo <[email protected]> wrote:
> > If, at last, the conclusion of this discussion is that we should not
> > apply this change until we fix those problems in aggregate estimates,
> > I'd be very sad.  This conclusion is absolutely correct, for sure, in
> > an ideal world, but in the real world, it feels like a death sentence
> > for this patch, and for any future patches that attempt to apply some
> > optimizations above aggregate nodes - unless, of course, the day
> > arrives when we finally fix those aggregate estimate problems, which
> > doesn't seem likely in the near future.
>
> Well, such conclusions should be based on evidence. So far, the
> evidence you've presented suggests that the optimization works, so
> there's no reason to leap to the conclusion that we shouldn't move
> forward. On the other hand, the amount of evidence you've presented
> does not seem to me to be all that large. And I'm not sure that you've
> gone looking for adversarial cases.
>
> > And if that's the case, can I then argue that the same principle
> > should apply to joins?  Specifically, should we refrain from applying
> > any optimizations above join nodes until we've fixed the join estimate
> > problems, particularly in cases where we fall back on default
> > selectivity estimates?
>
> I am having a hard time figuring out how to write back to this. I
> mean, I don't think that what you write here is a serious proposal,
> and I think you already know that I was not proposing any such thing.
> But it upsets me that you think that this hypothetical argument is
> equivalent to the ones I've actually been making. Apparently, you
> consider my concerns quite groundless and foolish.

I'm really sorry if my previous response upset you or gave the wrong
impression.  That was never my intention, and I certainly do not
consider your concerns to be groundless or foolish.  I can see how my
message may have come across differently than I intended.  To clarify,
I wasn't suggesting that your concerns about the estimates weren't
valid.  Rather, I was trying to express that it might be more
effective to fix the cost estimates based on specific regressions.

> > Regarding whether we should use a single RelOptInfo or separate
> > RelOptInfos for the same grouped relation: If we choose to separate
> > the grouped paths of the same grouped relation into different
> > RelOptInfos based on the location of the partial aggregation within
> > the path tree, then, based on my calculation from the previous email,
> > for a relation containing n base rels, we would need to create 2^(n-1)
> > different RelOptInfos, not to mention that this can also lead to more
> > paths.  I still struggle to see how this is feasible.  Could you
> > please elaborate on why you believe this is a viable option?
>
> I agree that creating an exponential number of RelOptInfos is not
> going to work out well. I haven't been quite as certain as you seem to
> be that it's an unavoidable reality, but maybe it is. For instance, my
> intuition is that if PartialAgg(t1) JOIN t2 and PartialAgg(t1 JOIN t2)
> produce very different numbers of rows, we could probably just take
> the one with the smaller row count regardless of cost, because the
> whole selling point of this optimization is that we reduce the number
> of rows that are being fed to higher level plan nodes. I don't quite
> see how it can make sense to keep a less costly path that produces
> more rows on the theory that maybe it's going to work out better later
> on. Why is the path cheaper, after all? It feels like the savings must
> come from not reducing the row count so much, but that is a cost we're
> going to have to repay at a higher plan level. Moreover, we'll be
> repaying it with interest, because more rows will have filtered
> through every level of plan over which we postponed partial
> aggregation.

I've been thinking about this proposal, and it's quite appealing.  It
would significantly reduce both the planning effort and implementation
complexity, while still yielding reasonable planning results.

One concern I have with this proposal is that, as we climb up higher
and higher in the join tree, the assumption that a path with smaller
row count and higher cost is better than one with larger row count and
lower cost may gradually no longer hold.  It's true that a path with a
smaller row count is generally better for upper join nodes, as it
feeds fewer rows to upper join nodes.  However, as there are fewer and
fewer upper join nodes left, the efficiency gained from the smaller
row count could likely no longer justify the high cost of that path
itself.

Here's an example I found that can help illustrate what I mean.

create table t (a int, b int, c int);
insert into t select i%3, i%3, i from generate_series(1,500)i;
analyze t;
set enable_eager_aggregate to on;

And here are two plans for the same query:

-- Plan 1
explain (costs on)
select sum(t4.c) from t t1 join
  (t t2 join t t3 on t2.b != t3.b join t t4 on t3.b = t4.b)
  on t1.b = t2.b
group by t1.a;
                                        QUERY PLAN
------------------------------------------------------------------------------------------
 Finalize HashAggregate  (cost=4135.19..4135.22 rows=3 width=12)
   Group Key: t1.a
   ->  Hash Join  (cost=1392.13..3301.85 rows=166668 width=12)
         Hash Cond: (t2.b = t1.b)
         ->  Nested Loop  (cost=1377.88..1409.66 rows=1000 width=12)
               Join Filter: (t2.b <> t3.b)
               ->  Partial HashAggregate  (cost=1377.88..1377.91
rows=3 width=12)
                     Group Key: t3.b
                     ->  Hash Join  (cost=14.25..961.22 rows=83334 width=8)
                           Hash Cond: (t3.b = t4.b)
                           ->  Seq Scan on t t3  (cost=0.00..8.00
rows=500 width=4)
                           ->  Hash  (cost=8.00..8.00 rows=500 width=8)
                                 ->  Seq Scan on t t4
(cost=0.00..8.00 rows=500 width=8)
               ->  Materialize  (cost=0.00..10.50 rows=500 width=4)
                     ->  Seq Scan on t t2  (cost=0.00..8.00 rows=500 width=4)
         ->  Hash  (cost=8.00..8.00 rows=500 width=8)
               ->  Seq Scan on t t1  (cost=0.00..8.00 rows=500 width=8)
(17 rows)

-- Plan 2
explain (costs on)
select sum(t4.c) from t t1 join
  (t t2 join t t3 on t2.b != t3.b join t t4 on t3.b = t4.b)
  on t1.b = t2.b
group by t1.a;
                                           QUERY PLAN
------------------------------------------------------------------------------------------------
 Finalize HashAggregate  (cost=455675.44..455675.47 rows=3 width=12)
   Group Key: t1.a
   ->  Hash Join  (cost=455658.07..455672.94 rows=500 width=12)
         Hash Cond: (t1.b = t2.b)
         ->  Seq Scan on t t1  (cost=0.00..8.00 rows=500 width=8)
         ->  Hash  (cost=455658.03..455658.03 rows=3 width=12)
               ->  Partial HashAggregate  (cost=455658.00..455658.03
rows=3 width=12)
                     Group Key: t2.b
                     ->  Hash Join  (cost=14.25..316768.56
rows=27777887 width=8)
                           Hash Cond: (t3.b = t4.b)
                           ->  Nested Loop  (cost=0.00..3767.25
rows=166666 width=8)
                                 Join Filter: (t2.b <> t3.b)
                                 ->  Seq Scan on t t2
(cost=0.00..8.00 rows=500 width=4)
                                 ->  Materialize  (cost=0.00..10.50
rows=500 width=4)
                                       ->  Seq Scan on t t3
(cost=0.00..8.00 rows=500 width=4)
                           ->  Hash  (cost=8.00..8.00 rows=500 width=8)
                                 ->  Seq Scan on t t4
(cost=0.00..8.00 rows=500 width=8)
(17 rows)

For the grouped relation {t2 t3 t4}, Plan 1 chose the path
"PartialAgg(t3/t4) JOIN t2", while Plan 2 chose the path
"PartialAgg(t2/t3/t4)".

The first path has larger row count (1000) and lower cost (1409.66).
The second path has smaller row count (3) and higher cost (455658.03).

Executing these two plans shows that Plan 2 is slower than Plan 1.

-- Plan 1
 Execution Time: 286.860 ms

-- Plan 2
 Execution Time: 27109.744 ms

I think we may need to take the position in the join tree into account
when applying this heuristic.  At lower levels, we should prefer paths
with smaller row counts, while at higher levels, we should prefer
paths with lower costs.  However, it's unclear to me how we should
define "lower" and "higher" - how low is 'low' and how high is 'high'.

> I admit it's not so clear-cut when the row counts are close. If
> PartialAgg(t1 JOIN t2) JOIN t3 has a very similar to PartialAgg(t1
> JOIN t3) JOIN t2, can we categorically pick whichever one has the
> lower row count and forget about the other? I'm not sure. But I have
> an uncomfortable feeling that if we can't, we're going to have an
> explosion in the number of paths we have to generate even if we avoid
> an explosion in the number of RelOptInfos we generate.
>
> For example, consider:
>
> SELECT ... FROM fact f, dim1, dim2, dim3, dim4
> WHERE f.dim1_id = dim1.id AND f.dim2_id = dim2.id
> AND f.dim3_id = dim3.id AND f.dim4_id = dim4.id
> GROUP BY f.something;
>
> Let's assume that each dimN table has PRIMARY KEY (id). Because of the
> primary keys, it's only sensible to consider partial aggregation for
> subsets of rels that include f; and it doesn't make sense to consider
> partially aggregating after joining all 5 tables because at that point
> we should just do a single-step aggregation. So, the partially
> grouped-rel for {f,dim1,dim2,dim3,dim4} can contain paths generated in
> 15 different ways, because we can join f to any proper subset of
> {dim1,dim2,dim3,dim4} before partially aggregating and then to the
> remainder after partially aggregating. But that feels like we're
> re-performing essentially the same join search 16 times which seems
> super-expensive. I can't quite say that the work is useless or that I
> have a better idea, but I guess there will be a lot of cases where all
> 16 join searches produce the same results, or most of them do. It
> doesn't feel to me like checking through all of those possibilities is
> a good expenditure of planner effort.

Yeah, you're right that the join search process for grouped paths
basically mirrors what we do for non-grouped paths, which indeed
involves a lot of planner effort.  I've been exploring potential
heuristics to limit the search space for grouped paths, but so far, I
haven't found any effective solutions.  Currently, the heuristic used
in the patch is to only consider grouped paths that dramatically
reduce the number of rows.  All others are just discarded.  The
rationale is that if a grouped path does not reduce the number of rows
enough, it is highly unlikely to result in a competitive final plan
during the upper planning stages, so it doesn't make much sense to
consider it.  The current threshold is set to 50%, meaning that if the
number of rows returned by PartialAgg(t1 JOIN t2) is not less than 50%
of the rows returned by (t1 JOIN t2), no Aggregate paths will be
generated on top of the t1/t2 join.  If we notice significant
regressions in planning time, we might consider further increasing
this threshold, say, to 80%, so that only grouped paths that reduce
the rows by more than 80% will be considered.  This heuristic also
ensures that, once a plan with eager aggregation is chosen, it is
highly likely to result in performance improvements, due to the
significant data reduction before joins.

> I took a look at the paper you linked in the original post, but
> unfortunately it doesn't seem to say much about how to search the plan
> space efficiently. I wonder if other systems perform a search that as
> exhaustive as the one that you are proposing to perform here or
> whether they apply some heuristics to limit the search space, and if
> so, what those heuristics are.

Unfortunately, I don't have much knowledge about other systems.  It
would be really helpful if anyone could share some insights on how
other systems handle this.

Thanks
Richard






^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-01-21 16:36                     ` Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  0 siblings, 1 reply; 70+ messages in thread

From: Robert Haas @ 2025-01-21 16:36 UTC (permalink / raw)
  To: Richard Guo <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

On Tue, Jan 21, 2025 at 3:33 AM Richard Guo <[email protected]> wrote:
> I've been thinking about this proposal, and it's quite appealing.  It
> would significantly reduce both the planning effort and implementation
> complexity, while still yielding reasonable planning results.
>
> One concern I have with this proposal is that, as we climb up higher
> and higher in the join tree, the assumption that a path with smaller
> row count and higher cost is better than one with larger row count and
> lower cost may gradually no longer hold.  It's true that a path with a
> smaller row count is generally better for upper join nodes, as it
> feeds fewer rows to upper join nodes.  However, as there are fewer and
> fewer upper join nodes left, the efficiency gained from the smaller
> row count could likely no longer justify the high cost of that path
> itself.
>
> Here's an example I found that can help illustrate what I mean.

Thanks for the example. What seems to be happening here is that each
of the three joins increases the number of rows by a multiple of
either 166 or 333. Aggregating reduces the number of rows to 3. I am
not sure that we should be too concerned about this kind of case,
because I don't think it will be common to have multiple joins that
dramatically increase the row count. If you did have that, you must
want to aggregate multiple times. We don't have the code for an
IntermediateAggregate or CombineAggregate node right now, I believe,
but in this query it would likely make sense to apply such a step
after every join; then you'd never have more than three rows.

Honestly, I'm not sure how much we should worry about a case like
this. I think that if a user is writing queries that use joins to
vastly inflate the row count and then aggregate the result, perhaps
they need to think about rewriting the queries. In this instance, it
feels a bit like the user is emulating multiplication using an
iterated SUM(), which is probably never going to work out all that
well.

But I bet it's possible to construct an example using only
row-reducing joins. Let's say we start with 10k rows that aggregate to
10 rows; after performing a join, we end up with 9k rows that
aggregate to 9 rows. So if we partially aggregate first, we have to
aggregate 1000 extra rows, but if we join first, we have to join 1000
extra rows. I don't think we can say a priori which will be cheaper,
but my idea would make the path that partially aggregates after the
join win unconditionally.

> Yeah, you're right that the join search process for grouped paths
> basically mirrors what we do for non-grouped paths, which indeed
> involves a lot of planner effort.  I've been exploring potential
> heuristics to limit the search space for grouped paths, but so far, I
> haven't found any effective solutions.  Currently, the heuristic used
> in the patch is to only consider grouped paths that dramatically
> reduce the number of rows.  All others are just discarded.  The
> rationale is that if a grouped path does not reduce the number of rows
> enough, it is highly unlikely to result in a competitive final plan
> during the upper planning stages, so it doesn't make much sense to
> consider it.  The current threshold is set to 50%, meaning that if the
> number of rows returned by PartialAgg(t1 JOIN t2) is not less than 50%
> of the rows returned by (t1 JOIN t2), no Aggregate paths will be
> generated on top of the t1/t2 join.  If we notice significant
> regressions in planning time, we might consider further increasing
> this threshold, say, to 80%, so that only grouped paths that reduce
> the rows by more than 80% will be considered.  This heuristic also
> ensures that, once a plan with eager aggregation is chosen, it is
> highly likely to result in performance improvements, due to the
> significant data reduction before joins.

To be honest, I was quite surprised this was a percentage like 50% or
80% and not a multiple like 2 or 5. And I had thought the multiplier
might even be larger, like 10 or more. The thing is, 50% means we only
have to form 2-item groups in order to justify aggregating twice.
Maybe SUM() is cheap enough to justify that treatment, but a more
expensive aggregate might not be, especially things like string_agg()
or array_agg() where aggregation creates bigger objects.

Another thing to consider is that when the number of groups is small
enough that we don't need to do a Sort+GroupAggregate, it doesn't seem
so bad to perform marginally-useful partial aggregation, but sometimes
that won't be the case. For example, imagine that the user wants to
join orders to order_lines and then compute SUM(order_lines.quantity)
for each orders.customer_id. If the size of the order_lines tables is
large relative to  work_mem, we're going to need to sort it in order
to partially aggregate, which is expensive. If it turns out that the
orders table is also quite big, then maybe we'll end up performing a
merge join and the same sort order can be used for both operations,
but if not, we could've just done a hash join with orders as the build
table. In that kind of case, partial aggregation has to save quite a
lot to justify itself.

Now, maybe we shouldn't worry about that when applying this heuristic
cutoff; after all, it's the job of the cost model to understand that
sorting is expensive, and this cutoff should just be there to make
sure we don't even try the cost model in cases where it's clearly
unpromising. But I do suspect that in queries where the average group
size is 2, this will often be a marginal technique. In addition to the
problems already mentioned, it could be that the average group size is
2 but a lot of groups are actually of size 1 and then there are some
larger groups. In such cases I'm even less sure that the partial
aggregation technique will be a winner. Building many 1-element groups
sounds inefficient.

-- 
Robert Haas
EDB: http://www.enterprisedb.com






^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
@ 2025-01-22 06:48                       ` Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-10-09 02:13                         ` Re: Eager aggregation, take 3 Tom Lane <[email protected]>
  0 siblings, 2 replies; 70+ messages in thread

From: Richard Guo @ 2025-01-22 06:48 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

On Wed, Jan 22, 2025 at 1:36 AM Robert Haas <[email protected]> wrote:
> Thanks for the example. What seems to be happening here is that each
> of the three joins increases the number of rows by a multiple of
> either 166 or 333. Aggregating reduces the number of rows to 3. I am
> not sure that we should be too concerned about this kind of case,
> because I don't think it will be common to have multiple joins that
> dramatically increase the row count. If you did have that, you must
> want to aggregate multiple times. We don't have the code for an
> IntermediateAggregate or CombineAggregate node right now, I believe,
> but in this query it would likely make sense to apply such a step
> after every join; then you'd never have more than three rows.

Haha, I did once think about the concept of multi-stage aggregations
while working on this patch.  While testing this patch and trying to
figure out where placing the partial aggregation would bring the most
benefit, I noticed that a potentially effective approach could be
this: every time the row count increases to a certain point as we join
more and more tables, we perform one aggregation to deflate it, and
then wait for it to grow again before deflating it once more.

This approach would require injecting multiple intermediate
aggregation nodes into the path tree, for which we currently lack the
necessary architecture.  As a result, I didn't pursue this idea
further.  However, I'm really glad you mentioned this approach, though
it's still unclear whether it's a feasible or reasonable idea.

> Honestly, I'm not sure how much we should worry about a case like
> this. I think that if a user is writing queries that use joins to
> vastly inflate the row count and then aggregate the result, perhaps
> they need to think about rewriting the queries. In this instance, it
> feels a bit like the user is emulating multiplication using an
> iterated SUM(), which is probably never going to work out all that
> well.

I don't have much experience with end-user scenarios, so I'm not sure
if it's common to have queries where the row count increases with more
and more tables joined.

> But I bet it's possible to construct an example using only
> row-reducing joins. Let's say we start with 10k rows that aggregate to
> 10 rows; after performing a join, we end up with 9k rows that
> aggregate to 9 rows. So if we partially aggregate first, we have to
> aggregate 1000 extra rows, but if we join first, we have to join 1000
> extra rows. I don't think we can say a priori which will be cheaper,
> but my idea would make the path that partially aggregates after the
> join win unconditionally.

Yeah, this is the concern I raised upthread: the efficiency gained
from a path having a smaller row count may not always justify the high
cost of the path itself, especially as we move higher in the join
tree.

> To be honest, I was quite surprised this was a percentage like 50% or
> 80% and not a multiple like 2 or 5. And I had thought the multiplier
> might even be larger, like 10 or more. The thing is, 50% means we only
> have to form 2-item groups in order to justify aggregating twice.
> Maybe SUM() is cheap enough to justify that treatment, but a more
> expensive aggregate might not be, especially things like string_agg()
> or array_agg() where aggregation creates bigger objects.

Hmm, if I understand correctly, the "percentage" and the "multiple"
work in the same way.  Percentage 50% and multiple 2 both mean that
the average group size is 2, and percentage 90% and multiple 10 both
mean that the average group size is 10.  In general, this relationship
should hold: percentage = 1 - 1/multiple.  However, I might not have
grasped your point correctly.

> Another thing to consider is that when the number of groups is small
> enough that we don't need to do a Sort+GroupAggregate, it doesn't seem
> so bad to perform marginally-useful partial aggregation, but sometimes
> that won't be the case. For example, imagine that the user wants to
> join orders to order_lines and then compute SUM(order_lines.quantity)
> for each orders.customer_id. If the size of the order_lines tables is
> large relative to  work_mem, we're going to need to sort it in order
> to partially aggregate, which is expensive. If it turns out that the
> orders table is also quite big, then maybe we'll end up performing a
> merge join and the same sort order can be used for both operations,
> but if not, we could've just done a hash join with orders as the build
> table. In that kind of case, partial aggregation has to save quite a
> lot to justify itself.
>
> Now, maybe we shouldn't worry about that when applying this heuristic
> cutoff; after all, it's the job of the cost model to understand that
> sorting is expensive, and this cutoff should just be there to make
> sure we don't even try the cost model in cases where it's clearly
> unpromising. But I do suspect that in queries where the average group
> size is 2, this will often be a marginal technique. In addition to the
> problems already mentioned, it could be that the average group size is
> 2 but a lot of groups are actually of size 1 and then there are some
> larger groups. In such cases I'm even less sure that the partial
> aggregation technique will be a winner. Building many 1-element groups
> sounds inefficient.

Yeah, as you summarized, this heuristic is primarily used to discard
unpromising paths, ensuring they aren't considered further.  For the
paths that pass this heuristic, the cost model will then determine the
appropriate aggregation and join methods.  If we take this into
consideration when applying the heuristic, it seems to me that we
would essentially be duplicating the work that the cost model
performs, which doesn't seem necessary.

I think you are right that in cases where a lot of groups are actually
of size 1 and then there are some larger groups, the partial
aggregation may not be a win.  Perhaps we can do better in this if we
have the techniques to estimate the distribution of data across
different groups or to predict how skewed the data might be.  It seems
that we don't have such techniques at the moment.  This also reminds
me of a similar challenge when calculating the startup cost of
incremental sort.  I looked into cost_incremental_sort() and found
that we're currently using the average group size to estimate the
startup cost (please correct me if I'm wrong).

    group_tuples = input_tuples / input_groups;

I think this may also suffer from data skew across different groups.
With the mentioned techniques, I believe we could improve the cost
estimation for incremental sort as well.

If I understand correctly, your main concern is the threshold being
set to 2, rather than the heuristic itself, right?  Do you think
increasing this threshold to 10 or a larger value would help mitigate
the issue?

Thanks
Richard






^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-01-24 20:53                         ` Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  1 sibling, 1 reply; 70+ messages in thread

From: Robert Haas @ 2025-01-24 20:53 UTC (permalink / raw)
  To: Richard Guo <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

On Wed, Jan 22, 2025 at 1:48 AM Richard Guo <[email protected]> wrote:
> This approach would require injecting multiple intermediate
> aggregation nodes into the path tree, for which we currently lack the
> necessary architecture.  As a result, I didn't pursue this idea
> further.  However, I'm really glad you mentioned this approach, though
> it's still unclear whether it's a feasible or reasonable idea.

I think the biggest question in my mind is really whether we can
accurately judge when such a strategy is likely to be a win. In this
instance it looks like we could have figured it out, but as we've
discussed, I fear a lot of estimates will be inaccurate. If we knew
they were going to be good, then I see no reason not to apply the
technique when it's sensible.

> I don't have much experience with end-user scenarios, so I'm not sure
> if it's common to have queries where the row count increases with more
> and more tables joined.

I don't think it's very common to see it increase as dramatically as
in your test case.

> > To be honest, I was quite surprised this was a percentage like 50% or
> > 80% and not a multiple like 2 or 5. And I had thought the multiplier
> > might even be larger, like 10 or more. The thing is, 50% means we only
> > have to form 2-item groups in order to justify aggregating twice.
> > Maybe SUM() is cheap enough to justify that treatment, but a more
> > expensive aggregate might not be, especially things like string_agg()
> > or array_agg() where aggregation creates bigger objects.
>
> Hmm, if I understand correctly, the "percentage" and the "multiple"
> work in the same way.  Percentage 50% and multiple 2 both mean that
> the average group size is 2, and percentage 90% and multiple 10 both
> mean that the average group size is 10.  In general, this relationship
> should hold: percentage = 1 - 1/multiple.  However, I might not have
> grasped your point correctly.

Yes, they're equivalent. However, a percentage to me suggests that we
think that the meaningful values might be something like 20%, 50%,
80%; whereas with a multiplier someone might be more inclined to think
of values like 10, 100, 1000. You can definitely write those values as
90%, 99%, 99.9%; however, it seems less natural to me to express it
that way when we think the value will be quite close to 1. The fact
that you chose a percentage suggested to me that you were aiming for a
less-strict threshold than I had supposed we would want.

> Yeah, as you summarized, this heuristic is primarily used to discard
> unpromising paths, ensuring they aren't considered further.  For the
> paths that pass this heuristic, the cost model will then determine the
> appropriate aggregation and join methods.  If we take this into
> consideration when applying the heuristic, it seems to me that we
> would essentially be duplicating the work that the cost model
> performs, which doesn't seem necessary.

Well, I think we do ideally want heuristics that can reject
unpromising paths earlier. The planning cost of this is really quite
high. But I'm not sure how far we can get with this particular
heuristic. True, we could raise it to a larger value, and that might
help to rule out unpromising paths earlier. But I fear you'll quickly
find examples where it also rules out promising paths early. A good
heuristic is easy to compute and highly accurate. This heuristic is
easy to compute, but the accuracy is questionable.

-- 
Robert Haas
EDB: http://www.enterprisedb.com






^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
@ 2025-06-13 07:41                           ` Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-05 13:09                             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  0 siblings, 2 replies; 70+ messages in thread

From: Richard Guo @ 2025-06-13 07:41 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

I've switched back to this thread and will begin by working through
the key concerns that were previously raised.

The first concern is the lack of a proof demonstrating the correctness
of this transformation.  To address this, I plan to include a detailed
proof in the README, along the lines of the following.

====== proof start ======
To prove that the transformation is correct, we partition the tables
in the FROM clause into two groups: those that contain at least one
aggregation column, and those that do not contain any aggregation
columns.  Each group can be treated as a single relation formed by the
Cartesian product of the tables within that group.  Therefore, without
loss of generality, we can assume that the FROM clause contains
exactly two relations, R1 and R2, where R1 represents the relation
containing all aggregation columns, and R2 represents the relation
without any aggregation columns.

Let the query be of the form:

SELECT G, AGG(A)
FROM R1 JOIN R2 ON J
GROUP BY G;

where G is the set of grouping keys that may include columns from R1
and/or R2; AGG(A) is an aggregate function over columns A from R1; J
is the join condition between R1 and R2.

The transformation of eager aggregation is:

    GROUP BY G, AGG(A) on (R1 JOIN R2 ON J)
    =
    GROUP BY G, AGG(agg_A) on ((GROUP BY G1, AGG(A) AS agg_A on R1)
JOIN R2 ON J)

This equivalence holds under the following conditions:

1) AGG is decomposable, meaning that it can be computed in two stages:
a partial aggregation followed by a final aggregation;
2) The set G1 used in the pre-aggregation of R1 includes:
    * all columns from R1 that are part of the grouping keys G, and
    * all columns from R1 that appear in the join condition J.
3) The grouping operator for any column in G1 must be compatible with
the operator used for that column in the join condition J.

Since G1 includes all columns from R1 that appear in either the
grouping keys G or the join condition J, all rows within each partial
group have identical values for both the grouping keys and the
join-relevant columns from R1, assuming compatible operators are used.
As a result, the rows within a partial group are indistinguishable in
terms of their contribution to the aggregation and their behavior in
the join.  This ensures that all rows in the same partial group share
the same "destiny": they either all match or all fail to match a given
row in R2.  Because the aggregate function AGG is decomposable,
aggregating the partial results after the join yields the same final
result as aggregating after the full join, thereby preserving query
semantics.

Q.E.D.

The second concern is that a RelOptInfo representing a grouped
relation may include paths that produce different row sets due to
partial aggregation being applied at different join levels.  This
potentially violates a fundamental assumption in the planner.

Additionally, the patch currently performs an exhaustive search by
exploring partial aggregation at every possible join level, leading to
excessive planning effort, which may not be justified by the
cost-benefit ratio.

To address these concerns, I'm thinking that maybe we can adopt a
strategy where partial aggregation is only pushed to the lowest
possible level in the join tree that is deemed useful.  In other
words, if we can build a grouped path like "AGG(B) JOIN A" -- and
AGG(B) yields a significant reduction in row count -- we skip
exploring alternatives like "AGG(A JOIN B)".

This is somewhat analogous to how we handle qual clauses: we only push
a qual clause down to the lowest scan or join level that includes all
the relations it references -- following the "filter early, join late"
principle.  For example, if predicate Pb only references B, we only
consider "A JOIN sigma[Pb](B)" and skip "sigma[Pb](A JOIN B)".  (Note
that if Pb involves costly functions and the join is highly selective,
we may want to apply the predicate after the join.)

This ensures that all grouped paths for the same grouped relation
produce the same set of rows (e.g., consider "A JOIN AGG(B) JOIN C"
vs. "AGG(B) JOIN C JOIN A").  As a result, we avoid the complexity of
comparing costs between different grouped paths of the same grouped
relation, and also eliminate the need for special handling of row
estimates on join paths.  It also significantly reduces planning
effort.

While this approach may miss potentially more efficient plans where
applying partial aggregation at a higher join level would yield better
performance, it strikes a practical balance: we can still find plans
that outperform those without eager aggregation, without incurring
excessive planning overhead.  As discussed earlier, it's uncommon in
practice to encounter multiple joins that dramatically inflate row
counts.  So in most cases, pushing partial aggregation to the lowest
level where it offers a significant row count reduction tends to be
the most efficient strategy.

I think this heuristic serves as a good starting point, and we can
look into extending it with more advanced strategies as the feature
evolves.

Any thoughts?

Thanks
Richard





^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-06-26 02:01                             ` Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  1 sibling, 1 reply; 70+ messages in thread

From: Richard Guo @ 2025-06-26 02:01 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

On Fri, Jun 13, 2025 at 4:41 PM Richard Guo <[email protected]> wrote:
> I've switched back to this thread and will begin by working through
> the key concerns that were previously raised.
>
> The first concern is the lack of a proof demonstrating the correctness
> of this transformation.  To address this, I plan to include a detailed
> proof in the README, along the lines of the following.

> The second concern is that a RelOptInfo representing a grouped
> relation may include paths that produce different row sets due to
> partial aggregation being applied at different join levels.  This
> potentially violates a fundamental assumption in the planner.
>
> Additionally, the patch currently performs an exhaustive search by
> exploring partial aggregation at every possible join level, leading to
> excessive planning effort, which may not be justified by the
> cost-benefit ratio.
>
> To address these concerns, I'm thinking that maybe we can adopt a
> strategy where partial aggregation is only pushed to the lowest
> possible level in the join tree that is deemed useful.  In other
> words, if we can build a grouped path like "AGG(B) JOIN A" -- and
> AGG(B) yields a significant reduction in row count -- we skip
> exploring alternatives like "AGG(A JOIN B)".

Here is the patch based on the proposed ideas.  It includes the proof
of correctness in the README and implements the strategy of pushing
partial aggregation only to the lowest applicable join level where it
is deemed useful.  This is done by introducing a "Relids apply_at"
field to track that level and ensuring that partial aggregation is
applied only at the recorded "apply_at" level.

Additionally, this patch changes how grouped relations are stored.
Since each grouped relation represents a partially aggregated version
of a non-grouped relation, we now associate each grouped relation with
the RelOptInfo of the corresponding non-grouped relation.  This
eliminates the need for a dedicated list of all grouped relations and
avoids list searches when retrieving a grouped relation.

It also addresses other previously raised concerns, such as the
potential memory blowout risks with large partial-aggregation values,
and includes improvements to comments and the commit message.

Another change is that this feature is now enabled by default.

Thanks
Richard


Attachments:

  [application/octet-stream] v17-0001-Implement-Eager-Aggregation.patch (165.3K, 2-v17-0001-Implement-Eager-Aggregation.patch)
  download | inline diff:
From fcdd75d824bc9ee65078ad2dc7337cca22eccf50 Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Tue, 11 Jun 2024 15:59:19 +0900
Subject: [PATCH v17] Implement Eager Aggregation

Eager aggregation is a query optimization technique that partially
pushes aggregation past a join, and finalizes it once all the
relations are joined.  Eager aggregation may reduce the number of
input rows to the join and thus could result in a better overall plan.

In the current planner architecture, the separation between the
scan/join planning phase and the post-scan/join phase means that
aggregation steps are not visible when constructing the join tree,
limiting the planner's ability to exploit aggregation-aware
optimizations.  To implement eager aggregation, we collect information
about aggregate functions in the targetlist and HAVING clause, along
with grouping expressions from the GROUP BY clause, and store it in
the PlannerInfo node.  During the scan/join planning phase, this
information is used to evaluate each base or join relation to
determine whether eager aggregation can be applied.  If applicable, we
create a separate RelOptInfo, referred to as a grouped relation, to
represent the partially-aggregated version of the relation and
generate grouped paths for it.

Grouped relation paths can be generated in two ways.  The first method
involves adding sorted and hashed partial aggregation paths on top of
the non-grouped paths.  To limit planning time, we only consider the
cheapest or suitably-sorted non-grouped paths in this step.
Alternatively, grouped paths can be generated by joining a grouped
relation with a non-grouped relation.  Joining two grouped relations
is currently not supported.

To further limit planning time, we currently adopt a strategy where
partial aggregation is pushed only to the lowest feasible level in the
join tree where it provides a significant reduction in row count.
This strategy also helps ensure that all grouped paths for the same
grouped relation produce the same set of rows, which is important to
support a fundamental assumption of the planner.

For the partial aggregation that is pushed down to a non-aggregated
relation, we need to consider all expressions from this relation that
are involved in upper join clauses and include them in the grouping
keys, using compatible operators.  This is essential to ensure that an
aggregated row from the partial aggregation matches the other side of
the join if and only if each row in the partial group does.  This
ensures that all rows within the same partial group share the same
"destiny", which is crucial for maintaining correctness.

One restriction is that we cannot push partial aggregation down to a
relation that is in the nullable side of an outer join, because the
NULL-extended rows produced by the outer join would not be available
when we perform the partial aggregation, while with a
non-eager-aggregation plan these rows are available for the top-level
aggregation.  Pushing partial aggregation in this case may result in
the rows being grouped differently than expected, or produce incorrect
values from the aggregate functions.

If we have generated a grouped relation for the topmost join relation,
we finalize its paths at the end.  The final paths will compete in the
usual way with paths built from regular planning.

The patch was originally proposed by Antonin Houska in 2017.  This
commit reworks various important aspects and rewrites most of the
current code.  However, the original patch and reviews were very
useful.

Author: Richard Guo, Antonin Houska
Reviewed-by: Robert Haas, Jian He, Tender Wang, Paul George, Tom Lane
Reviewed-by: Tomas Vondra, Andy Fan, Ashutosh Bapat
Discussion: https://postgr.es/m/CAMbWs48jzLrPt1J_00ZcPZXWUQKawQOFE8ROc-ADiYqsqrpBNw@mail.gmail.com
---
 .../postgres_fdw/expected/postgres_fdw.out    |   49 +-
 doc/src/sgml/config.sgml                      |   15 +
 src/backend/optimizer/README                  |   89 ++
 src/backend/optimizer/geqo/geqo_eval.c        |   21 +
 src/backend/optimizer/path/allpaths.c         |  443 ++++++
 src/backend/optimizer/path/joinrels.c         |  193 +++
 src/backend/optimizer/plan/initsplan.c        |  313 ++++
 src/backend/optimizer/plan/planmain.c         |    9 +
 src/backend/optimizer/plan/planner.c          |  124 +-
 src/backend/optimizer/util/appendinfo.c       |   59 +
 src/backend/optimizer/util/pathnode.c         |   12 +-
 src/backend/optimizer/util/relnode.c          |  636 ++++++++
 src/backend/utils/misc/guc_tables.c           |   10 +
 src/backend/utils/misc/postgresql.conf.sample |    1 +
 src/include/nodes/pathnodes.h                 |  130 ++
 src/include/optimizer/pathnode.h              |    5 +
 src/include/optimizer/paths.h                 |    5 +
 src/include/optimizer/planmain.h              |    1 +
 src/test/regress/expected/eager_aggregate.out | 1334 +++++++++++++++++
 src/test/regress/expected/sysviews.out        |    3 +-
 src/test/regress/parallel_schedule            |    2 +-
 src/test/regress/sql/eager_aggregate.sql      |  194 +++
 src/tools/pgindent/typedefs.list              |    3 +
 23 files changed, 3588 insertions(+), 63 deletions(-)
 create mode 100644 src/test/regress/expected/eager_aggregate.out
 create mode 100644 src/test/regress/sql/eager_aggregate.sql

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 2185b42bb4f..b9f767df05d 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -3692,30 +3692,33 @@ select count(t1.c3) from ft2 t1 left join ft2 t2 on (t1.c1 = random() * t2.c2);
 -- Subquery in FROM clause having aggregate
 explain (verbose, costs off)
 select count(*), x.b from ft1, (select c2 a, sum(c1) b from ft1 group by c2) x where ft1.c2 = x.a group by x.b order by 1, 2;
-                                          QUERY PLAN                                           
------------------------------------------------------------------------------------------------
+                                       QUERY PLAN                                        
+-----------------------------------------------------------------------------------------
  Sort
-   Output: (count(*)), x.b
-   Sort Key: (count(*)), x.b
-   ->  HashAggregate
-         Output: count(*), x.b
-         Group Key: x.b
-         ->  Hash Join
-               Output: x.b
-               Inner Unique: true
-               Hash Cond: (ft1.c2 = x.a)
-               ->  Foreign Scan on public.ft1
-                     Output: ft1.c2
-                     Remote SQL: SELECT c2 FROM "S 1"."T 1"
-               ->  Hash
-                     Output: x.b, x.a
-                     ->  Subquery Scan on x
-                           Output: x.b, x.a
-                           ->  Foreign Scan
-                                 Output: ft1_1.c2, (sum(ft1_1.c1))
-                                 Relations: Aggregate on (public.ft1 ft1_1)
-                                 Remote SQL: SELECT c2, sum("C 1") FROM "S 1"."T 1" GROUP BY 1
-(21 rows)
+   Output: (count(*)), (sum(ft1_1.c1))
+   Sort Key: (count(*)), (sum(ft1_1.c1))
+   ->  Finalize GroupAggregate
+         Output: count(*), (sum(ft1_1.c1))
+         Group Key: (sum(ft1_1.c1))
+         ->  Sort
+               Output: (sum(ft1_1.c1)), (PARTIAL count(*))
+               Sort Key: (sum(ft1_1.c1))
+               ->  Hash Join
+                     Output: (sum(ft1_1.c1)), (PARTIAL count(*))
+                     Hash Cond: (ft1_1.c2 = ft1.c2)
+                     ->  Foreign Scan
+                           Output: ft1_1.c2, (sum(ft1_1.c1))
+                           Relations: Aggregate on (public.ft1 ft1_1)
+                           Remote SQL: SELECT c2, sum("C 1") FROM "S 1"."T 1" GROUP BY 1
+                     ->  Hash
+                           Output: ft1.c2, (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: ft1.c2, PARTIAL count(*)
+                                 Group Key: ft1.c2
+                                 ->  Foreign Scan on public.ft1
+                                       Output: ft1.c2
+                                       Remote SQL: SELECT c2 FROM "S 1"."T 1"
+(24 rows)
 
 select count(*), x.b from ft1, (select c2 a, sum(c1) b from ft1 group by c2) x where ft1.c2 = x.a group by x.b order by 1, 2;
  count |   b   
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 59a0874528a..780b4a9fed1 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -5470,6 +5470,21 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-enable-eager-aggregate" xreflabel="enable_eager_aggregate">
+      <term><varname>enable_eager_aggregate</varname> (<type>boolean</type>)
+      <indexterm>
+       <primary><varname>enable_eager_aggregate</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Enables or disables the query planner's ability to partially push
+        aggregation past a join, and finalize it once all the relations are
+        joined. The default is <literal>on</literal>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-enable-gathermerge" xreflabel="enable_gathermerge">
       <term><varname>enable_gathermerge</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/src/backend/optimizer/README b/src/backend/optimizer/README
index 9c724ccfabf..48a575c5bda 100644
--- a/src/backend/optimizer/README
+++ b/src/backend/optimizer/README
@@ -1501,3 +1501,92 @@ breaking down aggregation or grouping over a partitioned relation into
 aggregation or grouping over its partitions is called partitionwise
 aggregation.  Especially when the partition keys match the GROUP BY clause,
 this can be significantly faster than the regular method.
+
+Eager aggregation
+-----------------
+
+Eager aggregation is a query optimization technique that partially
+pushes aggregation past a join, and finalizes it once all the
+relations are joined.  Eager aggregation may reduce the number of
+input rows to the join and thus could result in a better overall plan.
+
+To prove that the transformation is correct, we partition the tables
+in the FROM clause into two groups: those that contain at least one
+aggregation column, and those that do not contain any aggregation
+columns.  Each group can be treated as a single relation formed by the
+Cartesian product of the tables within that group.  Therefore, without
+loss of generality, we can assume that the FROM clause contains
+exactly two relations, R1 and R2, where R1 represents the relation
+containing all aggregation columns, and R2 represents the relation
+without any aggregation columns.
+
+Let the query be of the form:
+
+SELECT G, AGG(A)
+FROM R1 JOIN R2 ON J
+GROUP BY G;
+
+where G is the set of grouping keys that may include columns from R1
+and/or R2; AGG(A) is an aggregate function over columns A from R1; J
+is the join condition between R1 and R2.
+
+The transformation of eager aggregation is:
+
+    GROUP BY G, AGG(A) on (R1 JOIN R2 ON J)
+    =
+    GROUP BY G, AGG(agg_A) on ((GROUP BY G1, AGG(A) AS agg_A on R1) JOIN R2 ON J)
+
+This equivalence holds under the following conditions:
+
+1) AGG is decomposable, meaning that it can be computed in two stages:
+a partial aggregation followed by a final aggregation;
+2) The set G1 used in the pre-aggregation of R1 includes:
+    * all columns from R1 that are part of the grouping keys G, and
+    * all columns from R1 that appear in the join condition J.
+3) The grouping operator for any column in G1 must be compatible with
+the operator used for that column in the join condition J.
+
+Since G1 includes all columns from R1 that appear in either the
+grouping keys G or the join condition J, all rows within each partial
+group have identical values for both the grouping keys and the
+join-relevant columns from R1, assuming compatible operators are used.
+As a result, the rows within a partial group are indistinguishable in
+terms of their contribution to the aggregation and their behavior in
+the join.  This ensures that all rows in the same partial group share
+the same "destiny": they either all match or all fail to match a given
+row in R2.  Because the aggregate function AGG is decomposable,
+aggregating the partial results after the join yields the same final
+result as aggregating after the full join, thereby preserving query
+semantics.  Q.E.D.
+
+One restriction is that we cannot push partial aggregation down to a
+relation that is in the nullable side of an outer join, because the
+NULL-extended rows produced by the outer join would not be available
+when we perform the partial aggregation, while with a
+non-eager-aggregation plan these rows are available for the top-level
+aggregation.  Pushing partial aggregation in this case may result in
+the rows being grouped differently than expected, or produce incorrect
+values from the aggregate functions.
+
+During the construction of the join tree, we evaluate each base or
+join relation to determine if eager aggregation can be applied.  If
+feasible, we create a separate RelOptInfo called a "grouped relation"
+and generate grouped paths by adding sorted and hashed partial
+aggregation paths on top of the non-grouped paths.  To limit planning
+time, we consider only the cheapest or suitably-sorted non-grouped
+paths in this step.
+
+Another way to generate grouped paths is to join a grouped relation
+with a non-grouped relation.  Joining two grouped relations is
+currently not supported.
+
+To further limit planning time, we currently adopt a strategy where
+partial aggregation is pushed only to the lowest feasible level in the
+join tree where it provides a significant reduction in row count.
+This strategy also helps ensure that all grouped paths for the same
+grouped relation produce the same set of rows, which is important to
+support a fundamental assumption of the planner.
+
+If we have generated a grouped relation for the topmost join relation,
+we need to finalize its paths at the end.  The final paths will
+compete in the usual way with paths built from regular planning.
diff --git a/src/backend/optimizer/geqo/geqo_eval.c b/src/backend/optimizer/geqo/geqo_eval.c
index f07d1dc8ac6..4a65f955ca6 100644
--- a/src/backend/optimizer/geqo/geqo_eval.c
+++ b/src/backend/optimizer/geqo/geqo_eval.c
@@ -279,6 +279,27 @@ merge_clump(PlannerInfo *root, List *clumps, Clump *new_clump, int num_gene,
 				/* Find and save the cheapest paths for this joinrel */
 				set_cheapest(joinrel);
 
+				/*
+				 * Except for the topmost scan/join rel, consider generating
+				 * partial aggregation paths for the grouped relation on top
+				 * of the paths of this rel.  After that, we're done creating
+				 * paths for the grouped relation, so run set_cheapest().
+				 */
+				if (!bms_equal(joinrel->relids, root->all_query_rels))
+				{
+					RelOptInfo *grouped_rel;
+
+					grouped_rel = joinrel->grouped_rel;
+					if (grouped_rel)
+					{
+						Assert(IS_GROUPED_REL(grouped_rel));
+
+						generate_grouped_paths(root, grouped_rel, joinrel,
+											   grouped_rel->agg_info);
+						set_cheapest(grouped_rel);
+					}
+				}
+
 				/* Absorb new clump into old */
 				old_clump->joinrel = joinrel;
 				old_clump->size += new_clump->size;
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 6cc6966b060..e75bb41b58d 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -40,6 +40,7 @@
 #include "optimizer/paths.h"
 #include "optimizer/plancat.h"
 #include "optimizer/planner.h"
+#include "optimizer/prep.h"
 #include "optimizer/tlist.h"
 #include "parser/parse_clause.h"
 #include "parser/parsetree.h"
@@ -47,6 +48,7 @@
 #include "port/pg_bitutils.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/lsyscache.h"
+#include "utils/selfuncs.h"
 
 
 /* Bitmask flags for pushdown_safety_info.unsafeFlags */
@@ -77,6 +79,7 @@ typedef enum pushdown_safe_type
 
 /* These parameters are set by GUC */
 bool		enable_geqo = false;	/* just in case GUC doesn't set it */
+bool		enable_eager_aggregate = true;
 int			geqo_threshold;
 int			min_parallel_table_scan_size;
 int			min_parallel_index_scan_size;
@@ -90,6 +93,7 @@ join_search_hook_type join_search_hook = NULL;
 
 static void set_base_rel_consider_startup(PlannerInfo *root);
 static void set_base_rel_sizes(PlannerInfo *root);
+static void setup_base_grouped_rels(PlannerInfo *root);
 static void set_base_rel_pathlists(PlannerInfo *root);
 static void set_rel_size(PlannerInfo *root, RelOptInfo *rel,
 						 Index rti, RangeTblEntry *rte);
@@ -114,6 +118,7 @@ static void set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
 								Index rti, RangeTblEntry *rte);
 static void set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 									Index rti, RangeTblEntry *rte);
+static void set_grouped_rel_pathlist(PlannerInfo *root, RelOptInfo *rel);
 static void generate_orderedappend_paths(PlannerInfo *root, RelOptInfo *rel,
 										 List *live_childrels,
 										 List *all_child_pathkeys);
@@ -182,6 +187,11 @@ make_one_rel(PlannerInfo *root, List *joinlist)
 	 */
 	set_base_rel_sizes(root);
 
+	/*
+	 * Build grouped relations for base rels where possible.
+	 */
+	setup_base_grouped_rels(root);
+
 	/*
 	 * We should now have size estimates for every actual table involved in
 	 * the query, and we also know which if any have been deleted from the
@@ -323,6 +333,39 @@ set_base_rel_sizes(PlannerInfo *root)
 	}
 }
 
+/*
+ * setup_base_grouped_rels
+ *	  For each base relation, build a grouped base relation if eager
+ *	  aggregation is possible and if this relation can produce grouped paths.
+ */
+static void
+setup_base_grouped_rels(PlannerInfo *root)
+{
+	Index		rti;
+
+	/*
+	 * If there are no aggregate expressions or grouping expressions, eager
+	 * aggregation is not possible.
+	 */
+	if (root->agg_clause_list == NIL ||
+		root->group_expr_list == NIL)
+		return;
+
+	for (rti = 1; rti < root->simple_rel_array_size; rti++)
+	{
+		RelOptInfo *rel = root->simple_rel_array[rti];
+
+		/* there may be empty slots corresponding to non-baserel RTEs */
+		if (rel == NULL)
+			continue;
+
+		Assert(rel->relid == rti);	/* sanity check on array */
+		Assert(IS_SIMPLE_REL(rel)); /* sanity check on rel */
+
+		(void) build_simple_grouped_rel(root, rel);
+	}
+}
+
 /*
  * set_base_rel_pathlists
  *	  Finds all paths available for scanning each base-relation entry.
@@ -559,6 +602,15 @@ set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 	/* Now find the cheapest of the paths for this rel */
 	set_cheapest(rel);
 
+	/*
+	 * If a grouped relation for this rel exists, build partial aggregation
+	 * paths for it.
+	 *
+	 * Note that this can only happen after we've called set_cheapest() for
+	 * this base rel, because we need its cheapest paths.
+	 */
+	set_grouped_rel_pathlist(root, rel);
+
 #ifdef OPTIMIZER_DEBUG
 	pprint(rel);
 #endif
@@ -1305,6 +1357,36 @@ set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 	add_paths_to_append_rel(root, rel, live_childrels);
 }
 
+/*
+ * set_grouped_rel_pathlist
+ *	  If a grouped relation for the given 'rel' exists, build partial
+ *	  aggregation paths for it.
+ */
+static void
+set_grouped_rel_pathlist(PlannerInfo *root, RelOptInfo *rel)
+{
+	RelOptInfo *grouped_rel;
+
+	/*
+	 * If there are no aggregate expressions or grouping expressions, eager
+	 * aggregation is not possible.
+	 */
+	if (root->agg_clause_list == NIL ||
+		root->group_expr_list == NIL)
+		return;
+
+	/* Add paths to the grouped base relation if one exists. */
+	grouped_rel = rel->grouped_rel;
+	if (grouped_rel)
+	{
+		Assert(IS_GROUPED_REL(grouped_rel));
+
+		generate_grouped_paths(root, grouped_rel, rel,
+							   grouped_rel->agg_info);
+		set_cheapest(grouped_rel);
+	}
+}
+
 
 /*
  * add_paths_to_append_rel
@@ -3335,6 +3417,319 @@ generate_useful_gather_paths(PlannerInfo *root, RelOptInfo *rel, bool override_r
 	}
 }
 
+/*
+ * generate_grouped_paths
+ *		Generate paths for a grouped relation by adding sorted and hashed
+ *		partial aggregation paths on top of paths of the ungrouped base or join
+ *		relation.
+ *
+ * The information needed are provided by the RelAggInfo structure.
+ */
+void
+generate_grouped_paths(PlannerInfo *root, RelOptInfo *grouped_rel,
+					   RelOptInfo *rel, RelAggInfo *agg_info)
+{
+	AggClauseCosts agg_costs;
+	bool		can_hash;
+	bool		can_sort;
+	Path	   *cheapest_total_path = NULL;
+	Path	   *cheapest_partial_path = NULL;
+	double		dNumGroups = 0;
+	double		dNumPartialGroups = 0;
+
+	if (IS_DUMMY_REL(rel))
+	{
+		mark_dummy_rel(grouped_rel);
+		return;
+	}
+
+	/*
+	 * We push partial aggregation only to the lowest possible level in the
+	 * join tree that is deemed useful.
+	 */
+	if (!bms_equal(agg_info->apply_at, rel->relids) ||
+		!agg_info->agg_useful)
+		return;
+
+	MemSet(&agg_costs, 0, sizeof(AggClauseCosts));
+	get_agg_clause_costs(root, AGGSPLIT_INITIAL_SERIAL, &agg_costs);
+
+	/*
+	 * Determine whether it's possible to perform sort-based implementations
+	 * of grouping.
+	 */
+	can_sort = grouping_is_sortable(agg_info->group_clauses);
+
+	/*
+	 * Determine whether we should consider hash-based implementations of
+	 * grouping.
+	 */
+	Assert(root->numOrderedAggs == 0);
+	can_hash = (agg_info->group_clauses != NIL &&
+				grouping_is_hashable(agg_info->group_clauses));
+
+	/*
+	 * Consider whether we should generate partially aggregated non-partial
+	 * paths.  We can only do this if we have a non-partial path.
+	 */
+	if (rel->pathlist != NIL)
+	{
+		cheapest_total_path = rel->cheapest_total_path;
+		Assert(cheapest_total_path != NULL);
+	}
+
+	/*
+	 * If parallelism is possible for grouped_rel, then we should consider
+	 * generating partially-grouped partial paths.  However, if the ungrouped
+	 * rel has no partial paths, then we can't.
+	 */
+	if (grouped_rel->consider_parallel && rel->partial_pathlist != NIL)
+	{
+		cheapest_partial_path = linitial(rel->partial_pathlist);
+		Assert(cheapest_partial_path != NULL);
+	}
+
+	/* Estimate number of partial groups. */
+	if (cheapest_total_path != NULL)
+		dNumGroups = estimate_num_groups(root,
+										 agg_info->group_exprs,
+										 cheapest_total_path->rows,
+										 NULL, NULL);
+	if (cheapest_partial_path != NULL)
+		dNumPartialGroups = estimate_num_groups(root,
+												agg_info->group_exprs,
+												cheapest_partial_path->rows,
+												NULL, NULL);
+
+	if (can_sort && cheapest_total_path != NULL)
+	{
+		ListCell   *lc;
+
+		/*
+		 * Use any available suitably-sorted path as input, and also consider
+		 * sorting the cheapest-total path.
+		 */
+		foreach(lc, rel->pathlist)
+		{
+			Path	   *input_path = (Path *) lfirst(lc);
+			Path	   *path;
+			bool		is_sorted;
+			int			presorted_keys;
+
+			/*
+			 * Since the path originates from a non-grouped relation that is
+			 * not aware of eager aggregation, we must ensure that it provides
+			 * the correct input for partial aggregation.
+			 */
+			path = (Path *) create_projection_path(root,
+												   grouped_rel,
+												   input_path,
+												   agg_info->agg_input);
+
+			is_sorted = pathkeys_count_contained_in(agg_info->group_pathkeys,
+													path->pathkeys,
+													&presorted_keys);
+			if (!is_sorted)
+			{
+				/*
+				 * Try at least sorting the cheapest path and also try
+				 * incrementally sorting any path which is partially sorted
+				 * already (no need to deal with paths which have presorted
+				 * keys when incremental sort is disabled unless it's the
+				 * cheapest input path).
+				 */
+				if (input_path != cheapest_total_path &&
+					(presorted_keys == 0 || !enable_incremental_sort))
+					continue;
+
+				/*
+				 * We've no need to consider both a sort and incremental sort.
+				 * We'll just do a sort if there are no presorted keys and an
+				 * incremental sort when there are presorted keys.
+				 */
+				if (presorted_keys == 0 || !enable_incremental_sort)
+					path = (Path *) create_sort_path(root,
+													 grouped_rel,
+													 path,
+													 agg_info->group_pathkeys,
+													 -1.0);
+				else
+					path = (Path *) create_incremental_sort_path(root,
+																 grouped_rel,
+																 path,
+																 agg_info->group_pathkeys,
+																 presorted_keys,
+																 -1.0);
+			}
+
+			/*
+			 * qual is NIL because the HAVING clause cannot be evaluated until
+			 * the final value of the aggregate is known.
+			 */
+			path = (Path *) create_agg_path(root,
+											grouped_rel,
+											path,
+											agg_info->target,
+											AGG_SORTED,
+											AGGSPLIT_INITIAL_SERIAL,
+											agg_info->group_clauses,
+											NIL,
+											&agg_costs,
+											dNumGroups);
+
+			add_path(grouped_rel, path);
+		}
+	}
+
+	if (can_sort && cheapest_partial_path != NULL)
+	{
+		ListCell   *lc;
+
+		/* Similar to above logic, but for partial paths. */
+		foreach(lc, rel->partial_pathlist)
+		{
+			Path	   *input_path = (Path *) lfirst(lc);
+			Path	   *path;
+			bool		is_sorted;
+			int			presorted_keys;
+
+			/*
+			 * Since the path originates from a non-grouped relation that is
+			 * not aware of eager aggregation, we must ensure that it provides
+			 * the correct input for partial aggregation.
+			 */
+			path = (Path *) create_projection_path(root,
+												   grouped_rel,
+												   input_path,
+												   agg_info->agg_input);
+
+			is_sorted = pathkeys_count_contained_in(agg_info->group_pathkeys,
+													path->pathkeys,
+													&presorted_keys);
+
+			if (!is_sorted)
+			{
+				/*
+				 * Try at least sorting the cheapest path and also try
+				 * incrementally sorting any path which is partially sorted
+				 * already (no need to deal with paths which have presorted
+				 * keys when incremental sort is disabled unless it's the
+				 * cheapest input path).
+				 */
+				if (input_path != cheapest_partial_path &&
+					(presorted_keys == 0 || !enable_incremental_sort))
+					continue;
+
+				/*
+				 * We've no need to consider both a sort and incremental sort.
+				 * We'll just do a sort if there are no presorted keys and an
+				 * incremental sort when there are presorted keys.
+				 */
+				if (presorted_keys == 0 || !enable_incremental_sort)
+					path = (Path *) create_sort_path(root,
+													 grouped_rel,
+													 path,
+													 agg_info->group_pathkeys,
+													 -1.0);
+				else
+					path = (Path *) create_incremental_sort_path(root,
+																 grouped_rel,
+																 path,
+																 agg_info->group_pathkeys,
+																 presorted_keys,
+																 -1.0);
+			}
+
+			/*
+			 * qual is NIL because the HAVING clause cannot be evaluated until
+			 * the final value of the aggregate is known.
+			 */
+			path = (Path *) create_agg_path(root,
+											grouped_rel,
+											path,
+											agg_info->target,
+											AGG_SORTED,
+											AGGSPLIT_INITIAL_SERIAL,
+											agg_info->group_clauses,
+											NIL,
+											&agg_costs,
+											dNumPartialGroups);
+
+			add_partial_path(grouped_rel, path);
+		}
+	}
+
+	/*
+	 * Add a partially-grouped HashAgg Path where possible
+	 */
+	if (can_hash && cheapest_total_path != NULL)
+	{
+		Path	   *path;
+
+		/*
+		 * Since the path originates from a non-grouped relation that is not
+		 * aware of eager aggregation, we must ensure that it provides the
+		 * correct input for partial aggregation.
+		 */
+		path = (Path *) create_projection_path(root,
+											   grouped_rel,
+											   cheapest_total_path,
+											   agg_info->agg_input);
+
+		/*
+		 * qual is NIL because the HAVING clause cannot be evaluated until the
+		 * final value of the aggregate is known.
+		 */
+		path = (Path *) create_agg_path(root,
+										grouped_rel,
+										path,
+										agg_info->target,
+										AGG_HASHED,
+										AGGSPLIT_INITIAL_SERIAL,
+										agg_info->group_clauses,
+										NIL,
+										&agg_costs,
+										dNumGroups);
+
+		add_path(grouped_rel, path);
+	}
+
+	/*
+	 * Now add a partially-grouped HashAgg partial Path where possible
+	 */
+	if (can_hash && cheapest_partial_path != NULL)
+	{
+		Path	   *path;
+
+		/*
+		 * Since the path originates from a non-grouped relation that is not
+		 * aware of eager aggregation, we must ensure that it provides the
+		 * correct input for partial aggregation.
+		 */
+		path = (Path *) create_projection_path(root,
+											   grouped_rel,
+											   cheapest_partial_path,
+											   agg_info->agg_input);
+
+		/*
+		 * qual is NIL because the HAVING clause cannot be evaluated until the
+		 * final value of the aggregate is known.
+		 */
+		path = (Path *) create_agg_path(root,
+										grouped_rel,
+										path,
+										agg_info->target,
+										AGG_HASHED,
+										AGGSPLIT_INITIAL_SERIAL,
+										agg_info->group_clauses,
+										NIL,
+										&agg_costs,
+										dNumPartialGroups);
+
+		add_partial_path(grouped_rel, path);
+	}
+}
+
 /*
  * make_rel_from_joinlist
  *	  Build access paths using a "joinlist" to guide the join path search.
@@ -3494,6 +3889,10 @@ standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
 		 *
 		 * After that, we're done creating paths for the joinrel, so run
 		 * set_cheapest().
+		 *
+		 * In addition, we also run generate_grouped_paths() for the grouped
+		 * relation of each just-processed joinrel, and run set_cheapest() for
+		 * the grouped relation afterwards.
 		 */
 		foreach(lc, root->join_rel_level[lev])
 		{
@@ -3514,6 +3913,27 @@ standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
 			/* Find and save the cheapest paths for this rel */
 			set_cheapest(rel);
 
+			/*
+			 * Except for the topmost scan/join rel, consider generating
+			 * partial aggregation paths for the grouped relation on top of
+			 * the paths of this rel.  After that, we're done creating paths
+			 * for the grouped relation, so run set_cheapest().
+			 */
+			if (!bms_equal(rel->relids, root->all_query_rels))
+			{
+				RelOptInfo *grouped_rel;
+
+				grouped_rel = rel->grouped_rel;
+				if (grouped_rel)
+				{
+					Assert(IS_GROUPED_REL(grouped_rel));
+
+					generate_grouped_paths(root, grouped_rel, rel,
+										   grouped_rel->agg_info);
+					set_cheapest(grouped_rel);
+				}
+			}
+
 #ifdef OPTIMIZER_DEBUG
 			pprint(rel);
 #endif
@@ -4383,6 +4803,29 @@ generate_partitionwise_join_paths(PlannerInfo *root, RelOptInfo *rel)
 		if (IS_DUMMY_REL(child_rel))
 			continue;
 
+		/*
+		 * Except for the topmost scan/join rel, consider generating partial
+		 * aggregation paths for the grouped relation on top of the paths of
+		 * this partitioned child-join.  After that, we're done creating paths
+		 * for the grouped relation, so run set_cheapest().
+		 */
+		if (!bms_equal(IS_OTHER_REL(rel) ?
+					   rel->top_parent_relids : rel->relids,
+					   root->all_query_rels))
+		{
+			RelOptInfo *grouped_rel;
+
+			grouped_rel = child_rel->grouped_rel;
+			if (grouped_rel)
+			{
+				Assert(IS_GROUPED_REL(grouped_rel));
+
+				generate_grouped_paths(root, grouped_rel, child_rel,
+									   grouped_rel->agg_info);
+				set_cheapest(grouped_rel);
+			}
+		}
+
 #ifdef OPTIMIZER_DEBUG
 		pprint(child_rel);
 #endif
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index aad41b94009..477b0bc3b84 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -16,6 +16,7 @@
 
 #include "miscadmin.h"
 #include "optimizer/appendinfo.h"
+#include "optimizer/cost.h"
 #include "optimizer/joininfo.h"
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
@@ -35,6 +36,9 @@ static bool has_legal_joinclause(PlannerInfo *root, RelOptInfo *rel);
 static bool restriction_is_constant_false(List *restrictlist,
 										  RelOptInfo *joinrel,
 										  bool only_pushed_down);
+static void make_grouped_join_rel(PlannerInfo *root, RelOptInfo *rel1,
+								  RelOptInfo *rel2, RelOptInfo *joinrel,
+								  SpecialJoinInfo *sjinfo, List *restrictlist);
 static void populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
 										RelOptInfo *rel2, RelOptInfo *joinrel,
 										SpecialJoinInfo *sjinfo, List *restrictlist);
@@ -763,6 +767,10 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
 		return joinrel;
 	}
 
+	/* Build a grouped join relation for 'joinrel' if possible. */
+	make_grouped_join_rel(root, rel1, rel2, joinrel, sjinfo,
+						  restrictlist);
+
 	/* Add paths to the join relation. */
 	populate_joinrel_with_paths(root, rel1, rel2, joinrel, sjinfo,
 								restrictlist);
@@ -874,6 +882,186 @@ add_outer_joins_to_relids(PlannerInfo *root, Relids input_relids,
 	return input_relids;
 }
 
+/*
+ * make_grouped_join_rel
+ *	  Build a grouped join relation for the given "joinrel" if eager
+ *	  aggregation is applicable and the resulting grouped paths are considered
+ *	  useful.
+ *
+ * There are two strategies for generating grouped paths for a join relation:
+ *
+ * 1. Join a grouped (partially aggregated) input relation with a non-grouped
+ * input (e.g., AGG(B) JOIN A).
+ *
+ * 2. Apply partial aggregation (sorted or hashed) on top of existing
+ * non-grouped join paths (e.g., AGG(A JOIN B)).
+ *
+ * To limit planning effort and avoid an explosion of alternatives, we adopt a
+ * strategy where partial aggregation is only pushed to the lowest possible
+ * level in the join tree that is deemed useful.  That is, if grouped paths can
+ * be built using the first strategy, we skip consideration of the second
+ * strategy for the same join level.
+ *
+ * Additionally, if there are multiple lowest useful levels where partial
+ * aggregation could be applied, such as in a join tree with relations A, B,
+ * and C where both "AGG(A JOIN B) JOIN C" and "A JOIN AGG(B JOIN C)" are valid
+ * placements, we choose only the first one encountered during join search.
+ * This avoids generating multiple versions of the same grouped relation based
+ * on different aggregation placements.
+ *
+ * These heuristics also ensure that all grouped paths for the same grouped
+ * relation produce the same set of rows, which is a basic assumption in the
+ * planner.
+ */
+static void
+make_grouped_join_rel(PlannerInfo *root, RelOptInfo *rel1,
+					  RelOptInfo *rel2, RelOptInfo *joinrel,
+					  SpecialJoinInfo *sjinfo, List *restrictlist)
+{
+	RelOptInfo *grouped_rel;
+	RelOptInfo *grouped_rel1;
+	RelOptInfo *grouped_rel2;
+	bool		rel1_empty;
+	bool		rel2_empty;
+	Relids		agg_apply_at;
+
+	/*
+	 * If there are no aggregate expressions or grouping expressions, eager
+	 * aggregation is not possible.
+	 */
+	if (root->agg_clause_list == NIL ||
+		root->group_expr_list == NIL)
+		return;
+
+	/* Retrieve the grouped relations for the two input rels */
+	grouped_rel1 = rel1->grouped_rel;
+	grouped_rel2 = rel2->grouped_rel;
+
+	rel1_empty = (grouped_rel1 == NULL || IS_DUMMY_REL(grouped_rel1));
+	rel2_empty = (grouped_rel2 == NULL || IS_DUMMY_REL(grouped_rel2));
+
+	/* Find or construct a grouped joinrel for this joinrel */
+	grouped_rel = joinrel->grouped_rel;
+	if (grouped_rel == NULL)
+	{
+		RelAggInfo *agg_info = NULL;
+
+		/*
+		 * Prepare the information needed to create grouped paths for this
+		 * join relation.
+		 */
+		agg_info = create_rel_agg_info(root, joinrel);
+		if (agg_info == NULL)
+			return;
+
+		/*
+		 * If grouped paths for the given join relation are not considered
+		 * useful, and no grouped paths can be built by joining grouped input
+		 * relations, skip building the grouped join relation.
+		 */
+		if (!agg_info->agg_useful &&
+			(rel1_empty == rel2_empty))
+			return;
+
+		/* build the grouped relation */
+		grouped_rel = build_grouped_rel(root, joinrel);
+		grouped_rel->reltarget = agg_info->target;
+
+		if (rel1_empty != rel2_empty)
+		{
+			/*
+			 * If there is exactly one grouped input relation, then we can
+			 * build grouped paths by joining the input relations.  Set size
+			 * estimates for the grouped join relation based on the input
+			 * relations, and update the lowest join level where partial
+			 * aggregation is applied to that of the grouped input relation.
+			 */
+			set_joinrel_size_estimates(root, grouped_rel,
+									   rel1_empty ? rel1 : grouped_rel1,
+									   rel2_empty ? rel2 : grouped_rel2,
+									   sjinfo, restrictlist);
+			agg_info->apply_at = rel1_empty ?
+				grouped_rel2->agg_info->apply_at :
+				grouped_rel1->agg_info->apply_at;
+		}
+		else
+		{
+			/*
+			 * Otherwise, grouped paths can be built by applying partial
+			 * aggregation on top of existing non-grouped join paths.  Set
+			 * size estimates for the grouped join relation based on the
+			 * estimated number of groups, and track the lowest join level
+			 * where partial aggregation is applied.  Note that these values
+			 * may be updated later if it is determined that grouped paths can
+			 * be constructed by joining other input relations.
+			 */
+			grouped_rel->rows = agg_info->grouped_rows;
+			agg_info->apply_at = bms_copy(joinrel->relids);
+		}
+
+		grouped_rel->agg_info = agg_info;
+		joinrel->grouped_rel = grouped_rel;
+	}
+
+	Assert(IS_GROUPED_REL(grouped_rel));
+
+	/* We may have already proven this grouped join relation to be dummy. */
+	if (IS_DUMMY_REL(grouped_rel))
+		return;
+
+	/*
+	 * Nothing to do if there's no grouped input relation.  Also, joining two
+	 * grouped relations is not currently supported.
+	 */
+	if (rel1_empty == rel2_empty)
+		return;
+
+	/*
+	 * Get the lowest join level where partial aggregation is applied among
+	 * the given input relations.
+	 */
+	agg_apply_at = rel1_empty ?
+		grouped_rel2->agg_info->apply_at :
+		grouped_rel1->agg_info->apply_at;
+
+	/*
+	 * If it's not the designated level, skip building grouped paths.
+	 *
+	 * One exception is when it is a subset of the previously recorded level.
+	 * In that case, we need to update the designated level to this one, and
+	 * adjust the size estimates for the grouped join relation accordingly.
+	 * For example, suppose partial aggregation can be applied on top of (B
+	 * JOIN C).  If we first construct the join as ((A JOIN B) JOIN C), we'd
+	 * record the designated level as including all three relations (A B C).
+	 * Later, when we consider (A JOIN (B JOIN C)), we encounter the smaller
+	 * (B C) join level directly.  Since this is a subset of the previous
+	 * level and still valid for partial aggregation, we update the designated
+	 * level to (B C), and adjust the size estimates accordingly.
+	 */
+	if (!bms_equal(agg_apply_at, grouped_rel->agg_info->apply_at))
+	{
+		if (bms_is_subset(agg_apply_at, grouped_rel->agg_info->apply_at))
+		{
+			/* Adjust the size estimates for the grouped join relation. */
+			set_joinrel_size_estimates(root, grouped_rel,
+									   rel1_empty ? rel1 : grouped_rel1,
+									   rel2_empty ? rel2 : grouped_rel2,
+									   sjinfo, restrictlist);
+			grouped_rel->agg_info->apply_at = agg_apply_at;
+		}
+		else
+			return;
+	}
+
+	/* Make paths for the grouped join relation. */
+	populate_joinrel_with_paths(root,
+								rel1_empty ? rel1 : grouped_rel1,
+								rel2_empty ? rel2 : grouped_rel2,
+								grouped_rel,
+								sjinfo,
+								restrictlist);
+}
+
 /*
  * populate_joinrel_with_paths
  *	  Add paths to the given joinrel for given pair of joining relations. The
@@ -1615,6 +1803,11 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
 						 adjust_child_relids(joinrel->relids,
 											 nappinfos, appinfos)));
 
+		/* Build a grouped join relation for 'child_joinrel' if possible */
+		make_grouped_join_rel(root, child_rel1, child_rel2,
+							  child_joinrel, child_sjinfo,
+							  child_restrictlist);
+
 		/* And make paths for the child join */
 		populate_joinrel_with_paths(root, child_rel1, child_rel2,
 									child_joinrel, child_sjinfo,
diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index 01804b085b3..7fa1e5099b1 100644
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -14,6 +14,7 @@
  */
 #include "postgres.h"
 
+#include "access/nbtree.h"
 #include "catalog/pg_constraint.h"
 #include "catalog/pg_type.h"
 #include "nodes/makefuncs.h"
@@ -81,6 +82,9 @@ typedef struct JoinTreeItem
 } JoinTreeItem;
 
 
+static bool has_internal_aggtranstype(PlannerInfo *root);
+static void create_agg_clause_infos(PlannerInfo *root);
+static void create_grouping_expr_infos(PlannerInfo *root);
 static void extract_lateral_references(PlannerInfo *root, RelOptInfo *brel,
 									   Index rtindex);
 static List *deconstruct_recurse(PlannerInfo *root, Node *jtnode,
@@ -628,6 +632,315 @@ remove_useless_groupby_columns(PlannerInfo *root)
 	}
 }
 
+/*
+ * setup_eager_aggregation
+ *	  Check if eager aggregation is applicable, and if so collect suitable
+ *	  aggregate expressions and grouping expressions in the query.
+ */
+void
+setup_eager_aggregation(PlannerInfo *root)
+{
+	/*
+	 * Don't apply eager aggregation if disabled by user.
+	 */
+	if (!enable_eager_aggregate)
+		return;
+
+	/*
+	 * Don't apply eager aggregation if there are no available GROUP BY
+	 * clauses.
+	 */
+	if (!root->processed_groupClause)
+		return;
+
+	/*
+	 * For now we don't try to support grouping sets.
+	 */
+	if (root->parse->groupingSets)
+		return;
+
+	/*
+	 * For now we don't try to support DISTINCT or ORDER BY aggregates.
+	 */
+	if (root->numOrderedAggs > 0)
+		return;
+
+	/*
+	 * If there are any aggregates that do not support partial mode, or any
+	 * partial aggregates that are non-serializable, do not apply eager
+	 * aggregation.
+	 */
+	if (root->hasNonPartialAggs || root->hasNonSerialAggs)
+		return;
+
+	/*
+	 * We don't try to apply eager aggregation if there are set-returning
+	 * functions in targetlist.
+	 */
+	if (root->parse->hasTargetSRFs)
+		return;
+
+	/*
+	 * Eager aggregation only makes sense if there are multiple base rels in
+	 * the query.
+	 */
+	if (bms_membership(root->all_baserels) != BMS_MULTIPLE)
+		return;
+
+	/*
+	 * Don't apply eager aggregation if any aggregate uses INTERNAL transition
+	 * type.
+	 *
+	 * Although INTERNAL is marked as pass-by-value, it usually points to a
+	 * large internal data structure (like those used by string_agg or
+	 * array_agg).  These transition states can grow large and their size is
+	 * hard to estimate.  Applying eager aggregation in such cases risks high
+	 * memory usage since partial aggregation results might be stored in join
+	 * hash tables or materialized nodes.
+	 */
+	if (has_internal_aggtranstype(root))
+		return;
+
+	/*
+	 * Collect aggregate expressions and plain Vars that appear in the
+	 * targetlist and havingQual.
+	 */
+	create_agg_clause_infos(root);
+
+	/*
+	 * If there are no suitable aggregate expressions, we cannot apply eager
+	 * aggregation.
+	 */
+	if (root->agg_clause_list == NIL)
+		return;
+
+	/*
+	 * Collect grouping expressions that appear in grouping clauses.
+	 */
+	create_grouping_expr_infos(root);
+}
+
+/*
+ * has_internal_aggtranstype
+ *	  Checks if any aggregate uses INTERNAL transition type.
+ */
+static bool
+has_internal_aggtranstype(PlannerInfo *root)
+{
+	ListCell   *lc;
+
+	foreach(lc, root->aggtransinfos)
+	{
+		AggTransInfo *transinfo = lfirst_node(AggTransInfo, lc);
+
+		if (transinfo->aggtranstype == INTERNALOID)
+			return true;
+	}
+
+	return false;
+}
+
+/*
+ * create_agg_clause_infos
+ *	  Search the targetlist and havingQual for Aggrefs and plain Vars, and
+ *	  create an AggClauseInfo for each Aggref node.
+ */
+static void
+create_agg_clause_infos(PlannerInfo *root)
+{
+	List	   *tlist_exprs;
+	List	   *agg_clause_list = NIL;
+	List	   *tlist_vars = NIL;
+	Relids		aggregate_relids = NULL;
+	bool		eager_agg_applicable = true;
+	ListCell   *lc;
+
+	Assert(root->agg_clause_list == NIL);
+	Assert(root->tlist_vars == NIL);
+
+	tlist_exprs = pull_var_clause((Node *) root->processed_tlist,
+								  PVC_INCLUDE_AGGREGATES |
+								  PVC_RECURSE_WINDOWFUNCS |
+								  PVC_RECURSE_PLACEHOLDERS);
+
+	/*
+	 * Aggregates within the HAVING clause need to be processed in the same
+	 * way as those in the targetlist.  Note that HAVING can contain Aggrefs
+	 * but not WindowFuncs.
+	 */
+	if (root->parse->havingQual != NULL)
+	{
+		List	   *having_exprs;
+
+		having_exprs = pull_var_clause((Node *) root->parse->havingQual,
+									   PVC_INCLUDE_AGGREGATES |
+									   PVC_RECURSE_PLACEHOLDERS);
+		if (having_exprs != NIL)
+		{
+			tlist_exprs = list_concat(tlist_exprs, having_exprs);
+			list_free(having_exprs);
+		}
+	}
+
+	foreach(lc, tlist_exprs)
+	{
+		Expr	   *expr = (Expr *) lfirst(lc);
+		Aggref	   *aggref;
+		Relids		agg_eval_at;
+		AggClauseInfo *ac_info;
+
+		/* For now we don't try to support GROUPING() expressions */
+		if (IsA(expr, GroupingFunc))
+		{
+			eager_agg_applicable = false;
+			break;
+		}
+
+		/* Collect plain Vars for future reference */
+		if (IsA(expr, Var))
+		{
+			tlist_vars = list_append_unique(tlist_vars, expr);
+			continue;
+		}
+
+		aggref = castNode(Aggref, expr);
+
+		Assert(aggref->aggorder == NIL);
+		Assert(aggref->aggdistinct == NIL);
+
+		/*
+		 * If there are any securityQuals, do not try to apply eager
+		 * aggregation if any non-leakproof aggregate functions are present.
+		 * This is overly strict, but for now...
+		 */
+		if (root->qual_security_level > 0 &&
+			!get_func_leakproof(aggref->aggfnoid))
+		{
+			eager_agg_applicable = false;
+			break;
+		}
+
+		agg_eval_at = pull_varnos(root, (Node *) aggref);
+
+		/*
+		 * If all base relations in the query are referenced by aggregate
+		 * functions, then eager aggregation is not applicable.
+		 */
+		aggregate_relids = bms_add_members(aggregate_relids, agg_eval_at);
+		if (bms_is_subset(root->all_baserels, aggregate_relids))
+		{
+			eager_agg_applicable = false;
+			break;
+		}
+
+		/* OK, create the AggClauseInfo node */
+		ac_info = makeNode(AggClauseInfo);
+		ac_info->aggref = aggref;
+		ac_info->agg_eval_at = agg_eval_at;
+
+		/* ... and add it to the list */
+		agg_clause_list = list_append_unique(agg_clause_list, ac_info);
+	}
+
+	list_free(tlist_exprs);
+
+	if (eager_agg_applicable)
+	{
+		root->agg_clause_list = agg_clause_list;
+		root->tlist_vars = tlist_vars;
+	}
+	else
+	{
+		list_free_deep(agg_clause_list);
+		list_free(tlist_vars);
+	}
+}
+
+/*
+ * create_grouping_expr_infos
+ *	  Create a GroupingExprInfo for each expression usable as grouping key.
+ *
+ * If any grouping expression is not suitable, we will just return with
+ * root->group_expr_list being NIL.
+ */
+static void
+create_grouping_expr_infos(PlannerInfo *root)
+{
+	List	   *exprs = NIL;
+	List	   *sortgrouprefs = NIL;
+	List	   *btree_opfamilies = NIL;
+	ListCell   *lc,
+			   *lc1,
+			   *lc2,
+			   *lc3;
+
+	Assert(root->group_expr_list == NIL);
+
+	foreach(lc, root->processed_groupClause)
+	{
+		SortGroupClause *sgc = lfirst_node(SortGroupClause, lc);
+		TargetEntry *tle = get_sortgroupclause_tle(sgc, root->processed_tlist);
+		TypeCacheEntry *tce;
+		Oid			equalimageproc;
+
+		Assert(tle->ressortgroupref > 0);
+
+		/*
+		 * For now we only support plain Vars as grouping expressions.
+		 */
+		if (!IsA(tle->expr, Var))
+			return;
+
+		/*
+		 * Eager aggregation is only possible if equality implies image
+		 * equality for each grouping key.  Otherwise, placing keys with
+		 * different byte images into the same group may result in the loss of
+		 * information that could be necessary to evaluate upper qual clauses.
+		 *
+		 * For instance, the NUMERIC data type is not supported, as values
+		 * that are considered equal by the equality operator (e.g., 0 and
+		 * 0.0) can have different scales.
+		 */
+		tce = lookup_type_cache(exprType((Node *) tle->expr),
+								TYPECACHE_BTREE_OPFAMILY);
+		if (!OidIsValid(tce->btree_opf) ||
+			!OidIsValid(tce->btree_opintype))
+			return;
+
+		equalimageproc = get_opfamily_proc(tce->btree_opf,
+										   tce->btree_opintype,
+										   tce->btree_opintype,
+										   BTEQUALIMAGE_PROC);
+		if (!OidIsValid(equalimageproc) ||
+			!DatumGetBool(OidFunctionCall1Coll(equalimageproc,
+											   tce->typcollation,
+											   ObjectIdGetDatum(tce->btree_opintype))))
+			return;
+
+		exprs = lappend(exprs, tle->expr);
+		sortgrouprefs = lappend_int(sortgrouprefs, tle->ressortgroupref);
+		btree_opfamilies = lappend_oid(btree_opfamilies, tce->btree_opf);
+	}
+
+	/*
+	 * Construct a GroupingExprInfo for each expression.
+	 */
+	forthree(lc1, exprs, lc2, sortgrouprefs, lc3, btree_opfamilies)
+	{
+		Expr	   *expr = (Expr *) lfirst(lc1);
+		int			sortgroupref = lfirst_int(lc2);
+		Oid			btree_opfamily = lfirst_oid(lc3);
+		GroupingExprInfo *ge_info;
+
+		ge_info = makeNode(GroupingExprInfo);
+		ge_info->expr = (Expr *) copyObject(expr);
+		ge_info->sortgroupref = sortgroupref;
+		ge_info->btree_opfamily = btree_opfamily;
+
+		root->group_expr_list = lappend(root->group_expr_list, ge_info);
+	}
+}
+
 /*****************************************************************************
  *
  *	  LATERAL REFERENCES
diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c
index 5467e094ca7..eefc486a566 100644
--- a/src/backend/optimizer/plan/planmain.c
+++ b/src/backend/optimizer/plan/planmain.c
@@ -76,6 +76,9 @@ query_planner(PlannerInfo *root,
 	root->placeholder_list = NIL;
 	root->placeholder_array = NULL;
 	root->placeholder_array_size = 0;
+	root->agg_clause_list = NIL;
+	root->group_expr_list = NIL;
+	root->tlist_vars = NIL;
 	root->fkey_list = NIL;
 	root->initial_rels = NIL;
 
@@ -265,6 +268,12 @@ query_planner(PlannerInfo *root,
 	 */
 	extract_restriction_or_clauses(root);
 
+	/*
+	 * Check if eager aggregation is applicable, and if so, set up
+	 * root->agg_clause_list and root->group_expr_list.
+	 */
+	setup_eager_aggregation(root);
+
 	/*
 	 * Now expand appendrels by adding "otherrels" for their children.  We
 	 * delay this to the end so that we have as much information as possible
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 549aedcfa99..6289902fc93 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -231,7 +231,6 @@ static void add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 									  RelOptInfo *partially_grouped_rel,
 									  const AggClauseCosts *agg_costs,
 									  grouping_sets_data *gd,
-									  double dNumGroups,
 									  GroupPathExtraData *extra);
 static RelOptInfo *create_partial_grouping_paths(PlannerInfo *root,
 												 RelOptInfo *grouped_rel,
@@ -3982,9 +3981,7 @@ create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
 							   GroupPathExtraData *extra,
 							   RelOptInfo **partially_grouped_rel_p)
 {
-	Path	   *cheapest_path = input_rel->cheapest_total_path;
 	RelOptInfo *partially_grouped_rel = NULL;
-	double		dNumGroups;
 	PartitionwiseAggregateType patype = PARTITIONWISE_AGGREGATE_NONE;
 
 	/*
@@ -4066,23 +4063,16 @@ create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
 
 	/* Gather any partially grouped partial paths. */
 	if (partially_grouped_rel && partially_grouped_rel->partial_pathlist)
-	{
 		gather_grouping_paths(root, partially_grouped_rel);
-		set_cheapest(partially_grouped_rel);
-	}
 
-	/*
-	 * Estimate number of groups.
-	 */
-	dNumGroups = get_number_of_groups(root,
-									  cheapest_path->rows,
-									  gd,
-									  extra->targetList);
+	/* Now choose the best path(s) for partially_grouped_rel. */
+	if (partially_grouped_rel && partially_grouped_rel->pathlist)
+		set_cheapest(partially_grouped_rel);
 
 	/* Build final grouping paths */
 	add_paths_to_grouping_rel(root, input_rel, grouped_rel,
 							  partially_grouped_rel, agg_costs, gd,
-							  dNumGroups, extra);
+							  extra);
 
 	/* Give a helpful error if we failed to find any implementation */
 	if (grouped_rel->pathlist == NIL)
@@ -7027,16 +7017,42 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 						  RelOptInfo *grouped_rel,
 						  RelOptInfo *partially_grouped_rel,
 						  const AggClauseCosts *agg_costs,
-						  grouping_sets_data *gd, double dNumGroups,
+						  grouping_sets_data *gd,
 						  GroupPathExtraData *extra)
 {
 	Query	   *parse = root->parse;
 	Path	   *cheapest_path = input_rel->cheapest_total_path;
+	Path	   *cheapest_partially_grouped_path = NULL;
 	ListCell   *lc;
 	bool		can_hash = (extra->flags & GROUPING_CAN_USE_HASH) != 0;
 	bool		can_sort = (extra->flags & GROUPING_CAN_USE_SORT) != 0;
 	List	   *havingQual = (List *) extra->havingQual;
 	AggClauseCosts *agg_final_costs = &extra->agg_final_costs;
+	double		dNumGroups = 0;
+	double		dNumFinalGroups = 0;
+
+	/*
+	 * Estimate number of groups for non-split aggregation.
+	 */
+	dNumGroups = get_number_of_groups(root,
+									  cheapest_path->rows,
+									  gd,
+									  extra->targetList);
+
+	if (partially_grouped_rel && partially_grouped_rel->pathlist)
+	{
+		cheapest_partially_grouped_path =
+			partially_grouped_rel->cheapest_total_path;
+
+		/*
+		 * Estimate number of groups for final phase of partial aggregation.
+		 */
+		dNumFinalGroups =
+			get_number_of_groups(root,
+								 cheapest_partially_grouped_path->rows,
+								 gd,
+								 extra->targetList);
+	}
 
 	if (can_sort)
 	{
@@ -7149,7 +7165,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 					path = make_ordered_path(root,
 											 grouped_rel,
 											 path,
-											 partially_grouped_rel->cheapest_total_path,
+											 cheapest_partially_grouped_path,
 											 info->pathkeys,
 											 -1.0);
 
@@ -7167,7 +7183,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 												 info->clauses,
 												 havingQual,
 												 agg_final_costs,
-												 dNumGroups));
+												 dNumFinalGroups));
 					else
 						add_path(grouped_rel, (Path *)
 								 create_group_path(root,
@@ -7175,7 +7191,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 												   path,
 												   info->clauses,
 												   havingQual,
-												   dNumGroups));
+												   dNumFinalGroups));
 
 				}
 			}
@@ -7217,19 +7233,17 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 		 */
 		if (partially_grouped_rel && partially_grouped_rel->pathlist)
 		{
-			Path	   *path = partially_grouped_rel->cheapest_total_path;
-
 			add_path(grouped_rel, (Path *)
 					 create_agg_path(root,
 									 grouped_rel,
-									 path,
+									 cheapest_partially_grouped_path,
 									 grouped_rel->reltarget,
 									 AGG_HASHED,
 									 AGGSPLIT_FINAL_DESERIAL,
 									 root->processed_groupClause,
 									 havingQual,
 									 agg_final_costs,
-									 dNumGroups));
+									 dNumFinalGroups));
 		}
 	}
 
@@ -7269,6 +7283,7 @@ create_partial_grouping_paths(PlannerInfo *root,
 {
 	Query	   *parse = root->parse;
 	RelOptInfo *partially_grouped_rel;
+	RelOptInfo *eager_agg_rel = NULL;
 	AggClauseCosts *agg_partial_costs = &extra->agg_partial_costs;
 	AggClauseCosts *agg_final_costs = &extra->agg_final_costs;
 	Path	   *cheapest_partial_path = NULL;
@@ -7279,6 +7294,15 @@ create_partial_grouping_paths(PlannerInfo *root,
 	bool		can_hash = (extra->flags & GROUPING_CAN_USE_HASH) != 0;
 	bool		can_sort = (extra->flags & GROUPING_CAN_USE_SORT) != 0;
 
+	/*
+	 * Check whether any partially aggregated paths have been generated
+	 * through eager aggregation.
+	 */
+	if (input_rel->grouped_rel &&
+		!IS_DUMMY_REL(input_rel->grouped_rel) &&
+		input_rel->grouped_rel->pathlist != NIL)
+		eager_agg_rel = input_rel->grouped_rel;
+
 	/*
 	 * Consider whether we should generate partially aggregated non-partial
 	 * paths.  We can only do this if we have a non-partial path, and only if
@@ -7300,11 +7324,13 @@ create_partial_grouping_paths(PlannerInfo *root,
 
 	/*
 	 * If we can't partially aggregate partial paths, and we can't partially
-	 * aggregate non-partial paths, then don't bother creating the new
+	 * aggregate non-partial paths, and no partially aggregated paths were
+	 * generated by eager aggregation, then don't bother creating the new
 	 * RelOptInfo at all, unless the caller specified force_rel_creation.
 	 */
 	if (cheapest_total_path == NULL &&
 		cheapest_partial_path == NULL &&
+		eager_agg_rel == NULL &&
 		!force_rel_creation)
 		return NULL;
 
@@ -7529,6 +7555,51 @@ create_partial_grouping_paths(PlannerInfo *root,
 										 dNumPartialPartialGroups));
 	}
 
+	/*
+	 * Add any partially aggregated paths generated by eager aggregation to
+	 * the new upper relation after applying projection steps as needed.
+	 */
+	if (eager_agg_rel)
+	{
+		/* Add the paths */
+		foreach(lc, eager_agg_rel->pathlist)
+		{
+			Path	   *path = (Path *) lfirst(lc);
+
+			/* Shouldn't have any parameterized paths anymore */
+			Assert(path->param_info == NULL);
+
+			path = (Path *) create_projection_path(root,
+												   partially_grouped_rel,
+												   path,
+												   partially_grouped_rel->reltarget);
+
+			add_path(partially_grouped_rel, path);
+		}
+
+		/*
+		 * Likewise add the partial paths, but only if parallelism is possible
+		 * for partially_grouped_rel.
+		 */
+		if (partially_grouped_rel->consider_parallel)
+		{
+			foreach(lc, eager_agg_rel->partial_pathlist)
+			{
+				Path	   *path = (Path *) lfirst(lc);
+
+				/* Shouldn't have any parameterized paths anymore */
+				Assert(path->param_info == NULL);
+
+				path = (Path *) create_projection_path(root,
+													   partially_grouped_rel,
+													   path,
+													   partially_grouped_rel->reltarget);
+
+				add_partial_path(partially_grouped_rel, path);
+			}
+		}
+	}
+
 	/*
 	 * If there is an FDW that's responsible for all baserels of the query,
 	 * let it consider adding partially grouped ForeignPaths.
@@ -8092,13 +8163,6 @@ create_partitionwise_grouping_paths(PlannerInfo *root,
 
 		add_paths_to_append_rel(root, partially_grouped_rel,
 								partially_grouped_live_children);
-
-		/*
-		 * We need call set_cheapest, since the finalization step will use the
-		 * cheapest path from the rel.
-		 */
-		if (partially_grouped_rel->pathlist)
-			set_cheapest(partially_grouped_rel);
 	}
 
 	/* If possible, create append paths for fully grouped children. */
diff --git a/src/backend/optimizer/util/appendinfo.c b/src/backend/optimizer/util/appendinfo.c
index 5b3dc0d8653..11c0eb0d180 100644
--- a/src/backend/optimizer/util/appendinfo.c
+++ b/src/backend/optimizer/util/appendinfo.c
@@ -516,6 +516,65 @@ adjust_appendrel_attrs_mutator(Node *node,
 		return (Node *) newinfo;
 	}
 
+	/*
+	 * We have to process RelAggInfo nodes specially.
+	 */
+	if (IsA(node, RelAggInfo))
+	{
+		RelAggInfo *oldinfo = (RelAggInfo *) node;
+		RelAggInfo *newinfo = makeNode(RelAggInfo);
+
+		/* Copy all flat-copiable fields */
+		memcpy(newinfo, oldinfo, sizeof(RelAggInfo));
+
+		newinfo->relids = adjust_child_relids(oldinfo->relids,
+											  nappinfos, appinfos);
+
+		newinfo->target = (PathTarget *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->target,
+										   context);
+
+		newinfo->agg_input = (PathTarget *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->agg_input,
+										   context);
+
+		newinfo->group_clauses = (List *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->group_clauses,
+										   context);
+
+		newinfo->group_exprs = (List *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->group_exprs,
+										   context);
+
+		return (Node *) newinfo;
+	}
+
+	/*
+	 * We have to process PathTarget nodes specially.
+	 */
+	if (IsA(node, PathTarget))
+	{
+		PathTarget *oldtarget = (PathTarget *) node;
+		PathTarget *newtarget = makeNode(PathTarget);
+
+		/* Copy all flat-copiable fields */
+		memcpy(newtarget, oldtarget, sizeof(PathTarget));
+
+		newtarget->exprs = (List *)
+			adjust_appendrel_attrs_mutator((Node *) oldtarget->exprs,
+										   context);
+
+		if (oldtarget->sortgrouprefs)
+		{
+			Size		nbytes = list_length(oldtarget->exprs) * sizeof(Index);
+
+			newtarget->sortgrouprefs = (Index *) palloc(nbytes);
+			memcpy(newtarget->sortgrouprefs, oldtarget->sortgrouprefs, nbytes);
+		}
+
+		return (Node *) newtarget;
+	}
+
 	/*
 	 * NOTE: we do not need to recurse into sublinks, because they should
 	 * already have been converted to subplans before we see them.
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index e0192d4a491..26127eb07d1 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -2790,8 +2790,7 @@ create_projection_path(PlannerInfo *root,
 	pathnode->path.pathtype = T_Result;
 	pathnode->path.parent = rel;
 	pathnode->path.pathtarget = target;
-	/* For now, assume we are above any joins, so no parameterization */
-	pathnode->path.param_info = NULL;
+	pathnode->path.param_info = subpath->param_info;
 	pathnode->path.parallel_aware = false;
 	pathnode->path.parallel_safe = rel->consider_parallel &&
 		subpath->parallel_safe &&
@@ -3046,8 +3045,7 @@ create_incremental_sort_path(PlannerInfo *root,
 	pathnode->path.parent = rel;
 	/* Sort doesn't project, so use source path's pathtarget */
 	pathnode->path.pathtarget = subpath->pathtarget;
-	/* For now, assume we are above any joins, so no parameterization */
-	pathnode->path.param_info = NULL;
+	pathnode->path.param_info = subpath->param_info;
 	pathnode->path.parallel_aware = false;
 	pathnode->path.parallel_safe = rel->consider_parallel &&
 		subpath->parallel_safe;
@@ -3094,8 +3092,7 @@ create_sort_path(PlannerInfo *root,
 	pathnode->path.parent = rel;
 	/* Sort doesn't project, so use source path's pathtarget */
 	pathnode->path.pathtarget = subpath->pathtarget;
-	/* For now, assume we are above any joins, so no parameterization */
-	pathnode->path.param_info = NULL;
+	pathnode->path.param_info = subpath->param_info;
 	pathnode->path.parallel_aware = false;
 	pathnode->path.parallel_safe = rel->consider_parallel &&
 		subpath->parallel_safe;
@@ -3256,8 +3253,7 @@ create_agg_path(PlannerInfo *root,
 	pathnode->path.pathtype = T_Agg;
 	pathnode->path.parent = rel;
 	pathnode->path.pathtarget = target;
-	/* For now, assume we are above any joins, so no parameterization */
-	pathnode->path.param_info = NULL;
+	pathnode->path.param_info = subpath->param_info;
 	pathnode->path.parallel_aware = false;
 	pathnode->path.parallel_safe = rel->consider_parallel &&
 		subpath->parallel_safe;
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index ff507331a06..c4054b5d03f 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -16,6 +16,8 @@
 
 #include <limits.h>
 
+#include "access/nbtree.h"
+#include "catalog/pg_constraint.h"
 #include "miscadmin.h"
 #include "nodes/nodeFuncs.h"
 #include "optimizer/appendinfo.h"
@@ -27,12 +29,16 @@
 #include "optimizer/paths.h"
 #include "optimizer/placeholder.h"
 #include "optimizer/plancat.h"
+#include "optimizer/planner.h"
 #include "optimizer/restrictinfo.h"
 #include "optimizer/tlist.h"
+#include "parser/parse_oper.h"
 #include "parser/parse_relation.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/hsearch.h"
 #include "utils/lsyscache.h"
+#include "utils/selfuncs.h"
+#include "utils/typcache.h"
 
 
 typedef struct JoinHashEntry
@@ -83,7 +89,22 @@ static void build_child_join_reltarget(PlannerInfo *root,
 									   RelOptInfo *childrel,
 									   int nappinfos,
 									   AppendRelInfo **appinfos);
+static bool eager_aggregation_possible_for_relation(PlannerInfo *root,
+													RelOptInfo *rel);
+static bool init_grouping_targets(PlannerInfo *root, RelOptInfo *rel,
+								  PathTarget *target, PathTarget *agg_input,
+								  List **group_clauses, List **group_exprs);
+static bool is_var_in_aggref_only(PlannerInfo *root, Var *var);
+static bool is_var_needed_by_join(PlannerInfo *root, Var *var, RelOptInfo *rel);
+static Index get_expression_sortgroupref(PlannerInfo *root, Expr *expr);
 
+/*
+ * Minimum average group size required to consider applying eager aggregation.
+ *
+ * This helps avoid the overhead of eager aggregation when it does not offer
+ * significant row count reduction.
+ */
+#define EAGER_AGG_MIN_GROUP_SIZE 20.0
 
 /*
  * setup_simple_rel_arrays
@@ -276,6 +297,8 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
 	rel->joininfo = NIL;
 	rel->has_eclass_joins = false;
 	rel->consider_partitionwise_join = false;	/* might get changed later */
+	rel->agg_info = NULL;
+	rel->grouped_rel = NULL;
 	rel->part_scheme = NULL;
 	rel->nparts = -1;
 	rel->boundinfo = NULL;
@@ -406,6 +429,104 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
 	return rel;
 }
 
+/*
+ * build_simple_grouped_rel
+ *	  Construct a new RelOptInfo representing a grouped version of the input
+ *	  base relation.
+ */
+RelOptInfo *
+build_simple_grouped_rel(PlannerInfo *root, RelOptInfo *rel)
+{
+	RelOptInfo *grouped_rel;
+	RelAggInfo *agg_info;
+
+	/*
+	 * We should have available aggregate expressions and grouping
+	 * expressions, otherwise we cannot reach here.
+	 */
+	Assert(root->agg_clause_list != NIL);
+	Assert(root->group_expr_list != NIL);
+
+	/* nothing to do for dummy rel */
+	if (IS_DUMMY_REL(rel))
+		return NULL;
+
+	/*
+	 * Prepare the information needed to create grouped paths for this base
+	 * relation.
+	 */
+	agg_info = create_rel_agg_info(root, rel);
+	if (agg_info == NULL)
+		return NULL;
+
+	/*
+	 * If grouped paths for the given base relation are not considered useful,
+	 * skip building the grouped relation.
+	 */
+	if (!agg_info->agg_useful)
+		return NULL;
+
+	/* Tracks the lowest join level at which partial aggregation is applied */
+	agg_info->apply_at = bms_copy(rel->relids);
+
+	/* build the grouped relation */
+	grouped_rel = build_grouped_rel(root, rel);
+	grouped_rel->reltarget = agg_info->target;
+	grouped_rel->rows = agg_info->grouped_rows;
+	grouped_rel->agg_info = agg_info;
+
+	rel->grouped_rel = grouped_rel;
+
+	return grouped_rel;
+}
+
+/*
+ * build_grouped_rel
+ *	  Build a grouped relation by flat copying the input relation and resetting
+ *	  the necessary fields.
+ */
+RelOptInfo *
+build_grouped_rel(PlannerInfo *root, RelOptInfo *rel)
+{
+	RelOptInfo *grouped_rel;
+
+	grouped_rel = makeNode(RelOptInfo);
+	memcpy(grouped_rel, rel, sizeof(RelOptInfo));
+
+	/*
+	 * clear path info
+	 */
+	grouped_rel->pathlist = NIL;
+	grouped_rel->ppilist = NIL;
+	grouped_rel->partial_pathlist = NIL;
+	grouped_rel->cheapest_startup_path = NULL;
+	grouped_rel->cheapest_total_path = NULL;
+	grouped_rel->cheapest_unique_path = NULL;
+	grouped_rel->cheapest_parameterized_paths = NIL;
+
+	/*
+	 * clear partition info
+	 */
+	grouped_rel->part_scheme = NULL;
+	grouped_rel->nparts = -1;
+	grouped_rel->boundinfo = NULL;
+	grouped_rel->partbounds_merged = false;
+	grouped_rel->partition_qual = NIL;
+	grouped_rel->part_rels = NULL;
+	grouped_rel->live_parts = NULL;
+	grouped_rel->all_partrels = NULL;
+	grouped_rel->partexprs = NULL;
+	grouped_rel->nullable_partexprs = NULL;
+	grouped_rel->consider_partitionwise_join = false;
+
+	/*
+	 * clear size estimates
+	 */
+	grouped_rel->rows = 0;
+
+	return grouped_rel;
+}
+
 /*
  * find_base_rel
  *	  Find a base or otherrel relation entry, which must already exist.
@@ -755,6 +876,8 @@ build_join_rel(PlannerInfo *root,
 	joinrel->joininfo = NIL;
 	joinrel->has_eclass_joins = false;
 	joinrel->consider_partitionwise_join = false;	/* might get changed later */
+	joinrel->agg_info = NULL;
+	joinrel->grouped_rel = NULL;
 	joinrel->parent = NULL;
 	joinrel->top_parent = NULL;
 	joinrel->top_parent_relids = NULL;
@@ -939,6 +1062,8 @@ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
 	joinrel->joininfo = NIL;
 	joinrel->has_eclass_joins = false;
 	joinrel->consider_partitionwise_join = false;	/* might get changed later */
+	joinrel->agg_info = NULL;
+	joinrel->grouped_rel = NULL;
 	joinrel->parent = parent_joinrel;
 	joinrel->top_parent = parent_joinrel->top_parent ? parent_joinrel->top_parent : parent_joinrel;
 	joinrel->top_parent_relids = joinrel->top_parent->relids;
@@ -2518,3 +2643,514 @@ build_child_join_reltarget(PlannerInfo *root,
 	childrel->reltarget->cost.per_tuple = parentrel->reltarget->cost.per_tuple;
 	childrel->reltarget->width = parentrel->reltarget->width;
 }
+
+/*
+ * create_rel_agg_info
+ *	  Create the RelAggInfo structure for the given relation if it can produce
+ *	  grouped paths.  The given relation is the non-grouped one which has the
+ *	  reltarget already constructed.
+ */
+RelAggInfo *
+create_rel_agg_info(PlannerInfo *root, RelOptInfo *rel)
+{
+	ListCell   *lc;
+	RelAggInfo *result;
+	PathTarget *agg_input;
+	PathTarget *target;
+	List	   *group_clauses = NIL;
+	List	   *group_exprs = NIL;
+
+	/*
+	 * The lists of aggregate expressions and grouping expressions should have
+	 * been constructed.
+	 */
+	Assert(root->agg_clause_list != NIL);
+	Assert(root->group_expr_list != NIL);
+
+	/*
+	 * If this is a child rel, the grouped rel for its parent rel must have
+	 * been created if it can.  So we can just use parent's RelAggInfo if
+	 * there is one, with appropriate variable substitutions.
+	 */
+	if (IS_OTHER_REL(rel))
+	{
+		RelOptInfo *grouped_rel;
+		RelAggInfo *agg_info;
+
+		grouped_rel = rel->top_parent->grouped_rel;
+		if (grouped_rel == NULL)
+			return NULL;
+
+		Assert(IS_GROUPED_REL(grouped_rel));
+
+		/* Must do multi-level transformation */
+		agg_info = (RelAggInfo *)
+			adjust_appendrel_attrs_multilevel(root,
+											  (Node *) grouped_rel->agg_info,
+											  rel,
+											  rel->top_parent);
+
+		agg_info->grouped_rows =
+			estimate_num_groups(root, agg_info->group_exprs,
+								rel->rows, NULL, NULL);
+
+		agg_info->apply_at = NULL;	/* caller will change this later */
+
+		/*
+		 * The grouped paths for the given relation are considered useful iff
+		 * the average group size is no less than EAGER_AGG_MIN_GROUP_SIZE.
+		 */
+		agg_info->agg_useful =
+			(rel->rows / agg_info->grouped_rows) >= EAGER_AGG_MIN_GROUP_SIZE;
+
+		return agg_info;
+	}
+
+	/* Check if it's possible to produce grouped paths for this relation. */
+	if (!eager_aggregation_possible_for_relation(root, rel))
+		return NULL;
+
+	/*
+	 * Create targets for the grouped paths and for the input paths of the
+	 * grouped paths.
+	 */
+	target = create_empty_pathtarget();
+	agg_input = create_empty_pathtarget();
+
+	/* ... and initialize these targets */
+	if (!init_grouping_targets(root, rel, target, agg_input,
+							   &group_clauses, &group_exprs))
+		return NULL;
+
+	/*
+	 * Eager aggregation is not applicable if there are no available grouping
+	 * expressions.
+	 */
+	if (list_length(group_clauses) == 0)
+		return NULL;
+
+	/* build the RelAggInfo result */
+	result = makeNode(RelAggInfo);
+
+	result->group_clauses = group_clauses;
+	result->group_exprs = group_exprs;
+
+	/* Calculate pathkeys that represent this grouping requirements */
+	result->group_pathkeys =
+		make_pathkeys_for_sortclauses(root, result->group_clauses,
+									  make_tlist_from_pathtarget(target));
+
+	/* Add aggregates to the grouping target */
+	foreach(lc, root->agg_clause_list)
+	{
+		AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+		Aggref	   *aggref;
+
+		Assert(IsA(ac_info->aggref, Aggref));
+
+		aggref = (Aggref *) copyObject(ac_info->aggref);
+		mark_partial_aggref(aggref, AGGSPLIT_INITIAL_SERIAL);
+
+		add_column_to_pathtarget(target, (Expr *) aggref, 0);
+	}
+
+	/* Set the estimated eval cost and output width for both targets */
+	set_pathtarget_cost_width(root, target);
+	set_pathtarget_cost_width(root, agg_input);
+
+	result->relids = bms_copy(rel->relids);
+	result->target = target;
+	result->agg_input = agg_input;
+	result->grouped_rows = estimate_num_groups(root, result->group_exprs,
+											   rel->rows, NULL, NULL);
+	result->apply_at = NULL;	/* caller will change this later */
+
+	/*
+	 * The grouped paths for the given relation are considered useful iff the
+	 * average group size is no less than EAGER_AGG_MIN_GROUP_SIZE.
+	 */
+	result->agg_useful =
+		(rel->rows / result->grouped_rows) >= EAGER_AGG_MIN_GROUP_SIZE;
+
+	return result;
+}
+
+/*
+ * eager_aggregation_possible_for_relation
+ * 	  Check if it's possible to produce grouped paths for the given relation.
+ */
+static bool
+eager_aggregation_possible_for_relation(PlannerInfo *root, RelOptInfo *rel)
+{
+	ListCell   *lc;
+	int			cur_relid;
+
+	/*
+	 * Check to see if the given relation is in the nullable side of an outer
+	 * join.  In this case, we cannot push a partial aggregation down to the
+	 * relation, because the NULL-extended rows produced by the outer join
+	 * would not be available when we perform the partial aggregation, while
+	 * with a non-eager-aggregation plan these rows are available for the
+	 * top-level aggregation.  Doing so may result in the rows being grouped
+	 * differently than expected, or produce incorrect values from the
+	 * aggregate functions.
+	 */
+	cur_relid = -1;
+	while ((cur_relid = bms_next_member(rel->relids, cur_relid)) >= 0)
+	{
+		RelOptInfo *baserel = find_base_rel_ignore_join(root, cur_relid);
+
+		if (baserel == NULL)
+			continue;			/* ignore outer joins in rel->relids */
+
+		if (!bms_is_subset(baserel->nulling_relids, rel->relids))
+			return false;
+	}
+
+	/*
+	 * For now we don't try to support PlaceHolderVars.
+	 */
+	foreach(lc, rel->reltarget->exprs)
+	{
+		Expr	   *expr = lfirst(lc);
+
+		if (IsA(expr, PlaceHolderVar))
+			return false;
+	}
+
+	/* Caller should only pass base relations or joins. */
+	Assert(rel->reloptkind == RELOPT_BASEREL ||
+		   rel->reloptkind == RELOPT_JOINREL);
+
+	/*
+	 * Check if all aggregate expressions can be evaluated on this relation
+	 * level.
+	 */
+	foreach(lc, root->agg_clause_list)
+	{
+		AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+
+		Assert(IsA(ac_info->aggref, Aggref));
+
+		/*
+		 * Give up if any aggregate requires relations other than the current
+		 * one.  If the aggregate requires the current relation plus
+		 * additional relations, grouping the current relation could make some
+		 * input rows unavailable for the higher aggregate and may reduce the
+		 * number of input rows it receives.  If the aggregate does not
+		 * require the current relation at all, it should not be grouped, as
+		 * we do not support joining two grouped relations.
+		 */
+		if (!bms_is_subset(ac_info->agg_eval_at, rel->relids))
+			return false;
+	}
+
+	return true;
+}
+
+/*
+ * init_grouping_targets
+ *	  Initialize the target for grouped paths (target) as well as the target
+ *	  for paths that generate input for the grouped paths (agg_input).
+ *
+ * We also construct the list of SortGroupClauses and the list of grouping
+ * expressions for the partial aggregation, and return them in *group_clause
+ * and *group_exprs.
+ *
+ * Return true if the targets could be initialized, false otherwise.
+ */
+static bool
+init_grouping_targets(PlannerInfo *root, RelOptInfo *rel,
+					  PathTarget *target, PathTarget *agg_input,
+					  List **group_clauses, List **group_exprs)
+{
+	ListCell   *lc;
+	List	   *possibly_dependent = NIL;
+	Index		maxSortGroupRef;
+
+	/* Identify the max sortgroupref */
+	maxSortGroupRef = 0;
+	foreach(lc, root->processed_tlist)
+	{
+		Index		ref = ((TargetEntry *) lfirst(lc))->ressortgroupref;
+
+		if (ref > maxSortGroupRef)
+			maxSortGroupRef = ref;
+	}
+
+	/*
+	 * At this point, all Vars from this relation that are needed by upper
+	 * joins or are required in the final targetlist should already be present
+	 * in its reltarget.  Therefore, we can safely iterate over this
+	 * relation's reltarget->exprs to construct the PathTarget and grouping
+	 * clauses for the grouped paths.
+	 */
+	foreach(lc, rel->reltarget->exprs)
+	{
+		Expr	   *expr = (Expr *) lfirst(lc);
+		Index		sortgroupref;
+
+		/*
+		 * Given that PlaceHolderVar currently prevents us from doing eager
+		 * aggregation, the source target cannot contain anything more complex
+		 * than a Var.
+		 */
+		Assert(IsA(expr, Var));
+
+		/*
+		 * Get the sortgroupref of the expr if it is found among, or can be
+		 * deduced from, the original grouping expressions.
+		 */
+		sortgroupref = get_expression_sortgroupref(root, expr);
+		if (sortgroupref > 0)
+		{
+			SortGroupClause *sgc;
+
+			/* Find the matching SortGroupClause */
+			sgc = get_sortgroupref_clause(sortgroupref, root->processed_groupClause);
+			Assert(sgc->tleSortGroupRef <= maxSortGroupRef);
+
+			/*
+			 * If the target expression is to be used as a grouping key, it
+			 * should be emitted by the grouped paths that have been pushed
+			 * down to this relation level.
+			 */
+			add_column_to_pathtarget(target, expr, sortgroupref);
+
+			/*
+			 * ... and it also should be emitted by the input paths.
+			 */
+			add_column_to_pathtarget(agg_input, expr, sortgroupref);
+
+			/*
+			 * Record this SortGroupClause and grouping expression.  Note that
+			 * this SortGroupClause might have already been recorded.
+			 */
+			if (!list_member(*group_clauses, sgc))
+			{
+				*group_clauses = lappend(*group_clauses, sgc);
+				*group_exprs = lappend(*group_exprs, expr);
+			}
+		}
+		else if (is_var_needed_by_join(root, (Var *) expr, rel))
+		{
+			/*
+			 * The expression is needed for an upper join but is neither in
+			 * the GROUP BY clause nor derivable from it using EC (otherwise,
+			 * it would have already been included in the targets above).  We
+			 * need to create a special SortGroupClause for this expression.
+			 *
+			 * It is important to include such expressions in the grouping
+			 * keys.  This is essential to ensure that an aggregated row from
+			 * the partial aggregation matches the other side of the join if
+			 * and only if each row in the partial group does.  This ensures
+			 * that all rows within the same partial group share the same
+			 * 'destiny', which is crucial for maintaining correctness.
+			 */
+			SortGroupClause *sgc;
+			TypeCacheEntry *tce;
+			Oid			equalimageproc;
+
+			/*
+			 * But first, check if equality implies image equality for this
+			 * expression.  If not, we cannot use it as a grouping key.  See
+			 * comments in create_grouping_expr_infos().
+			 */
+			tce = lookup_type_cache(exprType((Node *) expr),
+									TYPECACHE_BTREE_OPFAMILY);
+			if (!OidIsValid(tce->btree_opf) ||
+				!OidIsValid(tce->btree_opintype))
+				return false;
+
+			equalimageproc = get_opfamily_proc(tce->btree_opf,
+											   tce->btree_opintype,
+											   tce->btree_opintype,
+											   BTEQUALIMAGE_PROC);
+			if (!OidIsValid(equalimageproc) ||
+				!DatumGetBool(OidFunctionCall1Coll(equalimageproc,
+												   tce->typcollation,
+												   ObjectIdGetDatum(tce->btree_opintype))))
+				return false;
+
+			/* Create the SortGroupClause. */
+			sgc = makeNode(SortGroupClause);
+
+			/* Initialize the SortGroupClause. */
+			sgc->tleSortGroupRef = ++maxSortGroupRef;
+			get_sort_group_operators(exprType((Node *) expr),
+									 false, true, false,
+									 &sgc->sortop, &sgc->eqop, NULL,
+									 &sgc->hashable);
+
+			/* This expression should be emitted by the grouped paths */
+			add_column_to_pathtarget(target, expr, sgc->tleSortGroupRef);
+
+			/* ... and it also should be emitted by the input paths. */
+			add_column_to_pathtarget(agg_input, expr, sgc->tleSortGroupRef);
+
+			/* Record this SortGroupClause and grouping expression */
+			*group_clauses = lappend(*group_clauses, sgc);
+			*group_exprs = lappend(*group_exprs, expr);
+		}
+		else if (is_var_in_aggref_only(root, (Var *) expr))
+		{
+			/*
+			 * The expression is referenced by an aggregate function pushed
+			 * down to this relation and does not appear elsewhere in the
+			 * targetlist or havingQual.  Add it to 'agg_input' but not to
+			 * 'target'.
+			 */
+			add_new_column_to_pathtarget(agg_input, expr);
+		}
+		else
+		{
+			/*
+			 * The expression may be functionally dependent on other
+			 * expressions in the target, but we cannot verify this until all
+			 * target expressions have been constructed.
+			 */
+			possibly_dependent = lappend(possibly_dependent, expr);
+		}
+	}
+
+	/*
+	 * Now we can verify whether an expression is functionally dependent on
+	 * others.
+	 */
+	foreach(lc, possibly_dependent)
+	{
+		Var		   *tvar;
+		List	   *deps = NIL;
+		RangeTblEntry *rte;
+
+		tvar = lfirst_node(Var, lc);
+		rte = root->simple_rte_array[tvar->varno];
+
+		if (check_functional_grouping(rte->relid, tvar->varno,
+									  tvar->varlevelsup,
+									  target->exprs, &deps))
+		{
+			/*
+			 * The expression is functionally dependent on other target
+			 * expressions, so it can be included in the targets.  Since it
+			 * will not be used as a grouping key, a sortgroupref is not
+			 * needed for it.
+			 */
+			add_new_column_to_pathtarget(target, (Expr *) tvar);
+			add_new_column_to_pathtarget(agg_input, (Expr *) tvar);
+		}
+		else
+		{
+			/*
+			 * We may arrive here with a grouping expression that is proven
+			 * redundant by EquivalenceClass processing, such as 't1.a' in the
+			 * query below.
+			 *
+			 * select max(t1.c) from t t1, t t2 where t1.a = 1 group by t1.a,
+			 * t1.b;
+			 *
+			 * For now we just give up in this case.
+			 */
+			return false;
+		}
+	}
+
+	return true;
+}
+
+/*
+ * is_var_in_aggref_only
+ *	  Check whether the given Var appears in aggregate expressions and not
+ *	  elsewhere in the targetlist or havingQual.
+ */
+static bool
+is_var_in_aggref_only(PlannerInfo *root, Var *var)
+{
+	ListCell   *lc;
+
+	/*
+	 * Search the list of aggregate expressions for the Var.
+	 */
+	foreach(lc, root->agg_clause_list)
+	{
+		AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+		List	   *vars;
+
+		Assert(IsA(ac_info->aggref, Aggref));
+
+		if (!bms_is_member(var->varno, ac_info->agg_eval_at))
+			continue;
+
+		vars = pull_var_clause((Node *) ac_info->aggref,
+							   PVC_RECURSE_AGGREGATES |
+							   PVC_RECURSE_WINDOWFUNCS |
+							   PVC_RECURSE_PLACEHOLDERS);
+
+		if (list_member(vars, var))
+		{
+			list_free(vars);
+			break;
+		}
+
+		list_free(vars);
+	}
+
+	return (lc != NULL && !list_member(root->tlist_vars, var));
+}
+
+/*
+ * is_var_needed_by_join
+ *	  Check if the given Var is needed by joins above the current rel.
+ */
+static bool
+is_var_needed_by_join(PlannerInfo *root, Var *var, RelOptInfo *rel)
+{
+	Relids		relids;
+	int			attno;
+	RelOptInfo *baserel;
+
+	/*
+	 * Note that when checking if the Var is needed by joins above, we want to
+	 * exclude cases where the Var is only needed in the final targetlist.  So
+	 * include "relation 0" in the check.
+	 */
+	relids = bms_copy(rel->relids);
+	relids = bms_add_member(relids, 0);
+
+	baserel = find_base_rel(root, var->varno);
+	attno = var->varattno - baserel->min_attr;
+
+	return bms_nonempty_difference(baserel->attr_needed[attno], relids);
+}
+
+/*
+ * get_expression_sortgroupref
+ *	  Return the sortgroupref of the given "expr" if it is found among the
+ *	  original grouping expressions, or is known equal to any of the original
+ *	  grouping expressions due to equivalence relationships.  Return 0 if no
+ *	  match is found.
+ */
+static Index
+get_expression_sortgroupref(PlannerInfo *root, Expr *expr)
+{
+	ListCell   *lc;
+
+	foreach(lc, root->group_expr_list)
+	{
+		GroupingExprInfo *ge_info = lfirst_node(GroupingExprInfo, lc);
+
+		Assert(IsA(ge_info->expr, Var));
+
+		if (equal(ge_info->expr, expr) ||
+			exprs_known_equal(root, (Node *) expr, (Node *) ge_info->expr,
+							  ge_info->btree_opfamily))
+		{
+			Assert(ge_info->sortgroupref > 0);
+
+			return ge_info->sortgroupref;
+		}
+	}
+
+	/* no match is found */
+	return 0;
+}
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index f04bfedb2fd..5a6a3b7406e 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -949,6 +949,16 @@ struct config_bool ConfigureNamesBool[] =
 		false,
 		NULL, NULL, NULL
 	},
+	{
+		{"enable_eager_aggregate", PGC_USERSET, QUERY_TUNING_METHOD,
+			gettext_noop("Enables eager aggregation."),
+			NULL,
+			GUC_EXPLAIN
+		},
+		&enable_eager_aggregate,
+		true,
+		NULL, NULL, NULL
+	},
 	{
 		{"enable_parallel_append", PGC_USERSET, QUERY_TUNING_METHOD,
 			gettext_noop("Enables the planner's use of parallel append plans."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 341f88adc87..00eaf4869e0 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -428,6 +428,7 @@
 #enable_group_by_reordering = on
 #enable_distinct_reordering = on
 #enable_self_join_elimination = on
+#enable_eager_aggregate = on
 
 # - Planner Cost Constants -
 
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 6567759595d..1b03b5f03cf 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -394,6 +394,15 @@ struct PlannerInfo
 	/* list of PlaceHolderInfos */
 	List	   *placeholder_list;
 
+	/* list of AggClauseInfos */
+	List	   *agg_clause_list;
+
+	/* list of GroupExprInfos */
+	List	   *group_expr_list;
+
+	/* list of plain Vars contained in targetlist and havingQual */
+	List	   *tlist_vars;
+
 	/* array of PlaceHolderInfos indexed by phid */
 	struct PlaceHolderInfo **placeholder_array pg_node_attr(read_write_ignore, array_size(placeholder_array_size));
 	/* allocated size of array */
@@ -1022,6 +1031,14 @@ typedef struct RelOptInfo
 	/* consider partitionwise join paths? (if partitioned rel) */
 	bool		consider_partitionwise_join;
 
+	/*
+	 * used by eager aggregation:
+	 */
+	/* information needed to create grouped paths */
+	struct RelAggInfo *agg_info;
+	/* the partially-aggregated version of the relation */
+	struct RelOptInfo *grouped_rel;
+
 	/*
 	 * inheritance links, if this is an otherrel (otherwise NULL):
 	 */
@@ -1095,6 +1112,75 @@ typedef struct RelOptInfo
 	((rel)->part_scheme && (rel)->boundinfo && (rel)->nparts > 0 && \
 	 (rel)->part_rels && (rel)->partexprs && (rel)->nullable_partexprs)
 
+/*
+ * Is the given relation a grouped relation?
+ */
+#define IS_GROUPED_REL(rel) \
+	((rel)->agg_info != NULL)
+
+/*
+ * RelAggInfo
+ *		Information needed to create grouped paths for base and join rels.
+ *
+ * "relids" is the set of relation identifiers (RT indexes).
+ *
+ * "target" is the output tlist for the grouped paths.
+ *
+ * "agg_input" is the output tlist for the paths that provide input to the
+ * grouped paths.  One difference from the reltarget of the non-grouped
+ * relation is that agg_input has its sortgrouprefs[] initialized.
+ *
+ * "grouped_rows" is the estimated number of result tuples of the grouped
+ * relation.
+ *
+ * "group_clauses", "group_exprs" and "group_pathkeys" are lists of
+ * SortGroupClauses, the corresponding grouping expressions and PathKeys
+ * respectively.
+ *
+ * "apply_at" tracks the lowest join level at which partial aggregation is
+ * applied.
+ *
+ * "agg_useful" is a flag to indicate whether the grouped paths are considered
+ * useful.  It is set true if the average partial group size is no less than
+ * EAGER_AGG_MIN_GROUP_SIZE, suggesting a significant row count reduction.
+ */
+typedef struct RelAggInfo
+{
+	pg_node_attr(no_copy_equal, no_read, no_query_jumble)
+
+	NodeTag		type;
+
+	/* set of base + OJ relids (rangetable indexes) */
+	Relids		relids;
+
+	/*
+	 * default result targetlist for Paths scanning this grouped relation;
+	 * list of Vars/Exprs, cost, width
+	 */
+	struct PathTarget *target;
+
+	/*
+	 * the targetlist for Paths that provide input to the grouped paths
+	 */
+	struct PathTarget *agg_input;
+
+	/* estimated number of result tuples */
+	Cardinality grouped_rows;
+
+	/* a list of SortGroupClauses */
+	List	   *group_clauses;
+	/* a list of grouping expressions */
+	List	   *group_exprs;
+	/* a list of PathKeys */
+	List	   *group_pathkeys;
+
+	/* lowest level partial aggregation is applied at */
+	Relids		apply_at;
+
+	/* the grouped paths are considered useful? */
+	bool		agg_useful;
+} RelAggInfo;
+
 /*
  * IndexOptInfo
  *		Per-index information for planning/optimization
@@ -3274,6 +3360,50 @@ typedef struct MinMaxAggInfo
 	Param	   *param;
 } MinMaxAggInfo;
 
+/*
+ * For each distinct Aggref node that appears in the targetlist and HAVING
+ * clauses, we store an AggClauseInfo node in the PlannerInfo node's
+ * agg_clause_list.  Each AggClauseInfo records the set of relations referenced
+ * by the aggregate expression.  This information is used to determine how far
+ * the aggregate can be safely pushed down in the join tree.
+ */
+typedef struct AggClauseInfo
+{
+	pg_node_attr(no_read, no_query_jumble)
+
+	NodeTag		type;
+
+	/* the Aggref expr */
+	Aggref	   *aggref;
+
+	/* lowest level we can evaluate this aggregate at */
+	Relids		agg_eval_at;
+} AggClauseInfo;
+
+/*
+ * For each grouping expression that appears in grouping clauses, we store a
+ * GroupingExprInfo node in the PlannerInfo node's group_expr_list.  Each
+ * GroupingExprInfo records the expression being grouped on, its sortgroupref,
+ * and the btree opfamily used for equality comparison.  This information is
+ * necessary to reproduce correct grouping semantics at different levels of the
+ * join tree.
+ */
+typedef struct GroupingExprInfo
+{
+	pg_node_attr(no_read, no_query_jumble)
+
+	NodeTag		type;
+
+	/* the represented expression */
+	Expr	   *expr;
+
+	/* the tleSortGroupRef of the corresponding SortGroupClause */
+	Index		sortgroupref;
+
+	/* btree opfamily defining the ordering */
+	Oid			btree_opfamily;
+} GroupingExprInfo;
+
 /*
  * At runtime, PARAM_EXEC slots are used to pass values around from one plan
  * node to another.  They can be used to pass values down into subqueries (for
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index 60dcdb77e41..01a3532dc2e 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -314,6 +314,10 @@ extern void setup_simple_rel_arrays(PlannerInfo *root);
 extern void expand_planner_arrays(PlannerInfo *root, int add_size);
 extern RelOptInfo *build_simple_rel(PlannerInfo *root, int relid,
 									RelOptInfo *parent);
+extern RelOptInfo *build_simple_grouped_rel(PlannerInfo *root,
+											RelOptInfo *rel_plain);
+extern RelOptInfo *build_grouped_rel(PlannerInfo *root,
+									 RelOptInfo *rel_plain);
 extern RelOptInfo *find_base_rel(PlannerInfo *root, int relid);
 extern RelOptInfo *find_base_rel_noerr(PlannerInfo *root, int relid);
 extern RelOptInfo *find_base_rel_ignore_join(PlannerInfo *root, int relid);
@@ -353,4 +357,5 @@ extern RelOptInfo *build_child_join_rel(PlannerInfo *root,
 										SpecialJoinInfo *sjinfo,
 										int nappinfos, AppendRelInfo **appinfos);
 
+extern RelAggInfo *create_rel_agg_info(PlannerInfo *root, RelOptInfo *rel);
 #endif							/* PATHNODE_H */
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 8410531f2d6..b62f22237b7 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -21,6 +21,7 @@
  * allpaths.c
  */
 extern PGDLLIMPORT bool enable_geqo;
+extern PGDLLIMPORT bool enable_eager_aggregate;
 extern PGDLLIMPORT int geqo_threshold;
 extern PGDLLIMPORT int min_parallel_table_scan_size;
 extern PGDLLIMPORT int min_parallel_index_scan_size;
@@ -57,6 +58,10 @@ extern void generate_gather_paths(PlannerInfo *root, RelOptInfo *rel,
 								  bool override_rows);
 extern void generate_useful_gather_paths(PlannerInfo *root, RelOptInfo *rel,
 										 bool override_rows);
+extern void generate_grouped_paths(PlannerInfo *root,
+								   RelOptInfo *rel_grouped,
+								   RelOptInfo *rel_plain,
+								   RelAggInfo *agg_info);
 extern int	compute_parallel_worker(RelOptInfo *rel, double heap_pages,
 									double index_pages, int max_workers);
 extern void create_partial_bitmap_paths(PlannerInfo *root, RelOptInfo *rel,
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index 9d3debcab28..09b48b26f8f 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -76,6 +76,7 @@ extern void add_vars_to_targetlist(PlannerInfo *root, List *vars,
 extern void add_vars_to_attr_needed(PlannerInfo *root, List *vars,
 									Relids where_needed);
 extern void remove_useless_groupby_columns(PlannerInfo *root);
+extern void setup_eager_aggregation(PlannerInfo *root);
 extern void find_lateral_references(PlannerInfo *root);
 extern void rebuild_lateral_attr_needed(PlannerInfo *root);
 extern void create_lateral_join_info(PlannerInfo *root);
diff --git a/src/test/regress/expected/eager_aggregate.out b/src/test/regress/expected/eager_aggregate.out
new file mode 100644
index 00000000000..f02ff0b30a3
--- /dev/null
+++ b/src/test/regress/expected/eager_aggregate.out
@@ -0,0 +1,1334 @@
+--
+-- EAGER AGGREGATION
+-- Test we can push aggregation down below join
+--
+-- Enable eager aggregation, which by default is disabled.
+SET enable_eager_aggregate TO on;
+CREATE TABLE eager_agg_t1 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t2 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t3 (a int, b int, c double precision);
+INSERT INTO eager_agg_t1 SELECT i, i, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t2 SELECT i, i%10, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t3 SELECT i%10, i%10, i FROM generate_series(1, 1000)i;
+ANALYZE eager_agg_t1;
+ANALYZE eager_agg_t2;
+ANALYZE eager_agg_t3;
+--
+-- Test eager aggregation over base rel
+--
+-- Perform scan of a table, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                            QUERY PLAN                            
+------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Seq Scan on public.eager_agg_t2 t2
+                                 Output: t2.a, t2.b, t2.c
+(18 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                               QUERY PLAN                               
+------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial GroupAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Sort
+                                 Output: t2.c, t2.b
+                                 Sort Key: t2.b
+                                 ->  Seq Scan on public.eager_agg_t2 t2
+                                       Output: t2.c, t2.b
+(21 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+RESET enable_hashagg;
+--
+-- Test eager aggregation over join rel
+--
+-- Perform join of tables, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+                                  QUERY PLAN                                  
+------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg((t2.c + t3.c))
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg((t2.c + t3.c)))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg((t2.c + t3.c))
+                           Group Key: t2.b
+                           ->  Hash Join
+                                 Output: t2.c, t2.b, t3.c
+                                 Hash Cond: (t3.a = t2.a)
+                                 ->  Seq Scan on public.eager_agg_t3 t3
+                                       Output: t3.a, t3.b, t3.c
+                                 ->  Hash
+                                       Output: t2.c, t2.b, t2.a
+                                       ->  Seq Scan on public.eager_agg_t2 t2
+                                             Output: t2.c, t2.b, t2.a
+(25 rows)
+
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 497
+ 2 | 499
+ 3 | 501
+ 4 | 503
+ 5 | 505
+ 6 | 507
+ 7 | 509
+ 8 | 511
+ 9 | 513
+(9 rows)
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+                                     QUERY PLAN                                     
+------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg((t2.c + t3.c))
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg((t2.c + t3.c)))
+                     ->  Partial GroupAggregate
+                           Output: t2.b, PARTIAL avg((t2.c + t3.c))
+                           Group Key: t2.b
+                           ->  Sort
+                                 Output: t2.c, t2.b, t3.c
+                                 Sort Key: t2.b
+                                 ->  Hash Join
+                                       Output: t2.c, t2.b, t3.c
+                                       Hash Cond: (t3.a = t2.a)
+                                       ->  Seq Scan on public.eager_agg_t3 t3
+                                             Output: t3.a, t3.b, t3.c
+                                       ->  Hash
+                                             Output: t2.c, t2.b, t2.a
+                                             ->  Seq Scan on public.eager_agg_t2 t2
+                                                   Output: t2.c, t2.b, t2.a
+(28 rows)
+
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 497
+ 2 | 499
+ 3 | 501
+ 4 | 503
+ 5 | 505
+ 6 | 507
+ 7 | 509
+ 8 | 511
+ 9 | 513
+(9 rows)
+
+RESET enable_hashagg;
+--
+-- Test that eager aggregation works for outer join
+--
+-- Ensure aggregation can be pushed down to the non-nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                            QUERY PLAN                            
+------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Right Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Seq Scan on public.eager_agg_t2 t2
+                                 Output: t2.a, t2.b, t2.c
+(18 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+   | 505
+(10 rows)
+
+-- Ensure aggregation cannot be pushed down to the nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Sort
+   Output: t2.b, (avg(t2.c))
+   Sort Key: t2.b
+   ->  HashAggregate
+         Output: t2.b, avg(t2.c)
+         Group Key: t2.b
+         ->  Hash Right Join
+               Output: t2.b, t2.c
+               Hash Cond: (t2.b = t1.b)
+               ->  Seq Scan on public.eager_agg_t2 t2
+                     Output: t2.a, t2.b, t2.c
+               ->  Hash
+                     Output: t1.b
+                     ->  Seq Scan on public.eager_agg_t1 t1
+                           Output: t1.b
+(15 rows)
+
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+ b | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+   |    
+(10 rows)
+
+--
+-- Test that eager aggregation works for parallel plans
+--
+SET parallel_setup_cost=0;
+SET parallel_tuple_cost=0;
+SET min_parallel_table_scan_size=0;
+SET max_parallel_workers_per_gather=4;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                                   QUERY PLAN                                    
+---------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Gather Merge
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Workers Planned: 2
+         ->  Sort
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Sort Key: t1.a
+               ->  Parallel Hash Join
+                     Output: t1.a, (PARTIAL avg(t2.c))
+                     Hash Cond: (t1.b = t2.b)
+                     ->  Parallel Seq Scan on public.eager_agg_t1 t1
+                           Output: t1.a, t1.b, t1.c
+                     ->  Parallel Hash
+                           Output: t2.b, (PARTIAL avg(t2.c))
+                           ->  Partial HashAggregate
+                                 Output: t2.b, PARTIAL avg(t2.c)
+                                 Group Key: t2.b
+                                 ->  Parallel Seq Scan on public.eager_agg_t2 t2
+                                       Output: t2.a, t2.b, t2.c
+(21 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+RESET parallel_setup_cost;
+RESET parallel_tuple_cost;
+RESET min_parallel_table_scan_size;
+RESET max_parallel_workers_per_gather;
+DROP TABLE eager_agg_t1;
+DROP TABLE eager_agg_t2;
+DROP TABLE eager_agg_t3;
+--
+-- Test eager aggregation for partitionwise join
+--
+-- Enable partitionwise aggregate, which by default is disabled.
+SET enable_partitionwise_aggregate TO true;
+-- Enable partitionwise join, which by default is disabled.
+SET enable_partitionwise_join TO true;
+CREATE TABLE eager_agg_tab1(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab1_p1 PARTITION OF eager_agg_tab1 FOR VALUES FROM (0) TO (5);
+CREATE TABLE eager_agg_tab1_p2 PARTITION OF eager_agg_tab1 FOR VALUES FROM (5) TO (10);
+CREATE TABLE eager_agg_tab1_p3 PARTITION OF eager_agg_tab1 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab2(x int, y int) PARTITION BY RANGE(y);
+CREATE TABLE eager_agg_tab2_p1 PARTITION OF eager_agg_tab2 FOR VALUES FROM (0) TO (5);
+CREATE TABLE eager_agg_tab2_p2 PARTITION OF eager_agg_tab2 FOR VALUES FROM (5) TO (10);
+CREATE TABLE eager_agg_tab2_p3 PARTITION OF eager_agg_tab2 FOR VALUES FROM (10) TO (15);
+INSERT INTO eager_agg_tab1 SELECT i % 15, i % 10 FROM generate_series(1, 1000) i;
+INSERT INTO eager_agg_tab2 SELECT i % 10, i % 15 FROM generate_series(1, 1000) i;
+ANALYZE eager_agg_tab1;
+ANALYZE eager_agg_tab2;
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum(t1.y)), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum(t1.y), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2.y = t1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2
+                           Output: t2.y
+                     ->  Hash
+                           Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1.x, PARTIAL sum(t1.y), PARTIAL count(*)
+                                 Group Key: t1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                       Output: t1.x, t1.y
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum(t1_1.y), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_1
+                           Output: t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.y), PARTIAL count(*)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                       Output: t1_1.x, t1_1.y
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum(t1_2.y), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_2
+                           Output: t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.y), PARTIAL count(*)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                       Output: t1_2.x, t1_2.y
+(49 rows)
+
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+ x  |  sum  | count 
+----+-------+-------
+  0 | 10890 |  4356
+  1 | 15544 |  4489
+  2 | 20033 |  4489
+  3 | 24522 |  4489
+  4 | 29011 |  4489
+  5 | 11390 |  4489
+  6 | 15879 |  4489
+  7 | 20368 |  4489
+  8 | 24857 |  4489
+  9 | 29346 |  4489
+ 10 | 11055 |  4489
+ 11 | 15246 |  4356
+ 12 | 19602 |  4356
+ 13 | 23958 |  4356
+ 14 | 28314 |  4356
+(15 rows)
+
+-- GROUP BY having other matching key
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t2.y, (sum(t1.y)), (count(*))
+   Sort Key: t2.y
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t2.y, sum(t1.y), count(*)
+               Group Key: t2.y
+               ->  Hash Join
+                     Output: t2.y, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2.y = t1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2
+                           Output: t2.y
+                     ->  Hash
+                           Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1.x, PARTIAL sum(t1.y), PARTIAL count(*)
+                                 Group Key: t1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                       Output: t1.y, t1.x
+         ->  Finalize HashAggregate
+               Output: t2_1.y, sum(t1_1.y), count(*)
+               Group Key: t2_1.y
+               ->  Hash Join
+                     Output: t2_1.y, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_1
+                           Output: t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.y), PARTIAL count(*)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                       Output: t1_1.y, t1_1.x
+         ->  Finalize HashAggregate
+               Output: t2_2.y, sum(t1_2.y), count(*)
+               Group Key: t2_2.y
+               ->  Hash Join
+                     Output: t2_2.y, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_2
+                           Output: t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.y), PARTIAL count(*)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                       Output: t1_2.y, t1_2.x
+(49 rows)
+
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+ y  |  sum  | count 
+----+-------+-------
+  0 | 10890 |  4356
+  1 | 15544 |  4489
+  2 | 20033 |  4489
+  3 | 24522 |  4489
+  4 | 29011 |  4489
+  5 | 11390 |  4489
+  6 | 15879 |  4489
+  7 | 20368 |  4489
+  8 | 24857 |  4489
+  9 | 29346 |  4489
+ 10 | 11055 |  4489
+ 11 | 15246 |  4356
+ 12 | 19602 |  4356
+ 13 | 23958 |  4356
+ 14 | 28314 |  4356
+(15 rows)
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 5 ORDER BY t2.x;
+                                                 QUERY PLAN                                                 
+------------------------------------------------------------------------------------------------------------
+ Sort
+   Output: t2.x, (sum(t1.x)), (count(*))
+   Sort Key: t2.x
+   ->  Finalize HashAggregate
+         Output: t2.x, sum(t1.x), count(*)
+         Group Key: t2.x
+         Filter: (avg(t1.x) > '5'::numeric)
+         ->  Append
+               ->  Hash Join
+                     Output: t2.x, (PARTIAL sum(t1.x)), (PARTIAL count(*)), (PARTIAL avg(t1.x))
+                     Hash Cond: (t2.y = t1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2
+                           Output: t2.x, t2.y
+                     ->  Hash
+                           Output: t1.x, (PARTIAL sum(t1.x)), (PARTIAL count(*)), (PARTIAL avg(t1.x))
+                           ->  Partial HashAggregate
+                                 Output: t1.x, PARTIAL sum(t1.x), PARTIAL count(*), PARTIAL avg(t1.x)
+                                 Group Key: t1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                       Output: t1.x
+               ->  Hash Join
+                     Output: t2_1.x, (PARTIAL sum(t1_1.x)), (PARTIAL count(*)), (PARTIAL avg(t1_1.x))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_1
+                           Output: t2_1.x, t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.x)), (PARTIAL count(*)), (PARTIAL avg(t1_1.x))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.x), PARTIAL count(*), PARTIAL avg(t1_1.x)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                       Output: t1_1.x
+               ->  Hash Join
+                     Output: t2_2.x, (PARTIAL sum(t1_2.x)), (PARTIAL count(*)), (PARTIAL avg(t1_2.x))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_2
+                           Output: t2_2.x, t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.x)), (PARTIAL count(*)), (PARTIAL avg(t1_2.x))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.x), PARTIAL count(*), PARTIAL avg(t1_2.x)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                       Output: t1_2.x
+(44 rows)
+
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 5 ORDER BY t2.x;
+ x |  sum  | count 
+---+-------+-------
+ 0 | 33835 |  6667
+ 1 | 39502 |  6667
+ 2 | 46169 |  6667
+ 3 | 52836 |  6667
+ 4 | 59503 |  6667
+ 5 | 33500 |  6667
+ 6 | 39837 |  6667
+ 7 | 46504 |  6667
+ 8 | 53171 |  6667
+ 9 | 59838 |  6667
+(10 rows)
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum((t2.y + t3.y)))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum((t2.y + t3.y))
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum((t2.y + t3.y)))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, t3.x, (PARTIAL sum((t2.y + t3.y)))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, t3.x, PARTIAL sum((t2.y + t3.y))
+                                 Group Key: t2.x
+                                 ->  Hash Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Hash Cond: (t2.x = t3.x)
+                                       ->  Seq Scan on public.eager_agg_tab1_p1 t2
+                                             Output: t2.y, t2.x
+                                       ->  Hash
+                                             Output: t3.y, t3.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p1 t3
+                                                   Output: t3.y, t3.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum((t2_1.y + t3_1.y))
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum((t2_1.y + t3_1.y)))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y)))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y))
+                                 Group Key: t2_1.x
+                                 ->  Hash Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Hash Cond: (t2_1.x = t3_1.x)
+                                       ->  Seq Scan on public.eager_agg_tab1_p2 t2_1
+                                             Output: t2_1.y, t2_1.x
+                                       ->  Hash
+                                             Output: t3_1.y, t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p2 t3_1
+                                                   Output: t3_1.y, t3_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum((t2_2.y + t3_2.y))
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum((t2_2.y + t3_2.y)))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y)))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y))
+                                 Group Key: t2_2.x
+                                 ->  Hash Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Hash Cond: (t2_2.x = t3_2.x)
+                                       ->  Seq Scan on public.eager_agg_tab1_p3 t2_2
+                                             Output: t2_2.y, t2_2.x
+                                       ->  Hash
+                                             Output: t3_2.y, t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p3 t3_2
+                                                   Output: t3_2.y, t3_2.x
+(70 rows)
+
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+ x  |   sum   
+----+---------
+  0 | 1437480
+  1 | 2082896
+  2 | 2684422
+  3 | 3285948
+  4 | 3887474
+  5 | 1526260
+  6 | 2127786
+  7 | 2729312
+  8 | 3330838
+  9 | 3932364
+ 10 | 1481370
+ 11 | 2012472
+ 12 | 2587464
+ 13 | 3162456
+ 14 | 3737448
+(15 rows)
+
+-- partial aggregation
+SET enable_hashagg TO off;
+SET max_parallel_workers_per_gather TO 0;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t3.y, sum((t2.y + t3.y))
+   Group Key: t3.y
+   ->  Sort
+         Output: t3.y, (PARTIAL sum((t2.y + t3.y)))
+         Sort Key: t3.y
+         ->  Append
+               ->  Hash Join
+                     Output: t3.y, (PARTIAL sum((t2.y + t3.y)))
+                     Hash Cond: (t2.x = t1.x)
+                     ->  Partial GroupAggregate
+                           Output: t2.x, t3.y, t3.x, PARTIAL sum((t2.y + t3.y))
+                           Group Key: t2.x, t3.y, t3.x
+                           ->  Incremental Sort
+                                 Output: t2.y, t2.x, t3.y, t3.x
+                                 Sort Key: t2.x, t3.y
+                                 Presorted Key: t2.x
+                                 ->  Merge Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Merge Cond: (t2.x = t3.x)
+                                       ->  Sort
+                                             Output: t2.y, t2.x
+                                             Sort Key: t2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p1 t2
+                                                   Output: t2.y, t2.x
+                                       ->  Sort
+                                             Output: t3.y, t3.x
+                                             Sort Key: t3.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p1 t3
+                                                   Output: t3.y, t3.x
+                     ->  Hash
+                           Output: t1.x
+                           ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                 Output: t1.x
+               ->  Hash Join
+                     Output: t3_1.y, (PARTIAL sum((t2_1.y + t3_1.y)))
+                     Hash Cond: (t2_1.x = t1_1.x)
+                     ->  Partial GroupAggregate
+                           Output: t2_1.x, t3_1.y, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y))
+                           Group Key: t2_1.x, t3_1.y, t3_1.x
+                           ->  Incremental Sort
+                                 Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                 Sort Key: t2_1.x, t3_1.y
+                                 Presorted Key: t2_1.x
+                                 ->  Merge Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Merge Cond: (t2_1.x = t3_1.x)
+                                       ->  Sort
+                                             Output: t2_1.y, t2_1.x
+                                             Sort Key: t2_1.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p2 t2_1
+                                                   Output: t2_1.y, t2_1.x
+                                       ->  Sort
+                                             Output: t3_1.y, t3_1.x
+                                             Sort Key: t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p2 t3_1
+                                                   Output: t3_1.y, t3_1.x
+                     ->  Hash
+                           Output: t1_1.x
+                           ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                 Output: t1_1.x
+               ->  Hash Join
+                     Output: t3_2.y, (PARTIAL sum((t2_2.y + t3_2.y)))
+                     Hash Cond: (t2_2.x = t1_2.x)
+                     ->  Partial GroupAggregate
+                           Output: t2_2.x, t3_2.y, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y))
+                           Group Key: t2_2.x, t3_2.y, t3_2.x
+                           ->  Incremental Sort
+                                 Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                 Sort Key: t2_2.x, t3_2.y
+                                 Presorted Key: t2_2.x
+                                 ->  Merge Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Merge Cond: (t2_2.x = t3_2.x)
+                                       ->  Sort
+                                             Output: t2_2.y, t2_2.x
+                                             Sort Key: t2_2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p3 t2_2
+                                                   Output: t2_2.y, t2_2.x
+                                       ->  Sort
+                                             Output: t3_2.y, t3_2.x
+                                             Sort Key: t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p3 t3_2
+                                                   Output: t3_2.y, t3_2.x
+                     ->  Hash
+                           Output: t1_2.x
+                           ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                 Output: t1_2.x
+(88 rows)
+
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+ y |   sum   
+---+---------
+ 0 | 1111110
+ 1 | 2000132
+ 2 | 2889154
+ 3 | 3778176
+ 4 | 4667198
+ 5 | 3334000
+ 6 | 4223022
+ 7 | 5112044
+ 8 | 6001066
+ 9 | 6890088
+(10 rows)
+
+RESET enable_hashagg;
+RESET max_parallel_workers_per_gather;
+DROP TABLE eager_agg_tab1;
+DROP TABLE eager_agg_tab2;
+--
+-- Test with multi-level partitioning scheme
+--
+CREATE TABLE eager_agg_tab_ml(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p1 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab_ml_p2 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (10) TO (20) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p2_s1 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab_ml_p2_s2 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (15) TO (20);
+CREATE TABLE eager_agg_tab_ml_p3 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (20) TO (30) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p3_s1 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (20) TO (25);
+CREATE TABLE eager_agg_tab_ml_p3_s2 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (25) TO (30);
+INSERT INTO eager_agg_tab_ml SELECT i % 30, i % 30 FROM generate_series(1, 1000) i;
+ANALYZE eager_agg_tab_ml;
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum(t2.y)), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum(t2.y), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, PARTIAL sum(t2.y), PARTIAL count(*)
+                                 Group Key: t2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                       Output: t2.y, t2.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum(t2_1.y), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, PARTIAL sum(t2_1.y), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                       Output: t2_1.y, t2_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum(t2_2.y), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, PARTIAL sum(t2_2.y), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                       Output: t2_2.y, t2_2.x
+         ->  Finalize HashAggregate
+               Output: t1_3.x, sum(t2_3.y), count(*)
+               Group Key: t1_3.x
+               ->  Hash Join
+                     Output: t1_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, PARTIAL sum(t2_3.y), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                       Output: t2_3.y, t2_3.x
+         ->  Finalize HashAggregate
+               Output: t1_4.x, sum(t2_4.y), count(*)
+               Group Key: t1_4.x
+               ->  Hash Join
+                     Output: t1_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, PARTIAL sum(t2_4.y), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                       Output: t2_4.y, t2_4.x
+(79 rows)
+
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+ x  |  sum  | count 
+----+-------+-------
+  0 |     0 |  1089
+  1 |  1156 |  1156
+  2 |  2312 |  1156
+  3 |  3468 |  1156
+  4 |  4624 |  1156
+  5 |  5780 |  1156
+  6 |  6936 |  1156
+  7 |  8092 |  1156
+  8 |  9248 |  1156
+  9 | 10404 |  1156
+ 10 | 11560 |  1156
+ 11 | 11979 |  1089
+ 12 | 13068 |  1089
+ 13 | 14157 |  1089
+ 14 | 15246 |  1089
+ 15 | 16335 |  1089
+ 16 | 17424 |  1089
+ 17 | 18513 |  1089
+ 18 | 19602 |  1089
+ 19 | 20691 |  1089
+ 20 | 21780 |  1089
+ 21 | 22869 |  1089
+ 22 | 23958 |  1089
+ 23 | 25047 |  1089
+ 24 | 26136 |  1089
+ 25 | 27225 |  1089
+ 26 | 28314 |  1089
+ 27 | 29403 |  1089
+ 28 | 30492 |  1089
+ 29 | 31581 |  1089
+(30 rows)
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.y, (sum(t2.y)), (count(*))
+   Sort Key: t1.y
+   ->  Finalize HashAggregate
+         Output: t1.y, sum(t2.y), count(*)
+         Group Key: t1.y
+         ->  Append
+               ->  Hash Join
+                     Output: t1.y, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.y, t1.x
+                     ->  Hash
+                           Output: t2.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, PARTIAL sum(t2.y), PARTIAL count(*)
+                                 Group Key: t2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                       Output: t2.y, t2.x
+               ->  Hash Join
+                     Output: t1_1.y, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.y, t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, PARTIAL sum(t2_1.y), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                       Output: t2_1.y, t2_1.x
+               ->  Hash Join
+                     Output: t1_2.y, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.y, t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, PARTIAL sum(t2_2.y), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                       Output: t2_2.y, t2_2.x
+               ->  Hash Join
+                     Output: t1_3.y, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.y, t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, PARTIAL sum(t2_3.y), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                       Output: t2_3.y, t2_3.x
+               ->  Hash Join
+                     Output: t1_4.y, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.y, t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, PARTIAL sum(t2_4.y), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                       Output: t2_4.y, t2_4.x
+(67 rows)
+
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+ y  |  sum  | count 
+----+-------+-------
+  0 |     0 |  1089
+  1 |  1156 |  1156
+  2 |  2312 |  1156
+  3 |  3468 |  1156
+  4 |  4624 |  1156
+  5 |  5780 |  1156
+  6 |  6936 |  1156
+  7 |  8092 |  1156
+  8 |  9248 |  1156
+  9 | 10404 |  1156
+ 10 | 11560 |  1156
+ 11 | 11979 |  1089
+ 12 | 13068 |  1089
+ 13 | 14157 |  1089
+ 14 | 15246 |  1089
+ 15 | 16335 |  1089
+ 16 | 17424 |  1089
+ 17 | 18513 |  1089
+ 18 | 19602 |  1089
+ 19 | 20691 |  1089
+ 20 | 21780 |  1089
+ 21 | 22869 |  1089
+ 22 | 23958 |  1089
+ 23 | 25047 |  1089
+ 24 | 26136 |  1089
+ 25 | 27225 |  1089
+ 26 | 28314 |  1089
+ 27 | 29403 |  1089
+ 28 | 30492 |  1089
+ 29 | 31581 |  1089
+(30 rows)
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+                                                QUERY PLAN                                                
+----------------------------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum((t2.y + t3.y))), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum((t2.y + t3.y)), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, t3.x, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, t3.x, PARTIAL sum((t2.y + t3.y)), PARTIAL count(*)
+                                 Group Key: t2.x
+                                 ->  Hash Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Hash Cond: (t2.x = t3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                             Output: t2.y, t2.x
+                                       ->  Hash
+                                             Output: t3.y, t3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p1 t3
+                                                   Output: t3.y, t3.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum((t2_1.y + t3_1.y)), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y)), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Hash Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Hash Cond: (t2_1.x = t3_1.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                             Output: t2_1.y, t2_1.x
+                                       ->  Hash
+                                             Output: t3_1.y, t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t3_1
+                                                   Output: t3_1.y, t3_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum((t2_2.y + t3_2.y)), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y)), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Hash Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Hash Cond: (t2_2.x = t3_2.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                             Output: t2_2.y, t2_2.x
+                                       ->  Hash
+                                             Output: t3_2.y, t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t3_2
+                                                   Output: t3_2.y, t3_2.x
+         ->  Finalize HashAggregate
+               Output: t1_3.x, sum((t2_3.y + t3_3.y)), count(*)
+               Group Key: t1_3.x
+               ->  Hash Join
+                     Output: t1_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, t3_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, t3_3.x, PARTIAL sum((t2_3.y + t3_3.y)), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Hash Join
+                                       Output: t2_3.y, t2_3.x, t3_3.y, t3_3.x
+                                       Hash Cond: (t2_3.x = t3_3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                             Output: t2_3.y, t2_3.x
+                                       ->  Hash
+                                             Output: t3_3.y, t3_3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t3_3
+                                                   Output: t3_3.y, t3_3.x
+         ->  Finalize HashAggregate
+               Output: t1_4.x, sum((t2_4.y + t3_4.y)), count(*)
+               Group Key: t1_4.x
+               ->  Hash Join
+                     Output: t1_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, t3_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, t3_4.x, PARTIAL sum((t2_4.y + t3_4.y)), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Hash Join
+                                       Output: t2_4.y, t2_4.x, t3_4.y, t3_4.x
+                                       Hash Cond: (t2_4.x = t3_4.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                             Output: t2_4.y, t2_4.x
+                                       ->  Hash
+                                             Output: t3_4.y, t3_4.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t3_4
+                                                   Output: t3_4.y, t3_4.x
+(114 rows)
+
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+ x  |   sum   | count 
+----+---------+-------
+  0 |       0 | 35937
+  1 |   78608 | 39304
+  2 |  157216 | 39304
+  3 |  235824 | 39304
+  4 |  314432 | 39304
+  5 |  393040 | 39304
+  6 |  471648 | 39304
+  7 |  550256 | 39304
+  8 |  628864 | 39304
+  9 |  707472 | 39304
+ 10 |  786080 | 39304
+ 11 |  790614 | 35937
+ 12 |  862488 | 35937
+ 13 |  934362 | 35937
+ 14 | 1006236 | 35937
+ 15 | 1078110 | 35937
+ 16 | 1149984 | 35937
+ 17 | 1221858 | 35937
+ 18 | 1293732 | 35937
+ 19 | 1365606 | 35937
+ 20 | 1437480 | 35937
+ 21 | 1509354 | 35937
+ 22 | 1581228 | 35937
+ 23 | 1653102 | 35937
+ 24 | 1724976 | 35937
+ 25 | 1796850 | 35937
+ 26 | 1868724 | 35937
+ 27 | 1940598 | 35937
+ 28 | 2012472 | 35937
+ 29 | 2084346 | 35937
+(30 rows)
+
+-- partial aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+                                                    QUERY PLAN                                                    
+------------------------------------------------------------------------------------------------------------------
+ Sort
+   Output: t3.y, (sum((t2.y + t3.y))), (count(*))
+   Sort Key: t3.y
+   ->  Finalize HashAggregate
+         Output: t3.y, sum((t2.y + t3.y)), count(*)
+         Group Key: t3.y
+         ->  Append
+               ->  Hash Join
+                     Output: t3.y, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, t3.y, t3.x, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, t3.y, t3.x, PARTIAL sum((t2.y + t3.y)), PARTIAL count(*)
+                                 Group Key: t2.x, t3.y, t3.x
+                                 ->  Hash Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Hash Cond: (t2.x = t3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                             Output: t2.y, t2.x
+                                       ->  Hash
+                                             Output: t3.y, t3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p1 t3
+                                                   Output: t3.y, t3.x
+               ->  Hash Join
+                     Output: t3_1.y, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, t3_1.y, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, t3_1.y, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y)), PARTIAL count(*)
+                                 Group Key: t2_1.x, t3_1.y, t3_1.x
+                                 ->  Hash Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Hash Cond: (t2_1.x = t3_1.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                             Output: t2_1.y, t2_1.x
+                                       ->  Hash
+                                             Output: t3_1.y, t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t3_1
+                                                   Output: t3_1.y, t3_1.x
+               ->  Hash Join
+                     Output: t3_2.y, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, t3_2.y, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, t3_2.y, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y)), PARTIAL count(*)
+                                 Group Key: t2_2.x, t3_2.y, t3_2.x
+                                 ->  Hash Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Hash Cond: (t2_2.x = t3_2.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                             Output: t2_2.y, t2_2.x
+                                       ->  Hash
+                                             Output: t3_2.y, t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t3_2
+                                                   Output: t3_2.y, t3_2.x
+               ->  Hash Join
+                     Output: t3_3.y, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, t3_3.y, t3_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, t3_3.y, t3_3.x, PARTIAL sum((t2_3.y + t3_3.y)), PARTIAL count(*)
+                                 Group Key: t2_3.x, t3_3.y, t3_3.x
+                                 ->  Hash Join
+                                       Output: t2_3.y, t2_3.x, t3_3.y, t3_3.x
+                                       Hash Cond: (t2_3.x = t3_3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                             Output: t2_3.y, t2_3.x
+                                       ->  Hash
+                                             Output: t3_3.y, t3_3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t3_3
+                                                   Output: t3_3.y, t3_3.x
+               ->  Hash Join
+                     Output: t3_4.y, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, t3_4.y, t3_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, t3_4.y, t3_4.x, PARTIAL sum((t2_4.y + t3_4.y)), PARTIAL count(*)
+                                 Group Key: t2_4.x, t3_4.y, t3_4.x
+                                 ->  Hash Join
+                                       Output: t2_4.y, t2_4.x, t3_4.y, t3_4.x
+                                       Hash Cond: (t2_4.x = t3_4.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                             Output: t2_4.y, t2_4.x
+                                       ->  Hash
+                                             Output: t3_4.y, t3_4.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t3_4
+                                                   Output: t3_4.y, t3_4.x
+(102 rows)
+
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+ y  |   sum   | count 
+----+---------+-------
+  0 |       0 | 35937
+  1 |   78608 | 39304
+  2 |  157216 | 39304
+  3 |  235824 | 39304
+  4 |  314432 | 39304
+  5 |  393040 | 39304
+  6 |  471648 | 39304
+  7 |  550256 | 39304
+  8 |  628864 | 39304
+  9 |  707472 | 39304
+ 10 |  786080 | 39304
+ 11 |  790614 | 35937
+ 12 |  862488 | 35937
+ 13 |  934362 | 35937
+ 14 | 1006236 | 35937
+ 15 | 1078110 | 35937
+ 16 | 1149984 | 35937
+ 17 | 1221858 | 35937
+ 18 | 1293732 | 35937
+ 19 | 1365606 | 35937
+ 20 | 1437480 | 35937
+ 21 | 1509354 | 35937
+ 22 | 1581228 | 35937
+ 23 | 1653102 | 35937
+ 24 | 1724976 | 35937
+ 25 | 1796850 | 35937
+ 26 | 1868724 | 35937
+ 27 | 1940598 | 35937
+ 28 | 2012472 | 35937
+ 29 | 2084346 | 35937
+(30 rows)
+
+DROP TABLE eager_agg_tab_ml;
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index 83228cfca29..3b37fafa65b 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -151,6 +151,7 @@ select name, setting from pg_settings where name like 'enable%';
  enable_async_append            | on
  enable_bitmapscan              | on
  enable_distinct_reordering     | on
+ enable_eager_aggregate         | on
  enable_gathermerge             | on
  enable_group_by_reordering     | on
  enable_hashagg                 | on
@@ -172,7 +173,7 @@ select name, setting from pg_settings where name like 'enable%';
  enable_seqscan                 | on
  enable_sort                    | on
  enable_tidscan                 | on
-(24 rows)
+(25 rows)
 
 -- There are always wait event descriptions for various types.  InjectionPoint
 -- may be present or absent, depending on history since last postmaster start.
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index a424be2a6bf..929cab14c47 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -123,7 +123,7 @@ test: plancache limit plpgsql copy2 temp domain rangefuncs prepare conversion tr
 # The stats test resets stats, so nothing else needing stats access can be in
 # this group.
 # ----------
-test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats predicate numa
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression memoize stats predicate numa eager_aggregate
 
 # event_trigger depends on create_am and cannot run concurrently with
 # any test that runs DDL
diff --git a/src/test/regress/sql/eager_aggregate.sql b/src/test/regress/sql/eager_aggregate.sql
new file mode 100644
index 00000000000..5da8749a6cb
--- /dev/null
+++ b/src/test/regress/sql/eager_aggregate.sql
@@ -0,0 +1,194 @@
+--
+-- EAGER AGGREGATION
+-- Test we can push aggregation down below join
+--
+
+-- Enable eager aggregation, which by default is disabled.
+SET enable_eager_aggregate TO on;
+
+CREATE TABLE eager_agg_t1 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t2 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t3 (a int, b int, c double precision);
+
+INSERT INTO eager_agg_t1 SELECT i, i, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t2 SELECT i, i%10, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t3 SELECT i%10, i%10, i FROM generate_series(1, 1000)i;
+
+ANALYZE eager_agg_t1;
+ANALYZE eager_agg_t2;
+ANALYZE eager_agg_t3;
+
+
+--
+-- Test eager aggregation over base rel
+--
+
+-- Perform scan of a table, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+RESET enable_hashagg;
+
+
+--
+-- Test eager aggregation over join rel
+--
+
+-- Perform join of tables, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+
+RESET enable_hashagg;
+
+
+--
+-- Test that eager aggregation works for outer join
+--
+
+-- Ensure aggregation can be pushed down to the non-nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+-- Ensure aggregation cannot be pushed down to the nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+
+
+--
+-- Test that eager aggregation works for parallel plans
+--
+
+SET parallel_setup_cost=0;
+SET parallel_tuple_cost=0;
+SET min_parallel_table_scan_size=0;
+SET max_parallel_workers_per_gather=4;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+RESET parallel_setup_cost;
+RESET parallel_tuple_cost;
+RESET min_parallel_table_scan_size;
+RESET max_parallel_workers_per_gather;
+
+
+DROP TABLE eager_agg_t1;
+DROP TABLE eager_agg_t2;
+DROP TABLE eager_agg_t3;
+
+
+--
+-- Test eager aggregation for partitionwise join
+--
+
+-- Enable partitionwise aggregate, which by default is disabled.
+SET enable_partitionwise_aggregate TO true;
+-- Enable partitionwise join, which by default is disabled.
+SET enable_partitionwise_join TO true;
+
+CREATE TABLE eager_agg_tab1(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab1_p1 PARTITION OF eager_agg_tab1 FOR VALUES FROM (0) TO (5);
+CREATE TABLE eager_agg_tab1_p2 PARTITION OF eager_agg_tab1 FOR VALUES FROM (5) TO (10);
+CREATE TABLE eager_agg_tab1_p3 PARTITION OF eager_agg_tab1 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab2(x int, y int) PARTITION BY RANGE(y);
+CREATE TABLE eager_agg_tab2_p1 PARTITION OF eager_agg_tab2 FOR VALUES FROM (0) TO (5);
+CREATE TABLE eager_agg_tab2_p2 PARTITION OF eager_agg_tab2 FOR VALUES FROM (5) TO (10);
+CREATE TABLE eager_agg_tab2_p3 PARTITION OF eager_agg_tab2 FOR VALUES FROM (10) TO (15);
+INSERT INTO eager_agg_tab1 SELECT i % 15, i % 10 FROM generate_series(1, 1000) i;
+INSERT INTO eager_agg_tab2 SELECT i % 10, i % 15 FROM generate_series(1, 1000) i;
+
+ANALYZE eager_agg_tab1;
+ANALYZE eager_agg_tab2;
+
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+
+-- GROUP BY having other matching key
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 5 ORDER BY t2.x;
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 5 ORDER BY t2.x;
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+
+-- partial aggregation
+SET enable_hashagg TO off;
+SET max_parallel_workers_per_gather TO 0;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+RESET enable_hashagg;
+RESET max_parallel_workers_per_gather;
+
+DROP TABLE eager_agg_tab1;
+DROP TABLE eager_agg_tab2;
+
+
+--
+-- Test with multi-level partitioning scheme
+--
+CREATE TABLE eager_agg_tab_ml(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p1 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab_ml_p2 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (10) TO (20) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p2_s1 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab_ml_p2_s2 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (15) TO (20);
+CREATE TABLE eager_agg_tab_ml_p3 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (20) TO (30) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p3_s1 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (20) TO (25);
+CREATE TABLE eager_agg_tab_ml_p3_s2 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (25) TO (30);
+INSERT INTO eager_agg_tab_ml SELECT i % 30, i % 30 FROM generate_series(1, 1000) i;
+
+ANALYZE eager_agg_tab_ml;
+
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+
+-- partial aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+
+DROP TABLE eager_agg_tab_ml;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 32d6e718adc..61b7e6ea049 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -42,6 +42,7 @@ AfterTriggersTableData
 AfterTriggersTransData
 Agg
 AggClauseCosts
+AggClauseInfo
 AggInfo
 AggPath
 AggSplit
@@ -1111,6 +1112,7 @@ GroupPathExtraData
 GroupResultPath
 GroupState
 GroupVarInfo
+GroupingExprInfo
 GroupingFunc
 GroupingSet
 GroupingSetData
@@ -2464,6 +2466,7 @@ ReindexObjectType
 ReindexParams
 ReindexStmt
 ReindexType
+RelAggInfo
 RelFileLocator
 RelFileLocatorBackend
 RelFileNumber
-- 
2.43.0



^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-07-24 03:21                               ` Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  0 siblings, 1 reply; 70+ messages in thread

From: Richard Guo @ 2025-07-24 03:21 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

On Thu, Jun 26, 2025 at 11:01 AM Richard Guo <[email protected]> wrote:
> Here is the patch based on the proposed ideas.  It includes the proof
> of correctness in the README and implements the strategy of pushing
> partial aggregation only to the lowest applicable join level where it
> is deemed useful.  This is done by introducing a "Relids apply_at"
> field to track that level and ensuring that partial aggregation is
> applied only at the recorded "apply_at" level.
>
> Additionally, this patch changes how grouped relations are stored.
> Since each grouped relation represents a partially aggregated version
> of a non-grouped relation, we now associate each grouped relation with
> the RelOptInfo of the corresponding non-grouped relation.  This
> eliminates the need for a dedicated list of all grouped relations and
> avoids list searches when retrieving a grouped relation.
>
> It also addresses other previously raised concerns, such as the
> potential memory blowout risks with large partial-aggregation values,
> and includes improvements to comments and the commit message.
>
> Another change is that this feature is now enabled by default.

This patch no longer applies; here's a rebased version.  Nothing
essential has changed.

Thanks
Richard


Attachments:

  [application/octet-stream] v18-0001-Implement-Eager-Aggregation.patch (165.5K, 2-v18-0001-Implement-Eager-Aggregation.patch)
  download | inline diff:
From 23ab3a8c476e130a93b843c6afcba149641169fb Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Tue, 11 Jun 2024 15:59:19 +0900
Subject: [PATCH v18] Implement Eager Aggregation

Eager aggregation is a query optimization technique that partially
pushes aggregation past a join, and finalizes it once all the
relations are joined.  Eager aggregation may reduce the number of
input rows to the join and thus could result in a better overall plan.

In the current planner architecture, the separation between the
scan/join planning phase and the post-scan/join phase means that
aggregation steps are not visible when constructing the join tree,
limiting the planner's ability to exploit aggregation-aware
optimizations.  To implement eager aggregation, we collect information
about aggregate functions in the targetlist and HAVING clause, along
with grouping expressions from the GROUP BY clause, and store it in
the PlannerInfo node.  During the scan/join planning phase, this
information is used to evaluate each base or join relation to
determine whether eager aggregation can be applied.  If applicable, we
create a separate RelOptInfo, referred to as a grouped relation, to
represent the partially-aggregated version of the relation and
generate grouped paths for it.

Grouped relation paths can be generated in two ways.  The first method
involves adding sorted and hashed partial aggregation paths on top of
the non-grouped paths.  To limit planning time, we only consider the
cheapest or suitably-sorted non-grouped paths in this step.
Alternatively, grouped paths can be generated by joining a grouped
relation with a non-grouped relation.  Joining two grouped relations
is currently not supported.

To further limit planning time, we currently adopt a strategy where
partial aggregation is pushed only to the lowest feasible level in the
join tree where it provides a significant reduction in row count.
This strategy also helps ensure that all grouped paths for the same
grouped relation produce the same set of rows, which is important to
support a fundamental assumption of the planner.

For the partial aggregation that is pushed down to a non-aggregated
relation, we need to consider all expressions from this relation that
are involved in upper join clauses and include them in the grouping
keys, using compatible operators.  This is essential to ensure that an
aggregated row from the partial aggregation matches the other side of
the join if and only if each row in the partial group does.  This
ensures that all rows within the same partial group share the same
"destiny", which is crucial for maintaining correctness.

One restriction is that we cannot push partial aggregation down to a
relation that is in the nullable side of an outer join, because the
NULL-extended rows produced by the outer join would not be available
when we perform the partial aggregation, while with a
non-eager-aggregation plan these rows are available for the top-level
aggregation.  Pushing partial aggregation in this case may result in
the rows being grouped differently than expected, or produce incorrect
values from the aggregate functions.

If we have generated a grouped relation for the topmost join relation,
we finalize its paths at the end.  The final paths will compete in the
usual way with paths built from regular planning.

The patch was originally proposed by Antonin Houska in 2017.  This
commit reworks various important aspects and rewrites most of the
current code.  However, the original patch and reviews were very
useful.

Author: Richard Guo, Antonin Houska
Reviewed-by: Robert Haas, Jian He, Tender Wang, Paul George, Tom Lane
Reviewed-by: Tomas Vondra, Andy Fan, Ashutosh Bapat
Discussion: https://postgr.es/m/CAMbWs48jzLrPt1J_00ZcPZXWUQKawQOFE8ROc-ADiYqsqrpBNw@mail.gmail.com
---
 .../postgres_fdw/expected/postgres_fdw.out    |   49 +-
 doc/src/sgml/config.sgml                      |   15 +
 src/backend/optimizer/README                  |   89 ++
 src/backend/optimizer/geqo/geqo_eval.c        |   21 +
 src/backend/optimizer/path/allpaths.c         |  452 ++++++
 src/backend/optimizer/path/joinrels.c         |  193 +++
 src/backend/optimizer/plan/initsplan.c        |  313 ++++
 src/backend/optimizer/plan/planmain.c         |    9 +
 src/backend/optimizer/plan/planner.c          |  124 +-
 src/backend/optimizer/util/appendinfo.c       |   59 +
 src/backend/optimizer/util/pathnode.c         |   12 +-
 src/backend/optimizer/util/relnode.c          |  636 ++++++++
 src/backend/utils/misc/guc_tables.c           |   10 +
 src/backend/utils/misc/postgresql.conf.sample |    1 +
 src/include/nodes/pathnodes.h                 |  130 ++
 src/include/optimizer/pathnode.h              |    5 +
 src/include/optimizer/paths.h                 |    5 +
 src/include/optimizer/planmain.h              |    1 +
 src/test/regress/expected/eager_aggregate.out | 1334 +++++++++++++++++
 src/test/regress/expected/sysviews.out        |    3 +-
 src/test/regress/parallel_schedule            |    2 +-
 src/test/regress/sql/eager_aggregate.sql      |  194 +++
 src/tools/pgindent/typedefs.list              |    3 +
 23 files changed, 3597 insertions(+), 63 deletions(-)
 create mode 100644 src/test/regress/expected/eager_aggregate.out
 create mode 100644 src/test/regress/sql/eager_aggregate.sql

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 4b6e49a5d95..8dea3dee667 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -3713,30 +3713,33 @@ select count(t1.c3) from ft2 t1 left join ft2 t2 on (t1.c1 = random() * t2.c2);
 -- Subquery in FROM clause having aggregate
 explain (verbose, costs off)
 select count(*), x.b from ft1, (select c2 a, sum(c1) b from ft1 group by c2) x where ft1.c2 = x.a group by x.b order by 1, 2;
-                                          QUERY PLAN                                           
------------------------------------------------------------------------------------------------
+                                       QUERY PLAN                                        
+-----------------------------------------------------------------------------------------
  Sort
-   Output: (count(*)), x.b
-   Sort Key: (count(*)), x.b
-   ->  HashAggregate
-         Output: count(*), x.b
-         Group Key: x.b
-         ->  Hash Join
-               Output: x.b
-               Inner Unique: true
-               Hash Cond: (ft1.c2 = x.a)
-               ->  Foreign Scan on public.ft1
-                     Output: ft1.c2
-                     Remote SQL: SELECT c2 FROM "S 1"."T 1"
-               ->  Hash
-                     Output: x.b, x.a
-                     ->  Subquery Scan on x
-                           Output: x.b, x.a
-                           ->  Foreign Scan
-                                 Output: ft1_1.c2, (sum(ft1_1.c1))
-                                 Relations: Aggregate on (public.ft1 ft1_1)
-                                 Remote SQL: SELECT c2, sum("C 1") FROM "S 1"."T 1" GROUP BY 1
-(21 rows)
+   Output: (count(*)), (sum(ft1_1.c1))
+   Sort Key: (count(*)), (sum(ft1_1.c1))
+   ->  Finalize GroupAggregate
+         Output: count(*), (sum(ft1_1.c1))
+         Group Key: (sum(ft1_1.c1))
+         ->  Sort
+               Output: (sum(ft1_1.c1)), (PARTIAL count(*))
+               Sort Key: (sum(ft1_1.c1))
+               ->  Hash Join
+                     Output: (sum(ft1_1.c1)), (PARTIAL count(*))
+                     Hash Cond: (ft1_1.c2 = ft1.c2)
+                     ->  Foreign Scan
+                           Output: ft1_1.c2, (sum(ft1_1.c1))
+                           Relations: Aggregate on (public.ft1 ft1_1)
+                           Remote SQL: SELECT c2, sum("C 1") FROM "S 1"."T 1" GROUP BY 1
+                     ->  Hash
+                           Output: ft1.c2, (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: ft1.c2, PARTIAL count(*)
+                                 Group Key: ft1.c2
+                                 ->  Foreign Scan on public.ft1
+                                       Output: ft1.c2
+                                       Remote SQL: SELECT c2 FROM "S 1"."T 1"
+(24 rows)
 
 select count(*), x.b from ft1, (select c2 a, sum(c1) b from ft1 group by c2) x where ft1.c2 = x.a group by x.b order by 1, 2;
  count |   b   
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 20ccb2d6b54..395bca6cf95 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -5474,6 +5474,21 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-enable-eager-aggregate" xreflabel="enable_eager_aggregate">
+      <term><varname>enable_eager_aggregate</varname> (<type>boolean</type>)
+      <indexterm>
+       <primary><varname>enable_eager_aggregate</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Enables or disables the query planner's ability to partially push
+        aggregation past a join, and finalize it once all the relations are
+        joined. The default is <literal>on</literal>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-enable-gathermerge" xreflabel="enable_gathermerge">
       <term><varname>enable_gathermerge</varname> (<type>boolean</type>)
       <indexterm>
diff --git a/src/backend/optimizer/README b/src/backend/optimizer/README
index 9c724ccfabf..48a575c5bda 100644
--- a/src/backend/optimizer/README
+++ b/src/backend/optimizer/README
@@ -1501,3 +1501,92 @@ breaking down aggregation or grouping over a partitioned relation into
 aggregation or grouping over its partitions is called partitionwise
 aggregation.  Especially when the partition keys match the GROUP BY clause,
 this can be significantly faster than the regular method.
+
+Eager aggregation
+-----------------
+
+Eager aggregation is a query optimization technique that partially
+pushes aggregation past a join, and finalizes it once all the
+relations are joined.  Eager aggregation may reduce the number of
+input rows to the join and thus could result in a better overall plan.
+
+To prove that the transformation is correct, we partition the tables
+in the FROM clause into two groups: those that contain at least one
+aggregation column, and those that do not contain any aggregation
+columns.  Each group can be treated as a single relation formed by the
+Cartesian product of the tables within that group.  Therefore, without
+loss of generality, we can assume that the FROM clause contains
+exactly two relations, R1 and R2, where R1 represents the relation
+containing all aggregation columns, and R2 represents the relation
+without any aggregation columns.
+
+Let the query be of the form:
+
+SELECT G, AGG(A)
+FROM R1 JOIN R2 ON J
+GROUP BY G;
+
+where G is the set of grouping keys that may include columns from R1
+and/or R2; AGG(A) is an aggregate function over columns A from R1; J
+is the join condition between R1 and R2.
+
+The transformation of eager aggregation is:
+
+    GROUP BY G, AGG(A) on (R1 JOIN R2 ON J)
+    =
+    GROUP BY G, AGG(agg_A) on ((GROUP BY G1, AGG(A) AS agg_A on R1) JOIN R2 ON J)
+
+This equivalence holds under the following conditions:
+
+1) AGG is decomposable, meaning that it can be computed in two stages:
+a partial aggregation followed by a final aggregation;
+2) The set G1 used in the pre-aggregation of R1 includes:
+    * all columns from R1 that are part of the grouping keys G, and
+    * all columns from R1 that appear in the join condition J.
+3) The grouping operator for any column in G1 must be compatible with
+the operator used for that column in the join condition J.
+
+Since G1 includes all columns from R1 that appear in either the
+grouping keys G or the join condition J, all rows within each partial
+group have identical values for both the grouping keys and the
+join-relevant columns from R1, assuming compatible operators are used.
+As a result, the rows within a partial group are indistinguishable in
+terms of their contribution to the aggregation and their behavior in
+the join.  This ensures that all rows in the same partial group share
+the same "destiny": they either all match or all fail to match a given
+row in R2.  Because the aggregate function AGG is decomposable,
+aggregating the partial results after the join yields the same final
+result as aggregating after the full join, thereby preserving query
+semantics.  Q.E.D.
+
+One restriction is that we cannot push partial aggregation down to a
+relation that is in the nullable side of an outer join, because the
+NULL-extended rows produced by the outer join would not be available
+when we perform the partial aggregation, while with a
+non-eager-aggregation plan these rows are available for the top-level
+aggregation.  Pushing partial aggregation in this case may result in
+the rows being grouped differently than expected, or produce incorrect
+values from the aggregate functions.
+
+During the construction of the join tree, we evaluate each base or
+join relation to determine if eager aggregation can be applied.  If
+feasible, we create a separate RelOptInfo called a "grouped relation"
+and generate grouped paths by adding sorted and hashed partial
+aggregation paths on top of the non-grouped paths.  To limit planning
+time, we consider only the cheapest or suitably-sorted non-grouped
+paths in this step.
+
+Another way to generate grouped paths is to join a grouped relation
+with a non-grouped relation.  Joining two grouped relations is
+currently not supported.
+
+To further limit planning time, we currently adopt a strategy where
+partial aggregation is pushed only to the lowest feasible level in the
+join tree where it provides a significant reduction in row count.
+This strategy also helps ensure that all grouped paths for the same
+grouped relation produce the same set of rows, which is important to
+support a fundamental assumption of the planner.
+
+If we have generated a grouped relation for the topmost join relation,
+we need to finalize its paths at the end.  The final paths will
+compete in the usual way with paths built from regular planning.
diff --git a/src/backend/optimizer/geqo/geqo_eval.c b/src/backend/optimizer/geqo/geqo_eval.c
index f07d1dc8ac6..4a65f955ca6 100644
--- a/src/backend/optimizer/geqo/geqo_eval.c
+++ b/src/backend/optimizer/geqo/geqo_eval.c
@@ -279,6 +279,27 @@ merge_clump(PlannerInfo *root, List *clumps, Clump *new_clump, int num_gene,
 				/* Find and save the cheapest paths for this joinrel */
 				set_cheapest(joinrel);
 
+				/*
+				 * Except for the topmost scan/join rel, consider generating
+				 * partial aggregation paths for the grouped relation on top
+				 * of the paths of this rel.  After that, we're done creating
+				 * paths for the grouped relation, so run set_cheapest().
+				 */
+				if (!bms_equal(joinrel->relids, root->all_query_rels))
+				{
+					RelOptInfo *grouped_rel;
+
+					grouped_rel = joinrel->grouped_rel;
+					if (grouped_rel)
+					{
+						Assert(IS_GROUPED_REL(grouped_rel));
+
+						generate_grouped_paths(root, grouped_rel, joinrel,
+											   grouped_rel->agg_info);
+						set_cheapest(grouped_rel);
+					}
+				}
+
 				/* Absorb new clump into old */
 				old_clump->joinrel = joinrel;
 				old_clump->size += new_clump->size;
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 6cc6966b060..ac922dbf56a 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -40,6 +40,7 @@
 #include "optimizer/paths.h"
 #include "optimizer/plancat.h"
 #include "optimizer/planner.h"
+#include "optimizer/prep.h"
 #include "optimizer/tlist.h"
 #include "parser/parse_clause.h"
 #include "parser/parsetree.h"
@@ -47,6 +48,7 @@
 #include "port/pg_bitutils.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/lsyscache.h"
+#include "utils/selfuncs.h"
 
 
 /* Bitmask flags for pushdown_safety_info.unsafeFlags */
@@ -77,6 +79,7 @@ typedef enum pushdown_safe_type
 
 /* These parameters are set by GUC */
 bool		enable_geqo = false;	/* just in case GUC doesn't set it */
+bool		enable_eager_aggregate = true;
 int			geqo_threshold;
 int			min_parallel_table_scan_size;
 int			min_parallel_index_scan_size;
@@ -90,6 +93,7 @@ join_search_hook_type join_search_hook = NULL;
 
 static void set_base_rel_consider_startup(PlannerInfo *root);
 static void set_base_rel_sizes(PlannerInfo *root);
+static void setup_base_grouped_rels(PlannerInfo *root);
 static void set_base_rel_pathlists(PlannerInfo *root);
 static void set_rel_size(PlannerInfo *root, RelOptInfo *rel,
 						 Index rti, RangeTblEntry *rte);
@@ -114,6 +118,7 @@ static void set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
 								Index rti, RangeTblEntry *rte);
 static void set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 									Index rti, RangeTblEntry *rte);
+static void set_grouped_rel_pathlist(PlannerInfo *root, RelOptInfo *rel);
 static void generate_orderedappend_paths(PlannerInfo *root, RelOptInfo *rel,
 										 List *live_childrels,
 										 List *all_child_pathkeys);
@@ -182,6 +187,11 @@ make_one_rel(PlannerInfo *root, List *joinlist)
 	 */
 	set_base_rel_sizes(root);
 
+	/*
+	 * Build grouped relations for base rels where possible.
+	 */
+	setup_base_grouped_rels(root);
+
 	/*
 	 * We should now have size estimates for every actual table involved in
 	 * the query, and we also know which if any have been deleted from the
@@ -323,6 +333,39 @@ set_base_rel_sizes(PlannerInfo *root)
 	}
 }
 
+/*
+ * setup_base_grouped_rels
+ *	  For each base relation, build a grouped base relation if eager
+ *	  aggregation is possible and if this relation can produce grouped paths.
+ */
+static void
+setup_base_grouped_rels(PlannerInfo *root)
+{
+	Index		rti;
+
+	/*
+	 * If there are no aggregate expressions or grouping expressions, eager
+	 * aggregation is not possible.
+	 */
+	if (root->agg_clause_list == NIL ||
+		root->group_expr_list == NIL)
+		return;
+
+	for (rti = 1; rti < root->simple_rel_array_size; rti++)
+	{
+		RelOptInfo *rel = root->simple_rel_array[rti];
+
+		/* there may be empty slots corresponding to non-baserel RTEs */
+		if (rel == NULL)
+			continue;
+
+		Assert(rel->relid == rti);	/* sanity check on array */
+		Assert(IS_SIMPLE_REL(rel)); /* sanity check on rel */
+
+		(void) build_simple_grouped_rel(root, rel);
+	}
+}
+
 /*
  * set_base_rel_pathlists
  *	  Finds all paths available for scanning each base-relation entry.
@@ -559,6 +602,15 @@ set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 	/* Now find the cheapest of the paths for this rel */
 	set_cheapest(rel);
 
+	/*
+	 * If a grouped relation for this rel exists, build partial aggregation
+	 * paths for it.
+	 *
+	 * Note that this can only happen after we've called set_cheapest() for
+	 * this base rel, because we need its cheapest paths.
+	 */
+	set_grouped_rel_pathlist(root, rel);
+
 #ifdef OPTIMIZER_DEBUG
 	pprint(rel);
 #endif
@@ -1305,6 +1357,36 @@ set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 	add_paths_to_append_rel(root, rel, live_childrels);
 }
 
+/*
+ * set_grouped_rel_pathlist
+ *	  If a grouped relation for the given 'rel' exists, build partial
+ *	  aggregation paths for it.
+ */
+static void
+set_grouped_rel_pathlist(PlannerInfo *root, RelOptInfo *rel)
+{
+	RelOptInfo *grouped_rel;
+
+	/*
+	 * If there are no aggregate expressions or grouping expressions, eager
+	 * aggregation is not possible.
+	 */
+	if (root->agg_clause_list == NIL ||
+		root->group_expr_list == NIL)
+		return;
+
+	/* Add paths to the grouped base relation if one exists. */
+	grouped_rel = rel->grouped_rel;
+	if (grouped_rel)
+	{
+		Assert(IS_GROUPED_REL(grouped_rel));
+
+		generate_grouped_paths(root, grouped_rel, rel,
+							   grouped_rel->agg_info);
+		set_cheapest(grouped_rel);
+	}
+}
+
 
 /*
  * add_paths_to_append_rel
@@ -3335,6 +3417,328 @@ generate_useful_gather_paths(PlannerInfo *root, RelOptInfo *rel, bool override_r
 	}
 }
 
+/*
+ * generate_grouped_paths
+ *		Generate paths for a grouped relation by adding sorted and hashed
+ *		partial aggregation paths on top of paths of the ungrouped base or join
+ *		relation.
+ *
+ * The information needed are provided by the RelAggInfo structure.
+ */
+void
+generate_grouped_paths(PlannerInfo *root, RelOptInfo *grouped_rel,
+					   RelOptInfo *rel, RelAggInfo *agg_info)
+{
+	AggClauseCosts agg_costs;
+	bool		can_hash;
+	bool		can_sort;
+	Path	   *cheapest_total_path = NULL;
+	Path	   *cheapest_partial_path = NULL;
+	double		dNumGroups = 0;
+	double		dNumPartialGroups = 0;
+
+	if (IS_DUMMY_REL(rel))
+	{
+		mark_dummy_rel(grouped_rel);
+		return;
+	}
+
+	/*
+	 * We push partial aggregation only to the lowest possible level in the
+	 * join tree that is deemed useful.
+	 */
+	if (!bms_equal(agg_info->apply_at, rel->relids) ||
+		!agg_info->agg_useful)
+		return;
+
+	MemSet(&agg_costs, 0, sizeof(AggClauseCosts));
+	get_agg_clause_costs(root, AGGSPLIT_INITIAL_SERIAL, &agg_costs);
+
+	/*
+	 * Determine whether it's possible to perform sort-based implementations
+	 * of grouping.
+	 */
+	can_sort = grouping_is_sortable(agg_info->group_clauses);
+
+	/*
+	 * Determine whether we should consider hash-based implementations of
+	 * grouping.
+	 */
+	Assert(root->numOrderedAggs == 0);
+	can_hash = (agg_info->group_clauses != NIL &&
+				grouping_is_hashable(agg_info->group_clauses));
+
+	/*
+	 * Consider whether we should generate partially aggregated non-partial
+	 * paths.  We can only do this if we have a non-partial path.
+	 */
+	if (rel->pathlist != NIL)
+	{
+		cheapest_total_path = rel->cheapest_total_path;
+		Assert(cheapest_total_path != NULL);
+	}
+
+	/*
+	 * If parallelism is possible for grouped_rel, then we should consider
+	 * generating partially-grouped partial paths.  However, if the ungrouped
+	 * rel has no partial paths, then we can't.
+	 */
+	if (grouped_rel->consider_parallel && rel->partial_pathlist != NIL)
+	{
+		cheapest_partial_path = linitial(rel->partial_pathlist);
+		Assert(cheapest_partial_path != NULL);
+	}
+
+	/* Estimate number of partial groups. */
+	if (cheapest_total_path != NULL)
+		dNumGroups = estimate_num_groups(root,
+										 agg_info->group_exprs,
+										 cheapest_total_path->rows,
+										 NULL, NULL);
+	if (cheapest_partial_path != NULL)
+		dNumPartialGroups = estimate_num_groups(root,
+												agg_info->group_exprs,
+												cheapest_partial_path->rows,
+												NULL, NULL);
+
+	if (can_sort && cheapest_total_path != NULL)
+	{
+		ListCell   *lc;
+
+		/*
+		 * Use any available suitably-sorted path as input, and also consider
+		 * sorting the cheapest-total path and incremental sort on any paths
+		 * with presorted keys.
+		 *
+		 * To save planning time, we ignore parameterized input paths unless
+		 * they are the cheapest-total path.
+		 */
+		foreach(lc, rel->pathlist)
+		{
+			Path	   *input_path = (Path *) lfirst(lc);
+			Path	   *path;
+			bool		is_sorted;
+			int			presorted_keys;
+
+			/*
+			 * Ignore parameterized paths that are not the cheapest-total
+			 * path.
+			 */
+			if (input_path->param_info &&
+				input_path != cheapest_total_path)
+				continue;
+
+			is_sorted = pathkeys_count_contained_in(agg_info->group_pathkeys,
+													input_path->pathkeys,
+													&presorted_keys);
+
+			/*
+			 * Ignore paths that are not suitably or partially sorted, unless
+			 * they are the cheapest total path (no need to deal with paths
+			 * which have presorted keys when incremental sort is disabled).
+			 */
+			if (!is_sorted && input_path != cheapest_total_path &&
+				(presorted_keys == 0 || !enable_incremental_sort))
+				continue;
+
+			/*
+			 * Since the path originates from a non-grouped relation that is
+			 * not aware of eager aggregation, we must ensure that it provides
+			 * the correct input for partial aggregation.
+			 */
+			path = (Path *) create_projection_path(root,
+												   grouped_rel,
+												   input_path,
+												   agg_info->agg_input);
+
+			if (!is_sorted)
+			{
+				/*
+				 * We've no need to consider both a sort and incremental sort.
+				 * We'll just do a sort if there are no presorted keys and an
+				 * incremental sort when there are presorted keys.
+				 */
+				if (presorted_keys == 0 || !enable_incremental_sort)
+					path = (Path *) create_sort_path(root,
+													 grouped_rel,
+													 path,
+													 agg_info->group_pathkeys,
+													 -1.0);
+				else
+					path = (Path *) create_incremental_sort_path(root,
+																 grouped_rel,
+																 path,
+																 agg_info->group_pathkeys,
+																 presorted_keys,
+																 -1.0);
+			}
+
+			/*
+			 * qual is NIL because the HAVING clause cannot be evaluated until
+			 * the final value of the aggregate is known.
+			 */
+			path = (Path *) create_agg_path(root,
+											grouped_rel,
+											path,
+											agg_info->target,
+											AGG_SORTED,
+											AGGSPLIT_INITIAL_SERIAL,
+											agg_info->group_clauses,
+											NIL,
+											&agg_costs,
+											dNumGroups);
+
+			add_path(grouped_rel, path);
+		}
+	}
+
+	if (can_sort && cheapest_partial_path != NULL)
+	{
+		ListCell   *lc;
+
+		/* Similar to above logic, but for partial paths. */
+		foreach(lc, rel->partial_pathlist)
+		{
+			Path	   *input_path = (Path *) lfirst(lc);
+			Path	   *path;
+			bool		is_sorted;
+			int			presorted_keys;
+
+			is_sorted = pathkeys_count_contained_in(agg_info->group_pathkeys,
+													input_path->pathkeys,
+													&presorted_keys);
+
+			/*
+			 * Ignore paths that are not suitably or partially sorted, unless
+			 * they are the cheapest partial path (no need to deal with paths
+			 * which have presorted keys when incremental sort is disabled).
+			 */
+			if (!is_sorted && input_path != cheapest_partial_path &&
+				(presorted_keys == 0 || !enable_incremental_sort))
+				continue;
+
+			/*
+			 * Since the path originates from a non-grouped relation that is
+			 * not aware of eager aggregation, we must ensure that it provides
+			 * the correct input for partial aggregation.
+			 */
+			path = (Path *) create_projection_path(root,
+												   grouped_rel,
+												   input_path,
+												   agg_info->agg_input);
+
+			if (!is_sorted)
+			{
+				/*
+				 * We've no need to consider both a sort and incremental sort.
+				 * We'll just do a sort if there are no presorted keys and an
+				 * incremental sort when there are presorted keys.
+				 */
+				if (presorted_keys == 0 || !enable_incremental_sort)
+					path = (Path *) create_sort_path(root,
+													 grouped_rel,
+													 path,
+													 agg_info->group_pathkeys,
+													 -1.0);
+				else
+					path = (Path *) create_incremental_sort_path(root,
+																 grouped_rel,
+																 path,
+																 agg_info->group_pathkeys,
+																 presorted_keys,
+																 -1.0);
+			}
+
+			/*
+			 * qual is NIL because the HAVING clause cannot be evaluated until
+			 * the final value of the aggregate is known.
+			 */
+			path = (Path *) create_agg_path(root,
+											grouped_rel,
+											path,
+											agg_info->target,
+											AGG_SORTED,
+											AGGSPLIT_INITIAL_SERIAL,
+											agg_info->group_clauses,
+											NIL,
+											&agg_costs,
+											dNumPartialGroups);
+
+			add_partial_path(grouped_rel, path);
+		}
+	}
+
+	/*
+	 * Add a partially-grouped HashAgg Path where possible
+	 */
+	if (can_hash && cheapest_total_path != NULL)
+	{
+		Path	   *path;
+
+		/*
+		 * Since the path originates from a non-grouped relation that is not
+		 * aware of eager aggregation, we must ensure that it provides the
+		 * correct input for partial aggregation.
+		 */
+		path = (Path *) create_projection_path(root,
+											   grouped_rel,
+											   cheapest_total_path,
+											   agg_info->agg_input);
+
+		/*
+		 * qual is NIL because the HAVING clause cannot be evaluated until the
+		 * final value of the aggregate is known.
+		 */
+		path = (Path *) create_agg_path(root,
+										grouped_rel,
+										path,
+										agg_info->target,
+										AGG_HASHED,
+										AGGSPLIT_INITIAL_SERIAL,
+										agg_info->group_clauses,
+										NIL,
+										&agg_costs,
+										dNumGroups);
+
+		add_path(grouped_rel, path);
+	}
+
+	/*
+	 * Now add a partially-grouped HashAgg partial Path where possible
+	 */
+	if (can_hash && cheapest_partial_path != NULL)
+	{
+		Path	   *path;
+
+		/*
+		 * Since the path originates from a non-grouped relation that is not
+		 * aware of eager aggregation, we must ensure that it provides the
+		 * correct input for partial aggregation.
+		 */
+		path = (Path *) create_projection_path(root,
+											   grouped_rel,
+											   cheapest_partial_path,
+											   agg_info->agg_input);
+
+		/*
+		 * qual is NIL because the HAVING clause cannot be evaluated until the
+		 * final value of the aggregate is known.
+		 */
+		path = (Path *) create_agg_path(root,
+										grouped_rel,
+										path,
+										agg_info->target,
+										AGG_HASHED,
+										AGGSPLIT_INITIAL_SERIAL,
+										agg_info->group_clauses,
+										NIL,
+										&agg_costs,
+										dNumPartialGroups);
+
+		add_partial_path(grouped_rel, path);
+	}
+}
+
 /*
  * make_rel_from_joinlist
  *	  Build access paths using a "joinlist" to guide the join path search.
@@ -3494,6 +3898,10 @@ standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
 		 *
 		 * After that, we're done creating paths for the joinrel, so run
 		 * set_cheapest().
+		 *
+		 * In addition, we also run generate_grouped_paths() for the grouped
+		 * relation of each just-processed joinrel, and run set_cheapest() for
+		 * the grouped relation afterwards.
 		 */
 		foreach(lc, root->join_rel_level[lev])
 		{
@@ -3514,6 +3922,27 @@ standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
 			/* Find and save the cheapest paths for this rel */
 			set_cheapest(rel);
 
+			/*
+			 * Except for the topmost scan/join rel, consider generating
+			 * partial aggregation paths for the grouped relation on top of
+			 * the paths of this rel.  After that, we're done creating paths
+			 * for the grouped relation, so run set_cheapest().
+			 */
+			if (!bms_equal(rel->relids, root->all_query_rels))
+			{
+				RelOptInfo *grouped_rel;
+
+				grouped_rel = rel->grouped_rel;
+				if (grouped_rel)
+				{
+					Assert(IS_GROUPED_REL(grouped_rel));
+
+					generate_grouped_paths(root, grouped_rel, rel,
+										   grouped_rel->agg_info);
+					set_cheapest(grouped_rel);
+				}
+			}
+
 #ifdef OPTIMIZER_DEBUG
 			pprint(rel);
 #endif
@@ -4383,6 +4812,29 @@ generate_partitionwise_join_paths(PlannerInfo *root, RelOptInfo *rel)
 		if (IS_DUMMY_REL(child_rel))
 			continue;
 
+		/*
+		 * Except for the topmost scan/join rel, consider generating partial
+		 * aggregation paths for the grouped relation on top of the paths of
+		 * this partitioned child-join.  After that, we're done creating paths
+		 * for the grouped relation, so run set_cheapest().
+		 */
+		if (!bms_equal(IS_OTHER_REL(rel) ?
+					   rel->top_parent_relids : rel->relids,
+					   root->all_query_rels))
+		{
+			RelOptInfo *grouped_rel;
+
+			grouped_rel = child_rel->grouped_rel;
+			if (grouped_rel)
+			{
+				Assert(IS_GROUPED_REL(grouped_rel));
+
+				generate_grouped_paths(root, grouped_rel, child_rel,
+									   grouped_rel->agg_info);
+				set_cheapest(grouped_rel);
+			}
+		}
+
 #ifdef OPTIMIZER_DEBUG
 		pprint(child_rel);
 #endif
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index aad41b94009..477b0bc3b84 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -16,6 +16,7 @@
 
 #include "miscadmin.h"
 #include "optimizer/appendinfo.h"
+#include "optimizer/cost.h"
 #include "optimizer/joininfo.h"
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
@@ -35,6 +36,9 @@ static bool has_legal_joinclause(PlannerInfo *root, RelOptInfo *rel);
 static bool restriction_is_constant_false(List *restrictlist,
 										  RelOptInfo *joinrel,
 										  bool only_pushed_down);
+static void make_grouped_join_rel(PlannerInfo *root, RelOptInfo *rel1,
+								  RelOptInfo *rel2, RelOptInfo *joinrel,
+								  SpecialJoinInfo *sjinfo, List *restrictlist);
 static void populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
 										RelOptInfo *rel2, RelOptInfo *joinrel,
 										SpecialJoinInfo *sjinfo, List *restrictlist);
@@ -763,6 +767,10 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
 		return joinrel;
 	}
 
+	/* Build a grouped join relation for 'joinrel' if possible. */
+	make_grouped_join_rel(root, rel1, rel2, joinrel, sjinfo,
+						  restrictlist);
+
 	/* Add paths to the join relation. */
 	populate_joinrel_with_paths(root, rel1, rel2, joinrel, sjinfo,
 								restrictlist);
@@ -874,6 +882,186 @@ add_outer_joins_to_relids(PlannerInfo *root, Relids input_relids,
 	return input_relids;
 }
 
+/*
+ * make_grouped_join_rel
+ *	  Build a grouped join relation for the given "joinrel" if eager
+ *	  aggregation is applicable and the resulting grouped paths are considered
+ *	  useful.
+ *
+ * There are two strategies for generating grouped paths for a join relation:
+ *
+ * 1. Join a grouped (partially aggregated) input relation with a non-grouped
+ * input (e.g., AGG(B) JOIN A).
+ *
+ * 2. Apply partial aggregation (sorted or hashed) on top of existing
+ * non-grouped join paths (e.g., AGG(A JOIN B)).
+ *
+ * To limit planning effort and avoid an explosion of alternatives, we adopt a
+ * strategy where partial aggregation is only pushed to the lowest possible
+ * level in the join tree that is deemed useful.  That is, if grouped paths can
+ * be built using the first strategy, we skip consideration of the second
+ * strategy for the same join level.
+ *
+ * Additionally, if there are multiple lowest useful levels where partial
+ * aggregation could be applied, such as in a join tree with relations A, B,
+ * and C where both "AGG(A JOIN B) JOIN C" and "A JOIN AGG(B JOIN C)" are valid
+ * placements, we choose only the first one encountered during join search.
+ * This avoids generating multiple versions of the same grouped relation based
+ * on different aggregation placements.
+ *
+ * These heuristics also ensure that all grouped paths for the same grouped
+ * relation produce the same set of rows, which is a basic assumption in the
+ * planner.
+ */
+static void
+make_grouped_join_rel(PlannerInfo *root, RelOptInfo *rel1,
+					  RelOptInfo *rel2, RelOptInfo *joinrel,
+					  SpecialJoinInfo *sjinfo, List *restrictlist)
+{
+	RelOptInfo *grouped_rel;
+	RelOptInfo *grouped_rel1;
+	RelOptInfo *grouped_rel2;
+	bool		rel1_empty;
+	bool		rel2_empty;
+	Relids		agg_apply_at;
+
+	/*
+	 * If there are no aggregate expressions or grouping expressions, eager
+	 * aggregation is not possible.
+	 */
+	if (root->agg_clause_list == NIL ||
+		root->group_expr_list == NIL)
+		return;
+
+	/* Retrieve the grouped relations for the two input rels */
+	grouped_rel1 = rel1->grouped_rel;
+	grouped_rel2 = rel2->grouped_rel;
+
+	rel1_empty = (grouped_rel1 == NULL || IS_DUMMY_REL(grouped_rel1));
+	rel2_empty = (grouped_rel2 == NULL || IS_DUMMY_REL(grouped_rel2));
+
+	/* Find or construct a grouped joinrel for this joinrel */
+	grouped_rel = joinrel->grouped_rel;
+	if (grouped_rel == NULL)
+	{
+		RelAggInfo *agg_info = NULL;
+
+		/*
+		 * Prepare the information needed to create grouped paths for this
+		 * join relation.
+		 */
+		agg_info = create_rel_agg_info(root, joinrel);
+		if (agg_info == NULL)
+			return;
+
+		/*
+		 * If grouped paths for the given join relation are not considered
+		 * useful, and no grouped paths can be built by joining grouped input
+		 * relations, skip building the grouped join relation.
+		 */
+		if (!agg_info->agg_useful &&
+			(rel1_empty == rel2_empty))
+			return;
+
+		/* build the grouped relation */
+		grouped_rel = build_grouped_rel(root, joinrel);
+		grouped_rel->reltarget = agg_info->target;
+
+		if (rel1_empty != rel2_empty)
+		{
+			/*
+			 * If there is exactly one grouped input relation, then we can
+			 * build grouped paths by joining the input relations.  Set size
+			 * estimates for the grouped join relation based on the input
+			 * relations, and update the lowest join level where partial
+			 * aggregation is applied to that of the grouped input relation.
+			 */
+			set_joinrel_size_estimates(root, grouped_rel,
+									   rel1_empty ? rel1 : grouped_rel1,
+									   rel2_empty ? rel2 : grouped_rel2,
+									   sjinfo, restrictlist);
+			agg_info->apply_at = rel1_empty ?
+				grouped_rel2->agg_info->apply_at :
+				grouped_rel1->agg_info->apply_at;
+		}
+		else
+		{
+			/*
+			 * Otherwise, grouped paths can be built by applying partial
+			 * aggregation on top of existing non-grouped join paths.  Set
+			 * size estimates for the grouped join relation based on the
+			 * estimated number of groups, and track the lowest join level
+			 * where partial aggregation is applied.  Note that these values
+			 * may be updated later if it is determined that grouped paths can
+			 * be constructed by joining other input relations.
+			 */
+			grouped_rel->rows = agg_info->grouped_rows;
+			agg_info->apply_at = bms_copy(joinrel->relids);
+		}
+
+		grouped_rel->agg_info = agg_info;
+		joinrel->grouped_rel = grouped_rel;
+	}
+
+	Assert(IS_GROUPED_REL(grouped_rel));
+
+	/* We may have already proven this grouped join relation to be dummy. */
+	if (IS_DUMMY_REL(grouped_rel))
+		return;
+
+	/*
+	 * Nothing to do if there's no grouped input relation.  Also, joining two
+	 * grouped relations is not currently supported.
+	 */
+	if (rel1_empty == rel2_empty)
+		return;
+
+	/*
+	 * Get the lowest join level where partial aggregation is applied among
+	 * the given input relations.
+	 */
+	agg_apply_at = rel1_empty ?
+		grouped_rel2->agg_info->apply_at :
+		grouped_rel1->agg_info->apply_at;
+
+	/*
+	 * If it's not the designated level, skip building grouped paths.
+	 *
+	 * One exception is when it is a subset of the previously recorded level.
+	 * In that case, we need to update the designated level to this one, and
+	 * adjust the size estimates for the grouped join relation accordingly.
+	 * For example, suppose partial aggregation can be applied on top of (B
+	 * JOIN C).  If we first construct the join as ((A JOIN B) JOIN C), we'd
+	 * record the designated level as including all three relations (A B C).
+	 * Later, when we consider (A JOIN (B JOIN C)), we encounter the smaller
+	 * (B C) join level directly.  Since this is a subset of the previous
+	 * level and still valid for partial aggregation, we update the designated
+	 * level to (B C), and adjust the size estimates accordingly.
+	 */
+	if (!bms_equal(agg_apply_at, grouped_rel->agg_info->apply_at))
+	{
+		if (bms_is_subset(agg_apply_at, grouped_rel->agg_info->apply_at))
+		{
+			/* Adjust the size estimates for the grouped join relation. */
+			set_joinrel_size_estimates(root, grouped_rel,
+									   rel1_empty ? rel1 : grouped_rel1,
+									   rel2_empty ? rel2 : grouped_rel2,
+									   sjinfo, restrictlist);
+			grouped_rel->agg_info->apply_at = agg_apply_at;
+		}
+		else
+			return;
+	}
+
+	/* Make paths for the grouped join relation. */
+	populate_joinrel_with_paths(root,
+								rel1_empty ? rel1 : grouped_rel1,
+								rel2_empty ? rel2 : grouped_rel2,
+								grouped_rel,
+								sjinfo,
+								restrictlist);
+}
+
 /*
  * populate_joinrel_with_paths
  *	  Add paths to the given joinrel for given pair of joining relations. The
@@ -1615,6 +1803,11 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
 						 adjust_child_relids(joinrel->relids,
 											 nappinfos, appinfos)));
 
+		/* Build a grouped join relation for 'child_joinrel' if possible */
+		make_grouped_join_rel(root, child_rel1, child_rel2,
+							  child_joinrel, child_sjinfo,
+							  child_restrictlist);
+
 		/* And make paths for the child join */
 		populate_joinrel_with_paths(root, child_rel1, child_rel2,
 									child_joinrel, child_sjinfo,
diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index 3e3fec89252..3fbccc67190 100644
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -14,6 +14,7 @@
  */
 #include "postgres.h"
 
+#include "access/nbtree.h"
 #include "catalog/pg_constraint.h"
 #include "catalog/pg_type.h"
 #include "nodes/makefuncs.h"
@@ -81,6 +82,9 @@ typedef struct JoinTreeItem
 } JoinTreeItem;
 
 
+static bool has_internal_aggtranstype(PlannerInfo *root);
+static void create_agg_clause_infos(PlannerInfo *root);
+static void create_grouping_expr_infos(PlannerInfo *root);
 static void extract_lateral_references(PlannerInfo *root, RelOptInfo *brel,
 									   Index rtindex);
 static List *deconstruct_recurse(PlannerInfo *root, Node *jtnode,
@@ -628,6 +632,315 @@ remove_useless_groupby_columns(PlannerInfo *root)
 	}
 }
 
+/*
+ * setup_eager_aggregation
+ *	  Check if eager aggregation is applicable, and if so collect suitable
+ *	  aggregate expressions and grouping expressions in the query.
+ */
+void
+setup_eager_aggregation(PlannerInfo *root)
+{
+	/*
+	 * Don't apply eager aggregation if disabled by user.
+	 */
+	if (!enable_eager_aggregate)
+		return;
+
+	/*
+	 * Don't apply eager aggregation if there are no available GROUP BY
+	 * clauses.
+	 */
+	if (!root->processed_groupClause)
+		return;
+
+	/*
+	 * For now we don't try to support grouping sets.
+	 */
+	if (root->parse->groupingSets)
+		return;
+
+	/*
+	 * For now we don't try to support DISTINCT or ORDER BY aggregates.
+	 */
+	if (root->numOrderedAggs > 0)
+		return;
+
+	/*
+	 * If there are any aggregates that do not support partial mode, or any
+	 * partial aggregates that are non-serializable, do not apply eager
+	 * aggregation.
+	 */
+	if (root->hasNonPartialAggs || root->hasNonSerialAggs)
+		return;
+
+	/*
+	 * We don't try to apply eager aggregation if there are set-returning
+	 * functions in targetlist.
+	 */
+	if (root->parse->hasTargetSRFs)
+		return;
+
+	/*
+	 * Eager aggregation only makes sense if there are multiple base rels in
+	 * the query.
+	 */
+	if (bms_membership(root->all_baserels) != BMS_MULTIPLE)
+		return;
+
+	/*
+	 * Don't apply eager aggregation if any aggregate uses INTERNAL transition
+	 * type.
+	 *
+	 * Although INTERNAL is marked as pass-by-value, it usually points to a
+	 * large internal data structure (like those used by string_agg or
+	 * array_agg).  These transition states can grow large and their size is
+	 * hard to estimate.  Applying eager aggregation in such cases risks high
+	 * memory usage since partial aggregation results might be stored in join
+	 * hash tables or materialized nodes.
+	 */
+	if (has_internal_aggtranstype(root))
+		return;
+
+	/*
+	 * Collect aggregate expressions and plain Vars that appear in the
+	 * targetlist and havingQual.
+	 */
+	create_agg_clause_infos(root);
+
+	/*
+	 * If there are no suitable aggregate expressions, we cannot apply eager
+	 * aggregation.
+	 */
+	if (root->agg_clause_list == NIL)
+		return;
+
+	/*
+	 * Collect grouping expressions that appear in grouping clauses.
+	 */
+	create_grouping_expr_infos(root);
+}
+
+/*
+ * has_internal_aggtranstype
+ *	  Checks if any aggregate uses INTERNAL transition type.
+ */
+static bool
+has_internal_aggtranstype(PlannerInfo *root)
+{
+	ListCell   *lc;
+
+	foreach(lc, root->aggtransinfos)
+	{
+		AggTransInfo *transinfo = lfirst_node(AggTransInfo, lc);
+
+		if (transinfo->aggtranstype == INTERNALOID)
+			return true;
+	}
+
+	return false;
+}
+
+/*
+ * create_agg_clause_infos
+ *	  Search the targetlist and havingQual for Aggrefs and plain Vars, and
+ *	  create an AggClauseInfo for each Aggref node.
+ */
+static void
+create_agg_clause_infos(PlannerInfo *root)
+{
+	List	   *tlist_exprs;
+	List	   *agg_clause_list = NIL;
+	List	   *tlist_vars = NIL;
+	Relids		aggregate_relids = NULL;
+	bool		eager_agg_applicable = true;
+	ListCell   *lc;
+
+	Assert(root->agg_clause_list == NIL);
+	Assert(root->tlist_vars == NIL);
+
+	tlist_exprs = pull_var_clause((Node *) root->processed_tlist,
+								  PVC_INCLUDE_AGGREGATES |
+								  PVC_RECURSE_WINDOWFUNCS |
+								  PVC_RECURSE_PLACEHOLDERS);
+
+	/*
+	 * Aggregates within the HAVING clause need to be processed in the same
+	 * way as those in the targetlist.  Note that HAVING can contain Aggrefs
+	 * but not WindowFuncs.
+	 */
+	if (root->parse->havingQual != NULL)
+	{
+		List	   *having_exprs;
+
+		having_exprs = pull_var_clause((Node *) root->parse->havingQual,
+									   PVC_INCLUDE_AGGREGATES |
+									   PVC_RECURSE_PLACEHOLDERS);
+		if (having_exprs != NIL)
+		{
+			tlist_exprs = list_concat(tlist_exprs, having_exprs);
+			list_free(having_exprs);
+		}
+	}
+
+	foreach(lc, tlist_exprs)
+	{
+		Expr	   *expr = (Expr *) lfirst(lc);
+		Aggref	   *aggref;
+		Relids		agg_eval_at;
+		AggClauseInfo *ac_info;
+
+		/* For now we don't try to support GROUPING() expressions */
+		if (IsA(expr, GroupingFunc))
+		{
+			eager_agg_applicable = false;
+			break;
+		}
+
+		/* Collect plain Vars for future reference */
+		if (IsA(expr, Var))
+		{
+			tlist_vars = list_append_unique(tlist_vars, expr);
+			continue;
+		}
+
+		aggref = castNode(Aggref, expr);
+
+		Assert(aggref->aggorder == NIL);
+		Assert(aggref->aggdistinct == NIL);
+
+		/*
+		 * If there are any securityQuals, do not try to apply eager
+		 * aggregation if any non-leakproof aggregate functions are present.
+		 * This is overly strict, but for now...
+		 */
+		if (root->qual_security_level > 0 &&
+			!get_func_leakproof(aggref->aggfnoid))
+		{
+			eager_agg_applicable = false;
+			break;
+		}
+
+		agg_eval_at = pull_varnos(root, (Node *) aggref);
+
+		/*
+		 * If all base relations in the query are referenced by aggregate
+		 * functions, then eager aggregation is not applicable.
+		 */
+		aggregate_relids = bms_add_members(aggregate_relids, agg_eval_at);
+		if (bms_is_subset(root->all_baserels, aggregate_relids))
+		{
+			eager_agg_applicable = false;
+			break;
+		}
+
+		/* OK, create the AggClauseInfo node */
+		ac_info = makeNode(AggClauseInfo);
+		ac_info->aggref = aggref;
+		ac_info->agg_eval_at = agg_eval_at;
+
+		/* ... and add it to the list */
+		agg_clause_list = list_append_unique(agg_clause_list, ac_info);
+	}
+
+	list_free(tlist_exprs);
+
+	if (eager_agg_applicable)
+	{
+		root->agg_clause_list = agg_clause_list;
+		root->tlist_vars = tlist_vars;
+	}
+	else
+	{
+		list_free_deep(agg_clause_list);
+		list_free(tlist_vars);
+	}
+}
+
+/*
+ * create_grouping_expr_infos
+ *	  Create a GroupingExprInfo for each expression usable as grouping key.
+ *
+ * If any grouping expression is not suitable, we will just return with
+ * root->group_expr_list being NIL.
+ */
+static void
+create_grouping_expr_infos(PlannerInfo *root)
+{
+	List	   *exprs = NIL;
+	List	   *sortgrouprefs = NIL;
+	List	   *btree_opfamilies = NIL;
+	ListCell   *lc,
+			   *lc1,
+			   *lc2,
+			   *lc3;
+
+	Assert(root->group_expr_list == NIL);
+
+	foreach(lc, root->processed_groupClause)
+	{
+		SortGroupClause *sgc = lfirst_node(SortGroupClause, lc);
+		TargetEntry *tle = get_sortgroupclause_tle(sgc, root->processed_tlist);
+		TypeCacheEntry *tce;
+		Oid			equalimageproc;
+
+		Assert(tle->ressortgroupref > 0);
+
+		/*
+		 * For now we only support plain Vars as grouping expressions.
+		 */
+		if (!IsA(tle->expr, Var))
+			return;
+
+		/*
+		 * Eager aggregation is only possible if equality implies image
+		 * equality for each grouping key.  Otherwise, placing keys with
+		 * different byte images into the same group may result in the loss of
+		 * information that could be necessary to evaluate upper qual clauses.
+		 *
+		 * For instance, the NUMERIC data type is not supported, as values
+		 * that are considered equal by the equality operator (e.g., 0 and
+		 * 0.0) can have different scales.
+		 */
+		tce = lookup_type_cache(exprType((Node *) tle->expr),
+								TYPECACHE_BTREE_OPFAMILY);
+		if (!OidIsValid(tce->btree_opf) ||
+			!OidIsValid(tce->btree_opintype))
+			return;
+
+		equalimageproc = get_opfamily_proc(tce->btree_opf,
+										   tce->btree_opintype,
+										   tce->btree_opintype,
+										   BTEQUALIMAGE_PROC);
+		if (!OidIsValid(equalimageproc) ||
+			!DatumGetBool(OidFunctionCall1Coll(equalimageproc,
+											   tce->typcollation,
+											   ObjectIdGetDatum(tce->btree_opintype))))
+			return;
+
+		exprs = lappend(exprs, tle->expr);
+		sortgrouprefs = lappend_int(sortgrouprefs, tle->ressortgroupref);
+		btree_opfamilies = lappend_oid(btree_opfamilies, tce->btree_opf);
+	}
+
+	/*
+	 * Construct a GroupingExprInfo for each expression.
+	 */
+	forthree(lc1, exprs, lc2, sortgrouprefs, lc3, btree_opfamilies)
+	{
+		Expr	   *expr = (Expr *) lfirst(lc1);
+		int			sortgroupref = lfirst_int(lc2);
+		Oid			btree_opfamily = lfirst_oid(lc3);
+		GroupingExprInfo *ge_info;
+
+		ge_info = makeNode(GroupingExprInfo);
+		ge_info->expr = (Expr *) copyObject(expr);
+		ge_info->sortgroupref = sortgroupref;
+		ge_info->btree_opfamily = btree_opfamily;
+
+		root->group_expr_list = lappend(root->group_expr_list, ge_info);
+	}
+}
+
 /*****************************************************************************
  *
  *	  LATERAL REFERENCES
diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c
index 5467e094ca7..eefc486a566 100644
--- a/src/backend/optimizer/plan/planmain.c
+++ b/src/backend/optimizer/plan/planmain.c
@@ -76,6 +76,9 @@ query_planner(PlannerInfo *root,
 	root->placeholder_list = NIL;
 	root->placeholder_array = NULL;
 	root->placeholder_array_size = 0;
+	root->agg_clause_list = NIL;
+	root->group_expr_list = NIL;
+	root->tlist_vars = NIL;
 	root->fkey_list = NIL;
 	root->initial_rels = NIL;
 
@@ -265,6 +268,12 @@ query_planner(PlannerInfo *root,
 	 */
 	extract_restriction_or_clauses(root);
 
+	/*
+	 * Check if eager aggregation is applicable, and if so, set up
+	 * root->agg_clause_list and root->group_expr_list.
+	 */
+	setup_eager_aggregation(root);
+
 	/*
 	 * Now expand appendrels by adding "otherrels" for their children.  We
 	 * delay this to the end so that we have as much information as possible
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index c989e72cac5..6e1d01adbfa 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -231,7 +231,6 @@ static void add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 									  RelOptInfo *partially_grouped_rel,
 									  const AggClauseCosts *agg_costs,
 									  grouping_sets_data *gd,
-									  double dNumGroups,
 									  GroupPathExtraData *extra);
 static RelOptInfo *create_partial_grouping_paths(PlannerInfo *root,
 												 RelOptInfo *grouped_rel,
@@ -3970,9 +3969,7 @@ create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
 							   GroupPathExtraData *extra,
 							   RelOptInfo **partially_grouped_rel_p)
 {
-	Path	   *cheapest_path = input_rel->cheapest_total_path;
 	RelOptInfo *partially_grouped_rel = NULL;
-	double		dNumGroups;
 	PartitionwiseAggregateType patype = PARTITIONWISE_AGGREGATE_NONE;
 
 	/*
@@ -4054,23 +4051,16 @@ create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
 
 	/* Gather any partially grouped partial paths. */
 	if (partially_grouped_rel && partially_grouped_rel->partial_pathlist)
-	{
 		gather_grouping_paths(root, partially_grouped_rel);
-		set_cheapest(partially_grouped_rel);
-	}
 
-	/*
-	 * Estimate number of groups.
-	 */
-	dNumGroups = get_number_of_groups(root,
-									  cheapest_path->rows,
-									  gd,
-									  extra->targetList);
+	/* Now choose the best path(s) for partially_grouped_rel. */
+	if (partially_grouped_rel && partially_grouped_rel->pathlist)
+		set_cheapest(partially_grouped_rel);
 
 	/* Build final grouping paths */
 	add_paths_to_grouping_rel(root, input_rel, grouped_rel,
 							  partially_grouped_rel, agg_costs, gd,
-							  dNumGroups, extra);
+							  extra);
 
 	/* Give a helpful error if we failed to find any implementation */
 	if (grouped_rel->pathlist == NIL)
@@ -7015,16 +7005,42 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 						  RelOptInfo *grouped_rel,
 						  RelOptInfo *partially_grouped_rel,
 						  const AggClauseCosts *agg_costs,
-						  grouping_sets_data *gd, double dNumGroups,
+						  grouping_sets_data *gd,
 						  GroupPathExtraData *extra)
 {
 	Query	   *parse = root->parse;
 	Path	   *cheapest_path = input_rel->cheapest_total_path;
+	Path	   *cheapest_partially_grouped_path = NULL;
 	ListCell   *lc;
 	bool		can_hash = (extra->flags & GROUPING_CAN_USE_HASH) != 0;
 	bool		can_sort = (extra->flags & GROUPING_CAN_USE_SORT) != 0;
 	List	   *havingQual = (List *) extra->havingQual;
 	AggClauseCosts *agg_final_costs = &extra->agg_final_costs;
+	double		dNumGroups = 0;
+	double		dNumFinalGroups = 0;
+
+	/*
+	 * Estimate number of groups for non-split aggregation.
+	 */
+	dNumGroups = get_number_of_groups(root,
+									  cheapest_path->rows,
+									  gd,
+									  extra->targetList);
+
+	if (partially_grouped_rel && partially_grouped_rel->pathlist)
+	{
+		cheapest_partially_grouped_path =
+			partially_grouped_rel->cheapest_total_path;
+
+		/*
+		 * Estimate number of groups for final phase of partial aggregation.
+		 */
+		dNumFinalGroups =
+			get_number_of_groups(root,
+								 cheapest_partially_grouped_path->rows,
+								 gd,
+								 extra->targetList);
+	}
 
 	if (can_sort)
 	{
@@ -7137,7 +7153,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 					path = make_ordered_path(root,
 											 grouped_rel,
 											 path,
-											 partially_grouped_rel->cheapest_total_path,
+											 cheapest_partially_grouped_path,
 											 info->pathkeys,
 											 -1.0);
 
@@ -7155,7 +7171,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 												 info->clauses,
 												 havingQual,
 												 agg_final_costs,
-												 dNumGroups));
+												 dNumFinalGroups));
 					else
 						add_path(grouped_rel, (Path *)
 								 create_group_path(root,
@@ -7163,7 +7179,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 												   path,
 												   info->clauses,
 												   havingQual,
-												   dNumGroups));
+												   dNumFinalGroups));
 
 				}
 			}
@@ -7205,19 +7221,17 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 		 */
 		if (partially_grouped_rel && partially_grouped_rel->pathlist)
 		{
-			Path	   *path = partially_grouped_rel->cheapest_total_path;
-
 			add_path(grouped_rel, (Path *)
 					 create_agg_path(root,
 									 grouped_rel,
-									 path,
+									 cheapest_partially_grouped_path,
 									 grouped_rel->reltarget,
 									 AGG_HASHED,
 									 AGGSPLIT_FINAL_DESERIAL,
 									 root->processed_groupClause,
 									 havingQual,
 									 agg_final_costs,
-									 dNumGroups));
+									 dNumFinalGroups));
 		}
 	}
 
@@ -7257,6 +7271,7 @@ create_partial_grouping_paths(PlannerInfo *root,
 {
 	Query	   *parse = root->parse;
 	RelOptInfo *partially_grouped_rel;
+	RelOptInfo *eager_agg_rel = NULL;
 	AggClauseCosts *agg_partial_costs = &extra->agg_partial_costs;
 	AggClauseCosts *agg_final_costs = &extra->agg_final_costs;
 	Path	   *cheapest_partial_path = NULL;
@@ -7267,6 +7282,15 @@ create_partial_grouping_paths(PlannerInfo *root,
 	bool		can_hash = (extra->flags & GROUPING_CAN_USE_HASH) != 0;
 	bool		can_sort = (extra->flags & GROUPING_CAN_USE_SORT) != 0;
 
+	/*
+	 * Check whether any partially aggregated paths have been generated
+	 * through eager aggregation.
+	 */
+	if (input_rel->grouped_rel &&
+		!IS_DUMMY_REL(input_rel->grouped_rel) &&
+		input_rel->grouped_rel->pathlist != NIL)
+		eager_agg_rel = input_rel->grouped_rel;
+
 	/*
 	 * Consider whether we should generate partially aggregated non-partial
 	 * paths.  We can only do this if we have a non-partial path, and only if
@@ -7288,11 +7312,13 @@ create_partial_grouping_paths(PlannerInfo *root,
 
 	/*
 	 * If we can't partially aggregate partial paths, and we can't partially
-	 * aggregate non-partial paths, then don't bother creating the new
+	 * aggregate non-partial paths, and no partially aggregated paths were
+	 * generated by eager aggregation, then don't bother creating the new
 	 * RelOptInfo at all, unless the caller specified force_rel_creation.
 	 */
 	if (cheapest_total_path == NULL &&
 		cheapest_partial_path == NULL &&
+		eager_agg_rel == NULL &&
 		!force_rel_creation)
 		return NULL;
 
@@ -7517,6 +7543,51 @@ create_partial_grouping_paths(PlannerInfo *root,
 										 dNumPartialPartialGroups));
 	}
 
+	/*
+	 * Add any partially aggregated paths generated by eager aggregation to
+	 * the new upper relation after applying projection steps as needed.
+	 */
+	if (eager_agg_rel)
+	{
+		/* Add the paths */
+		foreach(lc, eager_agg_rel->pathlist)
+		{
+			Path	   *path = (Path *) lfirst(lc);
+
+			/* Shouldn't have any parameterized paths anymore */
+			Assert(path->param_info == NULL);
+
+			path = (Path *) create_projection_path(root,
+												   partially_grouped_rel,
+												   path,
+												   partially_grouped_rel->reltarget);
+
+			add_path(partially_grouped_rel, path);
+		}
+
+		/*
+		 * Likewise add the partial paths, but only if parallelism is possible
+		 * for partially_grouped_rel.
+		 */
+		if (partially_grouped_rel->consider_parallel)
+		{
+			foreach(lc, eager_agg_rel->partial_pathlist)
+			{
+				Path	   *path = (Path *) lfirst(lc);
+
+				/* Shouldn't have any parameterized paths anymore */
+				Assert(path->param_info == NULL);
+
+				path = (Path *) create_projection_path(root,
+													   partially_grouped_rel,
+													   path,
+													   partially_grouped_rel->reltarget);
+
+				add_partial_path(partially_grouped_rel, path);
+			}
+		}
+	}
+
 	/*
 	 * If there is an FDW that's responsible for all baserels of the query,
 	 * let it consider adding partially grouped ForeignPaths.
@@ -8080,13 +8151,6 @@ create_partitionwise_grouping_paths(PlannerInfo *root,
 
 		add_paths_to_append_rel(root, partially_grouped_rel,
 								partially_grouped_live_children);
-
-		/*
-		 * We need call set_cheapest, since the finalization step will use the
-		 * cheapest path from the rel.
-		 */
-		if (partially_grouped_rel->pathlist)
-			set_cheapest(partially_grouped_rel);
 	}
 
 	/* If possible, create append paths for fully grouped children. */
diff --git a/src/backend/optimizer/util/appendinfo.c b/src/backend/optimizer/util/appendinfo.c
index 5b3dc0d8653..11c0eb0d180 100644
--- a/src/backend/optimizer/util/appendinfo.c
+++ b/src/backend/optimizer/util/appendinfo.c
@@ -516,6 +516,65 @@ adjust_appendrel_attrs_mutator(Node *node,
 		return (Node *) newinfo;
 	}
 
+	/*
+	 * We have to process RelAggInfo nodes specially.
+	 */
+	if (IsA(node, RelAggInfo))
+	{
+		RelAggInfo *oldinfo = (RelAggInfo *) node;
+		RelAggInfo *newinfo = makeNode(RelAggInfo);
+
+		/* Copy all flat-copiable fields */
+		memcpy(newinfo, oldinfo, sizeof(RelAggInfo));
+
+		newinfo->relids = adjust_child_relids(oldinfo->relids,
+											  nappinfos, appinfos);
+
+		newinfo->target = (PathTarget *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->target,
+										   context);
+
+		newinfo->agg_input = (PathTarget *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->agg_input,
+										   context);
+
+		newinfo->group_clauses = (List *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->group_clauses,
+										   context);
+
+		newinfo->group_exprs = (List *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->group_exprs,
+										   context);
+
+		return (Node *) newinfo;
+	}
+
+	/*
+	 * We have to process PathTarget nodes specially.
+	 */
+	if (IsA(node, PathTarget))
+	{
+		PathTarget *oldtarget = (PathTarget *) node;
+		PathTarget *newtarget = makeNode(PathTarget);
+
+		/* Copy all flat-copiable fields */
+		memcpy(newtarget, oldtarget, sizeof(PathTarget));
+
+		newtarget->exprs = (List *)
+			adjust_appendrel_attrs_mutator((Node *) oldtarget->exprs,
+										   context);
+
+		if (oldtarget->sortgrouprefs)
+		{
+			Size		nbytes = list_length(oldtarget->exprs) * sizeof(Index);
+
+			newtarget->sortgrouprefs = (Index *) palloc(nbytes);
+			memcpy(newtarget->sortgrouprefs, oldtarget->sortgrouprefs, nbytes);
+		}
+
+		return (Node *) newtarget;
+	}
+
 	/*
 	 * NOTE: we do not need to recurse into sublinks, because they should
 	 * already have been converted to subplans before we see them.
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index 9cc602788ea..71d1096012c 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -2813,8 +2813,7 @@ create_projection_path(PlannerInfo *root,
 	pathnode->path.pathtype = T_Result;
 	pathnode->path.parent = rel;
 	pathnode->path.pathtarget = target;
-	/* For now, assume we are above any joins, so no parameterization */
-	pathnode->path.param_info = NULL;
+	pathnode->path.param_info = subpath->param_info;
 	pathnode->path.parallel_aware = false;
 	pathnode->path.parallel_safe = rel->consider_parallel &&
 		subpath->parallel_safe &&
@@ -3069,8 +3068,7 @@ create_incremental_sort_path(PlannerInfo *root,
 	pathnode->path.parent = rel;
 	/* Sort doesn't project, so use source path's pathtarget */
 	pathnode->path.pathtarget = subpath->pathtarget;
-	/* For now, assume we are above any joins, so no parameterization */
-	pathnode->path.param_info = NULL;
+	pathnode->path.param_info = subpath->param_info;
 	pathnode->path.parallel_aware = false;
 	pathnode->path.parallel_safe = rel->consider_parallel &&
 		subpath->parallel_safe;
@@ -3117,8 +3115,7 @@ create_sort_path(PlannerInfo *root,
 	pathnode->path.parent = rel;
 	/* Sort doesn't project, so use source path's pathtarget */
 	pathnode->path.pathtarget = subpath->pathtarget;
-	/* For now, assume we are above any joins, so no parameterization */
-	pathnode->path.param_info = NULL;
+	pathnode->path.param_info = subpath->param_info;
 	pathnode->path.parallel_aware = false;
 	pathnode->path.parallel_safe = rel->consider_parallel &&
 		subpath->parallel_safe;
@@ -3279,8 +3276,7 @@ create_agg_path(PlannerInfo *root,
 	pathnode->path.pathtype = T_Agg;
 	pathnode->path.parent = rel;
 	pathnode->path.pathtarget = target;
-	/* For now, assume we are above any joins, so no parameterization */
-	pathnode->path.param_info = NULL;
+	pathnode->path.param_info = subpath->param_info;
 	pathnode->path.parallel_aware = false;
 	pathnode->path.parallel_safe = rel->consider_parallel &&
 		subpath->parallel_safe;
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index ff507331a06..c4054b5d03f 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -16,6 +16,8 @@
 
 #include <limits.h>
 
+#include "access/nbtree.h"
+#include "catalog/pg_constraint.h"
 #include "miscadmin.h"
 #include "nodes/nodeFuncs.h"
 #include "optimizer/appendinfo.h"
@@ -27,12 +29,16 @@
 #include "optimizer/paths.h"
 #include "optimizer/placeholder.h"
 #include "optimizer/plancat.h"
+#include "optimizer/planner.h"
 #include "optimizer/restrictinfo.h"
 #include "optimizer/tlist.h"
+#include "parser/parse_oper.h"
 #include "parser/parse_relation.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/hsearch.h"
 #include "utils/lsyscache.h"
+#include "utils/selfuncs.h"
+#include "utils/typcache.h"
 
 
 typedef struct JoinHashEntry
@@ -83,7 +89,22 @@ static void build_child_join_reltarget(PlannerInfo *root,
 									   RelOptInfo *childrel,
 									   int nappinfos,
 									   AppendRelInfo **appinfos);
+static bool eager_aggregation_possible_for_relation(PlannerInfo *root,
+													RelOptInfo *rel);
+static bool init_grouping_targets(PlannerInfo *root, RelOptInfo *rel,
+								  PathTarget *target, PathTarget *agg_input,
+								  List **group_clauses, List **group_exprs);
+static bool is_var_in_aggref_only(PlannerInfo *root, Var *var);
+static bool is_var_needed_by_join(PlannerInfo *root, Var *var, RelOptInfo *rel);
+static Index get_expression_sortgroupref(PlannerInfo *root, Expr *expr);
 
+/*
+ * Minimum average group size required to consider applying eager aggregation.
+ *
+ * This helps avoid the overhead of eager aggregation when it does not offer
+ * significant row count reduction.
+ */
+#define EAGER_AGG_MIN_GROUP_SIZE 20.0
 
 /*
  * setup_simple_rel_arrays
@@ -276,6 +297,8 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
 	rel->joininfo = NIL;
 	rel->has_eclass_joins = false;
 	rel->consider_partitionwise_join = false;	/* might get changed later */
+	rel->agg_info = NULL;
+	rel->grouped_rel = NULL;
 	rel->part_scheme = NULL;
 	rel->nparts = -1;
 	rel->boundinfo = NULL;
@@ -406,6 +429,104 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
 	return rel;
 }
 
+/*
+ * build_simple_grouped_rel
+ *	  Construct a new RelOptInfo representing a grouped version of the input
+ *	  base relation.
+ */
+RelOptInfo *
+build_simple_grouped_rel(PlannerInfo *root, RelOptInfo *rel)
+{
+	RelOptInfo *grouped_rel;
+	RelAggInfo *agg_info;
+
+	/*
+	 * We should have available aggregate expressions and grouping
+	 * expressions, otherwise we cannot reach here.
+	 */
+	Assert(root->agg_clause_list != NIL);
+	Assert(root->group_expr_list != NIL);
+
+	/* nothing to do for dummy rel */
+	if (IS_DUMMY_REL(rel))
+		return NULL;
+
+	/*
+	 * Prepare the information needed to create grouped paths for this base
+	 * relation.
+	 */
+	agg_info = create_rel_agg_info(root, rel);
+	if (agg_info == NULL)
+		return NULL;
+
+	/*
+	 * If grouped paths for the given base relation are not considered useful,
+	 * skip building the grouped relation.
+	 */
+	if (!agg_info->agg_useful)
+		return NULL;
+
+	/* Tracks the lowest join level at which partial aggregation is applied */
+	agg_info->apply_at = bms_copy(rel->relids);
+
+	/* build the grouped relation */
+	grouped_rel = build_grouped_rel(root, rel);
+	grouped_rel->reltarget = agg_info->target;
+	grouped_rel->rows = agg_info->grouped_rows;
+	grouped_rel->agg_info = agg_info;
+
+	rel->grouped_rel = grouped_rel;
+
+	return grouped_rel;
+}
+
+/*
+ * build_grouped_rel
+ *	  Build a grouped relation by flat copying the input relation and resetting
+ *	  the necessary fields.
+ */
+RelOptInfo *
+build_grouped_rel(PlannerInfo *root, RelOptInfo *rel)
+{
+	RelOptInfo *grouped_rel;
+
+	grouped_rel = makeNode(RelOptInfo);
+	memcpy(grouped_rel, rel, sizeof(RelOptInfo));
+
+	/*
+	 * clear path info
+	 */
+	grouped_rel->pathlist = NIL;
+	grouped_rel->ppilist = NIL;
+	grouped_rel->partial_pathlist = NIL;
+	grouped_rel->cheapest_startup_path = NULL;
+	grouped_rel->cheapest_total_path = NULL;
+	grouped_rel->cheapest_unique_path = NULL;
+	grouped_rel->cheapest_parameterized_paths = NIL;
+
+	/*
+	 * clear partition info
+	 */
+	grouped_rel->part_scheme = NULL;
+	grouped_rel->nparts = -1;
+	grouped_rel->boundinfo = NULL;
+	grouped_rel->partbounds_merged = false;
+	grouped_rel->partition_qual = NIL;
+	grouped_rel->part_rels = NULL;
+	grouped_rel->live_parts = NULL;
+	grouped_rel->all_partrels = NULL;
+	grouped_rel->partexprs = NULL;
+	grouped_rel->nullable_partexprs = NULL;
+	grouped_rel->consider_partitionwise_join = false;
+
+	/*
+	 * clear size estimates
+	 */
+	grouped_rel->rows = 0;
+
+	return grouped_rel;
+}
+
 /*
  * find_base_rel
  *	  Find a base or otherrel relation entry, which must already exist.
@@ -755,6 +876,8 @@ build_join_rel(PlannerInfo *root,
 	joinrel->joininfo = NIL;
 	joinrel->has_eclass_joins = false;
 	joinrel->consider_partitionwise_join = false;	/* might get changed later */
+	joinrel->agg_info = NULL;
+	joinrel->grouped_rel = NULL;
 	joinrel->parent = NULL;
 	joinrel->top_parent = NULL;
 	joinrel->top_parent_relids = NULL;
@@ -939,6 +1062,8 @@ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
 	joinrel->joininfo = NIL;
 	joinrel->has_eclass_joins = false;
 	joinrel->consider_partitionwise_join = false;	/* might get changed later */
+	joinrel->agg_info = NULL;
+	joinrel->grouped_rel = NULL;
 	joinrel->parent = parent_joinrel;
 	joinrel->top_parent = parent_joinrel->top_parent ? parent_joinrel->top_parent : parent_joinrel;
 	joinrel->top_parent_relids = joinrel->top_parent->relids;
@@ -2518,3 +2643,514 @@ build_child_join_reltarget(PlannerInfo *root,
 	childrel->reltarget->cost.per_tuple = parentrel->reltarget->cost.per_tuple;
 	childrel->reltarget->width = parentrel->reltarget->width;
 }
+
+/*
+ * create_rel_agg_info
+ *	  Create the RelAggInfo structure for the given relation if it can produce
+ *	  grouped paths.  The given relation is the non-grouped one which has the
+ *	  reltarget already constructed.
+ */
+RelAggInfo *
+create_rel_agg_info(PlannerInfo *root, RelOptInfo *rel)
+{
+	ListCell   *lc;
+	RelAggInfo *result;
+	PathTarget *agg_input;
+	PathTarget *target;
+	List	   *group_clauses = NIL;
+	List	   *group_exprs = NIL;
+
+	/*
+	 * The lists of aggregate expressions and grouping expressions should have
+	 * been constructed.
+	 */
+	Assert(root->agg_clause_list != NIL);
+	Assert(root->group_expr_list != NIL);
+
+	/*
+	 * If this is a child rel, the grouped rel for its parent rel must have
+	 * been created if it can.  So we can just use parent's RelAggInfo if
+	 * there is one, with appropriate variable substitutions.
+	 */
+	if (IS_OTHER_REL(rel))
+	{
+		RelOptInfo *grouped_rel;
+		RelAggInfo *agg_info;
+
+		grouped_rel = rel->top_parent->grouped_rel;
+		if (grouped_rel == NULL)
+			return NULL;
+
+		Assert(IS_GROUPED_REL(grouped_rel));
+
+		/* Must do multi-level transformation */
+		agg_info = (RelAggInfo *)
+			adjust_appendrel_attrs_multilevel(root,
+											  (Node *) grouped_rel->agg_info,
+											  rel,
+											  rel->top_parent);
+
+		agg_info->grouped_rows =
+			estimate_num_groups(root, agg_info->group_exprs,
+								rel->rows, NULL, NULL);
+
+		agg_info->apply_at = NULL;	/* caller will change this later */
+
+		/*
+		 * The grouped paths for the given relation are considered useful iff
+		 * the average group size is no less than EAGER_AGG_MIN_GROUP_SIZE.
+		 */
+		agg_info->agg_useful =
+			(rel->rows / agg_info->grouped_rows) >= EAGER_AGG_MIN_GROUP_SIZE;
+
+		return agg_info;
+	}
+
+	/* Check if it's possible to produce grouped paths for this relation. */
+	if (!eager_aggregation_possible_for_relation(root, rel))
+		return NULL;
+
+	/*
+	 * Create targets for the grouped paths and for the input paths of the
+	 * grouped paths.
+	 */
+	target = create_empty_pathtarget();
+	agg_input = create_empty_pathtarget();
+
+	/* ... and initialize these targets */
+	if (!init_grouping_targets(root, rel, target, agg_input,
+							   &group_clauses, &group_exprs))
+		return NULL;
+
+	/*
+	 * Eager aggregation is not applicable if there are no available grouping
+	 * expressions.
+	 */
+	if (list_length(group_clauses) == 0)
+		return NULL;
+
+	/* build the RelAggInfo result */
+	result = makeNode(RelAggInfo);
+
+	result->group_clauses = group_clauses;
+	result->group_exprs = group_exprs;
+
+	/* Calculate pathkeys that represent this grouping requirements */
+	result->group_pathkeys =
+		make_pathkeys_for_sortclauses(root, result->group_clauses,
+									  make_tlist_from_pathtarget(target));
+
+	/* Add aggregates to the grouping target */
+	foreach(lc, root->agg_clause_list)
+	{
+		AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+		Aggref	   *aggref;
+
+		Assert(IsA(ac_info->aggref, Aggref));
+
+		aggref = (Aggref *) copyObject(ac_info->aggref);
+		mark_partial_aggref(aggref, AGGSPLIT_INITIAL_SERIAL);
+
+		add_column_to_pathtarget(target, (Expr *) aggref, 0);
+	}
+
+	/* Set the estimated eval cost and output width for both targets */
+	set_pathtarget_cost_width(root, target);
+	set_pathtarget_cost_width(root, agg_input);
+
+	result->relids = bms_copy(rel->relids);
+	result->target = target;
+	result->agg_input = agg_input;
+	result->grouped_rows = estimate_num_groups(root, result->group_exprs,
+											   rel->rows, NULL, NULL);
+	result->apply_at = NULL;	/* caller will change this later */
+
+	/*
+	 * The grouped paths for the given relation are considered useful iff the
+	 * average group size is no less than EAGER_AGG_MIN_GROUP_SIZE.
+	 */
+	result->agg_useful =
+		(rel->rows / result->grouped_rows) >= EAGER_AGG_MIN_GROUP_SIZE;
+
+	return result;
+}
+
+/*
+ * eager_aggregation_possible_for_relation
+ * 	  Check if it's possible to produce grouped paths for the given relation.
+ */
+static bool
+eager_aggregation_possible_for_relation(PlannerInfo *root, RelOptInfo *rel)
+{
+	ListCell   *lc;
+	int			cur_relid;
+
+	/*
+	 * Check to see if the given relation is in the nullable side of an outer
+	 * join.  In this case, we cannot push a partial aggregation down to the
+	 * relation, because the NULL-extended rows produced by the outer join
+	 * would not be available when we perform the partial aggregation, while
+	 * with a non-eager-aggregation plan these rows are available for the
+	 * top-level aggregation.  Doing so may result in the rows being grouped
+	 * differently than expected, or produce incorrect values from the
+	 * aggregate functions.
+	 */
+	cur_relid = -1;
+	while ((cur_relid = bms_next_member(rel->relids, cur_relid)) >= 0)
+	{
+		RelOptInfo *baserel = find_base_rel_ignore_join(root, cur_relid);
+
+		if (baserel == NULL)
+			continue;			/* ignore outer joins in rel->relids */
+
+		if (!bms_is_subset(baserel->nulling_relids, rel->relids))
+			return false;
+	}
+
+	/*
+	 * For now we don't try to support PlaceHolderVars.
+	 */
+	foreach(lc, rel->reltarget->exprs)
+	{
+		Expr	   *expr = lfirst(lc);
+
+		if (IsA(expr, PlaceHolderVar))
+			return false;
+	}
+
+	/* Caller should only pass base relations or joins. */
+	Assert(rel->reloptkind == RELOPT_BASEREL ||
+		   rel->reloptkind == RELOPT_JOINREL);
+
+	/*
+	 * Check if all aggregate expressions can be evaluated on this relation
+	 * level.
+	 */
+	foreach(lc, root->agg_clause_list)
+	{
+		AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+
+		Assert(IsA(ac_info->aggref, Aggref));
+
+		/*
+		 * Give up if any aggregate requires relations other than the current
+		 * one.  If the aggregate requires the current relation plus
+		 * additional relations, grouping the current relation could make some
+		 * input rows unavailable for the higher aggregate and may reduce the
+		 * number of input rows it receives.  If the aggregate does not
+		 * require the current relation at all, it should not be grouped, as
+		 * we do not support joining two grouped relations.
+		 */
+		if (!bms_is_subset(ac_info->agg_eval_at, rel->relids))
+			return false;
+	}
+
+	return true;
+}
+
+/*
+ * init_grouping_targets
+ *	  Initialize the target for grouped paths (target) as well as the target
+ *	  for paths that generate input for the grouped paths (agg_input).
+ *
+ * We also construct the list of SortGroupClauses and the list of grouping
+ * expressions for the partial aggregation, and return them in *group_clause
+ * and *group_exprs.
+ *
+ * Return true if the targets could be initialized, false otherwise.
+ */
+static bool
+init_grouping_targets(PlannerInfo *root, RelOptInfo *rel,
+					  PathTarget *target, PathTarget *agg_input,
+					  List **group_clauses, List **group_exprs)
+{
+	ListCell   *lc;
+	List	   *possibly_dependent = NIL;
+	Index		maxSortGroupRef;
+
+	/* Identify the max sortgroupref */
+	maxSortGroupRef = 0;
+	foreach(lc, root->processed_tlist)
+	{
+		Index		ref = ((TargetEntry *) lfirst(lc))->ressortgroupref;
+
+		if (ref > maxSortGroupRef)
+			maxSortGroupRef = ref;
+	}
+
+	/*
+	 * At this point, all Vars from this relation that are needed by upper
+	 * joins or are required in the final targetlist should already be present
+	 * in its reltarget.  Therefore, we can safely iterate over this
+	 * relation's reltarget->exprs to construct the PathTarget and grouping
+	 * clauses for the grouped paths.
+	 */
+	foreach(lc, rel->reltarget->exprs)
+	{
+		Expr	   *expr = (Expr *) lfirst(lc);
+		Index		sortgroupref;
+
+		/*
+		 * Given that PlaceHolderVar currently prevents us from doing eager
+		 * aggregation, the source target cannot contain anything more complex
+		 * than a Var.
+		 */
+		Assert(IsA(expr, Var));
+
+		/*
+		 * Get the sortgroupref of the expr if it is found among, or can be
+		 * deduced from, the original grouping expressions.
+		 */
+		sortgroupref = get_expression_sortgroupref(root, expr);
+		if (sortgroupref > 0)
+		{
+			SortGroupClause *sgc;
+
+			/* Find the matching SortGroupClause */
+			sgc = get_sortgroupref_clause(sortgroupref, root->processed_groupClause);
+			Assert(sgc->tleSortGroupRef <= maxSortGroupRef);
+
+			/*
+			 * If the target expression is to be used as a grouping key, it
+			 * should be emitted by the grouped paths that have been pushed
+			 * down to this relation level.
+			 */
+			add_column_to_pathtarget(target, expr, sortgroupref);
+
+			/*
+			 * ... and it also should be emitted by the input paths.
+			 */
+			add_column_to_pathtarget(agg_input, expr, sortgroupref);
+
+			/*
+			 * Record this SortGroupClause and grouping expression.  Note that
+			 * this SortGroupClause might have already been recorded.
+			 */
+			if (!list_member(*group_clauses, sgc))
+			{
+				*group_clauses = lappend(*group_clauses, sgc);
+				*group_exprs = lappend(*group_exprs, expr);
+			}
+		}
+		else if (is_var_needed_by_join(root, (Var *) expr, rel))
+		{
+			/*
+			 * The expression is needed for an upper join but is neither in
+			 * the GROUP BY clause nor derivable from it using EC (otherwise,
+			 * it would have already been included in the targets above).  We
+			 * need to create a special SortGroupClause for this expression.
+			 *
+			 * It is important to include such expressions in the grouping
+			 * keys.  This is essential to ensure that an aggregated row from
+			 * the partial aggregation matches the other side of the join if
+			 * and only if each row in the partial group does.  This ensures
+			 * that all rows within the same partial group share the same
+			 * 'destiny', which is crucial for maintaining correctness.
+			 */
+			SortGroupClause *sgc;
+			TypeCacheEntry *tce;
+			Oid			equalimageproc;
+
+			/*
+			 * But first, check if equality implies image equality for this
+			 * expression.  If not, we cannot use it as a grouping key.  See
+			 * comments in create_grouping_expr_infos().
+			 */
+			tce = lookup_type_cache(exprType((Node *) expr),
+									TYPECACHE_BTREE_OPFAMILY);
+			if (!OidIsValid(tce->btree_opf) ||
+				!OidIsValid(tce->btree_opintype))
+				return false;
+
+			equalimageproc = get_opfamily_proc(tce->btree_opf,
+											   tce->btree_opintype,
+											   tce->btree_opintype,
+											   BTEQUALIMAGE_PROC);
+			if (!OidIsValid(equalimageproc) ||
+				!DatumGetBool(OidFunctionCall1Coll(equalimageproc,
+												   tce->typcollation,
+												   ObjectIdGetDatum(tce->btree_opintype))))
+				return false;
+
+			/* Create the SortGroupClause. */
+			sgc = makeNode(SortGroupClause);
+
+			/* Initialize the SortGroupClause. */
+			sgc->tleSortGroupRef = ++maxSortGroupRef;
+			get_sort_group_operators(exprType((Node *) expr),
+									 false, true, false,
+									 &sgc->sortop, &sgc->eqop, NULL,
+									 &sgc->hashable);
+
+			/* This expression should be emitted by the grouped paths */
+			add_column_to_pathtarget(target, expr, sgc->tleSortGroupRef);
+
+			/* ... and it also should be emitted by the input paths. */
+			add_column_to_pathtarget(agg_input, expr, sgc->tleSortGroupRef);
+
+			/* Record this SortGroupClause and grouping expression */
+			*group_clauses = lappend(*group_clauses, sgc);
+			*group_exprs = lappend(*group_exprs, expr);
+		}
+		else if (is_var_in_aggref_only(root, (Var *) expr))
+		{
+			/*
+			 * The expression is referenced by an aggregate function pushed
+			 * down to this relation and does not appear elsewhere in the
+			 * targetlist or havingQual.  Add it to 'agg_input' but not to
+			 * 'target'.
+			 */
+			add_new_column_to_pathtarget(agg_input, expr);
+		}
+		else
+		{
+			/*
+			 * The expression may be functionally dependent on other
+			 * expressions in the target, but we cannot verify this until all
+			 * target expressions have been constructed.
+			 */
+			possibly_dependent = lappend(possibly_dependent, expr);
+		}
+	}
+
+	/*
+	 * Now we can verify whether an expression is functionally dependent on
+	 * others.
+	 */
+	foreach(lc, possibly_dependent)
+	{
+		Var		   *tvar;
+		List	   *deps = NIL;
+		RangeTblEntry *rte;
+
+		tvar = lfirst_node(Var, lc);
+		rte = root->simple_rte_array[tvar->varno];
+
+		if (check_functional_grouping(rte->relid, tvar->varno,
+									  tvar->varlevelsup,
+									  target->exprs, &deps))
+		{
+			/*
+			 * The expression is functionally dependent on other target
+			 * expressions, so it can be included in the targets.  Since it
+			 * will not be used as a grouping key, a sortgroupref is not
+			 * needed for it.
+			 */
+			add_new_column_to_pathtarget(target, (Expr *) tvar);
+			add_new_column_to_pathtarget(agg_input, (Expr *) tvar);
+		}
+		else
+		{
+			/*
+			 * We may arrive here with a grouping expression that is proven
+			 * redundant by EquivalenceClass processing, such as 't1.a' in the
+			 * query below.
+			 *
+			 * select max(t1.c) from t t1, t t2 where t1.a = 1 group by t1.a,
+			 * t1.b;
+			 *
+			 * For now we just give up in this case.
+			 */
+			return false;
+		}
+	}
+
+	return true;
+}
+
+/*
+ * is_var_in_aggref_only
+ *	  Check whether the given Var appears in aggregate expressions and not
+ *	  elsewhere in the targetlist or havingQual.
+ */
+static bool
+is_var_in_aggref_only(PlannerInfo *root, Var *var)
+{
+	ListCell   *lc;
+
+	/*
+	 * Search the list of aggregate expressions for the Var.
+	 */
+	foreach(lc, root->agg_clause_list)
+	{
+		AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+		List	   *vars;
+
+		Assert(IsA(ac_info->aggref, Aggref));
+
+		if (!bms_is_member(var->varno, ac_info->agg_eval_at))
+			continue;
+
+		vars = pull_var_clause((Node *) ac_info->aggref,
+							   PVC_RECURSE_AGGREGATES |
+							   PVC_RECURSE_WINDOWFUNCS |
+							   PVC_RECURSE_PLACEHOLDERS);
+
+		if (list_member(vars, var))
+		{
+			list_free(vars);
+			break;
+		}
+
+		list_free(vars);
+	}
+
+	return (lc != NULL && !list_member(root->tlist_vars, var));
+}
+
+/*
+ * is_var_needed_by_join
+ *	  Check if the given Var is needed by joins above the current rel.
+ */
+static bool
+is_var_needed_by_join(PlannerInfo *root, Var *var, RelOptInfo *rel)
+{
+	Relids		relids;
+	int			attno;
+	RelOptInfo *baserel;
+
+	/*
+	 * Note that when checking if the Var is needed by joins above, we want to
+	 * exclude cases where the Var is only needed in the final targetlist.  So
+	 * include "relation 0" in the check.
+	 */
+	relids = bms_copy(rel->relids);
+	relids = bms_add_member(relids, 0);
+
+	baserel = find_base_rel(root, var->varno);
+	attno = var->varattno - baserel->min_attr;
+
+	return bms_nonempty_difference(baserel->attr_needed[attno], relids);
+}
+
+/*
+ * get_expression_sortgroupref
+ *	  Return the sortgroupref of the given "expr" if it is found among the
+ *	  original grouping expressions, or is known equal to any of the original
+ *	  grouping expressions due to equivalence relationships.  Return 0 if no
+ *	  match is found.
+ */
+static Index
+get_expression_sortgroupref(PlannerInfo *root, Expr *expr)
+{
+	ListCell   *lc;
+
+	foreach(lc, root->group_expr_list)
+	{
+		GroupingExprInfo *ge_info = lfirst_node(GroupingExprInfo, lc);
+
+		Assert(IsA(ge_info->expr, Var));
+
+		if (equal(ge_info->expr, expr) ||
+			exprs_known_equal(root, (Node *) expr, (Node *) ge_info->expr,
+							  ge_info->btree_opfamily))
+		{
+			Assert(ge_info->sortgroupref > 0);
+
+			return ge_info->sortgroupref;
+		}
+	}
+
+	/* no match is found */
+	return 0;
+}
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index d14b1678e7f..5ef8b824a7b 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -949,6 +949,16 @@ struct config_bool ConfigureNamesBool[] =
 		false,
 		NULL, NULL, NULL
 	},
+	{
+		{"enable_eager_aggregate", PGC_USERSET, QUERY_TUNING_METHOD,
+			gettext_noop("Enables eager aggregation."),
+			NULL,
+			GUC_EXPLAIN
+		},
+		&enable_eager_aggregate,
+		true,
+		NULL, NULL, NULL
+	},
 	{
 		{"enable_parallel_append", PGC_USERSET, QUERY_TUNING_METHOD,
 			gettext_noop("Enables the planner's use of parallel append plans."),
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index a9d8293474a..0eb755d61da 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -428,6 +428,7 @@
 #enable_group_by_reordering = on
 #enable_distinct_reordering = on
 #enable_self_join_elimination = on
+#enable_eager_aggregate = on
 
 # - Planner Cost Constants -
 
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index e5dd15098f6..c9df12aa38e 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -397,6 +397,15 @@ struct PlannerInfo
 	/* list of PlaceHolderInfos */
 	List	   *placeholder_list;
 
+	/* list of AggClauseInfos */
+	List	   *agg_clause_list;
+
+	/* list of GroupExprInfos */
+	List	   *group_expr_list;
+
+	/* list of plain Vars contained in targetlist and havingQual */
+	List	   *tlist_vars;
+
 	/* array of PlaceHolderInfos indexed by phid */
 	struct PlaceHolderInfo **placeholder_array pg_node_attr(read_write_ignore, array_size(placeholder_array_size));
 	/* allocated size of array */
@@ -1024,6 +1033,14 @@ typedef struct RelOptInfo
 	/* consider partitionwise join paths? (if partitioned rel) */
 	bool		consider_partitionwise_join;
 
+	/*
+	 * used by eager aggregation:
+	 */
+	/* information needed to create grouped paths */
+	struct RelAggInfo *agg_info;
+	/* the partially-aggregated version of the relation */
+	struct RelOptInfo *grouped_rel;
+
 	/*
 	 * inheritance links, if this is an otherrel (otherwise NULL):
 	 */
@@ -1097,6 +1114,75 @@ typedef struct RelOptInfo
 	((rel)->part_scheme && (rel)->boundinfo && (rel)->nparts > 0 && \
 	 (rel)->part_rels && (rel)->partexprs && (rel)->nullable_partexprs)
 
+/*
+ * Is the given relation a grouped relation?
+ */
+#define IS_GROUPED_REL(rel) \
+	((rel)->agg_info != NULL)
+
+/*
+ * RelAggInfo
+ *		Information needed to create grouped paths for base and join rels.
+ *
+ * "relids" is the set of relation identifiers (RT indexes).
+ *
+ * "target" is the output tlist for the grouped paths.
+ *
+ * "agg_input" is the output tlist for the paths that provide input to the
+ * grouped paths.  One difference from the reltarget of the non-grouped
+ * relation is that agg_input has its sortgrouprefs[] initialized.
+ *
+ * "grouped_rows" is the estimated number of result tuples of the grouped
+ * relation.
+ *
+ * "group_clauses", "group_exprs" and "group_pathkeys" are lists of
+ * SortGroupClauses, the corresponding grouping expressions and PathKeys
+ * respectively.
+ *
+ * "apply_at" tracks the lowest join level at which partial aggregation is
+ * applied.
+ *
+ * "agg_useful" is a flag to indicate whether the grouped paths are considered
+ * useful.  It is set true if the average partial group size is no less than
+ * EAGER_AGG_MIN_GROUP_SIZE, suggesting a significant row count reduction.
+ */
+typedef struct RelAggInfo
+{
+	pg_node_attr(no_copy_equal, no_read, no_query_jumble)
+
+	NodeTag		type;
+
+	/* set of base + OJ relids (rangetable indexes) */
+	Relids		relids;
+
+	/*
+	 * default result targetlist for Paths scanning this grouped relation;
+	 * list of Vars/Exprs, cost, width
+	 */
+	struct PathTarget *target;
+
+	/*
+	 * the targetlist for Paths that provide input to the grouped paths
+	 */
+	struct PathTarget *agg_input;
+
+	/* estimated number of result tuples */
+	Cardinality grouped_rows;
+
+	/* a list of SortGroupClauses */
+	List	   *group_clauses;
+	/* a list of grouping expressions */
+	List	   *group_exprs;
+	/* a list of PathKeys */
+	List	   *group_pathkeys;
+
+	/* lowest level partial aggregation is applied at */
+	Relids		apply_at;
+
+	/* the grouped paths are considered useful? */
+	bool		agg_useful;
+} RelAggInfo;
+
 /*
  * IndexOptInfo
  *		Per-index information for planning/optimization
@@ -3276,6 +3362,50 @@ typedef struct MinMaxAggInfo
 	Param	   *param;
 } MinMaxAggInfo;
 
+/*
+ * For each distinct Aggref node that appears in the targetlist and HAVING
+ * clauses, we store an AggClauseInfo node in the PlannerInfo node's
+ * agg_clause_list.  Each AggClauseInfo records the set of relations referenced
+ * by the aggregate expression.  This information is used to determine how far
+ * the aggregate can be safely pushed down in the join tree.
+ */
+typedef struct AggClauseInfo
+{
+	pg_node_attr(no_read, no_query_jumble)
+
+	NodeTag		type;
+
+	/* the Aggref expr */
+	Aggref	   *aggref;
+
+	/* lowest level we can evaluate this aggregate at */
+	Relids		agg_eval_at;
+} AggClauseInfo;
+
+/*
+ * For each grouping expression that appears in grouping clauses, we store a
+ * GroupingExprInfo node in the PlannerInfo node's group_expr_list.  Each
+ * GroupingExprInfo records the expression being grouped on, its sortgroupref,
+ * and the btree opfamily used for equality comparison.  This information is
+ * necessary to reproduce correct grouping semantics at different levels of the
+ * join tree.
+ */
+typedef struct GroupingExprInfo
+{
+	pg_node_attr(no_read, no_query_jumble)
+
+	NodeTag		type;
+
+	/* the represented expression */
+	Expr	   *expr;
+
+	/* the tleSortGroupRef of the corresponding SortGroupClause */
+	Index		sortgroupref;
+
+	/* btree opfamily defining the ordering */
+	Oid			btree_opfamily;
+} GroupingExprInfo;
+
 /*
  * At runtime, PARAM_EXEC slots are used to pass values around from one plan
  * node to another.  They can be used to pass values down into subqueries (for
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index 60dcdb77e41..01a3532dc2e 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -314,6 +314,10 @@ extern void setup_simple_rel_arrays(PlannerInfo *root);
 extern void expand_planner_arrays(PlannerInfo *root, int add_size);
 extern RelOptInfo *build_simple_rel(PlannerInfo *root, int relid,
 									RelOptInfo *parent);
+extern RelOptInfo *build_simple_grouped_rel(PlannerInfo *root,
+											RelOptInfo *rel_plain);
+extern RelOptInfo *build_grouped_rel(PlannerInfo *root,
+									 RelOptInfo *rel_plain);
 extern RelOptInfo *find_base_rel(PlannerInfo *root, int relid);
 extern RelOptInfo *find_base_rel_noerr(PlannerInfo *root, int relid);
 extern RelOptInfo *find_base_rel_ignore_join(PlannerInfo *root, int relid);
@@ -353,4 +357,5 @@ extern RelOptInfo *build_child_join_rel(PlannerInfo *root,
 										SpecialJoinInfo *sjinfo,
 										int nappinfos, AppendRelInfo **appinfos);
 
+extern RelAggInfo *create_rel_agg_info(PlannerInfo *root, RelOptInfo *rel);
 #endif							/* PATHNODE_H */
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 8410531f2d6..b62f22237b7 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -21,6 +21,7 @@
  * allpaths.c
  */
 extern PGDLLIMPORT bool enable_geqo;
+extern PGDLLIMPORT bool enable_eager_aggregate;
 extern PGDLLIMPORT int geqo_threshold;
 extern PGDLLIMPORT int min_parallel_table_scan_size;
 extern PGDLLIMPORT int min_parallel_index_scan_size;
@@ -57,6 +58,10 @@ extern void generate_gather_paths(PlannerInfo *root, RelOptInfo *rel,
 								  bool override_rows);
 extern void generate_useful_gather_paths(PlannerInfo *root, RelOptInfo *rel,
 										 bool override_rows);
+extern void generate_grouped_paths(PlannerInfo *root,
+								   RelOptInfo *rel_grouped,
+								   RelOptInfo *rel_plain,
+								   RelAggInfo *agg_info);
 extern int	compute_parallel_worker(RelOptInfo *rel, double heap_pages,
 									double index_pages, int max_workers);
 extern void create_partial_bitmap_paths(PlannerInfo *root, RelOptInfo *rel,
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index 9d3debcab28..09b48b26f8f 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -76,6 +76,7 @@ extern void add_vars_to_targetlist(PlannerInfo *root, List *vars,
 extern void add_vars_to_attr_needed(PlannerInfo *root, List *vars,
 									Relids where_needed);
 extern void remove_useless_groupby_columns(PlannerInfo *root);
+extern void setup_eager_aggregation(PlannerInfo *root);
 extern void find_lateral_references(PlannerInfo *root);
 extern void rebuild_lateral_attr_needed(PlannerInfo *root);
 extern void create_lateral_join_info(PlannerInfo *root);
diff --git a/src/test/regress/expected/eager_aggregate.out b/src/test/regress/expected/eager_aggregate.out
new file mode 100644
index 00000000000..f02ff0b30a3
--- /dev/null
+++ b/src/test/regress/expected/eager_aggregate.out
@@ -0,0 +1,1334 @@
+--
+-- EAGER AGGREGATION
+-- Test we can push aggregation down below join
+--
+-- Enable eager aggregation, which by default is disabled.
+SET enable_eager_aggregate TO on;
+CREATE TABLE eager_agg_t1 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t2 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t3 (a int, b int, c double precision);
+INSERT INTO eager_agg_t1 SELECT i, i, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t2 SELECT i, i%10, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t3 SELECT i%10, i%10, i FROM generate_series(1, 1000)i;
+ANALYZE eager_agg_t1;
+ANALYZE eager_agg_t2;
+ANALYZE eager_agg_t3;
+--
+-- Test eager aggregation over base rel
+--
+-- Perform scan of a table, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                            QUERY PLAN                            
+------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Seq Scan on public.eager_agg_t2 t2
+                                 Output: t2.a, t2.b, t2.c
+(18 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                               QUERY PLAN                               
+------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial GroupAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Sort
+                                 Output: t2.c, t2.b
+                                 Sort Key: t2.b
+                                 ->  Seq Scan on public.eager_agg_t2 t2
+                                       Output: t2.c, t2.b
+(21 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+RESET enable_hashagg;
+--
+-- Test eager aggregation over join rel
+--
+-- Perform join of tables, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+                                  QUERY PLAN                                  
+------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg((t2.c + t3.c))
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg((t2.c + t3.c)))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg((t2.c + t3.c))
+                           Group Key: t2.b
+                           ->  Hash Join
+                                 Output: t2.c, t2.b, t3.c
+                                 Hash Cond: (t3.a = t2.a)
+                                 ->  Seq Scan on public.eager_agg_t3 t3
+                                       Output: t3.a, t3.b, t3.c
+                                 ->  Hash
+                                       Output: t2.c, t2.b, t2.a
+                                       ->  Seq Scan on public.eager_agg_t2 t2
+                                             Output: t2.c, t2.b, t2.a
+(25 rows)
+
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 497
+ 2 | 499
+ 3 | 501
+ 4 | 503
+ 5 | 505
+ 6 | 507
+ 7 | 509
+ 8 | 511
+ 9 | 513
+(9 rows)
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+                                     QUERY PLAN                                     
+------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg((t2.c + t3.c))
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg((t2.c + t3.c)))
+                     ->  Partial GroupAggregate
+                           Output: t2.b, PARTIAL avg((t2.c + t3.c))
+                           Group Key: t2.b
+                           ->  Sort
+                                 Output: t2.c, t2.b, t3.c
+                                 Sort Key: t2.b
+                                 ->  Hash Join
+                                       Output: t2.c, t2.b, t3.c
+                                       Hash Cond: (t3.a = t2.a)
+                                       ->  Seq Scan on public.eager_agg_t3 t3
+                                             Output: t3.a, t3.b, t3.c
+                                       ->  Hash
+                                             Output: t2.c, t2.b, t2.a
+                                             ->  Seq Scan on public.eager_agg_t2 t2
+                                                   Output: t2.c, t2.b, t2.a
+(28 rows)
+
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 497
+ 2 | 499
+ 3 | 501
+ 4 | 503
+ 5 | 505
+ 6 | 507
+ 7 | 509
+ 8 | 511
+ 9 | 513
+(9 rows)
+
+RESET enable_hashagg;
+--
+-- Test that eager aggregation works for outer join
+--
+-- Ensure aggregation can be pushed down to the non-nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                            QUERY PLAN                            
+------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Right Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Seq Scan on public.eager_agg_t2 t2
+                                 Output: t2.a, t2.b, t2.c
+(18 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+   | 505
+(10 rows)
+
+-- Ensure aggregation cannot be pushed down to the nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Sort
+   Output: t2.b, (avg(t2.c))
+   Sort Key: t2.b
+   ->  HashAggregate
+         Output: t2.b, avg(t2.c)
+         Group Key: t2.b
+         ->  Hash Right Join
+               Output: t2.b, t2.c
+               Hash Cond: (t2.b = t1.b)
+               ->  Seq Scan on public.eager_agg_t2 t2
+                     Output: t2.a, t2.b, t2.c
+               ->  Hash
+                     Output: t1.b
+                     ->  Seq Scan on public.eager_agg_t1 t1
+                           Output: t1.b
+(15 rows)
+
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+ b | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+   |    
+(10 rows)
+
+--
+-- Test that eager aggregation works for parallel plans
+--
+SET parallel_setup_cost=0;
+SET parallel_tuple_cost=0;
+SET min_parallel_table_scan_size=0;
+SET max_parallel_workers_per_gather=4;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                                   QUERY PLAN                                    
+---------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Gather Merge
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Workers Planned: 2
+         ->  Sort
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Sort Key: t1.a
+               ->  Parallel Hash Join
+                     Output: t1.a, (PARTIAL avg(t2.c))
+                     Hash Cond: (t1.b = t2.b)
+                     ->  Parallel Seq Scan on public.eager_agg_t1 t1
+                           Output: t1.a, t1.b, t1.c
+                     ->  Parallel Hash
+                           Output: t2.b, (PARTIAL avg(t2.c))
+                           ->  Partial HashAggregate
+                                 Output: t2.b, PARTIAL avg(t2.c)
+                                 Group Key: t2.b
+                                 ->  Parallel Seq Scan on public.eager_agg_t2 t2
+                                       Output: t2.a, t2.b, t2.c
+(21 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+RESET parallel_setup_cost;
+RESET parallel_tuple_cost;
+RESET min_parallel_table_scan_size;
+RESET max_parallel_workers_per_gather;
+DROP TABLE eager_agg_t1;
+DROP TABLE eager_agg_t2;
+DROP TABLE eager_agg_t3;
+--
+-- Test eager aggregation for partitionwise join
+--
+-- Enable partitionwise aggregate, which by default is disabled.
+SET enable_partitionwise_aggregate TO true;
+-- Enable partitionwise join, which by default is disabled.
+SET enable_partitionwise_join TO true;
+CREATE TABLE eager_agg_tab1(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab1_p1 PARTITION OF eager_agg_tab1 FOR VALUES FROM (0) TO (5);
+CREATE TABLE eager_agg_tab1_p2 PARTITION OF eager_agg_tab1 FOR VALUES FROM (5) TO (10);
+CREATE TABLE eager_agg_tab1_p3 PARTITION OF eager_agg_tab1 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab2(x int, y int) PARTITION BY RANGE(y);
+CREATE TABLE eager_agg_tab2_p1 PARTITION OF eager_agg_tab2 FOR VALUES FROM (0) TO (5);
+CREATE TABLE eager_agg_tab2_p2 PARTITION OF eager_agg_tab2 FOR VALUES FROM (5) TO (10);
+CREATE TABLE eager_agg_tab2_p3 PARTITION OF eager_agg_tab2 FOR VALUES FROM (10) TO (15);
+INSERT INTO eager_agg_tab1 SELECT i % 15, i % 10 FROM generate_series(1, 1000) i;
+INSERT INTO eager_agg_tab2 SELECT i % 10, i % 15 FROM generate_series(1, 1000) i;
+ANALYZE eager_agg_tab1;
+ANALYZE eager_agg_tab2;
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum(t1.y)), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum(t1.y), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2.y = t1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2
+                           Output: t2.y
+                     ->  Hash
+                           Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1.x, PARTIAL sum(t1.y), PARTIAL count(*)
+                                 Group Key: t1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                       Output: t1.x, t1.y
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum(t1_1.y), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_1
+                           Output: t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.y), PARTIAL count(*)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                       Output: t1_1.x, t1_1.y
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum(t1_2.y), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_2
+                           Output: t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.y), PARTIAL count(*)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                       Output: t1_2.x, t1_2.y
+(49 rows)
+
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+ x  |  sum  | count 
+----+-------+-------
+  0 | 10890 |  4356
+  1 | 15544 |  4489
+  2 | 20033 |  4489
+  3 | 24522 |  4489
+  4 | 29011 |  4489
+  5 | 11390 |  4489
+  6 | 15879 |  4489
+  7 | 20368 |  4489
+  8 | 24857 |  4489
+  9 | 29346 |  4489
+ 10 | 11055 |  4489
+ 11 | 15246 |  4356
+ 12 | 19602 |  4356
+ 13 | 23958 |  4356
+ 14 | 28314 |  4356
+(15 rows)
+
+-- GROUP BY having other matching key
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t2.y, (sum(t1.y)), (count(*))
+   Sort Key: t2.y
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t2.y, sum(t1.y), count(*)
+               Group Key: t2.y
+               ->  Hash Join
+                     Output: t2.y, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2.y = t1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2
+                           Output: t2.y
+                     ->  Hash
+                           Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1.x, PARTIAL sum(t1.y), PARTIAL count(*)
+                                 Group Key: t1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                       Output: t1.y, t1.x
+         ->  Finalize HashAggregate
+               Output: t2_1.y, sum(t1_1.y), count(*)
+               Group Key: t2_1.y
+               ->  Hash Join
+                     Output: t2_1.y, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_1
+                           Output: t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.y), PARTIAL count(*)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                       Output: t1_1.y, t1_1.x
+         ->  Finalize HashAggregate
+               Output: t2_2.y, sum(t1_2.y), count(*)
+               Group Key: t2_2.y
+               ->  Hash Join
+                     Output: t2_2.y, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_2
+                           Output: t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.y), PARTIAL count(*)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                       Output: t1_2.y, t1_2.x
+(49 rows)
+
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+ y  |  sum  | count 
+----+-------+-------
+  0 | 10890 |  4356
+  1 | 15544 |  4489
+  2 | 20033 |  4489
+  3 | 24522 |  4489
+  4 | 29011 |  4489
+  5 | 11390 |  4489
+  6 | 15879 |  4489
+  7 | 20368 |  4489
+  8 | 24857 |  4489
+  9 | 29346 |  4489
+ 10 | 11055 |  4489
+ 11 | 15246 |  4356
+ 12 | 19602 |  4356
+ 13 | 23958 |  4356
+ 14 | 28314 |  4356
+(15 rows)
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 5 ORDER BY t2.x;
+                                                 QUERY PLAN                                                 
+------------------------------------------------------------------------------------------------------------
+ Sort
+   Output: t2.x, (sum(t1.x)), (count(*))
+   Sort Key: t2.x
+   ->  Finalize HashAggregate
+         Output: t2.x, sum(t1.x), count(*)
+         Group Key: t2.x
+         Filter: (avg(t1.x) > '5'::numeric)
+         ->  Append
+               ->  Hash Join
+                     Output: t2.x, (PARTIAL sum(t1.x)), (PARTIAL count(*)), (PARTIAL avg(t1.x))
+                     Hash Cond: (t2.y = t1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2
+                           Output: t2.x, t2.y
+                     ->  Hash
+                           Output: t1.x, (PARTIAL sum(t1.x)), (PARTIAL count(*)), (PARTIAL avg(t1.x))
+                           ->  Partial HashAggregate
+                                 Output: t1.x, PARTIAL sum(t1.x), PARTIAL count(*), PARTIAL avg(t1.x)
+                                 Group Key: t1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                       Output: t1.x
+               ->  Hash Join
+                     Output: t2_1.x, (PARTIAL sum(t1_1.x)), (PARTIAL count(*)), (PARTIAL avg(t1_1.x))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_1
+                           Output: t2_1.x, t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.x)), (PARTIAL count(*)), (PARTIAL avg(t1_1.x))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.x), PARTIAL count(*), PARTIAL avg(t1_1.x)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                       Output: t1_1.x
+               ->  Hash Join
+                     Output: t2_2.x, (PARTIAL sum(t1_2.x)), (PARTIAL count(*)), (PARTIAL avg(t1_2.x))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_2
+                           Output: t2_2.x, t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.x)), (PARTIAL count(*)), (PARTIAL avg(t1_2.x))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.x), PARTIAL count(*), PARTIAL avg(t1_2.x)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                       Output: t1_2.x
+(44 rows)
+
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 5 ORDER BY t2.x;
+ x |  sum  | count 
+---+-------+-------
+ 0 | 33835 |  6667
+ 1 | 39502 |  6667
+ 2 | 46169 |  6667
+ 3 | 52836 |  6667
+ 4 | 59503 |  6667
+ 5 | 33500 |  6667
+ 6 | 39837 |  6667
+ 7 | 46504 |  6667
+ 8 | 53171 |  6667
+ 9 | 59838 |  6667
+(10 rows)
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum((t2.y + t3.y)))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum((t2.y + t3.y))
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum((t2.y + t3.y)))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, t3.x, (PARTIAL sum((t2.y + t3.y)))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, t3.x, PARTIAL sum((t2.y + t3.y))
+                                 Group Key: t2.x
+                                 ->  Hash Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Hash Cond: (t2.x = t3.x)
+                                       ->  Seq Scan on public.eager_agg_tab1_p1 t2
+                                             Output: t2.y, t2.x
+                                       ->  Hash
+                                             Output: t3.y, t3.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p1 t3
+                                                   Output: t3.y, t3.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum((t2_1.y + t3_1.y))
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum((t2_1.y + t3_1.y)))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y)))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y))
+                                 Group Key: t2_1.x
+                                 ->  Hash Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Hash Cond: (t2_1.x = t3_1.x)
+                                       ->  Seq Scan on public.eager_agg_tab1_p2 t2_1
+                                             Output: t2_1.y, t2_1.x
+                                       ->  Hash
+                                             Output: t3_1.y, t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p2 t3_1
+                                                   Output: t3_1.y, t3_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum((t2_2.y + t3_2.y))
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum((t2_2.y + t3_2.y)))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y)))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y))
+                                 Group Key: t2_2.x
+                                 ->  Hash Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Hash Cond: (t2_2.x = t3_2.x)
+                                       ->  Seq Scan on public.eager_agg_tab1_p3 t2_2
+                                             Output: t2_2.y, t2_2.x
+                                       ->  Hash
+                                             Output: t3_2.y, t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p3 t3_2
+                                                   Output: t3_2.y, t3_2.x
+(70 rows)
+
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+ x  |   sum   
+----+---------
+  0 | 1437480
+  1 | 2082896
+  2 | 2684422
+  3 | 3285948
+  4 | 3887474
+  5 | 1526260
+  6 | 2127786
+  7 | 2729312
+  8 | 3330838
+  9 | 3932364
+ 10 | 1481370
+ 11 | 2012472
+ 12 | 2587464
+ 13 | 3162456
+ 14 | 3737448
+(15 rows)
+
+-- partial aggregation
+SET enable_hashagg TO off;
+SET max_parallel_workers_per_gather TO 0;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t3.y, sum((t2.y + t3.y))
+   Group Key: t3.y
+   ->  Sort
+         Output: t3.y, (PARTIAL sum((t2.y + t3.y)))
+         Sort Key: t3.y
+         ->  Append
+               ->  Hash Join
+                     Output: t3.y, (PARTIAL sum((t2.y + t3.y)))
+                     Hash Cond: (t2.x = t1.x)
+                     ->  Partial GroupAggregate
+                           Output: t2.x, t3.y, t3.x, PARTIAL sum((t2.y + t3.y))
+                           Group Key: t2.x, t3.y, t3.x
+                           ->  Incremental Sort
+                                 Output: t2.y, t2.x, t3.y, t3.x
+                                 Sort Key: t2.x, t3.y
+                                 Presorted Key: t2.x
+                                 ->  Merge Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Merge Cond: (t2.x = t3.x)
+                                       ->  Sort
+                                             Output: t2.y, t2.x
+                                             Sort Key: t2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p1 t2
+                                                   Output: t2.y, t2.x
+                                       ->  Sort
+                                             Output: t3.y, t3.x
+                                             Sort Key: t3.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p1 t3
+                                                   Output: t3.y, t3.x
+                     ->  Hash
+                           Output: t1.x
+                           ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                 Output: t1.x
+               ->  Hash Join
+                     Output: t3_1.y, (PARTIAL sum((t2_1.y + t3_1.y)))
+                     Hash Cond: (t2_1.x = t1_1.x)
+                     ->  Partial GroupAggregate
+                           Output: t2_1.x, t3_1.y, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y))
+                           Group Key: t2_1.x, t3_1.y, t3_1.x
+                           ->  Incremental Sort
+                                 Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                 Sort Key: t2_1.x, t3_1.y
+                                 Presorted Key: t2_1.x
+                                 ->  Merge Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Merge Cond: (t2_1.x = t3_1.x)
+                                       ->  Sort
+                                             Output: t2_1.y, t2_1.x
+                                             Sort Key: t2_1.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p2 t2_1
+                                                   Output: t2_1.y, t2_1.x
+                                       ->  Sort
+                                             Output: t3_1.y, t3_1.x
+                                             Sort Key: t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p2 t3_1
+                                                   Output: t3_1.y, t3_1.x
+                     ->  Hash
+                           Output: t1_1.x
+                           ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                 Output: t1_1.x
+               ->  Hash Join
+                     Output: t3_2.y, (PARTIAL sum((t2_2.y + t3_2.y)))
+                     Hash Cond: (t2_2.x = t1_2.x)
+                     ->  Partial GroupAggregate
+                           Output: t2_2.x, t3_2.y, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y))
+                           Group Key: t2_2.x, t3_2.y, t3_2.x
+                           ->  Incremental Sort
+                                 Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                 Sort Key: t2_2.x, t3_2.y
+                                 Presorted Key: t2_2.x
+                                 ->  Merge Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Merge Cond: (t2_2.x = t3_2.x)
+                                       ->  Sort
+                                             Output: t2_2.y, t2_2.x
+                                             Sort Key: t2_2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p3 t2_2
+                                                   Output: t2_2.y, t2_2.x
+                                       ->  Sort
+                                             Output: t3_2.y, t3_2.x
+                                             Sort Key: t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p3 t3_2
+                                                   Output: t3_2.y, t3_2.x
+                     ->  Hash
+                           Output: t1_2.x
+                           ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                 Output: t1_2.x
+(88 rows)
+
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+ y |   sum   
+---+---------
+ 0 | 1111110
+ 1 | 2000132
+ 2 | 2889154
+ 3 | 3778176
+ 4 | 4667198
+ 5 | 3334000
+ 6 | 4223022
+ 7 | 5112044
+ 8 | 6001066
+ 9 | 6890088
+(10 rows)
+
+RESET enable_hashagg;
+RESET max_parallel_workers_per_gather;
+DROP TABLE eager_agg_tab1;
+DROP TABLE eager_agg_tab2;
+--
+-- Test with multi-level partitioning scheme
+--
+CREATE TABLE eager_agg_tab_ml(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p1 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab_ml_p2 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (10) TO (20) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p2_s1 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab_ml_p2_s2 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (15) TO (20);
+CREATE TABLE eager_agg_tab_ml_p3 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (20) TO (30) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p3_s1 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (20) TO (25);
+CREATE TABLE eager_agg_tab_ml_p3_s2 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (25) TO (30);
+INSERT INTO eager_agg_tab_ml SELECT i % 30, i % 30 FROM generate_series(1, 1000) i;
+ANALYZE eager_agg_tab_ml;
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum(t2.y)), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum(t2.y), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, PARTIAL sum(t2.y), PARTIAL count(*)
+                                 Group Key: t2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                       Output: t2.y, t2.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum(t2_1.y), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, PARTIAL sum(t2_1.y), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                       Output: t2_1.y, t2_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum(t2_2.y), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, PARTIAL sum(t2_2.y), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                       Output: t2_2.y, t2_2.x
+         ->  Finalize HashAggregate
+               Output: t1_3.x, sum(t2_3.y), count(*)
+               Group Key: t1_3.x
+               ->  Hash Join
+                     Output: t1_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, PARTIAL sum(t2_3.y), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                       Output: t2_3.y, t2_3.x
+         ->  Finalize HashAggregate
+               Output: t1_4.x, sum(t2_4.y), count(*)
+               Group Key: t1_4.x
+               ->  Hash Join
+                     Output: t1_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, PARTIAL sum(t2_4.y), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                       Output: t2_4.y, t2_4.x
+(79 rows)
+
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+ x  |  sum  | count 
+----+-------+-------
+  0 |     0 |  1089
+  1 |  1156 |  1156
+  2 |  2312 |  1156
+  3 |  3468 |  1156
+  4 |  4624 |  1156
+  5 |  5780 |  1156
+  6 |  6936 |  1156
+  7 |  8092 |  1156
+  8 |  9248 |  1156
+  9 | 10404 |  1156
+ 10 | 11560 |  1156
+ 11 | 11979 |  1089
+ 12 | 13068 |  1089
+ 13 | 14157 |  1089
+ 14 | 15246 |  1089
+ 15 | 16335 |  1089
+ 16 | 17424 |  1089
+ 17 | 18513 |  1089
+ 18 | 19602 |  1089
+ 19 | 20691 |  1089
+ 20 | 21780 |  1089
+ 21 | 22869 |  1089
+ 22 | 23958 |  1089
+ 23 | 25047 |  1089
+ 24 | 26136 |  1089
+ 25 | 27225 |  1089
+ 26 | 28314 |  1089
+ 27 | 29403 |  1089
+ 28 | 30492 |  1089
+ 29 | 31581 |  1089
+(30 rows)
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.y, (sum(t2.y)), (count(*))
+   Sort Key: t1.y
+   ->  Finalize HashAggregate
+         Output: t1.y, sum(t2.y), count(*)
+         Group Key: t1.y
+         ->  Append
+               ->  Hash Join
+                     Output: t1.y, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.y, t1.x
+                     ->  Hash
+                           Output: t2.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, PARTIAL sum(t2.y), PARTIAL count(*)
+                                 Group Key: t2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                       Output: t2.y, t2.x
+               ->  Hash Join
+                     Output: t1_1.y, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.y, t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, PARTIAL sum(t2_1.y), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                       Output: t2_1.y, t2_1.x
+               ->  Hash Join
+                     Output: t1_2.y, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.y, t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, PARTIAL sum(t2_2.y), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                       Output: t2_2.y, t2_2.x
+               ->  Hash Join
+                     Output: t1_3.y, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.y, t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, PARTIAL sum(t2_3.y), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                       Output: t2_3.y, t2_3.x
+               ->  Hash Join
+                     Output: t1_4.y, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.y, t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, PARTIAL sum(t2_4.y), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                       Output: t2_4.y, t2_4.x
+(67 rows)
+
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+ y  |  sum  | count 
+----+-------+-------
+  0 |     0 |  1089
+  1 |  1156 |  1156
+  2 |  2312 |  1156
+  3 |  3468 |  1156
+  4 |  4624 |  1156
+  5 |  5780 |  1156
+  6 |  6936 |  1156
+  7 |  8092 |  1156
+  8 |  9248 |  1156
+  9 | 10404 |  1156
+ 10 | 11560 |  1156
+ 11 | 11979 |  1089
+ 12 | 13068 |  1089
+ 13 | 14157 |  1089
+ 14 | 15246 |  1089
+ 15 | 16335 |  1089
+ 16 | 17424 |  1089
+ 17 | 18513 |  1089
+ 18 | 19602 |  1089
+ 19 | 20691 |  1089
+ 20 | 21780 |  1089
+ 21 | 22869 |  1089
+ 22 | 23958 |  1089
+ 23 | 25047 |  1089
+ 24 | 26136 |  1089
+ 25 | 27225 |  1089
+ 26 | 28314 |  1089
+ 27 | 29403 |  1089
+ 28 | 30492 |  1089
+ 29 | 31581 |  1089
+(30 rows)
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+                                                QUERY PLAN                                                
+----------------------------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum((t2.y + t3.y))), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum((t2.y + t3.y)), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, t3.x, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, t3.x, PARTIAL sum((t2.y + t3.y)), PARTIAL count(*)
+                                 Group Key: t2.x
+                                 ->  Hash Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Hash Cond: (t2.x = t3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                             Output: t2.y, t2.x
+                                       ->  Hash
+                                             Output: t3.y, t3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p1 t3
+                                                   Output: t3.y, t3.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum((t2_1.y + t3_1.y)), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y)), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Hash Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Hash Cond: (t2_1.x = t3_1.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                             Output: t2_1.y, t2_1.x
+                                       ->  Hash
+                                             Output: t3_1.y, t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t3_1
+                                                   Output: t3_1.y, t3_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum((t2_2.y + t3_2.y)), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y)), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Hash Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Hash Cond: (t2_2.x = t3_2.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                             Output: t2_2.y, t2_2.x
+                                       ->  Hash
+                                             Output: t3_2.y, t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t3_2
+                                                   Output: t3_2.y, t3_2.x
+         ->  Finalize HashAggregate
+               Output: t1_3.x, sum((t2_3.y + t3_3.y)), count(*)
+               Group Key: t1_3.x
+               ->  Hash Join
+                     Output: t1_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, t3_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, t3_3.x, PARTIAL sum((t2_3.y + t3_3.y)), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Hash Join
+                                       Output: t2_3.y, t2_3.x, t3_3.y, t3_3.x
+                                       Hash Cond: (t2_3.x = t3_3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                             Output: t2_3.y, t2_3.x
+                                       ->  Hash
+                                             Output: t3_3.y, t3_3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t3_3
+                                                   Output: t3_3.y, t3_3.x
+         ->  Finalize HashAggregate
+               Output: t1_4.x, sum((t2_4.y + t3_4.y)), count(*)
+               Group Key: t1_4.x
+               ->  Hash Join
+                     Output: t1_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, t3_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, t3_4.x, PARTIAL sum((t2_4.y + t3_4.y)), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Hash Join
+                                       Output: t2_4.y, t2_4.x, t3_4.y, t3_4.x
+                                       Hash Cond: (t2_4.x = t3_4.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                             Output: t2_4.y, t2_4.x
+                                       ->  Hash
+                                             Output: t3_4.y, t3_4.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t3_4
+                                                   Output: t3_4.y, t3_4.x
+(114 rows)
+
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+ x  |   sum   | count 
+----+---------+-------
+  0 |       0 | 35937
+  1 |   78608 | 39304
+  2 |  157216 | 39304
+  3 |  235824 | 39304
+  4 |  314432 | 39304
+  5 |  393040 | 39304
+  6 |  471648 | 39304
+  7 |  550256 | 39304
+  8 |  628864 | 39304
+  9 |  707472 | 39304
+ 10 |  786080 | 39304
+ 11 |  790614 | 35937
+ 12 |  862488 | 35937
+ 13 |  934362 | 35937
+ 14 | 1006236 | 35937
+ 15 | 1078110 | 35937
+ 16 | 1149984 | 35937
+ 17 | 1221858 | 35937
+ 18 | 1293732 | 35937
+ 19 | 1365606 | 35937
+ 20 | 1437480 | 35937
+ 21 | 1509354 | 35937
+ 22 | 1581228 | 35937
+ 23 | 1653102 | 35937
+ 24 | 1724976 | 35937
+ 25 | 1796850 | 35937
+ 26 | 1868724 | 35937
+ 27 | 1940598 | 35937
+ 28 | 2012472 | 35937
+ 29 | 2084346 | 35937
+(30 rows)
+
+-- partial aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+                                                    QUERY PLAN                                                    
+------------------------------------------------------------------------------------------------------------------
+ Sort
+   Output: t3.y, (sum((t2.y + t3.y))), (count(*))
+   Sort Key: t3.y
+   ->  Finalize HashAggregate
+         Output: t3.y, sum((t2.y + t3.y)), count(*)
+         Group Key: t3.y
+         ->  Append
+               ->  Hash Join
+                     Output: t3.y, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, t3.y, t3.x, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, t3.y, t3.x, PARTIAL sum((t2.y + t3.y)), PARTIAL count(*)
+                                 Group Key: t2.x, t3.y, t3.x
+                                 ->  Hash Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Hash Cond: (t2.x = t3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                             Output: t2.y, t2.x
+                                       ->  Hash
+                                             Output: t3.y, t3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p1 t3
+                                                   Output: t3.y, t3.x
+               ->  Hash Join
+                     Output: t3_1.y, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, t3_1.y, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, t3_1.y, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y)), PARTIAL count(*)
+                                 Group Key: t2_1.x, t3_1.y, t3_1.x
+                                 ->  Hash Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Hash Cond: (t2_1.x = t3_1.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                             Output: t2_1.y, t2_1.x
+                                       ->  Hash
+                                             Output: t3_1.y, t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t3_1
+                                                   Output: t3_1.y, t3_1.x
+               ->  Hash Join
+                     Output: t3_2.y, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, t3_2.y, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, t3_2.y, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y)), PARTIAL count(*)
+                                 Group Key: t2_2.x, t3_2.y, t3_2.x
+                                 ->  Hash Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Hash Cond: (t2_2.x = t3_2.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                             Output: t2_2.y, t2_2.x
+                                       ->  Hash
+                                             Output: t3_2.y, t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t3_2
+                                                   Output: t3_2.y, t3_2.x
+               ->  Hash Join
+                     Output: t3_3.y, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, t3_3.y, t3_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, t3_3.y, t3_3.x, PARTIAL sum((t2_3.y + t3_3.y)), PARTIAL count(*)
+                                 Group Key: t2_3.x, t3_3.y, t3_3.x
+                                 ->  Hash Join
+                                       Output: t2_3.y, t2_3.x, t3_3.y, t3_3.x
+                                       Hash Cond: (t2_3.x = t3_3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                             Output: t2_3.y, t2_3.x
+                                       ->  Hash
+                                             Output: t3_3.y, t3_3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t3_3
+                                                   Output: t3_3.y, t3_3.x
+               ->  Hash Join
+                     Output: t3_4.y, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, t3_4.y, t3_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, t3_4.y, t3_4.x, PARTIAL sum((t2_4.y + t3_4.y)), PARTIAL count(*)
+                                 Group Key: t2_4.x, t3_4.y, t3_4.x
+                                 ->  Hash Join
+                                       Output: t2_4.y, t2_4.x, t3_4.y, t3_4.x
+                                       Hash Cond: (t2_4.x = t3_4.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                             Output: t2_4.y, t2_4.x
+                                       ->  Hash
+                                             Output: t3_4.y, t3_4.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t3_4
+                                                   Output: t3_4.y, t3_4.x
+(102 rows)
+
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+ y  |   sum   | count 
+----+---------+-------
+  0 |       0 | 35937
+  1 |   78608 | 39304
+  2 |  157216 | 39304
+  3 |  235824 | 39304
+  4 |  314432 | 39304
+  5 |  393040 | 39304
+  6 |  471648 | 39304
+  7 |  550256 | 39304
+  8 |  628864 | 39304
+  9 |  707472 | 39304
+ 10 |  786080 | 39304
+ 11 |  790614 | 35937
+ 12 |  862488 | 35937
+ 13 |  934362 | 35937
+ 14 | 1006236 | 35937
+ 15 | 1078110 | 35937
+ 16 | 1149984 | 35937
+ 17 | 1221858 | 35937
+ 18 | 1293732 | 35937
+ 19 | 1365606 | 35937
+ 20 | 1437480 | 35937
+ 21 | 1509354 | 35937
+ 22 | 1581228 | 35937
+ 23 | 1653102 | 35937
+ 24 | 1724976 | 35937
+ 25 | 1796850 | 35937
+ 26 | 1868724 | 35937
+ 27 | 1940598 | 35937
+ 28 | 2012472 | 35937
+ 29 | 2084346 | 35937
+(30 rows)
+
+DROP TABLE eager_agg_tab_ml;
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index 83228cfca29..3b37fafa65b 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -151,6 +151,7 @@ select name, setting from pg_settings where name like 'enable%';
  enable_async_append            | on
  enable_bitmapscan              | on
  enable_distinct_reordering     | on
+ enable_eager_aggregate         | on
  enable_gathermerge             | on
  enable_group_by_reordering     | on
  enable_hashagg                 | on
@@ -172,7 +173,7 @@ select name, setting from pg_settings where name like 'enable%';
  enable_seqscan                 | on
  enable_sort                    | on
  enable_tidscan                 | on
-(24 rows)
+(25 rows)
 
 -- There are always wait event descriptions for various types.  InjectionPoint
 -- may be present or absent, depending on history since last postmaster start.
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index fbffc67ae60..f9450cdc477 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -123,7 +123,7 @@ test: plancache limit plpgsql copy2 temp domain rangefuncs prepare conversion tr
 # The stats test resets stats, so nothing else needing stats access can be in
 # this group.
 # ----------
-test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression compression_lz4 memoize stats predicate numa
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression compression_lz4 memoize stats predicate numa eager_aggregate
 
 # event_trigger depends on create_am and cannot run concurrently with
 # any test that runs DDL
diff --git a/src/test/regress/sql/eager_aggregate.sql b/src/test/regress/sql/eager_aggregate.sql
new file mode 100644
index 00000000000..5da8749a6cb
--- /dev/null
+++ b/src/test/regress/sql/eager_aggregate.sql
@@ -0,0 +1,194 @@
+--
+-- EAGER AGGREGATION
+-- Test we can push aggregation down below join
+--
+
+-- Enable eager aggregation, which by default is disabled.
+SET enable_eager_aggregate TO on;
+
+CREATE TABLE eager_agg_t1 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t2 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t3 (a int, b int, c double precision);
+
+INSERT INTO eager_agg_t1 SELECT i, i, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t2 SELECT i, i%10, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t3 SELECT i%10, i%10, i FROM generate_series(1, 1000)i;
+
+ANALYZE eager_agg_t1;
+ANALYZE eager_agg_t2;
+ANALYZE eager_agg_t3;
+
+
+--
+-- Test eager aggregation over base rel
+--
+
+-- Perform scan of a table, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+RESET enable_hashagg;
+
+
+--
+-- Test eager aggregation over join rel
+--
+
+-- Perform join of tables, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+
+RESET enable_hashagg;
+
+
+--
+-- Test that eager aggregation works for outer join
+--
+
+-- Ensure aggregation can be pushed down to the non-nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+-- Ensure aggregation cannot be pushed down to the nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+
+
+--
+-- Test that eager aggregation works for parallel plans
+--
+
+SET parallel_setup_cost=0;
+SET parallel_tuple_cost=0;
+SET min_parallel_table_scan_size=0;
+SET max_parallel_workers_per_gather=4;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+RESET parallel_setup_cost;
+RESET parallel_tuple_cost;
+RESET min_parallel_table_scan_size;
+RESET max_parallel_workers_per_gather;
+
+
+DROP TABLE eager_agg_t1;
+DROP TABLE eager_agg_t2;
+DROP TABLE eager_agg_t3;
+
+
+--
+-- Test eager aggregation for partitionwise join
+--
+
+-- Enable partitionwise aggregate, which by default is disabled.
+SET enable_partitionwise_aggregate TO true;
+-- Enable partitionwise join, which by default is disabled.
+SET enable_partitionwise_join TO true;
+
+CREATE TABLE eager_agg_tab1(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab1_p1 PARTITION OF eager_agg_tab1 FOR VALUES FROM (0) TO (5);
+CREATE TABLE eager_agg_tab1_p2 PARTITION OF eager_agg_tab1 FOR VALUES FROM (5) TO (10);
+CREATE TABLE eager_agg_tab1_p3 PARTITION OF eager_agg_tab1 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab2(x int, y int) PARTITION BY RANGE(y);
+CREATE TABLE eager_agg_tab2_p1 PARTITION OF eager_agg_tab2 FOR VALUES FROM (0) TO (5);
+CREATE TABLE eager_agg_tab2_p2 PARTITION OF eager_agg_tab2 FOR VALUES FROM (5) TO (10);
+CREATE TABLE eager_agg_tab2_p3 PARTITION OF eager_agg_tab2 FOR VALUES FROM (10) TO (15);
+INSERT INTO eager_agg_tab1 SELECT i % 15, i % 10 FROM generate_series(1, 1000) i;
+INSERT INTO eager_agg_tab2 SELECT i % 10, i % 15 FROM generate_series(1, 1000) i;
+
+ANALYZE eager_agg_tab1;
+ANALYZE eager_agg_tab2;
+
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+
+-- GROUP BY having other matching key
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 5 ORDER BY t2.x;
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 5 ORDER BY t2.x;
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+
+-- partial aggregation
+SET enable_hashagg TO off;
+SET max_parallel_workers_per_gather TO 0;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+RESET enable_hashagg;
+RESET max_parallel_workers_per_gather;
+
+DROP TABLE eager_agg_tab1;
+DROP TABLE eager_agg_tab2;
+
+
+--
+-- Test with multi-level partitioning scheme
+--
+CREATE TABLE eager_agg_tab_ml(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p1 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab_ml_p2 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (10) TO (20) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p2_s1 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab_ml_p2_s2 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (15) TO (20);
+CREATE TABLE eager_agg_tab_ml_p3 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (20) TO (30) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p3_s1 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (20) TO (25);
+CREATE TABLE eager_agg_tab_ml_p3_s2 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (25) TO (30);
+INSERT INTO eager_agg_tab_ml SELECT i % 30, i % 30 FROM generate_series(1, 1000) i;
+
+ANALYZE eager_agg_tab_ml;
+
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+
+-- partial aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+
+DROP TABLE eager_agg_tab_ml;
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index a8656419cb6..37053d9d769 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -42,6 +42,7 @@ AfterTriggersTableData
 AfterTriggersTransData
 Agg
 AggClauseCosts
+AggClauseInfo
 AggInfo
 AggPath
 AggSplit
@@ -1110,6 +1111,7 @@ GroupPathExtraData
 GroupResultPath
 GroupState
 GroupVarInfo
+GroupingExprInfo
 GroupingFunc
 GroupingSet
 GroupingSetData
@@ -2471,6 +2473,7 @@ ReindexObjectType
 ReindexParams
 ReindexStmt
 ReindexType
+RelAggInfo
 RelFileLocator
 RelFileLocatorBackend
 RelFileNumber
-- 
2.43.0



^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-08-06 07:52                                 ` Richard Guo <[email protected]>
  2025-08-06 13:44                                   ` Re: Eager aggregation, take 3 Matheus Alcantara <[email protected]>
  2025-09-05 13:12                                   ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-05 14:50                                   ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  0 siblings, 3 replies; 70+ messages in thread

From: Richard Guo @ 2025-08-06 07:52 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]; Matheus Alcantara <[email protected]>

On Thu, Jul 24, 2025 at 12:21 PM Richard Guo <[email protected]> wrote:
> This patch no longer applies; here's a rebased version.  Nothing
> essential has changed.

Based on some off-list testing by Matheus (CC'ed), several TPC-DS
queries that used to apply eager aggregation no longer do, which
suggests that the v18 patch is too strict about when eager aggregation
can be used.

I looked into query 4 and query 11, and found two reasons why they no
longer apply eager aggregation with v18.

* The has_internal_aggtranstype() check.

To avoid potential memory blowout risks from large partial aggregation
values, v18 avoids applying eager aggregation if any aggregate uses an
INTERNAL transition type, as this typically indicates a large internal
data structure (as in string_agg or array_agg).  However, this also
excludes aggregates like avg(numeric) and sum(numeric), which are
actually safe to use with eager aggregation.

What we really want to exclude are aggregate functions that can
produce large transition values by accumulating or concatenating input
rows.  So I'm wondering if we could instead check the transfn_oid
directly and explicitly exclude only F_ARRAY_AGG_TRANSFN and
F_STRING_AGG_TRANSFN.  We don't need to worry about json_agg,
jsonb_agg, or xmlagg, since they don't support partial aggregation
anyway.

* The EAGER_AGG_MIN_GROUP_SIZE threshold

This threshold defines the minimum average group size required to
consider applying eager aggregation.  It was previously set to 2, but
in v18 it was increased to 20 to be cautious about planning overhead.
This change was a snap decision though, without any profiling or data
to back it.

Looking at TPC-DS queries 4 and 11, a threshold of 10 is the minimum
needed to consider eager aggregation for them.  The resulting plans
show nice performance improvements without any measurable increase in
planning time.  So, I'm inclined to lower the threshold to 10 for now.
(Wondering whether we should make this threshold a GUC, so users can
adjust it based on their needs.)


With these two changes, here are the planning and execution time for
queries 4 and 11 (scale factor 1) on my snail-paced machine, with and
without eager aggregation.

query 4:
-- without eager aggregation
 Planning Time: 6.765 ms
 Execution Time: 34941.713 ms
-- with eager aggregation
 Planning Time: 6.674 ms
 Execution Time: 13994.183 ms

query 11:
-- without eager aggregation
 Planning Time: 3.757 ms
 Execution Time: 20888.076 ms
-- with eager aggregation
 Planning Time: 3.747 ms
 Execution Time: 7449.522 ms

Any comments on these two changes?

Thanks
Richard





^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-08-06 13:44                                   ` Matheus Alcantara <[email protected]>
  2025-08-09 01:32                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2 siblings, 1 reply; 70+ messages in thread

From: Matheus Alcantara @ 2025-08-06 13:44 UTC (permalink / raw)
  To: Richard Guo <[email protected]>; Robert Haas <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

On Wed Aug 6, 2025 at 4:52 AM -03, Richard Guo wrote:
> On Thu, Jul 24, 2025 at 12:21 PM Richard Guo <[email protected]> wrote:
>> This patch no longer applies; here's a rebased version.  Nothing
>> essential has changed.
>
> Based on some off-list testing by Matheus (CC'ed), several TPC-DS
> queries that used to apply eager aggregation no longer do, which
> suggests that the v18 patch is too strict about when eager aggregation
> can be used.
>
> I looked into query 4 and query 11, and found two reasons why they no
> longer apply eager aggregation with v18.
>
> * The has_internal_aggtranstype() check.
>
> To avoid potential memory blowout risks from large partial aggregation
> values, v18 avoids applying eager aggregation if any aggregate uses an
> INTERNAL transition type, as this typically indicates a large internal
> data structure (as in string_agg or array_agg).  However, this also
> excludes aggregates like avg(numeric) and sum(numeric), which are
> actually safe to use with eager aggregation.
>
> What we really want to exclude are aggregate functions that can
> produce large transition values by accumulating or concatenating input
> rows.  So I'm wondering if we could instead check the transfn_oid
> directly and explicitly exclude only F_ARRAY_AGG_TRANSFN and
> F_STRING_AGG_TRANSFN.  We don't need to worry about json_agg,
> jsonb_agg, or xmlagg, since they don't support partial aggregation
> anyway.
>
I think it makes sense to me. I just wondering if we should follow an
"allow" or "don't-allow" strategy. I mean, instead of a list aggregate
functions that are not allowed we could list functions that are actually
allowed to use eager aggregation, so in this case we ensure that for the
functions that are enabled the eager aggregation can work properly.

> * The EAGER_AGG_MIN_GROUP_SIZE threshold
>
> This threshold defines the minimum average group size required to
> consider applying eager aggregation.  It was previously set to 2, but
> in v18 it was increased to 20 to be cautious about planning overhead.
> This change was a snap decision though, without any profiling or data
> to back it.
>
> Looking at TPC-DS queries 4 and 11, a threshold of 10 is the minimum
> needed to consider eager aggregation for them.  The resulting plans
> show nice performance improvements without any measurable increase in
> planning time.  So, I'm inclined to lower the threshold to 10 for now.
> (Wondering whether we should make this threshold a GUC, so users can
> adjust it based on their needs.)
>
Having a GUC may sound like a good idea to me TBH. This threshold may
vary from workload to workload (?).

>
> With these two changes, here are the planning and execution time for
> queries 4 and 11 (scale factor 1) on my snail-paced machine, with and
> without eager aggregation.
>
> query 4:
> -- without eager aggregation
>  Planning Time: 6.765 ms
>  Execution Time: 34941.713 ms
> -- with eager aggregation
>  Planning Time: 6.674 ms
>  Execution Time: 13994.183 ms
>
> query 11:
> -- without eager aggregation
>  Planning Time: 3.757 ms
>  Execution Time: 20888.076 ms
> -- with eager aggregation
>  Planning Time: 3.747 ms
>  Execution Time: 7449.522 ms
>
> Any comments on these two changes?
>
It sounds like a good way to go for me, looking forward to the next
patch version to perform some other tests.

Thanks

--
Matheus Alcantara





^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 13:44                                   ` Re: Eager aggregation, take 3 Matheus Alcantara <[email protected]>
@ 2025-08-09 01:32                                     ` Richard Guo <[email protected]>
  2025-08-14 19:22                                       ` Re: Eager aggregation, take 3 Matheus Alcantara <[email protected]>
  2025-09-01 01:32                                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  0 siblings, 2 replies; 70+ messages in thread

From: Richard Guo @ 2025-08-09 01:32 UTC (permalink / raw)
  To: Matheus Alcantara <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

On Wed, Aug 6, 2025 at 10:44 PM Matheus Alcantara
<[email protected]> wrote:
> On Wed Aug 6, 2025 at 4:52 AM -03, Richard Guo wrote:
> > * The has_internal_aggtranstype() check.
> >
> > To avoid potential memory blowout risks from large partial aggregation
> > values, v18 avoids applying eager aggregation if any aggregate uses an
> > INTERNAL transition type, as this typically indicates a large internal
> > data structure (as in string_agg or array_agg).  However, this also
> > excludes aggregates like avg(numeric) and sum(numeric), which are
> > actually safe to use with eager aggregation.
> >
> > What we really want to exclude are aggregate functions that can
> > produce large transition values by accumulating or concatenating input
> > rows.  So I'm wondering if we could instead check the transfn_oid
> > directly and explicitly exclude only F_ARRAY_AGG_TRANSFN and
> > F_STRING_AGG_TRANSFN.  We don't need to worry about json_agg,
> > jsonb_agg, or xmlagg, since they don't support partial aggregation
> > anyway.

> I think it makes sense to me. I just wondering if we should follow an
> "allow" or "don't-allow" strategy. I mean, instead of a list aggregate
> functions that are not allowed we could list functions that are actually
> allowed to use eager aggregation, so in this case we ensure that for the
> functions that are enabled the eager aggregation can work properly.

I ended up still checking for INTERNAL transition types, but
explicitly excluded aggregates that use F_NUMERIC_AVG_ACCUM transition
function, assuming that avg(numeric) and sum(numeric) are safe in this
context.  This might still be overly strict, but I prefer to be on the
safe side for now.

> > * The EAGER_AGG_MIN_GROUP_SIZE threshold
> >
> > This threshold defines the minimum average group size required to
> > consider applying eager aggregation.  It was previously set to 2, but
> > in v18 it was increased to 20 to be cautious about planning overhead.
> > This change was a snap decision though, without any profiling or data
> > to back it.
> >
> > Looking at TPC-DS queries 4 and 11, a threshold of 10 is the minimum
> > needed to consider eager aggregation for them.  The resulting plans
> > show nice performance improvements without any measurable increase in
> > planning time.  So, I'm inclined to lower the threshold to 10 for now.
> > (Wondering whether we should make this threshold a GUC, so users can
> > adjust it based on their needs.)

> Having a GUC may sound like a good idea to me TBH. This threshold may
> vary from workload to workload (?).

I've made this threshold a GUC, with a default value of 8 (further
benchmark testing showed that a value of 10 is still too strict for
TPC-DS query 4).

> > Any comments on these two changes?

> It sounds like a good way to go for me, looking forward to the next
> patch version to perform some other tests.

OK.  Here it is.

Thanks
Richard


Attachments:

  [application/octet-stream] v19-0001-Implement-Eager-Aggregation.patch (174.0K, 2-v19-0001-Implement-Eager-Aggregation.patch)
  download | inline diff:
From 22999025da5f400b4b780df13dce008665c5c372 Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Tue, 11 Jun 2024 15:59:19 +0900
Subject: [PATCH v19] Implement Eager Aggregation

Eager aggregation is a query optimization technique that partially
pushes aggregation past a join, and finalizes it once all the
relations are joined.  Eager aggregation may reduce the number of
input rows to the join and thus could result in a better overall plan.

In the current planner architecture, the separation between the
scan/join planning phase and the post-scan/join phase means that
aggregation steps are not visible when constructing the join tree,
limiting the planner's ability to exploit aggregation-aware
optimizations.  To implement eager aggregation, we collect information
about aggregate functions in the targetlist and HAVING clause, along
with grouping expressions from the GROUP BY clause, and store it in
the PlannerInfo node.  During the scan/join planning phase, this
information is used to evaluate each base or join relation to
determine whether eager aggregation can be applied.  If applicable, we
create a separate RelOptInfo, referred to as a grouped relation, to
represent the partially-aggregated version of the relation and
generate grouped paths for it.

Grouped relation paths can be generated in two ways.  The first method
involves adding sorted and hashed partial aggregation paths on top of
the non-grouped paths.  To limit planning time, we only consider the
cheapest or suitably-sorted non-grouped paths in this step.
Alternatively, grouped paths can be generated by joining a grouped
relation with a non-grouped relation.  Joining two grouped relations
is currently not supported.

To further limit planning time, we currently adopt a strategy where
partial aggregation is pushed only to the lowest feasible level in the
join tree where it provides a significant reduction in row count.
This strategy also helps ensure that all grouped paths for the same
grouped relation produce the same set of rows, which is important to
support a fundamental assumption of the planner.

For the partial aggregation that is pushed down to a non-aggregated
relation, we need to consider all expressions from this relation that
are involved in upper join clauses and include them in the grouping
keys, using compatible operators.  This is essential to ensure that an
aggregated row from the partial aggregation matches the other side of
the join if and only if each row in the partial group does.  This
ensures that all rows within the same partial group share the same
"destiny", which is crucial for maintaining correctness.

One restriction is that we cannot push partial aggregation down to a
relation that is in the nullable side of an outer join, because the
NULL-extended rows produced by the outer join would not be available
when we perform the partial aggregation, while with a
non-eager-aggregation plan these rows are available for the top-level
aggregation.  Pushing partial aggregation in this case may result in
the rows being grouped differently than expected, or produce incorrect
values from the aggregate functions.

If we have generated a grouped relation for the topmost join relation,
we finalize its paths at the end.  The final paths will compete in the
usual way with paths built from regular planning.

The patch was originally proposed by Antonin Houska in 2017.  This
commit reworks various important aspects and rewrites most of the
current code.  However, the original patch and reviews were very
useful.

Author: Richard Guo, Antonin Houska
Reviewed-by: Robert Haas, Jian He, Tender Wang, Paul George, Tom Lane
Reviewed-by: Tomas Vondra, Andy Fan, Ashutosh Bapat
Discussion: https://postgr.es/m/CAMbWs48jzLrPt1J_00ZcPZXWUQKawQOFE8ROc-ADiYqsqrpBNw@mail.gmail.com
---
 .../postgres_fdw/expected/postgres_fdw.out    |   49 +-
 doc/src/sgml/config.sgml                      |   31 +
 src/backend/optimizer/README                  |   89 ++
 src/backend/optimizer/geqo/geqo_eval.c        |   21 +
 src/backend/optimizer/path/allpaths.c         |  453 ++++++
 src/backend/optimizer/path/joinrels.c         |  193 +++
 src/backend/optimizer/plan/initsplan.c        |  322 ++++
 src/backend/optimizer/plan/planmain.c         |    9 +
 src/backend/optimizer/plan/planner.c          |  124 +-
 src/backend/optimizer/util/appendinfo.c       |   59 +
 src/backend/optimizer/util/pathnode.c         |   12 +-
 src/backend/optimizer/util/relnode.c          |  629 ++++++++
 src/backend/utils/misc/guc_tables.c           |   21 +
 src/backend/utils/misc/postgresql.conf.sample |    2 +
 src/include/nodes/pathnodes.h                 |  130 ++
 src/include/optimizer/pathnode.h              |    5 +
 src/include/optimizer/paths.h                 |    6 +
 src/include/optimizer/planmain.h              |    1 +
 .../regress/expected/collate.icu.utf8.out     |   32 +-
 src/test/regress/expected/eager_aggregate.out | 1334 +++++++++++++++++
 src/test/regress/expected/join.out            |   12 +-
 .../regress/expected/partition_aggregate.out  |    2 +
 src/test/regress/expected/sysviews.out        |    3 +-
 src/test/regress/parallel_schedule            |    2 +-
 src/test/regress/sql/eager_aggregate.sql      |  194 +++
 src/test/regress/sql/partition_aggregate.sql  |    2 +
 src/tools/pgindent/typedefs.list              |    3 +
 27 files changed, 3658 insertions(+), 82 deletions(-)
 create mode 100644 src/test/regress/expected/eager_aggregate.out
 create mode 100644 src/test/regress/sql/eager_aggregate.sql

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index a434eb1395e..e05dcb44947 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -3713,30 +3713,33 @@ select count(t1.c3) from ft2 t1 left join ft2 t2 on (t1.c1 = random() * t2.c2);
 -- Subquery in FROM clause having aggregate
 explain (verbose, costs off)
 select count(*), x.b from ft1, (select c2 a, sum(c1) b from ft1 group by c2) x where ft1.c2 = x.a group by x.b order by 1, 2;
-                                          QUERY PLAN                                           
------------------------------------------------------------------------------------------------
+                                       QUERY PLAN                                        
+-----------------------------------------------------------------------------------------
  Sort
-   Output: (count(*)), x.b
-   Sort Key: (count(*)), x.b
-   ->  HashAggregate
-         Output: count(*), x.b
-         Group Key: x.b
-         ->  Hash Join
-               Output: x.b
-               Inner Unique: true
-               Hash Cond: (ft1.c2 = x.a)
-               ->  Foreign Scan on public.ft1
-                     Output: ft1.c2
-                     Remote SQL: SELECT c2 FROM "S 1"."T 1"
-               ->  Hash
-                     Output: x.b, x.a
-                     ->  Subquery Scan on x
-                           Output: x.b, x.a
-                           ->  Foreign Scan
-                                 Output: ft1_1.c2, (sum(ft1_1.c1))
-                                 Relations: Aggregate on (public.ft1 ft1_1)
-                                 Remote SQL: SELECT c2, sum("C 1") FROM "S 1"."T 1" GROUP BY 1
-(21 rows)
+   Output: (count(*)), (sum(ft1_1.c1))
+   Sort Key: (count(*)), (sum(ft1_1.c1))
+   ->  Finalize GroupAggregate
+         Output: count(*), (sum(ft1_1.c1))
+         Group Key: (sum(ft1_1.c1))
+         ->  Sort
+               Output: (sum(ft1_1.c1)), (PARTIAL count(*))
+               Sort Key: (sum(ft1_1.c1))
+               ->  Hash Join
+                     Output: (sum(ft1_1.c1)), (PARTIAL count(*))
+                     Hash Cond: (ft1_1.c2 = ft1.c2)
+                     ->  Foreign Scan
+                           Output: ft1_1.c2, (sum(ft1_1.c1))
+                           Relations: Aggregate on (public.ft1 ft1_1)
+                           Remote SQL: SELECT c2, sum("C 1") FROM "S 1"."T 1" GROUP BY 1
+                     ->  Hash
+                           Output: ft1.c2, (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: ft1.c2, PARTIAL count(*)
+                                 Group Key: ft1.c2
+                                 ->  Foreign Scan on public.ft1
+                                       Output: ft1.c2
+                                       Remote SQL: SELECT c2 FROM "S 1"."T 1"
+(24 rows)
 
 select count(*), x.b from ft1, (select c2 a, sum(c1) b from ft1 group by c2) x where ft1.c2 = x.a group by x.b order by 1, 2;
  count |   b   
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 20ccb2d6b54..5400bd8f18f 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -5474,6 +5474,21 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-enable-eager-aggregate" xreflabel="enable_eager_aggregate">
+      <term><varname>enable_eager_aggregate</varname> (<type>boolean</type>)
+      <indexterm>
+       <primary><varname>enable_eager_aggregate</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Enables or disables the query planner's ability to partially push
+        aggregation past a join, and finalize it once all the relations are
+        joined. The default is <literal>on</literal>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-enable-gathermerge" xreflabel="enable_gathermerge">
       <term><varname>enable_gathermerge</varname> (<type>boolean</type>)
       <indexterm>
@@ -6094,6 +6109,22 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-min-eager-agg-group-size" xreflabel="min_eager_agg_group_size">
+      <term><varname>min_eager_agg_group_size</varname> (<type>floating point</type>)
+      <indexterm>
+       <primary><varname>min_eager_agg_group_size</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Sets the minimum average group size required to consider applying
+        eager aggregation. This helps avoid the overhead of eager
+        aggregation when it does not offer significant row count reduction.
+        The default is <literal>8</literal>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-jit-above-cost" xreflabel="jit_above_cost">
       <term><varname>jit_above_cost</varname> (<type>floating point</type>)
       <indexterm>
diff --git a/src/backend/optimizer/README b/src/backend/optimizer/README
index 9c724ccfabf..48a575c5bda 100644
--- a/src/backend/optimizer/README
+++ b/src/backend/optimizer/README
@@ -1501,3 +1501,92 @@ breaking down aggregation or grouping over a partitioned relation into
 aggregation or grouping over its partitions is called partitionwise
 aggregation.  Especially when the partition keys match the GROUP BY clause,
 this can be significantly faster than the regular method.
+
+Eager aggregation
+-----------------
+
+Eager aggregation is a query optimization technique that partially
+pushes aggregation past a join, and finalizes it once all the
+relations are joined.  Eager aggregation may reduce the number of
+input rows to the join and thus could result in a better overall plan.
+
+To prove that the transformation is correct, we partition the tables
+in the FROM clause into two groups: those that contain at least one
+aggregation column, and those that do not contain any aggregation
+columns.  Each group can be treated as a single relation formed by the
+Cartesian product of the tables within that group.  Therefore, without
+loss of generality, we can assume that the FROM clause contains
+exactly two relations, R1 and R2, where R1 represents the relation
+containing all aggregation columns, and R2 represents the relation
+without any aggregation columns.
+
+Let the query be of the form:
+
+SELECT G, AGG(A)
+FROM R1 JOIN R2 ON J
+GROUP BY G;
+
+where G is the set of grouping keys that may include columns from R1
+and/or R2; AGG(A) is an aggregate function over columns A from R1; J
+is the join condition between R1 and R2.
+
+The transformation of eager aggregation is:
+
+    GROUP BY G, AGG(A) on (R1 JOIN R2 ON J)
+    =
+    GROUP BY G, AGG(agg_A) on ((GROUP BY G1, AGG(A) AS agg_A on R1) JOIN R2 ON J)
+
+This equivalence holds under the following conditions:
+
+1) AGG is decomposable, meaning that it can be computed in two stages:
+a partial aggregation followed by a final aggregation;
+2) The set G1 used in the pre-aggregation of R1 includes:
+    * all columns from R1 that are part of the grouping keys G, and
+    * all columns from R1 that appear in the join condition J.
+3) The grouping operator for any column in G1 must be compatible with
+the operator used for that column in the join condition J.
+
+Since G1 includes all columns from R1 that appear in either the
+grouping keys G or the join condition J, all rows within each partial
+group have identical values for both the grouping keys and the
+join-relevant columns from R1, assuming compatible operators are used.
+As a result, the rows within a partial group are indistinguishable in
+terms of their contribution to the aggregation and their behavior in
+the join.  This ensures that all rows in the same partial group share
+the same "destiny": they either all match or all fail to match a given
+row in R2.  Because the aggregate function AGG is decomposable,
+aggregating the partial results after the join yields the same final
+result as aggregating after the full join, thereby preserving query
+semantics.  Q.E.D.
+
+One restriction is that we cannot push partial aggregation down to a
+relation that is in the nullable side of an outer join, because the
+NULL-extended rows produced by the outer join would not be available
+when we perform the partial aggregation, while with a
+non-eager-aggregation plan these rows are available for the top-level
+aggregation.  Pushing partial aggregation in this case may result in
+the rows being grouped differently than expected, or produce incorrect
+values from the aggregate functions.
+
+During the construction of the join tree, we evaluate each base or
+join relation to determine if eager aggregation can be applied.  If
+feasible, we create a separate RelOptInfo called a "grouped relation"
+and generate grouped paths by adding sorted and hashed partial
+aggregation paths on top of the non-grouped paths.  To limit planning
+time, we consider only the cheapest or suitably-sorted non-grouped
+paths in this step.
+
+Another way to generate grouped paths is to join a grouped relation
+with a non-grouped relation.  Joining two grouped relations is
+currently not supported.
+
+To further limit planning time, we currently adopt a strategy where
+partial aggregation is pushed only to the lowest feasible level in the
+join tree where it provides a significant reduction in row count.
+This strategy also helps ensure that all grouped paths for the same
+grouped relation produce the same set of rows, which is important to
+support a fundamental assumption of the planner.
+
+If we have generated a grouped relation for the topmost join relation,
+we need to finalize its paths at the end.  The final paths will
+compete in the usual way with paths built from regular planning.
diff --git a/src/backend/optimizer/geqo/geqo_eval.c b/src/backend/optimizer/geqo/geqo_eval.c
index f07d1dc8ac6..4a65f955ca6 100644
--- a/src/backend/optimizer/geqo/geqo_eval.c
+++ b/src/backend/optimizer/geqo/geqo_eval.c
@@ -279,6 +279,27 @@ merge_clump(PlannerInfo *root, List *clumps, Clump *new_clump, int num_gene,
 				/* Find and save the cheapest paths for this joinrel */
 				set_cheapest(joinrel);
 
+				/*
+				 * Except for the topmost scan/join rel, consider generating
+				 * partial aggregation paths for the grouped relation on top
+				 * of the paths of this rel.  After that, we're done creating
+				 * paths for the grouped relation, so run set_cheapest().
+				 */
+				if (!bms_equal(joinrel->relids, root->all_query_rels))
+				{
+					RelOptInfo *grouped_rel;
+
+					grouped_rel = joinrel->grouped_rel;
+					if (grouped_rel)
+					{
+						Assert(IS_GROUPED_REL(grouped_rel));
+
+						generate_grouped_paths(root, grouped_rel, joinrel,
+											   grouped_rel->agg_info);
+						set_cheapest(grouped_rel);
+					}
+				}
+
 				/* Absorb new clump into old */
 				old_clump->joinrel = joinrel;
 				old_clump->size += new_clump->size;
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 6cc6966b060..7b349a4570e 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -40,6 +40,7 @@
 #include "optimizer/paths.h"
 #include "optimizer/plancat.h"
 #include "optimizer/planner.h"
+#include "optimizer/prep.h"
 #include "optimizer/tlist.h"
 #include "parser/parse_clause.h"
 #include "parser/parsetree.h"
@@ -47,6 +48,7 @@
 #include "port/pg_bitutils.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/lsyscache.h"
+#include "utils/selfuncs.h"
 
 
 /* Bitmask flags for pushdown_safety_info.unsafeFlags */
@@ -77,7 +79,9 @@ typedef enum pushdown_safe_type
 
 /* These parameters are set by GUC */
 bool		enable_geqo = false;	/* just in case GUC doesn't set it */
+bool		enable_eager_aggregate = true;
 int			geqo_threshold;
+double		min_eager_agg_group_size;
 int			min_parallel_table_scan_size;
 int			min_parallel_index_scan_size;
 
@@ -90,6 +94,7 @@ join_search_hook_type join_search_hook = NULL;
 
 static void set_base_rel_consider_startup(PlannerInfo *root);
 static void set_base_rel_sizes(PlannerInfo *root);
+static void setup_base_grouped_rels(PlannerInfo *root);
 static void set_base_rel_pathlists(PlannerInfo *root);
 static void set_rel_size(PlannerInfo *root, RelOptInfo *rel,
 						 Index rti, RangeTblEntry *rte);
@@ -114,6 +119,7 @@ static void set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
 								Index rti, RangeTblEntry *rte);
 static void set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 									Index rti, RangeTblEntry *rte);
+static void set_grouped_rel_pathlist(PlannerInfo *root, RelOptInfo *rel);
 static void generate_orderedappend_paths(PlannerInfo *root, RelOptInfo *rel,
 										 List *live_childrels,
 										 List *all_child_pathkeys);
@@ -182,6 +188,11 @@ make_one_rel(PlannerInfo *root, List *joinlist)
 	 */
 	set_base_rel_sizes(root);
 
+	/*
+	 * Build grouped relations for base rels where possible.
+	 */
+	setup_base_grouped_rels(root);
+
 	/*
 	 * We should now have size estimates for every actual table involved in
 	 * the query, and we also know which if any have been deleted from the
@@ -323,6 +334,39 @@ set_base_rel_sizes(PlannerInfo *root)
 	}
 }
 
+/*
+ * setup_base_grouped_rels
+ *	  For each base relation, build a grouped base relation if eager
+ *	  aggregation is possible and if this relation can produce grouped paths.
+ */
+static void
+setup_base_grouped_rels(PlannerInfo *root)
+{
+	Index		rti;
+
+	/*
+	 * If there are no aggregate expressions or grouping expressions, eager
+	 * aggregation is not possible.
+	 */
+	if (root->agg_clause_list == NIL ||
+		root->group_expr_list == NIL)
+		return;
+
+	for (rti = 1; rti < root->simple_rel_array_size; rti++)
+	{
+		RelOptInfo *rel = root->simple_rel_array[rti];
+
+		/* there may be empty slots corresponding to non-baserel RTEs */
+		if (rel == NULL)
+			continue;
+
+		Assert(rel->relid == rti);	/* sanity check on array */
+		Assert(IS_SIMPLE_REL(rel)); /* sanity check on rel */
+
+		(void) build_simple_grouped_rel(root, rel);
+	}
+}
+
 /*
  * set_base_rel_pathlists
  *	  Finds all paths available for scanning each base-relation entry.
@@ -559,6 +603,15 @@ set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 	/* Now find the cheapest of the paths for this rel */
 	set_cheapest(rel);
 
+	/*
+	 * If a grouped relation for this rel exists, build partial aggregation
+	 * paths for it.
+	 *
+	 * Note that this can only happen after we've called set_cheapest() for
+	 * this base rel, because we need its cheapest paths.
+	 */
+	set_grouped_rel_pathlist(root, rel);
+
 #ifdef OPTIMIZER_DEBUG
 	pprint(rel);
 #endif
@@ -1305,6 +1358,36 @@ set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 	add_paths_to_append_rel(root, rel, live_childrels);
 }
 
+/*
+ * set_grouped_rel_pathlist
+ *	  If a grouped relation for the given 'rel' exists, build partial
+ *	  aggregation paths for it.
+ */
+static void
+set_grouped_rel_pathlist(PlannerInfo *root, RelOptInfo *rel)
+{
+	RelOptInfo *grouped_rel;
+
+	/*
+	 * If there are no aggregate expressions or grouping expressions, eager
+	 * aggregation is not possible.
+	 */
+	if (root->agg_clause_list == NIL ||
+		root->group_expr_list == NIL)
+		return;
+
+	/* Add paths to the grouped base relation if one exists. */
+	grouped_rel = rel->grouped_rel;
+	if (grouped_rel)
+	{
+		Assert(IS_GROUPED_REL(grouped_rel));
+
+		generate_grouped_paths(root, grouped_rel, rel,
+							   grouped_rel->agg_info);
+		set_cheapest(grouped_rel);
+	}
+}
+
 
 /*
  * add_paths_to_append_rel
@@ -3335,6 +3418,328 @@ generate_useful_gather_paths(PlannerInfo *root, RelOptInfo *rel, bool override_r
 	}
 }
 
+/*
+ * generate_grouped_paths
+ *		Generate paths for a grouped relation by adding sorted and hashed
+ *		partial aggregation paths on top of paths of the ungrouped base or join
+ *		relation.
+ *
+ * The information needed are provided by the RelAggInfo structure.
+ */
+void
+generate_grouped_paths(PlannerInfo *root, RelOptInfo *grouped_rel,
+					   RelOptInfo *rel, RelAggInfo *agg_info)
+{
+	AggClauseCosts agg_costs;
+	bool		can_hash;
+	bool		can_sort;
+	Path	   *cheapest_total_path = NULL;
+	Path	   *cheapest_partial_path = NULL;
+	double		dNumGroups = 0;
+	double		dNumPartialGroups = 0;
+
+	if (IS_DUMMY_REL(rel))
+	{
+		mark_dummy_rel(grouped_rel);
+		return;
+	}
+
+	/*
+	 * We push partial aggregation only to the lowest possible level in the
+	 * join tree that is deemed useful.
+	 */
+	if (!bms_equal(agg_info->apply_at, rel->relids) ||
+		!agg_info->agg_useful)
+		return;
+
+	MemSet(&agg_costs, 0, sizeof(AggClauseCosts));
+	get_agg_clause_costs(root, AGGSPLIT_INITIAL_SERIAL, &agg_costs);
+
+	/*
+	 * Determine whether it's possible to perform sort-based implementations
+	 * of grouping.
+	 */
+	can_sort = grouping_is_sortable(agg_info->group_clauses);
+
+	/*
+	 * Determine whether we should consider hash-based implementations of
+	 * grouping.
+	 */
+	Assert(root->numOrderedAggs == 0);
+	can_hash = (agg_info->group_clauses != NIL &&
+				grouping_is_hashable(agg_info->group_clauses));
+
+	/*
+	 * Consider whether we should generate partially aggregated non-partial
+	 * paths.  We can only do this if we have a non-partial path.
+	 */
+	if (rel->pathlist != NIL)
+	{
+		cheapest_total_path = rel->cheapest_total_path;
+		Assert(cheapest_total_path != NULL);
+	}
+
+	/*
+	 * If parallelism is possible for grouped_rel, then we should consider
+	 * generating partially-grouped partial paths.  However, if the ungrouped
+	 * rel has no partial paths, then we can't.
+	 */
+	if (grouped_rel->consider_parallel && rel->partial_pathlist != NIL)
+	{
+		cheapest_partial_path = linitial(rel->partial_pathlist);
+		Assert(cheapest_partial_path != NULL);
+	}
+
+	/* Estimate number of partial groups. */
+	if (cheapest_total_path != NULL)
+		dNumGroups = estimate_num_groups(root,
+										 agg_info->group_exprs,
+										 cheapest_total_path->rows,
+										 NULL, NULL);
+	if (cheapest_partial_path != NULL)
+		dNumPartialGroups = estimate_num_groups(root,
+												agg_info->group_exprs,
+												cheapest_partial_path->rows,
+												NULL, NULL);
+
+	if (can_sort && cheapest_total_path != NULL)
+	{
+		ListCell   *lc;
+
+		/*
+		 * Use any available suitably-sorted path as input, and also consider
+		 * sorting the cheapest-total path and incremental sort on any paths
+		 * with presorted keys.
+		 *
+		 * To save planning time, we ignore parameterized input paths unless
+		 * they are the cheapest-total path.
+		 */
+		foreach(lc, rel->pathlist)
+		{
+			Path	   *input_path = (Path *) lfirst(lc);
+			Path	   *path;
+			bool		is_sorted;
+			int			presorted_keys;
+
+			/*
+			 * Ignore parameterized paths that are not the cheapest-total
+			 * path.
+			 */
+			if (input_path->param_info &&
+				input_path != cheapest_total_path)
+				continue;
+
+			is_sorted = pathkeys_count_contained_in(agg_info->group_pathkeys,
+													input_path->pathkeys,
+													&presorted_keys);
+
+			/*
+			 * Ignore paths that are not suitably or partially sorted, unless
+			 * they are the cheapest total path (no need to deal with paths
+			 * which have presorted keys when incremental sort is disabled).
+			 */
+			if (!is_sorted && input_path != cheapest_total_path &&
+				(presorted_keys == 0 || !enable_incremental_sort))
+				continue;
+
+			/*
+			 * Since the path originates from a non-grouped relation that is
+			 * not aware of eager aggregation, we must ensure that it provides
+			 * the correct input for partial aggregation.
+			 */
+			path = (Path *) create_projection_path(root,
+												   grouped_rel,
+												   input_path,
+												   agg_info->agg_input);
+
+			if (!is_sorted)
+			{
+				/*
+				 * We've no need to consider both a sort and incremental sort.
+				 * We'll just do a sort if there are no presorted keys and an
+				 * incremental sort when there are presorted keys.
+				 */
+				if (presorted_keys == 0 || !enable_incremental_sort)
+					path = (Path *) create_sort_path(root,
+													 grouped_rel,
+													 path,
+													 agg_info->group_pathkeys,
+													 -1.0);
+				else
+					path = (Path *) create_incremental_sort_path(root,
+																 grouped_rel,
+																 path,
+																 agg_info->group_pathkeys,
+																 presorted_keys,
+																 -1.0);
+			}
+
+			/*
+			 * qual is NIL because the HAVING clause cannot be evaluated until
+			 * the final value of the aggregate is known.
+			 */
+			path = (Path *) create_agg_path(root,
+											grouped_rel,
+											path,
+											agg_info->target,
+											AGG_SORTED,
+											AGGSPLIT_INITIAL_SERIAL,
+											agg_info->group_clauses,
+											NIL,
+											&agg_costs,
+											dNumGroups);
+
+			add_path(grouped_rel, path);
+		}
+	}
+
+	if (can_sort && cheapest_partial_path != NULL)
+	{
+		ListCell   *lc;
+
+		/* Similar to above logic, but for partial paths. */
+		foreach(lc, rel->partial_pathlist)
+		{
+			Path	   *input_path = (Path *) lfirst(lc);
+			Path	   *path;
+			bool		is_sorted;
+			int			presorted_keys;
+
+			is_sorted = pathkeys_count_contained_in(agg_info->group_pathkeys,
+													input_path->pathkeys,
+													&presorted_keys);
+
+			/*
+			 * Ignore paths that are not suitably or partially sorted, unless
+			 * they are the cheapest partial path (no need to deal with paths
+			 * which have presorted keys when incremental sort is disabled).
+			 */
+			if (!is_sorted && input_path != cheapest_partial_path &&
+				(presorted_keys == 0 || !enable_incremental_sort))
+				continue;
+
+			/*
+			 * Since the path originates from a non-grouped relation that is
+			 * not aware of eager aggregation, we must ensure that it provides
+			 * the correct input for partial aggregation.
+			 */
+			path = (Path *) create_projection_path(root,
+												   grouped_rel,
+												   input_path,
+												   agg_info->agg_input);
+
+			if (!is_sorted)
+			{
+				/*
+				 * We've no need to consider both a sort and incremental sort.
+				 * We'll just do a sort if there are no presorted keys and an
+				 * incremental sort when there are presorted keys.
+				 */
+				if (presorted_keys == 0 || !enable_incremental_sort)
+					path = (Path *) create_sort_path(root,
+													 grouped_rel,
+													 path,
+													 agg_info->group_pathkeys,
+													 -1.0);
+				else
+					path = (Path *) create_incremental_sort_path(root,
+																 grouped_rel,
+																 path,
+																 agg_info->group_pathkeys,
+																 presorted_keys,
+																 -1.0);
+			}
+
+			/*
+			 * qual is NIL because the HAVING clause cannot be evaluated until
+			 * the final value of the aggregate is known.
+			 */
+			path = (Path *) create_agg_path(root,
+											grouped_rel,
+											path,
+											agg_info->target,
+											AGG_SORTED,
+											AGGSPLIT_INITIAL_SERIAL,
+											agg_info->group_clauses,
+											NIL,
+											&agg_costs,
+											dNumPartialGroups);
+
+			add_partial_path(grouped_rel, path);
+		}
+	}
+
+	/*
+	 * Add a partially-grouped HashAgg Path where possible
+	 */
+	if (can_hash && cheapest_total_path != NULL)
+	{
+		Path	   *path;
+
+		/*
+		 * Since the path originates from a non-grouped relation that is not
+		 * aware of eager aggregation, we must ensure that it provides the
+		 * correct input for partial aggregation.
+		 */
+		path = (Path *) create_projection_path(root,
+											   grouped_rel,
+											   cheapest_total_path,
+											   agg_info->agg_input);
+
+		/*
+		 * qual is NIL because the HAVING clause cannot be evaluated until the
+		 * final value of the aggregate is known.
+		 */
+		path = (Path *) create_agg_path(root,
+										grouped_rel,
+										path,
+										agg_info->target,
+										AGG_HASHED,
+										AGGSPLIT_INITIAL_SERIAL,
+										agg_info->group_clauses,
+										NIL,
+										&agg_costs,
+										dNumGroups);
+
+		add_path(grouped_rel, path);
+	}
+
+	/*
+	 * Now add a partially-grouped HashAgg partial Path where possible
+	 */
+	if (can_hash && cheapest_partial_path != NULL)
+	{
+		Path	   *path;
+
+		/*
+		 * Since the path originates from a non-grouped relation that is not
+		 * aware of eager aggregation, we must ensure that it provides the
+		 * correct input for partial aggregation.
+		 */
+		path = (Path *) create_projection_path(root,
+											   grouped_rel,
+											   cheapest_partial_path,
+											   agg_info->agg_input);
+
+		/*
+		 * qual is NIL because the HAVING clause cannot be evaluated until the
+		 * final value of the aggregate is known.
+		 */
+		path = (Path *) create_agg_path(root,
+										grouped_rel,
+										path,
+										agg_info->target,
+										AGG_HASHED,
+										AGGSPLIT_INITIAL_SERIAL,
+										agg_info->group_clauses,
+										NIL,
+										&agg_costs,
+										dNumPartialGroups);
+
+		add_partial_path(grouped_rel, path);
+	}
+}
+
 /*
  * make_rel_from_joinlist
  *	  Build access paths using a "joinlist" to guide the join path search.
@@ -3494,6 +3899,10 @@ standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
 		 *
 		 * After that, we're done creating paths for the joinrel, so run
 		 * set_cheapest().
+		 *
+		 * In addition, we also run generate_grouped_paths() for the grouped
+		 * relation of each just-processed joinrel, and run set_cheapest() for
+		 * the grouped relation afterwards.
 		 */
 		foreach(lc, root->join_rel_level[lev])
 		{
@@ -3514,6 +3923,27 @@ standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
 			/* Find and save the cheapest paths for this rel */
 			set_cheapest(rel);
 
+			/*
+			 * Except for the topmost scan/join rel, consider generating
+			 * partial aggregation paths for the grouped relation on top of
+			 * the paths of this rel.  After that, we're done creating paths
+			 * for the grouped relation, so run set_cheapest().
+			 */
+			if (!bms_equal(rel->relids, root->all_query_rels))
+			{
+				RelOptInfo *grouped_rel;
+
+				grouped_rel = rel->grouped_rel;
+				if (grouped_rel)
+				{
+					Assert(IS_GROUPED_REL(grouped_rel));
+
+					generate_grouped_paths(root, grouped_rel, rel,
+										   grouped_rel->agg_info);
+					set_cheapest(grouped_rel);
+				}
+			}
+
 #ifdef OPTIMIZER_DEBUG
 			pprint(rel);
 #endif
@@ -4383,6 +4813,29 @@ generate_partitionwise_join_paths(PlannerInfo *root, RelOptInfo *rel)
 		if (IS_DUMMY_REL(child_rel))
 			continue;
 
+		/*
+		 * Except for the topmost scan/join rel, consider generating partial
+		 * aggregation paths for the grouped relation on top of the paths of
+		 * this partitioned child-join.  After that, we're done creating paths
+		 * for the grouped relation, so run set_cheapest().
+		 */
+		if (!bms_equal(IS_OTHER_REL(rel) ?
+					   rel->top_parent_relids : rel->relids,
+					   root->all_query_rels))
+		{
+			RelOptInfo *grouped_rel;
+
+			grouped_rel = child_rel->grouped_rel;
+			if (grouped_rel)
+			{
+				Assert(IS_GROUPED_REL(grouped_rel));
+
+				generate_grouped_paths(root, grouped_rel, child_rel,
+									   grouped_rel->agg_info);
+				set_cheapest(grouped_rel);
+			}
+		}
+
 #ifdef OPTIMIZER_DEBUG
 		pprint(child_rel);
 #endif
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index aad41b94009..477b0bc3b84 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -16,6 +16,7 @@
 
 #include "miscadmin.h"
 #include "optimizer/appendinfo.h"
+#include "optimizer/cost.h"
 #include "optimizer/joininfo.h"
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
@@ -35,6 +36,9 @@ static bool has_legal_joinclause(PlannerInfo *root, RelOptInfo *rel);
 static bool restriction_is_constant_false(List *restrictlist,
 										  RelOptInfo *joinrel,
 										  bool only_pushed_down);
+static void make_grouped_join_rel(PlannerInfo *root, RelOptInfo *rel1,
+								  RelOptInfo *rel2, RelOptInfo *joinrel,
+								  SpecialJoinInfo *sjinfo, List *restrictlist);
 static void populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
 										RelOptInfo *rel2, RelOptInfo *joinrel,
 										SpecialJoinInfo *sjinfo, List *restrictlist);
@@ -763,6 +767,10 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
 		return joinrel;
 	}
 
+	/* Build a grouped join relation for 'joinrel' if possible. */
+	make_grouped_join_rel(root, rel1, rel2, joinrel, sjinfo,
+						  restrictlist);
+
 	/* Add paths to the join relation. */
 	populate_joinrel_with_paths(root, rel1, rel2, joinrel, sjinfo,
 								restrictlist);
@@ -874,6 +882,186 @@ add_outer_joins_to_relids(PlannerInfo *root, Relids input_relids,
 	return input_relids;
 }
 
+/*
+ * make_grouped_join_rel
+ *	  Build a grouped join relation for the given "joinrel" if eager
+ *	  aggregation is applicable and the resulting grouped paths are considered
+ *	  useful.
+ *
+ * There are two strategies for generating grouped paths for a join relation:
+ *
+ * 1. Join a grouped (partially aggregated) input relation with a non-grouped
+ * input (e.g., AGG(B) JOIN A).
+ *
+ * 2. Apply partial aggregation (sorted or hashed) on top of existing
+ * non-grouped join paths (e.g., AGG(A JOIN B)).
+ *
+ * To limit planning effort and avoid an explosion of alternatives, we adopt a
+ * strategy where partial aggregation is only pushed to the lowest possible
+ * level in the join tree that is deemed useful.  That is, if grouped paths can
+ * be built using the first strategy, we skip consideration of the second
+ * strategy for the same join level.
+ *
+ * Additionally, if there are multiple lowest useful levels where partial
+ * aggregation could be applied, such as in a join tree with relations A, B,
+ * and C where both "AGG(A JOIN B) JOIN C" and "A JOIN AGG(B JOIN C)" are valid
+ * placements, we choose only the first one encountered during join search.
+ * This avoids generating multiple versions of the same grouped relation based
+ * on different aggregation placements.
+ *
+ * These heuristics also ensure that all grouped paths for the same grouped
+ * relation produce the same set of rows, which is a basic assumption in the
+ * planner.
+ */
+static void
+make_grouped_join_rel(PlannerInfo *root, RelOptInfo *rel1,
+					  RelOptInfo *rel2, RelOptInfo *joinrel,
+					  SpecialJoinInfo *sjinfo, List *restrictlist)
+{
+	RelOptInfo *grouped_rel;
+	RelOptInfo *grouped_rel1;
+	RelOptInfo *grouped_rel2;
+	bool		rel1_empty;
+	bool		rel2_empty;
+	Relids		agg_apply_at;
+
+	/*
+	 * If there are no aggregate expressions or grouping expressions, eager
+	 * aggregation is not possible.
+	 */
+	if (root->agg_clause_list == NIL ||
+		root->group_expr_list == NIL)
+		return;
+
+	/* Retrieve the grouped relations for the two input rels */
+	grouped_rel1 = rel1->grouped_rel;
+	grouped_rel2 = rel2->grouped_rel;
+
+	rel1_empty = (grouped_rel1 == NULL || IS_DUMMY_REL(grouped_rel1));
+	rel2_empty = (grouped_rel2 == NULL || IS_DUMMY_REL(grouped_rel2));
+
+	/* Find or construct a grouped joinrel for this joinrel */
+	grouped_rel = joinrel->grouped_rel;
+	if (grouped_rel == NULL)
+	{
+		RelAggInfo *agg_info = NULL;
+
+		/*
+		 * Prepare the information needed to create grouped paths for this
+		 * join relation.
+		 */
+		agg_info = create_rel_agg_info(root, joinrel);
+		if (agg_info == NULL)
+			return;
+
+		/*
+		 * If grouped paths for the given join relation are not considered
+		 * useful, and no grouped paths can be built by joining grouped input
+		 * relations, skip building the grouped join relation.
+		 */
+		if (!agg_info->agg_useful &&
+			(rel1_empty == rel2_empty))
+			return;
+
+		/* build the grouped relation */
+		grouped_rel = build_grouped_rel(root, joinrel);
+		grouped_rel->reltarget = agg_info->target;
+
+		if (rel1_empty != rel2_empty)
+		{
+			/*
+			 * If there is exactly one grouped input relation, then we can
+			 * build grouped paths by joining the input relations.  Set size
+			 * estimates for the grouped join relation based on the input
+			 * relations, and update the lowest join level where partial
+			 * aggregation is applied to that of the grouped input relation.
+			 */
+			set_joinrel_size_estimates(root, grouped_rel,
+									   rel1_empty ? rel1 : grouped_rel1,
+									   rel2_empty ? rel2 : grouped_rel2,
+									   sjinfo, restrictlist);
+			agg_info->apply_at = rel1_empty ?
+				grouped_rel2->agg_info->apply_at :
+				grouped_rel1->agg_info->apply_at;
+		}
+		else
+		{
+			/*
+			 * Otherwise, grouped paths can be built by applying partial
+			 * aggregation on top of existing non-grouped join paths.  Set
+			 * size estimates for the grouped join relation based on the
+			 * estimated number of groups, and track the lowest join level
+			 * where partial aggregation is applied.  Note that these values
+			 * may be updated later if it is determined that grouped paths can
+			 * be constructed by joining other input relations.
+			 */
+			grouped_rel->rows = agg_info->grouped_rows;
+			agg_info->apply_at = bms_copy(joinrel->relids);
+		}
+
+		grouped_rel->agg_info = agg_info;
+		joinrel->grouped_rel = grouped_rel;
+	}
+
+	Assert(IS_GROUPED_REL(grouped_rel));
+
+	/* We may have already proven this grouped join relation to be dummy. */
+	if (IS_DUMMY_REL(grouped_rel))
+		return;
+
+	/*
+	 * Nothing to do if there's no grouped input relation.  Also, joining two
+	 * grouped relations is not currently supported.
+	 */
+	if (rel1_empty == rel2_empty)
+		return;
+
+	/*
+	 * Get the lowest join level where partial aggregation is applied among
+	 * the given input relations.
+	 */
+	agg_apply_at = rel1_empty ?
+		grouped_rel2->agg_info->apply_at :
+		grouped_rel1->agg_info->apply_at;
+
+	/*
+	 * If it's not the designated level, skip building grouped paths.
+	 *
+	 * One exception is when it is a subset of the previously recorded level.
+	 * In that case, we need to update the designated level to this one, and
+	 * adjust the size estimates for the grouped join relation accordingly.
+	 * For example, suppose partial aggregation can be applied on top of (B
+	 * JOIN C).  If we first construct the join as ((A JOIN B) JOIN C), we'd
+	 * record the designated level as including all three relations (A B C).
+	 * Later, when we consider (A JOIN (B JOIN C)), we encounter the smaller
+	 * (B C) join level directly.  Since this is a subset of the previous
+	 * level and still valid for partial aggregation, we update the designated
+	 * level to (B C), and adjust the size estimates accordingly.
+	 */
+	if (!bms_equal(agg_apply_at, grouped_rel->agg_info->apply_at))
+	{
+		if (bms_is_subset(agg_apply_at, grouped_rel->agg_info->apply_at))
+		{
+			/* Adjust the size estimates for the grouped join relation. */
+			set_joinrel_size_estimates(root, grouped_rel,
+									   rel1_empty ? rel1 : grouped_rel1,
+									   rel2_empty ? rel2 : grouped_rel2,
+									   sjinfo, restrictlist);
+			grouped_rel->agg_info->apply_at = agg_apply_at;
+		}
+		else
+			return;
+	}
+
+	/* Make paths for the grouped join relation. */
+	populate_joinrel_with_paths(root,
+								rel1_empty ? rel1 : grouped_rel1,
+								rel2_empty ? rel2 : grouped_rel2,
+								grouped_rel,
+								sjinfo,
+								restrictlist);
+}
+
 /*
  * populate_joinrel_with_paths
  *	  Add paths to the given joinrel for given pair of joining relations. The
@@ -1615,6 +1803,11 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
 						 adjust_child_relids(joinrel->relids,
 											 nappinfos, appinfos)));
 
+		/* Build a grouped join relation for 'child_joinrel' if possible */
+		make_grouped_join_rel(root, child_rel1, child_rel2,
+							  child_joinrel, child_sjinfo,
+							  child_restrictlist);
+
 		/* And make paths for the child join */
 		populate_joinrel_with_paths(root, child_rel1, child_rel2,
 									child_joinrel, child_sjinfo,
diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index 3e3fec89252..9cc8c558ccf 100644
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -14,6 +14,7 @@
  */
 #include "postgres.h"
 
+#include "access/nbtree.h"
 #include "catalog/pg_constraint.h"
 #include "catalog/pg_type.h"
 #include "nodes/makefuncs.h"
@@ -31,6 +32,7 @@
 #include "optimizer/restrictinfo.h"
 #include "parser/analyze.h"
 #include "rewrite/rewriteManip.h"
+#include "utils/fmgroids.h"
 #include "utils/lsyscache.h"
 #include "utils/rel.h"
 #include "utils/typcache.h"
@@ -81,6 +83,9 @@ typedef struct JoinTreeItem
 } JoinTreeItem;
 
 
+static bool is_partial_agg_memory_risky(PlannerInfo *root);
+static void create_agg_clause_infos(PlannerInfo *root);
+static void create_grouping_expr_infos(PlannerInfo *root);
 static void extract_lateral_references(PlannerInfo *root, RelOptInfo *brel,
 									   Index rtindex);
 static List *deconstruct_recurse(PlannerInfo *root, Node *jtnode,
@@ -628,6 +633,323 @@ remove_useless_groupby_columns(PlannerInfo *root)
 	}
 }
 
+/*
+ * setup_eager_aggregation
+ *	  Check if eager aggregation is applicable, and if so collect suitable
+ *	  aggregate expressions and grouping expressions in the query.
+ */
+void
+setup_eager_aggregation(PlannerInfo *root)
+{
+	/*
+	 * Don't apply eager aggregation if disabled by user.
+	 */
+	if (!enable_eager_aggregate)
+		return;
+
+	/*
+	 * Don't apply eager aggregation if there are no available GROUP BY
+	 * clauses.
+	 */
+	if (!root->processed_groupClause)
+		return;
+
+	/*
+	 * For now we don't try to support grouping sets.
+	 */
+	if (root->parse->groupingSets)
+		return;
+
+	/*
+	 * For now we don't try to support DISTINCT or ORDER BY aggregates.
+	 */
+	if (root->numOrderedAggs > 0)
+		return;
+
+	/*
+	 * If there are any aggregates that do not support partial mode, or any
+	 * partial aggregates that are non-serializable, do not apply eager
+	 * aggregation.
+	 */
+	if (root->hasNonPartialAggs || root->hasNonSerialAggs)
+		return;
+
+	/*
+	 * We don't try to apply eager aggregation if there are set-returning
+	 * functions in targetlist.
+	 */
+	if (root->parse->hasTargetSRFs)
+		return;
+
+	/*
+	 * Eager aggregation only makes sense if there are multiple base rels in
+	 * the query.
+	 */
+	if (bms_membership(root->all_baserels) != BMS_MULTIPLE)
+		return;
+
+	/*
+	 * Don't apply eager aggregation if any aggregate poses a risk of
+	 * excessive memory usage during partial aggregation.
+	 */
+	if (is_partial_agg_memory_risky(root))
+		return;
+
+	/*
+	 * Collect aggregate expressions and plain Vars that appear in the
+	 * targetlist and havingQual.
+	 */
+	create_agg_clause_infos(root);
+
+	/*
+	 * If there are no suitable aggregate expressions, we cannot apply eager
+	 * aggregation.
+	 */
+	if (root->agg_clause_list == NIL)
+		return;
+
+	/*
+	 * Collect grouping expressions that appear in grouping clauses.
+	 */
+	create_grouping_expr_infos(root);
+}
+
+/*
+ * is_partial_agg_memory_risky
+ *	  Checks if any aggregate poses a risk of excessive memory usage during
+ *	  partial aggregation.
+ *
+ * We check if any aggregate uses INTERNAL transition type.  Although INTERNAL
+ * is marked as pass-by-value, it usually points to a large internal data
+ * structure (like those used by string_agg or array_agg).  These transition
+ * states can grow large and their size is hard to estimate.  Applying eager
+ * aggregation in such cases risks high memory usage since partial aggregation
+ * results might be stored in join hash tables or materialized nodes.
+ *
+ * We explicitly exclude aggregates with F_NUMERIC_AVG_ACCUM transition
+ * function from this check, based on the assumption that avg(numeric) and
+ * sum(numeric) are safe in this context.
+ */
+static bool
+is_partial_agg_memory_risky(PlannerInfo *root)
+{
+	ListCell   *lc;
+
+	foreach(lc, root->aggtransinfos)
+	{
+		AggTransInfo *transinfo = lfirst_node(AggTransInfo, lc);
+
+		if (transinfo->transfn_oid == F_NUMERIC_AVG_ACCUM)
+			continue;
+
+		if (transinfo->aggtranstype == INTERNALOID)
+			return true;
+	}
+
+	return false;
+}
+
+/*
+ * create_agg_clause_infos
+ *	  Search the targetlist and havingQual for Aggrefs and plain Vars, and
+ *	  create an AggClauseInfo for each Aggref node.
+ */
+static void
+create_agg_clause_infos(PlannerInfo *root)
+{
+	List	   *tlist_exprs;
+	List	   *agg_clause_list = NIL;
+	List	   *tlist_vars = NIL;
+	Relids		aggregate_relids = NULL;
+	bool		eager_agg_applicable = true;
+	ListCell   *lc;
+
+	Assert(root->agg_clause_list == NIL);
+	Assert(root->tlist_vars == NIL);
+
+	tlist_exprs = pull_var_clause((Node *) root->processed_tlist,
+								  PVC_INCLUDE_AGGREGATES |
+								  PVC_RECURSE_WINDOWFUNCS |
+								  PVC_RECURSE_PLACEHOLDERS);
+
+	/*
+	 * Aggregates within the HAVING clause need to be processed in the same
+	 * way as those in the targetlist.  Note that HAVING can contain Aggrefs
+	 * but not WindowFuncs.
+	 */
+	if (root->parse->havingQual != NULL)
+	{
+		List	   *having_exprs;
+
+		having_exprs = pull_var_clause((Node *) root->parse->havingQual,
+									   PVC_INCLUDE_AGGREGATES |
+									   PVC_RECURSE_PLACEHOLDERS);
+		if (having_exprs != NIL)
+		{
+			tlist_exprs = list_concat(tlist_exprs, having_exprs);
+			list_free(having_exprs);
+		}
+	}
+
+	foreach(lc, tlist_exprs)
+	{
+		Expr	   *expr = (Expr *) lfirst(lc);
+		Aggref	   *aggref;
+		Relids		agg_eval_at;
+		AggClauseInfo *ac_info;
+
+		/* For now we don't try to support GROUPING() expressions */
+		if (IsA(expr, GroupingFunc))
+		{
+			eager_agg_applicable = false;
+			break;
+		}
+
+		/* Collect plain Vars for future reference */
+		if (IsA(expr, Var))
+		{
+			tlist_vars = list_append_unique(tlist_vars, expr);
+			continue;
+		}
+
+		aggref = castNode(Aggref, expr);
+
+		Assert(aggref->aggorder == NIL);
+		Assert(aggref->aggdistinct == NIL);
+
+		/*
+		 * If there are any securityQuals, do not try to apply eager
+		 * aggregation if any non-leakproof aggregate functions are present.
+		 * This is overly strict, but for now...
+		 */
+		if (root->qual_security_level > 0 &&
+			!get_func_leakproof(aggref->aggfnoid))
+		{
+			eager_agg_applicable = false;
+			break;
+		}
+
+		agg_eval_at = pull_varnos(root, (Node *) aggref);
+
+		/*
+		 * If all base relations in the query are referenced by aggregate
+		 * functions, then eager aggregation is not applicable.
+		 */
+		aggregate_relids = bms_add_members(aggregate_relids, agg_eval_at);
+		if (bms_is_subset(root->all_baserels, aggregate_relids))
+		{
+			eager_agg_applicable = false;
+			break;
+		}
+
+		/* OK, create the AggClauseInfo node */
+		ac_info = makeNode(AggClauseInfo);
+		ac_info->aggref = aggref;
+		ac_info->agg_eval_at = agg_eval_at;
+
+		/* ... and add it to the list */
+		agg_clause_list = list_append_unique(agg_clause_list, ac_info);
+	}
+
+	list_free(tlist_exprs);
+
+	if (eager_agg_applicable)
+	{
+		root->agg_clause_list = agg_clause_list;
+		root->tlist_vars = tlist_vars;
+	}
+	else
+	{
+		list_free_deep(agg_clause_list);
+		list_free(tlist_vars);
+	}
+}
+
+/*
+ * create_grouping_expr_infos
+ *	  Create a GroupingExprInfo for each expression usable as grouping key.
+ *
+ * If any grouping expression is not suitable, we will just return with
+ * root->group_expr_list being NIL.
+ */
+static void
+create_grouping_expr_infos(PlannerInfo *root)
+{
+	List	   *exprs = NIL;
+	List	   *sortgrouprefs = NIL;
+	List	   *btree_opfamilies = NIL;
+	ListCell   *lc,
+			   *lc1,
+			   *lc2,
+			   *lc3;
+
+	Assert(root->group_expr_list == NIL);
+
+	foreach(lc, root->processed_groupClause)
+	{
+		SortGroupClause *sgc = lfirst_node(SortGroupClause, lc);
+		TargetEntry *tle = get_sortgroupclause_tle(sgc, root->processed_tlist);
+		TypeCacheEntry *tce;
+		Oid			equalimageproc;
+
+		Assert(tle->ressortgroupref > 0);
+
+		/*
+		 * For now we only support plain Vars as grouping expressions.
+		 */
+		if (!IsA(tle->expr, Var))
+			return;
+
+		/*
+		 * Eager aggregation is only possible if equality implies image
+		 * equality for each grouping key.  Otherwise, placing keys with
+		 * different byte images into the same group may result in the loss of
+		 * information that could be necessary to evaluate upper qual clauses.
+		 *
+		 * For instance, the NUMERIC data type is not supported, as values
+		 * that are considered equal by the equality operator (e.g., 0 and
+		 * 0.0) can have different scales.
+		 */
+		tce = lookup_type_cache(exprType((Node *) tle->expr),
+								TYPECACHE_BTREE_OPFAMILY);
+		if (!OidIsValid(tce->btree_opf) ||
+			!OidIsValid(tce->btree_opintype))
+			return;
+
+		equalimageproc = get_opfamily_proc(tce->btree_opf,
+										   tce->btree_opintype,
+										   tce->btree_opintype,
+										   BTEQUALIMAGE_PROC);
+		if (!OidIsValid(equalimageproc) ||
+			!DatumGetBool(OidFunctionCall1Coll(equalimageproc,
+											   tce->typcollation,
+											   ObjectIdGetDatum(tce->btree_opintype))))
+			return;
+
+		exprs = lappend(exprs, tle->expr);
+		sortgrouprefs = lappend_int(sortgrouprefs, tle->ressortgroupref);
+		btree_opfamilies = lappend_oid(btree_opfamilies, tce->btree_opf);
+	}
+
+	/*
+	 * Construct a GroupingExprInfo for each expression.
+	 */
+	forthree(lc1, exprs, lc2, sortgrouprefs, lc3, btree_opfamilies)
+	{
+		Expr	   *expr = (Expr *) lfirst(lc1);
+		int			sortgroupref = lfirst_int(lc2);
+		Oid			btree_opfamily = lfirst_oid(lc3);
+		GroupingExprInfo *ge_info;
+
+		ge_info = makeNode(GroupingExprInfo);
+		ge_info->expr = (Expr *) copyObject(expr);
+		ge_info->sortgroupref = sortgroupref;
+		ge_info->btree_opfamily = btree_opfamily;
+
+		root->group_expr_list = lappend(root->group_expr_list, ge_info);
+	}
+}
+
 /*****************************************************************************
  *
  *	  LATERAL REFERENCES
diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c
index 5467e094ca7..eefc486a566 100644
--- a/src/backend/optimizer/plan/planmain.c
+++ b/src/backend/optimizer/plan/planmain.c
@@ -76,6 +76,9 @@ query_planner(PlannerInfo *root,
 	root->placeholder_list = NIL;
 	root->placeholder_array = NULL;
 	root->placeholder_array_size = 0;
+	root->agg_clause_list = NIL;
+	root->group_expr_list = NIL;
+	root->tlist_vars = NIL;
 	root->fkey_list = NIL;
 	root->initial_rels = NIL;
 
@@ -265,6 +268,12 @@ query_planner(PlannerInfo *root,
 	 */
 	extract_restriction_or_clauses(root);
 
+	/*
+	 * Check if eager aggregation is applicable, and if so, set up
+	 * root->agg_clause_list and root->group_expr_list.
+	 */
+	setup_eager_aggregation(root);
+
 	/*
 	 * Now expand appendrels by adding "otherrels" for their children.  We
 	 * delay this to the end so that we have as much information as possible
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index d59d6e4c6a0..d361319d0b5 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -231,7 +231,6 @@ static void add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 									  RelOptInfo *partially_grouped_rel,
 									  const AggClauseCosts *agg_costs,
 									  grouping_sets_data *gd,
-									  double dNumGroups,
 									  GroupPathExtraData *extra);
 static RelOptInfo *create_partial_grouping_paths(PlannerInfo *root,
 												 RelOptInfo *grouped_rel,
@@ -3971,9 +3970,7 @@ create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
 							   GroupPathExtraData *extra,
 							   RelOptInfo **partially_grouped_rel_p)
 {
-	Path	   *cheapest_path = input_rel->cheapest_total_path;
 	RelOptInfo *partially_grouped_rel = NULL;
-	double		dNumGroups;
 	PartitionwiseAggregateType patype = PARTITIONWISE_AGGREGATE_NONE;
 
 	/*
@@ -4055,23 +4052,16 @@ create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
 
 	/* Gather any partially grouped partial paths. */
 	if (partially_grouped_rel && partially_grouped_rel->partial_pathlist)
-	{
 		gather_grouping_paths(root, partially_grouped_rel);
-		set_cheapest(partially_grouped_rel);
-	}
 
-	/*
-	 * Estimate number of groups.
-	 */
-	dNumGroups = get_number_of_groups(root,
-									  cheapest_path->rows,
-									  gd,
-									  extra->targetList);
+	/* Now choose the best path(s) for partially_grouped_rel. */
+	if (partially_grouped_rel && partially_grouped_rel->pathlist)
+		set_cheapest(partially_grouped_rel);
 
 	/* Build final grouping paths */
 	add_paths_to_grouping_rel(root, input_rel, grouped_rel,
 							  partially_grouped_rel, agg_costs, gd,
-							  dNumGroups, extra);
+							  extra);
 
 	/* Give a helpful error if we failed to find any implementation */
 	if (grouped_rel->pathlist == NIL)
@@ -7016,16 +7006,42 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 						  RelOptInfo *grouped_rel,
 						  RelOptInfo *partially_grouped_rel,
 						  const AggClauseCosts *agg_costs,
-						  grouping_sets_data *gd, double dNumGroups,
+						  grouping_sets_data *gd,
 						  GroupPathExtraData *extra)
 {
 	Query	   *parse = root->parse;
 	Path	   *cheapest_path = input_rel->cheapest_total_path;
+	Path	   *cheapest_partially_grouped_path = NULL;
 	ListCell   *lc;
 	bool		can_hash = (extra->flags & GROUPING_CAN_USE_HASH) != 0;
 	bool		can_sort = (extra->flags & GROUPING_CAN_USE_SORT) != 0;
 	List	   *havingQual = (List *) extra->havingQual;
 	AggClauseCosts *agg_final_costs = &extra->agg_final_costs;
+	double		dNumGroups = 0;
+	double		dNumFinalGroups = 0;
+
+	/*
+	 * Estimate number of groups for non-split aggregation.
+	 */
+	dNumGroups = get_number_of_groups(root,
+									  cheapest_path->rows,
+									  gd,
+									  extra->targetList);
+
+	if (partially_grouped_rel && partially_grouped_rel->pathlist)
+	{
+		cheapest_partially_grouped_path =
+			partially_grouped_rel->cheapest_total_path;
+
+		/*
+		 * Estimate number of groups for final phase of partial aggregation.
+		 */
+		dNumFinalGroups =
+			get_number_of_groups(root,
+								 cheapest_partially_grouped_path->rows,
+								 gd,
+								 extra->targetList);
+	}
 
 	if (can_sort)
 	{
@@ -7138,7 +7154,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 					path = make_ordered_path(root,
 											 grouped_rel,
 											 path,
-											 partially_grouped_rel->cheapest_total_path,
+											 cheapest_partially_grouped_path,
 											 info->pathkeys,
 											 -1.0);
 
@@ -7156,7 +7172,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 												 info->clauses,
 												 havingQual,
 												 agg_final_costs,
-												 dNumGroups));
+												 dNumFinalGroups));
 					else
 						add_path(grouped_rel, (Path *)
 								 create_group_path(root,
@@ -7164,7 +7180,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 												   path,
 												   info->clauses,
 												   havingQual,
-												   dNumGroups));
+												   dNumFinalGroups));
 
 				}
 			}
@@ -7206,19 +7222,17 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 		 */
 		if (partially_grouped_rel && partially_grouped_rel->pathlist)
 		{
-			Path	   *path = partially_grouped_rel->cheapest_total_path;
-
 			add_path(grouped_rel, (Path *)
 					 create_agg_path(root,
 									 grouped_rel,
-									 path,
+									 cheapest_partially_grouped_path,
 									 grouped_rel->reltarget,
 									 AGG_HASHED,
 									 AGGSPLIT_FINAL_DESERIAL,
 									 root->processed_groupClause,
 									 havingQual,
 									 agg_final_costs,
-									 dNumGroups));
+									 dNumFinalGroups));
 		}
 	}
 
@@ -7258,6 +7272,7 @@ create_partial_grouping_paths(PlannerInfo *root,
 {
 	Query	   *parse = root->parse;
 	RelOptInfo *partially_grouped_rel;
+	RelOptInfo *eager_agg_rel = NULL;
 	AggClauseCosts *agg_partial_costs = &extra->agg_partial_costs;
 	AggClauseCosts *agg_final_costs = &extra->agg_final_costs;
 	Path	   *cheapest_partial_path = NULL;
@@ -7268,6 +7283,15 @@ create_partial_grouping_paths(PlannerInfo *root,
 	bool		can_hash = (extra->flags & GROUPING_CAN_USE_HASH) != 0;
 	bool		can_sort = (extra->flags & GROUPING_CAN_USE_SORT) != 0;
 
+	/*
+	 * Check whether any partially aggregated paths have been generated
+	 * through eager aggregation.
+	 */
+	if (input_rel->grouped_rel &&
+		!IS_DUMMY_REL(input_rel->grouped_rel) &&
+		input_rel->grouped_rel->pathlist != NIL)
+		eager_agg_rel = input_rel->grouped_rel;
+
 	/*
 	 * Consider whether we should generate partially aggregated non-partial
 	 * paths.  We can only do this if we have a non-partial path, and only if
@@ -7289,11 +7313,13 @@ create_partial_grouping_paths(PlannerInfo *root,
 
 	/*
 	 * If we can't partially aggregate partial paths, and we can't partially
-	 * aggregate non-partial paths, then don't bother creating the new
+	 * aggregate non-partial paths, and no partially aggregated paths were
+	 * generated by eager aggregation, then don't bother creating the new
 	 * RelOptInfo at all, unless the caller specified force_rel_creation.
 	 */
 	if (cheapest_total_path == NULL &&
 		cheapest_partial_path == NULL &&
+		eager_agg_rel == NULL &&
 		!force_rel_creation)
 		return NULL;
 
@@ -7518,6 +7544,51 @@ create_partial_grouping_paths(PlannerInfo *root,
 										 dNumPartialPartialGroups));
 	}
 
+	/*
+	 * Add any partially aggregated paths generated by eager aggregation to
+	 * the new upper relation after applying projection steps as needed.
+	 */
+	if (eager_agg_rel)
+	{
+		/* Add the paths */
+		foreach(lc, eager_agg_rel->pathlist)
+		{
+			Path	   *path = (Path *) lfirst(lc);
+
+			/* Shouldn't have any parameterized paths anymore */
+			Assert(path->param_info == NULL);
+
+			path = (Path *) create_projection_path(root,
+												   partially_grouped_rel,
+												   path,
+												   partially_grouped_rel->reltarget);
+
+			add_path(partially_grouped_rel, path);
+		}
+
+		/*
+		 * Likewise add the partial paths, but only if parallelism is possible
+		 * for partially_grouped_rel.
+		 */
+		if (partially_grouped_rel->consider_parallel)
+		{
+			foreach(lc, eager_agg_rel->partial_pathlist)
+			{
+				Path	   *path = (Path *) lfirst(lc);
+
+				/* Shouldn't have any parameterized paths anymore */
+				Assert(path->param_info == NULL);
+
+				path = (Path *) create_projection_path(root,
+													   partially_grouped_rel,
+													   path,
+													   partially_grouped_rel->reltarget);
+
+				add_partial_path(partially_grouped_rel, path);
+			}
+		}
+	}
+
 	/*
 	 * If there is an FDW that's responsible for all baserels of the query,
 	 * let it consider adding partially grouped ForeignPaths.
@@ -8081,13 +8152,6 @@ create_partitionwise_grouping_paths(PlannerInfo *root,
 
 		add_paths_to_append_rel(root, partially_grouped_rel,
 								partially_grouped_live_children);
-
-		/*
-		 * We need call set_cheapest, since the finalization step will use the
-		 * cheapest path from the rel.
-		 */
-		if (partially_grouped_rel->pathlist)
-			set_cheapest(partially_grouped_rel);
 	}
 
 	/* If possible, create append paths for fully grouped children. */
diff --git a/src/backend/optimizer/util/appendinfo.c b/src/backend/optimizer/util/appendinfo.c
index 5b3dc0d8653..11c0eb0d180 100644
--- a/src/backend/optimizer/util/appendinfo.c
+++ b/src/backend/optimizer/util/appendinfo.c
@@ -516,6 +516,65 @@ adjust_appendrel_attrs_mutator(Node *node,
 		return (Node *) newinfo;
 	}
 
+	/*
+	 * We have to process RelAggInfo nodes specially.
+	 */
+	if (IsA(node, RelAggInfo))
+	{
+		RelAggInfo *oldinfo = (RelAggInfo *) node;
+		RelAggInfo *newinfo = makeNode(RelAggInfo);
+
+		/* Copy all flat-copiable fields */
+		memcpy(newinfo, oldinfo, sizeof(RelAggInfo));
+
+		newinfo->relids = adjust_child_relids(oldinfo->relids,
+											  nappinfos, appinfos);
+
+		newinfo->target = (PathTarget *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->target,
+										   context);
+
+		newinfo->agg_input = (PathTarget *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->agg_input,
+										   context);
+
+		newinfo->group_clauses = (List *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->group_clauses,
+										   context);
+
+		newinfo->group_exprs = (List *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->group_exprs,
+										   context);
+
+		return (Node *) newinfo;
+	}
+
+	/*
+	 * We have to process PathTarget nodes specially.
+	 */
+	if (IsA(node, PathTarget))
+	{
+		PathTarget *oldtarget = (PathTarget *) node;
+		PathTarget *newtarget = makeNode(PathTarget);
+
+		/* Copy all flat-copiable fields */
+		memcpy(newtarget, oldtarget, sizeof(PathTarget));
+
+		newtarget->exprs = (List *)
+			adjust_appendrel_attrs_mutator((Node *) oldtarget->exprs,
+										   context);
+
+		if (oldtarget->sortgrouprefs)
+		{
+			Size		nbytes = list_length(oldtarget->exprs) * sizeof(Index);
+
+			newtarget->sortgrouprefs = (Index *) palloc(nbytes);
+			memcpy(newtarget->sortgrouprefs, oldtarget->sortgrouprefs, nbytes);
+		}
+
+		return (Node *) newtarget;
+	}
+
 	/*
 	 * NOTE: we do not need to recurse into sublinks, because they should
 	 * already have been converted to subplans before we see them.
diff --git a/src/backend/optimizer/util/pathnode.c b/src/backend/optimizer/util/pathnode.c
index a4c5867cdcb..5a2e723bc29 100644
--- a/src/backend/optimizer/util/pathnode.c
+++ b/src/backend/optimizer/util/pathnode.c
@@ -2818,8 +2818,7 @@ create_projection_path(PlannerInfo *root,
 	pathnode->path.pathtype = T_Result;
 	pathnode->path.parent = rel;
 	pathnode->path.pathtarget = target;
-	/* For now, assume we are above any joins, so no parameterization */
-	pathnode->path.param_info = NULL;
+	pathnode->path.param_info = subpath->param_info;
 	pathnode->path.parallel_aware = false;
 	pathnode->path.parallel_safe = rel->consider_parallel &&
 		subpath->parallel_safe &&
@@ -3074,8 +3073,7 @@ create_incremental_sort_path(PlannerInfo *root,
 	pathnode->path.parent = rel;
 	/* Sort doesn't project, so use source path's pathtarget */
 	pathnode->path.pathtarget = subpath->pathtarget;
-	/* For now, assume we are above any joins, so no parameterization */
-	pathnode->path.param_info = NULL;
+	pathnode->path.param_info = subpath->param_info;
 	pathnode->path.parallel_aware = false;
 	pathnode->path.parallel_safe = rel->consider_parallel &&
 		subpath->parallel_safe;
@@ -3122,8 +3120,7 @@ create_sort_path(PlannerInfo *root,
 	pathnode->path.parent = rel;
 	/* Sort doesn't project, so use source path's pathtarget */
 	pathnode->path.pathtarget = subpath->pathtarget;
-	/* For now, assume we are above any joins, so no parameterization */
-	pathnode->path.param_info = NULL;
+	pathnode->path.param_info = subpath->param_info;
 	pathnode->path.parallel_aware = false;
 	pathnode->path.parallel_safe = rel->consider_parallel &&
 		subpath->parallel_safe;
@@ -3284,8 +3281,7 @@ create_agg_path(PlannerInfo *root,
 	pathnode->path.pathtype = T_Agg;
 	pathnode->path.parent = rel;
 	pathnode->path.pathtarget = target;
-	/* For now, assume we are above any joins, so no parameterization */
-	pathnode->path.param_info = NULL;
+	pathnode->path.param_info = subpath->param_info;
 	pathnode->path.parallel_aware = false;
 	pathnode->path.parallel_safe = rel->consider_parallel &&
 		subpath->parallel_safe;
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index ff507331a06..bd28687dc81 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -16,6 +16,8 @@
 
 #include <limits.h>
 
+#include "access/nbtree.h"
+#include "catalog/pg_constraint.h"
 #include "miscadmin.h"
 #include "nodes/nodeFuncs.h"
 #include "optimizer/appendinfo.h"
@@ -27,12 +29,16 @@
 #include "optimizer/paths.h"
 #include "optimizer/placeholder.h"
 #include "optimizer/plancat.h"
+#include "optimizer/planner.h"
 #include "optimizer/restrictinfo.h"
 #include "optimizer/tlist.h"
+#include "parser/parse_oper.h"
 #include "parser/parse_relation.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/hsearch.h"
 #include "utils/lsyscache.h"
+#include "utils/selfuncs.h"
+#include "utils/typcache.h"
 
 
 typedef struct JoinHashEntry
@@ -83,6 +89,14 @@ static void build_child_join_reltarget(PlannerInfo *root,
 									   RelOptInfo *childrel,
 									   int nappinfos,
 									   AppendRelInfo **appinfos);
+static bool eager_aggregation_possible_for_relation(PlannerInfo *root,
+													RelOptInfo *rel);
+static bool init_grouping_targets(PlannerInfo *root, RelOptInfo *rel,
+								  PathTarget *target, PathTarget *agg_input,
+								  List **group_clauses, List **group_exprs);
+static bool is_var_in_aggref_only(PlannerInfo *root, Var *var);
+static bool is_var_needed_by_join(PlannerInfo *root, Var *var, RelOptInfo *rel);
+static Index get_expression_sortgroupref(PlannerInfo *root, Expr *expr);
 
 
 /*
@@ -276,6 +290,8 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
 	rel->joininfo = NIL;
 	rel->has_eclass_joins = false;
 	rel->consider_partitionwise_join = false;	/* might get changed later */
+	rel->agg_info = NULL;
+	rel->grouped_rel = NULL;
 	rel->part_scheme = NULL;
 	rel->nparts = -1;
 	rel->boundinfo = NULL;
@@ -406,6 +422,104 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
 	return rel;
 }
 
+/*
+ * build_simple_grouped_rel
+ *	  Construct a new RelOptInfo representing a grouped version of the input
+ *	  base relation.
+ */
+RelOptInfo *
+build_simple_grouped_rel(PlannerInfo *root, RelOptInfo *rel)
+{
+	RelOptInfo *grouped_rel;
+	RelAggInfo *agg_info;
+
+	/*
+	 * We should have available aggregate expressions and grouping
+	 * expressions, otherwise we cannot reach here.
+	 */
+	Assert(root->agg_clause_list != NIL);
+	Assert(root->group_expr_list != NIL);
+
+	/* nothing to do for dummy rel */
+	if (IS_DUMMY_REL(rel))
+		return NULL;
+
+	/*
+	 * Prepare the information needed to create grouped paths for this base
+	 * relation.
+	 */
+	agg_info = create_rel_agg_info(root, rel);
+	if (agg_info == NULL)
+		return NULL;
+
+	/*
+	 * If grouped paths for the given base relation are not considered useful,
+	 * skip building the grouped relation.
+	 */
+	if (!agg_info->agg_useful)
+		return NULL;
+
+	/* Tracks the lowest join level at which partial aggregation is applied */
+	agg_info->apply_at = bms_copy(rel->relids);
+
+	/* build the grouped relation */
+	grouped_rel = build_grouped_rel(root, rel);
+	grouped_rel->reltarget = agg_info->target;
+	grouped_rel->rows = agg_info->grouped_rows;
+	grouped_rel->agg_info = agg_info;
+
+	rel->grouped_rel = grouped_rel;
+
+	return grouped_rel;
+}
+
+/*
+ * build_grouped_rel
+ *	  Build a grouped relation by flat copying the input relation and resetting
+ *	  the necessary fields.
+ */
+RelOptInfo *
+build_grouped_rel(PlannerInfo *root, RelOptInfo *rel)
+{
+	RelOptInfo *grouped_rel;
+
+	grouped_rel = makeNode(RelOptInfo);
+	memcpy(grouped_rel, rel, sizeof(RelOptInfo));
+
+	/*
+	 * clear path info
+	 */
+	grouped_rel->pathlist = NIL;
+	grouped_rel->ppilist = NIL;
+	grouped_rel->partial_pathlist = NIL;
+	grouped_rel->cheapest_startup_path = NULL;
+	grouped_rel->cheapest_total_path = NULL;
+	grouped_rel->cheapest_unique_path = NULL;
+	grouped_rel->cheapest_parameterized_paths = NIL;
+
+	/*
+	 * clear partition info
+	 */
+	grouped_rel->part_scheme = NULL;
+	grouped_rel->nparts = -1;
+	grouped_rel->boundinfo = NULL;
+	grouped_rel->partbounds_merged = false;
+	grouped_rel->partition_qual = NIL;
+	grouped_rel->part_rels = NULL;
+	grouped_rel->live_parts = NULL;
+	grouped_rel->all_partrels = NULL;
+	grouped_rel->partexprs = NULL;
+	grouped_rel->nullable_partexprs = NULL;
+	grouped_rel->consider_partitionwise_join = false;
+
+	/*
+	 * clear size estimates
+	 */
+	grouped_rel->rows = 0;
+
+	return grouped_rel;
+}
+
 /*
  * find_base_rel
  *	  Find a base or otherrel relation entry, which must already exist.
@@ -755,6 +869,8 @@ build_join_rel(PlannerInfo *root,
 	joinrel->joininfo = NIL;
 	joinrel->has_eclass_joins = false;
 	joinrel->consider_partitionwise_join = false;	/* might get changed later */
+	joinrel->agg_info = NULL;
+	joinrel->grouped_rel = NULL;
 	joinrel->parent = NULL;
 	joinrel->top_parent = NULL;
 	joinrel->top_parent_relids = NULL;
@@ -939,6 +1055,8 @@ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
 	joinrel->joininfo = NIL;
 	joinrel->has_eclass_joins = false;
 	joinrel->consider_partitionwise_join = false;	/* might get changed later */
+	joinrel->agg_info = NULL;
+	joinrel->grouped_rel = NULL;
 	joinrel->parent = parent_joinrel;
 	joinrel->top_parent = parent_joinrel->top_parent ? parent_joinrel->top_parent : parent_joinrel;
 	joinrel->top_parent_relids = joinrel->top_parent->relids;
@@ -2518,3 +2636,514 @@ build_child_join_reltarget(PlannerInfo *root,
 	childrel->reltarget->cost.per_tuple = parentrel->reltarget->cost.per_tuple;
 	childrel->reltarget->width = parentrel->reltarget->width;
 }
+
+/*
+ * create_rel_agg_info
+ *	  Create the RelAggInfo structure for the given relation if it can produce
+ *	  grouped paths.  The given relation is the non-grouped one which has the
+ *	  reltarget already constructed.
+ */
+RelAggInfo *
+create_rel_agg_info(PlannerInfo *root, RelOptInfo *rel)
+{
+	ListCell   *lc;
+	RelAggInfo *result;
+	PathTarget *agg_input;
+	PathTarget *target;
+	List	   *group_clauses = NIL;
+	List	   *group_exprs = NIL;
+
+	/*
+	 * The lists of aggregate expressions and grouping expressions should have
+	 * been constructed.
+	 */
+	Assert(root->agg_clause_list != NIL);
+	Assert(root->group_expr_list != NIL);
+
+	/*
+	 * If this is a child rel, the grouped rel for its parent rel must have
+	 * been created if it can.  So we can just use parent's RelAggInfo if
+	 * there is one, with appropriate variable substitutions.
+	 */
+	if (IS_OTHER_REL(rel))
+	{
+		RelOptInfo *grouped_rel;
+		RelAggInfo *agg_info;
+
+		grouped_rel = rel->top_parent->grouped_rel;
+		if (grouped_rel == NULL)
+			return NULL;
+
+		Assert(IS_GROUPED_REL(grouped_rel));
+
+		/* Must do multi-level transformation */
+		agg_info = (RelAggInfo *)
+			adjust_appendrel_attrs_multilevel(root,
+											  (Node *) grouped_rel->agg_info,
+											  rel,
+											  rel->top_parent);
+
+		agg_info->grouped_rows =
+			estimate_num_groups(root, agg_info->group_exprs,
+								rel->rows, NULL, NULL);
+
+		agg_info->apply_at = NULL;	/* caller will change this later */
+
+		/*
+		 * The grouped paths for the given relation are considered useful iff
+		 * the average group size is no less than min_eager_agg_group_size.
+		 */
+		agg_info->agg_useful =
+			(rel->rows / agg_info->grouped_rows) >= min_eager_agg_group_size;
+
+		return agg_info;
+	}
+
+	/* Check if it's possible to produce grouped paths for this relation. */
+	if (!eager_aggregation_possible_for_relation(root, rel))
+		return NULL;
+
+	/*
+	 * Create targets for the grouped paths and for the input paths of the
+	 * grouped paths.
+	 */
+	target = create_empty_pathtarget();
+	agg_input = create_empty_pathtarget();
+
+	/* ... and initialize these targets */
+	if (!init_grouping_targets(root, rel, target, agg_input,
+							   &group_clauses, &group_exprs))
+		return NULL;
+
+	/*
+	 * Eager aggregation is not applicable if there are no available grouping
+	 * expressions.
+	 */
+	if (list_length(group_clauses) == 0)
+		return NULL;
+
+	/* build the RelAggInfo result */
+	result = makeNode(RelAggInfo);
+
+	result->group_clauses = group_clauses;
+	result->group_exprs = group_exprs;
+
+	/* Calculate pathkeys that represent this grouping requirements */
+	result->group_pathkeys =
+		make_pathkeys_for_sortclauses(root, result->group_clauses,
+									  make_tlist_from_pathtarget(target));
+
+	/* Add aggregates to the grouping target */
+	foreach(lc, root->agg_clause_list)
+	{
+		AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+		Aggref	   *aggref;
+
+		Assert(IsA(ac_info->aggref, Aggref));
+
+		aggref = (Aggref *) copyObject(ac_info->aggref);
+		mark_partial_aggref(aggref, AGGSPLIT_INITIAL_SERIAL);
+
+		add_column_to_pathtarget(target, (Expr *) aggref, 0);
+	}
+
+	/* Set the estimated eval cost and output width for both targets */
+	set_pathtarget_cost_width(root, target);
+	set_pathtarget_cost_width(root, agg_input);
+
+	result->relids = bms_copy(rel->relids);
+	result->target = target;
+	result->agg_input = agg_input;
+	result->grouped_rows = estimate_num_groups(root, result->group_exprs,
+											   rel->rows, NULL, NULL);
+	result->apply_at = NULL;	/* caller will change this later */
+
+	/*
+	 * The grouped paths for the given relation are considered useful iff the
+	 * average group size is no less than min_eager_agg_group_size.
+	 */
+	result->agg_useful =
+		(rel->rows / result->grouped_rows) >= min_eager_agg_group_size;
+
+	return result;
+}
+
+/*
+ * eager_aggregation_possible_for_relation
+ * 	  Check if it's possible to produce grouped paths for the given relation.
+ */
+static bool
+eager_aggregation_possible_for_relation(PlannerInfo *root, RelOptInfo *rel)
+{
+	ListCell   *lc;
+	int			cur_relid;
+
+	/*
+	 * Check to see if the given relation is in the nullable side of an outer
+	 * join.  In this case, we cannot push a partial aggregation down to the
+	 * relation, because the NULL-extended rows produced by the outer join
+	 * would not be available when we perform the partial aggregation, while
+	 * with a non-eager-aggregation plan these rows are available for the
+	 * top-level aggregation.  Doing so may result in the rows being grouped
+	 * differently than expected, or produce incorrect values from the
+	 * aggregate functions.
+	 */
+	cur_relid = -1;
+	while ((cur_relid = bms_next_member(rel->relids, cur_relid)) >= 0)
+	{
+		RelOptInfo *baserel = find_base_rel_ignore_join(root, cur_relid);
+
+		if (baserel == NULL)
+			continue;			/* ignore outer joins in rel->relids */
+
+		if (!bms_is_subset(baserel->nulling_relids, rel->relids))
+			return false;
+	}
+
+	/*
+	 * For now we don't try to support PlaceHolderVars.
+	 */
+	foreach(lc, rel->reltarget->exprs)
+	{
+		Expr	   *expr = lfirst(lc);
+
+		if (IsA(expr, PlaceHolderVar))
+			return false;
+	}
+
+	/* Caller should only pass base relations or joins. */
+	Assert(rel->reloptkind == RELOPT_BASEREL ||
+		   rel->reloptkind == RELOPT_JOINREL);
+
+	/*
+	 * Check if all aggregate expressions can be evaluated on this relation
+	 * level.
+	 */
+	foreach(lc, root->agg_clause_list)
+	{
+		AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+
+		Assert(IsA(ac_info->aggref, Aggref));
+
+		/*
+		 * Give up if any aggregate requires relations other than the current
+		 * one.  If the aggregate requires the current relation plus
+		 * additional relations, grouping the current relation could make some
+		 * input rows unavailable for the higher aggregate and may reduce the
+		 * number of input rows it receives.  If the aggregate does not
+		 * require the current relation at all, it should not be grouped, as
+		 * we do not support joining two grouped relations.
+		 */
+		if (!bms_is_subset(ac_info->agg_eval_at, rel->relids))
+			return false;
+	}
+
+	return true;
+}
+
+/*
+ * init_grouping_targets
+ *	  Initialize the target for grouped paths (target) as well as the target
+ *	  for paths that generate input for the grouped paths (agg_input).
+ *
+ * We also construct the list of SortGroupClauses and the list of grouping
+ * expressions for the partial aggregation, and return them in *group_clause
+ * and *group_exprs.
+ *
+ * Return true if the targets could be initialized, false otherwise.
+ */
+static bool
+init_grouping_targets(PlannerInfo *root, RelOptInfo *rel,
+					  PathTarget *target, PathTarget *agg_input,
+					  List **group_clauses, List **group_exprs)
+{
+	ListCell   *lc;
+	List	   *possibly_dependent = NIL;
+	Index		maxSortGroupRef;
+
+	/* Identify the max sortgroupref */
+	maxSortGroupRef = 0;
+	foreach(lc, root->processed_tlist)
+	{
+		Index		ref = ((TargetEntry *) lfirst(lc))->ressortgroupref;
+
+		if (ref > maxSortGroupRef)
+			maxSortGroupRef = ref;
+	}
+
+	/*
+	 * At this point, all Vars from this relation that are needed by upper
+	 * joins or are required in the final targetlist should already be present
+	 * in its reltarget.  Therefore, we can safely iterate over this
+	 * relation's reltarget->exprs to construct the PathTarget and grouping
+	 * clauses for the grouped paths.
+	 */
+	foreach(lc, rel->reltarget->exprs)
+	{
+		Expr	   *expr = (Expr *) lfirst(lc);
+		Index		sortgroupref;
+
+		/*
+		 * Given that PlaceHolderVar currently prevents us from doing eager
+		 * aggregation, the source target cannot contain anything more complex
+		 * than a Var.
+		 */
+		Assert(IsA(expr, Var));
+
+		/*
+		 * Get the sortgroupref of the expr if it is found among, or can be
+		 * deduced from, the original grouping expressions.
+		 */
+		sortgroupref = get_expression_sortgroupref(root, expr);
+		if (sortgroupref > 0)
+		{
+			SortGroupClause *sgc;
+
+			/* Find the matching SortGroupClause */
+			sgc = get_sortgroupref_clause(sortgroupref, root->processed_groupClause);
+			Assert(sgc->tleSortGroupRef <= maxSortGroupRef);
+
+			/*
+			 * If the target expression is to be used as a grouping key, it
+			 * should be emitted by the grouped paths that have been pushed
+			 * down to this relation level.
+			 */
+			add_column_to_pathtarget(target, expr, sortgroupref);
+
+			/*
+			 * ... and it also should be emitted by the input paths.
+			 */
+			add_column_to_pathtarget(agg_input, expr, sortgroupref);
+
+			/*
+			 * Record this SortGroupClause and grouping expression.  Note that
+			 * this SortGroupClause might have already been recorded.
+			 */
+			if (!list_member(*group_clauses, sgc))
+			{
+				*group_clauses = lappend(*group_clauses, sgc);
+				*group_exprs = lappend(*group_exprs, expr);
+			}
+		}
+		else if (is_var_needed_by_join(root, (Var *) expr, rel))
+		{
+			/*
+			 * The expression is needed for an upper join but is neither in
+			 * the GROUP BY clause nor derivable from it using EC (otherwise,
+			 * it would have already been included in the targets above).  We
+			 * need to create a special SortGroupClause for this expression.
+			 *
+			 * It is important to include such expressions in the grouping
+			 * keys.  This is essential to ensure that an aggregated row from
+			 * the partial aggregation matches the other side of the join if
+			 * and only if each row in the partial group does.  This ensures
+			 * that all rows within the same partial group share the same
+			 * 'destiny', which is crucial for maintaining correctness.
+			 */
+			SortGroupClause *sgc;
+			TypeCacheEntry *tce;
+			Oid			equalimageproc;
+
+			/*
+			 * But first, check if equality implies image equality for this
+			 * expression.  If not, we cannot use it as a grouping key.  See
+			 * comments in create_grouping_expr_infos().
+			 */
+			tce = lookup_type_cache(exprType((Node *) expr),
+									TYPECACHE_BTREE_OPFAMILY);
+			if (!OidIsValid(tce->btree_opf) ||
+				!OidIsValid(tce->btree_opintype))
+				return false;
+
+			equalimageproc = get_opfamily_proc(tce->btree_opf,
+											   tce->btree_opintype,
+											   tce->btree_opintype,
+											   BTEQUALIMAGE_PROC);
+			if (!OidIsValid(equalimageproc) ||
+				!DatumGetBool(OidFunctionCall1Coll(equalimageproc,
+												   tce->typcollation,
+												   ObjectIdGetDatum(tce->btree_opintype))))
+				return false;
+
+			/* Create the SortGroupClause. */
+			sgc = makeNode(SortGroupClause);
+
+			/* Initialize the SortGroupClause. */
+			sgc->tleSortGroupRef = ++maxSortGroupRef;
+			get_sort_group_operators(exprType((Node *) expr),
+									 false, true, false,
+									 &sgc->sortop, &sgc->eqop, NULL,
+									 &sgc->hashable);
+
+			/* This expression should be emitted by the grouped paths */
+			add_column_to_pathtarget(target, expr, sgc->tleSortGroupRef);
+
+			/* ... and it also should be emitted by the input paths. */
+			add_column_to_pathtarget(agg_input, expr, sgc->tleSortGroupRef);
+
+			/* Record this SortGroupClause and grouping expression */
+			*group_clauses = lappend(*group_clauses, sgc);
+			*group_exprs = lappend(*group_exprs, expr);
+		}
+		else if (is_var_in_aggref_only(root, (Var *) expr))
+		{
+			/*
+			 * The expression is referenced by an aggregate function pushed
+			 * down to this relation and does not appear elsewhere in the
+			 * targetlist or havingQual.  Add it to 'agg_input' but not to
+			 * 'target'.
+			 */
+			add_new_column_to_pathtarget(agg_input, expr);
+		}
+		else
+		{
+			/*
+			 * The expression may be functionally dependent on other
+			 * expressions in the target, but we cannot verify this until all
+			 * target expressions have been constructed.
+			 */
+			possibly_dependent = lappend(possibly_dependent, expr);
+		}
+	}
+
+	/*
+	 * Now we can verify whether an expression is functionally dependent on
+	 * others.
+	 */
+	foreach(lc, possibly_dependent)
+	{
+		Var		   *tvar;
+		List	   *deps = NIL;
+		RangeTblEntry *rte;
+
+		tvar = lfirst_node(Var, lc);
+		rte = root->simple_rte_array[tvar->varno];
+
+		if (check_functional_grouping(rte->relid, tvar->varno,
+									  tvar->varlevelsup,
+									  target->exprs, &deps))
+		{
+			/*
+			 * The expression is functionally dependent on other target
+			 * expressions, so it can be included in the targets.  Since it
+			 * will not be used as a grouping key, a sortgroupref is not
+			 * needed for it.
+			 */
+			add_new_column_to_pathtarget(target, (Expr *) tvar);
+			add_new_column_to_pathtarget(agg_input, (Expr *) tvar);
+		}
+		else
+		{
+			/*
+			 * We may arrive here with a grouping expression that is proven
+			 * redundant by EquivalenceClass processing, such as 't1.a' in the
+			 * query below.
+			 *
+			 * select max(t1.c) from t t1, t t2 where t1.a = 1 group by t1.a,
+			 * t1.b;
+			 *
+			 * For now we just give up in this case.
+			 */
+			return false;
+		}
+	}
+
+	return true;
+}
+
+/*
+ * is_var_in_aggref_only
+ *	  Check whether the given Var appears in aggregate expressions and not
+ *	  elsewhere in the targetlist or havingQual.
+ */
+static bool
+is_var_in_aggref_only(PlannerInfo *root, Var *var)
+{
+	ListCell   *lc;
+
+	/*
+	 * Search the list of aggregate expressions for the Var.
+	 */
+	foreach(lc, root->agg_clause_list)
+	{
+		AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+		List	   *vars;
+
+		Assert(IsA(ac_info->aggref, Aggref));
+
+		if (!bms_is_member(var->varno, ac_info->agg_eval_at))
+			continue;
+
+		vars = pull_var_clause((Node *) ac_info->aggref,
+							   PVC_RECURSE_AGGREGATES |
+							   PVC_RECURSE_WINDOWFUNCS |
+							   PVC_RECURSE_PLACEHOLDERS);
+
+		if (list_member(vars, var))
+		{
+			list_free(vars);
+			break;
+		}
+
+		list_free(vars);
+	}
+
+	return (lc != NULL && !list_member(root->tlist_vars, var));
+}
+
+/*
+ * is_var_needed_by_join
+ *	  Check if the given Var is needed by joins above the current rel.
+ */
+static bool
+is_var_needed_by_join(PlannerInfo *root, Var *var, RelOptInfo *rel)
+{
+	Relids		relids;
+	int			attno;
+	RelOptInfo *baserel;
+
+	/*
+	 * Note that when checking if the Var is needed by joins above, we want to
+	 * exclude cases where the Var is only needed in the final targetlist.  So
+	 * include "relation 0" in the check.
+	 */
+	relids = bms_copy(rel->relids);
+	relids = bms_add_member(relids, 0);
+
+	baserel = find_base_rel(root, var->varno);
+	attno = var->varattno - baserel->min_attr;
+
+	return bms_nonempty_difference(baserel->attr_needed[attno], relids);
+}
+
+/*
+ * get_expression_sortgroupref
+ *	  Return the sortgroupref of the given "expr" if it is found among the
+ *	  original grouping expressions, or is known equal to any of the original
+ *	  grouping expressions due to equivalence relationships.  Return 0 if no
+ *	  match is found.
+ */
+static Index
+get_expression_sortgroupref(PlannerInfo *root, Expr *expr)
+{
+	ListCell   *lc;
+
+	foreach(lc, root->group_expr_list)
+	{
+		GroupingExprInfo *ge_info = lfirst_node(GroupingExprInfo, lc);
+
+		Assert(IsA(ge_info->expr, Var));
+
+		if (equal(ge_info->expr, expr) ||
+			exprs_known_equal(root, (Node *) expr, (Node *) ge_info->expr,
+							  ge_info->btree_opfamily))
+		{
+			Assert(ge_info->sortgroupref > 0);
+
+			return ge_info->sortgroupref;
+		}
+	}
+
+	/* no match is found */
+	return 0;
+}
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index d14b1678e7f..cdf8da02960 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -949,6 +949,16 @@ struct config_bool ConfigureNamesBool[] =
 		false,
 		NULL, NULL, NULL
 	},
+	{
+		{"enable_eager_aggregate", PGC_USERSET, QUERY_TUNING_METHOD,
+			gettext_noop("Enables eager aggregation."),
+			NULL,
+			GUC_EXPLAIN
+		},
+		&enable_eager_aggregate,
+		true,
+		NULL, NULL, NULL
+	},
 	{
 		{"enable_parallel_append", PGC_USERSET, QUERY_TUNING_METHOD,
 			gettext_noop("Enables the planner's use of parallel append plans."),
@@ -3980,6 +3990,17 @@ struct config_real ConfigureNamesReal[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"min_eager_agg_group_size", PGC_USERSET, QUERY_TUNING_COST,
+			gettext_noop("Sets the minimum average group size required to consider applying eager aggregation."),
+			NULL,
+			GUC_EXPLAIN
+		},
+		&min_eager_agg_group_size,
+		8.0, 0.0, DBL_MAX,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"cursor_tuple_fraction", PGC_USERSET, QUERY_TUNING_OTHER,
 			gettext_noop("Sets the planner's estimate of the fraction of "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index a9d8293474a..e3cdfe11992 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -428,6 +428,7 @@
 #enable_group_by_reordering = on
 #enable_distinct_reordering = on
 #enable_self_join_elimination = on
+#enable_eager_aggregate = on
 
 # - Planner Cost Constants -
 
@@ -441,6 +442,7 @@
 #min_parallel_table_scan_size = 8MB
 #min_parallel_index_scan_size = 512kB
 #effective_cache_size = 4GB
+#min_eager_agg_group_size = 8.0
 
 #jit_above_cost = 100000		# perform JIT compilation if available
 					# and query more expensive than this;
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index ad2726f026f..a6175cbecaf 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -397,6 +397,15 @@ struct PlannerInfo
 	/* list of PlaceHolderInfos */
 	List	   *placeholder_list;
 
+	/* list of AggClauseInfos */
+	List	   *agg_clause_list;
+
+	/* list of GroupExprInfos */
+	List	   *group_expr_list;
+
+	/* list of plain Vars contained in targetlist and havingQual */
+	List	   *tlist_vars;
+
 	/* array of PlaceHolderInfos indexed by phid */
 	struct PlaceHolderInfo **placeholder_array pg_node_attr(read_write_ignore, array_size(placeholder_array_size));
 	/* allocated size of array */
@@ -1024,6 +1033,14 @@ typedef struct RelOptInfo
 	/* consider partitionwise join paths? (if partitioned rel) */
 	bool		consider_partitionwise_join;
 
+	/*
+	 * used by eager aggregation:
+	 */
+	/* information needed to create grouped paths */
+	struct RelAggInfo *agg_info;
+	/* the partially-aggregated version of the relation */
+	struct RelOptInfo *grouped_rel;
+
 	/*
 	 * inheritance links, if this is an otherrel (otherwise NULL):
 	 */
@@ -1097,6 +1114,75 @@ typedef struct RelOptInfo
 	((rel)->part_scheme && (rel)->boundinfo && (rel)->nparts > 0 && \
 	 (rel)->part_rels && (rel)->partexprs && (rel)->nullable_partexprs)
 
+/*
+ * Is the given relation a grouped relation?
+ */
+#define IS_GROUPED_REL(rel) \
+	((rel)->agg_info != NULL)
+
+/*
+ * RelAggInfo
+ *		Information needed to create grouped paths for base and join rels.
+ *
+ * "relids" is the set of relation identifiers (RT indexes).
+ *
+ * "target" is the output tlist for the grouped paths.
+ *
+ * "agg_input" is the output tlist for the paths that provide input to the
+ * grouped paths.  One difference from the reltarget of the non-grouped
+ * relation is that agg_input has its sortgrouprefs[] initialized.
+ *
+ * "grouped_rows" is the estimated number of result tuples of the grouped
+ * relation.
+ *
+ * "group_clauses", "group_exprs" and "group_pathkeys" are lists of
+ * SortGroupClauses, the corresponding grouping expressions and PathKeys
+ * respectively.
+ *
+ * "apply_at" tracks the lowest join level at which partial aggregation is
+ * applied.
+ *
+ * "agg_useful" is a flag to indicate whether the grouped paths are considered
+ * useful.  It is set true if the average partial group size is no less than
+ * min_eager_agg_group_size, suggesting a significant row count reduction.
+ */
+typedef struct RelAggInfo
+{
+	pg_node_attr(no_copy_equal, no_read, no_query_jumble)
+
+	NodeTag		type;
+
+	/* set of base + OJ relids (rangetable indexes) */
+	Relids		relids;
+
+	/*
+	 * default result targetlist for Paths scanning this grouped relation;
+	 * list of Vars/Exprs, cost, width
+	 */
+	struct PathTarget *target;
+
+	/*
+	 * the targetlist for Paths that provide input to the grouped paths
+	 */
+	struct PathTarget *agg_input;
+
+	/* estimated number of result tuples */
+	Cardinality grouped_rows;
+
+	/* a list of SortGroupClauses */
+	List	   *group_clauses;
+	/* a list of grouping expressions */
+	List	   *group_exprs;
+	/* a list of PathKeys */
+	List	   *group_pathkeys;
+
+	/* lowest level partial aggregation is applied at */
+	Relids		apply_at;
+
+	/* the grouped paths are considered useful? */
+	bool		agg_useful;
+} RelAggInfo;
+
 /*
  * IndexOptInfo
  *		Per-index information for planning/optimization
@@ -3278,6 +3364,50 @@ typedef struct MinMaxAggInfo
 	Param	   *param;
 } MinMaxAggInfo;
 
+/*
+ * For each distinct Aggref node that appears in the targetlist and HAVING
+ * clauses, we store an AggClauseInfo node in the PlannerInfo node's
+ * agg_clause_list.  Each AggClauseInfo records the set of relations referenced
+ * by the aggregate expression.  This information is used to determine how far
+ * the aggregate can be safely pushed down in the join tree.
+ */
+typedef struct AggClauseInfo
+{
+	pg_node_attr(no_read, no_query_jumble)
+
+	NodeTag		type;
+
+	/* the Aggref expr */
+	Aggref	   *aggref;
+
+	/* lowest level we can evaluate this aggregate at */
+	Relids		agg_eval_at;
+} AggClauseInfo;
+
+/*
+ * For each grouping expression that appears in grouping clauses, we store a
+ * GroupingExprInfo node in the PlannerInfo node's group_expr_list.  Each
+ * GroupingExprInfo records the expression being grouped on, its sortgroupref,
+ * and the btree opfamily used for equality comparison.  This information is
+ * necessary to reproduce correct grouping semantics at different levels of the
+ * join tree.
+ */
+typedef struct GroupingExprInfo
+{
+	pg_node_attr(no_read, no_query_jumble)
+
+	NodeTag		type;
+
+	/* the represented expression */
+	Expr	   *expr;
+
+	/* the tleSortGroupRef of the corresponding SortGroupClause */
+	Index		sortgroupref;
+
+	/* btree opfamily defining the ordering */
+	Oid			btree_opfamily;
+} GroupingExprInfo;
+
 /*
  * At runtime, PARAM_EXEC slots are used to pass values around from one plan
  * node to another.  They can be used to pass values down into subqueries (for
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index 58936e963cb..cbdbc4978f6 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -314,6 +314,10 @@ extern void setup_simple_rel_arrays(PlannerInfo *root);
 extern void expand_planner_arrays(PlannerInfo *root, int add_size);
 extern RelOptInfo *build_simple_rel(PlannerInfo *root, int relid,
 									RelOptInfo *parent);
+extern RelOptInfo *build_simple_grouped_rel(PlannerInfo *root,
+											RelOptInfo *rel_plain);
+extern RelOptInfo *build_grouped_rel(PlannerInfo *root,
+									 RelOptInfo *rel_plain);
 extern RelOptInfo *find_base_rel(PlannerInfo *root, int relid);
 extern RelOptInfo *find_base_rel_noerr(PlannerInfo *root, int relid);
 extern RelOptInfo *find_base_rel_ignore_join(PlannerInfo *root, int relid);
@@ -353,4 +357,5 @@ extern RelOptInfo *build_child_join_rel(PlannerInfo *root,
 										SpecialJoinInfo *sjinfo,
 										int nappinfos, AppendRelInfo **appinfos);
 
+extern RelAggInfo *create_rel_agg_info(PlannerInfo *root, RelOptInfo *rel);
 #endif							/* PATHNODE_H */
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index 8410531f2d6..9f6bad1faca 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -21,7 +21,9 @@
  * allpaths.c
  */
 extern PGDLLIMPORT bool enable_geqo;
+extern PGDLLIMPORT bool enable_eager_aggregate;
 extern PGDLLIMPORT int geqo_threshold;
+extern PGDLLIMPORT double min_eager_agg_group_size;
 extern PGDLLIMPORT int min_parallel_table_scan_size;
 extern PGDLLIMPORT int min_parallel_index_scan_size;
 extern PGDLLIMPORT bool enable_group_by_reordering;
@@ -57,6 +59,10 @@ extern void generate_gather_paths(PlannerInfo *root, RelOptInfo *rel,
 								  bool override_rows);
 extern void generate_useful_gather_paths(PlannerInfo *root, RelOptInfo *rel,
 										 bool override_rows);
+extern void generate_grouped_paths(PlannerInfo *root,
+								   RelOptInfo *rel_grouped,
+								   RelOptInfo *rel_plain,
+								   RelAggInfo *agg_info);
 extern int	compute_parallel_worker(RelOptInfo *rel, double heap_pages,
 									double index_pages, int max_workers);
 extern void create_partial_bitmap_paths(PlannerInfo *root, RelOptInfo *rel,
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index 9d3debcab28..09b48b26f8f 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -76,6 +76,7 @@ extern void add_vars_to_targetlist(PlannerInfo *root, List *vars,
 extern void add_vars_to_attr_needed(PlannerInfo *root, List *vars,
 									Relids where_needed);
 extern void remove_useless_groupby_columns(PlannerInfo *root);
+extern void setup_eager_aggregation(PlannerInfo *root);
 extern void find_lateral_references(PlannerInfo *root);
 extern void rebuild_lateral_attr_needed(PlannerInfo *root);
 extern void create_lateral_join_info(PlannerInfo *root);
diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out
index 69805d4b9ec..ef79d6f1ded 100644
--- a/src/test/regress/expected/collate.icu.utf8.out
+++ b/src/test/regress/expected/collate.icu.utf8.out
@@ -2437,11 +2437,11 @@ SELECT c collate "C", count(c) FROM pagg_tab3 GROUP BY c collate "C" ORDER BY 1;
 SET enable_partitionwise_join TO false;
 EXPLAIN (COSTS OFF)
 SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROUP BY 1 ORDER BY t1.c COLLATE "C";
-                         QUERY PLAN                          
--------------------------------------------------------------
+                            QUERY PLAN                             
+-------------------------------------------------------------------
  Sort
    Sort Key: t1.c COLLATE "C"
-   ->  HashAggregate
+   ->  Finalize HashAggregate
          Group Key: t1.c
          ->  Hash Join
                Hash Cond: (t1.c = t2.c)
@@ -2449,10 +2449,12 @@ SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROU
                      ->  Seq Scan on pagg_tab3_p2 t1_1
                      ->  Seq Scan on pagg_tab3_p1 t1_2
                ->  Hash
-                     ->  Append
-                           ->  Seq Scan on pagg_tab3_p2 t2_1
-                           ->  Seq Scan on pagg_tab3_p1 t2_2
-(13 rows)
+                     ->  Partial HashAggregate
+                           Group Key: t2.c
+                           ->  Append
+                                 ->  Seq Scan on pagg_tab3_p2 t2_1
+                                 ->  Seq Scan on pagg_tab3_p1 t2_2
+(15 rows)
 
 SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROUP BY 1 ORDER BY t1.c COLLATE "C";
  c | count 
@@ -2464,11 +2466,11 @@ SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROU
 SET enable_partitionwise_join TO true;
 EXPLAIN (COSTS OFF)
 SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROUP BY 1 ORDER BY t1.c COLLATE "C";
-                         QUERY PLAN                          
--------------------------------------------------------------
+                            QUERY PLAN                             
+-------------------------------------------------------------------
  Sort
    Sort Key: t1.c COLLATE "C"
-   ->  HashAggregate
+   ->  Finalize HashAggregate
          Group Key: t1.c
          ->  Hash Join
                Hash Cond: (t1.c = t2.c)
@@ -2476,10 +2478,12 @@ SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROU
                      ->  Seq Scan on pagg_tab3_p2 t1_1
                      ->  Seq Scan on pagg_tab3_p1 t1_2
                ->  Hash
-                     ->  Append
-                           ->  Seq Scan on pagg_tab3_p2 t2_1
-                           ->  Seq Scan on pagg_tab3_p1 t2_2
-(13 rows)
+                     ->  Partial HashAggregate
+                           Group Key: t2.c
+                           ->  Append
+                                 ->  Seq Scan on pagg_tab3_p2 t2_1
+                                 ->  Seq Scan on pagg_tab3_p1 t2_2
+(15 rows)
 
 SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROUP BY 1 ORDER BY t1.c COLLATE "C";
  c | count 
diff --git a/src/test/regress/expected/eager_aggregate.out b/src/test/regress/expected/eager_aggregate.out
new file mode 100644
index 00000000000..f02ff0b30a3
--- /dev/null
+++ b/src/test/regress/expected/eager_aggregate.out
@@ -0,0 +1,1334 @@
+--
+-- EAGER AGGREGATION
+-- Test we can push aggregation down below join
+--
+-- Enable eager aggregation, which by default is disabled.
+SET enable_eager_aggregate TO on;
+CREATE TABLE eager_agg_t1 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t2 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t3 (a int, b int, c double precision);
+INSERT INTO eager_agg_t1 SELECT i, i, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t2 SELECT i, i%10, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t3 SELECT i%10, i%10, i FROM generate_series(1, 1000)i;
+ANALYZE eager_agg_t1;
+ANALYZE eager_agg_t2;
+ANALYZE eager_agg_t3;
+--
+-- Test eager aggregation over base rel
+--
+-- Perform scan of a table, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                            QUERY PLAN                            
+------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Seq Scan on public.eager_agg_t2 t2
+                                 Output: t2.a, t2.b, t2.c
+(18 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                               QUERY PLAN                               
+------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial GroupAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Sort
+                                 Output: t2.c, t2.b
+                                 Sort Key: t2.b
+                                 ->  Seq Scan on public.eager_agg_t2 t2
+                                       Output: t2.c, t2.b
+(21 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+RESET enable_hashagg;
+--
+-- Test eager aggregation over join rel
+--
+-- Perform join of tables, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+                                  QUERY PLAN                                  
+------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg((t2.c + t3.c))
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg((t2.c + t3.c)))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg((t2.c + t3.c))
+                           Group Key: t2.b
+                           ->  Hash Join
+                                 Output: t2.c, t2.b, t3.c
+                                 Hash Cond: (t3.a = t2.a)
+                                 ->  Seq Scan on public.eager_agg_t3 t3
+                                       Output: t3.a, t3.b, t3.c
+                                 ->  Hash
+                                       Output: t2.c, t2.b, t2.a
+                                       ->  Seq Scan on public.eager_agg_t2 t2
+                                             Output: t2.c, t2.b, t2.a
+(25 rows)
+
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 497
+ 2 | 499
+ 3 | 501
+ 4 | 503
+ 5 | 505
+ 6 | 507
+ 7 | 509
+ 8 | 511
+ 9 | 513
+(9 rows)
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+                                     QUERY PLAN                                     
+------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg((t2.c + t3.c))
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg((t2.c + t3.c)))
+                     ->  Partial GroupAggregate
+                           Output: t2.b, PARTIAL avg((t2.c + t3.c))
+                           Group Key: t2.b
+                           ->  Sort
+                                 Output: t2.c, t2.b, t3.c
+                                 Sort Key: t2.b
+                                 ->  Hash Join
+                                       Output: t2.c, t2.b, t3.c
+                                       Hash Cond: (t3.a = t2.a)
+                                       ->  Seq Scan on public.eager_agg_t3 t3
+                                             Output: t3.a, t3.b, t3.c
+                                       ->  Hash
+                                             Output: t2.c, t2.b, t2.a
+                                             ->  Seq Scan on public.eager_agg_t2 t2
+                                                   Output: t2.c, t2.b, t2.a
+(28 rows)
+
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 497
+ 2 | 499
+ 3 | 501
+ 4 | 503
+ 5 | 505
+ 6 | 507
+ 7 | 509
+ 8 | 511
+ 9 | 513
+(9 rows)
+
+RESET enable_hashagg;
+--
+-- Test that eager aggregation works for outer join
+--
+-- Ensure aggregation can be pushed down to the non-nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                            QUERY PLAN                            
+------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Right Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Seq Scan on public.eager_agg_t2 t2
+                                 Output: t2.a, t2.b, t2.c
+(18 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+   | 505
+(10 rows)
+
+-- Ensure aggregation cannot be pushed down to the nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Sort
+   Output: t2.b, (avg(t2.c))
+   Sort Key: t2.b
+   ->  HashAggregate
+         Output: t2.b, avg(t2.c)
+         Group Key: t2.b
+         ->  Hash Right Join
+               Output: t2.b, t2.c
+               Hash Cond: (t2.b = t1.b)
+               ->  Seq Scan on public.eager_agg_t2 t2
+                     Output: t2.a, t2.b, t2.c
+               ->  Hash
+                     Output: t1.b
+                     ->  Seq Scan on public.eager_agg_t1 t1
+                           Output: t1.b
+(15 rows)
+
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+ b | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+   |    
+(10 rows)
+
+--
+-- Test that eager aggregation works for parallel plans
+--
+SET parallel_setup_cost=0;
+SET parallel_tuple_cost=0;
+SET min_parallel_table_scan_size=0;
+SET max_parallel_workers_per_gather=4;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                                   QUERY PLAN                                    
+---------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Gather Merge
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Workers Planned: 2
+         ->  Sort
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Sort Key: t1.a
+               ->  Parallel Hash Join
+                     Output: t1.a, (PARTIAL avg(t2.c))
+                     Hash Cond: (t1.b = t2.b)
+                     ->  Parallel Seq Scan on public.eager_agg_t1 t1
+                           Output: t1.a, t1.b, t1.c
+                     ->  Parallel Hash
+                           Output: t2.b, (PARTIAL avg(t2.c))
+                           ->  Partial HashAggregate
+                                 Output: t2.b, PARTIAL avg(t2.c)
+                                 Group Key: t2.b
+                                 ->  Parallel Seq Scan on public.eager_agg_t2 t2
+                                       Output: t2.a, t2.b, t2.c
+(21 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+RESET parallel_setup_cost;
+RESET parallel_tuple_cost;
+RESET min_parallel_table_scan_size;
+RESET max_parallel_workers_per_gather;
+DROP TABLE eager_agg_t1;
+DROP TABLE eager_agg_t2;
+DROP TABLE eager_agg_t3;
+--
+-- Test eager aggregation for partitionwise join
+--
+-- Enable partitionwise aggregate, which by default is disabled.
+SET enable_partitionwise_aggregate TO true;
+-- Enable partitionwise join, which by default is disabled.
+SET enable_partitionwise_join TO true;
+CREATE TABLE eager_agg_tab1(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab1_p1 PARTITION OF eager_agg_tab1 FOR VALUES FROM (0) TO (5);
+CREATE TABLE eager_agg_tab1_p2 PARTITION OF eager_agg_tab1 FOR VALUES FROM (5) TO (10);
+CREATE TABLE eager_agg_tab1_p3 PARTITION OF eager_agg_tab1 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab2(x int, y int) PARTITION BY RANGE(y);
+CREATE TABLE eager_agg_tab2_p1 PARTITION OF eager_agg_tab2 FOR VALUES FROM (0) TO (5);
+CREATE TABLE eager_agg_tab2_p2 PARTITION OF eager_agg_tab2 FOR VALUES FROM (5) TO (10);
+CREATE TABLE eager_agg_tab2_p3 PARTITION OF eager_agg_tab2 FOR VALUES FROM (10) TO (15);
+INSERT INTO eager_agg_tab1 SELECT i % 15, i % 10 FROM generate_series(1, 1000) i;
+INSERT INTO eager_agg_tab2 SELECT i % 10, i % 15 FROM generate_series(1, 1000) i;
+ANALYZE eager_agg_tab1;
+ANALYZE eager_agg_tab2;
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum(t1.y)), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum(t1.y), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2.y = t1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2
+                           Output: t2.y
+                     ->  Hash
+                           Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1.x, PARTIAL sum(t1.y), PARTIAL count(*)
+                                 Group Key: t1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                       Output: t1.x, t1.y
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum(t1_1.y), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_1
+                           Output: t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.y), PARTIAL count(*)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                       Output: t1_1.x, t1_1.y
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum(t1_2.y), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_2
+                           Output: t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.y), PARTIAL count(*)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                       Output: t1_2.x, t1_2.y
+(49 rows)
+
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+ x  |  sum  | count 
+----+-------+-------
+  0 | 10890 |  4356
+  1 | 15544 |  4489
+  2 | 20033 |  4489
+  3 | 24522 |  4489
+  4 | 29011 |  4489
+  5 | 11390 |  4489
+  6 | 15879 |  4489
+  7 | 20368 |  4489
+  8 | 24857 |  4489
+  9 | 29346 |  4489
+ 10 | 11055 |  4489
+ 11 | 15246 |  4356
+ 12 | 19602 |  4356
+ 13 | 23958 |  4356
+ 14 | 28314 |  4356
+(15 rows)
+
+-- GROUP BY having other matching key
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t2.y, (sum(t1.y)), (count(*))
+   Sort Key: t2.y
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t2.y, sum(t1.y), count(*)
+               Group Key: t2.y
+               ->  Hash Join
+                     Output: t2.y, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2.y = t1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2
+                           Output: t2.y
+                     ->  Hash
+                           Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1.x, PARTIAL sum(t1.y), PARTIAL count(*)
+                                 Group Key: t1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                       Output: t1.y, t1.x
+         ->  Finalize HashAggregate
+               Output: t2_1.y, sum(t1_1.y), count(*)
+               Group Key: t2_1.y
+               ->  Hash Join
+                     Output: t2_1.y, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_1
+                           Output: t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.y), PARTIAL count(*)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                       Output: t1_1.y, t1_1.x
+         ->  Finalize HashAggregate
+               Output: t2_2.y, sum(t1_2.y), count(*)
+               Group Key: t2_2.y
+               ->  Hash Join
+                     Output: t2_2.y, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_2
+                           Output: t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.y), PARTIAL count(*)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                       Output: t1_2.y, t1_2.x
+(49 rows)
+
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+ y  |  sum  | count 
+----+-------+-------
+  0 | 10890 |  4356
+  1 | 15544 |  4489
+  2 | 20033 |  4489
+  3 | 24522 |  4489
+  4 | 29011 |  4489
+  5 | 11390 |  4489
+  6 | 15879 |  4489
+  7 | 20368 |  4489
+  8 | 24857 |  4489
+  9 | 29346 |  4489
+ 10 | 11055 |  4489
+ 11 | 15246 |  4356
+ 12 | 19602 |  4356
+ 13 | 23958 |  4356
+ 14 | 28314 |  4356
+(15 rows)
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 5 ORDER BY t2.x;
+                                                 QUERY PLAN                                                 
+------------------------------------------------------------------------------------------------------------
+ Sort
+   Output: t2.x, (sum(t1.x)), (count(*))
+   Sort Key: t2.x
+   ->  Finalize HashAggregate
+         Output: t2.x, sum(t1.x), count(*)
+         Group Key: t2.x
+         Filter: (avg(t1.x) > '5'::numeric)
+         ->  Append
+               ->  Hash Join
+                     Output: t2.x, (PARTIAL sum(t1.x)), (PARTIAL count(*)), (PARTIAL avg(t1.x))
+                     Hash Cond: (t2.y = t1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2
+                           Output: t2.x, t2.y
+                     ->  Hash
+                           Output: t1.x, (PARTIAL sum(t1.x)), (PARTIAL count(*)), (PARTIAL avg(t1.x))
+                           ->  Partial HashAggregate
+                                 Output: t1.x, PARTIAL sum(t1.x), PARTIAL count(*), PARTIAL avg(t1.x)
+                                 Group Key: t1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                       Output: t1.x
+               ->  Hash Join
+                     Output: t2_1.x, (PARTIAL sum(t1_1.x)), (PARTIAL count(*)), (PARTIAL avg(t1_1.x))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_1
+                           Output: t2_1.x, t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.x)), (PARTIAL count(*)), (PARTIAL avg(t1_1.x))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.x), PARTIAL count(*), PARTIAL avg(t1_1.x)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                       Output: t1_1.x
+               ->  Hash Join
+                     Output: t2_2.x, (PARTIAL sum(t1_2.x)), (PARTIAL count(*)), (PARTIAL avg(t1_2.x))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_2
+                           Output: t2_2.x, t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.x)), (PARTIAL count(*)), (PARTIAL avg(t1_2.x))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.x), PARTIAL count(*), PARTIAL avg(t1_2.x)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                       Output: t1_2.x
+(44 rows)
+
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 5 ORDER BY t2.x;
+ x |  sum  | count 
+---+-------+-------
+ 0 | 33835 |  6667
+ 1 | 39502 |  6667
+ 2 | 46169 |  6667
+ 3 | 52836 |  6667
+ 4 | 59503 |  6667
+ 5 | 33500 |  6667
+ 6 | 39837 |  6667
+ 7 | 46504 |  6667
+ 8 | 53171 |  6667
+ 9 | 59838 |  6667
+(10 rows)
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum((t2.y + t3.y)))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum((t2.y + t3.y))
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum((t2.y + t3.y)))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, t3.x, (PARTIAL sum((t2.y + t3.y)))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, t3.x, PARTIAL sum((t2.y + t3.y))
+                                 Group Key: t2.x
+                                 ->  Hash Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Hash Cond: (t2.x = t3.x)
+                                       ->  Seq Scan on public.eager_agg_tab1_p1 t2
+                                             Output: t2.y, t2.x
+                                       ->  Hash
+                                             Output: t3.y, t3.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p1 t3
+                                                   Output: t3.y, t3.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum((t2_1.y + t3_1.y))
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum((t2_1.y + t3_1.y)))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y)))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y))
+                                 Group Key: t2_1.x
+                                 ->  Hash Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Hash Cond: (t2_1.x = t3_1.x)
+                                       ->  Seq Scan on public.eager_agg_tab1_p2 t2_1
+                                             Output: t2_1.y, t2_1.x
+                                       ->  Hash
+                                             Output: t3_1.y, t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p2 t3_1
+                                                   Output: t3_1.y, t3_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum((t2_2.y + t3_2.y))
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum((t2_2.y + t3_2.y)))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y)))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y))
+                                 Group Key: t2_2.x
+                                 ->  Hash Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Hash Cond: (t2_2.x = t3_2.x)
+                                       ->  Seq Scan on public.eager_agg_tab1_p3 t2_2
+                                             Output: t2_2.y, t2_2.x
+                                       ->  Hash
+                                             Output: t3_2.y, t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p3 t3_2
+                                                   Output: t3_2.y, t3_2.x
+(70 rows)
+
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+ x  |   sum   
+----+---------
+  0 | 1437480
+  1 | 2082896
+  2 | 2684422
+  3 | 3285948
+  4 | 3887474
+  5 | 1526260
+  6 | 2127786
+  7 | 2729312
+  8 | 3330838
+  9 | 3932364
+ 10 | 1481370
+ 11 | 2012472
+ 12 | 2587464
+ 13 | 3162456
+ 14 | 3737448
+(15 rows)
+
+-- partial aggregation
+SET enable_hashagg TO off;
+SET max_parallel_workers_per_gather TO 0;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t3.y, sum((t2.y + t3.y))
+   Group Key: t3.y
+   ->  Sort
+         Output: t3.y, (PARTIAL sum((t2.y + t3.y)))
+         Sort Key: t3.y
+         ->  Append
+               ->  Hash Join
+                     Output: t3.y, (PARTIAL sum((t2.y + t3.y)))
+                     Hash Cond: (t2.x = t1.x)
+                     ->  Partial GroupAggregate
+                           Output: t2.x, t3.y, t3.x, PARTIAL sum((t2.y + t3.y))
+                           Group Key: t2.x, t3.y, t3.x
+                           ->  Incremental Sort
+                                 Output: t2.y, t2.x, t3.y, t3.x
+                                 Sort Key: t2.x, t3.y
+                                 Presorted Key: t2.x
+                                 ->  Merge Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Merge Cond: (t2.x = t3.x)
+                                       ->  Sort
+                                             Output: t2.y, t2.x
+                                             Sort Key: t2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p1 t2
+                                                   Output: t2.y, t2.x
+                                       ->  Sort
+                                             Output: t3.y, t3.x
+                                             Sort Key: t3.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p1 t3
+                                                   Output: t3.y, t3.x
+                     ->  Hash
+                           Output: t1.x
+                           ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                 Output: t1.x
+               ->  Hash Join
+                     Output: t3_1.y, (PARTIAL sum((t2_1.y + t3_1.y)))
+                     Hash Cond: (t2_1.x = t1_1.x)
+                     ->  Partial GroupAggregate
+                           Output: t2_1.x, t3_1.y, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y))
+                           Group Key: t2_1.x, t3_1.y, t3_1.x
+                           ->  Incremental Sort
+                                 Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                 Sort Key: t2_1.x, t3_1.y
+                                 Presorted Key: t2_1.x
+                                 ->  Merge Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Merge Cond: (t2_1.x = t3_1.x)
+                                       ->  Sort
+                                             Output: t2_1.y, t2_1.x
+                                             Sort Key: t2_1.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p2 t2_1
+                                                   Output: t2_1.y, t2_1.x
+                                       ->  Sort
+                                             Output: t3_1.y, t3_1.x
+                                             Sort Key: t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p2 t3_1
+                                                   Output: t3_1.y, t3_1.x
+                     ->  Hash
+                           Output: t1_1.x
+                           ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                 Output: t1_1.x
+               ->  Hash Join
+                     Output: t3_2.y, (PARTIAL sum((t2_2.y + t3_2.y)))
+                     Hash Cond: (t2_2.x = t1_2.x)
+                     ->  Partial GroupAggregate
+                           Output: t2_2.x, t3_2.y, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y))
+                           Group Key: t2_2.x, t3_2.y, t3_2.x
+                           ->  Incremental Sort
+                                 Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                 Sort Key: t2_2.x, t3_2.y
+                                 Presorted Key: t2_2.x
+                                 ->  Merge Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Merge Cond: (t2_2.x = t3_2.x)
+                                       ->  Sort
+                                             Output: t2_2.y, t2_2.x
+                                             Sort Key: t2_2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p3 t2_2
+                                                   Output: t2_2.y, t2_2.x
+                                       ->  Sort
+                                             Output: t3_2.y, t3_2.x
+                                             Sort Key: t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p3 t3_2
+                                                   Output: t3_2.y, t3_2.x
+                     ->  Hash
+                           Output: t1_2.x
+                           ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                 Output: t1_2.x
+(88 rows)
+
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+ y |   sum   
+---+---------
+ 0 | 1111110
+ 1 | 2000132
+ 2 | 2889154
+ 3 | 3778176
+ 4 | 4667198
+ 5 | 3334000
+ 6 | 4223022
+ 7 | 5112044
+ 8 | 6001066
+ 9 | 6890088
+(10 rows)
+
+RESET enable_hashagg;
+RESET max_parallel_workers_per_gather;
+DROP TABLE eager_agg_tab1;
+DROP TABLE eager_agg_tab2;
+--
+-- Test with multi-level partitioning scheme
+--
+CREATE TABLE eager_agg_tab_ml(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p1 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab_ml_p2 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (10) TO (20) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p2_s1 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab_ml_p2_s2 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (15) TO (20);
+CREATE TABLE eager_agg_tab_ml_p3 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (20) TO (30) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p3_s1 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (20) TO (25);
+CREATE TABLE eager_agg_tab_ml_p3_s2 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (25) TO (30);
+INSERT INTO eager_agg_tab_ml SELECT i % 30, i % 30 FROM generate_series(1, 1000) i;
+ANALYZE eager_agg_tab_ml;
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum(t2.y)), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum(t2.y), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, PARTIAL sum(t2.y), PARTIAL count(*)
+                                 Group Key: t2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                       Output: t2.y, t2.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum(t2_1.y), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, PARTIAL sum(t2_1.y), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                       Output: t2_1.y, t2_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum(t2_2.y), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, PARTIAL sum(t2_2.y), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                       Output: t2_2.y, t2_2.x
+         ->  Finalize HashAggregate
+               Output: t1_3.x, sum(t2_3.y), count(*)
+               Group Key: t1_3.x
+               ->  Hash Join
+                     Output: t1_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, PARTIAL sum(t2_3.y), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                       Output: t2_3.y, t2_3.x
+         ->  Finalize HashAggregate
+               Output: t1_4.x, sum(t2_4.y), count(*)
+               Group Key: t1_4.x
+               ->  Hash Join
+                     Output: t1_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, PARTIAL sum(t2_4.y), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                       Output: t2_4.y, t2_4.x
+(79 rows)
+
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+ x  |  sum  | count 
+----+-------+-------
+  0 |     0 |  1089
+  1 |  1156 |  1156
+  2 |  2312 |  1156
+  3 |  3468 |  1156
+  4 |  4624 |  1156
+  5 |  5780 |  1156
+  6 |  6936 |  1156
+  7 |  8092 |  1156
+  8 |  9248 |  1156
+  9 | 10404 |  1156
+ 10 | 11560 |  1156
+ 11 | 11979 |  1089
+ 12 | 13068 |  1089
+ 13 | 14157 |  1089
+ 14 | 15246 |  1089
+ 15 | 16335 |  1089
+ 16 | 17424 |  1089
+ 17 | 18513 |  1089
+ 18 | 19602 |  1089
+ 19 | 20691 |  1089
+ 20 | 21780 |  1089
+ 21 | 22869 |  1089
+ 22 | 23958 |  1089
+ 23 | 25047 |  1089
+ 24 | 26136 |  1089
+ 25 | 27225 |  1089
+ 26 | 28314 |  1089
+ 27 | 29403 |  1089
+ 28 | 30492 |  1089
+ 29 | 31581 |  1089
+(30 rows)
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.y, (sum(t2.y)), (count(*))
+   Sort Key: t1.y
+   ->  Finalize HashAggregate
+         Output: t1.y, sum(t2.y), count(*)
+         Group Key: t1.y
+         ->  Append
+               ->  Hash Join
+                     Output: t1.y, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.y, t1.x
+                     ->  Hash
+                           Output: t2.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, PARTIAL sum(t2.y), PARTIAL count(*)
+                                 Group Key: t2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                       Output: t2.y, t2.x
+               ->  Hash Join
+                     Output: t1_1.y, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.y, t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, PARTIAL sum(t2_1.y), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                       Output: t2_1.y, t2_1.x
+               ->  Hash Join
+                     Output: t1_2.y, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.y, t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, PARTIAL sum(t2_2.y), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                       Output: t2_2.y, t2_2.x
+               ->  Hash Join
+                     Output: t1_3.y, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.y, t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, PARTIAL sum(t2_3.y), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                       Output: t2_3.y, t2_3.x
+               ->  Hash Join
+                     Output: t1_4.y, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.y, t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, PARTIAL sum(t2_4.y), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                       Output: t2_4.y, t2_4.x
+(67 rows)
+
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+ y  |  sum  | count 
+----+-------+-------
+  0 |     0 |  1089
+  1 |  1156 |  1156
+  2 |  2312 |  1156
+  3 |  3468 |  1156
+  4 |  4624 |  1156
+  5 |  5780 |  1156
+  6 |  6936 |  1156
+  7 |  8092 |  1156
+  8 |  9248 |  1156
+  9 | 10404 |  1156
+ 10 | 11560 |  1156
+ 11 | 11979 |  1089
+ 12 | 13068 |  1089
+ 13 | 14157 |  1089
+ 14 | 15246 |  1089
+ 15 | 16335 |  1089
+ 16 | 17424 |  1089
+ 17 | 18513 |  1089
+ 18 | 19602 |  1089
+ 19 | 20691 |  1089
+ 20 | 21780 |  1089
+ 21 | 22869 |  1089
+ 22 | 23958 |  1089
+ 23 | 25047 |  1089
+ 24 | 26136 |  1089
+ 25 | 27225 |  1089
+ 26 | 28314 |  1089
+ 27 | 29403 |  1089
+ 28 | 30492 |  1089
+ 29 | 31581 |  1089
+(30 rows)
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+                                                QUERY PLAN                                                
+----------------------------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum((t2.y + t3.y))), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum((t2.y + t3.y)), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, t3.x, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, t3.x, PARTIAL sum((t2.y + t3.y)), PARTIAL count(*)
+                                 Group Key: t2.x
+                                 ->  Hash Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Hash Cond: (t2.x = t3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                             Output: t2.y, t2.x
+                                       ->  Hash
+                                             Output: t3.y, t3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p1 t3
+                                                   Output: t3.y, t3.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum((t2_1.y + t3_1.y)), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y)), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Hash Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Hash Cond: (t2_1.x = t3_1.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                             Output: t2_1.y, t2_1.x
+                                       ->  Hash
+                                             Output: t3_1.y, t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t3_1
+                                                   Output: t3_1.y, t3_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum((t2_2.y + t3_2.y)), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y)), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Hash Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Hash Cond: (t2_2.x = t3_2.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                             Output: t2_2.y, t2_2.x
+                                       ->  Hash
+                                             Output: t3_2.y, t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t3_2
+                                                   Output: t3_2.y, t3_2.x
+         ->  Finalize HashAggregate
+               Output: t1_3.x, sum((t2_3.y + t3_3.y)), count(*)
+               Group Key: t1_3.x
+               ->  Hash Join
+                     Output: t1_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, t3_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, t3_3.x, PARTIAL sum((t2_3.y + t3_3.y)), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Hash Join
+                                       Output: t2_3.y, t2_3.x, t3_3.y, t3_3.x
+                                       Hash Cond: (t2_3.x = t3_3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                             Output: t2_3.y, t2_3.x
+                                       ->  Hash
+                                             Output: t3_3.y, t3_3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t3_3
+                                                   Output: t3_3.y, t3_3.x
+         ->  Finalize HashAggregate
+               Output: t1_4.x, sum((t2_4.y + t3_4.y)), count(*)
+               Group Key: t1_4.x
+               ->  Hash Join
+                     Output: t1_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, t3_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, t3_4.x, PARTIAL sum((t2_4.y + t3_4.y)), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Hash Join
+                                       Output: t2_4.y, t2_4.x, t3_4.y, t3_4.x
+                                       Hash Cond: (t2_4.x = t3_4.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                             Output: t2_4.y, t2_4.x
+                                       ->  Hash
+                                             Output: t3_4.y, t3_4.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t3_4
+                                                   Output: t3_4.y, t3_4.x
+(114 rows)
+
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+ x  |   sum   | count 
+----+---------+-------
+  0 |       0 | 35937
+  1 |   78608 | 39304
+  2 |  157216 | 39304
+  3 |  235824 | 39304
+  4 |  314432 | 39304
+  5 |  393040 | 39304
+  6 |  471648 | 39304
+  7 |  550256 | 39304
+  8 |  628864 | 39304
+  9 |  707472 | 39304
+ 10 |  786080 | 39304
+ 11 |  790614 | 35937
+ 12 |  862488 | 35937
+ 13 |  934362 | 35937
+ 14 | 1006236 | 35937
+ 15 | 1078110 | 35937
+ 16 | 1149984 | 35937
+ 17 | 1221858 | 35937
+ 18 | 1293732 | 35937
+ 19 | 1365606 | 35937
+ 20 | 1437480 | 35937
+ 21 | 1509354 | 35937
+ 22 | 1581228 | 35937
+ 23 | 1653102 | 35937
+ 24 | 1724976 | 35937
+ 25 | 1796850 | 35937
+ 26 | 1868724 | 35937
+ 27 | 1940598 | 35937
+ 28 | 2012472 | 35937
+ 29 | 2084346 | 35937
+(30 rows)
+
+-- partial aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+                                                    QUERY PLAN                                                    
+------------------------------------------------------------------------------------------------------------------
+ Sort
+   Output: t3.y, (sum((t2.y + t3.y))), (count(*))
+   Sort Key: t3.y
+   ->  Finalize HashAggregate
+         Output: t3.y, sum((t2.y + t3.y)), count(*)
+         Group Key: t3.y
+         ->  Append
+               ->  Hash Join
+                     Output: t3.y, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, t3.y, t3.x, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, t3.y, t3.x, PARTIAL sum((t2.y + t3.y)), PARTIAL count(*)
+                                 Group Key: t2.x, t3.y, t3.x
+                                 ->  Hash Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Hash Cond: (t2.x = t3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                             Output: t2.y, t2.x
+                                       ->  Hash
+                                             Output: t3.y, t3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p1 t3
+                                                   Output: t3.y, t3.x
+               ->  Hash Join
+                     Output: t3_1.y, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, t3_1.y, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, t3_1.y, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y)), PARTIAL count(*)
+                                 Group Key: t2_1.x, t3_1.y, t3_1.x
+                                 ->  Hash Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Hash Cond: (t2_1.x = t3_1.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                             Output: t2_1.y, t2_1.x
+                                       ->  Hash
+                                             Output: t3_1.y, t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t3_1
+                                                   Output: t3_1.y, t3_1.x
+               ->  Hash Join
+                     Output: t3_2.y, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, t3_2.y, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, t3_2.y, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y)), PARTIAL count(*)
+                                 Group Key: t2_2.x, t3_2.y, t3_2.x
+                                 ->  Hash Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Hash Cond: (t2_2.x = t3_2.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                             Output: t2_2.y, t2_2.x
+                                       ->  Hash
+                                             Output: t3_2.y, t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t3_2
+                                                   Output: t3_2.y, t3_2.x
+               ->  Hash Join
+                     Output: t3_3.y, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, t3_3.y, t3_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, t3_3.y, t3_3.x, PARTIAL sum((t2_3.y + t3_3.y)), PARTIAL count(*)
+                                 Group Key: t2_3.x, t3_3.y, t3_3.x
+                                 ->  Hash Join
+                                       Output: t2_3.y, t2_3.x, t3_3.y, t3_3.x
+                                       Hash Cond: (t2_3.x = t3_3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                             Output: t2_3.y, t2_3.x
+                                       ->  Hash
+                                             Output: t3_3.y, t3_3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t3_3
+                                                   Output: t3_3.y, t3_3.x
+               ->  Hash Join
+                     Output: t3_4.y, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, t3_4.y, t3_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, t3_4.y, t3_4.x, PARTIAL sum((t2_4.y + t3_4.y)), PARTIAL count(*)
+                                 Group Key: t2_4.x, t3_4.y, t3_4.x
+                                 ->  Hash Join
+                                       Output: t2_4.y, t2_4.x, t3_4.y, t3_4.x
+                                       Hash Cond: (t2_4.x = t3_4.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                             Output: t2_4.y, t2_4.x
+                                       ->  Hash
+                                             Output: t3_4.y, t3_4.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t3_4
+                                                   Output: t3_4.y, t3_4.x
+(102 rows)
+
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+ y  |   sum   | count 
+----+---------+-------
+  0 |       0 | 35937
+  1 |   78608 | 39304
+  2 |  157216 | 39304
+  3 |  235824 | 39304
+  4 |  314432 | 39304
+  5 |  393040 | 39304
+  6 |  471648 | 39304
+  7 |  550256 | 39304
+  8 |  628864 | 39304
+  9 |  707472 | 39304
+ 10 |  786080 | 39304
+ 11 |  790614 | 35937
+ 12 |  862488 | 35937
+ 13 |  934362 | 35937
+ 14 | 1006236 | 35937
+ 15 | 1078110 | 35937
+ 16 | 1149984 | 35937
+ 17 | 1221858 | 35937
+ 18 | 1293732 | 35937
+ 19 | 1365606 | 35937
+ 20 | 1437480 | 35937
+ 21 | 1509354 | 35937
+ 22 | 1581228 | 35937
+ 23 | 1653102 | 35937
+ 24 | 1724976 | 35937
+ 25 | 1796850 | 35937
+ 26 | 1868724 | 35937
+ 27 | 1940598 | 35937
+ 28 | 2012472 | 35937
+ 29 | 2084346 | 35937
+(30 rows)
+
+DROP TABLE eager_agg_tab_ml;
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 4d5d35d0727..b764284d9c0 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -2837,20 +2837,22 @@ select x.thousand, x.twothousand, count(*)
 from tenk1 x inner join tenk1 y on x.thousand = y.thousand
 group by x.thousand, x.twothousand
 order by x.thousand desc, x.twothousand;
-                                    QUERY PLAN                                    
-----------------------------------------------------------------------------------
- GroupAggregate
+                                       QUERY PLAN                                       
+----------------------------------------------------------------------------------------
+ Finalize GroupAggregate
    Group Key: x.thousand, x.twothousand
    ->  Incremental Sort
          Sort Key: x.thousand DESC, x.twothousand
          Presorted Key: x.thousand
          ->  Merge Join
                Merge Cond: (y.thousand = x.thousand)
-               ->  Index Only Scan Backward using tenk1_thous_tenthous on tenk1 y
+               ->  Partial GroupAggregate
+                     Group Key: y.thousand
+                     ->  Index Only Scan Backward using tenk1_thous_tenthous on tenk1 y
                ->  Sort
                      Sort Key: x.thousand DESC
                      ->  Seq Scan on tenk1 x
-(11 rows)
+(13 rows)
 
 reset enable_hashagg;
 reset enable_nestloop;
diff --git a/src/test/regress/expected/partition_aggregate.out b/src/test/regress/expected/partition_aggregate.out
index 5f2c0cf5786..1f56f55155b 100644
--- a/src/test/regress/expected/partition_aggregate.out
+++ b/src/test/regress/expected/partition_aggregate.out
@@ -13,6 +13,8 @@ SET enable_partitionwise_join TO true;
 SET max_parallel_workers_per_gather TO 0;
 -- Disable incremental sort, which can influence selected plans due to fuzz factor.
 SET enable_incremental_sort TO off;
+-- Disable eager aggregation, which can interfere with the generation of partitionwise aggregation.
+SET enable_eager_aggregate TO off;
 --
 -- Tests for list partitioned tables.
 --
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index 83228cfca29..3b37fafa65b 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -151,6 +151,7 @@ select name, setting from pg_settings where name like 'enable%';
  enable_async_append            | on
  enable_bitmapscan              | on
  enable_distinct_reordering     | on
+ enable_eager_aggregate         | on
  enable_gathermerge             | on
  enable_group_by_reordering     | on
  enable_hashagg                 | on
@@ -172,7 +173,7 @@ select name, setting from pg_settings where name like 'enable%';
  enable_seqscan                 | on
  enable_sort                    | on
  enable_tidscan                 | on
-(24 rows)
+(25 rows)
 
 -- There are always wait event descriptions for various types.  InjectionPoint
 -- may be present or absent, depending on history since last postmaster start.
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index fbffc67ae60..f9450cdc477 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -123,7 +123,7 @@ test: plancache limit plpgsql copy2 temp domain rangefuncs prepare conversion tr
 # The stats test resets stats, so nothing else needing stats access can be in
 # this group.
 # ----------
-test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression compression_lz4 memoize stats predicate numa
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression compression_lz4 memoize stats predicate numa eager_aggregate
 
 # event_trigger depends on create_am and cannot run concurrently with
 # any test that runs DDL
diff --git a/src/test/regress/sql/eager_aggregate.sql b/src/test/regress/sql/eager_aggregate.sql
new file mode 100644
index 00000000000..5da8749a6cb
--- /dev/null
+++ b/src/test/regress/sql/eager_aggregate.sql
@@ -0,0 +1,194 @@
+--
+-- EAGER AGGREGATION
+-- Test we can push aggregation down below join
+--
+
+-- Enable eager aggregation, which by default is disabled.
+SET enable_eager_aggregate TO on;
+
+CREATE TABLE eager_agg_t1 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t2 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t3 (a int, b int, c double precision);
+
+INSERT INTO eager_agg_t1 SELECT i, i, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t2 SELECT i, i%10, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t3 SELECT i%10, i%10, i FROM generate_series(1, 1000)i;
+
+ANALYZE eager_agg_t1;
+ANALYZE eager_agg_t2;
+ANALYZE eager_agg_t3;
+
+
+--
+-- Test eager aggregation over base rel
+--
+
+-- Perform scan of a table, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+RESET enable_hashagg;
+
+
+--
+-- Test eager aggregation over join rel
+--
+
+-- Perform join of tables, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+
+RESET enable_hashagg;
+
+
+--
+-- Test that eager aggregation works for outer join
+--
+
+-- Ensure aggregation can be pushed down to the non-nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+-- Ensure aggregation cannot be pushed down to the nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+
+
+--
+-- Test that eager aggregation works for parallel plans
+--
+
+SET parallel_setup_cost=0;
+SET parallel_tuple_cost=0;
+SET min_parallel_table_scan_size=0;
+SET max_parallel_workers_per_gather=4;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+RESET parallel_setup_cost;
+RESET parallel_tuple_cost;
+RESET min_parallel_table_scan_size;
+RESET max_parallel_workers_per_gather;
+
+
+DROP TABLE eager_agg_t1;
+DROP TABLE eager_agg_t2;
+DROP TABLE eager_agg_t3;
+
+
+--
+-- Test eager aggregation for partitionwise join
+--
+
+-- Enable partitionwise aggregate, which by default is disabled.
+SET enable_partitionwise_aggregate TO true;
+-- Enable partitionwise join, which by default is disabled.
+SET enable_partitionwise_join TO true;
+
+CREATE TABLE eager_agg_tab1(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab1_p1 PARTITION OF eager_agg_tab1 FOR VALUES FROM (0) TO (5);
+CREATE TABLE eager_agg_tab1_p2 PARTITION OF eager_agg_tab1 FOR VALUES FROM (5) TO (10);
+CREATE TABLE eager_agg_tab1_p3 PARTITION OF eager_agg_tab1 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab2(x int, y int) PARTITION BY RANGE(y);
+CREATE TABLE eager_agg_tab2_p1 PARTITION OF eager_agg_tab2 FOR VALUES FROM (0) TO (5);
+CREATE TABLE eager_agg_tab2_p2 PARTITION OF eager_agg_tab2 FOR VALUES FROM (5) TO (10);
+CREATE TABLE eager_agg_tab2_p3 PARTITION OF eager_agg_tab2 FOR VALUES FROM (10) TO (15);
+INSERT INTO eager_agg_tab1 SELECT i % 15, i % 10 FROM generate_series(1, 1000) i;
+INSERT INTO eager_agg_tab2 SELECT i % 10, i % 15 FROM generate_series(1, 1000) i;
+
+ANALYZE eager_agg_tab1;
+ANALYZE eager_agg_tab2;
+
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+
+-- GROUP BY having other matching key
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 5 ORDER BY t2.x;
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 5 ORDER BY t2.x;
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+
+-- partial aggregation
+SET enable_hashagg TO off;
+SET max_parallel_workers_per_gather TO 0;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+RESET enable_hashagg;
+RESET max_parallel_workers_per_gather;
+
+DROP TABLE eager_agg_tab1;
+DROP TABLE eager_agg_tab2;
+
+
+--
+-- Test with multi-level partitioning scheme
+--
+CREATE TABLE eager_agg_tab_ml(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p1 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab_ml_p2 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (10) TO (20) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p2_s1 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab_ml_p2_s2 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (15) TO (20);
+CREATE TABLE eager_agg_tab_ml_p3 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (20) TO (30) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p3_s1 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (20) TO (25);
+CREATE TABLE eager_agg_tab_ml_p3_s2 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (25) TO (30);
+INSERT INTO eager_agg_tab_ml SELECT i % 30, i % 30 FROM generate_series(1, 1000) i;
+
+ANALYZE eager_agg_tab_ml;
+
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+
+-- partial aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+
+DROP TABLE eager_agg_tab_ml;
diff --git a/src/test/regress/sql/partition_aggregate.sql b/src/test/regress/sql/partition_aggregate.sql
index ab070fee244..124cc260461 100644
--- a/src/test/regress/sql/partition_aggregate.sql
+++ b/src/test/regress/sql/partition_aggregate.sql
@@ -14,6 +14,8 @@ SET enable_partitionwise_join TO true;
 SET max_parallel_workers_per_gather TO 0;
 -- Disable incremental sort, which can influence selected plans due to fuzz factor.
 SET enable_incremental_sort TO off;
+-- Disable eager aggregation, which can interfere with the generation of partitionwise aggregation.
+SET enable_eager_aggregate TO off;
 
 --
 -- Tests for list partitioned tables.
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index e6f2e93b2d6..052e6b7b920 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -42,6 +42,7 @@ AfterTriggersTableData
 AfterTriggersTransData
 Agg
 AggClauseCosts
+AggClauseInfo
 AggInfo
 AggPath
 AggSplit
@@ -1110,6 +1111,7 @@ GroupPathExtraData
 GroupResultPath
 GroupState
 GroupVarInfo
+GroupingExprInfo
 GroupingFunc
 GroupingSet
 GroupingSetData
@@ -2472,6 +2474,7 @@ ReindexObjectType
 ReindexParams
 ReindexStmt
 ReindexType
+RelAggInfo
 RelFileLocator
 RelFileLocatorBackend
 RelFileNumber
-- 
2.43.0



^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 13:44                                   ` Re: Eager aggregation, take 3 Matheus Alcantara <[email protected]>
  2025-08-09 01:32                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-08-14 19:22                                       ` Matheus Alcantara <[email protected]>
  2025-08-15 01:41                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  1 sibling, 1 reply; 70+ messages in thread

From: Matheus Alcantara @ 2025-08-14 19:22 UTC (permalink / raw)
  To: Richard Guo <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

On 08/08/25 22:32, Richard Guo wrote:
>> It sounds like a good way to go for me, looking forward to the next
>> patch version to perform some other tests.
>
> OK.  Here it is.
>
Thanks! I can confirm now that I can see the eager aggregate in action
in some of these queries that I've tested on the TPC-DS benchmark.

I few questions regarding the new version:

I've noticed that when a query has a WHERE clause filtering columns from
the same relation being aggregated using "=" operator the Partial and
Finalize aggregation nodes are not present on explain results even if
setup_eager_aggregation() returns true on all if statements and also
RelAggInfo->agg_useful is true. For example, consider this query that is
used on eager aggregation paper that use some tables from TPC-H
benchmark:

tpch=# show enable_eager_aggregate ;
 enable_eager_aggregate
------------------------
 on
(1 row)

tpch=# set max_parallel_workers_per_gather to 0;
SET

tpch=# EXPLAIN(COSTS OFF) SELECT O_CLERK,
       SUM(L_EXTENDEDPRICE * (1 - L_DISCOUNT)) AS LOSS
FROM LINEITEM
JOIN ORDERS ON L_ORDERKEY = O_ORDERKEY
WHERE L_RETURNFLAG = 'R'
GROUP BY O_CLERK;
                          QUERY PLAN
--------------------------------------------------------------
 HashAggregate
   Group Key: orders.o_clerk
   ->  Hash Join
         Hash Cond: (lineitem.l_orderkey = orders.o_orderkey)
         ->  Seq Scan on lineitem
               Filter: (l_returnflag = 'R'::bpchar)
         ->  Hash
               ->  Seq Scan on orders
(8 rows)

Debugging this query shows that all if conditions on
setup_eager_aggregation() returns false and create_agg_clause_infos()
and create_grouping_expr_infos() are called. The RelAggInfo->agg_useful
is also being set to true so I would expect to see Finalize and Partial
agg nodes, is this correct or am I missing something here?

Removing the WHERE clause I can see the Finalize and Partial agg nodes:

tpch=# EXPLAIN(COSTS OFF) SELECT O_CLERK,
       SUM(L_EXTENDEDPRICE * (1 - L_DISCOUNT)) AS LOSS
FROM LINEITEM
JOIN ORDERS ON L_ORDERKEY = O_ORDERKEY
GROUP BY O_CLERK;
                              QUERY PLAN
----------------------------------------------------------------------
 Finalize HashAggregate
   Group Key: orders.o_clerk
   ->  Merge Join
         Merge Cond: (lineitem.l_orderkey = orders.o_orderkey)
         ->  Partial GroupAggregate
               Group Key: lineitem.l_orderkey
               ->  Index Scan using idx_lineitem_orderkey on lineitem
         ->  Index Scan using orders_pkey on orders
(8 rows)

This can also be reproduced with an addition of a WHERE clause on some
tests on eager_aggregate.sql:

postgres=# EXPLAIN (VERBOSE, COSTS OFF)
SELECT t1.a, avg(t2.c)
FROM eager_agg_t1 t1
JOIN eager_agg_t2 t2
    ON t1.b = t2.b
WHERE t2.c = 5
GROUP BY t1.a
ORDER BY t1.a;
                            QUERY PLAN
------------------------------------------------------------------
 GroupAggregate
   Output: t1.a, avg(t2.c)
   Group Key: t1.a
   ->  Sort
         Output: t1.a, t2.c
         Sort Key: t1.a
         ->  Hash Join
               Output: t1.a, t2.c
               Hash Cond: (t1.b = t2.b)
               ->  Seq Scan on public.eager_agg_t1 t1
                     Output: t1.a, t1.b, t1.c
               ->  Hash
                     Output: t2.c, t2.b
                     ->  Seq Scan on public.eager_agg_t2 t2
                           Output: t2.c, t2.b
                           Filter: (t2.c = '5'::double precision)
(16 rows)


Note that if I use ">" operator for example, this doesn't happen:
SELECT t1.a, avg(t2.c)
FROM eager_agg_t1 t1
JOIN eager_agg_t2 t2
    ON t1.b = t2.b
WHERE t2.c > 5
GROUP BY t1.a
ORDER BY t1.a;
                               QUERY PLAN
------------------------------------------------------------------------
 Finalize GroupAggregate
   Output: t1.a, avg(t2.c)
   Group Key: t1.a
   ->  Sort
         Output: t1.a, (PARTIAL avg(t2.c))
         Sort Key: t1.a
         ->  Hash Join
               Output: t1.a, (PARTIAL avg(t2.c))
               Hash Cond: (t1.b = t2.b)
               ->  Seq Scan on public.eager_agg_t1 t1
                     Output: t1.a, t1.b, t1.c
               ->  Hash
                     Output: t2.b, (PARTIAL avg(t2.c))
                     ->  Partial HashAggregate
                           Output: t2.b, PARTIAL avg(t2.c)
                           Group Key: t2.b
                           ->  Seq Scan on public.eager_agg_t2 t2
                                 Output: t2.a, t2.b, t2.c
                                 Filter: (t2.c > '5'::double precision)
(19 rows)


Is this behavior correct? If it's correct, would be possible to check
this limitation on setup_eager_aggregation() and maybe skip all the
other work?

--
Matheus Alcantara





^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 13:44                                   ` Re: Eager aggregation, take 3 Matheus Alcantara <[email protected]>
  2025-08-09 01:32                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-14 19:22                                       ` Re: Eager aggregation, take 3 Matheus Alcantara <[email protected]>
@ 2025-08-15 01:41                                         ` Richard Guo <[email protected]>
  0 siblings, 0 replies; 70+ messages in thread

From: Richard Guo @ 2025-08-15 01:41 UTC (permalink / raw)
  To: Matheus Alcantara <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

On Fri, Aug 15, 2025 at 4:22 AM Matheus Alcantara
<[email protected]> wrote:
> Debugging this query shows that all if conditions on
> setup_eager_aggregation() returns false and create_agg_clause_infos()
> and create_grouping_expr_infos() are called. The RelAggInfo->agg_useful
> is also being set to true so I would expect to see Finalize and Partial
> agg nodes, is this correct or am I missing something here?

Well, just because eager aggregation *can* be applied does not mean
that it *will* be; it depends on whether it produces a lower-cost
execution plan.  This transformation is cost-based, so it's not the
right mindset to assume that it will always be applied when possible.

In your case, with the filter "t2.c = 5", the row estimate for t2 is
just 1 after the filter has been applied.  The planner decides that
adding a partial aggregation on top of such a small result set doesn't
offer much benefit, which seems reasonable to me.

->  Hash  (cost=18.50..18.50 rows=1 width=12)
          (actual time=0.864..0.865 rows=1.00 loops=1)
      Buckets: 1024  Batches: 1  Memory Usage: 9kB
      ->  Seq Scan on eager_agg_t2 t2  (cost=0.00..18.50 rows=1 width=12)
                                       (actual time=0.060..0.851
rows=1.00 loops=1)
            Filter: (c = '5'::double precision)
            Rows Removed by Filter: 999


With the filter "t2.c > 5", the row estimate for t2 is 995 after
filtering.  A partial aggregation can reduce that to 10 rows, so the
planner decides that adding a partial aggregation is beneficial -- and
does so.  That also seems reasonable to me.

->  Partial HashAggregate  (cost=23.48..23.58 rows=10 width=36)
                           (actual time=2.427..2.438 rows=10.00 loops=1)
      Group Key: t2.b
      Batches: 1  Memory Usage: 32kB
      ->  Seq Scan on eager_agg_t2 t2  (cost=0.00..18.50 rows=995 width=12)
                                       (actual time=0.053..0.989
rows=995.00 loops=1)
            Filter: (c > '5'::double precision)
            Rows Removed by Filter: 5

> Is this behavior correct? If it's correct, would be possible to check
> this limitation on setup_eager_aggregation() and maybe skip all the
> other work?

Hmm, I wouldn't consider this a limitation; it's just the result of
the planner's cost-based tournament for path selection.

Thanks
Richard





^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 13:44                                   ` Re: Eager aggregation, take 3 Matheus Alcantara <[email protected]>
  2025-08-09 01:32                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-09-01 01:32                                       ` Richard Guo <[email protected]>
  2025-09-05 07:35                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  1 sibling, 1 reply; 70+ messages in thread

From: Richard Guo @ 2025-09-01 01:32 UTC (permalink / raw)
  To: Matheus Alcantara <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

On Sat, Aug 9, 2025 at 10:32 AM Richard Guo <[email protected]> wrote:
> OK.  Here it is.

This patch needs a rebase; here it is.  No changes were made.

- Richard


Attachments:

  [application/octet-stream] v20-0001-Implement-Eager-Aggregation.patch (171.8K, 2-v20-0001-Implement-Eager-Aggregation.patch)
  download | inline diff:
From 63378cda1912f8bca3455e374638ba02ce1ad651 Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Tue, 11 Jun 2024 15:59:19 +0900
Subject: [PATCH v20] Implement Eager Aggregation

Eager aggregation is a query optimization technique that partially
pushes aggregation past a join, and finalizes it once all the
relations are joined.  Eager aggregation may reduce the number of
input rows to the join and thus could result in a better overall plan.

In the current planner architecture, the separation between the
scan/join planning phase and the post-scan/join phase means that
aggregation steps are not visible when constructing the join tree,
limiting the planner's ability to exploit aggregation-aware
optimizations.  To implement eager aggregation, we collect information
about aggregate functions in the targetlist and HAVING clause, along
with grouping expressions from the GROUP BY clause, and store it in
the PlannerInfo node.  During the scan/join planning phase, this
information is used to evaluate each base or join relation to
determine whether eager aggregation can be applied.  If applicable, we
create a separate RelOptInfo, referred to as a grouped relation, to
represent the partially-aggregated version of the relation and
generate grouped paths for it.

Grouped relation paths can be generated in two ways.  The first method
involves adding sorted and hashed partial aggregation paths on top of
the non-grouped paths.  To limit planning time, we only consider the
cheapest or suitably-sorted non-grouped paths in this step.
Alternatively, grouped paths can be generated by joining a grouped
relation with a non-grouped relation.  Joining two grouped relations
is currently not supported.

To further limit planning time, we currently adopt a strategy where
partial aggregation is pushed only to the lowest feasible level in the
join tree where it provides a significant reduction in row count.
This strategy also helps ensure that all grouped paths for the same
grouped relation produce the same set of rows, which is important to
support a fundamental assumption of the planner.

For the partial aggregation that is pushed down to a non-aggregated
relation, we need to consider all expressions from this relation that
are involved in upper join clauses and include them in the grouping
keys, using compatible operators.  This is essential to ensure that an
aggregated row from the partial aggregation matches the other side of
the join if and only if each row in the partial group does.  This
ensures that all rows within the same partial group share the same
"destiny", which is crucial for maintaining correctness.

One restriction is that we cannot push partial aggregation down to a
relation that is in the nullable side of an outer join, because the
NULL-extended rows produced by the outer join would not be available
when we perform the partial aggregation, while with a
non-eager-aggregation plan these rows are available for the top-level
aggregation.  Pushing partial aggregation in this case may result in
the rows being grouped differently than expected, or produce incorrect
values from the aggregate functions.

If we have generated a grouped relation for the topmost join relation,
we finalize its paths at the end.  The final paths will compete in the
usual way with paths built from regular planning.

The patch was originally proposed by Antonin Houska in 2017.  This
commit reworks various important aspects and rewrites most of the
current code.  However, the original patch and reviews were very
useful.

Author: Richard Guo, Antonin Houska
Reviewed-by: Robert Haas, Jian He, Tender Wang, Paul George, Tom Lane
Reviewed-by: Tomas Vondra, Andy Fan, Ashutosh Bapat
Discussion: https://postgr.es/m/CAMbWs48jzLrPt1J_00ZcPZXWUQKawQOFE8ROc-ADiYqsqrpBNw@mail.gmail.com
---
 .../postgres_fdw/expected/postgres_fdw.out    |   49 +-
 doc/src/sgml/config.sgml                      |   31 +
 src/backend/optimizer/README                  |   89 ++
 src/backend/optimizer/geqo/geqo_eval.c        |   21 +
 src/backend/optimizer/path/allpaths.c         |  453 ++++++
 src/backend/optimizer/path/joinrels.c         |  193 +++
 src/backend/optimizer/plan/initsplan.c        |  322 ++++
 src/backend/optimizer/plan/planmain.c         |    9 +
 src/backend/optimizer/plan/planner.c          |  124 +-
 src/backend/optimizer/util/appendinfo.c       |   59 +
 src/backend/optimizer/util/relnode.c          |  628 ++++++++
 src/backend/utils/misc/guc_tables.c           |   21 +
 src/backend/utils/misc/postgresql.conf.sample |    2 +
 src/include/nodes/pathnodes.h                 |  130 ++
 src/include/optimizer/pathnode.h              |    5 +
 src/include/optimizer/paths.h                 |    6 +
 src/include/optimizer/planmain.h              |    1 +
 .../regress/expected/collate.icu.utf8.out     |   32 +-
 src/test/regress/expected/eager_aggregate.out | 1334 +++++++++++++++++
 src/test/regress/expected/join.out            |   12 +-
 .../regress/expected/partition_aggregate.out  |    2 +
 src/test/regress/expected/sysviews.out        |    3 +-
 src/test/regress/parallel_schedule            |    2 +-
 src/test/regress/sql/eager_aggregate.sql      |  194 +++
 src/test/regress/sql/partition_aggregate.sql  |    2 +
 src/tools/pgindent/typedefs.list              |    3 +
 26 files changed, 3653 insertions(+), 74 deletions(-)
 create mode 100644 src/test/regress/expected/eager_aggregate.out
 create mode 100644 src/test/regress/sql/eager_aggregate.sql

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 78b8367d289..b6c892bdb51 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -3701,30 +3701,33 @@ select count(t1.c3) from ft2 t1 left join ft2 t2 on (t1.c1 = random() * t2.c2);
 -- Subquery in FROM clause having aggregate
 explain (verbose, costs off)
 select count(*), x.b from ft1, (select c2 a, sum(c1) b from ft1 group by c2) x where ft1.c2 = x.a group by x.b order by 1, 2;
-                                          QUERY PLAN                                           
------------------------------------------------------------------------------------------------
+                                       QUERY PLAN                                        
+-----------------------------------------------------------------------------------------
  Sort
-   Output: (count(*)), x.b
-   Sort Key: (count(*)), x.b
-   ->  HashAggregate
-         Output: count(*), x.b
-         Group Key: x.b
-         ->  Hash Join
-               Output: x.b
-               Inner Unique: true
-               Hash Cond: (ft1.c2 = x.a)
-               ->  Foreign Scan on public.ft1
-                     Output: ft1.c2
-                     Remote SQL: SELECT c2 FROM "S 1"."T 1"
-               ->  Hash
-                     Output: x.b, x.a
-                     ->  Subquery Scan on x
-                           Output: x.b, x.a
-                           ->  Foreign Scan
-                                 Output: ft1_1.c2, (sum(ft1_1.c1))
-                                 Relations: Aggregate on (public.ft1 ft1_1)
-                                 Remote SQL: SELECT c2, sum("C 1") FROM "S 1"."T 1" GROUP BY 1
-(21 rows)
+   Output: (count(*)), (sum(ft1_1.c1))
+   Sort Key: (count(*)), (sum(ft1_1.c1))
+   ->  Finalize GroupAggregate
+         Output: count(*), (sum(ft1_1.c1))
+         Group Key: (sum(ft1_1.c1))
+         ->  Sort
+               Output: (sum(ft1_1.c1)), (PARTIAL count(*))
+               Sort Key: (sum(ft1_1.c1))
+               ->  Hash Join
+                     Output: (sum(ft1_1.c1)), (PARTIAL count(*))
+                     Hash Cond: (ft1_1.c2 = ft1.c2)
+                     ->  Foreign Scan
+                           Output: ft1_1.c2, (sum(ft1_1.c1))
+                           Relations: Aggregate on (public.ft1 ft1_1)
+                           Remote SQL: SELECT c2, sum("C 1") FROM "S 1"."T 1" GROUP BY 1
+                     ->  Hash
+                           Output: ft1.c2, (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: ft1.c2, PARTIAL count(*)
+                                 Group Key: ft1.c2
+                                 ->  Foreign Scan on public.ft1
+                                       Output: ft1.c2
+                                       Remote SQL: SELECT c2 FROM "S 1"."T 1"
+(24 rows)
 
 select count(*), x.b from ft1, (select c2 a, sum(c1) b from ft1 group by c2) x where ft1.c2 = x.a group by x.b order by 1, 2;
  count |   b   
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 0a4b3e55ba5..aab91625daf 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -5475,6 +5475,21 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-enable-eager-aggregate" xreflabel="enable_eager_aggregate">
+      <term><varname>enable_eager_aggregate</varname> (<type>boolean</type>)
+      <indexterm>
+       <primary><varname>enable_eager_aggregate</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Enables or disables the query planner's ability to partially push
+        aggregation past a join, and finalize it once all the relations are
+        joined. The default is <literal>on</literal>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-enable-gathermerge" xreflabel="enable_gathermerge">
       <term><varname>enable_gathermerge</varname> (<type>boolean</type>)
       <indexterm>
@@ -6095,6 +6110,22 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-min-eager-agg-group-size" xreflabel="min_eager_agg_group_size">
+      <term><varname>min_eager_agg_group_size</varname> (<type>floating point</type>)
+      <indexterm>
+       <primary><varname>min_eager_agg_group_size</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Sets the minimum average group size required to consider applying
+        eager aggregation. This helps avoid the overhead of eager
+        aggregation when it does not offer significant row count reduction.
+        The default is <literal>8</literal>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-jit-above-cost" xreflabel="jit_above_cost">
       <term><varname>jit_above_cost</varname> (<type>floating point</type>)
       <indexterm>
diff --git a/src/backend/optimizer/README b/src/backend/optimizer/README
index 843368096fd..5af3ced5750 100644
--- a/src/backend/optimizer/README
+++ b/src/backend/optimizer/README
@@ -1500,3 +1500,92 @@ breaking down aggregation or grouping over a partitioned relation into
 aggregation or grouping over its partitions is called partitionwise
 aggregation.  Especially when the partition keys match the GROUP BY clause,
 this can be significantly faster than the regular method.
+
+Eager aggregation
+-----------------
+
+Eager aggregation is a query optimization technique that partially
+pushes aggregation past a join, and finalizes it once all the
+relations are joined.  Eager aggregation may reduce the number of
+input rows to the join and thus could result in a better overall plan.
+
+To prove that the transformation is correct, we partition the tables
+in the FROM clause into two groups: those that contain at least one
+aggregation column, and those that do not contain any aggregation
+columns.  Each group can be treated as a single relation formed by the
+Cartesian product of the tables within that group.  Therefore, without
+loss of generality, we can assume that the FROM clause contains
+exactly two relations, R1 and R2, where R1 represents the relation
+containing all aggregation columns, and R2 represents the relation
+without any aggregation columns.
+
+Let the query be of the form:
+
+SELECT G, AGG(A)
+FROM R1 JOIN R2 ON J
+GROUP BY G;
+
+where G is the set of grouping keys that may include columns from R1
+and/or R2; AGG(A) is an aggregate function over columns A from R1; J
+is the join condition between R1 and R2.
+
+The transformation of eager aggregation is:
+
+    GROUP BY G, AGG(A) on (R1 JOIN R2 ON J)
+    =
+    GROUP BY G, AGG(agg_A) on ((GROUP BY G1, AGG(A) AS agg_A on R1) JOIN R2 ON J)
+
+This equivalence holds under the following conditions:
+
+1) AGG is decomposable, meaning that it can be computed in two stages:
+a partial aggregation followed by a final aggregation;
+2) The set G1 used in the pre-aggregation of R1 includes:
+    * all columns from R1 that are part of the grouping keys G, and
+    * all columns from R1 that appear in the join condition J.
+3) The grouping operator for any column in G1 must be compatible with
+the operator used for that column in the join condition J.
+
+Since G1 includes all columns from R1 that appear in either the
+grouping keys G or the join condition J, all rows within each partial
+group have identical values for both the grouping keys and the
+join-relevant columns from R1, assuming compatible operators are used.
+As a result, the rows within a partial group are indistinguishable in
+terms of their contribution to the aggregation and their behavior in
+the join.  This ensures that all rows in the same partial group share
+the same "destiny": they either all match or all fail to match a given
+row in R2.  Because the aggregate function AGG is decomposable,
+aggregating the partial results after the join yields the same final
+result as aggregating after the full join, thereby preserving query
+semantics.  Q.E.D.
+
+One restriction is that we cannot push partial aggregation down to a
+relation that is in the nullable side of an outer join, because the
+NULL-extended rows produced by the outer join would not be available
+when we perform the partial aggregation, while with a
+non-eager-aggregation plan these rows are available for the top-level
+aggregation.  Pushing partial aggregation in this case may result in
+the rows being grouped differently than expected, or produce incorrect
+values from the aggregate functions.
+
+During the construction of the join tree, we evaluate each base or
+join relation to determine if eager aggregation can be applied.  If
+feasible, we create a separate RelOptInfo called a "grouped relation"
+and generate grouped paths by adding sorted and hashed partial
+aggregation paths on top of the non-grouped paths.  To limit planning
+time, we consider only the cheapest or suitably-sorted non-grouped
+paths in this step.
+
+Another way to generate grouped paths is to join a grouped relation
+with a non-grouped relation.  Joining two grouped relations is
+currently not supported.
+
+To further limit planning time, we currently adopt a strategy where
+partial aggregation is pushed only to the lowest feasible level in the
+join tree where it provides a significant reduction in row count.
+This strategy also helps ensure that all grouped paths for the same
+grouped relation produce the same set of rows, which is important to
+support a fundamental assumption of the planner.
+
+If we have generated a grouped relation for the topmost join relation,
+we need to finalize its paths at the end.  The final paths will
+compete in the usual way with paths built from regular planning.
diff --git a/src/backend/optimizer/geqo/geqo_eval.c b/src/backend/optimizer/geqo/geqo_eval.c
index f07d1dc8ac6..4a65f955ca6 100644
--- a/src/backend/optimizer/geqo/geqo_eval.c
+++ b/src/backend/optimizer/geqo/geqo_eval.c
@@ -279,6 +279,27 @@ merge_clump(PlannerInfo *root, List *clumps, Clump *new_clump, int num_gene,
 				/* Find and save the cheapest paths for this joinrel */
 				set_cheapest(joinrel);
 
+				/*
+				 * Except for the topmost scan/join rel, consider generating
+				 * partial aggregation paths for the grouped relation on top
+				 * of the paths of this rel.  After that, we're done creating
+				 * paths for the grouped relation, so run set_cheapest().
+				 */
+				if (!bms_equal(joinrel->relids, root->all_query_rels))
+				{
+					RelOptInfo *grouped_rel;
+
+					grouped_rel = joinrel->grouped_rel;
+					if (grouped_rel)
+					{
+						Assert(IS_GROUPED_REL(grouped_rel));
+
+						generate_grouped_paths(root, grouped_rel, joinrel,
+											   grouped_rel->agg_info);
+						set_cheapest(grouped_rel);
+					}
+				}
+
 				/* Absorb new clump into old */
 				old_clump->joinrel = joinrel;
 				old_clump->size += new_clump->size;
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 6cc6966b060..7b349a4570e 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -40,6 +40,7 @@
 #include "optimizer/paths.h"
 #include "optimizer/plancat.h"
 #include "optimizer/planner.h"
+#include "optimizer/prep.h"
 #include "optimizer/tlist.h"
 #include "parser/parse_clause.h"
 #include "parser/parsetree.h"
@@ -47,6 +48,7 @@
 #include "port/pg_bitutils.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/lsyscache.h"
+#include "utils/selfuncs.h"
 
 
 /* Bitmask flags for pushdown_safety_info.unsafeFlags */
@@ -77,7 +79,9 @@ typedef enum pushdown_safe_type
 
 /* These parameters are set by GUC */
 bool		enable_geqo = false;	/* just in case GUC doesn't set it */
+bool		enable_eager_aggregate = true;
 int			geqo_threshold;
+double		min_eager_agg_group_size;
 int			min_parallel_table_scan_size;
 int			min_parallel_index_scan_size;
 
@@ -90,6 +94,7 @@ join_search_hook_type join_search_hook = NULL;
 
 static void set_base_rel_consider_startup(PlannerInfo *root);
 static void set_base_rel_sizes(PlannerInfo *root);
+static void setup_base_grouped_rels(PlannerInfo *root);
 static void set_base_rel_pathlists(PlannerInfo *root);
 static void set_rel_size(PlannerInfo *root, RelOptInfo *rel,
 						 Index rti, RangeTblEntry *rte);
@@ -114,6 +119,7 @@ static void set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
 								Index rti, RangeTblEntry *rte);
 static void set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 									Index rti, RangeTblEntry *rte);
+static void set_grouped_rel_pathlist(PlannerInfo *root, RelOptInfo *rel);
 static void generate_orderedappend_paths(PlannerInfo *root, RelOptInfo *rel,
 										 List *live_childrels,
 										 List *all_child_pathkeys);
@@ -182,6 +188,11 @@ make_one_rel(PlannerInfo *root, List *joinlist)
 	 */
 	set_base_rel_sizes(root);
 
+	/*
+	 * Build grouped relations for base rels where possible.
+	 */
+	setup_base_grouped_rels(root);
+
 	/*
 	 * We should now have size estimates for every actual table involved in
 	 * the query, and we also know which if any have been deleted from the
@@ -323,6 +334,39 @@ set_base_rel_sizes(PlannerInfo *root)
 	}
 }
 
+/*
+ * setup_base_grouped_rels
+ *	  For each base relation, build a grouped base relation if eager
+ *	  aggregation is possible and if this relation can produce grouped paths.
+ */
+static void
+setup_base_grouped_rels(PlannerInfo *root)
+{
+	Index		rti;
+
+	/*
+	 * If there are no aggregate expressions or grouping expressions, eager
+	 * aggregation is not possible.
+	 */
+	if (root->agg_clause_list == NIL ||
+		root->group_expr_list == NIL)
+		return;
+
+	for (rti = 1; rti < root->simple_rel_array_size; rti++)
+	{
+		RelOptInfo *rel = root->simple_rel_array[rti];
+
+		/* there may be empty slots corresponding to non-baserel RTEs */
+		if (rel == NULL)
+			continue;
+
+		Assert(rel->relid == rti);	/* sanity check on array */
+		Assert(IS_SIMPLE_REL(rel)); /* sanity check on rel */
+
+		(void) build_simple_grouped_rel(root, rel);
+	}
+}
+
 /*
  * set_base_rel_pathlists
  *	  Finds all paths available for scanning each base-relation entry.
@@ -559,6 +603,15 @@ set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 	/* Now find the cheapest of the paths for this rel */
 	set_cheapest(rel);
 
+	/*
+	 * If a grouped relation for this rel exists, build partial aggregation
+	 * paths for it.
+	 *
+	 * Note that this can only happen after we've called set_cheapest() for
+	 * this base rel, because we need its cheapest paths.
+	 */
+	set_grouped_rel_pathlist(root, rel);
+
 #ifdef OPTIMIZER_DEBUG
 	pprint(rel);
 #endif
@@ -1305,6 +1358,36 @@ set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 	add_paths_to_append_rel(root, rel, live_childrels);
 }
 
+/*
+ * set_grouped_rel_pathlist
+ *	  If a grouped relation for the given 'rel' exists, build partial
+ *	  aggregation paths for it.
+ */
+static void
+set_grouped_rel_pathlist(PlannerInfo *root, RelOptInfo *rel)
+{
+	RelOptInfo *grouped_rel;
+
+	/*
+	 * If there are no aggregate expressions or grouping expressions, eager
+	 * aggregation is not possible.
+	 */
+	if (root->agg_clause_list == NIL ||
+		root->group_expr_list == NIL)
+		return;
+
+	/* Add paths to the grouped base relation if one exists. */
+	grouped_rel = rel->grouped_rel;
+	if (grouped_rel)
+	{
+		Assert(IS_GROUPED_REL(grouped_rel));
+
+		generate_grouped_paths(root, grouped_rel, rel,
+							   grouped_rel->agg_info);
+		set_cheapest(grouped_rel);
+	}
+}
+
 
 /*
  * add_paths_to_append_rel
@@ -3335,6 +3418,328 @@ generate_useful_gather_paths(PlannerInfo *root, RelOptInfo *rel, bool override_r
 	}
 }
 
+/*
+ * generate_grouped_paths
+ *		Generate paths for a grouped relation by adding sorted and hashed
+ *		partial aggregation paths on top of paths of the ungrouped base or join
+ *		relation.
+ *
+ * The information needed are provided by the RelAggInfo structure.
+ */
+void
+generate_grouped_paths(PlannerInfo *root, RelOptInfo *grouped_rel,
+					   RelOptInfo *rel, RelAggInfo *agg_info)
+{
+	AggClauseCosts agg_costs;
+	bool		can_hash;
+	bool		can_sort;
+	Path	   *cheapest_total_path = NULL;
+	Path	   *cheapest_partial_path = NULL;
+	double		dNumGroups = 0;
+	double		dNumPartialGroups = 0;
+
+	if (IS_DUMMY_REL(rel))
+	{
+		mark_dummy_rel(grouped_rel);
+		return;
+	}
+
+	/*
+	 * We push partial aggregation only to the lowest possible level in the
+	 * join tree that is deemed useful.
+	 */
+	if (!bms_equal(agg_info->apply_at, rel->relids) ||
+		!agg_info->agg_useful)
+		return;
+
+	MemSet(&agg_costs, 0, sizeof(AggClauseCosts));
+	get_agg_clause_costs(root, AGGSPLIT_INITIAL_SERIAL, &agg_costs);
+
+	/*
+	 * Determine whether it's possible to perform sort-based implementations
+	 * of grouping.
+	 */
+	can_sort = grouping_is_sortable(agg_info->group_clauses);
+
+	/*
+	 * Determine whether we should consider hash-based implementations of
+	 * grouping.
+	 */
+	Assert(root->numOrderedAggs == 0);
+	can_hash = (agg_info->group_clauses != NIL &&
+				grouping_is_hashable(agg_info->group_clauses));
+
+	/*
+	 * Consider whether we should generate partially aggregated non-partial
+	 * paths.  We can only do this if we have a non-partial path.
+	 */
+	if (rel->pathlist != NIL)
+	{
+		cheapest_total_path = rel->cheapest_total_path;
+		Assert(cheapest_total_path != NULL);
+	}
+
+	/*
+	 * If parallelism is possible for grouped_rel, then we should consider
+	 * generating partially-grouped partial paths.  However, if the ungrouped
+	 * rel has no partial paths, then we can't.
+	 */
+	if (grouped_rel->consider_parallel && rel->partial_pathlist != NIL)
+	{
+		cheapest_partial_path = linitial(rel->partial_pathlist);
+		Assert(cheapest_partial_path != NULL);
+	}
+
+	/* Estimate number of partial groups. */
+	if (cheapest_total_path != NULL)
+		dNumGroups = estimate_num_groups(root,
+										 agg_info->group_exprs,
+										 cheapest_total_path->rows,
+										 NULL, NULL);
+	if (cheapest_partial_path != NULL)
+		dNumPartialGroups = estimate_num_groups(root,
+												agg_info->group_exprs,
+												cheapest_partial_path->rows,
+												NULL, NULL);
+
+	if (can_sort && cheapest_total_path != NULL)
+	{
+		ListCell   *lc;
+
+		/*
+		 * Use any available suitably-sorted path as input, and also consider
+		 * sorting the cheapest-total path and incremental sort on any paths
+		 * with presorted keys.
+		 *
+		 * To save planning time, we ignore parameterized input paths unless
+		 * they are the cheapest-total path.
+		 */
+		foreach(lc, rel->pathlist)
+		{
+			Path	   *input_path = (Path *) lfirst(lc);
+			Path	   *path;
+			bool		is_sorted;
+			int			presorted_keys;
+
+			/*
+			 * Ignore parameterized paths that are not the cheapest-total
+			 * path.
+			 */
+			if (input_path->param_info &&
+				input_path != cheapest_total_path)
+				continue;
+
+			is_sorted = pathkeys_count_contained_in(agg_info->group_pathkeys,
+													input_path->pathkeys,
+													&presorted_keys);
+
+			/*
+			 * Ignore paths that are not suitably or partially sorted, unless
+			 * they are the cheapest total path (no need to deal with paths
+			 * which have presorted keys when incremental sort is disabled).
+			 */
+			if (!is_sorted && input_path != cheapest_total_path &&
+				(presorted_keys == 0 || !enable_incremental_sort))
+				continue;
+
+			/*
+			 * Since the path originates from a non-grouped relation that is
+			 * not aware of eager aggregation, we must ensure that it provides
+			 * the correct input for partial aggregation.
+			 */
+			path = (Path *) create_projection_path(root,
+												   grouped_rel,
+												   input_path,
+												   agg_info->agg_input);
+
+			if (!is_sorted)
+			{
+				/*
+				 * We've no need to consider both a sort and incremental sort.
+				 * We'll just do a sort if there are no presorted keys and an
+				 * incremental sort when there are presorted keys.
+				 */
+				if (presorted_keys == 0 || !enable_incremental_sort)
+					path = (Path *) create_sort_path(root,
+													 grouped_rel,
+													 path,
+													 agg_info->group_pathkeys,
+													 -1.0);
+				else
+					path = (Path *) create_incremental_sort_path(root,
+																 grouped_rel,
+																 path,
+																 agg_info->group_pathkeys,
+																 presorted_keys,
+																 -1.0);
+			}
+
+			/*
+			 * qual is NIL because the HAVING clause cannot be evaluated until
+			 * the final value of the aggregate is known.
+			 */
+			path = (Path *) create_agg_path(root,
+											grouped_rel,
+											path,
+											agg_info->target,
+											AGG_SORTED,
+											AGGSPLIT_INITIAL_SERIAL,
+											agg_info->group_clauses,
+											NIL,
+											&agg_costs,
+											dNumGroups);
+
+			add_path(grouped_rel, path);
+		}
+	}
+
+	if (can_sort && cheapest_partial_path != NULL)
+	{
+		ListCell   *lc;
+
+		/* Similar to above logic, but for partial paths. */
+		foreach(lc, rel->partial_pathlist)
+		{
+			Path	   *input_path = (Path *) lfirst(lc);
+			Path	   *path;
+			bool		is_sorted;
+			int			presorted_keys;
+
+			is_sorted = pathkeys_count_contained_in(agg_info->group_pathkeys,
+													input_path->pathkeys,
+													&presorted_keys);
+
+			/*
+			 * Ignore paths that are not suitably or partially sorted, unless
+			 * they are the cheapest partial path (no need to deal with paths
+			 * which have presorted keys when incremental sort is disabled).
+			 */
+			if (!is_sorted && input_path != cheapest_partial_path &&
+				(presorted_keys == 0 || !enable_incremental_sort))
+				continue;
+
+			/*
+			 * Since the path originates from a non-grouped relation that is
+			 * not aware of eager aggregation, we must ensure that it provides
+			 * the correct input for partial aggregation.
+			 */
+			path = (Path *) create_projection_path(root,
+												   grouped_rel,
+												   input_path,
+												   agg_info->agg_input);
+
+			if (!is_sorted)
+			{
+				/*
+				 * We've no need to consider both a sort and incremental sort.
+				 * We'll just do a sort if there are no presorted keys and an
+				 * incremental sort when there are presorted keys.
+				 */
+				if (presorted_keys == 0 || !enable_incremental_sort)
+					path = (Path *) create_sort_path(root,
+													 grouped_rel,
+													 path,
+													 agg_info->group_pathkeys,
+													 -1.0);
+				else
+					path = (Path *) create_incremental_sort_path(root,
+																 grouped_rel,
+																 path,
+																 agg_info->group_pathkeys,
+																 presorted_keys,
+																 -1.0);
+			}
+
+			/*
+			 * qual is NIL because the HAVING clause cannot be evaluated until
+			 * the final value of the aggregate is known.
+			 */
+			path = (Path *) create_agg_path(root,
+											grouped_rel,
+											path,
+											agg_info->target,
+											AGG_SORTED,
+											AGGSPLIT_INITIAL_SERIAL,
+											agg_info->group_clauses,
+											NIL,
+											&agg_costs,
+											dNumPartialGroups);
+
+			add_partial_path(grouped_rel, path);
+		}
+	}
+
+	/*
+	 * Add a partially-grouped HashAgg Path where possible
+	 */
+	if (can_hash && cheapest_total_path != NULL)
+	{
+		Path	   *path;
+
+		/*
+		 * Since the path originates from a non-grouped relation that is not
+		 * aware of eager aggregation, we must ensure that it provides the
+		 * correct input for partial aggregation.
+		 */
+		path = (Path *) create_projection_path(root,
+											   grouped_rel,
+											   cheapest_total_path,
+											   agg_info->agg_input);
+
+		/*
+		 * qual is NIL because the HAVING clause cannot be evaluated until the
+		 * final value of the aggregate is known.
+		 */
+		path = (Path *) create_agg_path(root,
+										grouped_rel,
+										path,
+										agg_info->target,
+										AGG_HASHED,
+										AGGSPLIT_INITIAL_SERIAL,
+										agg_info->group_clauses,
+										NIL,
+										&agg_costs,
+										dNumGroups);
+
+		add_path(grouped_rel, path);
+	}
+
+	/*
+	 * Now add a partially-grouped HashAgg partial Path where possible
+	 */
+	if (can_hash && cheapest_partial_path != NULL)
+	{
+		Path	   *path;
+
+		/*
+		 * Since the path originates from a non-grouped relation that is not
+		 * aware of eager aggregation, we must ensure that it provides the
+		 * correct input for partial aggregation.
+		 */
+		path = (Path *) create_projection_path(root,
+											   grouped_rel,
+											   cheapest_partial_path,
+											   agg_info->agg_input);
+
+		/*
+		 * qual is NIL because the HAVING clause cannot be evaluated until the
+		 * final value of the aggregate is known.
+		 */
+		path = (Path *) create_agg_path(root,
+										grouped_rel,
+										path,
+										agg_info->target,
+										AGG_HASHED,
+										AGGSPLIT_INITIAL_SERIAL,
+										agg_info->group_clauses,
+										NIL,
+										&agg_costs,
+										dNumPartialGroups);
+
+		add_partial_path(grouped_rel, path);
+	}
+}
+
 /*
  * make_rel_from_joinlist
  *	  Build access paths using a "joinlist" to guide the join path search.
@@ -3494,6 +3899,10 @@ standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
 		 *
 		 * After that, we're done creating paths for the joinrel, so run
 		 * set_cheapest().
+		 *
+		 * In addition, we also run generate_grouped_paths() for the grouped
+		 * relation of each just-processed joinrel, and run set_cheapest() for
+		 * the grouped relation afterwards.
 		 */
 		foreach(lc, root->join_rel_level[lev])
 		{
@@ -3514,6 +3923,27 @@ standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
 			/* Find and save the cheapest paths for this rel */
 			set_cheapest(rel);
 
+			/*
+			 * Except for the topmost scan/join rel, consider generating
+			 * partial aggregation paths for the grouped relation on top of
+			 * the paths of this rel.  After that, we're done creating paths
+			 * for the grouped relation, so run set_cheapest().
+			 */
+			if (!bms_equal(rel->relids, root->all_query_rels))
+			{
+				RelOptInfo *grouped_rel;
+
+				grouped_rel = rel->grouped_rel;
+				if (grouped_rel)
+				{
+					Assert(IS_GROUPED_REL(grouped_rel));
+
+					generate_grouped_paths(root, grouped_rel, rel,
+										   grouped_rel->agg_info);
+					set_cheapest(grouped_rel);
+				}
+			}
+
 #ifdef OPTIMIZER_DEBUG
 			pprint(rel);
 #endif
@@ -4383,6 +4813,29 @@ generate_partitionwise_join_paths(PlannerInfo *root, RelOptInfo *rel)
 		if (IS_DUMMY_REL(child_rel))
 			continue;
 
+		/*
+		 * Except for the topmost scan/join rel, consider generating partial
+		 * aggregation paths for the grouped relation on top of the paths of
+		 * this partitioned child-join.  After that, we're done creating paths
+		 * for the grouped relation, so run set_cheapest().
+		 */
+		if (!bms_equal(IS_OTHER_REL(rel) ?
+					   rel->top_parent_relids : rel->relids,
+					   root->all_query_rels))
+		{
+			RelOptInfo *grouped_rel;
+
+			grouped_rel = child_rel->grouped_rel;
+			if (grouped_rel)
+			{
+				Assert(IS_GROUPED_REL(grouped_rel));
+
+				generate_grouped_paths(root, grouped_rel, child_rel,
+									   grouped_rel->agg_info);
+				set_cheapest(grouped_rel);
+			}
+		}
+
 #ifdef OPTIMIZER_DEBUG
 		pprint(child_rel);
 #endif
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index 535248aa525..04cbbcea2a4 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -16,6 +16,7 @@
 
 #include "miscadmin.h"
 #include "optimizer/appendinfo.h"
+#include "optimizer/cost.h"
 #include "optimizer/joininfo.h"
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
@@ -36,6 +37,9 @@ static bool has_legal_joinclause(PlannerInfo *root, RelOptInfo *rel);
 static bool restriction_is_constant_false(List *restrictlist,
 										  RelOptInfo *joinrel,
 										  bool only_pushed_down);
+static void make_grouped_join_rel(PlannerInfo *root, RelOptInfo *rel1,
+								  RelOptInfo *rel2, RelOptInfo *joinrel,
+								  SpecialJoinInfo *sjinfo, List *restrictlist);
 static void populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
 										RelOptInfo *rel2, RelOptInfo *joinrel,
 										SpecialJoinInfo *sjinfo, List *restrictlist);
@@ -762,6 +766,10 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
 		return joinrel;
 	}
 
+	/* Build a grouped join relation for 'joinrel' if possible. */
+	make_grouped_join_rel(root, rel1, rel2, joinrel, sjinfo,
+						  restrictlist);
+
 	/* Add paths to the join relation. */
 	populate_joinrel_with_paths(root, rel1, rel2, joinrel, sjinfo,
 								restrictlist);
@@ -873,6 +881,186 @@ add_outer_joins_to_relids(PlannerInfo *root, Relids input_relids,
 	return input_relids;
 }
 
+/*
+ * make_grouped_join_rel
+ *	  Build a grouped join relation for the given "joinrel" if eager
+ *	  aggregation is applicable and the resulting grouped paths are considered
+ *	  useful.
+ *
+ * There are two strategies for generating grouped paths for a join relation:
+ *
+ * 1. Join a grouped (partially aggregated) input relation with a non-grouped
+ * input (e.g., AGG(B) JOIN A).
+ *
+ * 2. Apply partial aggregation (sorted or hashed) on top of existing
+ * non-grouped join paths (e.g., AGG(A JOIN B)).
+ *
+ * To limit planning effort and avoid an explosion of alternatives, we adopt a
+ * strategy where partial aggregation is only pushed to the lowest possible
+ * level in the join tree that is deemed useful.  That is, if grouped paths can
+ * be built using the first strategy, we skip consideration of the second
+ * strategy for the same join level.
+ *
+ * Additionally, if there are multiple lowest useful levels where partial
+ * aggregation could be applied, such as in a join tree with relations A, B,
+ * and C where both "AGG(A JOIN B) JOIN C" and "A JOIN AGG(B JOIN C)" are valid
+ * placements, we choose only the first one encountered during join search.
+ * This avoids generating multiple versions of the same grouped relation based
+ * on different aggregation placements.
+ *
+ * These heuristics also ensure that all grouped paths for the same grouped
+ * relation produce the same set of rows, which is a basic assumption in the
+ * planner.
+ */
+static void
+make_grouped_join_rel(PlannerInfo *root, RelOptInfo *rel1,
+					  RelOptInfo *rel2, RelOptInfo *joinrel,
+					  SpecialJoinInfo *sjinfo, List *restrictlist)
+{
+	RelOptInfo *grouped_rel;
+	RelOptInfo *grouped_rel1;
+	RelOptInfo *grouped_rel2;
+	bool		rel1_empty;
+	bool		rel2_empty;
+	Relids		agg_apply_at;
+
+	/*
+	 * If there are no aggregate expressions or grouping expressions, eager
+	 * aggregation is not possible.
+	 */
+	if (root->agg_clause_list == NIL ||
+		root->group_expr_list == NIL)
+		return;
+
+	/* Retrieve the grouped relations for the two input rels */
+	grouped_rel1 = rel1->grouped_rel;
+	grouped_rel2 = rel2->grouped_rel;
+
+	rel1_empty = (grouped_rel1 == NULL || IS_DUMMY_REL(grouped_rel1));
+	rel2_empty = (grouped_rel2 == NULL || IS_DUMMY_REL(grouped_rel2));
+
+	/* Find or construct a grouped joinrel for this joinrel */
+	grouped_rel = joinrel->grouped_rel;
+	if (grouped_rel == NULL)
+	{
+		RelAggInfo *agg_info = NULL;
+
+		/*
+		 * Prepare the information needed to create grouped paths for this
+		 * join relation.
+		 */
+		agg_info = create_rel_agg_info(root, joinrel);
+		if (agg_info == NULL)
+			return;
+
+		/*
+		 * If grouped paths for the given join relation are not considered
+		 * useful, and no grouped paths can be built by joining grouped input
+		 * relations, skip building the grouped join relation.
+		 */
+		if (!agg_info->agg_useful &&
+			(rel1_empty == rel2_empty))
+			return;
+
+		/* build the grouped relation */
+		grouped_rel = build_grouped_rel(root, joinrel);
+		grouped_rel->reltarget = agg_info->target;
+
+		if (rel1_empty != rel2_empty)
+		{
+			/*
+			 * If there is exactly one grouped input relation, then we can
+			 * build grouped paths by joining the input relations.  Set size
+			 * estimates for the grouped join relation based on the input
+			 * relations, and update the lowest join level where partial
+			 * aggregation is applied to that of the grouped input relation.
+			 */
+			set_joinrel_size_estimates(root, grouped_rel,
+									   rel1_empty ? rel1 : grouped_rel1,
+									   rel2_empty ? rel2 : grouped_rel2,
+									   sjinfo, restrictlist);
+			agg_info->apply_at = rel1_empty ?
+				grouped_rel2->agg_info->apply_at :
+				grouped_rel1->agg_info->apply_at;
+		}
+		else
+		{
+			/*
+			 * Otherwise, grouped paths can be built by applying partial
+			 * aggregation on top of existing non-grouped join paths.  Set
+			 * size estimates for the grouped join relation based on the
+			 * estimated number of groups, and track the lowest join level
+			 * where partial aggregation is applied.  Note that these values
+			 * may be updated later if it is determined that grouped paths can
+			 * be constructed by joining other input relations.
+			 */
+			grouped_rel->rows = agg_info->grouped_rows;
+			agg_info->apply_at = bms_copy(joinrel->relids);
+		}
+
+		grouped_rel->agg_info = agg_info;
+		joinrel->grouped_rel = grouped_rel;
+	}
+
+	Assert(IS_GROUPED_REL(grouped_rel));
+
+	/* We may have already proven this grouped join relation to be dummy. */
+	if (IS_DUMMY_REL(grouped_rel))
+		return;
+
+	/*
+	 * Nothing to do if there's no grouped input relation.  Also, joining two
+	 * grouped relations is not currently supported.
+	 */
+	if (rel1_empty == rel2_empty)
+		return;
+
+	/*
+	 * Get the lowest join level where partial aggregation is applied among
+	 * the given input relations.
+	 */
+	agg_apply_at = rel1_empty ?
+		grouped_rel2->agg_info->apply_at :
+		grouped_rel1->agg_info->apply_at;
+
+	/*
+	 * If it's not the designated level, skip building grouped paths.
+	 *
+	 * One exception is when it is a subset of the previously recorded level.
+	 * In that case, we need to update the designated level to this one, and
+	 * adjust the size estimates for the grouped join relation accordingly.
+	 * For example, suppose partial aggregation can be applied on top of (B
+	 * JOIN C).  If we first construct the join as ((A JOIN B) JOIN C), we'd
+	 * record the designated level as including all three relations (A B C).
+	 * Later, when we consider (A JOIN (B JOIN C)), we encounter the smaller
+	 * (B C) join level directly.  Since this is a subset of the previous
+	 * level and still valid for partial aggregation, we update the designated
+	 * level to (B C), and adjust the size estimates accordingly.
+	 */
+	if (!bms_equal(agg_apply_at, grouped_rel->agg_info->apply_at))
+	{
+		if (bms_is_subset(agg_apply_at, grouped_rel->agg_info->apply_at))
+		{
+			/* Adjust the size estimates for the grouped join relation. */
+			set_joinrel_size_estimates(root, grouped_rel,
+									   rel1_empty ? rel1 : grouped_rel1,
+									   rel2_empty ? rel2 : grouped_rel2,
+									   sjinfo, restrictlist);
+			grouped_rel->agg_info->apply_at = agg_apply_at;
+		}
+		else
+			return;
+	}
+
+	/* Make paths for the grouped join relation. */
+	populate_joinrel_with_paths(root,
+								rel1_empty ? rel1 : grouped_rel1,
+								rel2_empty ? rel2 : grouped_rel2,
+								grouped_rel,
+								sjinfo,
+								restrictlist);
+}
+
 /*
  * populate_joinrel_with_paths
  *	  Add paths to the given joinrel for given pair of joining relations. The
@@ -1615,6 +1803,11 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
 						 adjust_child_relids(joinrel->relids,
 											 nappinfos, appinfos)));
 
+		/* Build a grouped join relation for 'child_joinrel' if possible */
+		make_grouped_join_rel(root, child_rel1, child_rel2,
+							  child_joinrel, child_sjinfo,
+							  child_restrictlist);
+
 		/* And make paths for the child join */
 		populate_joinrel_with_paths(root, child_rel1, child_rel2,
 									child_joinrel, child_sjinfo,
diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index 3e3fec89252..9cc8c558ccf 100644
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -14,6 +14,7 @@
  */
 #include "postgres.h"
 
+#include "access/nbtree.h"
 #include "catalog/pg_constraint.h"
 #include "catalog/pg_type.h"
 #include "nodes/makefuncs.h"
@@ -31,6 +32,7 @@
 #include "optimizer/restrictinfo.h"
 #include "parser/analyze.h"
 #include "rewrite/rewriteManip.h"
+#include "utils/fmgroids.h"
 #include "utils/lsyscache.h"
 #include "utils/rel.h"
 #include "utils/typcache.h"
@@ -81,6 +83,9 @@ typedef struct JoinTreeItem
 } JoinTreeItem;
 
 
+static bool is_partial_agg_memory_risky(PlannerInfo *root);
+static void create_agg_clause_infos(PlannerInfo *root);
+static void create_grouping_expr_infos(PlannerInfo *root);
 static void extract_lateral_references(PlannerInfo *root, RelOptInfo *brel,
 									   Index rtindex);
 static List *deconstruct_recurse(PlannerInfo *root, Node *jtnode,
@@ -628,6 +633,323 @@ remove_useless_groupby_columns(PlannerInfo *root)
 	}
 }
 
+/*
+ * setup_eager_aggregation
+ *	  Check if eager aggregation is applicable, and if so collect suitable
+ *	  aggregate expressions and grouping expressions in the query.
+ */
+void
+setup_eager_aggregation(PlannerInfo *root)
+{
+	/*
+	 * Don't apply eager aggregation if disabled by user.
+	 */
+	if (!enable_eager_aggregate)
+		return;
+
+	/*
+	 * Don't apply eager aggregation if there are no available GROUP BY
+	 * clauses.
+	 */
+	if (!root->processed_groupClause)
+		return;
+
+	/*
+	 * For now we don't try to support grouping sets.
+	 */
+	if (root->parse->groupingSets)
+		return;
+
+	/*
+	 * For now we don't try to support DISTINCT or ORDER BY aggregates.
+	 */
+	if (root->numOrderedAggs > 0)
+		return;
+
+	/*
+	 * If there are any aggregates that do not support partial mode, or any
+	 * partial aggregates that are non-serializable, do not apply eager
+	 * aggregation.
+	 */
+	if (root->hasNonPartialAggs || root->hasNonSerialAggs)
+		return;
+
+	/*
+	 * We don't try to apply eager aggregation if there are set-returning
+	 * functions in targetlist.
+	 */
+	if (root->parse->hasTargetSRFs)
+		return;
+
+	/*
+	 * Eager aggregation only makes sense if there are multiple base rels in
+	 * the query.
+	 */
+	if (bms_membership(root->all_baserels) != BMS_MULTIPLE)
+		return;
+
+	/*
+	 * Don't apply eager aggregation if any aggregate poses a risk of
+	 * excessive memory usage during partial aggregation.
+	 */
+	if (is_partial_agg_memory_risky(root))
+		return;
+
+	/*
+	 * Collect aggregate expressions and plain Vars that appear in the
+	 * targetlist and havingQual.
+	 */
+	create_agg_clause_infos(root);
+
+	/*
+	 * If there are no suitable aggregate expressions, we cannot apply eager
+	 * aggregation.
+	 */
+	if (root->agg_clause_list == NIL)
+		return;
+
+	/*
+	 * Collect grouping expressions that appear in grouping clauses.
+	 */
+	create_grouping_expr_infos(root);
+}
+
+/*
+ * is_partial_agg_memory_risky
+ *	  Checks if any aggregate poses a risk of excessive memory usage during
+ *	  partial aggregation.
+ *
+ * We check if any aggregate uses INTERNAL transition type.  Although INTERNAL
+ * is marked as pass-by-value, it usually points to a large internal data
+ * structure (like those used by string_agg or array_agg).  These transition
+ * states can grow large and their size is hard to estimate.  Applying eager
+ * aggregation in such cases risks high memory usage since partial aggregation
+ * results might be stored in join hash tables or materialized nodes.
+ *
+ * We explicitly exclude aggregates with F_NUMERIC_AVG_ACCUM transition
+ * function from this check, based on the assumption that avg(numeric) and
+ * sum(numeric) are safe in this context.
+ */
+static bool
+is_partial_agg_memory_risky(PlannerInfo *root)
+{
+	ListCell   *lc;
+
+	foreach(lc, root->aggtransinfos)
+	{
+		AggTransInfo *transinfo = lfirst_node(AggTransInfo, lc);
+
+		if (transinfo->transfn_oid == F_NUMERIC_AVG_ACCUM)
+			continue;
+
+		if (transinfo->aggtranstype == INTERNALOID)
+			return true;
+	}
+
+	return false;
+}
+
+/*
+ * create_agg_clause_infos
+ *	  Search the targetlist and havingQual for Aggrefs and plain Vars, and
+ *	  create an AggClauseInfo for each Aggref node.
+ */
+static void
+create_agg_clause_infos(PlannerInfo *root)
+{
+	List	   *tlist_exprs;
+	List	   *agg_clause_list = NIL;
+	List	   *tlist_vars = NIL;
+	Relids		aggregate_relids = NULL;
+	bool		eager_agg_applicable = true;
+	ListCell   *lc;
+
+	Assert(root->agg_clause_list == NIL);
+	Assert(root->tlist_vars == NIL);
+
+	tlist_exprs = pull_var_clause((Node *) root->processed_tlist,
+								  PVC_INCLUDE_AGGREGATES |
+								  PVC_RECURSE_WINDOWFUNCS |
+								  PVC_RECURSE_PLACEHOLDERS);
+
+	/*
+	 * Aggregates within the HAVING clause need to be processed in the same
+	 * way as those in the targetlist.  Note that HAVING can contain Aggrefs
+	 * but not WindowFuncs.
+	 */
+	if (root->parse->havingQual != NULL)
+	{
+		List	   *having_exprs;
+
+		having_exprs = pull_var_clause((Node *) root->parse->havingQual,
+									   PVC_INCLUDE_AGGREGATES |
+									   PVC_RECURSE_PLACEHOLDERS);
+		if (having_exprs != NIL)
+		{
+			tlist_exprs = list_concat(tlist_exprs, having_exprs);
+			list_free(having_exprs);
+		}
+	}
+
+	foreach(lc, tlist_exprs)
+	{
+		Expr	   *expr = (Expr *) lfirst(lc);
+		Aggref	   *aggref;
+		Relids		agg_eval_at;
+		AggClauseInfo *ac_info;
+
+		/* For now we don't try to support GROUPING() expressions */
+		if (IsA(expr, GroupingFunc))
+		{
+			eager_agg_applicable = false;
+			break;
+		}
+
+		/* Collect plain Vars for future reference */
+		if (IsA(expr, Var))
+		{
+			tlist_vars = list_append_unique(tlist_vars, expr);
+			continue;
+		}
+
+		aggref = castNode(Aggref, expr);
+
+		Assert(aggref->aggorder == NIL);
+		Assert(aggref->aggdistinct == NIL);
+
+		/*
+		 * If there are any securityQuals, do not try to apply eager
+		 * aggregation if any non-leakproof aggregate functions are present.
+		 * This is overly strict, but for now...
+		 */
+		if (root->qual_security_level > 0 &&
+			!get_func_leakproof(aggref->aggfnoid))
+		{
+			eager_agg_applicable = false;
+			break;
+		}
+
+		agg_eval_at = pull_varnos(root, (Node *) aggref);
+
+		/*
+		 * If all base relations in the query are referenced by aggregate
+		 * functions, then eager aggregation is not applicable.
+		 */
+		aggregate_relids = bms_add_members(aggregate_relids, agg_eval_at);
+		if (bms_is_subset(root->all_baserels, aggregate_relids))
+		{
+			eager_agg_applicable = false;
+			break;
+		}
+
+		/* OK, create the AggClauseInfo node */
+		ac_info = makeNode(AggClauseInfo);
+		ac_info->aggref = aggref;
+		ac_info->agg_eval_at = agg_eval_at;
+
+		/* ... and add it to the list */
+		agg_clause_list = list_append_unique(agg_clause_list, ac_info);
+	}
+
+	list_free(tlist_exprs);
+
+	if (eager_agg_applicable)
+	{
+		root->agg_clause_list = agg_clause_list;
+		root->tlist_vars = tlist_vars;
+	}
+	else
+	{
+		list_free_deep(agg_clause_list);
+		list_free(tlist_vars);
+	}
+}
+
+/*
+ * create_grouping_expr_infos
+ *	  Create a GroupingExprInfo for each expression usable as grouping key.
+ *
+ * If any grouping expression is not suitable, we will just return with
+ * root->group_expr_list being NIL.
+ */
+static void
+create_grouping_expr_infos(PlannerInfo *root)
+{
+	List	   *exprs = NIL;
+	List	   *sortgrouprefs = NIL;
+	List	   *btree_opfamilies = NIL;
+	ListCell   *lc,
+			   *lc1,
+			   *lc2,
+			   *lc3;
+
+	Assert(root->group_expr_list == NIL);
+
+	foreach(lc, root->processed_groupClause)
+	{
+		SortGroupClause *sgc = lfirst_node(SortGroupClause, lc);
+		TargetEntry *tle = get_sortgroupclause_tle(sgc, root->processed_tlist);
+		TypeCacheEntry *tce;
+		Oid			equalimageproc;
+
+		Assert(tle->ressortgroupref > 0);
+
+		/*
+		 * For now we only support plain Vars as grouping expressions.
+		 */
+		if (!IsA(tle->expr, Var))
+			return;
+
+		/*
+		 * Eager aggregation is only possible if equality implies image
+		 * equality for each grouping key.  Otherwise, placing keys with
+		 * different byte images into the same group may result in the loss of
+		 * information that could be necessary to evaluate upper qual clauses.
+		 *
+		 * For instance, the NUMERIC data type is not supported, as values
+		 * that are considered equal by the equality operator (e.g., 0 and
+		 * 0.0) can have different scales.
+		 */
+		tce = lookup_type_cache(exprType((Node *) tle->expr),
+								TYPECACHE_BTREE_OPFAMILY);
+		if (!OidIsValid(tce->btree_opf) ||
+			!OidIsValid(tce->btree_opintype))
+			return;
+
+		equalimageproc = get_opfamily_proc(tce->btree_opf,
+										   tce->btree_opintype,
+										   tce->btree_opintype,
+										   BTEQUALIMAGE_PROC);
+		if (!OidIsValid(equalimageproc) ||
+			!DatumGetBool(OidFunctionCall1Coll(equalimageproc,
+											   tce->typcollation,
+											   ObjectIdGetDatum(tce->btree_opintype))))
+			return;
+
+		exprs = lappend(exprs, tle->expr);
+		sortgrouprefs = lappend_int(sortgrouprefs, tle->ressortgroupref);
+		btree_opfamilies = lappend_oid(btree_opfamilies, tce->btree_opf);
+	}
+
+	/*
+	 * Construct a GroupingExprInfo for each expression.
+	 */
+	forthree(lc1, exprs, lc2, sortgrouprefs, lc3, btree_opfamilies)
+	{
+		Expr	   *expr = (Expr *) lfirst(lc1);
+		int			sortgroupref = lfirst_int(lc2);
+		Oid			btree_opfamily = lfirst_oid(lc3);
+		GroupingExprInfo *ge_info;
+
+		ge_info = makeNode(GroupingExprInfo);
+		ge_info->expr = (Expr *) copyObject(expr);
+		ge_info->sortgroupref = sortgroupref;
+		ge_info->btree_opfamily = btree_opfamily;
+
+		root->group_expr_list = lappend(root->group_expr_list, ge_info);
+	}
+}
+
 /*****************************************************************************
  *
  *	  LATERAL REFERENCES
diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c
index 5467e094ca7..eefc486a566 100644
--- a/src/backend/optimizer/plan/planmain.c
+++ b/src/backend/optimizer/plan/planmain.c
@@ -76,6 +76,9 @@ query_planner(PlannerInfo *root,
 	root->placeholder_list = NIL;
 	root->placeholder_array = NULL;
 	root->placeholder_array_size = 0;
+	root->agg_clause_list = NIL;
+	root->group_expr_list = NIL;
+	root->tlist_vars = NIL;
 	root->fkey_list = NIL;
 	root->initial_rels = NIL;
 
@@ -265,6 +268,12 @@ query_planner(PlannerInfo *root,
 	 */
 	extract_restriction_or_clauses(root);
 
+	/*
+	 * Check if eager aggregation is applicable, and if so, set up
+	 * root->agg_clause_list and root->group_expr_list.
+	 */
+	setup_eager_aggregation(root);
+
 	/*
 	 * Now expand appendrels by adding "otherrels" for their children.  We
 	 * delay this to the end so that we have as much information as possible
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 41bd8353430..462c5335589 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -232,7 +232,6 @@ static void add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 									  RelOptInfo *partially_grouped_rel,
 									  const AggClauseCosts *agg_costs,
 									  grouping_sets_data *gd,
-									  double dNumGroups,
 									  GroupPathExtraData *extra);
 static RelOptInfo *create_partial_grouping_paths(PlannerInfo *root,
 												 RelOptInfo *grouped_rel,
@@ -4010,9 +4009,7 @@ create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
 							   GroupPathExtraData *extra,
 							   RelOptInfo **partially_grouped_rel_p)
 {
-	Path	   *cheapest_path = input_rel->cheapest_total_path;
 	RelOptInfo *partially_grouped_rel = NULL;
-	double		dNumGroups;
 	PartitionwiseAggregateType patype = PARTITIONWISE_AGGREGATE_NONE;
 
 	/*
@@ -4094,23 +4091,16 @@ create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
 
 	/* Gather any partially grouped partial paths. */
 	if (partially_grouped_rel && partially_grouped_rel->partial_pathlist)
-	{
 		gather_grouping_paths(root, partially_grouped_rel);
-		set_cheapest(partially_grouped_rel);
-	}
 
-	/*
-	 * Estimate number of groups.
-	 */
-	dNumGroups = get_number_of_groups(root,
-									  cheapest_path->rows,
-									  gd,
-									  extra->targetList);
+	/* Now choose the best path(s) for partially_grouped_rel. */
+	if (partially_grouped_rel && partially_grouped_rel->pathlist)
+		set_cheapest(partially_grouped_rel);
 
 	/* Build final grouping paths */
 	add_paths_to_grouping_rel(root, input_rel, grouped_rel,
 							  partially_grouped_rel, agg_costs, gd,
-							  dNumGroups, extra);
+							  extra);
 
 	/* Give a helpful error if we failed to find any implementation */
 	if (grouped_rel->pathlist == NIL)
@@ -7055,16 +7045,42 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 						  RelOptInfo *grouped_rel,
 						  RelOptInfo *partially_grouped_rel,
 						  const AggClauseCosts *agg_costs,
-						  grouping_sets_data *gd, double dNumGroups,
+						  grouping_sets_data *gd,
 						  GroupPathExtraData *extra)
 {
 	Query	   *parse = root->parse;
 	Path	   *cheapest_path = input_rel->cheapest_total_path;
+	Path	   *cheapest_partially_grouped_path = NULL;
 	ListCell   *lc;
 	bool		can_hash = (extra->flags & GROUPING_CAN_USE_HASH) != 0;
 	bool		can_sort = (extra->flags & GROUPING_CAN_USE_SORT) != 0;
 	List	   *havingQual = (List *) extra->havingQual;
 	AggClauseCosts *agg_final_costs = &extra->agg_final_costs;
+	double		dNumGroups = 0;
+	double		dNumFinalGroups = 0;
+
+	/*
+	 * Estimate number of groups for non-split aggregation.
+	 */
+	dNumGroups = get_number_of_groups(root,
+									  cheapest_path->rows,
+									  gd,
+									  extra->targetList);
+
+	if (partially_grouped_rel && partially_grouped_rel->pathlist)
+	{
+		cheapest_partially_grouped_path =
+			partially_grouped_rel->cheapest_total_path;
+
+		/*
+		 * Estimate number of groups for final phase of partial aggregation.
+		 */
+		dNumFinalGroups =
+			get_number_of_groups(root,
+								 cheapest_partially_grouped_path->rows,
+								 gd,
+								 extra->targetList);
+	}
 
 	if (can_sort)
 	{
@@ -7177,7 +7193,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 					path = make_ordered_path(root,
 											 grouped_rel,
 											 path,
-											 partially_grouped_rel->cheapest_total_path,
+											 cheapest_partially_grouped_path,
 											 info->pathkeys,
 											 -1.0);
 
@@ -7195,7 +7211,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 												 info->clauses,
 												 havingQual,
 												 agg_final_costs,
-												 dNumGroups));
+												 dNumFinalGroups));
 					else
 						add_path(grouped_rel, (Path *)
 								 create_group_path(root,
@@ -7203,7 +7219,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 												   path,
 												   info->clauses,
 												   havingQual,
-												   dNumGroups));
+												   dNumFinalGroups));
 
 				}
 			}
@@ -7245,19 +7261,17 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 		 */
 		if (partially_grouped_rel && partially_grouped_rel->pathlist)
 		{
-			Path	   *path = partially_grouped_rel->cheapest_total_path;
-
 			add_path(grouped_rel, (Path *)
 					 create_agg_path(root,
 									 grouped_rel,
-									 path,
+									 cheapest_partially_grouped_path,
 									 grouped_rel->reltarget,
 									 AGG_HASHED,
 									 AGGSPLIT_FINAL_DESERIAL,
 									 root->processed_groupClause,
 									 havingQual,
 									 agg_final_costs,
-									 dNumGroups));
+									 dNumFinalGroups));
 		}
 	}
 
@@ -7297,6 +7311,7 @@ create_partial_grouping_paths(PlannerInfo *root,
 {
 	Query	   *parse = root->parse;
 	RelOptInfo *partially_grouped_rel;
+	RelOptInfo *eager_agg_rel = NULL;
 	AggClauseCosts *agg_partial_costs = &extra->agg_partial_costs;
 	AggClauseCosts *agg_final_costs = &extra->agg_final_costs;
 	Path	   *cheapest_partial_path = NULL;
@@ -7307,6 +7322,15 @@ create_partial_grouping_paths(PlannerInfo *root,
 	bool		can_hash = (extra->flags & GROUPING_CAN_USE_HASH) != 0;
 	bool		can_sort = (extra->flags & GROUPING_CAN_USE_SORT) != 0;
 
+	/*
+	 * Check whether any partially aggregated paths have been generated
+	 * through eager aggregation.
+	 */
+	if (input_rel->grouped_rel &&
+		!IS_DUMMY_REL(input_rel->grouped_rel) &&
+		input_rel->grouped_rel->pathlist != NIL)
+		eager_agg_rel = input_rel->grouped_rel;
+
 	/*
 	 * Consider whether we should generate partially aggregated non-partial
 	 * paths.  We can only do this if we have a non-partial path, and only if
@@ -7328,11 +7352,13 @@ create_partial_grouping_paths(PlannerInfo *root,
 
 	/*
 	 * If we can't partially aggregate partial paths, and we can't partially
-	 * aggregate non-partial paths, then don't bother creating the new
+	 * aggregate non-partial paths, and no partially aggregated paths were
+	 * generated by eager aggregation, then don't bother creating the new
 	 * RelOptInfo at all, unless the caller specified force_rel_creation.
 	 */
 	if (cheapest_total_path == NULL &&
 		cheapest_partial_path == NULL &&
+		eager_agg_rel == NULL &&
 		!force_rel_creation)
 		return NULL;
 
@@ -7557,6 +7583,51 @@ create_partial_grouping_paths(PlannerInfo *root,
 										 dNumPartialPartialGroups));
 	}
 
+	/*
+	 * Add any partially aggregated paths generated by eager aggregation to
+	 * the new upper relation after applying projection steps as needed.
+	 */
+	if (eager_agg_rel)
+	{
+		/* Add the paths */
+		foreach(lc, eager_agg_rel->pathlist)
+		{
+			Path	   *path = (Path *) lfirst(lc);
+
+			/* Shouldn't have any parameterized paths anymore */
+			Assert(path->param_info == NULL);
+
+			path = (Path *) create_projection_path(root,
+												   partially_grouped_rel,
+												   path,
+												   partially_grouped_rel->reltarget);
+
+			add_path(partially_grouped_rel, path);
+		}
+
+		/*
+		 * Likewise add the partial paths, but only if parallelism is possible
+		 * for partially_grouped_rel.
+		 */
+		if (partially_grouped_rel->consider_parallel)
+		{
+			foreach(lc, eager_agg_rel->partial_pathlist)
+			{
+				Path	   *path = (Path *) lfirst(lc);
+
+				/* Shouldn't have any parameterized paths anymore */
+				Assert(path->param_info == NULL);
+
+				path = (Path *) create_projection_path(root,
+													   partially_grouped_rel,
+													   path,
+													   partially_grouped_rel->reltarget);
+
+				add_partial_path(partially_grouped_rel, path);
+			}
+		}
+	}
+
 	/*
 	 * If there is an FDW that's responsible for all baserels of the query,
 	 * let it consider adding partially grouped ForeignPaths.
@@ -8120,13 +8191,6 @@ create_partitionwise_grouping_paths(PlannerInfo *root,
 
 		add_paths_to_append_rel(root, partially_grouped_rel,
 								partially_grouped_live_children);
-
-		/*
-		 * We need call set_cheapest, since the finalization step will use the
-		 * cheapest path from the rel.
-		 */
-		if (partially_grouped_rel->pathlist)
-			set_cheapest(partially_grouped_rel);
 	}
 
 	/* If possible, create append paths for fully grouped children. */
diff --git a/src/backend/optimizer/util/appendinfo.c b/src/backend/optimizer/util/appendinfo.c
index 5b3dc0d8653..11c0eb0d180 100644
--- a/src/backend/optimizer/util/appendinfo.c
+++ b/src/backend/optimizer/util/appendinfo.c
@@ -516,6 +516,65 @@ adjust_appendrel_attrs_mutator(Node *node,
 		return (Node *) newinfo;
 	}
 
+	/*
+	 * We have to process RelAggInfo nodes specially.
+	 */
+	if (IsA(node, RelAggInfo))
+	{
+		RelAggInfo *oldinfo = (RelAggInfo *) node;
+		RelAggInfo *newinfo = makeNode(RelAggInfo);
+
+		/* Copy all flat-copiable fields */
+		memcpy(newinfo, oldinfo, sizeof(RelAggInfo));
+
+		newinfo->relids = adjust_child_relids(oldinfo->relids,
+											  nappinfos, appinfos);
+
+		newinfo->target = (PathTarget *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->target,
+										   context);
+
+		newinfo->agg_input = (PathTarget *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->agg_input,
+										   context);
+
+		newinfo->group_clauses = (List *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->group_clauses,
+										   context);
+
+		newinfo->group_exprs = (List *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->group_exprs,
+										   context);
+
+		return (Node *) newinfo;
+	}
+
+	/*
+	 * We have to process PathTarget nodes specially.
+	 */
+	if (IsA(node, PathTarget))
+	{
+		PathTarget *oldtarget = (PathTarget *) node;
+		PathTarget *newtarget = makeNode(PathTarget);
+
+		/* Copy all flat-copiable fields */
+		memcpy(newtarget, oldtarget, sizeof(PathTarget));
+
+		newtarget->exprs = (List *)
+			adjust_appendrel_attrs_mutator((Node *) oldtarget->exprs,
+										   context);
+
+		if (oldtarget->sortgrouprefs)
+		{
+			Size		nbytes = list_length(oldtarget->exprs) * sizeof(Index);
+
+			newtarget->sortgrouprefs = (Index *) palloc(nbytes);
+			memcpy(newtarget->sortgrouprefs, oldtarget->sortgrouprefs, nbytes);
+		}
+
+		return (Node *) newtarget;
+	}
+
 	/*
 	 * NOTE: we do not need to recurse into sublinks, because they should
 	 * already have been converted to subplans before we see them.
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index 0e523d2eb5b..faa44e46594 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -16,6 +16,8 @@
 
 #include <limits.h>
 
+#include "access/nbtree.h"
+#include "catalog/pg_constraint.h"
 #include "miscadmin.h"
 #include "nodes/nodeFuncs.h"
 #include "optimizer/appendinfo.h"
@@ -27,12 +29,16 @@
 #include "optimizer/paths.h"
 #include "optimizer/placeholder.h"
 #include "optimizer/plancat.h"
+#include "optimizer/planner.h"
 #include "optimizer/restrictinfo.h"
 #include "optimizer/tlist.h"
+#include "parser/parse_oper.h"
 #include "parser/parse_relation.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/hsearch.h"
 #include "utils/lsyscache.h"
+#include "utils/selfuncs.h"
+#include "utils/typcache.h"
 
 
 typedef struct JoinHashEntry
@@ -83,6 +89,14 @@ static void build_child_join_reltarget(PlannerInfo *root,
 									   RelOptInfo *childrel,
 									   int nappinfos,
 									   AppendRelInfo **appinfos);
+static bool eager_aggregation_possible_for_relation(PlannerInfo *root,
+													RelOptInfo *rel);
+static bool init_grouping_targets(PlannerInfo *root, RelOptInfo *rel,
+								  PathTarget *target, PathTarget *agg_input,
+								  List **group_clauses, List **group_exprs);
+static bool is_var_in_aggref_only(PlannerInfo *root, Var *var);
+static bool is_var_needed_by_join(PlannerInfo *root, Var *var, RelOptInfo *rel);
+static Index get_expression_sortgroupref(PlannerInfo *root, Expr *expr);
 
 
 /*
@@ -278,6 +292,8 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
 	rel->joininfo = NIL;
 	rel->has_eclass_joins = false;
 	rel->consider_partitionwise_join = false;	/* might get changed later */
+	rel->agg_info = NULL;
+	rel->grouped_rel = NULL;
 	rel->part_scheme = NULL;
 	rel->nparts = -1;
 	rel->boundinfo = NULL;
@@ -408,6 +424,103 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
 	return rel;
 }
 
+/*
+ * build_simple_grouped_rel
+ *	  Construct a new RelOptInfo representing a grouped version of the input
+ *	  base relation.
+ */
+RelOptInfo *
+build_simple_grouped_rel(PlannerInfo *root, RelOptInfo *rel)
+{
+	RelOptInfo *grouped_rel;
+	RelAggInfo *agg_info;
+
+	/*
+	 * We should have available aggregate expressions and grouping
+	 * expressions, otherwise we cannot reach here.
+	 */
+	Assert(root->agg_clause_list != NIL);
+	Assert(root->group_expr_list != NIL);
+
+	/* nothing to do for dummy rel */
+	if (IS_DUMMY_REL(rel))
+		return NULL;
+
+	/*
+	 * Prepare the information needed to create grouped paths for this base
+	 * relation.
+	 */
+	agg_info = create_rel_agg_info(root, rel);
+	if (agg_info == NULL)
+		return NULL;
+
+	/*
+	 * If grouped paths for the given base relation are not considered useful,
+	 * skip building the grouped relation.
+	 */
+	if (!agg_info->agg_useful)
+		return NULL;
+
+	/* Tracks the lowest join level at which partial aggregation is applied */
+	agg_info->apply_at = bms_copy(rel->relids);
+
+	/* build the grouped relation */
+	grouped_rel = build_grouped_rel(root, rel);
+	grouped_rel->reltarget = agg_info->target;
+	grouped_rel->rows = agg_info->grouped_rows;
+	grouped_rel->agg_info = agg_info;
+
+	rel->grouped_rel = grouped_rel;
+
+	return grouped_rel;
+}
+
+/*
+ * build_grouped_rel
+ *	  Build a grouped relation by flat copying the input relation and resetting
+ *	  the necessary fields.
+ */
+RelOptInfo *
+build_grouped_rel(PlannerInfo *root, RelOptInfo *rel)
+{
+	RelOptInfo *grouped_rel;
+
+	grouped_rel = makeNode(RelOptInfo);
+	memcpy(grouped_rel, rel, sizeof(RelOptInfo));
+
+	/*
+	 * clear path info
+	 */
+	grouped_rel->pathlist = NIL;
+	grouped_rel->ppilist = NIL;
+	grouped_rel->partial_pathlist = NIL;
+	grouped_rel->cheapest_startup_path = NULL;
+	grouped_rel->cheapest_total_path = NULL;
+	grouped_rel->cheapest_parameterized_paths = NIL;
+
+	/*
+	 * clear partition info
+	 */
+	grouped_rel->part_scheme = NULL;
+	grouped_rel->nparts = -1;
+	grouped_rel->boundinfo = NULL;
+	grouped_rel->partbounds_merged = false;
+	grouped_rel->partition_qual = NIL;
+	grouped_rel->part_rels = NULL;
+	grouped_rel->live_parts = NULL;
+	grouped_rel->all_partrels = NULL;
+	grouped_rel->partexprs = NULL;
+	grouped_rel->nullable_partexprs = NULL;
+	grouped_rel->consider_partitionwise_join = false;
+
+	/*
+	 * clear size estimates
+	 */
+	grouped_rel->rows = 0;
+
+	return grouped_rel;
+}
+
 /*
  * find_base_rel
  *	  Find a base or otherrel relation entry, which must already exist.
@@ -759,6 +872,8 @@ build_join_rel(PlannerInfo *root,
 	joinrel->joininfo = NIL;
 	joinrel->has_eclass_joins = false;
 	joinrel->consider_partitionwise_join = false;	/* might get changed later */
+	joinrel->agg_info = NULL;
+	joinrel->grouped_rel = NULL;
 	joinrel->parent = NULL;
 	joinrel->top_parent = NULL;
 	joinrel->top_parent_relids = NULL;
@@ -945,6 +1060,8 @@ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
 	joinrel->joininfo = NIL;
 	joinrel->has_eclass_joins = false;
 	joinrel->consider_partitionwise_join = false;	/* might get changed later */
+	joinrel->agg_info = NULL;
+	joinrel->grouped_rel = NULL;
 	joinrel->parent = parent_joinrel;
 	joinrel->top_parent = parent_joinrel->top_parent ? parent_joinrel->top_parent : parent_joinrel;
 	joinrel->top_parent_relids = joinrel->top_parent->relids;
@@ -2523,3 +2640,514 @@ build_child_join_reltarget(PlannerInfo *root,
 	childrel->reltarget->cost.per_tuple = parentrel->reltarget->cost.per_tuple;
 	childrel->reltarget->width = parentrel->reltarget->width;
 }
+
+/*
+ * create_rel_agg_info
+ *	  Create the RelAggInfo structure for the given relation if it can produce
+ *	  grouped paths.  The given relation is the non-grouped one which has the
+ *	  reltarget already constructed.
+ */
+RelAggInfo *
+create_rel_agg_info(PlannerInfo *root, RelOptInfo *rel)
+{
+	ListCell   *lc;
+	RelAggInfo *result;
+	PathTarget *agg_input;
+	PathTarget *target;
+	List	   *group_clauses = NIL;
+	List	   *group_exprs = NIL;
+
+	/*
+	 * The lists of aggregate expressions and grouping expressions should have
+	 * been constructed.
+	 */
+	Assert(root->agg_clause_list != NIL);
+	Assert(root->group_expr_list != NIL);
+
+	/*
+	 * If this is a child rel, the grouped rel for its parent rel must have
+	 * been created if it can.  So we can just use parent's RelAggInfo if
+	 * there is one, with appropriate variable substitutions.
+	 */
+	if (IS_OTHER_REL(rel))
+	{
+		RelOptInfo *grouped_rel;
+		RelAggInfo *agg_info;
+
+		grouped_rel = rel->top_parent->grouped_rel;
+		if (grouped_rel == NULL)
+			return NULL;
+
+		Assert(IS_GROUPED_REL(grouped_rel));
+
+		/* Must do multi-level transformation */
+		agg_info = (RelAggInfo *)
+			adjust_appendrel_attrs_multilevel(root,
+											  (Node *) grouped_rel->agg_info,
+											  rel,
+											  rel->top_parent);
+
+		agg_info->grouped_rows =
+			estimate_num_groups(root, agg_info->group_exprs,
+								rel->rows, NULL, NULL);
+
+		agg_info->apply_at = NULL;	/* caller will change this later */
+
+		/*
+		 * The grouped paths for the given relation are considered useful iff
+		 * the average group size is no less than min_eager_agg_group_size.
+		 */
+		agg_info->agg_useful =
+			(rel->rows / agg_info->grouped_rows) >= min_eager_agg_group_size;
+
+		return agg_info;
+	}
+
+	/* Check if it's possible to produce grouped paths for this relation. */
+	if (!eager_aggregation_possible_for_relation(root, rel))
+		return NULL;
+
+	/*
+	 * Create targets for the grouped paths and for the input paths of the
+	 * grouped paths.
+	 */
+	target = create_empty_pathtarget();
+	agg_input = create_empty_pathtarget();
+
+	/* ... and initialize these targets */
+	if (!init_grouping_targets(root, rel, target, agg_input,
+							   &group_clauses, &group_exprs))
+		return NULL;
+
+	/*
+	 * Eager aggregation is not applicable if there are no available grouping
+	 * expressions.
+	 */
+	if (list_length(group_clauses) == 0)
+		return NULL;
+
+	/* build the RelAggInfo result */
+	result = makeNode(RelAggInfo);
+
+	result->group_clauses = group_clauses;
+	result->group_exprs = group_exprs;
+
+	/* Calculate pathkeys that represent this grouping requirements */
+	result->group_pathkeys =
+		make_pathkeys_for_sortclauses(root, result->group_clauses,
+									  make_tlist_from_pathtarget(target));
+
+	/* Add aggregates to the grouping target */
+	foreach(lc, root->agg_clause_list)
+	{
+		AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+		Aggref	   *aggref;
+
+		Assert(IsA(ac_info->aggref, Aggref));
+
+		aggref = (Aggref *) copyObject(ac_info->aggref);
+		mark_partial_aggref(aggref, AGGSPLIT_INITIAL_SERIAL);
+
+		add_column_to_pathtarget(target, (Expr *) aggref, 0);
+	}
+
+	/* Set the estimated eval cost and output width for both targets */
+	set_pathtarget_cost_width(root, target);
+	set_pathtarget_cost_width(root, agg_input);
+
+	result->relids = bms_copy(rel->relids);
+	result->target = target;
+	result->agg_input = agg_input;
+	result->grouped_rows = estimate_num_groups(root, result->group_exprs,
+											   rel->rows, NULL, NULL);
+	result->apply_at = NULL;	/* caller will change this later */
+
+	/*
+	 * The grouped paths for the given relation are considered useful iff the
+	 * average group size is no less than min_eager_agg_group_size.
+	 */
+	result->agg_useful =
+		(rel->rows / result->grouped_rows) >= min_eager_agg_group_size;
+
+	return result;
+}
+
+/*
+ * eager_aggregation_possible_for_relation
+ * 	  Check if it's possible to produce grouped paths for the given relation.
+ */
+static bool
+eager_aggregation_possible_for_relation(PlannerInfo *root, RelOptInfo *rel)
+{
+	ListCell   *lc;
+	int			cur_relid;
+
+	/*
+	 * Check to see if the given relation is in the nullable side of an outer
+	 * join.  In this case, we cannot push a partial aggregation down to the
+	 * relation, because the NULL-extended rows produced by the outer join
+	 * would not be available when we perform the partial aggregation, while
+	 * with a non-eager-aggregation plan these rows are available for the
+	 * top-level aggregation.  Doing so may result in the rows being grouped
+	 * differently than expected, or produce incorrect values from the
+	 * aggregate functions.
+	 */
+	cur_relid = -1;
+	while ((cur_relid = bms_next_member(rel->relids, cur_relid)) >= 0)
+	{
+		RelOptInfo *baserel = find_base_rel_ignore_join(root, cur_relid);
+
+		if (baserel == NULL)
+			continue;			/* ignore outer joins in rel->relids */
+
+		if (!bms_is_subset(baserel->nulling_relids, rel->relids))
+			return false;
+	}
+
+	/*
+	 * For now we don't try to support PlaceHolderVars.
+	 */
+	foreach(lc, rel->reltarget->exprs)
+	{
+		Expr	   *expr = lfirst(lc);
+
+		if (IsA(expr, PlaceHolderVar))
+			return false;
+	}
+
+	/* Caller should only pass base relations or joins. */
+	Assert(rel->reloptkind == RELOPT_BASEREL ||
+		   rel->reloptkind == RELOPT_JOINREL);
+
+	/*
+	 * Check if all aggregate expressions can be evaluated on this relation
+	 * level.
+	 */
+	foreach(lc, root->agg_clause_list)
+	{
+		AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+
+		Assert(IsA(ac_info->aggref, Aggref));
+
+		/*
+		 * Give up if any aggregate requires relations other than the current
+		 * one.  If the aggregate requires the current relation plus
+		 * additional relations, grouping the current relation could make some
+		 * input rows unavailable for the higher aggregate and may reduce the
+		 * number of input rows it receives.  If the aggregate does not
+		 * require the current relation at all, it should not be grouped, as
+		 * we do not support joining two grouped relations.
+		 */
+		if (!bms_is_subset(ac_info->agg_eval_at, rel->relids))
+			return false;
+	}
+
+	return true;
+}
+
+/*
+ * init_grouping_targets
+ *	  Initialize the target for grouped paths (target) as well as the target
+ *	  for paths that generate input for the grouped paths (agg_input).
+ *
+ * We also construct the list of SortGroupClauses and the list of grouping
+ * expressions for the partial aggregation, and return them in *group_clause
+ * and *group_exprs.
+ *
+ * Return true if the targets could be initialized, false otherwise.
+ */
+static bool
+init_grouping_targets(PlannerInfo *root, RelOptInfo *rel,
+					  PathTarget *target, PathTarget *agg_input,
+					  List **group_clauses, List **group_exprs)
+{
+	ListCell   *lc;
+	List	   *possibly_dependent = NIL;
+	Index		maxSortGroupRef;
+
+	/* Identify the max sortgroupref */
+	maxSortGroupRef = 0;
+	foreach(lc, root->processed_tlist)
+	{
+		Index		ref = ((TargetEntry *) lfirst(lc))->ressortgroupref;
+
+		if (ref > maxSortGroupRef)
+			maxSortGroupRef = ref;
+	}
+
+	/*
+	 * At this point, all Vars from this relation that are needed by upper
+	 * joins or are required in the final targetlist should already be present
+	 * in its reltarget.  Therefore, we can safely iterate over this
+	 * relation's reltarget->exprs to construct the PathTarget and grouping
+	 * clauses for the grouped paths.
+	 */
+	foreach(lc, rel->reltarget->exprs)
+	{
+		Expr	   *expr = (Expr *) lfirst(lc);
+		Index		sortgroupref;
+
+		/*
+		 * Given that PlaceHolderVar currently prevents us from doing eager
+		 * aggregation, the source target cannot contain anything more complex
+		 * than a Var.
+		 */
+		Assert(IsA(expr, Var));
+
+		/*
+		 * Get the sortgroupref of the expr if it is found among, or can be
+		 * deduced from, the original grouping expressions.
+		 */
+		sortgroupref = get_expression_sortgroupref(root, expr);
+		if (sortgroupref > 0)
+		{
+			SortGroupClause *sgc;
+
+			/* Find the matching SortGroupClause */
+			sgc = get_sortgroupref_clause(sortgroupref, root->processed_groupClause);
+			Assert(sgc->tleSortGroupRef <= maxSortGroupRef);
+
+			/*
+			 * If the target expression is to be used as a grouping key, it
+			 * should be emitted by the grouped paths that have been pushed
+			 * down to this relation level.
+			 */
+			add_column_to_pathtarget(target, expr, sortgroupref);
+
+			/*
+			 * ... and it also should be emitted by the input paths.
+			 */
+			add_column_to_pathtarget(agg_input, expr, sortgroupref);
+
+			/*
+			 * Record this SortGroupClause and grouping expression.  Note that
+			 * this SortGroupClause might have already been recorded.
+			 */
+			if (!list_member(*group_clauses, sgc))
+			{
+				*group_clauses = lappend(*group_clauses, sgc);
+				*group_exprs = lappend(*group_exprs, expr);
+			}
+		}
+		else if (is_var_needed_by_join(root, (Var *) expr, rel))
+		{
+			/*
+			 * The expression is needed for an upper join but is neither in
+			 * the GROUP BY clause nor derivable from it using EC (otherwise,
+			 * it would have already been included in the targets above).  We
+			 * need to create a special SortGroupClause for this expression.
+			 *
+			 * It is important to include such expressions in the grouping
+			 * keys.  This is essential to ensure that an aggregated row from
+			 * the partial aggregation matches the other side of the join if
+			 * and only if each row in the partial group does.  This ensures
+			 * that all rows within the same partial group share the same
+			 * 'destiny', which is crucial for maintaining correctness.
+			 */
+			SortGroupClause *sgc;
+			TypeCacheEntry *tce;
+			Oid			equalimageproc;
+
+			/*
+			 * But first, check if equality implies image equality for this
+			 * expression.  If not, we cannot use it as a grouping key.  See
+			 * comments in create_grouping_expr_infos().
+			 */
+			tce = lookup_type_cache(exprType((Node *) expr),
+									TYPECACHE_BTREE_OPFAMILY);
+			if (!OidIsValid(tce->btree_opf) ||
+				!OidIsValid(tce->btree_opintype))
+				return false;
+
+			equalimageproc = get_opfamily_proc(tce->btree_opf,
+											   tce->btree_opintype,
+											   tce->btree_opintype,
+											   BTEQUALIMAGE_PROC);
+			if (!OidIsValid(equalimageproc) ||
+				!DatumGetBool(OidFunctionCall1Coll(equalimageproc,
+												   tce->typcollation,
+												   ObjectIdGetDatum(tce->btree_opintype))))
+				return false;
+
+			/* Create the SortGroupClause. */
+			sgc = makeNode(SortGroupClause);
+
+			/* Initialize the SortGroupClause. */
+			sgc->tleSortGroupRef = ++maxSortGroupRef;
+			get_sort_group_operators(exprType((Node *) expr),
+									 false, true, false,
+									 &sgc->sortop, &sgc->eqop, NULL,
+									 &sgc->hashable);
+
+			/* This expression should be emitted by the grouped paths */
+			add_column_to_pathtarget(target, expr, sgc->tleSortGroupRef);
+
+			/* ... and it also should be emitted by the input paths. */
+			add_column_to_pathtarget(agg_input, expr, sgc->tleSortGroupRef);
+
+			/* Record this SortGroupClause and grouping expression */
+			*group_clauses = lappend(*group_clauses, sgc);
+			*group_exprs = lappend(*group_exprs, expr);
+		}
+		else if (is_var_in_aggref_only(root, (Var *) expr))
+		{
+			/*
+			 * The expression is referenced by an aggregate function pushed
+			 * down to this relation and does not appear elsewhere in the
+			 * targetlist or havingQual.  Add it to 'agg_input' but not to
+			 * 'target'.
+			 */
+			add_new_column_to_pathtarget(agg_input, expr);
+		}
+		else
+		{
+			/*
+			 * The expression may be functionally dependent on other
+			 * expressions in the target, but we cannot verify this until all
+			 * target expressions have been constructed.
+			 */
+			possibly_dependent = lappend(possibly_dependent, expr);
+		}
+	}
+
+	/*
+	 * Now we can verify whether an expression is functionally dependent on
+	 * others.
+	 */
+	foreach(lc, possibly_dependent)
+	{
+		Var		   *tvar;
+		List	   *deps = NIL;
+		RangeTblEntry *rte;
+
+		tvar = lfirst_node(Var, lc);
+		rte = root->simple_rte_array[tvar->varno];
+
+		if (check_functional_grouping(rte->relid, tvar->varno,
+									  tvar->varlevelsup,
+									  target->exprs, &deps))
+		{
+			/*
+			 * The expression is functionally dependent on other target
+			 * expressions, so it can be included in the targets.  Since it
+			 * will not be used as a grouping key, a sortgroupref is not
+			 * needed for it.
+			 */
+			add_new_column_to_pathtarget(target, (Expr *) tvar);
+			add_new_column_to_pathtarget(agg_input, (Expr *) tvar);
+		}
+		else
+		{
+			/*
+			 * We may arrive here with a grouping expression that is proven
+			 * redundant by EquivalenceClass processing, such as 't1.a' in the
+			 * query below.
+			 *
+			 * select max(t1.c) from t t1, t t2 where t1.a = 1 group by t1.a,
+			 * t1.b;
+			 *
+			 * For now we just give up in this case.
+			 */
+			return false;
+		}
+	}
+
+	return true;
+}
+
+/*
+ * is_var_in_aggref_only
+ *	  Check whether the given Var appears in aggregate expressions and not
+ *	  elsewhere in the targetlist or havingQual.
+ */
+static bool
+is_var_in_aggref_only(PlannerInfo *root, Var *var)
+{
+	ListCell   *lc;
+
+	/*
+	 * Search the list of aggregate expressions for the Var.
+	 */
+	foreach(lc, root->agg_clause_list)
+	{
+		AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+		List	   *vars;
+
+		Assert(IsA(ac_info->aggref, Aggref));
+
+		if (!bms_is_member(var->varno, ac_info->agg_eval_at))
+			continue;
+
+		vars = pull_var_clause((Node *) ac_info->aggref,
+							   PVC_RECURSE_AGGREGATES |
+							   PVC_RECURSE_WINDOWFUNCS |
+							   PVC_RECURSE_PLACEHOLDERS);
+
+		if (list_member(vars, var))
+		{
+			list_free(vars);
+			break;
+		}
+
+		list_free(vars);
+	}
+
+	return (lc != NULL && !list_member(root->tlist_vars, var));
+}
+
+/*
+ * is_var_needed_by_join
+ *	  Check if the given Var is needed by joins above the current rel.
+ */
+static bool
+is_var_needed_by_join(PlannerInfo *root, Var *var, RelOptInfo *rel)
+{
+	Relids		relids;
+	int			attno;
+	RelOptInfo *baserel;
+
+	/*
+	 * Note that when checking if the Var is needed by joins above, we want to
+	 * exclude cases where the Var is only needed in the final targetlist.  So
+	 * include "relation 0" in the check.
+	 */
+	relids = bms_copy(rel->relids);
+	relids = bms_add_member(relids, 0);
+
+	baserel = find_base_rel(root, var->varno);
+	attno = var->varattno - baserel->min_attr;
+
+	return bms_nonempty_difference(baserel->attr_needed[attno], relids);
+}
+
+/*
+ * get_expression_sortgroupref
+ *	  Return the sortgroupref of the given "expr" if it is found among the
+ *	  original grouping expressions, or is known equal to any of the original
+ *	  grouping expressions due to equivalence relationships.  Return 0 if no
+ *	  match is found.
+ */
+static Index
+get_expression_sortgroupref(PlannerInfo *root, Expr *expr)
+{
+	ListCell   *lc;
+
+	foreach(lc, root->group_expr_list)
+	{
+		GroupingExprInfo *ge_info = lfirst_node(GroupingExprInfo, lc);
+
+		Assert(IsA(ge_info->expr, Var));
+
+		if (equal(ge_info->expr, expr) ||
+			exprs_known_equal(root, (Node *) expr, (Node *) ge_info->expr,
+							  ge_info->btree_opfamily))
+		{
+			Assert(ge_info->sortgroupref > 0);
+
+			return ge_info->sortgroupref;
+		}
+	}
+
+	/* no match is found */
+	return 0;
+}
diff --git a/src/backend/utils/misc/guc_tables.c b/src/backend/utils/misc/guc_tables.c
index f137129209f..d3bfcaf0784 100644
--- a/src/backend/utils/misc/guc_tables.c
+++ b/src/backend/utils/misc/guc_tables.c
@@ -965,6 +965,16 @@ struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"enable_eager_aggregate", PGC_USERSET, QUERY_TUNING_METHOD,
+			gettext_noop("Enables eager aggregation."),
+			NULL,
+			GUC_EXPLAIN
+		},
+		&enable_eager_aggregate,
+		true,
+		NULL, NULL, NULL
+	},
 	{
 		{"enable_parallel_append", PGC_USERSET, QUERY_TUNING_METHOD,
 			gettext_noop("Enables the planner's use of parallel append plans."),
@@ -4050,6 +4060,17 @@ struct config_real ConfigureNamesReal[] =
 		NULL, NULL, NULL
 	},
 
+	{
+		{"min_eager_agg_group_size", PGC_USERSET, QUERY_TUNING_COST,
+			gettext_noop("Sets the minimum average group size required to consider applying eager aggregation."),
+			NULL,
+			GUC_EXPLAIN
+		},
+		&min_eager_agg_group_size,
+		8.0, 0.0, DBL_MAX,
+		NULL, NULL, NULL
+	},
+
 	{
 		{"cursor_tuple_fraction", PGC_USERSET, QUERY_TUNING_OTHER,
 			gettext_noop("Sets the planner's estimate of the fraction of "
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index a9d8293474a..e3cdfe11992 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -428,6 +428,7 @@
 #enable_group_by_reordering = on
 #enable_distinct_reordering = on
 #enable_self_join_elimination = on
+#enable_eager_aggregate = on
 
 # - Planner Cost Constants -
 
@@ -441,6 +442,7 @@
 #min_parallel_table_scan_size = 8MB
 #min_parallel_index_scan_size = 512kB
 #effective_cache_size = 4GB
+#min_eager_agg_group_size = 8.0
 
 #jit_above_cost = 100000		# perform JIT compilation if available
 					# and query more expensive than this;
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 4a903d1ec18..ad211207343 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -397,6 +397,15 @@ struct PlannerInfo
 	/* list of PlaceHolderInfos */
 	List	   *placeholder_list;
 
+	/* list of AggClauseInfos */
+	List	   *agg_clause_list;
+
+	/* list of GroupExprInfos */
+	List	   *group_expr_list;
+
+	/* list of plain Vars contained in targetlist and havingQual */
+	List	   *tlist_vars;
+
 	/* array of PlaceHolderInfos indexed by phid */
 	struct PlaceHolderInfo **placeholder_array pg_node_attr(read_write_ignore, array_size(placeholder_array_size));
 	/* allocated size of array */
@@ -1046,6 +1055,14 @@ typedef struct RelOptInfo
 	/* consider partitionwise join paths? (if partitioned rel) */
 	bool		consider_partitionwise_join;
 
+	/*
+	 * used by eager aggregation:
+	 */
+	/* information needed to create grouped paths */
+	struct RelAggInfo *agg_info;
+	/* the partially-aggregated version of the relation */
+	struct RelOptInfo *grouped_rel;
+
 	/*
 	 * inheritance links, if this is an otherrel (otherwise NULL):
 	 */
@@ -1130,6 +1147,75 @@ typedef struct RelOptInfo
 	((nominal_jointype) == JOIN_INNER && (sjinfo)->jointype == JOIN_SEMI && \
 	 bms_equal((sjinfo)->syn_righthand, (rel)->relids))
 
+/*
+ * Is the given relation a grouped relation?
+ */
+#define IS_GROUPED_REL(rel) \
+	((rel)->agg_info != NULL)
+
+/*
+ * RelAggInfo
+ *		Information needed to create grouped paths for base and join rels.
+ *
+ * "relids" is the set of relation identifiers (RT indexes).
+ *
+ * "target" is the output tlist for the grouped paths.
+ *
+ * "agg_input" is the output tlist for the paths that provide input to the
+ * grouped paths.  One difference from the reltarget of the non-grouped
+ * relation is that agg_input has its sortgrouprefs[] initialized.
+ *
+ * "grouped_rows" is the estimated number of result tuples of the grouped
+ * relation.
+ *
+ * "group_clauses", "group_exprs" and "group_pathkeys" are lists of
+ * SortGroupClauses, the corresponding grouping expressions and PathKeys
+ * respectively.
+ *
+ * "apply_at" tracks the lowest join level at which partial aggregation is
+ * applied.
+ *
+ * "agg_useful" is a flag to indicate whether the grouped paths are considered
+ * useful.  It is set true if the average partial group size is no less than
+ * min_eager_agg_group_size, suggesting a significant row count reduction.
+ */
+typedef struct RelAggInfo
+{
+	pg_node_attr(no_copy_equal, no_read, no_query_jumble)
+
+	NodeTag		type;
+
+	/* set of base + OJ relids (rangetable indexes) */
+	Relids		relids;
+
+	/*
+	 * default result targetlist for Paths scanning this grouped relation;
+	 * list of Vars/Exprs, cost, width
+	 */
+	struct PathTarget *target;
+
+	/*
+	 * the targetlist for Paths that provide input to the grouped paths
+	 */
+	struct PathTarget *agg_input;
+
+	/* estimated number of result tuples */
+	Cardinality grouped_rows;
+
+	/* a list of SortGroupClauses */
+	List	   *group_clauses;
+	/* a list of grouping expressions */
+	List	   *group_exprs;
+	/* a list of PathKeys */
+	List	   *group_pathkeys;
+
+	/* lowest level partial aggregation is applied at */
+	Relids		apply_at;
+
+	/* the grouped paths are considered useful? */
+	bool		agg_useful;
+} RelAggInfo;
+
 /*
  * IndexOptInfo
  *		Per-index information for planning/optimization
@@ -3283,6 +3369,50 @@ typedef struct MinMaxAggInfo
 	Param	   *param;
 } MinMaxAggInfo;
 
+/*
+ * For each distinct Aggref node that appears in the targetlist and HAVING
+ * clauses, we store an AggClauseInfo node in the PlannerInfo node's
+ * agg_clause_list.  Each AggClauseInfo records the set of relations referenced
+ * by the aggregate expression.  This information is used to determine how far
+ * the aggregate can be safely pushed down in the join tree.
+ */
+typedef struct AggClauseInfo
+{
+	pg_node_attr(no_read, no_query_jumble)
+
+	NodeTag		type;
+
+	/* the Aggref expr */
+	Aggref	   *aggref;
+
+	/* lowest level we can evaluate this aggregate at */
+	Relids		agg_eval_at;
+} AggClauseInfo;
+
+/*
+ * For each grouping expression that appears in grouping clauses, we store a
+ * GroupingExprInfo node in the PlannerInfo node's group_expr_list.  Each
+ * GroupingExprInfo records the expression being grouped on, its sortgroupref,
+ * and the btree opfamily used for equality comparison.  This information is
+ * necessary to reproduce correct grouping semantics at different levels of the
+ * join tree.
+ */
+typedef struct GroupingExprInfo
+{
+	pg_node_attr(no_read, no_query_jumble)
+
+	NodeTag		type;
+
+	/* the represented expression */
+	Expr	   *expr;
+
+	/* the tleSortGroupRef of the corresponding SortGroupClause */
+	Index		sortgroupref;
+
+	/* btree opfamily defining the ordering */
+	Oid			btree_opfamily;
+} GroupingExprInfo;
+
 /*
  * At runtime, PARAM_EXEC slots are used to pass values around from one plan
  * node to another.  They can be used to pass values down into subqueries (for
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index 763cd25bb3c..5b9c1daf14b 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -312,6 +312,10 @@ extern void setup_simple_rel_arrays(PlannerInfo *root);
 extern void expand_planner_arrays(PlannerInfo *root, int add_size);
 extern RelOptInfo *build_simple_rel(PlannerInfo *root, int relid,
 									RelOptInfo *parent);
+extern RelOptInfo *build_simple_grouped_rel(PlannerInfo *root,
+											RelOptInfo *rel_plain);
+extern RelOptInfo *build_grouped_rel(PlannerInfo *root,
+									 RelOptInfo *rel_plain);
 extern RelOptInfo *find_base_rel(PlannerInfo *root, int relid);
 extern RelOptInfo *find_base_rel_noerr(PlannerInfo *root, int relid);
 extern RelOptInfo *find_base_rel_ignore_join(PlannerInfo *root, int relid);
@@ -351,4 +355,5 @@ extern RelOptInfo *build_child_join_rel(PlannerInfo *root,
 										SpecialJoinInfo *sjinfo,
 										int nappinfos, AppendRelInfo **appinfos);
 
+extern RelAggInfo *create_rel_agg_info(PlannerInfo *root, RelOptInfo *rel);
 #endif							/* PATHNODE_H */
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index cbade77b717..8d03d662a04 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -21,7 +21,9 @@
  * allpaths.c
  */
 extern PGDLLIMPORT bool enable_geqo;
+extern PGDLLIMPORT bool enable_eager_aggregate;
 extern PGDLLIMPORT int geqo_threshold;
+extern PGDLLIMPORT double min_eager_agg_group_size;
 extern PGDLLIMPORT int min_parallel_table_scan_size;
 extern PGDLLIMPORT int min_parallel_index_scan_size;
 extern PGDLLIMPORT bool enable_group_by_reordering;
@@ -57,6 +59,10 @@ extern void generate_gather_paths(PlannerInfo *root, RelOptInfo *rel,
 								  bool override_rows);
 extern void generate_useful_gather_paths(PlannerInfo *root, RelOptInfo *rel,
 										 bool override_rows);
+extern void generate_grouped_paths(PlannerInfo *root,
+								   RelOptInfo *rel_grouped,
+								   RelOptInfo *rel_plain,
+								   RelAggInfo *agg_info);
 extern int	compute_parallel_worker(RelOptInfo *rel, double heap_pages,
 									double index_pages, int max_workers);
 extern void create_partial_bitmap_paths(PlannerInfo *root, RelOptInfo *rel,
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index 9d3debcab28..09b48b26f8f 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -76,6 +76,7 @@ extern void add_vars_to_targetlist(PlannerInfo *root, List *vars,
 extern void add_vars_to_attr_needed(PlannerInfo *root, List *vars,
 									Relids where_needed);
 extern void remove_useless_groupby_columns(PlannerInfo *root);
+extern void setup_eager_aggregation(PlannerInfo *root);
 extern void find_lateral_references(PlannerInfo *root);
 extern void rebuild_lateral_attr_needed(PlannerInfo *root);
 extern void create_lateral_join_info(PlannerInfo *root);
diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out
index 69805d4b9ec..ef79d6f1ded 100644
--- a/src/test/regress/expected/collate.icu.utf8.out
+++ b/src/test/regress/expected/collate.icu.utf8.out
@@ -2437,11 +2437,11 @@ SELECT c collate "C", count(c) FROM pagg_tab3 GROUP BY c collate "C" ORDER BY 1;
 SET enable_partitionwise_join TO false;
 EXPLAIN (COSTS OFF)
 SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROUP BY 1 ORDER BY t1.c COLLATE "C";
-                         QUERY PLAN                          
--------------------------------------------------------------
+                            QUERY PLAN                             
+-------------------------------------------------------------------
  Sort
    Sort Key: t1.c COLLATE "C"
-   ->  HashAggregate
+   ->  Finalize HashAggregate
          Group Key: t1.c
          ->  Hash Join
                Hash Cond: (t1.c = t2.c)
@@ -2449,10 +2449,12 @@ SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROU
                      ->  Seq Scan on pagg_tab3_p2 t1_1
                      ->  Seq Scan on pagg_tab3_p1 t1_2
                ->  Hash
-                     ->  Append
-                           ->  Seq Scan on pagg_tab3_p2 t2_1
-                           ->  Seq Scan on pagg_tab3_p1 t2_2
-(13 rows)
+                     ->  Partial HashAggregate
+                           Group Key: t2.c
+                           ->  Append
+                                 ->  Seq Scan on pagg_tab3_p2 t2_1
+                                 ->  Seq Scan on pagg_tab3_p1 t2_2
+(15 rows)
 
 SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROUP BY 1 ORDER BY t1.c COLLATE "C";
  c | count 
@@ -2464,11 +2466,11 @@ SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROU
 SET enable_partitionwise_join TO true;
 EXPLAIN (COSTS OFF)
 SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROUP BY 1 ORDER BY t1.c COLLATE "C";
-                         QUERY PLAN                          
--------------------------------------------------------------
+                            QUERY PLAN                             
+-------------------------------------------------------------------
  Sort
    Sort Key: t1.c COLLATE "C"
-   ->  HashAggregate
+   ->  Finalize HashAggregate
          Group Key: t1.c
          ->  Hash Join
                Hash Cond: (t1.c = t2.c)
@@ -2476,10 +2478,12 @@ SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROU
                      ->  Seq Scan on pagg_tab3_p2 t1_1
                      ->  Seq Scan on pagg_tab3_p1 t1_2
                ->  Hash
-                     ->  Append
-                           ->  Seq Scan on pagg_tab3_p2 t2_1
-                           ->  Seq Scan on pagg_tab3_p1 t2_2
-(13 rows)
+                     ->  Partial HashAggregate
+                           Group Key: t2.c
+                           ->  Append
+                                 ->  Seq Scan on pagg_tab3_p2 t2_1
+                                 ->  Seq Scan on pagg_tab3_p1 t2_2
+(15 rows)
 
 SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROUP BY 1 ORDER BY t1.c COLLATE "C";
  c | count 
diff --git a/src/test/regress/expected/eager_aggregate.out b/src/test/regress/expected/eager_aggregate.out
new file mode 100644
index 00000000000..f02ff0b30a3
--- /dev/null
+++ b/src/test/regress/expected/eager_aggregate.out
@@ -0,0 +1,1334 @@
+--
+-- EAGER AGGREGATION
+-- Test we can push aggregation down below join
+--
+-- Enable eager aggregation, which by default is disabled.
+SET enable_eager_aggregate TO on;
+CREATE TABLE eager_agg_t1 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t2 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t3 (a int, b int, c double precision);
+INSERT INTO eager_agg_t1 SELECT i, i, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t2 SELECT i, i%10, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t3 SELECT i%10, i%10, i FROM generate_series(1, 1000)i;
+ANALYZE eager_agg_t1;
+ANALYZE eager_agg_t2;
+ANALYZE eager_agg_t3;
+--
+-- Test eager aggregation over base rel
+--
+-- Perform scan of a table, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                            QUERY PLAN                            
+------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Seq Scan on public.eager_agg_t2 t2
+                                 Output: t2.a, t2.b, t2.c
+(18 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                               QUERY PLAN                               
+------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial GroupAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Sort
+                                 Output: t2.c, t2.b
+                                 Sort Key: t2.b
+                                 ->  Seq Scan on public.eager_agg_t2 t2
+                                       Output: t2.c, t2.b
+(21 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+RESET enable_hashagg;
+--
+-- Test eager aggregation over join rel
+--
+-- Perform join of tables, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+                                  QUERY PLAN                                  
+------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg((t2.c + t3.c))
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg((t2.c + t3.c)))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg((t2.c + t3.c))
+                           Group Key: t2.b
+                           ->  Hash Join
+                                 Output: t2.c, t2.b, t3.c
+                                 Hash Cond: (t3.a = t2.a)
+                                 ->  Seq Scan on public.eager_agg_t3 t3
+                                       Output: t3.a, t3.b, t3.c
+                                 ->  Hash
+                                       Output: t2.c, t2.b, t2.a
+                                       ->  Seq Scan on public.eager_agg_t2 t2
+                                             Output: t2.c, t2.b, t2.a
+(25 rows)
+
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 497
+ 2 | 499
+ 3 | 501
+ 4 | 503
+ 5 | 505
+ 6 | 507
+ 7 | 509
+ 8 | 511
+ 9 | 513
+(9 rows)
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+                                     QUERY PLAN                                     
+------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg((t2.c + t3.c))
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg((t2.c + t3.c)))
+                     ->  Partial GroupAggregate
+                           Output: t2.b, PARTIAL avg((t2.c + t3.c))
+                           Group Key: t2.b
+                           ->  Sort
+                                 Output: t2.c, t2.b, t3.c
+                                 Sort Key: t2.b
+                                 ->  Hash Join
+                                       Output: t2.c, t2.b, t3.c
+                                       Hash Cond: (t3.a = t2.a)
+                                       ->  Seq Scan on public.eager_agg_t3 t3
+                                             Output: t3.a, t3.b, t3.c
+                                       ->  Hash
+                                             Output: t2.c, t2.b, t2.a
+                                             ->  Seq Scan on public.eager_agg_t2 t2
+                                                   Output: t2.c, t2.b, t2.a
+(28 rows)
+
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 497
+ 2 | 499
+ 3 | 501
+ 4 | 503
+ 5 | 505
+ 6 | 507
+ 7 | 509
+ 8 | 511
+ 9 | 513
+(9 rows)
+
+RESET enable_hashagg;
+--
+-- Test that eager aggregation works for outer join
+--
+-- Ensure aggregation can be pushed down to the non-nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                            QUERY PLAN                            
+------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Right Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Seq Scan on public.eager_agg_t2 t2
+                                 Output: t2.a, t2.b, t2.c
+(18 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+   | 505
+(10 rows)
+
+-- Ensure aggregation cannot be pushed down to the nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Sort
+   Output: t2.b, (avg(t2.c))
+   Sort Key: t2.b
+   ->  HashAggregate
+         Output: t2.b, avg(t2.c)
+         Group Key: t2.b
+         ->  Hash Right Join
+               Output: t2.b, t2.c
+               Hash Cond: (t2.b = t1.b)
+               ->  Seq Scan on public.eager_agg_t2 t2
+                     Output: t2.a, t2.b, t2.c
+               ->  Hash
+                     Output: t1.b
+                     ->  Seq Scan on public.eager_agg_t1 t1
+                           Output: t1.b
+(15 rows)
+
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+ b | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+   |    
+(10 rows)
+
+--
+-- Test that eager aggregation works for parallel plans
+--
+SET parallel_setup_cost=0;
+SET parallel_tuple_cost=0;
+SET min_parallel_table_scan_size=0;
+SET max_parallel_workers_per_gather=4;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                                   QUERY PLAN                                    
+---------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Gather Merge
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Workers Planned: 2
+         ->  Sort
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Sort Key: t1.a
+               ->  Parallel Hash Join
+                     Output: t1.a, (PARTIAL avg(t2.c))
+                     Hash Cond: (t1.b = t2.b)
+                     ->  Parallel Seq Scan on public.eager_agg_t1 t1
+                           Output: t1.a, t1.b, t1.c
+                     ->  Parallel Hash
+                           Output: t2.b, (PARTIAL avg(t2.c))
+                           ->  Partial HashAggregate
+                                 Output: t2.b, PARTIAL avg(t2.c)
+                                 Group Key: t2.b
+                                 ->  Parallel Seq Scan on public.eager_agg_t2 t2
+                                       Output: t2.a, t2.b, t2.c
+(21 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+RESET parallel_setup_cost;
+RESET parallel_tuple_cost;
+RESET min_parallel_table_scan_size;
+RESET max_parallel_workers_per_gather;
+DROP TABLE eager_agg_t1;
+DROP TABLE eager_agg_t2;
+DROP TABLE eager_agg_t3;
+--
+-- Test eager aggregation for partitionwise join
+--
+-- Enable partitionwise aggregate, which by default is disabled.
+SET enable_partitionwise_aggregate TO true;
+-- Enable partitionwise join, which by default is disabled.
+SET enable_partitionwise_join TO true;
+CREATE TABLE eager_agg_tab1(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab1_p1 PARTITION OF eager_agg_tab1 FOR VALUES FROM (0) TO (5);
+CREATE TABLE eager_agg_tab1_p2 PARTITION OF eager_agg_tab1 FOR VALUES FROM (5) TO (10);
+CREATE TABLE eager_agg_tab1_p3 PARTITION OF eager_agg_tab1 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab2(x int, y int) PARTITION BY RANGE(y);
+CREATE TABLE eager_agg_tab2_p1 PARTITION OF eager_agg_tab2 FOR VALUES FROM (0) TO (5);
+CREATE TABLE eager_agg_tab2_p2 PARTITION OF eager_agg_tab2 FOR VALUES FROM (5) TO (10);
+CREATE TABLE eager_agg_tab2_p3 PARTITION OF eager_agg_tab2 FOR VALUES FROM (10) TO (15);
+INSERT INTO eager_agg_tab1 SELECT i % 15, i % 10 FROM generate_series(1, 1000) i;
+INSERT INTO eager_agg_tab2 SELECT i % 10, i % 15 FROM generate_series(1, 1000) i;
+ANALYZE eager_agg_tab1;
+ANALYZE eager_agg_tab2;
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum(t1.y)), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum(t1.y), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2.y = t1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2
+                           Output: t2.y
+                     ->  Hash
+                           Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1.x, PARTIAL sum(t1.y), PARTIAL count(*)
+                                 Group Key: t1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                       Output: t1.x, t1.y
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum(t1_1.y), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_1
+                           Output: t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.y), PARTIAL count(*)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                       Output: t1_1.x, t1_1.y
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum(t1_2.y), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_2
+                           Output: t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.y), PARTIAL count(*)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                       Output: t1_2.x, t1_2.y
+(49 rows)
+
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+ x  |  sum  | count 
+----+-------+-------
+  0 | 10890 |  4356
+  1 | 15544 |  4489
+  2 | 20033 |  4489
+  3 | 24522 |  4489
+  4 | 29011 |  4489
+  5 | 11390 |  4489
+  6 | 15879 |  4489
+  7 | 20368 |  4489
+  8 | 24857 |  4489
+  9 | 29346 |  4489
+ 10 | 11055 |  4489
+ 11 | 15246 |  4356
+ 12 | 19602 |  4356
+ 13 | 23958 |  4356
+ 14 | 28314 |  4356
+(15 rows)
+
+-- GROUP BY having other matching key
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t2.y, (sum(t1.y)), (count(*))
+   Sort Key: t2.y
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t2.y, sum(t1.y), count(*)
+               Group Key: t2.y
+               ->  Hash Join
+                     Output: t2.y, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2.y = t1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2
+                           Output: t2.y
+                     ->  Hash
+                           Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1.x, PARTIAL sum(t1.y), PARTIAL count(*)
+                                 Group Key: t1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                       Output: t1.y, t1.x
+         ->  Finalize HashAggregate
+               Output: t2_1.y, sum(t1_1.y), count(*)
+               Group Key: t2_1.y
+               ->  Hash Join
+                     Output: t2_1.y, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_1
+                           Output: t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.y), PARTIAL count(*)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                       Output: t1_1.y, t1_1.x
+         ->  Finalize HashAggregate
+               Output: t2_2.y, sum(t1_2.y), count(*)
+               Group Key: t2_2.y
+               ->  Hash Join
+                     Output: t2_2.y, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_2
+                           Output: t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.y), PARTIAL count(*)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                       Output: t1_2.y, t1_2.x
+(49 rows)
+
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+ y  |  sum  | count 
+----+-------+-------
+  0 | 10890 |  4356
+  1 | 15544 |  4489
+  2 | 20033 |  4489
+  3 | 24522 |  4489
+  4 | 29011 |  4489
+  5 | 11390 |  4489
+  6 | 15879 |  4489
+  7 | 20368 |  4489
+  8 | 24857 |  4489
+  9 | 29346 |  4489
+ 10 | 11055 |  4489
+ 11 | 15246 |  4356
+ 12 | 19602 |  4356
+ 13 | 23958 |  4356
+ 14 | 28314 |  4356
+(15 rows)
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 5 ORDER BY t2.x;
+                                                 QUERY PLAN                                                 
+------------------------------------------------------------------------------------------------------------
+ Sort
+   Output: t2.x, (sum(t1.x)), (count(*))
+   Sort Key: t2.x
+   ->  Finalize HashAggregate
+         Output: t2.x, sum(t1.x), count(*)
+         Group Key: t2.x
+         Filter: (avg(t1.x) > '5'::numeric)
+         ->  Append
+               ->  Hash Join
+                     Output: t2.x, (PARTIAL sum(t1.x)), (PARTIAL count(*)), (PARTIAL avg(t1.x))
+                     Hash Cond: (t2.y = t1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2
+                           Output: t2.x, t2.y
+                     ->  Hash
+                           Output: t1.x, (PARTIAL sum(t1.x)), (PARTIAL count(*)), (PARTIAL avg(t1.x))
+                           ->  Partial HashAggregate
+                                 Output: t1.x, PARTIAL sum(t1.x), PARTIAL count(*), PARTIAL avg(t1.x)
+                                 Group Key: t1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                       Output: t1.x
+               ->  Hash Join
+                     Output: t2_1.x, (PARTIAL sum(t1_1.x)), (PARTIAL count(*)), (PARTIAL avg(t1_1.x))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_1
+                           Output: t2_1.x, t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.x)), (PARTIAL count(*)), (PARTIAL avg(t1_1.x))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.x), PARTIAL count(*), PARTIAL avg(t1_1.x)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                       Output: t1_1.x
+               ->  Hash Join
+                     Output: t2_2.x, (PARTIAL sum(t1_2.x)), (PARTIAL count(*)), (PARTIAL avg(t1_2.x))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_2
+                           Output: t2_2.x, t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.x)), (PARTIAL count(*)), (PARTIAL avg(t1_2.x))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.x), PARTIAL count(*), PARTIAL avg(t1_2.x)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                       Output: t1_2.x
+(44 rows)
+
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 5 ORDER BY t2.x;
+ x |  sum  | count 
+---+-------+-------
+ 0 | 33835 |  6667
+ 1 | 39502 |  6667
+ 2 | 46169 |  6667
+ 3 | 52836 |  6667
+ 4 | 59503 |  6667
+ 5 | 33500 |  6667
+ 6 | 39837 |  6667
+ 7 | 46504 |  6667
+ 8 | 53171 |  6667
+ 9 | 59838 |  6667
+(10 rows)
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum((t2.y + t3.y)))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum((t2.y + t3.y))
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum((t2.y + t3.y)))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, t3.x, (PARTIAL sum((t2.y + t3.y)))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, t3.x, PARTIAL sum((t2.y + t3.y))
+                                 Group Key: t2.x
+                                 ->  Hash Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Hash Cond: (t2.x = t3.x)
+                                       ->  Seq Scan on public.eager_agg_tab1_p1 t2
+                                             Output: t2.y, t2.x
+                                       ->  Hash
+                                             Output: t3.y, t3.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p1 t3
+                                                   Output: t3.y, t3.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum((t2_1.y + t3_1.y))
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum((t2_1.y + t3_1.y)))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y)))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y))
+                                 Group Key: t2_1.x
+                                 ->  Hash Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Hash Cond: (t2_1.x = t3_1.x)
+                                       ->  Seq Scan on public.eager_agg_tab1_p2 t2_1
+                                             Output: t2_1.y, t2_1.x
+                                       ->  Hash
+                                             Output: t3_1.y, t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p2 t3_1
+                                                   Output: t3_1.y, t3_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum((t2_2.y + t3_2.y))
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum((t2_2.y + t3_2.y)))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y)))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y))
+                                 Group Key: t2_2.x
+                                 ->  Hash Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Hash Cond: (t2_2.x = t3_2.x)
+                                       ->  Seq Scan on public.eager_agg_tab1_p3 t2_2
+                                             Output: t2_2.y, t2_2.x
+                                       ->  Hash
+                                             Output: t3_2.y, t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p3 t3_2
+                                                   Output: t3_2.y, t3_2.x
+(70 rows)
+
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+ x  |   sum   
+----+---------
+  0 | 1437480
+  1 | 2082896
+  2 | 2684422
+  3 | 3285948
+  4 | 3887474
+  5 | 1526260
+  6 | 2127786
+  7 | 2729312
+  8 | 3330838
+  9 | 3932364
+ 10 | 1481370
+ 11 | 2012472
+ 12 | 2587464
+ 13 | 3162456
+ 14 | 3737448
+(15 rows)
+
+-- partial aggregation
+SET enable_hashagg TO off;
+SET max_parallel_workers_per_gather TO 0;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t3.y, sum((t2.y + t3.y))
+   Group Key: t3.y
+   ->  Sort
+         Output: t3.y, (PARTIAL sum((t2.y + t3.y)))
+         Sort Key: t3.y
+         ->  Append
+               ->  Hash Join
+                     Output: t3.y, (PARTIAL sum((t2.y + t3.y)))
+                     Hash Cond: (t2.x = t1.x)
+                     ->  Partial GroupAggregate
+                           Output: t2.x, t3.y, t3.x, PARTIAL sum((t2.y + t3.y))
+                           Group Key: t2.x, t3.y, t3.x
+                           ->  Incremental Sort
+                                 Output: t2.y, t2.x, t3.y, t3.x
+                                 Sort Key: t2.x, t3.y
+                                 Presorted Key: t2.x
+                                 ->  Merge Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Merge Cond: (t2.x = t3.x)
+                                       ->  Sort
+                                             Output: t2.y, t2.x
+                                             Sort Key: t2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p1 t2
+                                                   Output: t2.y, t2.x
+                                       ->  Sort
+                                             Output: t3.y, t3.x
+                                             Sort Key: t3.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p1 t3
+                                                   Output: t3.y, t3.x
+                     ->  Hash
+                           Output: t1.x
+                           ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                 Output: t1.x
+               ->  Hash Join
+                     Output: t3_1.y, (PARTIAL sum((t2_1.y + t3_1.y)))
+                     Hash Cond: (t2_1.x = t1_1.x)
+                     ->  Partial GroupAggregate
+                           Output: t2_1.x, t3_1.y, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y))
+                           Group Key: t2_1.x, t3_1.y, t3_1.x
+                           ->  Incremental Sort
+                                 Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                 Sort Key: t2_1.x, t3_1.y
+                                 Presorted Key: t2_1.x
+                                 ->  Merge Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Merge Cond: (t2_1.x = t3_1.x)
+                                       ->  Sort
+                                             Output: t2_1.y, t2_1.x
+                                             Sort Key: t2_1.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p2 t2_1
+                                                   Output: t2_1.y, t2_1.x
+                                       ->  Sort
+                                             Output: t3_1.y, t3_1.x
+                                             Sort Key: t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p2 t3_1
+                                                   Output: t3_1.y, t3_1.x
+                     ->  Hash
+                           Output: t1_1.x
+                           ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                 Output: t1_1.x
+               ->  Hash Join
+                     Output: t3_2.y, (PARTIAL sum((t2_2.y + t3_2.y)))
+                     Hash Cond: (t2_2.x = t1_2.x)
+                     ->  Partial GroupAggregate
+                           Output: t2_2.x, t3_2.y, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y))
+                           Group Key: t2_2.x, t3_2.y, t3_2.x
+                           ->  Incremental Sort
+                                 Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                 Sort Key: t2_2.x, t3_2.y
+                                 Presorted Key: t2_2.x
+                                 ->  Merge Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Merge Cond: (t2_2.x = t3_2.x)
+                                       ->  Sort
+                                             Output: t2_2.y, t2_2.x
+                                             Sort Key: t2_2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p3 t2_2
+                                                   Output: t2_2.y, t2_2.x
+                                       ->  Sort
+                                             Output: t3_2.y, t3_2.x
+                                             Sort Key: t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p3 t3_2
+                                                   Output: t3_2.y, t3_2.x
+                     ->  Hash
+                           Output: t1_2.x
+                           ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                 Output: t1_2.x
+(88 rows)
+
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+ y |   sum   
+---+---------
+ 0 | 1111110
+ 1 | 2000132
+ 2 | 2889154
+ 3 | 3778176
+ 4 | 4667198
+ 5 | 3334000
+ 6 | 4223022
+ 7 | 5112044
+ 8 | 6001066
+ 9 | 6890088
+(10 rows)
+
+RESET enable_hashagg;
+RESET max_parallel_workers_per_gather;
+DROP TABLE eager_agg_tab1;
+DROP TABLE eager_agg_tab2;
+--
+-- Test with multi-level partitioning scheme
+--
+CREATE TABLE eager_agg_tab_ml(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p1 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab_ml_p2 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (10) TO (20) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p2_s1 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab_ml_p2_s2 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (15) TO (20);
+CREATE TABLE eager_agg_tab_ml_p3 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (20) TO (30) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p3_s1 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (20) TO (25);
+CREATE TABLE eager_agg_tab_ml_p3_s2 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (25) TO (30);
+INSERT INTO eager_agg_tab_ml SELECT i % 30, i % 30 FROM generate_series(1, 1000) i;
+ANALYZE eager_agg_tab_ml;
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum(t2.y)), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum(t2.y), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, PARTIAL sum(t2.y), PARTIAL count(*)
+                                 Group Key: t2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                       Output: t2.y, t2.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum(t2_1.y), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, PARTIAL sum(t2_1.y), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                       Output: t2_1.y, t2_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum(t2_2.y), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, PARTIAL sum(t2_2.y), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                       Output: t2_2.y, t2_2.x
+         ->  Finalize HashAggregate
+               Output: t1_3.x, sum(t2_3.y), count(*)
+               Group Key: t1_3.x
+               ->  Hash Join
+                     Output: t1_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, PARTIAL sum(t2_3.y), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                       Output: t2_3.y, t2_3.x
+         ->  Finalize HashAggregate
+               Output: t1_4.x, sum(t2_4.y), count(*)
+               Group Key: t1_4.x
+               ->  Hash Join
+                     Output: t1_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, PARTIAL sum(t2_4.y), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                       Output: t2_4.y, t2_4.x
+(79 rows)
+
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+ x  |  sum  | count 
+----+-------+-------
+  0 |     0 |  1089
+  1 |  1156 |  1156
+  2 |  2312 |  1156
+  3 |  3468 |  1156
+  4 |  4624 |  1156
+  5 |  5780 |  1156
+  6 |  6936 |  1156
+  7 |  8092 |  1156
+  8 |  9248 |  1156
+  9 | 10404 |  1156
+ 10 | 11560 |  1156
+ 11 | 11979 |  1089
+ 12 | 13068 |  1089
+ 13 | 14157 |  1089
+ 14 | 15246 |  1089
+ 15 | 16335 |  1089
+ 16 | 17424 |  1089
+ 17 | 18513 |  1089
+ 18 | 19602 |  1089
+ 19 | 20691 |  1089
+ 20 | 21780 |  1089
+ 21 | 22869 |  1089
+ 22 | 23958 |  1089
+ 23 | 25047 |  1089
+ 24 | 26136 |  1089
+ 25 | 27225 |  1089
+ 26 | 28314 |  1089
+ 27 | 29403 |  1089
+ 28 | 30492 |  1089
+ 29 | 31581 |  1089
+(30 rows)
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.y, (sum(t2.y)), (count(*))
+   Sort Key: t1.y
+   ->  Finalize HashAggregate
+         Output: t1.y, sum(t2.y), count(*)
+         Group Key: t1.y
+         ->  Append
+               ->  Hash Join
+                     Output: t1.y, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.y, t1.x
+                     ->  Hash
+                           Output: t2.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, PARTIAL sum(t2.y), PARTIAL count(*)
+                                 Group Key: t2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                       Output: t2.y, t2.x
+               ->  Hash Join
+                     Output: t1_1.y, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.y, t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, PARTIAL sum(t2_1.y), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                       Output: t2_1.y, t2_1.x
+               ->  Hash Join
+                     Output: t1_2.y, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.y, t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, PARTIAL sum(t2_2.y), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                       Output: t2_2.y, t2_2.x
+               ->  Hash Join
+                     Output: t1_3.y, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.y, t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, PARTIAL sum(t2_3.y), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                       Output: t2_3.y, t2_3.x
+               ->  Hash Join
+                     Output: t1_4.y, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.y, t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, PARTIAL sum(t2_4.y), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                       Output: t2_4.y, t2_4.x
+(67 rows)
+
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+ y  |  sum  | count 
+----+-------+-------
+  0 |     0 |  1089
+  1 |  1156 |  1156
+  2 |  2312 |  1156
+  3 |  3468 |  1156
+  4 |  4624 |  1156
+  5 |  5780 |  1156
+  6 |  6936 |  1156
+  7 |  8092 |  1156
+  8 |  9248 |  1156
+  9 | 10404 |  1156
+ 10 | 11560 |  1156
+ 11 | 11979 |  1089
+ 12 | 13068 |  1089
+ 13 | 14157 |  1089
+ 14 | 15246 |  1089
+ 15 | 16335 |  1089
+ 16 | 17424 |  1089
+ 17 | 18513 |  1089
+ 18 | 19602 |  1089
+ 19 | 20691 |  1089
+ 20 | 21780 |  1089
+ 21 | 22869 |  1089
+ 22 | 23958 |  1089
+ 23 | 25047 |  1089
+ 24 | 26136 |  1089
+ 25 | 27225 |  1089
+ 26 | 28314 |  1089
+ 27 | 29403 |  1089
+ 28 | 30492 |  1089
+ 29 | 31581 |  1089
+(30 rows)
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+                                                QUERY PLAN                                                
+----------------------------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum((t2.y + t3.y))), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum((t2.y + t3.y)), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, t3.x, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, t3.x, PARTIAL sum((t2.y + t3.y)), PARTIAL count(*)
+                                 Group Key: t2.x
+                                 ->  Hash Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Hash Cond: (t2.x = t3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                             Output: t2.y, t2.x
+                                       ->  Hash
+                                             Output: t3.y, t3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p1 t3
+                                                   Output: t3.y, t3.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum((t2_1.y + t3_1.y)), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y)), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Hash Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Hash Cond: (t2_1.x = t3_1.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                             Output: t2_1.y, t2_1.x
+                                       ->  Hash
+                                             Output: t3_1.y, t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t3_1
+                                                   Output: t3_1.y, t3_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum((t2_2.y + t3_2.y)), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y)), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Hash Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Hash Cond: (t2_2.x = t3_2.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                             Output: t2_2.y, t2_2.x
+                                       ->  Hash
+                                             Output: t3_2.y, t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t3_2
+                                                   Output: t3_2.y, t3_2.x
+         ->  Finalize HashAggregate
+               Output: t1_3.x, sum((t2_3.y + t3_3.y)), count(*)
+               Group Key: t1_3.x
+               ->  Hash Join
+                     Output: t1_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, t3_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, t3_3.x, PARTIAL sum((t2_3.y + t3_3.y)), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Hash Join
+                                       Output: t2_3.y, t2_3.x, t3_3.y, t3_3.x
+                                       Hash Cond: (t2_3.x = t3_3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                             Output: t2_3.y, t2_3.x
+                                       ->  Hash
+                                             Output: t3_3.y, t3_3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t3_3
+                                                   Output: t3_3.y, t3_3.x
+         ->  Finalize HashAggregate
+               Output: t1_4.x, sum((t2_4.y + t3_4.y)), count(*)
+               Group Key: t1_4.x
+               ->  Hash Join
+                     Output: t1_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, t3_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, t3_4.x, PARTIAL sum((t2_4.y + t3_4.y)), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Hash Join
+                                       Output: t2_4.y, t2_4.x, t3_4.y, t3_4.x
+                                       Hash Cond: (t2_4.x = t3_4.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                             Output: t2_4.y, t2_4.x
+                                       ->  Hash
+                                             Output: t3_4.y, t3_4.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t3_4
+                                                   Output: t3_4.y, t3_4.x
+(114 rows)
+
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+ x  |   sum   | count 
+----+---------+-------
+  0 |       0 | 35937
+  1 |   78608 | 39304
+  2 |  157216 | 39304
+  3 |  235824 | 39304
+  4 |  314432 | 39304
+  5 |  393040 | 39304
+  6 |  471648 | 39304
+  7 |  550256 | 39304
+  8 |  628864 | 39304
+  9 |  707472 | 39304
+ 10 |  786080 | 39304
+ 11 |  790614 | 35937
+ 12 |  862488 | 35937
+ 13 |  934362 | 35937
+ 14 | 1006236 | 35937
+ 15 | 1078110 | 35937
+ 16 | 1149984 | 35937
+ 17 | 1221858 | 35937
+ 18 | 1293732 | 35937
+ 19 | 1365606 | 35937
+ 20 | 1437480 | 35937
+ 21 | 1509354 | 35937
+ 22 | 1581228 | 35937
+ 23 | 1653102 | 35937
+ 24 | 1724976 | 35937
+ 25 | 1796850 | 35937
+ 26 | 1868724 | 35937
+ 27 | 1940598 | 35937
+ 28 | 2012472 | 35937
+ 29 | 2084346 | 35937
+(30 rows)
+
+-- partial aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+                                                    QUERY PLAN                                                    
+------------------------------------------------------------------------------------------------------------------
+ Sort
+   Output: t3.y, (sum((t2.y + t3.y))), (count(*))
+   Sort Key: t3.y
+   ->  Finalize HashAggregate
+         Output: t3.y, sum((t2.y + t3.y)), count(*)
+         Group Key: t3.y
+         ->  Append
+               ->  Hash Join
+                     Output: t3.y, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, t3.y, t3.x, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, t3.y, t3.x, PARTIAL sum((t2.y + t3.y)), PARTIAL count(*)
+                                 Group Key: t2.x, t3.y, t3.x
+                                 ->  Hash Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Hash Cond: (t2.x = t3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                             Output: t2.y, t2.x
+                                       ->  Hash
+                                             Output: t3.y, t3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p1 t3
+                                                   Output: t3.y, t3.x
+               ->  Hash Join
+                     Output: t3_1.y, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, t3_1.y, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, t3_1.y, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y)), PARTIAL count(*)
+                                 Group Key: t2_1.x, t3_1.y, t3_1.x
+                                 ->  Hash Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Hash Cond: (t2_1.x = t3_1.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                             Output: t2_1.y, t2_1.x
+                                       ->  Hash
+                                             Output: t3_1.y, t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t3_1
+                                                   Output: t3_1.y, t3_1.x
+               ->  Hash Join
+                     Output: t3_2.y, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, t3_2.y, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, t3_2.y, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y)), PARTIAL count(*)
+                                 Group Key: t2_2.x, t3_2.y, t3_2.x
+                                 ->  Hash Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Hash Cond: (t2_2.x = t3_2.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                             Output: t2_2.y, t2_2.x
+                                       ->  Hash
+                                             Output: t3_2.y, t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t3_2
+                                                   Output: t3_2.y, t3_2.x
+               ->  Hash Join
+                     Output: t3_3.y, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, t3_3.y, t3_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, t3_3.y, t3_3.x, PARTIAL sum((t2_3.y + t3_3.y)), PARTIAL count(*)
+                                 Group Key: t2_3.x, t3_3.y, t3_3.x
+                                 ->  Hash Join
+                                       Output: t2_3.y, t2_3.x, t3_3.y, t3_3.x
+                                       Hash Cond: (t2_3.x = t3_3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                             Output: t2_3.y, t2_3.x
+                                       ->  Hash
+                                             Output: t3_3.y, t3_3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t3_3
+                                                   Output: t3_3.y, t3_3.x
+               ->  Hash Join
+                     Output: t3_4.y, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, t3_4.y, t3_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, t3_4.y, t3_4.x, PARTIAL sum((t2_4.y + t3_4.y)), PARTIAL count(*)
+                                 Group Key: t2_4.x, t3_4.y, t3_4.x
+                                 ->  Hash Join
+                                       Output: t2_4.y, t2_4.x, t3_4.y, t3_4.x
+                                       Hash Cond: (t2_4.x = t3_4.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                             Output: t2_4.y, t2_4.x
+                                       ->  Hash
+                                             Output: t3_4.y, t3_4.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t3_4
+                                                   Output: t3_4.y, t3_4.x
+(102 rows)
+
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+ y  |   sum   | count 
+----+---------+-------
+  0 |       0 | 35937
+  1 |   78608 | 39304
+  2 |  157216 | 39304
+  3 |  235824 | 39304
+  4 |  314432 | 39304
+  5 |  393040 | 39304
+  6 |  471648 | 39304
+  7 |  550256 | 39304
+  8 |  628864 | 39304
+  9 |  707472 | 39304
+ 10 |  786080 | 39304
+ 11 |  790614 | 35937
+ 12 |  862488 | 35937
+ 13 |  934362 | 35937
+ 14 | 1006236 | 35937
+ 15 | 1078110 | 35937
+ 16 | 1149984 | 35937
+ 17 | 1221858 | 35937
+ 18 | 1293732 | 35937
+ 19 | 1365606 | 35937
+ 20 | 1437480 | 35937
+ 21 | 1509354 | 35937
+ 22 | 1581228 | 35937
+ 23 | 1653102 | 35937
+ 24 | 1724976 | 35937
+ 25 | 1796850 | 35937
+ 26 | 1868724 | 35937
+ 27 | 1940598 | 35937
+ 28 | 2012472 | 35937
+ 29 | 2084346 | 35937
+(30 rows)
+
+DROP TABLE eager_agg_tab_ml;
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 04079268b98..d0bb66f43da 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -2837,20 +2837,22 @@ select x.thousand, x.twothousand, count(*)
 from tenk1 x inner join tenk1 y on x.thousand = y.thousand
 group by x.thousand, x.twothousand
 order by x.thousand desc, x.twothousand;
-                                    QUERY PLAN                                    
-----------------------------------------------------------------------------------
- GroupAggregate
+                                       QUERY PLAN                                       
+----------------------------------------------------------------------------------------
+ Finalize GroupAggregate
    Group Key: x.thousand, x.twothousand
    ->  Incremental Sort
          Sort Key: x.thousand DESC, x.twothousand
          Presorted Key: x.thousand
          ->  Merge Join
                Merge Cond: (y.thousand = x.thousand)
-               ->  Index Only Scan Backward using tenk1_thous_tenthous on tenk1 y
+               ->  Partial GroupAggregate
+                     Group Key: y.thousand
+                     ->  Index Only Scan Backward using tenk1_thous_tenthous on tenk1 y
                ->  Sort
                      Sort Key: x.thousand DESC
                      ->  Seq Scan on tenk1 x
-(11 rows)
+(13 rows)
 
 reset enable_hashagg;
 reset enable_nestloop;
diff --git a/src/test/regress/expected/partition_aggregate.out b/src/test/regress/expected/partition_aggregate.out
index 5f2c0cf5786..1f56f55155b 100644
--- a/src/test/regress/expected/partition_aggregate.out
+++ b/src/test/regress/expected/partition_aggregate.out
@@ -13,6 +13,8 @@ SET enable_partitionwise_join TO true;
 SET max_parallel_workers_per_gather TO 0;
 -- Disable incremental sort, which can influence selected plans due to fuzz factor.
 SET enable_incremental_sort TO off;
+-- Disable eager aggregation, which can interfere with the generation of partitionwise aggregation.
+SET enable_eager_aggregate TO off;
 --
 -- Tests for list partitioned tables.
 --
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index 83228cfca29..3b37fafa65b 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -151,6 +151,7 @@ select name, setting from pg_settings where name like 'enable%';
  enable_async_append            | on
  enable_bitmapscan              | on
  enable_distinct_reordering     | on
+ enable_eager_aggregate         | on
  enable_gathermerge             | on
  enable_group_by_reordering     | on
  enable_hashagg                 | on
@@ -172,7 +173,7 @@ select name, setting from pg_settings where name like 'enable%';
  enable_seqscan                 | on
  enable_sort                    | on
  enable_tidscan                 | on
-(24 rows)
+(25 rows)
 
 -- There are always wait event descriptions for various types.  InjectionPoint
 -- may be present or absent, depending on history since last postmaster start.
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index fbffc67ae60..f9450cdc477 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -123,7 +123,7 @@ test: plancache limit plpgsql copy2 temp domain rangefuncs prepare conversion tr
 # The stats test resets stats, so nothing else needing stats access can be in
 # this group.
 # ----------
-test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression compression_lz4 memoize stats predicate numa
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression compression_lz4 memoize stats predicate numa eager_aggregate
 
 # event_trigger depends on create_am and cannot run concurrently with
 # any test that runs DDL
diff --git a/src/test/regress/sql/eager_aggregate.sql b/src/test/regress/sql/eager_aggregate.sql
new file mode 100644
index 00000000000..5da8749a6cb
--- /dev/null
+++ b/src/test/regress/sql/eager_aggregate.sql
@@ -0,0 +1,194 @@
+--
+-- EAGER AGGREGATION
+-- Test we can push aggregation down below join
+--
+
+-- Enable eager aggregation, which by default is disabled.
+SET enable_eager_aggregate TO on;
+
+CREATE TABLE eager_agg_t1 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t2 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t3 (a int, b int, c double precision);
+
+INSERT INTO eager_agg_t1 SELECT i, i, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t2 SELECT i, i%10, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t3 SELECT i%10, i%10, i FROM generate_series(1, 1000)i;
+
+ANALYZE eager_agg_t1;
+ANALYZE eager_agg_t2;
+ANALYZE eager_agg_t3;
+
+
+--
+-- Test eager aggregation over base rel
+--
+
+-- Perform scan of a table, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+RESET enable_hashagg;
+
+
+--
+-- Test eager aggregation over join rel
+--
+
+-- Perform join of tables, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+
+RESET enable_hashagg;
+
+
+--
+-- Test that eager aggregation works for outer join
+--
+
+-- Ensure aggregation can be pushed down to the non-nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+-- Ensure aggregation cannot be pushed down to the nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+
+
+--
+-- Test that eager aggregation works for parallel plans
+--
+
+SET parallel_setup_cost=0;
+SET parallel_tuple_cost=0;
+SET min_parallel_table_scan_size=0;
+SET max_parallel_workers_per_gather=4;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+RESET parallel_setup_cost;
+RESET parallel_tuple_cost;
+RESET min_parallel_table_scan_size;
+RESET max_parallel_workers_per_gather;
+
+
+DROP TABLE eager_agg_t1;
+DROP TABLE eager_agg_t2;
+DROP TABLE eager_agg_t3;
+
+
+--
+-- Test eager aggregation for partitionwise join
+--
+
+-- Enable partitionwise aggregate, which by default is disabled.
+SET enable_partitionwise_aggregate TO true;
+-- Enable partitionwise join, which by default is disabled.
+SET enable_partitionwise_join TO true;
+
+CREATE TABLE eager_agg_tab1(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab1_p1 PARTITION OF eager_agg_tab1 FOR VALUES FROM (0) TO (5);
+CREATE TABLE eager_agg_tab1_p2 PARTITION OF eager_agg_tab1 FOR VALUES FROM (5) TO (10);
+CREATE TABLE eager_agg_tab1_p3 PARTITION OF eager_agg_tab1 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab2(x int, y int) PARTITION BY RANGE(y);
+CREATE TABLE eager_agg_tab2_p1 PARTITION OF eager_agg_tab2 FOR VALUES FROM (0) TO (5);
+CREATE TABLE eager_agg_tab2_p2 PARTITION OF eager_agg_tab2 FOR VALUES FROM (5) TO (10);
+CREATE TABLE eager_agg_tab2_p3 PARTITION OF eager_agg_tab2 FOR VALUES FROM (10) TO (15);
+INSERT INTO eager_agg_tab1 SELECT i % 15, i % 10 FROM generate_series(1, 1000) i;
+INSERT INTO eager_agg_tab2 SELECT i % 10, i % 15 FROM generate_series(1, 1000) i;
+
+ANALYZE eager_agg_tab1;
+ANALYZE eager_agg_tab2;
+
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+
+-- GROUP BY having other matching key
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 5 ORDER BY t2.x;
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 5 ORDER BY t2.x;
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+
+-- partial aggregation
+SET enable_hashagg TO off;
+SET max_parallel_workers_per_gather TO 0;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+RESET enable_hashagg;
+RESET max_parallel_workers_per_gather;
+
+DROP TABLE eager_agg_tab1;
+DROP TABLE eager_agg_tab2;
+
+
+--
+-- Test with multi-level partitioning scheme
+--
+CREATE TABLE eager_agg_tab_ml(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p1 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab_ml_p2 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (10) TO (20) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p2_s1 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab_ml_p2_s2 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (15) TO (20);
+CREATE TABLE eager_agg_tab_ml_p3 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (20) TO (30) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p3_s1 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (20) TO (25);
+CREATE TABLE eager_agg_tab_ml_p3_s2 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (25) TO (30);
+INSERT INTO eager_agg_tab_ml SELECT i % 30, i % 30 FROM generate_series(1, 1000) i;
+
+ANALYZE eager_agg_tab_ml;
+
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+
+-- partial aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+
+DROP TABLE eager_agg_tab_ml;
diff --git a/src/test/regress/sql/partition_aggregate.sql b/src/test/regress/sql/partition_aggregate.sql
index ab070fee244..124cc260461 100644
--- a/src/test/regress/sql/partition_aggregate.sql
+++ b/src/test/regress/sql/partition_aggregate.sql
@@ -14,6 +14,8 @@ SET enable_partitionwise_join TO true;
 SET max_parallel_workers_per_gather TO 0;
 -- Disable incremental sort, which can influence selected plans due to fuzz factor.
 SET enable_incremental_sort TO off;
+-- Disable eager aggregation, which can interfere with the generation of partitionwise aggregation.
+SET enable_eager_aggregate TO off;
 
 --
 -- Tests for list partitioned tables.
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index a13e8162890..9a4567db01a 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -42,6 +42,7 @@ AfterTriggersTableData
 AfterTriggersTransData
 Agg
 AggClauseCosts
+AggClauseInfo
 AggInfo
 AggPath
 AggSplit
@@ -1110,6 +1111,7 @@ GroupPathExtraData
 GroupResultPath
 GroupState
 GroupVarInfo
+GroupingExprInfo
 GroupingFunc
 GroupingSet
 GroupingSetData
@@ -2473,6 +2475,7 @@ ReindexObjectType
 ReindexParams
 ReindexStmt
 ReindexType
+RelAggInfo
 RelFileLocator
 RelFileLocatorBackend
 RelFileNumber
-- 
2.39.5 (Apple Git-154)



^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 13:44                                   ` Re: Eager aggregation, take 3 Matheus Alcantara <[email protected]>
  2025-08-09 01:32                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-01 01:32                                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-09-05 07:35                                         ` Richard Guo <[email protected]>
  2025-09-05 14:37                                           ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  0 siblings, 1 reply; 70+ messages in thread

From: Richard Guo @ 2025-09-05 07:35 UTC (permalink / raw)
  To: Matheus Alcantara <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

On Mon, Sep 1, 2025 at 10:32 AM Richard Guo <[email protected]> wrote:
> This patch needs a rebase; here it is.  No changes were made.

Here is a rebase after the GUC tables change.

- Richard


Attachments:

  [application/octet-stream] v21-0001-Implement-Eager-Aggregation.patch (172.1K, 2-v21-0001-Implement-Eager-Aggregation.patch)
  download | inline diff:
From 3f839b71eb76f9e662f0768ad2aff600d500748f Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Tue, 11 Jun 2024 15:59:19 +0900
Subject: [PATCH v21] Implement Eager Aggregation

Eager aggregation is a query optimization technique that partially
pushes aggregation past a join, and finalizes it once all the
relations are joined.  Eager aggregation may reduce the number of
input rows to the join and thus could result in a better overall plan.

In the current planner architecture, the separation between the
scan/join planning phase and the post-scan/join phase means that
aggregation steps are not visible when constructing the join tree,
limiting the planner's ability to exploit aggregation-aware
optimizations.  To implement eager aggregation, we collect information
about aggregate functions in the targetlist and HAVING clause, along
with grouping expressions from the GROUP BY clause, and store it in
the PlannerInfo node.  During the scan/join planning phase, this
information is used to evaluate each base or join relation to
determine whether eager aggregation can be applied.  If applicable, we
create a separate RelOptInfo, referred to as a grouped relation, to
represent the partially-aggregated version of the relation and
generate grouped paths for it.

Grouped relation paths can be generated in two ways.  The first method
involves adding sorted and hashed partial aggregation paths on top of
the non-grouped paths.  To limit planning time, we only consider the
cheapest or suitably-sorted non-grouped paths in this step.
Alternatively, grouped paths can be generated by joining a grouped
relation with a non-grouped relation.  Joining two grouped relations
is currently not supported.

To further limit planning time, we currently adopt a strategy where
partial aggregation is pushed only to the lowest feasible level in the
join tree where it provides a significant reduction in row count.
This strategy also helps ensure that all grouped paths for the same
grouped relation produce the same set of rows, which is important to
support a fundamental assumption of the planner.

For the partial aggregation that is pushed down to a non-aggregated
relation, we need to consider all expressions from this relation that
are involved in upper join clauses and include them in the grouping
keys, using compatible operators.  This is essential to ensure that an
aggregated row from the partial aggregation matches the other side of
the join if and only if each row in the partial group does.  This
ensures that all rows within the same partial group share the same
"destiny", which is crucial for maintaining correctness.

One restriction is that we cannot push partial aggregation down to a
relation that is in the nullable side of an outer join, because the
NULL-extended rows produced by the outer join would not be available
when we perform the partial aggregation, while with a
non-eager-aggregation plan these rows are available for the top-level
aggregation.  Pushing partial aggregation in this case may result in
the rows being grouped differently than expected, or produce incorrect
values from the aggregate functions.

If we have generated a grouped relation for the topmost join relation,
we finalize its paths at the end.  The final paths will compete in the
usual way with paths built from regular planning.

The patch was originally proposed by Antonin Houska in 2017.  This
commit reworks various important aspects and rewrites most of the
current code.  However, the original patch and reviews were very
useful.

Author: Richard Guo, Antonin Houska
Reviewed-by: Robert Haas, Jian He, Tender Wang, Paul George, Tom Lane
Reviewed-by: Tomas Vondra, Andy Fan, Ashutosh Bapat
Discussion: https://postgr.es/m/CAMbWs48jzLrPt1J_00ZcPZXWUQKawQOFE8ROc-ADiYqsqrpBNw@mail.gmail.com
---
 .../postgres_fdw/expected/postgres_fdw.out    |   49 +-
 doc/src/sgml/config.sgml                      |   31 +
 src/backend/optimizer/README                  |   89 ++
 src/backend/optimizer/geqo/geqo_eval.c        |   21 +
 src/backend/optimizer/path/allpaths.c         |  453 ++++++
 src/backend/optimizer/path/joinrels.c         |  193 +++
 src/backend/optimizer/plan/initsplan.c        |  322 ++++
 src/backend/optimizer/plan/planmain.c         |    9 +
 src/backend/optimizer/plan/planner.c          |  124 +-
 src/backend/optimizer/util/appendinfo.c       |   59 +
 src/backend/optimizer/util/relnode.c          |  628 ++++++++
 src/backend/utils/misc/guc_parameters.dat     |   16 +
 src/backend/utils/misc/postgresql.conf.sample |    2 +
 src/include/nodes/pathnodes.h                 |  130 ++
 src/include/optimizer/pathnode.h              |    5 +
 src/include/optimizer/paths.h                 |    6 +
 src/include/optimizer/planmain.h              |    1 +
 .../regress/expected/collate.icu.utf8.out     |   32 +-
 src/test/regress/expected/eager_aggregate.out | 1334 +++++++++++++++++
 src/test/regress/expected/join.out            |   12 +-
 .../regress/expected/partition_aggregate.out  |    2 +
 src/test/regress/expected/sysviews.out        |    3 +-
 src/test/regress/parallel_schedule            |    2 +-
 src/test/regress/sql/eager_aggregate.sql      |  194 +++
 src/test/regress/sql/partition_aggregate.sql  |    2 +
 src/tools/pgindent/typedefs.list              |    3 +
 26 files changed, 3648 insertions(+), 74 deletions(-)
 create mode 100644 src/test/regress/expected/eager_aggregate.out
 create mode 100644 src/test/regress/sql/eager_aggregate.sql

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 78b8367d289..b6c892bdb51 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -3701,30 +3701,33 @@ select count(t1.c3) from ft2 t1 left join ft2 t2 on (t1.c1 = random() * t2.c2);
 -- Subquery in FROM clause having aggregate
 explain (verbose, costs off)
 select count(*), x.b from ft1, (select c2 a, sum(c1) b from ft1 group by c2) x where ft1.c2 = x.a group by x.b order by 1, 2;
-                                          QUERY PLAN                                           
------------------------------------------------------------------------------------------------
+                                       QUERY PLAN                                        
+-----------------------------------------------------------------------------------------
  Sort
-   Output: (count(*)), x.b
-   Sort Key: (count(*)), x.b
-   ->  HashAggregate
-         Output: count(*), x.b
-         Group Key: x.b
-         ->  Hash Join
-               Output: x.b
-               Inner Unique: true
-               Hash Cond: (ft1.c2 = x.a)
-               ->  Foreign Scan on public.ft1
-                     Output: ft1.c2
-                     Remote SQL: SELECT c2 FROM "S 1"."T 1"
-               ->  Hash
-                     Output: x.b, x.a
-                     ->  Subquery Scan on x
-                           Output: x.b, x.a
-                           ->  Foreign Scan
-                                 Output: ft1_1.c2, (sum(ft1_1.c1))
-                                 Relations: Aggregate on (public.ft1 ft1_1)
-                                 Remote SQL: SELECT c2, sum("C 1") FROM "S 1"."T 1" GROUP BY 1
-(21 rows)
+   Output: (count(*)), (sum(ft1_1.c1))
+   Sort Key: (count(*)), (sum(ft1_1.c1))
+   ->  Finalize GroupAggregate
+         Output: count(*), (sum(ft1_1.c1))
+         Group Key: (sum(ft1_1.c1))
+         ->  Sort
+               Output: (sum(ft1_1.c1)), (PARTIAL count(*))
+               Sort Key: (sum(ft1_1.c1))
+               ->  Hash Join
+                     Output: (sum(ft1_1.c1)), (PARTIAL count(*))
+                     Hash Cond: (ft1_1.c2 = ft1.c2)
+                     ->  Foreign Scan
+                           Output: ft1_1.c2, (sum(ft1_1.c1))
+                           Relations: Aggregate on (public.ft1 ft1_1)
+                           Remote SQL: SELECT c2, sum("C 1") FROM "S 1"."T 1" GROUP BY 1
+                     ->  Hash
+                           Output: ft1.c2, (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: ft1.c2, PARTIAL count(*)
+                                 Group Key: ft1.c2
+                                 ->  Foreign Scan on public.ft1
+                                       Output: ft1.c2
+                                       Remote SQL: SELECT c2 FROM "S 1"."T 1"
+(24 rows)
 
 select count(*), x.b from ft1, (select c2 a, sum(c1) b from ft1 group by c2) x where ft1.c2 = x.a group by x.b order by 1, 2;
  count |   b   
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 0a4b3e55ba5..aab91625daf 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -5475,6 +5475,21 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-enable-eager-aggregate" xreflabel="enable_eager_aggregate">
+      <term><varname>enable_eager_aggregate</varname> (<type>boolean</type>)
+      <indexterm>
+       <primary><varname>enable_eager_aggregate</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Enables or disables the query planner's ability to partially push
+        aggregation past a join, and finalize it once all the relations are
+        joined. The default is <literal>on</literal>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-enable-gathermerge" xreflabel="enable_gathermerge">
       <term><varname>enable_gathermerge</varname> (<type>boolean</type>)
       <indexterm>
@@ -6095,6 +6110,22 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-min-eager-agg-group-size" xreflabel="min_eager_agg_group_size">
+      <term><varname>min_eager_agg_group_size</varname> (<type>floating point</type>)
+      <indexterm>
+       <primary><varname>min_eager_agg_group_size</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Sets the minimum average group size required to consider applying
+        eager aggregation. This helps avoid the overhead of eager
+        aggregation when it does not offer significant row count reduction.
+        The default is <literal>8</literal>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-jit-above-cost" xreflabel="jit_above_cost">
       <term><varname>jit_above_cost</varname> (<type>floating point</type>)
       <indexterm>
diff --git a/src/backend/optimizer/README b/src/backend/optimizer/README
index 843368096fd..5af3ced5750 100644
--- a/src/backend/optimizer/README
+++ b/src/backend/optimizer/README
@@ -1500,3 +1500,92 @@ breaking down aggregation or grouping over a partitioned relation into
 aggregation or grouping over its partitions is called partitionwise
 aggregation.  Especially when the partition keys match the GROUP BY clause,
 this can be significantly faster than the regular method.
+
+Eager aggregation
+-----------------
+
+Eager aggregation is a query optimization technique that partially
+pushes aggregation past a join, and finalizes it once all the
+relations are joined.  Eager aggregation may reduce the number of
+input rows to the join and thus could result in a better overall plan.
+
+To prove that the transformation is correct, we partition the tables
+in the FROM clause into two groups: those that contain at least one
+aggregation column, and those that do not contain any aggregation
+columns.  Each group can be treated as a single relation formed by the
+Cartesian product of the tables within that group.  Therefore, without
+loss of generality, we can assume that the FROM clause contains
+exactly two relations, R1 and R2, where R1 represents the relation
+containing all aggregation columns, and R2 represents the relation
+without any aggregation columns.
+
+Let the query be of the form:
+
+SELECT G, AGG(A)
+FROM R1 JOIN R2 ON J
+GROUP BY G;
+
+where G is the set of grouping keys that may include columns from R1
+and/or R2; AGG(A) is an aggregate function over columns A from R1; J
+is the join condition between R1 and R2.
+
+The transformation of eager aggregation is:
+
+    GROUP BY G, AGG(A) on (R1 JOIN R2 ON J)
+    =
+    GROUP BY G, AGG(agg_A) on ((GROUP BY G1, AGG(A) AS agg_A on R1) JOIN R2 ON J)
+
+This equivalence holds under the following conditions:
+
+1) AGG is decomposable, meaning that it can be computed in two stages:
+a partial aggregation followed by a final aggregation;
+2) The set G1 used in the pre-aggregation of R1 includes:
+    * all columns from R1 that are part of the grouping keys G, and
+    * all columns from R1 that appear in the join condition J.
+3) The grouping operator for any column in G1 must be compatible with
+the operator used for that column in the join condition J.
+
+Since G1 includes all columns from R1 that appear in either the
+grouping keys G or the join condition J, all rows within each partial
+group have identical values for both the grouping keys and the
+join-relevant columns from R1, assuming compatible operators are used.
+As a result, the rows within a partial group are indistinguishable in
+terms of their contribution to the aggregation and their behavior in
+the join.  This ensures that all rows in the same partial group share
+the same "destiny": they either all match or all fail to match a given
+row in R2.  Because the aggregate function AGG is decomposable,
+aggregating the partial results after the join yields the same final
+result as aggregating after the full join, thereby preserving query
+semantics.  Q.E.D.
+
+One restriction is that we cannot push partial aggregation down to a
+relation that is in the nullable side of an outer join, because the
+NULL-extended rows produced by the outer join would not be available
+when we perform the partial aggregation, while with a
+non-eager-aggregation plan these rows are available for the top-level
+aggregation.  Pushing partial aggregation in this case may result in
+the rows being grouped differently than expected, or produce incorrect
+values from the aggregate functions.
+
+During the construction of the join tree, we evaluate each base or
+join relation to determine if eager aggregation can be applied.  If
+feasible, we create a separate RelOptInfo called a "grouped relation"
+and generate grouped paths by adding sorted and hashed partial
+aggregation paths on top of the non-grouped paths.  To limit planning
+time, we consider only the cheapest or suitably-sorted non-grouped
+paths in this step.
+
+Another way to generate grouped paths is to join a grouped relation
+with a non-grouped relation.  Joining two grouped relations is
+currently not supported.
+
+To further limit planning time, we currently adopt a strategy where
+partial aggregation is pushed only to the lowest feasible level in the
+join tree where it provides a significant reduction in row count.
+This strategy also helps ensure that all grouped paths for the same
+grouped relation produce the same set of rows, which is important to
+support a fundamental assumption of the planner.
+
+If we have generated a grouped relation for the topmost join relation,
+we need to finalize its paths at the end.  The final paths will
+compete in the usual way with paths built from regular planning.
diff --git a/src/backend/optimizer/geqo/geqo_eval.c b/src/backend/optimizer/geqo/geqo_eval.c
index f07d1dc8ac6..4a65f955ca6 100644
--- a/src/backend/optimizer/geqo/geqo_eval.c
+++ b/src/backend/optimizer/geqo/geqo_eval.c
@@ -279,6 +279,27 @@ merge_clump(PlannerInfo *root, List *clumps, Clump *new_clump, int num_gene,
 				/* Find and save the cheapest paths for this joinrel */
 				set_cheapest(joinrel);
 
+				/*
+				 * Except for the topmost scan/join rel, consider generating
+				 * partial aggregation paths for the grouped relation on top
+				 * of the paths of this rel.  After that, we're done creating
+				 * paths for the grouped relation, so run set_cheapest().
+				 */
+				if (!bms_equal(joinrel->relids, root->all_query_rels))
+				{
+					RelOptInfo *grouped_rel;
+
+					grouped_rel = joinrel->grouped_rel;
+					if (grouped_rel)
+					{
+						Assert(IS_GROUPED_REL(grouped_rel));
+
+						generate_grouped_paths(root, grouped_rel, joinrel,
+											   grouped_rel->agg_info);
+						set_cheapest(grouped_rel);
+					}
+				}
+
 				/* Absorb new clump into old */
 				old_clump->joinrel = joinrel;
 				old_clump->size += new_clump->size;
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 6cc6966b060..7b349a4570e 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -40,6 +40,7 @@
 #include "optimizer/paths.h"
 #include "optimizer/plancat.h"
 #include "optimizer/planner.h"
+#include "optimizer/prep.h"
 #include "optimizer/tlist.h"
 #include "parser/parse_clause.h"
 #include "parser/parsetree.h"
@@ -47,6 +48,7 @@
 #include "port/pg_bitutils.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/lsyscache.h"
+#include "utils/selfuncs.h"
 
 
 /* Bitmask flags for pushdown_safety_info.unsafeFlags */
@@ -77,7 +79,9 @@ typedef enum pushdown_safe_type
 
 /* These parameters are set by GUC */
 bool		enable_geqo = false;	/* just in case GUC doesn't set it */
+bool		enable_eager_aggregate = true;
 int			geqo_threshold;
+double		min_eager_agg_group_size;
 int			min_parallel_table_scan_size;
 int			min_parallel_index_scan_size;
 
@@ -90,6 +94,7 @@ join_search_hook_type join_search_hook = NULL;
 
 static void set_base_rel_consider_startup(PlannerInfo *root);
 static void set_base_rel_sizes(PlannerInfo *root);
+static void setup_base_grouped_rels(PlannerInfo *root);
 static void set_base_rel_pathlists(PlannerInfo *root);
 static void set_rel_size(PlannerInfo *root, RelOptInfo *rel,
 						 Index rti, RangeTblEntry *rte);
@@ -114,6 +119,7 @@ static void set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
 								Index rti, RangeTblEntry *rte);
 static void set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 									Index rti, RangeTblEntry *rte);
+static void set_grouped_rel_pathlist(PlannerInfo *root, RelOptInfo *rel);
 static void generate_orderedappend_paths(PlannerInfo *root, RelOptInfo *rel,
 										 List *live_childrels,
 										 List *all_child_pathkeys);
@@ -182,6 +188,11 @@ make_one_rel(PlannerInfo *root, List *joinlist)
 	 */
 	set_base_rel_sizes(root);
 
+	/*
+	 * Build grouped relations for base rels where possible.
+	 */
+	setup_base_grouped_rels(root);
+
 	/*
 	 * We should now have size estimates for every actual table involved in
 	 * the query, and we also know which if any have been deleted from the
@@ -323,6 +334,39 @@ set_base_rel_sizes(PlannerInfo *root)
 	}
 }
 
+/*
+ * setup_base_grouped_rels
+ *	  For each base relation, build a grouped base relation if eager
+ *	  aggregation is possible and if this relation can produce grouped paths.
+ */
+static void
+setup_base_grouped_rels(PlannerInfo *root)
+{
+	Index		rti;
+
+	/*
+	 * If there are no aggregate expressions or grouping expressions, eager
+	 * aggregation is not possible.
+	 */
+	if (root->agg_clause_list == NIL ||
+		root->group_expr_list == NIL)
+		return;
+
+	for (rti = 1; rti < root->simple_rel_array_size; rti++)
+	{
+		RelOptInfo *rel = root->simple_rel_array[rti];
+
+		/* there may be empty slots corresponding to non-baserel RTEs */
+		if (rel == NULL)
+			continue;
+
+		Assert(rel->relid == rti);	/* sanity check on array */
+		Assert(IS_SIMPLE_REL(rel)); /* sanity check on rel */
+
+		(void) build_simple_grouped_rel(root, rel);
+	}
+}
+
 /*
  * set_base_rel_pathlists
  *	  Finds all paths available for scanning each base-relation entry.
@@ -559,6 +603,15 @@ set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 	/* Now find the cheapest of the paths for this rel */
 	set_cheapest(rel);
 
+	/*
+	 * If a grouped relation for this rel exists, build partial aggregation
+	 * paths for it.
+	 *
+	 * Note that this can only happen after we've called set_cheapest() for
+	 * this base rel, because we need its cheapest paths.
+	 */
+	set_grouped_rel_pathlist(root, rel);
+
 #ifdef OPTIMIZER_DEBUG
 	pprint(rel);
 #endif
@@ -1305,6 +1358,36 @@ set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 	add_paths_to_append_rel(root, rel, live_childrels);
 }
 
+/*
+ * set_grouped_rel_pathlist
+ *	  If a grouped relation for the given 'rel' exists, build partial
+ *	  aggregation paths for it.
+ */
+static void
+set_grouped_rel_pathlist(PlannerInfo *root, RelOptInfo *rel)
+{
+	RelOptInfo *grouped_rel;
+
+	/*
+	 * If there are no aggregate expressions or grouping expressions, eager
+	 * aggregation is not possible.
+	 */
+	if (root->agg_clause_list == NIL ||
+		root->group_expr_list == NIL)
+		return;
+
+	/* Add paths to the grouped base relation if one exists. */
+	grouped_rel = rel->grouped_rel;
+	if (grouped_rel)
+	{
+		Assert(IS_GROUPED_REL(grouped_rel));
+
+		generate_grouped_paths(root, grouped_rel, rel,
+							   grouped_rel->agg_info);
+		set_cheapest(grouped_rel);
+	}
+}
+
 
 /*
  * add_paths_to_append_rel
@@ -3335,6 +3418,328 @@ generate_useful_gather_paths(PlannerInfo *root, RelOptInfo *rel, bool override_r
 	}
 }
 
+/*
+ * generate_grouped_paths
+ *		Generate paths for a grouped relation by adding sorted and hashed
+ *		partial aggregation paths on top of paths of the ungrouped base or join
+ *		relation.
+ *
+ * The information needed are provided by the RelAggInfo structure.
+ */
+void
+generate_grouped_paths(PlannerInfo *root, RelOptInfo *grouped_rel,
+					   RelOptInfo *rel, RelAggInfo *agg_info)
+{
+	AggClauseCosts agg_costs;
+	bool		can_hash;
+	bool		can_sort;
+	Path	   *cheapest_total_path = NULL;
+	Path	   *cheapest_partial_path = NULL;
+	double		dNumGroups = 0;
+	double		dNumPartialGroups = 0;
+
+	if (IS_DUMMY_REL(rel))
+	{
+		mark_dummy_rel(grouped_rel);
+		return;
+	}
+
+	/*
+	 * We push partial aggregation only to the lowest possible level in the
+	 * join tree that is deemed useful.
+	 */
+	if (!bms_equal(agg_info->apply_at, rel->relids) ||
+		!agg_info->agg_useful)
+		return;
+
+	MemSet(&agg_costs, 0, sizeof(AggClauseCosts));
+	get_agg_clause_costs(root, AGGSPLIT_INITIAL_SERIAL, &agg_costs);
+
+	/*
+	 * Determine whether it's possible to perform sort-based implementations
+	 * of grouping.
+	 */
+	can_sort = grouping_is_sortable(agg_info->group_clauses);
+
+	/*
+	 * Determine whether we should consider hash-based implementations of
+	 * grouping.
+	 */
+	Assert(root->numOrderedAggs == 0);
+	can_hash = (agg_info->group_clauses != NIL &&
+				grouping_is_hashable(agg_info->group_clauses));
+
+	/*
+	 * Consider whether we should generate partially aggregated non-partial
+	 * paths.  We can only do this if we have a non-partial path.
+	 */
+	if (rel->pathlist != NIL)
+	{
+		cheapest_total_path = rel->cheapest_total_path;
+		Assert(cheapest_total_path != NULL);
+	}
+
+	/*
+	 * If parallelism is possible for grouped_rel, then we should consider
+	 * generating partially-grouped partial paths.  However, if the ungrouped
+	 * rel has no partial paths, then we can't.
+	 */
+	if (grouped_rel->consider_parallel && rel->partial_pathlist != NIL)
+	{
+		cheapest_partial_path = linitial(rel->partial_pathlist);
+		Assert(cheapest_partial_path != NULL);
+	}
+
+	/* Estimate number of partial groups. */
+	if (cheapest_total_path != NULL)
+		dNumGroups = estimate_num_groups(root,
+										 agg_info->group_exprs,
+										 cheapest_total_path->rows,
+										 NULL, NULL);
+	if (cheapest_partial_path != NULL)
+		dNumPartialGroups = estimate_num_groups(root,
+												agg_info->group_exprs,
+												cheapest_partial_path->rows,
+												NULL, NULL);
+
+	if (can_sort && cheapest_total_path != NULL)
+	{
+		ListCell   *lc;
+
+		/*
+		 * Use any available suitably-sorted path as input, and also consider
+		 * sorting the cheapest-total path and incremental sort on any paths
+		 * with presorted keys.
+		 *
+		 * To save planning time, we ignore parameterized input paths unless
+		 * they are the cheapest-total path.
+		 */
+		foreach(lc, rel->pathlist)
+		{
+			Path	   *input_path = (Path *) lfirst(lc);
+			Path	   *path;
+			bool		is_sorted;
+			int			presorted_keys;
+
+			/*
+			 * Ignore parameterized paths that are not the cheapest-total
+			 * path.
+			 */
+			if (input_path->param_info &&
+				input_path != cheapest_total_path)
+				continue;
+
+			is_sorted = pathkeys_count_contained_in(agg_info->group_pathkeys,
+													input_path->pathkeys,
+													&presorted_keys);
+
+			/*
+			 * Ignore paths that are not suitably or partially sorted, unless
+			 * they are the cheapest total path (no need to deal with paths
+			 * which have presorted keys when incremental sort is disabled).
+			 */
+			if (!is_sorted && input_path != cheapest_total_path &&
+				(presorted_keys == 0 || !enable_incremental_sort))
+				continue;
+
+			/*
+			 * Since the path originates from a non-grouped relation that is
+			 * not aware of eager aggregation, we must ensure that it provides
+			 * the correct input for partial aggregation.
+			 */
+			path = (Path *) create_projection_path(root,
+												   grouped_rel,
+												   input_path,
+												   agg_info->agg_input);
+
+			if (!is_sorted)
+			{
+				/*
+				 * We've no need to consider both a sort and incremental sort.
+				 * We'll just do a sort if there are no presorted keys and an
+				 * incremental sort when there are presorted keys.
+				 */
+				if (presorted_keys == 0 || !enable_incremental_sort)
+					path = (Path *) create_sort_path(root,
+													 grouped_rel,
+													 path,
+													 agg_info->group_pathkeys,
+													 -1.0);
+				else
+					path = (Path *) create_incremental_sort_path(root,
+																 grouped_rel,
+																 path,
+																 agg_info->group_pathkeys,
+																 presorted_keys,
+																 -1.0);
+			}
+
+			/*
+			 * qual is NIL because the HAVING clause cannot be evaluated until
+			 * the final value of the aggregate is known.
+			 */
+			path = (Path *) create_agg_path(root,
+											grouped_rel,
+											path,
+											agg_info->target,
+											AGG_SORTED,
+											AGGSPLIT_INITIAL_SERIAL,
+											agg_info->group_clauses,
+											NIL,
+											&agg_costs,
+											dNumGroups);
+
+			add_path(grouped_rel, path);
+		}
+	}
+
+	if (can_sort && cheapest_partial_path != NULL)
+	{
+		ListCell   *lc;
+
+		/* Similar to above logic, but for partial paths. */
+		foreach(lc, rel->partial_pathlist)
+		{
+			Path	   *input_path = (Path *) lfirst(lc);
+			Path	   *path;
+			bool		is_sorted;
+			int			presorted_keys;
+
+			is_sorted = pathkeys_count_contained_in(agg_info->group_pathkeys,
+													input_path->pathkeys,
+													&presorted_keys);
+
+			/*
+			 * Ignore paths that are not suitably or partially sorted, unless
+			 * they are the cheapest partial path (no need to deal with paths
+			 * which have presorted keys when incremental sort is disabled).
+			 */
+			if (!is_sorted && input_path != cheapest_partial_path &&
+				(presorted_keys == 0 || !enable_incremental_sort))
+				continue;
+
+			/*
+			 * Since the path originates from a non-grouped relation that is
+			 * not aware of eager aggregation, we must ensure that it provides
+			 * the correct input for partial aggregation.
+			 */
+			path = (Path *) create_projection_path(root,
+												   grouped_rel,
+												   input_path,
+												   agg_info->agg_input);
+
+			if (!is_sorted)
+			{
+				/*
+				 * We've no need to consider both a sort and incremental sort.
+				 * We'll just do a sort if there are no presorted keys and an
+				 * incremental sort when there are presorted keys.
+				 */
+				if (presorted_keys == 0 || !enable_incremental_sort)
+					path = (Path *) create_sort_path(root,
+													 grouped_rel,
+													 path,
+													 agg_info->group_pathkeys,
+													 -1.0);
+				else
+					path = (Path *) create_incremental_sort_path(root,
+																 grouped_rel,
+																 path,
+																 agg_info->group_pathkeys,
+																 presorted_keys,
+																 -1.0);
+			}
+
+			/*
+			 * qual is NIL because the HAVING clause cannot be evaluated until
+			 * the final value of the aggregate is known.
+			 */
+			path = (Path *) create_agg_path(root,
+											grouped_rel,
+											path,
+											agg_info->target,
+											AGG_SORTED,
+											AGGSPLIT_INITIAL_SERIAL,
+											agg_info->group_clauses,
+											NIL,
+											&agg_costs,
+											dNumPartialGroups);
+
+			add_partial_path(grouped_rel, path);
+		}
+	}
+
+	/*
+	 * Add a partially-grouped HashAgg Path where possible
+	 */
+	if (can_hash && cheapest_total_path != NULL)
+	{
+		Path	   *path;
+
+		/*
+		 * Since the path originates from a non-grouped relation that is not
+		 * aware of eager aggregation, we must ensure that it provides the
+		 * correct input for partial aggregation.
+		 */
+		path = (Path *) create_projection_path(root,
+											   grouped_rel,
+											   cheapest_total_path,
+											   agg_info->agg_input);
+
+		/*
+		 * qual is NIL because the HAVING clause cannot be evaluated until the
+		 * final value of the aggregate is known.
+		 */
+		path = (Path *) create_agg_path(root,
+										grouped_rel,
+										path,
+										agg_info->target,
+										AGG_HASHED,
+										AGGSPLIT_INITIAL_SERIAL,
+										agg_info->group_clauses,
+										NIL,
+										&agg_costs,
+										dNumGroups);
+
+		add_path(grouped_rel, path);
+	}
+
+	/*
+	 * Now add a partially-grouped HashAgg partial Path where possible
+	 */
+	if (can_hash && cheapest_partial_path != NULL)
+	{
+		Path	   *path;
+
+		/*
+		 * Since the path originates from a non-grouped relation that is not
+		 * aware of eager aggregation, we must ensure that it provides the
+		 * correct input for partial aggregation.
+		 */
+		path = (Path *) create_projection_path(root,
+											   grouped_rel,
+											   cheapest_partial_path,
+											   agg_info->agg_input);
+
+		/*
+		 * qual is NIL because the HAVING clause cannot be evaluated until the
+		 * final value of the aggregate is known.
+		 */
+		path = (Path *) create_agg_path(root,
+										grouped_rel,
+										path,
+										agg_info->target,
+										AGG_HASHED,
+										AGGSPLIT_INITIAL_SERIAL,
+										agg_info->group_clauses,
+										NIL,
+										&agg_costs,
+										dNumPartialGroups);
+
+		add_partial_path(grouped_rel, path);
+	}
+}
+
 /*
  * make_rel_from_joinlist
  *	  Build access paths using a "joinlist" to guide the join path search.
@@ -3494,6 +3899,10 @@ standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
 		 *
 		 * After that, we're done creating paths for the joinrel, so run
 		 * set_cheapest().
+		 *
+		 * In addition, we also run generate_grouped_paths() for the grouped
+		 * relation of each just-processed joinrel, and run set_cheapest() for
+		 * the grouped relation afterwards.
 		 */
 		foreach(lc, root->join_rel_level[lev])
 		{
@@ -3514,6 +3923,27 @@ standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
 			/* Find and save the cheapest paths for this rel */
 			set_cheapest(rel);
 
+			/*
+			 * Except for the topmost scan/join rel, consider generating
+			 * partial aggregation paths for the grouped relation on top of
+			 * the paths of this rel.  After that, we're done creating paths
+			 * for the grouped relation, so run set_cheapest().
+			 */
+			if (!bms_equal(rel->relids, root->all_query_rels))
+			{
+				RelOptInfo *grouped_rel;
+
+				grouped_rel = rel->grouped_rel;
+				if (grouped_rel)
+				{
+					Assert(IS_GROUPED_REL(grouped_rel));
+
+					generate_grouped_paths(root, grouped_rel, rel,
+										   grouped_rel->agg_info);
+					set_cheapest(grouped_rel);
+				}
+			}
+
 #ifdef OPTIMIZER_DEBUG
 			pprint(rel);
 #endif
@@ -4383,6 +4813,29 @@ generate_partitionwise_join_paths(PlannerInfo *root, RelOptInfo *rel)
 		if (IS_DUMMY_REL(child_rel))
 			continue;
 
+		/*
+		 * Except for the topmost scan/join rel, consider generating partial
+		 * aggregation paths for the grouped relation on top of the paths of
+		 * this partitioned child-join.  After that, we're done creating paths
+		 * for the grouped relation, so run set_cheapest().
+		 */
+		if (!bms_equal(IS_OTHER_REL(rel) ?
+					   rel->top_parent_relids : rel->relids,
+					   root->all_query_rels))
+		{
+			RelOptInfo *grouped_rel;
+
+			grouped_rel = child_rel->grouped_rel;
+			if (grouped_rel)
+			{
+				Assert(IS_GROUPED_REL(grouped_rel));
+
+				generate_grouped_paths(root, grouped_rel, child_rel,
+									   grouped_rel->agg_info);
+				set_cheapest(grouped_rel);
+			}
+		}
+
 #ifdef OPTIMIZER_DEBUG
 		pprint(child_rel);
 #endif
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index 535248aa525..04cbbcea2a4 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -16,6 +16,7 @@
 
 #include "miscadmin.h"
 #include "optimizer/appendinfo.h"
+#include "optimizer/cost.h"
 #include "optimizer/joininfo.h"
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
@@ -36,6 +37,9 @@ static bool has_legal_joinclause(PlannerInfo *root, RelOptInfo *rel);
 static bool restriction_is_constant_false(List *restrictlist,
 										  RelOptInfo *joinrel,
 										  bool only_pushed_down);
+static void make_grouped_join_rel(PlannerInfo *root, RelOptInfo *rel1,
+								  RelOptInfo *rel2, RelOptInfo *joinrel,
+								  SpecialJoinInfo *sjinfo, List *restrictlist);
 static void populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
 										RelOptInfo *rel2, RelOptInfo *joinrel,
 										SpecialJoinInfo *sjinfo, List *restrictlist);
@@ -762,6 +766,10 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
 		return joinrel;
 	}
 
+	/* Build a grouped join relation for 'joinrel' if possible. */
+	make_grouped_join_rel(root, rel1, rel2, joinrel, sjinfo,
+						  restrictlist);
+
 	/* Add paths to the join relation. */
 	populate_joinrel_with_paths(root, rel1, rel2, joinrel, sjinfo,
 								restrictlist);
@@ -873,6 +881,186 @@ add_outer_joins_to_relids(PlannerInfo *root, Relids input_relids,
 	return input_relids;
 }
 
+/*
+ * make_grouped_join_rel
+ *	  Build a grouped join relation for the given "joinrel" if eager
+ *	  aggregation is applicable and the resulting grouped paths are considered
+ *	  useful.
+ *
+ * There are two strategies for generating grouped paths for a join relation:
+ *
+ * 1. Join a grouped (partially aggregated) input relation with a non-grouped
+ * input (e.g., AGG(B) JOIN A).
+ *
+ * 2. Apply partial aggregation (sorted or hashed) on top of existing
+ * non-grouped join paths (e.g., AGG(A JOIN B)).
+ *
+ * To limit planning effort and avoid an explosion of alternatives, we adopt a
+ * strategy where partial aggregation is only pushed to the lowest possible
+ * level in the join tree that is deemed useful.  That is, if grouped paths can
+ * be built using the first strategy, we skip consideration of the second
+ * strategy for the same join level.
+ *
+ * Additionally, if there are multiple lowest useful levels where partial
+ * aggregation could be applied, such as in a join tree with relations A, B,
+ * and C where both "AGG(A JOIN B) JOIN C" and "A JOIN AGG(B JOIN C)" are valid
+ * placements, we choose only the first one encountered during join search.
+ * This avoids generating multiple versions of the same grouped relation based
+ * on different aggregation placements.
+ *
+ * These heuristics also ensure that all grouped paths for the same grouped
+ * relation produce the same set of rows, which is a basic assumption in the
+ * planner.
+ */
+static void
+make_grouped_join_rel(PlannerInfo *root, RelOptInfo *rel1,
+					  RelOptInfo *rel2, RelOptInfo *joinrel,
+					  SpecialJoinInfo *sjinfo, List *restrictlist)
+{
+	RelOptInfo *grouped_rel;
+	RelOptInfo *grouped_rel1;
+	RelOptInfo *grouped_rel2;
+	bool		rel1_empty;
+	bool		rel2_empty;
+	Relids		agg_apply_at;
+
+	/*
+	 * If there are no aggregate expressions or grouping expressions, eager
+	 * aggregation is not possible.
+	 */
+	if (root->agg_clause_list == NIL ||
+		root->group_expr_list == NIL)
+		return;
+
+	/* Retrieve the grouped relations for the two input rels */
+	grouped_rel1 = rel1->grouped_rel;
+	grouped_rel2 = rel2->grouped_rel;
+
+	rel1_empty = (grouped_rel1 == NULL || IS_DUMMY_REL(grouped_rel1));
+	rel2_empty = (grouped_rel2 == NULL || IS_DUMMY_REL(grouped_rel2));
+
+	/* Find or construct a grouped joinrel for this joinrel */
+	grouped_rel = joinrel->grouped_rel;
+	if (grouped_rel == NULL)
+	{
+		RelAggInfo *agg_info = NULL;
+
+		/*
+		 * Prepare the information needed to create grouped paths for this
+		 * join relation.
+		 */
+		agg_info = create_rel_agg_info(root, joinrel);
+		if (agg_info == NULL)
+			return;
+
+		/*
+		 * If grouped paths for the given join relation are not considered
+		 * useful, and no grouped paths can be built by joining grouped input
+		 * relations, skip building the grouped join relation.
+		 */
+		if (!agg_info->agg_useful &&
+			(rel1_empty == rel2_empty))
+			return;
+
+		/* build the grouped relation */
+		grouped_rel = build_grouped_rel(root, joinrel);
+		grouped_rel->reltarget = agg_info->target;
+
+		if (rel1_empty != rel2_empty)
+		{
+			/*
+			 * If there is exactly one grouped input relation, then we can
+			 * build grouped paths by joining the input relations.  Set size
+			 * estimates for the grouped join relation based on the input
+			 * relations, and update the lowest join level where partial
+			 * aggregation is applied to that of the grouped input relation.
+			 */
+			set_joinrel_size_estimates(root, grouped_rel,
+									   rel1_empty ? rel1 : grouped_rel1,
+									   rel2_empty ? rel2 : grouped_rel2,
+									   sjinfo, restrictlist);
+			agg_info->apply_at = rel1_empty ?
+				grouped_rel2->agg_info->apply_at :
+				grouped_rel1->agg_info->apply_at;
+		}
+		else
+		{
+			/*
+			 * Otherwise, grouped paths can be built by applying partial
+			 * aggregation on top of existing non-grouped join paths.  Set
+			 * size estimates for the grouped join relation based on the
+			 * estimated number of groups, and track the lowest join level
+			 * where partial aggregation is applied.  Note that these values
+			 * may be updated later if it is determined that grouped paths can
+			 * be constructed by joining other input relations.
+			 */
+			grouped_rel->rows = agg_info->grouped_rows;
+			agg_info->apply_at = bms_copy(joinrel->relids);
+		}
+
+		grouped_rel->agg_info = agg_info;
+		joinrel->grouped_rel = grouped_rel;
+	}
+
+	Assert(IS_GROUPED_REL(grouped_rel));
+
+	/* We may have already proven this grouped join relation to be dummy. */
+	if (IS_DUMMY_REL(grouped_rel))
+		return;
+
+	/*
+	 * Nothing to do if there's no grouped input relation.  Also, joining two
+	 * grouped relations is not currently supported.
+	 */
+	if (rel1_empty == rel2_empty)
+		return;
+
+	/*
+	 * Get the lowest join level where partial aggregation is applied among
+	 * the given input relations.
+	 */
+	agg_apply_at = rel1_empty ?
+		grouped_rel2->agg_info->apply_at :
+		grouped_rel1->agg_info->apply_at;
+
+	/*
+	 * If it's not the designated level, skip building grouped paths.
+	 *
+	 * One exception is when it is a subset of the previously recorded level.
+	 * In that case, we need to update the designated level to this one, and
+	 * adjust the size estimates for the grouped join relation accordingly.
+	 * For example, suppose partial aggregation can be applied on top of (B
+	 * JOIN C).  If we first construct the join as ((A JOIN B) JOIN C), we'd
+	 * record the designated level as including all three relations (A B C).
+	 * Later, when we consider (A JOIN (B JOIN C)), we encounter the smaller
+	 * (B C) join level directly.  Since this is a subset of the previous
+	 * level and still valid for partial aggregation, we update the designated
+	 * level to (B C), and adjust the size estimates accordingly.
+	 */
+	if (!bms_equal(agg_apply_at, grouped_rel->agg_info->apply_at))
+	{
+		if (bms_is_subset(agg_apply_at, grouped_rel->agg_info->apply_at))
+		{
+			/* Adjust the size estimates for the grouped join relation. */
+			set_joinrel_size_estimates(root, grouped_rel,
+									   rel1_empty ? rel1 : grouped_rel1,
+									   rel2_empty ? rel2 : grouped_rel2,
+									   sjinfo, restrictlist);
+			grouped_rel->agg_info->apply_at = agg_apply_at;
+		}
+		else
+			return;
+	}
+
+	/* Make paths for the grouped join relation. */
+	populate_joinrel_with_paths(root,
+								rel1_empty ? rel1 : grouped_rel1,
+								rel2_empty ? rel2 : grouped_rel2,
+								grouped_rel,
+								sjinfo,
+								restrictlist);
+}
+
 /*
  * populate_joinrel_with_paths
  *	  Add paths to the given joinrel for given pair of joining relations. The
@@ -1615,6 +1803,11 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
 						 adjust_child_relids(joinrel->relids,
 											 nappinfos, appinfos)));
 
+		/* Build a grouped join relation for 'child_joinrel' if possible */
+		make_grouped_join_rel(root, child_rel1, child_rel2,
+							  child_joinrel, child_sjinfo,
+							  child_restrictlist);
+
 		/* And make paths for the child join */
 		populate_joinrel_with_paths(root, child_rel1, child_rel2,
 									child_joinrel, child_sjinfo,
diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index 3e3fec89252..9cc8c558ccf 100644
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -14,6 +14,7 @@
  */
 #include "postgres.h"
 
+#include "access/nbtree.h"
 #include "catalog/pg_constraint.h"
 #include "catalog/pg_type.h"
 #include "nodes/makefuncs.h"
@@ -31,6 +32,7 @@
 #include "optimizer/restrictinfo.h"
 #include "parser/analyze.h"
 #include "rewrite/rewriteManip.h"
+#include "utils/fmgroids.h"
 #include "utils/lsyscache.h"
 #include "utils/rel.h"
 #include "utils/typcache.h"
@@ -81,6 +83,9 @@ typedef struct JoinTreeItem
 } JoinTreeItem;
 
 
+static bool is_partial_agg_memory_risky(PlannerInfo *root);
+static void create_agg_clause_infos(PlannerInfo *root);
+static void create_grouping_expr_infos(PlannerInfo *root);
 static void extract_lateral_references(PlannerInfo *root, RelOptInfo *brel,
 									   Index rtindex);
 static List *deconstruct_recurse(PlannerInfo *root, Node *jtnode,
@@ -628,6 +633,323 @@ remove_useless_groupby_columns(PlannerInfo *root)
 	}
 }
 
+/*
+ * setup_eager_aggregation
+ *	  Check if eager aggregation is applicable, and if so collect suitable
+ *	  aggregate expressions and grouping expressions in the query.
+ */
+void
+setup_eager_aggregation(PlannerInfo *root)
+{
+	/*
+	 * Don't apply eager aggregation if disabled by user.
+	 */
+	if (!enable_eager_aggregate)
+		return;
+
+	/*
+	 * Don't apply eager aggregation if there are no available GROUP BY
+	 * clauses.
+	 */
+	if (!root->processed_groupClause)
+		return;
+
+	/*
+	 * For now we don't try to support grouping sets.
+	 */
+	if (root->parse->groupingSets)
+		return;
+
+	/*
+	 * For now we don't try to support DISTINCT or ORDER BY aggregates.
+	 */
+	if (root->numOrderedAggs > 0)
+		return;
+
+	/*
+	 * If there are any aggregates that do not support partial mode, or any
+	 * partial aggregates that are non-serializable, do not apply eager
+	 * aggregation.
+	 */
+	if (root->hasNonPartialAggs || root->hasNonSerialAggs)
+		return;
+
+	/*
+	 * We don't try to apply eager aggregation if there are set-returning
+	 * functions in targetlist.
+	 */
+	if (root->parse->hasTargetSRFs)
+		return;
+
+	/*
+	 * Eager aggregation only makes sense if there are multiple base rels in
+	 * the query.
+	 */
+	if (bms_membership(root->all_baserels) != BMS_MULTIPLE)
+		return;
+
+	/*
+	 * Don't apply eager aggregation if any aggregate poses a risk of
+	 * excessive memory usage during partial aggregation.
+	 */
+	if (is_partial_agg_memory_risky(root))
+		return;
+
+	/*
+	 * Collect aggregate expressions and plain Vars that appear in the
+	 * targetlist and havingQual.
+	 */
+	create_agg_clause_infos(root);
+
+	/*
+	 * If there are no suitable aggregate expressions, we cannot apply eager
+	 * aggregation.
+	 */
+	if (root->agg_clause_list == NIL)
+		return;
+
+	/*
+	 * Collect grouping expressions that appear in grouping clauses.
+	 */
+	create_grouping_expr_infos(root);
+}
+
+/*
+ * is_partial_agg_memory_risky
+ *	  Checks if any aggregate poses a risk of excessive memory usage during
+ *	  partial aggregation.
+ *
+ * We check if any aggregate uses INTERNAL transition type.  Although INTERNAL
+ * is marked as pass-by-value, it usually points to a large internal data
+ * structure (like those used by string_agg or array_agg).  These transition
+ * states can grow large and their size is hard to estimate.  Applying eager
+ * aggregation in such cases risks high memory usage since partial aggregation
+ * results might be stored in join hash tables or materialized nodes.
+ *
+ * We explicitly exclude aggregates with F_NUMERIC_AVG_ACCUM transition
+ * function from this check, based on the assumption that avg(numeric) and
+ * sum(numeric) are safe in this context.
+ */
+static bool
+is_partial_agg_memory_risky(PlannerInfo *root)
+{
+	ListCell   *lc;
+
+	foreach(lc, root->aggtransinfos)
+	{
+		AggTransInfo *transinfo = lfirst_node(AggTransInfo, lc);
+
+		if (transinfo->transfn_oid == F_NUMERIC_AVG_ACCUM)
+			continue;
+
+		if (transinfo->aggtranstype == INTERNALOID)
+			return true;
+	}
+
+	return false;
+}
+
+/*
+ * create_agg_clause_infos
+ *	  Search the targetlist and havingQual for Aggrefs and plain Vars, and
+ *	  create an AggClauseInfo for each Aggref node.
+ */
+static void
+create_agg_clause_infos(PlannerInfo *root)
+{
+	List	   *tlist_exprs;
+	List	   *agg_clause_list = NIL;
+	List	   *tlist_vars = NIL;
+	Relids		aggregate_relids = NULL;
+	bool		eager_agg_applicable = true;
+	ListCell   *lc;
+
+	Assert(root->agg_clause_list == NIL);
+	Assert(root->tlist_vars == NIL);
+
+	tlist_exprs = pull_var_clause((Node *) root->processed_tlist,
+								  PVC_INCLUDE_AGGREGATES |
+								  PVC_RECURSE_WINDOWFUNCS |
+								  PVC_RECURSE_PLACEHOLDERS);
+
+	/*
+	 * Aggregates within the HAVING clause need to be processed in the same
+	 * way as those in the targetlist.  Note that HAVING can contain Aggrefs
+	 * but not WindowFuncs.
+	 */
+	if (root->parse->havingQual != NULL)
+	{
+		List	   *having_exprs;
+
+		having_exprs = pull_var_clause((Node *) root->parse->havingQual,
+									   PVC_INCLUDE_AGGREGATES |
+									   PVC_RECURSE_PLACEHOLDERS);
+		if (having_exprs != NIL)
+		{
+			tlist_exprs = list_concat(tlist_exprs, having_exprs);
+			list_free(having_exprs);
+		}
+	}
+
+	foreach(lc, tlist_exprs)
+	{
+		Expr	   *expr = (Expr *) lfirst(lc);
+		Aggref	   *aggref;
+		Relids		agg_eval_at;
+		AggClauseInfo *ac_info;
+
+		/* For now we don't try to support GROUPING() expressions */
+		if (IsA(expr, GroupingFunc))
+		{
+			eager_agg_applicable = false;
+			break;
+		}
+
+		/* Collect plain Vars for future reference */
+		if (IsA(expr, Var))
+		{
+			tlist_vars = list_append_unique(tlist_vars, expr);
+			continue;
+		}
+
+		aggref = castNode(Aggref, expr);
+
+		Assert(aggref->aggorder == NIL);
+		Assert(aggref->aggdistinct == NIL);
+
+		/*
+		 * If there are any securityQuals, do not try to apply eager
+		 * aggregation if any non-leakproof aggregate functions are present.
+		 * This is overly strict, but for now...
+		 */
+		if (root->qual_security_level > 0 &&
+			!get_func_leakproof(aggref->aggfnoid))
+		{
+			eager_agg_applicable = false;
+			break;
+		}
+
+		agg_eval_at = pull_varnos(root, (Node *) aggref);
+
+		/*
+		 * If all base relations in the query are referenced by aggregate
+		 * functions, then eager aggregation is not applicable.
+		 */
+		aggregate_relids = bms_add_members(aggregate_relids, agg_eval_at);
+		if (bms_is_subset(root->all_baserels, aggregate_relids))
+		{
+			eager_agg_applicable = false;
+			break;
+		}
+
+		/* OK, create the AggClauseInfo node */
+		ac_info = makeNode(AggClauseInfo);
+		ac_info->aggref = aggref;
+		ac_info->agg_eval_at = agg_eval_at;
+
+		/* ... and add it to the list */
+		agg_clause_list = list_append_unique(agg_clause_list, ac_info);
+	}
+
+	list_free(tlist_exprs);
+
+	if (eager_agg_applicable)
+	{
+		root->agg_clause_list = agg_clause_list;
+		root->tlist_vars = tlist_vars;
+	}
+	else
+	{
+		list_free_deep(agg_clause_list);
+		list_free(tlist_vars);
+	}
+}
+
+/*
+ * create_grouping_expr_infos
+ *	  Create a GroupingExprInfo for each expression usable as grouping key.
+ *
+ * If any grouping expression is not suitable, we will just return with
+ * root->group_expr_list being NIL.
+ */
+static void
+create_grouping_expr_infos(PlannerInfo *root)
+{
+	List	   *exprs = NIL;
+	List	   *sortgrouprefs = NIL;
+	List	   *btree_opfamilies = NIL;
+	ListCell   *lc,
+			   *lc1,
+			   *lc2,
+			   *lc3;
+
+	Assert(root->group_expr_list == NIL);
+
+	foreach(lc, root->processed_groupClause)
+	{
+		SortGroupClause *sgc = lfirst_node(SortGroupClause, lc);
+		TargetEntry *tle = get_sortgroupclause_tle(sgc, root->processed_tlist);
+		TypeCacheEntry *tce;
+		Oid			equalimageproc;
+
+		Assert(tle->ressortgroupref > 0);
+
+		/*
+		 * For now we only support plain Vars as grouping expressions.
+		 */
+		if (!IsA(tle->expr, Var))
+			return;
+
+		/*
+		 * Eager aggregation is only possible if equality implies image
+		 * equality for each grouping key.  Otherwise, placing keys with
+		 * different byte images into the same group may result in the loss of
+		 * information that could be necessary to evaluate upper qual clauses.
+		 *
+		 * For instance, the NUMERIC data type is not supported, as values
+		 * that are considered equal by the equality operator (e.g., 0 and
+		 * 0.0) can have different scales.
+		 */
+		tce = lookup_type_cache(exprType((Node *) tle->expr),
+								TYPECACHE_BTREE_OPFAMILY);
+		if (!OidIsValid(tce->btree_opf) ||
+			!OidIsValid(tce->btree_opintype))
+			return;
+
+		equalimageproc = get_opfamily_proc(tce->btree_opf,
+										   tce->btree_opintype,
+										   tce->btree_opintype,
+										   BTEQUALIMAGE_PROC);
+		if (!OidIsValid(equalimageproc) ||
+			!DatumGetBool(OidFunctionCall1Coll(equalimageproc,
+											   tce->typcollation,
+											   ObjectIdGetDatum(tce->btree_opintype))))
+			return;
+
+		exprs = lappend(exprs, tle->expr);
+		sortgrouprefs = lappend_int(sortgrouprefs, tle->ressortgroupref);
+		btree_opfamilies = lappend_oid(btree_opfamilies, tce->btree_opf);
+	}
+
+	/*
+	 * Construct a GroupingExprInfo for each expression.
+	 */
+	forthree(lc1, exprs, lc2, sortgrouprefs, lc3, btree_opfamilies)
+	{
+		Expr	   *expr = (Expr *) lfirst(lc1);
+		int			sortgroupref = lfirst_int(lc2);
+		Oid			btree_opfamily = lfirst_oid(lc3);
+		GroupingExprInfo *ge_info;
+
+		ge_info = makeNode(GroupingExprInfo);
+		ge_info->expr = (Expr *) copyObject(expr);
+		ge_info->sortgroupref = sortgroupref;
+		ge_info->btree_opfamily = btree_opfamily;
+
+		root->group_expr_list = lappend(root->group_expr_list, ge_info);
+	}
+}
+
 /*****************************************************************************
  *
  *	  LATERAL REFERENCES
diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c
index 5467e094ca7..eefc486a566 100644
--- a/src/backend/optimizer/plan/planmain.c
+++ b/src/backend/optimizer/plan/planmain.c
@@ -76,6 +76,9 @@ query_planner(PlannerInfo *root,
 	root->placeholder_list = NIL;
 	root->placeholder_array = NULL;
 	root->placeholder_array_size = 0;
+	root->agg_clause_list = NIL;
+	root->group_expr_list = NIL;
+	root->tlist_vars = NIL;
 	root->fkey_list = NIL;
 	root->initial_rels = NIL;
 
@@ -265,6 +268,12 @@ query_planner(PlannerInfo *root,
 	 */
 	extract_restriction_or_clauses(root);
 
+	/*
+	 * Check if eager aggregation is applicable, and if so, set up
+	 * root->agg_clause_list and root->group_expr_list.
+	 */
+	setup_eager_aggregation(root);
+
 	/*
 	 * Now expand appendrels by adding "otherrels" for their children.  We
 	 * delay this to the end so that we have as much information as possible
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 41bd8353430..462c5335589 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -232,7 +232,6 @@ static void add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 									  RelOptInfo *partially_grouped_rel,
 									  const AggClauseCosts *agg_costs,
 									  grouping_sets_data *gd,
-									  double dNumGroups,
 									  GroupPathExtraData *extra);
 static RelOptInfo *create_partial_grouping_paths(PlannerInfo *root,
 												 RelOptInfo *grouped_rel,
@@ -4010,9 +4009,7 @@ create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
 							   GroupPathExtraData *extra,
 							   RelOptInfo **partially_grouped_rel_p)
 {
-	Path	   *cheapest_path = input_rel->cheapest_total_path;
 	RelOptInfo *partially_grouped_rel = NULL;
-	double		dNumGroups;
 	PartitionwiseAggregateType patype = PARTITIONWISE_AGGREGATE_NONE;
 
 	/*
@@ -4094,23 +4091,16 @@ create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
 
 	/* Gather any partially grouped partial paths. */
 	if (partially_grouped_rel && partially_grouped_rel->partial_pathlist)
-	{
 		gather_grouping_paths(root, partially_grouped_rel);
-		set_cheapest(partially_grouped_rel);
-	}
 
-	/*
-	 * Estimate number of groups.
-	 */
-	dNumGroups = get_number_of_groups(root,
-									  cheapest_path->rows,
-									  gd,
-									  extra->targetList);
+	/* Now choose the best path(s) for partially_grouped_rel. */
+	if (partially_grouped_rel && partially_grouped_rel->pathlist)
+		set_cheapest(partially_grouped_rel);
 
 	/* Build final grouping paths */
 	add_paths_to_grouping_rel(root, input_rel, grouped_rel,
 							  partially_grouped_rel, agg_costs, gd,
-							  dNumGroups, extra);
+							  extra);
 
 	/* Give a helpful error if we failed to find any implementation */
 	if (grouped_rel->pathlist == NIL)
@@ -7055,16 +7045,42 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 						  RelOptInfo *grouped_rel,
 						  RelOptInfo *partially_grouped_rel,
 						  const AggClauseCosts *agg_costs,
-						  grouping_sets_data *gd, double dNumGroups,
+						  grouping_sets_data *gd,
 						  GroupPathExtraData *extra)
 {
 	Query	   *parse = root->parse;
 	Path	   *cheapest_path = input_rel->cheapest_total_path;
+	Path	   *cheapest_partially_grouped_path = NULL;
 	ListCell   *lc;
 	bool		can_hash = (extra->flags & GROUPING_CAN_USE_HASH) != 0;
 	bool		can_sort = (extra->flags & GROUPING_CAN_USE_SORT) != 0;
 	List	   *havingQual = (List *) extra->havingQual;
 	AggClauseCosts *agg_final_costs = &extra->agg_final_costs;
+	double		dNumGroups = 0;
+	double		dNumFinalGroups = 0;
+
+	/*
+	 * Estimate number of groups for non-split aggregation.
+	 */
+	dNumGroups = get_number_of_groups(root,
+									  cheapest_path->rows,
+									  gd,
+									  extra->targetList);
+
+	if (partially_grouped_rel && partially_grouped_rel->pathlist)
+	{
+		cheapest_partially_grouped_path =
+			partially_grouped_rel->cheapest_total_path;
+
+		/*
+		 * Estimate number of groups for final phase of partial aggregation.
+		 */
+		dNumFinalGroups =
+			get_number_of_groups(root,
+								 cheapest_partially_grouped_path->rows,
+								 gd,
+								 extra->targetList);
+	}
 
 	if (can_sort)
 	{
@@ -7177,7 +7193,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 					path = make_ordered_path(root,
 											 grouped_rel,
 											 path,
-											 partially_grouped_rel->cheapest_total_path,
+											 cheapest_partially_grouped_path,
 											 info->pathkeys,
 											 -1.0);
 
@@ -7195,7 +7211,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 												 info->clauses,
 												 havingQual,
 												 agg_final_costs,
-												 dNumGroups));
+												 dNumFinalGroups));
 					else
 						add_path(grouped_rel, (Path *)
 								 create_group_path(root,
@@ -7203,7 +7219,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 												   path,
 												   info->clauses,
 												   havingQual,
-												   dNumGroups));
+												   dNumFinalGroups));
 
 				}
 			}
@@ -7245,19 +7261,17 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 		 */
 		if (partially_grouped_rel && partially_grouped_rel->pathlist)
 		{
-			Path	   *path = partially_grouped_rel->cheapest_total_path;
-
 			add_path(grouped_rel, (Path *)
 					 create_agg_path(root,
 									 grouped_rel,
-									 path,
+									 cheapest_partially_grouped_path,
 									 grouped_rel->reltarget,
 									 AGG_HASHED,
 									 AGGSPLIT_FINAL_DESERIAL,
 									 root->processed_groupClause,
 									 havingQual,
 									 agg_final_costs,
-									 dNumGroups));
+									 dNumFinalGroups));
 		}
 	}
 
@@ -7297,6 +7311,7 @@ create_partial_grouping_paths(PlannerInfo *root,
 {
 	Query	   *parse = root->parse;
 	RelOptInfo *partially_grouped_rel;
+	RelOptInfo *eager_agg_rel = NULL;
 	AggClauseCosts *agg_partial_costs = &extra->agg_partial_costs;
 	AggClauseCosts *agg_final_costs = &extra->agg_final_costs;
 	Path	   *cheapest_partial_path = NULL;
@@ -7307,6 +7322,15 @@ create_partial_grouping_paths(PlannerInfo *root,
 	bool		can_hash = (extra->flags & GROUPING_CAN_USE_HASH) != 0;
 	bool		can_sort = (extra->flags & GROUPING_CAN_USE_SORT) != 0;
 
+	/*
+	 * Check whether any partially aggregated paths have been generated
+	 * through eager aggregation.
+	 */
+	if (input_rel->grouped_rel &&
+		!IS_DUMMY_REL(input_rel->grouped_rel) &&
+		input_rel->grouped_rel->pathlist != NIL)
+		eager_agg_rel = input_rel->grouped_rel;
+
 	/*
 	 * Consider whether we should generate partially aggregated non-partial
 	 * paths.  We can only do this if we have a non-partial path, and only if
@@ -7328,11 +7352,13 @@ create_partial_grouping_paths(PlannerInfo *root,
 
 	/*
 	 * If we can't partially aggregate partial paths, and we can't partially
-	 * aggregate non-partial paths, then don't bother creating the new
+	 * aggregate non-partial paths, and no partially aggregated paths were
+	 * generated by eager aggregation, then don't bother creating the new
 	 * RelOptInfo at all, unless the caller specified force_rel_creation.
 	 */
 	if (cheapest_total_path == NULL &&
 		cheapest_partial_path == NULL &&
+		eager_agg_rel == NULL &&
 		!force_rel_creation)
 		return NULL;
 
@@ -7557,6 +7583,51 @@ create_partial_grouping_paths(PlannerInfo *root,
 										 dNumPartialPartialGroups));
 	}
 
+	/*
+	 * Add any partially aggregated paths generated by eager aggregation to
+	 * the new upper relation after applying projection steps as needed.
+	 */
+	if (eager_agg_rel)
+	{
+		/* Add the paths */
+		foreach(lc, eager_agg_rel->pathlist)
+		{
+			Path	   *path = (Path *) lfirst(lc);
+
+			/* Shouldn't have any parameterized paths anymore */
+			Assert(path->param_info == NULL);
+
+			path = (Path *) create_projection_path(root,
+												   partially_grouped_rel,
+												   path,
+												   partially_grouped_rel->reltarget);
+
+			add_path(partially_grouped_rel, path);
+		}
+
+		/*
+		 * Likewise add the partial paths, but only if parallelism is possible
+		 * for partially_grouped_rel.
+		 */
+		if (partially_grouped_rel->consider_parallel)
+		{
+			foreach(lc, eager_agg_rel->partial_pathlist)
+			{
+				Path	   *path = (Path *) lfirst(lc);
+
+				/* Shouldn't have any parameterized paths anymore */
+				Assert(path->param_info == NULL);
+
+				path = (Path *) create_projection_path(root,
+													   partially_grouped_rel,
+													   path,
+													   partially_grouped_rel->reltarget);
+
+				add_partial_path(partially_grouped_rel, path);
+			}
+		}
+	}
+
 	/*
 	 * If there is an FDW that's responsible for all baserels of the query,
 	 * let it consider adding partially grouped ForeignPaths.
@@ -8120,13 +8191,6 @@ create_partitionwise_grouping_paths(PlannerInfo *root,
 
 		add_paths_to_append_rel(root, partially_grouped_rel,
 								partially_grouped_live_children);
-
-		/*
-		 * We need call set_cheapest, since the finalization step will use the
-		 * cheapest path from the rel.
-		 */
-		if (partially_grouped_rel->pathlist)
-			set_cheapest(partially_grouped_rel);
 	}
 
 	/* If possible, create append paths for fully grouped children. */
diff --git a/src/backend/optimizer/util/appendinfo.c b/src/backend/optimizer/util/appendinfo.c
index 5b3dc0d8653..11c0eb0d180 100644
--- a/src/backend/optimizer/util/appendinfo.c
+++ b/src/backend/optimizer/util/appendinfo.c
@@ -516,6 +516,65 @@ adjust_appendrel_attrs_mutator(Node *node,
 		return (Node *) newinfo;
 	}
 
+	/*
+	 * We have to process RelAggInfo nodes specially.
+	 */
+	if (IsA(node, RelAggInfo))
+	{
+		RelAggInfo *oldinfo = (RelAggInfo *) node;
+		RelAggInfo *newinfo = makeNode(RelAggInfo);
+
+		/* Copy all flat-copiable fields */
+		memcpy(newinfo, oldinfo, sizeof(RelAggInfo));
+
+		newinfo->relids = adjust_child_relids(oldinfo->relids,
+											  nappinfos, appinfos);
+
+		newinfo->target = (PathTarget *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->target,
+										   context);
+
+		newinfo->agg_input = (PathTarget *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->agg_input,
+										   context);
+
+		newinfo->group_clauses = (List *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->group_clauses,
+										   context);
+
+		newinfo->group_exprs = (List *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->group_exprs,
+										   context);
+
+		return (Node *) newinfo;
+	}
+
+	/*
+	 * We have to process PathTarget nodes specially.
+	 */
+	if (IsA(node, PathTarget))
+	{
+		PathTarget *oldtarget = (PathTarget *) node;
+		PathTarget *newtarget = makeNode(PathTarget);
+
+		/* Copy all flat-copiable fields */
+		memcpy(newtarget, oldtarget, sizeof(PathTarget));
+
+		newtarget->exprs = (List *)
+			adjust_appendrel_attrs_mutator((Node *) oldtarget->exprs,
+										   context);
+
+		if (oldtarget->sortgrouprefs)
+		{
+			Size		nbytes = list_length(oldtarget->exprs) * sizeof(Index);
+
+			newtarget->sortgrouprefs = (Index *) palloc(nbytes);
+			memcpy(newtarget->sortgrouprefs, oldtarget->sortgrouprefs, nbytes);
+		}
+
+		return (Node *) newtarget;
+	}
+
 	/*
 	 * NOTE: we do not need to recurse into sublinks, because they should
 	 * already have been converted to subplans before we see them.
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index 0e523d2eb5b..faa44e46594 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -16,6 +16,8 @@
 
 #include <limits.h>
 
+#include "access/nbtree.h"
+#include "catalog/pg_constraint.h"
 #include "miscadmin.h"
 #include "nodes/nodeFuncs.h"
 #include "optimizer/appendinfo.h"
@@ -27,12 +29,16 @@
 #include "optimizer/paths.h"
 #include "optimizer/placeholder.h"
 #include "optimizer/plancat.h"
+#include "optimizer/planner.h"
 #include "optimizer/restrictinfo.h"
 #include "optimizer/tlist.h"
+#include "parser/parse_oper.h"
 #include "parser/parse_relation.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/hsearch.h"
 #include "utils/lsyscache.h"
+#include "utils/selfuncs.h"
+#include "utils/typcache.h"
 
 
 typedef struct JoinHashEntry
@@ -83,6 +89,14 @@ static void build_child_join_reltarget(PlannerInfo *root,
 									   RelOptInfo *childrel,
 									   int nappinfos,
 									   AppendRelInfo **appinfos);
+static bool eager_aggregation_possible_for_relation(PlannerInfo *root,
+													RelOptInfo *rel);
+static bool init_grouping_targets(PlannerInfo *root, RelOptInfo *rel,
+								  PathTarget *target, PathTarget *agg_input,
+								  List **group_clauses, List **group_exprs);
+static bool is_var_in_aggref_only(PlannerInfo *root, Var *var);
+static bool is_var_needed_by_join(PlannerInfo *root, Var *var, RelOptInfo *rel);
+static Index get_expression_sortgroupref(PlannerInfo *root, Expr *expr);
 
 
 /*
@@ -278,6 +292,8 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
 	rel->joininfo = NIL;
 	rel->has_eclass_joins = false;
 	rel->consider_partitionwise_join = false;	/* might get changed later */
+	rel->agg_info = NULL;
+	rel->grouped_rel = NULL;
 	rel->part_scheme = NULL;
 	rel->nparts = -1;
 	rel->boundinfo = NULL;
@@ -408,6 +424,103 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
 	return rel;
 }
 
+/*
+ * build_simple_grouped_rel
+ *	  Construct a new RelOptInfo representing a grouped version of the input
+ *	  base relation.
+ */
+RelOptInfo *
+build_simple_grouped_rel(PlannerInfo *root, RelOptInfo *rel)
+{
+	RelOptInfo *grouped_rel;
+	RelAggInfo *agg_info;
+
+	/*
+	 * We should have available aggregate expressions and grouping
+	 * expressions, otherwise we cannot reach here.
+	 */
+	Assert(root->agg_clause_list != NIL);
+	Assert(root->group_expr_list != NIL);
+
+	/* nothing to do for dummy rel */
+	if (IS_DUMMY_REL(rel))
+		return NULL;
+
+	/*
+	 * Prepare the information needed to create grouped paths for this base
+	 * relation.
+	 */
+	agg_info = create_rel_agg_info(root, rel);
+	if (agg_info == NULL)
+		return NULL;
+
+	/*
+	 * If grouped paths for the given base relation are not considered useful,
+	 * skip building the grouped relation.
+	 */
+	if (!agg_info->agg_useful)
+		return NULL;
+
+	/* Tracks the lowest join level at which partial aggregation is applied */
+	agg_info->apply_at = bms_copy(rel->relids);
+
+	/* build the grouped relation */
+	grouped_rel = build_grouped_rel(root, rel);
+	grouped_rel->reltarget = agg_info->target;
+	grouped_rel->rows = agg_info->grouped_rows;
+	grouped_rel->agg_info = agg_info;
+
+	rel->grouped_rel = grouped_rel;
+
+	return grouped_rel;
+}
+
+/*
+ * build_grouped_rel
+ *	  Build a grouped relation by flat copying the input relation and resetting
+ *	  the necessary fields.
+ */
+RelOptInfo *
+build_grouped_rel(PlannerInfo *root, RelOptInfo *rel)
+{
+	RelOptInfo *grouped_rel;
+
+	grouped_rel = makeNode(RelOptInfo);
+	memcpy(grouped_rel, rel, sizeof(RelOptInfo));
+
+	/*
+	 * clear path info
+	 */
+	grouped_rel->pathlist = NIL;
+	grouped_rel->ppilist = NIL;
+	grouped_rel->partial_pathlist = NIL;
+	grouped_rel->cheapest_startup_path = NULL;
+	grouped_rel->cheapest_total_path = NULL;
+	grouped_rel->cheapest_parameterized_paths = NIL;
+
+	/*
+	 * clear partition info
+	 */
+	grouped_rel->part_scheme = NULL;
+	grouped_rel->nparts = -1;
+	grouped_rel->boundinfo = NULL;
+	grouped_rel->partbounds_merged = false;
+	grouped_rel->partition_qual = NIL;
+	grouped_rel->part_rels = NULL;
+	grouped_rel->live_parts = NULL;
+	grouped_rel->all_partrels = NULL;
+	grouped_rel->partexprs = NULL;
+	grouped_rel->nullable_partexprs = NULL;
+	grouped_rel->consider_partitionwise_join = false;
+
+	/*
+	 * clear size estimates
+	 */
+	grouped_rel->rows = 0;
+
+	return grouped_rel;
+}
+
 /*
  * find_base_rel
  *	  Find a base or otherrel relation entry, which must already exist.
@@ -759,6 +872,8 @@ build_join_rel(PlannerInfo *root,
 	joinrel->joininfo = NIL;
 	joinrel->has_eclass_joins = false;
 	joinrel->consider_partitionwise_join = false;	/* might get changed later */
+	joinrel->agg_info = NULL;
+	joinrel->grouped_rel = NULL;
 	joinrel->parent = NULL;
 	joinrel->top_parent = NULL;
 	joinrel->top_parent_relids = NULL;
@@ -945,6 +1060,8 @@ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
 	joinrel->joininfo = NIL;
 	joinrel->has_eclass_joins = false;
 	joinrel->consider_partitionwise_join = false;	/* might get changed later */
+	joinrel->agg_info = NULL;
+	joinrel->grouped_rel = NULL;
 	joinrel->parent = parent_joinrel;
 	joinrel->top_parent = parent_joinrel->top_parent ? parent_joinrel->top_parent : parent_joinrel;
 	joinrel->top_parent_relids = joinrel->top_parent->relids;
@@ -2523,3 +2640,514 @@ build_child_join_reltarget(PlannerInfo *root,
 	childrel->reltarget->cost.per_tuple = parentrel->reltarget->cost.per_tuple;
 	childrel->reltarget->width = parentrel->reltarget->width;
 }
+
+/*
+ * create_rel_agg_info
+ *	  Create the RelAggInfo structure for the given relation if it can produce
+ *	  grouped paths.  The given relation is the non-grouped one which has the
+ *	  reltarget already constructed.
+ */
+RelAggInfo *
+create_rel_agg_info(PlannerInfo *root, RelOptInfo *rel)
+{
+	ListCell   *lc;
+	RelAggInfo *result;
+	PathTarget *agg_input;
+	PathTarget *target;
+	List	   *group_clauses = NIL;
+	List	   *group_exprs = NIL;
+
+	/*
+	 * The lists of aggregate expressions and grouping expressions should have
+	 * been constructed.
+	 */
+	Assert(root->agg_clause_list != NIL);
+	Assert(root->group_expr_list != NIL);
+
+	/*
+	 * If this is a child rel, the grouped rel for its parent rel must have
+	 * been created if it can.  So we can just use parent's RelAggInfo if
+	 * there is one, with appropriate variable substitutions.
+	 */
+	if (IS_OTHER_REL(rel))
+	{
+		RelOptInfo *grouped_rel;
+		RelAggInfo *agg_info;
+
+		grouped_rel = rel->top_parent->grouped_rel;
+		if (grouped_rel == NULL)
+			return NULL;
+
+		Assert(IS_GROUPED_REL(grouped_rel));
+
+		/* Must do multi-level transformation */
+		agg_info = (RelAggInfo *)
+			adjust_appendrel_attrs_multilevel(root,
+											  (Node *) grouped_rel->agg_info,
+											  rel,
+											  rel->top_parent);
+
+		agg_info->grouped_rows =
+			estimate_num_groups(root, agg_info->group_exprs,
+								rel->rows, NULL, NULL);
+
+		agg_info->apply_at = NULL;	/* caller will change this later */
+
+		/*
+		 * The grouped paths for the given relation are considered useful iff
+		 * the average group size is no less than min_eager_agg_group_size.
+		 */
+		agg_info->agg_useful =
+			(rel->rows / agg_info->grouped_rows) >= min_eager_agg_group_size;
+
+		return agg_info;
+	}
+
+	/* Check if it's possible to produce grouped paths for this relation. */
+	if (!eager_aggregation_possible_for_relation(root, rel))
+		return NULL;
+
+	/*
+	 * Create targets for the grouped paths and for the input paths of the
+	 * grouped paths.
+	 */
+	target = create_empty_pathtarget();
+	agg_input = create_empty_pathtarget();
+
+	/* ... and initialize these targets */
+	if (!init_grouping_targets(root, rel, target, agg_input,
+							   &group_clauses, &group_exprs))
+		return NULL;
+
+	/*
+	 * Eager aggregation is not applicable if there are no available grouping
+	 * expressions.
+	 */
+	if (list_length(group_clauses) == 0)
+		return NULL;
+
+	/* build the RelAggInfo result */
+	result = makeNode(RelAggInfo);
+
+	result->group_clauses = group_clauses;
+	result->group_exprs = group_exprs;
+
+	/* Calculate pathkeys that represent this grouping requirements */
+	result->group_pathkeys =
+		make_pathkeys_for_sortclauses(root, result->group_clauses,
+									  make_tlist_from_pathtarget(target));
+
+	/* Add aggregates to the grouping target */
+	foreach(lc, root->agg_clause_list)
+	{
+		AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+		Aggref	   *aggref;
+
+		Assert(IsA(ac_info->aggref, Aggref));
+
+		aggref = (Aggref *) copyObject(ac_info->aggref);
+		mark_partial_aggref(aggref, AGGSPLIT_INITIAL_SERIAL);
+
+		add_column_to_pathtarget(target, (Expr *) aggref, 0);
+	}
+
+	/* Set the estimated eval cost and output width for both targets */
+	set_pathtarget_cost_width(root, target);
+	set_pathtarget_cost_width(root, agg_input);
+
+	result->relids = bms_copy(rel->relids);
+	result->target = target;
+	result->agg_input = agg_input;
+	result->grouped_rows = estimate_num_groups(root, result->group_exprs,
+											   rel->rows, NULL, NULL);
+	result->apply_at = NULL;	/* caller will change this later */
+
+	/*
+	 * The grouped paths for the given relation are considered useful iff the
+	 * average group size is no less than min_eager_agg_group_size.
+	 */
+	result->agg_useful =
+		(rel->rows / result->grouped_rows) >= min_eager_agg_group_size;
+
+	return result;
+}
+
+/*
+ * eager_aggregation_possible_for_relation
+ * 	  Check if it's possible to produce grouped paths for the given relation.
+ */
+static bool
+eager_aggregation_possible_for_relation(PlannerInfo *root, RelOptInfo *rel)
+{
+	ListCell   *lc;
+	int			cur_relid;
+
+	/*
+	 * Check to see if the given relation is in the nullable side of an outer
+	 * join.  In this case, we cannot push a partial aggregation down to the
+	 * relation, because the NULL-extended rows produced by the outer join
+	 * would not be available when we perform the partial aggregation, while
+	 * with a non-eager-aggregation plan these rows are available for the
+	 * top-level aggregation.  Doing so may result in the rows being grouped
+	 * differently than expected, or produce incorrect values from the
+	 * aggregate functions.
+	 */
+	cur_relid = -1;
+	while ((cur_relid = bms_next_member(rel->relids, cur_relid)) >= 0)
+	{
+		RelOptInfo *baserel = find_base_rel_ignore_join(root, cur_relid);
+
+		if (baserel == NULL)
+			continue;			/* ignore outer joins in rel->relids */
+
+		if (!bms_is_subset(baserel->nulling_relids, rel->relids))
+			return false;
+	}
+
+	/*
+	 * For now we don't try to support PlaceHolderVars.
+	 */
+	foreach(lc, rel->reltarget->exprs)
+	{
+		Expr	   *expr = lfirst(lc);
+
+		if (IsA(expr, PlaceHolderVar))
+			return false;
+	}
+
+	/* Caller should only pass base relations or joins. */
+	Assert(rel->reloptkind == RELOPT_BASEREL ||
+		   rel->reloptkind == RELOPT_JOINREL);
+
+	/*
+	 * Check if all aggregate expressions can be evaluated on this relation
+	 * level.
+	 */
+	foreach(lc, root->agg_clause_list)
+	{
+		AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+
+		Assert(IsA(ac_info->aggref, Aggref));
+
+		/*
+		 * Give up if any aggregate requires relations other than the current
+		 * one.  If the aggregate requires the current relation plus
+		 * additional relations, grouping the current relation could make some
+		 * input rows unavailable for the higher aggregate and may reduce the
+		 * number of input rows it receives.  If the aggregate does not
+		 * require the current relation at all, it should not be grouped, as
+		 * we do not support joining two grouped relations.
+		 */
+		if (!bms_is_subset(ac_info->agg_eval_at, rel->relids))
+			return false;
+	}
+
+	return true;
+}
+
+/*
+ * init_grouping_targets
+ *	  Initialize the target for grouped paths (target) as well as the target
+ *	  for paths that generate input for the grouped paths (agg_input).
+ *
+ * We also construct the list of SortGroupClauses and the list of grouping
+ * expressions for the partial aggregation, and return them in *group_clause
+ * and *group_exprs.
+ *
+ * Return true if the targets could be initialized, false otherwise.
+ */
+static bool
+init_grouping_targets(PlannerInfo *root, RelOptInfo *rel,
+					  PathTarget *target, PathTarget *agg_input,
+					  List **group_clauses, List **group_exprs)
+{
+	ListCell   *lc;
+	List	   *possibly_dependent = NIL;
+	Index		maxSortGroupRef;
+
+	/* Identify the max sortgroupref */
+	maxSortGroupRef = 0;
+	foreach(lc, root->processed_tlist)
+	{
+		Index		ref = ((TargetEntry *) lfirst(lc))->ressortgroupref;
+
+		if (ref > maxSortGroupRef)
+			maxSortGroupRef = ref;
+	}
+
+	/*
+	 * At this point, all Vars from this relation that are needed by upper
+	 * joins or are required in the final targetlist should already be present
+	 * in its reltarget.  Therefore, we can safely iterate over this
+	 * relation's reltarget->exprs to construct the PathTarget and grouping
+	 * clauses for the grouped paths.
+	 */
+	foreach(lc, rel->reltarget->exprs)
+	{
+		Expr	   *expr = (Expr *) lfirst(lc);
+		Index		sortgroupref;
+
+		/*
+		 * Given that PlaceHolderVar currently prevents us from doing eager
+		 * aggregation, the source target cannot contain anything more complex
+		 * than a Var.
+		 */
+		Assert(IsA(expr, Var));
+
+		/*
+		 * Get the sortgroupref of the expr if it is found among, or can be
+		 * deduced from, the original grouping expressions.
+		 */
+		sortgroupref = get_expression_sortgroupref(root, expr);
+		if (sortgroupref > 0)
+		{
+			SortGroupClause *sgc;
+
+			/* Find the matching SortGroupClause */
+			sgc = get_sortgroupref_clause(sortgroupref, root->processed_groupClause);
+			Assert(sgc->tleSortGroupRef <= maxSortGroupRef);
+
+			/*
+			 * If the target expression is to be used as a grouping key, it
+			 * should be emitted by the grouped paths that have been pushed
+			 * down to this relation level.
+			 */
+			add_column_to_pathtarget(target, expr, sortgroupref);
+
+			/*
+			 * ... and it also should be emitted by the input paths.
+			 */
+			add_column_to_pathtarget(agg_input, expr, sortgroupref);
+
+			/*
+			 * Record this SortGroupClause and grouping expression.  Note that
+			 * this SortGroupClause might have already been recorded.
+			 */
+			if (!list_member(*group_clauses, sgc))
+			{
+				*group_clauses = lappend(*group_clauses, sgc);
+				*group_exprs = lappend(*group_exprs, expr);
+			}
+		}
+		else if (is_var_needed_by_join(root, (Var *) expr, rel))
+		{
+			/*
+			 * The expression is needed for an upper join but is neither in
+			 * the GROUP BY clause nor derivable from it using EC (otherwise,
+			 * it would have already been included in the targets above).  We
+			 * need to create a special SortGroupClause for this expression.
+			 *
+			 * It is important to include such expressions in the grouping
+			 * keys.  This is essential to ensure that an aggregated row from
+			 * the partial aggregation matches the other side of the join if
+			 * and only if each row in the partial group does.  This ensures
+			 * that all rows within the same partial group share the same
+			 * 'destiny', which is crucial for maintaining correctness.
+			 */
+			SortGroupClause *sgc;
+			TypeCacheEntry *tce;
+			Oid			equalimageproc;
+
+			/*
+			 * But first, check if equality implies image equality for this
+			 * expression.  If not, we cannot use it as a grouping key.  See
+			 * comments in create_grouping_expr_infos().
+			 */
+			tce = lookup_type_cache(exprType((Node *) expr),
+									TYPECACHE_BTREE_OPFAMILY);
+			if (!OidIsValid(tce->btree_opf) ||
+				!OidIsValid(tce->btree_opintype))
+				return false;
+
+			equalimageproc = get_opfamily_proc(tce->btree_opf,
+											   tce->btree_opintype,
+											   tce->btree_opintype,
+											   BTEQUALIMAGE_PROC);
+			if (!OidIsValid(equalimageproc) ||
+				!DatumGetBool(OidFunctionCall1Coll(equalimageproc,
+												   tce->typcollation,
+												   ObjectIdGetDatum(tce->btree_opintype))))
+				return false;
+
+			/* Create the SortGroupClause. */
+			sgc = makeNode(SortGroupClause);
+
+			/* Initialize the SortGroupClause. */
+			sgc->tleSortGroupRef = ++maxSortGroupRef;
+			get_sort_group_operators(exprType((Node *) expr),
+									 false, true, false,
+									 &sgc->sortop, &sgc->eqop, NULL,
+									 &sgc->hashable);
+
+			/* This expression should be emitted by the grouped paths */
+			add_column_to_pathtarget(target, expr, sgc->tleSortGroupRef);
+
+			/* ... and it also should be emitted by the input paths. */
+			add_column_to_pathtarget(agg_input, expr, sgc->tleSortGroupRef);
+
+			/* Record this SortGroupClause and grouping expression */
+			*group_clauses = lappend(*group_clauses, sgc);
+			*group_exprs = lappend(*group_exprs, expr);
+		}
+		else if (is_var_in_aggref_only(root, (Var *) expr))
+		{
+			/*
+			 * The expression is referenced by an aggregate function pushed
+			 * down to this relation and does not appear elsewhere in the
+			 * targetlist or havingQual.  Add it to 'agg_input' but not to
+			 * 'target'.
+			 */
+			add_new_column_to_pathtarget(agg_input, expr);
+		}
+		else
+		{
+			/*
+			 * The expression may be functionally dependent on other
+			 * expressions in the target, but we cannot verify this until all
+			 * target expressions have been constructed.
+			 */
+			possibly_dependent = lappend(possibly_dependent, expr);
+		}
+	}
+
+	/*
+	 * Now we can verify whether an expression is functionally dependent on
+	 * others.
+	 */
+	foreach(lc, possibly_dependent)
+	{
+		Var		   *tvar;
+		List	   *deps = NIL;
+		RangeTblEntry *rte;
+
+		tvar = lfirst_node(Var, lc);
+		rte = root->simple_rte_array[tvar->varno];
+
+		if (check_functional_grouping(rte->relid, tvar->varno,
+									  tvar->varlevelsup,
+									  target->exprs, &deps))
+		{
+			/*
+			 * The expression is functionally dependent on other target
+			 * expressions, so it can be included in the targets.  Since it
+			 * will not be used as a grouping key, a sortgroupref is not
+			 * needed for it.
+			 */
+			add_new_column_to_pathtarget(target, (Expr *) tvar);
+			add_new_column_to_pathtarget(agg_input, (Expr *) tvar);
+		}
+		else
+		{
+			/*
+			 * We may arrive here with a grouping expression that is proven
+			 * redundant by EquivalenceClass processing, such as 't1.a' in the
+			 * query below.
+			 *
+			 * select max(t1.c) from t t1, t t2 where t1.a = 1 group by t1.a,
+			 * t1.b;
+			 *
+			 * For now we just give up in this case.
+			 */
+			return false;
+		}
+	}
+
+	return true;
+}
+
+/*
+ * is_var_in_aggref_only
+ *	  Check whether the given Var appears in aggregate expressions and not
+ *	  elsewhere in the targetlist or havingQual.
+ */
+static bool
+is_var_in_aggref_only(PlannerInfo *root, Var *var)
+{
+	ListCell   *lc;
+
+	/*
+	 * Search the list of aggregate expressions for the Var.
+	 */
+	foreach(lc, root->agg_clause_list)
+	{
+		AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+		List	   *vars;
+
+		Assert(IsA(ac_info->aggref, Aggref));
+
+		if (!bms_is_member(var->varno, ac_info->agg_eval_at))
+			continue;
+
+		vars = pull_var_clause((Node *) ac_info->aggref,
+							   PVC_RECURSE_AGGREGATES |
+							   PVC_RECURSE_WINDOWFUNCS |
+							   PVC_RECURSE_PLACEHOLDERS);
+
+		if (list_member(vars, var))
+		{
+			list_free(vars);
+			break;
+		}
+
+		list_free(vars);
+	}
+
+	return (lc != NULL && !list_member(root->tlist_vars, var));
+}
+
+/*
+ * is_var_needed_by_join
+ *	  Check if the given Var is needed by joins above the current rel.
+ */
+static bool
+is_var_needed_by_join(PlannerInfo *root, Var *var, RelOptInfo *rel)
+{
+	Relids		relids;
+	int			attno;
+	RelOptInfo *baserel;
+
+	/*
+	 * Note that when checking if the Var is needed by joins above, we want to
+	 * exclude cases where the Var is only needed in the final targetlist.  So
+	 * include "relation 0" in the check.
+	 */
+	relids = bms_copy(rel->relids);
+	relids = bms_add_member(relids, 0);
+
+	baserel = find_base_rel(root, var->varno);
+	attno = var->varattno - baserel->min_attr;
+
+	return bms_nonempty_difference(baserel->attr_needed[attno], relids);
+}
+
+/*
+ * get_expression_sortgroupref
+ *	  Return the sortgroupref of the given "expr" if it is found among the
+ *	  original grouping expressions, or is known equal to any of the original
+ *	  grouping expressions due to equivalence relationships.  Return 0 if no
+ *	  match is found.
+ */
+static Index
+get_expression_sortgroupref(PlannerInfo *root, Expr *expr)
+{
+	ListCell   *lc;
+
+	foreach(lc, root->group_expr_list)
+	{
+		GroupingExprInfo *ge_info = lfirst_node(GroupingExprInfo, lc);
+
+		Assert(IsA(ge_info->expr, Var));
+
+		if (equal(ge_info->expr, expr) ||
+			exprs_known_equal(root, (Node *) expr, (Node *) ge_info->expr,
+							  ge_info->btree_opfamily))
+		{
+			Assert(ge_info->sortgroupref > 0);
+
+			return ge_info->sortgroupref;
+		}
+	}
+
+	/* no match is found */
+	return 0;
+}
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index a157cec3c4d..466aabb8cf0 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -145,6 +145,13 @@
   boot_val => 'false',
 },
 
+{ name => 'enable_eager_aggregate', type => 'bool', context => 'PGC_USERSET', group => 'QUERY_TUNING_METHOD',
+  short_desc => 'Enables eager aggregation.',
+  flags => 'GUC_EXPLAIN',
+  variable => 'enable_eager_aggregate',
+  boot_val => 'true',
+},
+
 { name => 'enable_parallel_append', type => 'bool', context => 'PGC_USERSET', group => 'QUERY_TUNING_METHOD',
   short_desc => 'Enables the planner\'s use of parallel append plans.',
   flags => 'GUC_EXPLAIN',
@@ -2421,6 +2428,15 @@
   max => 'DBL_MAX',
 },
 
+{ name => 'min_eager_agg_group_size', type => 'real', context => 'PGC_USERSET', group => 'QUERY_TUNING_COST',
+  short_desc => 'Sets the minimum average group size required to consider applying eager aggregation.',
+  flags => 'GUC_EXPLAIN',
+  variable => 'min_eager_agg_group_size',
+  boot_val => '8.0',
+  min => '0.0',
+  max => 'DBL_MAX',
+},
+
 { name => 'cursor_tuple_fraction', type => 'real', context => 'PGC_USERSET', group => 'QUERY_TUNING_OTHER',
   short_desc => 'Sets the planner\'s estimate of the fraction of a cursor\'s rows that will be retrieved.',
   flags => 'GUC_EXPLAIN',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index a9d8293474a..e3cdfe11992 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -428,6 +428,7 @@
 #enable_group_by_reordering = on
 #enable_distinct_reordering = on
 #enable_self_join_elimination = on
+#enable_eager_aggregate = on
 
 # - Planner Cost Constants -
 
@@ -441,6 +442,7 @@
 #min_parallel_table_scan_size = 8MB
 #min_parallel_index_scan_size = 512kB
 #effective_cache_size = 4GB
+#min_eager_agg_group_size = 8.0
 
 #jit_above_cost = 100000		# perform JIT compilation if available
 					# and query more expensive than this;
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 4a903d1ec18..ad211207343 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -397,6 +397,15 @@ struct PlannerInfo
 	/* list of PlaceHolderInfos */
 	List	   *placeholder_list;
 
+	/* list of AggClauseInfos */
+	List	   *agg_clause_list;
+
+	/* list of GroupExprInfos */
+	List	   *group_expr_list;
+
+	/* list of plain Vars contained in targetlist and havingQual */
+	List	   *tlist_vars;
+
 	/* array of PlaceHolderInfos indexed by phid */
 	struct PlaceHolderInfo **placeholder_array pg_node_attr(read_write_ignore, array_size(placeholder_array_size));
 	/* allocated size of array */
@@ -1046,6 +1055,14 @@ typedef struct RelOptInfo
 	/* consider partitionwise join paths? (if partitioned rel) */
 	bool		consider_partitionwise_join;
 
+	/*
+	 * used by eager aggregation:
+	 */
+	/* information needed to create grouped paths */
+	struct RelAggInfo *agg_info;
+	/* the partially-aggregated version of the relation */
+	struct RelOptInfo *grouped_rel;
+
 	/*
 	 * inheritance links, if this is an otherrel (otherwise NULL):
 	 */
@@ -1130,6 +1147,75 @@ typedef struct RelOptInfo
 	((nominal_jointype) == JOIN_INNER && (sjinfo)->jointype == JOIN_SEMI && \
 	 bms_equal((sjinfo)->syn_righthand, (rel)->relids))
 
+/*
+ * Is the given relation a grouped relation?
+ */
+#define IS_GROUPED_REL(rel) \
+	((rel)->agg_info != NULL)
+
+/*
+ * RelAggInfo
+ *		Information needed to create grouped paths for base and join rels.
+ *
+ * "relids" is the set of relation identifiers (RT indexes).
+ *
+ * "target" is the output tlist for the grouped paths.
+ *
+ * "agg_input" is the output tlist for the paths that provide input to the
+ * grouped paths.  One difference from the reltarget of the non-grouped
+ * relation is that agg_input has its sortgrouprefs[] initialized.
+ *
+ * "grouped_rows" is the estimated number of result tuples of the grouped
+ * relation.
+ *
+ * "group_clauses", "group_exprs" and "group_pathkeys" are lists of
+ * SortGroupClauses, the corresponding grouping expressions and PathKeys
+ * respectively.
+ *
+ * "apply_at" tracks the lowest join level at which partial aggregation is
+ * applied.
+ *
+ * "agg_useful" is a flag to indicate whether the grouped paths are considered
+ * useful.  It is set true if the average partial group size is no less than
+ * min_eager_agg_group_size, suggesting a significant row count reduction.
+ */
+typedef struct RelAggInfo
+{
+	pg_node_attr(no_copy_equal, no_read, no_query_jumble)
+
+	NodeTag		type;
+
+	/* set of base + OJ relids (rangetable indexes) */
+	Relids		relids;
+
+	/*
+	 * default result targetlist for Paths scanning this grouped relation;
+	 * list of Vars/Exprs, cost, width
+	 */
+	struct PathTarget *target;
+
+	/*
+	 * the targetlist for Paths that provide input to the grouped paths
+	 */
+	struct PathTarget *agg_input;
+
+	/* estimated number of result tuples */
+	Cardinality grouped_rows;
+
+	/* a list of SortGroupClauses */
+	List	   *group_clauses;
+	/* a list of grouping expressions */
+	List	   *group_exprs;
+	/* a list of PathKeys */
+	List	   *group_pathkeys;
+
+	/* lowest level partial aggregation is applied at */
+	Relids		apply_at;
+
+	/* the grouped paths are considered useful? */
+	bool		agg_useful;
+} RelAggInfo;
+
 /*
  * IndexOptInfo
  *		Per-index information for planning/optimization
@@ -3283,6 +3369,50 @@ typedef struct MinMaxAggInfo
 	Param	   *param;
 } MinMaxAggInfo;
 
+/*
+ * For each distinct Aggref node that appears in the targetlist and HAVING
+ * clauses, we store an AggClauseInfo node in the PlannerInfo node's
+ * agg_clause_list.  Each AggClauseInfo records the set of relations referenced
+ * by the aggregate expression.  This information is used to determine how far
+ * the aggregate can be safely pushed down in the join tree.
+ */
+typedef struct AggClauseInfo
+{
+	pg_node_attr(no_read, no_query_jumble)
+
+	NodeTag		type;
+
+	/* the Aggref expr */
+	Aggref	   *aggref;
+
+	/* lowest level we can evaluate this aggregate at */
+	Relids		agg_eval_at;
+} AggClauseInfo;
+
+/*
+ * For each grouping expression that appears in grouping clauses, we store a
+ * GroupingExprInfo node in the PlannerInfo node's group_expr_list.  Each
+ * GroupingExprInfo records the expression being grouped on, its sortgroupref,
+ * and the btree opfamily used for equality comparison.  This information is
+ * necessary to reproduce correct grouping semantics at different levels of the
+ * join tree.
+ */
+typedef struct GroupingExprInfo
+{
+	pg_node_attr(no_read, no_query_jumble)
+
+	NodeTag		type;
+
+	/* the represented expression */
+	Expr	   *expr;
+
+	/* the tleSortGroupRef of the corresponding SortGroupClause */
+	Index		sortgroupref;
+
+	/* btree opfamily defining the ordering */
+	Oid			btree_opfamily;
+} GroupingExprInfo;
+
 /*
  * At runtime, PARAM_EXEC slots are used to pass values around from one plan
  * node to another.  They can be used to pass values down into subqueries (for
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index 763cd25bb3c..5b9c1daf14b 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -312,6 +312,10 @@ extern void setup_simple_rel_arrays(PlannerInfo *root);
 extern void expand_planner_arrays(PlannerInfo *root, int add_size);
 extern RelOptInfo *build_simple_rel(PlannerInfo *root, int relid,
 									RelOptInfo *parent);
+extern RelOptInfo *build_simple_grouped_rel(PlannerInfo *root,
+											RelOptInfo *rel_plain);
+extern RelOptInfo *build_grouped_rel(PlannerInfo *root,
+									 RelOptInfo *rel_plain);
 extern RelOptInfo *find_base_rel(PlannerInfo *root, int relid);
 extern RelOptInfo *find_base_rel_noerr(PlannerInfo *root, int relid);
 extern RelOptInfo *find_base_rel_ignore_join(PlannerInfo *root, int relid);
@@ -351,4 +355,5 @@ extern RelOptInfo *build_child_join_rel(PlannerInfo *root,
 										SpecialJoinInfo *sjinfo,
 										int nappinfos, AppendRelInfo **appinfos);
 
+extern RelAggInfo *create_rel_agg_info(PlannerInfo *root, RelOptInfo *rel);
 #endif							/* PATHNODE_H */
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index cbade77b717..8d03d662a04 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -21,7 +21,9 @@
  * allpaths.c
  */
 extern PGDLLIMPORT bool enable_geqo;
+extern PGDLLIMPORT bool enable_eager_aggregate;
 extern PGDLLIMPORT int geqo_threshold;
+extern PGDLLIMPORT double min_eager_agg_group_size;
 extern PGDLLIMPORT int min_parallel_table_scan_size;
 extern PGDLLIMPORT int min_parallel_index_scan_size;
 extern PGDLLIMPORT bool enable_group_by_reordering;
@@ -57,6 +59,10 @@ extern void generate_gather_paths(PlannerInfo *root, RelOptInfo *rel,
 								  bool override_rows);
 extern void generate_useful_gather_paths(PlannerInfo *root, RelOptInfo *rel,
 										 bool override_rows);
+extern void generate_grouped_paths(PlannerInfo *root,
+								   RelOptInfo *rel_grouped,
+								   RelOptInfo *rel_plain,
+								   RelAggInfo *agg_info);
 extern int	compute_parallel_worker(RelOptInfo *rel, double heap_pages,
 									double index_pages, int max_workers);
 extern void create_partial_bitmap_paths(PlannerInfo *root, RelOptInfo *rel,
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index 9d3debcab28..09b48b26f8f 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -76,6 +76,7 @@ extern void add_vars_to_targetlist(PlannerInfo *root, List *vars,
 extern void add_vars_to_attr_needed(PlannerInfo *root, List *vars,
 									Relids where_needed);
 extern void remove_useless_groupby_columns(PlannerInfo *root);
+extern void setup_eager_aggregation(PlannerInfo *root);
 extern void find_lateral_references(PlannerInfo *root);
 extern void rebuild_lateral_attr_needed(PlannerInfo *root);
 extern void create_lateral_join_info(PlannerInfo *root);
diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out
index 69805d4b9ec..ef79d6f1ded 100644
--- a/src/test/regress/expected/collate.icu.utf8.out
+++ b/src/test/regress/expected/collate.icu.utf8.out
@@ -2437,11 +2437,11 @@ SELECT c collate "C", count(c) FROM pagg_tab3 GROUP BY c collate "C" ORDER BY 1;
 SET enable_partitionwise_join TO false;
 EXPLAIN (COSTS OFF)
 SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROUP BY 1 ORDER BY t1.c COLLATE "C";
-                         QUERY PLAN                          
--------------------------------------------------------------
+                            QUERY PLAN                             
+-------------------------------------------------------------------
  Sort
    Sort Key: t1.c COLLATE "C"
-   ->  HashAggregate
+   ->  Finalize HashAggregate
          Group Key: t1.c
          ->  Hash Join
                Hash Cond: (t1.c = t2.c)
@@ -2449,10 +2449,12 @@ SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROU
                      ->  Seq Scan on pagg_tab3_p2 t1_1
                      ->  Seq Scan on pagg_tab3_p1 t1_2
                ->  Hash
-                     ->  Append
-                           ->  Seq Scan on pagg_tab3_p2 t2_1
-                           ->  Seq Scan on pagg_tab3_p1 t2_2
-(13 rows)
+                     ->  Partial HashAggregate
+                           Group Key: t2.c
+                           ->  Append
+                                 ->  Seq Scan on pagg_tab3_p2 t2_1
+                                 ->  Seq Scan on pagg_tab3_p1 t2_2
+(15 rows)
 
 SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROUP BY 1 ORDER BY t1.c COLLATE "C";
  c | count 
@@ -2464,11 +2466,11 @@ SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROU
 SET enable_partitionwise_join TO true;
 EXPLAIN (COSTS OFF)
 SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROUP BY 1 ORDER BY t1.c COLLATE "C";
-                         QUERY PLAN                          
--------------------------------------------------------------
+                            QUERY PLAN                             
+-------------------------------------------------------------------
  Sort
    Sort Key: t1.c COLLATE "C"
-   ->  HashAggregate
+   ->  Finalize HashAggregate
          Group Key: t1.c
          ->  Hash Join
                Hash Cond: (t1.c = t2.c)
@@ -2476,10 +2478,12 @@ SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROU
                      ->  Seq Scan on pagg_tab3_p2 t1_1
                      ->  Seq Scan on pagg_tab3_p1 t1_2
                ->  Hash
-                     ->  Append
-                           ->  Seq Scan on pagg_tab3_p2 t2_1
-                           ->  Seq Scan on pagg_tab3_p1 t2_2
-(13 rows)
+                     ->  Partial HashAggregate
+                           Group Key: t2.c
+                           ->  Append
+                                 ->  Seq Scan on pagg_tab3_p2 t2_1
+                                 ->  Seq Scan on pagg_tab3_p1 t2_2
+(15 rows)
 
 SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROUP BY 1 ORDER BY t1.c COLLATE "C";
  c | count 
diff --git a/src/test/regress/expected/eager_aggregate.out b/src/test/regress/expected/eager_aggregate.out
new file mode 100644
index 00000000000..f02ff0b30a3
--- /dev/null
+++ b/src/test/regress/expected/eager_aggregate.out
@@ -0,0 +1,1334 @@
+--
+-- EAGER AGGREGATION
+-- Test we can push aggregation down below join
+--
+-- Enable eager aggregation, which by default is disabled.
+SET enable_eager_aggregate TO on;
+CREATE TABLE eager_agg_t1 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t2 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t3 (a int, b int, c double precision);
+INSERT INTO eager_agg_t1 SELECT i, i, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t2 SELECT i, i%10, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t3 SELECT i%10, i%10, i FROM generate_series(1, 1000)i;
+ANALYZE eager_agg_t1;
+ANALYZE eager_agg_t2;
+ANALYZE eager_agg_t3;
+--
+-- Test eager aggregation over base rel
+--
+-- Perform scan of a table, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                            QUERY PLAN                            
+------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Seq Scan on public.eager_agg_t2 t2
+                                 Output: t2.a, t2.b, t2.c
+(18 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                               QUERY PLAN                               
+------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial GroupAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Sort
+                                 Output: t2.c, t2.b
+                                 Sort Key: t2.b
+                                 ->  Seq Scan on public.eager_agg_t2 t2
+                                       Output: t2.c, t2.b
+(21 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+RESET enable_hashagg;
+--
+-- Test eager aggregation over join rel
+--
+-- Perform join of tables, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+                                  QUERY PLAN                                  
+------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg((t2.c + t3.c))
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg((t2.c + t3.c)))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg((t2.c + t3.c))
+                           Group Key: t2.b
+                           ->  Hash Join
+                                 Output: t2.c, t2.b, t3.c
+                                 Hash Cond: (t3.a = t2.a)
+                                 ->  Seq Scan on public.eager_agg_t3 t3
+                                       Output: t3.a, t3.b, t3.c
+                                 ->  Hash
+                                       Output: t2.c, t2.b, t2.a
+                                       ->  Seq Scan on public.eager_agg_t2 t2
+                                             Output: t2.c, t2.b, t2.a
+(25 rows)
+
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 497
+ 2 | 499
+ 3 | 501
+ 4 | 503
+ 5 | 505
+ 6 | 507
+ 7 | 509
+ 8 | 511
+ 9 | 513
+(9 rows)
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+                                     QUERY PLAN                                     
+------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg((t2.c + t3.c))
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg((t2.c + t3.c)))
+                     ->  Partial GroupAggregate
+                           Output: t2.b, PARTIAL avg((t2.c + t3.c))
+                           Group Key: t2.b
+                           ->  Sort
+                                 Output: t2.c, t2.b, t3.c
+                                 Sort Key: t2.b
+                                 ->  Hash Join
+                                       Output: t2.c, t2.b, t3.c
+                                       Hash Cond: (t3.a = t2.a)
+                                       ->  Seq Scan on public.eager_agg_t3 t3
+                                             Output: t3.a, t3.b, t3.c
+                                       ->  Hash
+                                             Output: t2.c, t2.b, t2.a
+                                             ->  Seq Scan on public.eager_agg_t2 t2
+                                                   Output: t2.c, t2.b, t2.a
+(28 rows)
+
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 497
+ 2 | 499
+ 3 | 501
+ 4 | 503
+ 5 | 505
+ 6 | 507
+ 7 | 509
+ 8 | 511
+ 9 | 513
+(9 rows)
+
+RESET enable_hashagg;
+--
+-- Test that eager aggregation works for outer join
+--
+-- Ensure aggregation can be pushed down to the non-nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                            QUERY PLAN                            
+------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Right Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Seq Scan on public.eager_agg_t2 t2
+                                 Output: t2.a, t2.b, t2.c
+(18 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+   | 505
+(10 rows)
+
+-- Ensure aggregation cannot be pushed down to the nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Sort
+   Output: t2.b, (avg(t2.c))
+   Sort Key: t2.b
+   ->  HashAggregate
+         Output: t2.b, avg(t2.c)
+         Group Key: t2.b
+         ->  Hash Right Join
+               Output: t2.b, t2.c
+               Hash Cond: (t2.b = t1.b)
+               ->  Seq Scan on public.eager_agg_t2 t2
+                     Output: t2.a, t2.b, t2.c
+               ->  Hash
+                     Output: t1.b
+                     ->  Seq Scan on public.eager_agg_t1 t1
+                           Output: t1.b
+(15 rows)
+
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+ b | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+   |    
+(10 rows)
+
+--
+-- Test that eager aggregation works for parallel plans
+--
+SET parallel_setup_cost=0;
+SET parallel_tuple_cost=0;
+SET min_parallel_table_scan_size=0;
+SET max_parallel_workers_per_gather=4;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                                   QUERY PLAN                                    
+---------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Gather Merge
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Workers Planned: 2
+         ->  Sort
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Sort Key: t1.a
+               ->  Parallel Hash Join
+                     Output: t1.a, (PARTIAL avg(t2.c))
+                     Hash Cond: (t1.b = t2.b)
+                     ->  Parallel Seq Scan on public.eager_agg_t1 t1
+                           Output: t1.a, t1.b, t1.c
+                     ->  Parallel Hash
+                           Output: t2.b, (PARTIAL avg(t2.c))
+                           ->  Partial HashAggregate
+                                 Output: t2.b, PARTIAL avg(t2.c)
+                                 Group Key: t2.b
+                                 ->  Parallel Seq Scan on public.eager_agg_t2 t2
+                                       Output: t2.a, t2.b, t2.c
+(21 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+RESET parallel_setup_cost;
+RESET parallel_tuple_cost;
+RESET min_parallel_table_scan_size;
+RESET max_parallel_workers_per_gather;
+DROP TABLE eager_agg_t1;
+DROP TABLE eager_agg_t2;
+DROP TABLE eager_agg_t3;
+--
+-- Test eager aggregation for partitionwise join
+--
+-- Enable partitionwise aggregate, which by default is disabled.
+SET enable_partitionwise_aggregate TO true;
+-- Enable partitionwise join, which by default is disabled.
+SET enable_partitionwise_join TO true;
+CREATE TABLE eager_agg_tab1(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab1_p1 PARTITION OF eager_agg_tab1 FOR VALUES FROM (0) TO (5);
+CREATE TABLE eager_agg_tab1_p2 PARTITION OF eager_agg_tab1 FOR VALUES FROM (5) TO (10);
+CREATE TABLE eager_agg_tab1_p3 PARTITION OF eager_agg_tab1 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab2(x int, y int) PARTITION BY RANGE(y);
+CREATE TABLE eager_agg_tab2_p1 PARTITION OF eager_agg_tab2 FOR VALUES FROM (0) TO (5);
+CREATE TABLE eager_agg_tab2_p2 PARTITION OF eager_agg_tab2 FOR VALUES FROM (5) TO (10);
+CREATE TABLE eager_agg_tab2_p3 PARTITION OF eager_agg_tab2 FOR VALUES FROM (10) TO (15);
+INSERT INTO eager_agg_tab1 SELECT i % 15, i % 10 FROM generate_series(1, 1000) i;
+INSERT INTO eager_agg_tab2 SELECT i % 10, i % 15 FROM generate_series(1, 1000) i;
+ANALYZE eager_agg_tab1;
+ANALYZE eager_agg_tab2;
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum(t1.y)), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum(t1.y), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2.y = t1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2
+                           Output: t2.y
+                     ->  Hash
+                           Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1.x, PARTIAL sum(t1.y), PARTIAL count(*)
+                                 Group Key: t1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                       Output: t1.x, t1.y
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum(t1_1.y), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_1
+                           Output: t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.y), PARTIAL count(*)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                       Output: t1_1.x, t1_1.y
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum(t1_2.y), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_2
+                           Output: t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.y), PARTIAL count(*)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                       Output: t1_2.x, t1_2.y
+(49 rows)
+
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+ x  |  sum  | count 
+----+-------+-------
+  0 | 10890 |  4356
+  1 | 15544 |  4489
+  2 | 20033 |  4489
+  3 | 24522 |  4489
+  4 | 29011 |  4489
+  5 | 11390 |  4489
+  6 | 15879 |  4489
+  7 | 20368 |  4489
+  8 | 24857 |  4489
+  9 | 29346 |  4489
+ 10 | 11055 |  4489
+ 11 | 15246 |  4356
+ 12 | 19602 |  4356
+ 13 | 23958 |  4356
+ 14 | 28314 |  4356
+(15 rows)
+
+-- GROUP BY having other matching key
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t2.y, (sum(t1.y)), (count(*))
+   Sort Key: t2.y
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t2.y, sum(t1.y), count(*)
+               Group Key: t2.y
+               ->  Hash Join
+                     Output: t2.y, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2.y = t1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2
+                           Output: t2.y
+                     ->  Hash
+                           Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1.x, PARTIAL sum(t1.y), PARTIAL count(*)
+                                 Group Key: t1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                       Output: t1.y, t1.x
+         ->  Finalize HashAggregate
+               Output: t2_1.y, sum(t1_1.y), count(*)
+               Group Key: t2_1.y
+               ->  Hash Join
+                     Output: t2_1.y, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_1
+                           Output: t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.y), PARTIAL count(*)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                       Output: t1_1.y, t1_1.x
+         ->  Finalize HashAggregate
+               Output: t2_2.y, sum(t1_2.y), count(*)
+               Group Key: t2_2.y
+               ->  Hash Join
+                     Output: t2_2.y, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_2
+                           Output: t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.y), PARTIAL count(*)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                       Output: t1_2.y, t1_2.x
+(49 rows)
+
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+ y  |  sum  | count 
+----+-------+-------
+  0 | 10890 |  4356
+  1 | 15544 |  4489
+  2 | 20033 |  4489
+  3 | 24522 |  4489
+  4 | 29011 |  4489
+  5 | 11390 |  4489
+  6 | 15879 |  4489
+  7 | 20368 |  4489
+  8 | 24857 |  4489
+  9 | 29346 |  4489
+ 10 | 11055 |  4489
+ 11 | 15246 |  4356
+ 12 | 19602 |  4356
+ 13 | 23958 |  4356
+ 14 | 28314 |  4356
+(15 rows)
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 5 ORDER BY t2.x;
+                                                 QUERY PLAN                                                 
+------------------------------------------------------------------------------------------------------------
+ Sort
+   Output: t2.x, (sum(t1.x)), (count(*))
+   Sort Key: t2.x
+   ->  Finalize HashAggregate
+         Output: t2.x, sum(t1.x), count(*)
+         Group Key: t2.x
+         Filter: (avg(t1.x) > '5'::numeric)
+         ->  Append
+               ->  Hash Join
+                     Output: t2.x, (PARTIAL sum(t1.x)), (PARTIAL count(*)), (PARTIAL avg(t1.x))
+                     Hash Cond: (t2.y = t1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2
+                           Output: t2.x, t2.y
+                     ->  Hash
+                           Output: t1.x, (PARTIAL sum(t1.x)), (PARTIAL count(*)), (PARTIAL avg(t1.x))
+                           ->  Partial HashAggregate
+                                 Output: t1.x, PARTIAL sum(t1.x), PARTIAL count(*), PARTIAL avg(t1.x)
+                                 Group Key: t1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                       Output: t1.x
+               ->  Hash Join
+                     Output: t2_1.x, (PARTIAL sum(t1_1.x)), (PARTIAL count(*)), (PARTIAL avg(t1_1.x))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_1
+                           Output: t2_1.x, t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.x)), (PARTIAL count(*)), (PARTIAL avg(t1_1.x))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.x), PARTIAL count(*), PARTIAL avg(t1_1.x)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                       Output: t1_1.x
+               ->  Hash Join
+                     Output: t2_2.x, (PARTIAL sum(t1_2.x)), (PARTIAL count(*)), (PARTIAL avg(t1_2.x))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_2
+                           Output: t2_2.x, t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.x)), (PARTIAL count(*)), (PARTIAL avg(t1_2.x))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.x), PARTIAL count(*), PARTIAL avg(t1_2.x)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                       Output: t1_2.x
+(44 rows)
+
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 5 ORDER BY t2.x;
+ x |  sum  | count 
+---+-------+-------
+ 0 | 33835 |  6667
+ 1 | 39502 |  6667
+ 2 | 46169 |  6667
+ 3 | 52836 |  6667
+ 4 | 59503 |  6667
+ 5 | 33500 |  6667
+ 6 | 39837 |  6667
+ 7 | 46504 |  6667
+ 8 | 53171 |  6667
+ 9 | 59838 |  6667
+(10 rows)
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum((t2.y + t3.y)))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum((t2.y + t3.y))
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum((t2.y + t3.y)))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, t3.x, (PARTIAL sum((t2.y + t3.y)))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, t3.x, PARTIAL sum((t2.y + t3.y))
+                                 Group Key: t2.x
+                                 ->  Hash Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Hash Cond: (t2.x = t3.x)
+                                       ->  Seq Scan on public.eager_agg_tab1_p1 t2
+                                             Output: t2.y, t2.x
+                                       ->  Hash
+                                             Output: t3.y, t3.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p1 t3
+                                                   Output: t3.y, t3.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum((t2_1.y + t3_1.y))
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum((t2_1.y + t3_1.y)))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y)))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y))
+                                 Group Key: t2_1.x
+                                 ->  Hash Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Hash Cond: (t2_1.x = t3_1.x)
+                                       ->  Seq Scan on public.eager_agg_tab1_p2 t2_1
+                                             Output: t2_1.y, t2_1.x
+                                       ->  Hash
+                                             Output: t3_1.y, t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p2 t3_1
+                                                   Output: t3_1.y, t3_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum((t2_2.y + t3_2.y))
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum((t2_2.y + t3_2.y)))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y)))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y))
+                                 Group Key: t2_2.x
+                                 ->  Hash Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Hash Cond: (t2_2.x = t3_2.x)
+                                       ->  Seq Scan on public.eager_agg_tab1_p3 t2_2
+                                             Output: t2_2.y, t2_2.x
+                                       ->  Hash
+                                             Output: t3_2.y, t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p3 t3_2
+                                                   Output: t3_2.y, t3_2.x
+(70 rows)
+
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+ x  |   sum   
+----+---------
+  0 | 1437480
+  1 | 2082896
+  2 | 2684422
+  3 | 3285948
+  4 | 3887474
+  5 | 1526260
+  6 | 2127786
+  7 | 2729312
+  8 | 3330838
+  9 | 3932364
+ 10 | 1481370
+ 11 | 2012472
+ 12 | 2587464
+ 13 | 3162456
+ 14 | 3737448
+(15 rows)
+
+-- partial aggregation
+SET enable_hashagg TO off;
+SET max_parallel_workers_per_gather TO 0;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t3.y, sum((t2.y + t3.y))
+   Group Key: t3.y
+   ->  Sort
+         Output: t3.y, (PARTIAL sum((t2.y + t3.y)))
+         Sort Key: t3.y
+         ->  Append
+               ->  Hash Join
+                     Output: t3.y, (PARTIAL sum((t2.y + t3.y)))
+                     Hash Cond: (t2.x = t1.x)
+                     ->  Partial GroupAggregate
+                           Output: t2.x, t3.y, t3.x, PARTIAL sum((t2.y + t3.y))
+                           Group Key: t2.x, t3.y, t3.x
+                           ->  Incremental Sort
+                                 Output: t2.y, t2.x, t3.y, t3.x
+                                 Sort Key: t2.x, t3.y
+                                 Presorted Key: t2.x
+                                 ->  Merge Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Merge Cond: (t2.x = t3.x)
+                                       ->  Sort
+                                             Output: t2.y, t2.x
+                                             Sort Key: t2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p1 t2
+                                                   Output: t2.y, t2.x
+                                       ->  Sort
+                                             Output: t3.y, t3.x
+                                             Sort Key: t3.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p1 t3
+                                                   Output: t3.y, t3.x
+                     ->  Hash
+                           Output: t1.x
+                           ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                 Output: t1.x
+               ->  Hash Join
+                     Output: t3_1.y, (PARTIAL sum((t2_1.y + t3_1.y)))
+                     Hash Cond: (t2_1.x = t1_1.x)
+                     ->  Partial GroupAggregate
+                           Output: t2_1.x, t3_1.y, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y))
+                           Group Key: t2_1.x, t3_1.y, t3_1.x
+                           ->  Incremental Sort
+                                 Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                 Sort Key: t2_1.x, t3_1.y
+                                 Presorted Key: t2_1.x
+                                 ->  Merge Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Merge Cond: (t2_1.x = t3_1.x)
+                                       ->  Sort
+                                             Output: t2_1.y, t2_1.x
+                                             Sort Key: t2_1.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p2 t2_1
+                                                   Output: t2_1.y, t2_1.x
+                                       ->  Sort
+                                             Output: t3_1.y, t3_1.x
+                                             Sort Key: t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p2 t3_1
+                                                   Output: t3_1.y, t3_1.x
+                     ->  Hash
+                           Output: t1_1.x
+                           ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                 Output: t1_1.x
+               ->  Hash Join
+                     Output: t3_2.y, (PARTIAL sum((t2_2.y + t3_2.y)))
+                     Hash Cond: (t2_2.x = t1_2.x)
+                     ->  Partial GroupAggregate
+                           Output: t2_2.x, t3_2.y, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y))
+                           Group Key: t2_2.x, t3_2.y, t3_2.x
+                           ->  Incremental Sort
+                                 Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                 Sort Key: t2_2.x, t3_2.y
+                                 Presorted Key: t2_2.x
+                                 ->  Merge Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Merge Cond: (t2_2.x = t3_2.x)
+                                       ->  Sort
+                                             Output: t2_2.y, t2_2.x
+                                             Sort Key: t2_2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p3 t2_2
+                                                   Output: t2_2.y, t2_2.x
+                                       ->  Sort
+                                             Output: t3_2.y, t3_2.x
+                                             Sort Key: t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p3 t3_2
+                                                   Output: t3_2.y, t3_2.x
+                     ->  Hash
+                           Output: t1_2.x
+                           ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                 Output: t1_2.x
+(88 rows)
+
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+ y |   sum   
+---+---------
+ 0 | 1111110
+ 1 | 2000132
+ 2 | 2889154
+ 3 | 3778176
+ 4 | 4667198
+ 5 | 3334000
+ 6 | 4223022
+ 7 | 5112044
+ 8 | 6001066
+ 9 | 6890088
+(10 rows)
+
+RESET enable_hashagg;
+RESET max_parallel_workers_per_gather;
+DROP TABLE eager_agg_tab1;
+DROP TABLE eager_agg_tab2;
+--
+-- Test with multi-level partitioning scheme
+--
+CREATE TABLE eager_agg_tab_ml(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p1 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab_ml_p2 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (10) TO (20) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p2_s1 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab_ml_p2_s2 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (15) TO (20);
+CREATE TABLE eager_agg_tab_ml_p3 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (20) TO (30) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p3_s1 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (20) TO (25);
+CREATE TABLE eager_agg_tab_ml_p3_s2 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (25) TO (30);
+INSERT INTO eager_agg_tab_ml SELECT i % 30, i % 30 FROM generate_series(1, 1000) i;
+ANALYZE eager_agg_tab_ml;
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum(t2.y)), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum(t2.y), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, PARTIAL sum(t2.y), PARTIAL count(*)
+                                 Group Key: t2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                       Output: t2.y, t2.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum(t2_1.y), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, PARTIAL sum(t2_1.y), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                       Output: t2_1.y, t2_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum(t2_2.y), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, PARTIAL sum(t2_2.y), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                       Output: t2_2.y, t2_2.x
+         ->  Finalize HashAggregate
+               Output: t1_3.x, sum(t2_3.y), count(*)
+               Group Key: t1_3.x
+               ->  Hash Join
+                     Output: t1_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, PARTIAL sum(t2_3.y), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                       Output: t2_3.y, t2_3.x
+         ->  Finalize HashAggregate
+               Output: t1_4.x, sum(t2_4.y), count(*)
+               Group Key: t1_4.x
+               ->  Hash Join
+                     Output: t1_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, PARTIAL sum(t2_4.y), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                       Output: t2_4.y, t2_4.x
+(79 rows)
+
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+ x  |  sum  | count 
+----+-------+-------
+  0 |     0 |  1089
+  1 |  1156 |  1156
+  2 |  2312 |  1156
+  3 |  3468 |  1156
+  4 |  4624 |  1156
+  5 |  5780 |  1156
+  6 |  6936 |  1156
+  7 |  8092 |  1156
+  8 |  9248 |  1156
+  9 | 10404 |  1156
+ 10 | 11560 |  1156
+ 11 | 11979 |  1089
+ 12 | 13068 |  1089
+ 13 | 14157 |  1089
+ 14 | 15246 |  1089
+ 15 | 16335 |  1089
+ 16 | 17424 |  1089
+ 17 | 18513 |  1089
+ 18 | 19602 |  1089
+ 19 | 20691 |  1089
+ 20 | 21780 |  1089
+ 21 | 22869 |  1089
+ 22 | 23958 |  1089
+ 23 | 25047 |  1089
+ 24 | 26136 |  1089
+ 25 | 27225 |  1089
+ 26 | 28314 |  1089
+ 27 | 29403 |  1089
+ 28 | 30492 |  1089
+ 29 | 31581 |  1089
+(30 rows)
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.y, (sum(t2.y)), (count(*))
+   Sort Key: t1.y
+   ->  Finalize HashAggregate
+         Output: t1.y, sum(t2.y), count(*)
+         Group Key: t1.y
+         ->  Append
+               ->  Hash Join
+                     Output: t1.y, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.y, t1.x
+                     ->  Hash
+                           Output: t2.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, PARTIAL sum(t2.y), PARTIAL count(*)
+                                 Group Key: t2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                       Output: t2.y, t2.x
+               ->  Hash Join
+                     Output: t1_1.y, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.y, t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, PARTIAL sum(t2_1.y), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                       Output: t2_1.y, t2_1.x
+               ->  Hash Join
+                     Output: t1_2.y, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.y, t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, PARTIAL sum(t2_2.y), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                       Output: t2_2.y, t2_2.x
+               ->  Hash Join
+                     Output: t1_3.y, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.y, t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, PARTIAL sum(t2_3.y), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                       Output: t2_3.y, t2_3.x
+               ->  Hash Join
+                     Output: t1_4.y, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.y, t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, PARTIAL sum(t2_4.y), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                       Output: t2_4.y, t2_4.x
+(67 rows)
+
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+ y  |  sum  | count 
+----+-------+-------
+  0 |     0 |  1089
+  1 |  1156 |  1156
+  2 |  2312 |  1156
+  3 |  3468 |  1156
+  4 |  4624 |  1156
+  5 |  5780 |  1156
+  6 |  6936 |  1156
+  7 |  8092 |  1156
+  8 |  9248 |  1156
+  9 | 10404 |  1156
+ 10 | 11560 |  1156
+ 11 | 11979 |  1089
+ 12 | 13068 |  1089
+ 13 | 14157 |  1089
+ 14 | 15246 |  1089
+ 15 | 16335 |  1089
+ 16 | 17424 |  1089
+ 17 | 18513 |  1089
+ 18 | 19602 |  1089
+ 19 | 20691 |  1089
+ 20 | 21780 |  1089
+ 21 | 22869 |  1089
+ 22 | 23958 |  1089
+ 23 | 25047 |  1089
+ 24 | 26136 |  1089
+ 25 | 27225 |  1089
+ 26 | 28314 |  1089
+ 27 | 29403 |  1089
+ 28 | 30492 |  1089
+ 29 | 31581 |  1089
+(30 rows)
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+                                                QUERY PLAN                                                
+----------------------------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum((t2.y + t3.y))), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum((t2.y + t3.y)), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, t3.x, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, t3.x, PARTIAL sum((t2.y + t3.y)), PARTIAL count(*)
+                                 Group Key: t2.x
+                                 ->  Hash Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Hash Cond: (t2.x = t3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                             Output: t2.y, t2.x
+                                       ->  Hash
+                                             Output: t3.y, t3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p1 t3
+                                                   Output: t3.y, t3.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum((t2_1.y + t3_1.y)), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y)), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Hash Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Hash Cond: (t2_1.x = t3_1.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                             Output: t2_1.y, t2_1.x
+                                       ->  Hash
+                                             Output: t3_1.y, t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t3_1
+                                                   Output: t3_1.y, t3_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum((t2_2.y + t3_2.y)), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y)), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Hash Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Hash Cond: (t2_2.x = t3_2.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                             Output: t2_2.y, t2_2.x
+                                       ->  Hash
+                                             Output: t3_2.y, t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t3_2
+                                                   Output: t3_2.y, t3_2.x
+         ->  Finalize HashAggregate
+               Output: t1_3.x, sum((t2_3.y + t3_3.y)), count(*)
+               Group Key: t1_3.x
+               ->  Hash Join
+                     Output: t1_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, t3_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, t3_3.x, PARTIAL sum((t2_3.y + t3_3.y)), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Hash Join
+                                       Output: t2_3.y, t2_3.x, t3_3.y, t3_3.x
+                                       Hash Cond: (t2_3.x = t3_3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                             Output: t2_3.y, t2_3.x
+                                       ->  Hash
+                                             Output: t3_3.y, t3_3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t3_3
+                                                   Output: t3_3.y, t3_3.x
+         ->  Finalize HashAggregate
+               Output: t1_4.x, sum((t2_4.y + t3_4.y)), count(*)
+               Group Key: t1_4.x
+               ->  Hash Join
+                     Output: t1_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, t3_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, t3_4.x, PARTIAL sum((t2_4.y + t3_4.y)), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Hash Join
+                                       Output: t2_4.y, t2_4.x, t3_4.y, t3_4.x
+                                       Hash Cond: (t2_4.x = t3_4.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                             Output: t2_4.y, t2_4.x
+                                       ->  Hash
+                                             Output: t3_4.y, t3_4.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t3_4
+                                                   Output: t3_4.y, t3_4.x
+(114 rows)
+
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+ x  |   sum   | count 
+----+---------+-------
+  0 |       0 | 35937
+  1 |   78608 | 39304
+  2 |  157216 | 39304
+  3 |  235824 | 39304
+  4 |  314432 | 39304
+  5 |  393040 | 39304
+  6 |  471648 | 39304
+  7 |  550256 | 39304
+  8 |  628864 | 39304
+  9 |  707472 | 39304
+ 10 |  786080 | 39304
+ 11 |  790614 | 35937
+ 12 |  862488 | 35937
+ 13 |  934362 | 35937
+ 14 | 1006236 | 35937
+ 15 | 1078110 | 35937
+ 16 | 1149984 | 35937
+ 17 | 1221858 | 35937
+ 18 | 1293732 | 35937
+ 19 | 1365606 | 35937
+ 20 | 1437480 | 35937
+ 21 | 1509354 | 35937
+ 22 | 1581228 | 35937
+ 23 | 1653102 | 35937
+ 24 | 1724976 | 35937
+ 25 | 1796850 | 35937
+ 26 | 1868724 | 35937
+ 27 | 1940598 | 35937
+ 28 | 2012472 | 35937
+ 29 | 2084346 | 35937
+(30 rows)
+
+-- partial aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+                                                    QUERY PLAN                                                    
+------------------------------------------------------------------------------------------------------------------
+ Sort
+   Output: t3.y, (sum((t2.y + t3.y))), (count(*))
+   Sort Key: t3.y
+   ->  Finalize HashAggregate
+         Output: t3.y, sum((t2.y + t3.y)), count(*)
+         Group Key: t3.y
+         ->  Append
+               ->  Hash Join
+                     Output: t3.y, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, t3.y, t3.x, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, t3.y, t3.x, PARTIAL sum((t2.y + t3.y)), PARTIAL count(*)
+                                 Group Key: t2.x, t3.y, t3.x
+                                 ->  Hash Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Hash Cond: (t2.x = t3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                             Output: t2.y, t2.x
+                                       ->  Hash
+                                             Output: t3.y, t3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p1 t3
+                                                   Output: t3.y, t3.x
+               ->  Hash Join
+                     Output: t3_1.y, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, t3_1.y, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, t3_1.y, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y)), PARTIAL count(*)
+                                 Group Key: t2_1.x, t3_1.y, t3_1.x
+                                 ->  Hash Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Hash Cond: (t2_1.x = t3_1.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                             Output: t2_1.y, t2_1.x
+                                       ->  Hash
+                                             Output: t3_1.y, t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t3_1
+                                                   Output: t3_1.y, t3_1.x
+               ->  Hash Join
+                     Output: t3_2.y, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, t3_2.y, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, t3_2.y, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y)), PARTIAL count(*)
+                                 Group Key: t2_2.x, t3_2.y, t3_2.x
+                                 ->  Hash Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Hash Cond: (t2_2.x = t3_2.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                             Output: t2_2.y, t2_2.x
+                                       ->  Hash
+                                             Output: t3_2.y, t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t3_2
+                                                   Output: t3_2.y, t3_2.x
+               ->  Hash Join
+                     Output: t3_3.y, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, t3_3.y, t3_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, t3_3.y, t3_3.x, PARTIAL sum((t2_3.y + t3_3.y)), PARTIAL count(*)
+                                 Group Key: t2_3.x, t3_3.y, t3_3.x
+                                 ->  Hash Join
+                                       Output: t2_3.y, t2_3.x, t3_3.y, t3_3.x
+                                       Hash Cond: (t2_3.x = t3_3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                             Output: t2_3.y, t2_3.x
+                                       ->  Hash
+                                             Output: t3_3.y, t3_3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t3_3
+                                                   Output: t3_3.y, t3_3.x
+               ->  Hash Join
+                     Output: t3_4.y, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, t3_4.y, t3_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, t3_4.y, t3_4.x, PARTIAL sum((t2_4.y + t3_4.y)), PARTIAL count(*)
+                                 Group Key: t2_4.x, t3_4.y, t3_4.x
+                                 ->  Hash Join
+                                       Output: t2_4.y, t2_4.x, t3_4.y, t3_4.x
+                                       Hash Cond: (t2_4.x = t3_4.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                             Output: t2_4.y, t2_4.x
+                                       ->  Hash
+                                             Output: t3_4.y, t3_4.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t3_4
+                                                   Output: t3_4.y, t3_4.x
+(102 rows)
+
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+ y  |   sum   | count 
+----+---------+-------
+  0 |       0 | 35937
+  1 |   78608 | 39304
+  2 |  157216 | 39304
+  3 |  235824 | 39304
+  4 |  314432 | 39304
+  5 |  393040 | 39304
+  6 |  471648 | 39304
+  7 |  550256 | 39304
+  8 |  628864 | 39304
+  9 |  707472 | 39304
+ 10 |  786080 | 39304
+ 11 |  790614 | 35937
+ 12 |  862488 | 35937
+ 13 |  934362 | 35937
+ 14 | 1006236 | 35937
+ 15 | 1078110 | 35937
+ 16 | 1149984 | 35937
+ 17 | 1221858 | 35937
+ 18 | 1293732 | 35937
+ 19 | 1365606 | 35937
+ 20 | 1437480 | 35937
+ 21 | 1509354 | 35937
+ 22 | 1581228 | 35937
+ 23 | 1653102 | 35937
+ 24 | 1724976 | 35937
+ 25 | 1796850 | 35937
+ 26 | 1868724 | 35937
+ 27 | 1940598 | 35937
+ 28 | 2012472 | 35937
+ 29 | 2084346 | 35937
+(30 rows)
+
+DROP TABLE eager_agg_tab_ml;
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 04079268b98..d0bb66f43da 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -2837,20 +2837,22 @@ select x.thousand, x.twothousand, count(*)
 from tenk1 x inner join tenk1 y on x.thousand = y.thousand
 group by x.thousand, x.twothousand
 order by x.thousand desc, x.twothousand;
-                                    QUERY PLAN                                    
-----------------------------------------------------------------------------------
- GroupAggregate
+                                       QUERY PLAN                                       
+----------------------------------------------------------------------------------------
+ Finalize GroupAggregate
    Group Key: x.thousand, x.twothousand
    ->  Incremental Sort
          Sort Key: x.thousand DESC, x.twothousand
          Presorted Key: x.thousand
          ->  Merge Join
                Merge Cond: (y.thousand = x.thousand)
-               ->  Index Only Scan Backward using tenk1_thous_tenthous on tenk1 y
+               ->  Partial GroupAggregate
+                     Group Key: y.thousand
+                     ->  Index Only Scan Backward using tenk1_thous_tenthous on tenk1 y
                ->  Sort
                      Sort Key: x.thousand DESC
                      ->  Seq Scan on tenk1 x
-(11 rows)
+(13 rows)
 
 reset enable_hashagg;
 reset enable_nestloop;
diff --git a/src/test/regress/expected/partition_aggregate.out b/src/test/regress/expected/partition_aggregate.out
index 5f2c0cf5786..1f56f55155b 100644
--- a/src/test/regress/expected/partition_aggregate.out
+++ b/src/test/regress/expected/partition_aggregate.out
@@ -13,6 +13,8 @@ SET enable_partitionwise_join TO true;
 SET max_parallel_workers_per_gather TO 0;
 -- Disable incremental sort, which can influence selected plans due to fuzz factor.
 SET enable_incremental_sort TO off;
+-- Disable eager aggregation, which can interfere with the generation of partitionwise aggregation.
+SET enable_eager_aggregate TO off;
 --
 -- Tests for list partitioned tables.
 --
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index 83228cfca29..3b37fafa65b 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -151,6 +151,7 @@ select name, setting from pg_settings where name like 'enable%';
  enable_async_append            | on
  enable_bitmapscan              | on
  enable_distinct_reordering     | on
+ enable_eager_aggregate         | on
  enable_gathermerge             | on
  enable_group_by_reordering     | on
  enable_hashagg                 | on
@@ -172,7 +173,7 @@ select name, setting from pg_settings where name like 'enable%';
  enable_seqscan                 | on
  enable_sort                    | on
  enable_tidscan                 | on
-(24 rows)
+(25 rows)
 
 -- There are always wait event descriptions for various types.  InjectionPoint
 -- may be present or absent, depending on history since last postmaster start.
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index fbffc67ae60..f9450cdc477 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -123,7 +123,7 @@ test: plancache limit plpgsql copy2 temp domain rangefuncs prepare conversion tr
 # The stats test resets stats, so nothing else needing stats access can be in
 # this group.
 # ----------
-test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression compression_lz4 memoize stats predicate numa
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression compression_lz4 memoize stats predicate numa eager_aggregate
 
 # event_trigger depends on create_am and cannot run concurrently with
 # any test that runs DDL
diff --git a/src/test/regress/sql/eager_aggregate.sql b/src/test/regress/sql/eager_aggregate.sql
new file mode 100644
index 00000000000..5da8749a6cb
--- /dev/null
+++ b/src/test/regress/sql/eager_aggregate.sql
@@ -0,0 +1,194 @@
+--
+-- EAGER AGGREGATION
+-- Test we can push aggregation down below join
+--
+
+-- Enable eager aggregation, which by default is disabled.
+SET enable_eager_aggregate TO on;
+
+CREATE TABLE eager_agg_t1 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t2 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t3 (a int, b int, c double precision);
+
+INSERT INTO eager_agg_t1 SELECT i, i, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t2 SELECT i, i%10, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t3 SELECT i%10, i%10, i FROM generate_series(1, 1000)i;
+
+ANALYZE eager_agg_t1;
+ANALYZE eager_agg_t2;
+ANALYZE eager_agg_t3;
+
+
+--
+-- Test eager aggregation over base rel
+--
+
+-- Perform scan of a table, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+RESET enable_hashagg;
+
+
+--
+-- Test eager aggregation over join rel
+--
+
+-- Perform join of tables, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+
+RESET enable_hashagg;
+
+
+--
+-- Test that eager aggregation works for outer join
+--
+
+-- Ensure aggregation can be pushed down to the non-nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+-- Ensure aggregation cannot be pushed down to the nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+
+
+--
+-- Test that eager aggregation works for parallel plans
+--
+
+SET parallel_setup_cost=0;
+SET parallel_tuple_cost=0;
+SET min_parallel_table_scan_size=0;
+SET max_parallel_workers_per_gather=4;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+RESET parallel_setup_cost;
+RESET parallel_tuple_cost;
+RESET min_parallel_table_scan_size;
+RESET max_parallel_workers_per_gather;
+
+
+DROP TABLE eager_agg_t1;
+DROP TABLE eager_agg_t2;
+DROP TABLE eager_agg_t3;
+
+
+--
+-- Test eager aggregation for partitionwise join
+--
+
+-- Enable partitionwise aggregate, which by default is disabled.
+SET enable_partitionwise_aggregate TO true;
+-- Enable partitionwise join, which by default is disabled.
+SET enable_partitionwise_join TO true;
+
+CREATE TABLE eager_agg_tab1(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab1_p1 PARTITION OF eager_agg_tab1 FOR VALUES FROM (0) TO (5);
+CREATE TABLE eager_agg_tab1_p2 PARTITION OF eager_agg_tab1 FOR VALUES FROM (5) TO (10);
+CREATE TABLE eager_agg_tab1_p3 PARTITION OF eager_agg_tab1 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab2(x int, y int) PARTITION BY RANGE(y);
+CREATE TABLE eager_agg_tab2_p1 PARTITION OF eager_agg_tab2 FOR VALUES FROM (0) TO (5);
+CREATE TABLE eager_agg_tab2_p2 PARTITION OF eager_agg_tab2 FOR VALUES FROM (5) TO (10);
+CREATE TABLE eager_agg_tab2_p3 PARTITION OF eager_agg_tab2 FOR VALUES FROM (10) TO (15);
+INSERT INTO eager_agg_tab1 SELECT i % 15, i % 10 FROM generate_series(1, 1000) i;
+INSERT INTO eager_agg_tab2 SELECT i % 10, i % 15 FROM generate_series(1, 1000) i;
+
+ANALYZE eager_agg_tab1;
+ANALYZE eager_agg_tab2;
+
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+
+-- GROUP BY having other matching key
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 5 ORDER BY t2.x;
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 5 ORDER BY t2.x;
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+
+-- partial aggregation
+SET enable_hashagg TO off;
+SET max_parallel_workers_per_gather TO 0;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+RESET enable_hashagg;
+RESET max_parallel_workers_per_gather;
+
+DROP TABLE eager_agg_tab1;
+DROP TABLE eager_agg_tab2;
+
+
+--
+-- Test with multi-level partitioning scheme
+--
+CREATE TABLE eager_agg_tab_ml(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p1 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab_ml_p2 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (10) TO (20) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p2_s1 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab_ml_p2_s2 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (15) TO (20);
+CREATE TABLE eager_agg_tab_ml_p3 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (20) TO (30) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p3_s1 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (20) TO (25);
+CREATE TABLE eager_agg_tab_ml_p3_s2 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (25) TO (30);
+INSERT INTO eager_agg_tab_ml SELECT i % 30, i % 30 FROM generate_series(1, 1000) i;
+
+ANALYZE eager_agg_tab_ml;
+
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+
+-- partial aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+
+DROP TABLE eager_agg_tab_ml;
diff --git a/src/test/regress/sql/partition_aggregate.sql b/src/test/regress/sql/partition_aggregate.sql
index ab070fee244..124cc260461 100644
--- a/src/test/regress/sql/partition_aggregate.sql
+++ b/src/test/regress/sql/partition_aggregate.sql
@@ -14,6 +14,8 @@ SET enable_partitionwise_join TO true;
 SET max_parallel_workers_per_gather TO 0;
 -- Disable incremental sort, which can influence selected plans due to fuzz factor.
 SET enable_incremental_sort TO off;
+-- Disable eager aggregation, which can interfere with the generation of partitionwise aggregation.
+SET enable_eager_aggregate TO off;
 
 --
 -- Tests for list partitioned tables.
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index a13e8162890..9a4567db01a 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -42,6 +42,7 @@ AfterTriggersTableData
 AfterTriggersTransData
 Agg
 AggClauseCosts
+AggClauseInfo
 AggInfo
 AggPath
 AggSplit
@@ -1110,6 +1111,7 @@ GroupPathExtraData
 GroupResultPath
 GroupState
 GroupVarInfo
+GroupingExprInfo
 GroupingFunc
 GroupingSet
 GroupingSetData
@@ -2473,6 +2475,7 @@ ReindexObjectType
 ReindexParams
 ReindexStmt
 ReindexType
+RelAggInfo
 RelFileLocator
 RelFileLocatorBackend
 RelFileNumber
-- 
2.39.5 (Apple Git-154)



^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 13:44                                   ` Re: Eager aggregation, take 3 Matheus Alcantara <[email protected]>
  2025-08-09 01:32                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-01 01:32                                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-05 07:35                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-09-05 14:37                                           ` Robert Haas <[email protected]>
  2025-09-09 10:30                                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  0 siblings, 1 reply; 70+ messages in thread

From: Robert Haas @ 2025-09-05 14:37 UTC (permalink / raw)
  To: Richard Guo <[email protected]>; +Cc: Matheus Alcantara <[email protected]>; Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

On Fri, Sep 5, 2025 at 3:35 AM Richard Guo <[email protected]> wrote:
> Here is a rebase after the GUC tables change.

I spent a bit of time scrolling through this today. Here are a few
observations/review comments.

It looks as though this will create a bunch of RelOptInfo objects that
don't end up getting used for anything once the apply_at test in
generate_grouped_paths() fails. It seems to me that it would be better
to altogether avoid generating the RelOptInfo in that case.

I think it would be worth considering generating the partially grouped
relations in a second pass. Right now, as you progress from the bottom
of the join tree towards the top, you created grouped rels as you go.
But you could equally well finish planning everything up to the
scan/join target first and then go back and add grouped_rels to
relations where it seems worthwhile. I don't know if this would really
make a big difference as you have things today, but I think it might
provided a better structure for the future, because you would then
have a lot more information with which to judge where to do
aggregation. For instance, you could looked at the row counts of any
number of those ungrouped-rels before deciding where to put the
partial aggregation. That seems like it could be pretty valuable.

I haven't done a detailed comparison of generate_grouped_paths() to
other parts of the code, but I have an uncomfortable feeling that it
might be rather similar to some existing code that probably already
exists in multiple, slightly-different versions. Is there any
refactoring we could do here?

Do you need a test of this feature in combination with GEQO? You have
code for it but I don't immediately see a test. I didn't check
carefully, though.

Overall I like the direction this is heading. I don't feel
well-qualified to evaluate whether all of the things that you're doing
are completely safe. The logic in is_var_in_aggref_only() and
is_var_needed_by_join() scares me a bit because I worry that the
checks are somehow non-exhaustive, but I don't know of a specific
hazard. That said, I think that modulo such issues, this has a good
chance of significantly improving performance for certain query
shapes.

One thing to check might be whether you can construct any cases where
the strategy is applied too boldly. Given the safeguards you've put in
place that seems a little a little hard to construct. The most obvious
thing that occurs to me is an aggregate where combining is more
expensive than aggregating, so that the partial aggregation gives the
appearance of saving more work than it really does, but I can't
immediately think of a problem case. Another case could be where the
row counts are off, leading to us mistakenly believing that we're
going to reduce the number of rows that need to be processed when we
really don't. Of course, such a case would arguably be a fault of the
bad row-count estimate rather than this patch, but if the patch has
that problem frequently, it might need to be addressed. Still, I have
a feeling that the testing you've already been doing might have
surfaced such cases if they were common. Have you looked into how many
queries in the regression tests, or in TPC-H/DS, expend significant
planning effort on this strategy before discarding it? That might be a
good way to get a sense of whether the patch is too aggressive, not
aggressive enough, a mix of the two, or just right.

-- 
Robert Haas
EDB: http://www.enterprisedb.com





^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 13:44                                   ` Re: Eager aggregation, take 3 Matheus Alcantara <[email protected]>
  2025-08-09 01:32                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-01 01:32                                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-05 07:35                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-05 14:37                                           ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
@ 2025-09-09 10:30                                             ` Richard Guo <[email protected]>
  2025-09-09 14:30                                               ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  0 siblings, 1 reply; 70+ messages in thread

From: Richard Guo @ 2025-09-09 10:30 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: Matheus Alcantara <[email protected]>; Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

On Fri, Sep 5, 2025 at 11:37 PM Robert Haas <[email protected]> wrote:
> I spent a bit of time scrolling through this today. Here are a few
> observations/review comments.

Thanks for all the comments.

> It looks as though this will create a bunch of RelOptInfo objects that
> don't end up getting used for anything once the apply_at test in
> generate_grouped_paths() fails. It seems to me that it would be better
> to altogether avoid generating the RelOptInfo in that case.

Hmm, that's not the case.  make_grouped_join_rel() guarantees that for
a given relation, if its grouped paths are not considered useful, and
no grouped paths can be built by joining grouped input relations, then
its grouped relation will not be created.  IOW, we only create a
grouped RelOptInfo if we've determined that we can generate useful
grouped paths for it.

In the case you mentioned, where the apply_at test in
generate_grouped_paths() fails, it must mean that grouped paths can be
built by joining its outer and inner relations.  Also, note that calls
to generate_grouped_paths() are always followed by calls to
set_cheapest().  If we failed to generate any grouped paths for a
grouped relation, the set_cheapest() call should already have reported
an error.

> I think it would be worth considering generating the partially grouped
> relations in a second pass. Right now, as you progress from the bottom
> of the join tree towards the top, you created grouped rels as you go.
> But you could equally well finish planning everything up to the
> scan/join target first and then go back and add grouped_rels to
> relations where it seems worthwhile.

Hmm, I don't think so.  I think the presence of eager aggregation
could change the best join order.  For example, without eager
aggregation, the optimizer might find that (A JOIN B) JOIN C the best
join order.  But with eager aggregation on B, the optimizer could
prefer A JOIN (AGG(B) JOIN C).  I'm not sure how we could find the
best join order with eager aggregation applied without building the
join tree from the bottom up.

> I haven't done a detailed comparison of generate_grouped_paths() to
> other parts of the code, but I have an uncomfortable feeling that it
> might be rather similar to some existing code that probably already
> exists in multiple, slightly-different versions. Is there any
> refactoring we could do here?

Yeah, we currently have several functions that do similar, but not
exactly the same, things.  Maybe some refactoring is possible -- maybe
not -- I haven't looked into it closely yet.  However, I'd prefer to
address that in a separate patch if possible, since this issue also
exists on master, and I want to avoid introducing such changes in this
already large patch.

> Do you need a test of this feature in combination with GEQO? You have
> code for it but I don't immediately see a test. I didn't check
> carefully, though.

Good point.  I do have manually tested GEQO by setting geqo_threshold
to 2 and running the regression tests to check for any planning
errors, crashes, or incorrect results.  However, I'm not sure where
test cases for GEQO should be added.  I searched the regression tests
and found only one explicit GEQO test, added back in 2009 (commit
a43b190e3).  It's not quite clear to me what the current policy is for
adding GEQO test cases.

Anyway, I will add some test cases in eager_aggregate.sql with
geqo_threshold set to 2.

> Overall I like the direction this is heading. I don't feel
> well-qualified to evaluate whether all of the things that you're doing
> are completely safe. The logic in is_var_in_aggref_only() and
> is_var_needed_by_join() scares me a bit because I worry that the
> checks are somehow non-exhaustive, but I don't know of a specific
> hazard. That said, I think that modulo such issues, this has a good
> chance of significantly improving performance for certain query
> shapes.
>
> One thing to check might be whether you can construct any cases where
> the strategy is applied too boldly. Given the safeguards you've put in
> place that seems a little a little hard to construct. The most obvious
> thing that occurs to me is an aggregate where combining is more
> expensive than aggregating, so that the partial aggregation gives the
> appearance of saving more work than it really does, but I can't
> immediately think of a problem case. Another case could be where the
> row counts are off, leading to us mistakenly believing that we're
> going to reduce the number of rows that need to be processed when we
> really don't. Of course, such a case would arguably be a fault of the
> bad row-count estimate rather than this patch, but if the patch has
> that problem frequently, it might need to be addressed. Still, I have
> a feeling that the testing you've already been doing might have
> surfaced such cases if they were common. Have you looked into how many
> queries in the regression tests, or in TPC-H/DS, expend significant
> planning effort on this strategy before discarding it? That might be a
> good way to get a sense of whether the patch is too aggressive, not
> aggressive enough, a mix of the two, or just right.

I previously looked into the TPC-DS queries where eager aggregation
was applied and didn't observe any regressions in planning time or
execution time.  I can run TPC-DS again to check the planning time for
the remaining queries.

- Richard





^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 13:44                                   ` Re: Eager aggregation, take 3 Matheus Alcantara <[email protected]>
  2025-08-09 01:32                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-01 01:32                                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-05 07:35                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-05 14:37                                           ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-09 10:30                                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-09-09 14:30                                               ` Robert Haas <[email protected]>
  0 siblings, 0 replies; 70+ messages in thread

From: Robert Haas @ 2025-09-09 14:30 UTC (permalink / raw)
  To: Richard Guo <[email protected]>; +Cc: Matheus Alcantara <[email protected]>; Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

On Tue, Sep 9, 2025 at 6:30 AM Richard Guo <[email protected]> wrote:
> > I think it would be worth considering generating the partially grouped
> > relations in a second pass. Right now, as you progress from the bottom
> > of the join tree towards the top, you created grouped rels as you go.
> > But you could equally well finish planning everything up to the
> > scan/join target first and then go back and add grouped_rels to
> > relations where it seems worthwhile.
>
> Hmm, I don't think so.  I think the presence of eager aggregation
> could change the best join order.  For example, without eager
> aggregation, the optimizer might find that (A JOIN B) JOIN C the best
> join order.  But with eager aggregation on B, the optimizer could
> prefer A JOIN (AGG(B) JOIN C).  I'm not sure how we could find the
> best join order with eager aggregation applied without building the
> join tree from the bottom up.

Oh, that is a problem, yes. :-(

> > I haven't done a detailed comparison of generate_grouped_paths() to
> > other parts of the code, but I have an uncomfortable feeling that it
> > might be rather similar to some existing code that probably already
> > exists in multiple, slightly-different versions. Is there any
> > refactoring we could do here?
>
> Yeah, we currently have several functions that do similar, but not
> exactly the same, things.  Maybe some refactoring is possible -- maybe
> not -- I haven't looked into it closely yet.  However, I'd prefer to
> address that in a separate patch if possible, since this issue also
> exists on master, and I want to avoid introducing such changes in this
> already large patch.

Well, it's not just a matter of "this already exists" -- it gets
harder and harder to unify things the more near-copies you add.

> Good point.  I do have manually tested GEQO by setting geqo_threshold
> to 2 and running the regression tests to check for any planning
> errors, crashes, or incorrect results.  However, I'm not sure where
> test cases for GEQO should be added.  I searched the regression tests
> and found only one explicit GEQO test, added back in 2009 (commit
> a43b190e3).  It's not quite clear to me what the current policy is for
> adding GEQO test cases.
>
> Anyway, I will add some test cases in eager_aggregate.sql with
> geqo_threshold set to 2.

Sounds good. I think GEQO is mostly-unmaintained these days, but if
we're updating the code, I think it is good to add tests. Being that
the code is so old, it probably lacks adequate test coverage.

-- 
Robert Haas
EDB: http://www.enterprisedb.com





^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-09-05 13:12                                   ` Robert Haas <[email protected]>
  2025-09-09 09:20                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2 siblings, 1 reply; 70+ messages in thread

From: Robert Haas @ 2025-09-05 13:12 UTC (permalink / raw)
  To: Richard Guo <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]; Matheus Alcantara <[email protected]>

On Wed, Aug 6, 2025 at 3:52 AM Richard Guo <[email protected]> wrote:
> To avoid potential memory blowout risks from large partial aggregation
> values, v18 avoids applying eager aggregation if any aggregate uses an
> INTERNAL transition type, as this typically indicates a large internal
> data structure (as in string_agg or array_agg).  However, this also
> excludes aggregates like avg(numeric) and sum(numeric), which are
> actually safe to use with eager aggregation.
>
> What we really want to exclude are aggregate functions that can
> produce large transition values by accumulating or concatenating input
> rows.  So I'm wondering if we could instead check the transfn_oid
> directly and explicitly exclude only F_ARRAY_AGG_TRANSFN and
> F_STRING_AGG_TRANSFN.  We don't need to worry about json_agg,
> jsonb_agg, or xmlagg, since they don't support partial aggregation
> anyway.

This strategy seems fairly unfriendly towards out-of-core code. Can
you come up with something that allows the author of a SQL-callable
function to include or exclude the function by a choice that is under
their control, rather than hard-coding something in PostgreSQL itself?

-- 
Robert Haas
EDB: http://www.enterprisedb.com





^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-05 13:12                                   ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
@ 2025-09-09 09:20                                     ` Richard Guo <[email protected]>
  2025-09-09 14:20                                       ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  0 siblings, 1 reply; 70+ messages in thread

From: Richard Guo @ 2025-09-09 09:20 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]; Matheus Alcantara <[email protected]>

On Fri, Sep 5, 2025 at 10:12 PM Robert Haas <[email protected]> wrote:
> On Wed, Aug 6, 2025 at 3:52 AM Richard Guo <[email protected]> wrote:
> > What we really want to exclude are aggregate functions that can
> > produce large transition values by accumulating or concatenating input
> > rows.  So I'm wondering if we could instead check the transfn_oid
> > directly and explicitly exclude only F_ARRAY_AGG_TRANSFN and
> > F_STRING_AGG_TRANSFN.  We don't need to worry about json_agg,
> > jsonb_agg, or xmlagg, since they don't support partial aggregation
> > anyway.

> This strategy seems fairly unfriendly towards out-of-core code. Can
> you come up with something that allows the author of a SQL-callable
> function to include or exclude the function by a choice that is under
> their control, rather than hard-coding something in PostgreSQL itself?

Yeah, ideally we should tell whether an aggregate's transition state
may grow unbounded just by looking at system catalogs.  Unfortunately,
after trying for a while, it seems to me that the current catalog
doesn't provide enough information.

I once considered adding a flag (e.g., aggtransbounded) to catalog
pg_aggregate to indicate whether the transition state size is bounded.
This flag could be specified by users when creating aggregate
functions, and then leveraged by features such as eager aggregation.

However, adding new information to system catalogs involves a lot of
discussions and changes, including updates to DDL commands, dump and
restore processes, and upgrade procedures.  Therefore, to keep the
focus of this patch on the eager aggregation feature itself, I prefer
to treat this enhancement as future work.

- Richard





^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-05 13:12                                   ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-09 09:20                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-09-09 14:20                                       ` Robert Haas <[email protected]>
  2025-09-12 09:34                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  0 siblings, 1 reply; 70+ messages in thread

From: Robert Haas @ 2025-09-09 14:20 UTC (permalink / raw)
  To: Richard Guo <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]; Matheus Alcantara <[email protected]>

On Tue, Sep 9, 2025 at 5:20 AM Richard Guo <[email protected]> wrote:
> Yeah, ideally we should tell whether an aggregate's transition state
> may grow unbounded just by looking at system catalogs.  Unfortunately,
> after trying for a while, it seems to me that the current catalog
> doesn't provide enough information.
>
> I once considered adding a flag (e.g., aggtransbounded) to catalog
> pg_aggregate to indicate whether the transition state size is bounded.
> This flag could be specified by users when creating aggregate
> functions, and then leveraged by features such as eager aggregation.
>
> However, adding new information to system catalogs involves a lot of
> discussions and changes, including updates to DDL commands, dump and
> restore processes, and upgrade procedures.  Therefore, to keep the
> focus of this patch on the eager aggregation feature itself, I prefer
> to treat this enhancement as future work.

I don't really like that. I think there's a lot of danger of that
future work never getting done, and thus leaving us stuck more-or-less
permanently with a system that's not really extensible. Data type and
function extensibility is one of the strongest areas of PostgreSQL,
and we should try hard to avoid situations where we regress it. I'm
not sure whether the aggtransbounded flag is exactly the right thing
here, but I don't think adding a new catalog column is an unreasonable
amount of work for a feature of this type.

Having said that, I wonder whether there's some way that we could use
the aggtransspace property for this. For instance, for stanullfrac, we
use values >0 to mean absolute quantities and values <0 to mean
proportions. The current definition of aggtranspace assigns no meaning
to values <0, and the current coding seems to assume that sizes are
fixed regardless of how many inputs are supplied. Maybe we could
define aggtransspace<0 to mean that the number of bytes used per input
value is the additive inverse of the value, or something like that.

-- 
Robert Haas
EDB: http://www.enterprisedb.com





^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-05 13:12                                   ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-09 09:20                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-09 14:20                                       ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
@ 2025-09-12 09:34                                         ` Richard Guo <[email protected]>
  2025-09-12 18:47                                           ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  0 siblings, 1 reply; 70+ messages in thread

From: Richard Guo @ 2025-09-12 09:34 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]; Matheus Alcantara <[email protected]>

On Tue, Sep 9, 2025 at 11:20 PM Robert Haas <[email protected]> wrote:
> Having said that, I wonder whether there's some way that we could use
> the aggtransspace property for this. For instance, for stanullfrac, we
> use values >0 to mean absolute quantities and values <0 to mean
> proportions. The current definition of aggtranspace assigns no meaning
> to values <0, and the current coding seems to assume that sizes are
> fixed regardless of how many inputs are supplied. Maybe we could
> define aggtransspace<0 to mean that the number of bytes used per input
> value is the additive inverse of the value, or something like that.

I really like this idea.  Currently, aggtransspace represents an
estimate of the transition state size provided by the aggregate
definition.  If it's set to zero, a default estimate based on the
state data type is used.  Negative values currently have no defined
meaning.  I think it makes perfect sense to reuse this field so that
a negative value indicates that the transition state data can grow
unboundedly in size.

Attached 0002 implements this idea.  It requires fewer code changes
than I expected.  This is mainly because that our current code uses
aggtransspace in such a way that if it's a positive value, that value
is used as it's provided by the aggregate definition; otherwise, some
heuristics are applied to estimate the size.  For the aggregates that
accumulate input rows (e.g., array_agg, string_agg), I don't currently
have a better heuristic for estimating their size, so I've chosen to
keep the current logic.  This won't regress anything in estimating
transition state data size.

- Richard


Attachments:

  [application/octet-stream] v22-0001-Implement-Eager-Aggregation.patch (184.9K, 2-v22-0001-Implement-Eager-Aggregation.patch)
  download | inline diff:
From 8a780d897ec5205a48867f3dc291edf80707aca3 Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Tue, 11 Jun 2024 15:59:19 +0900
Subject: [PATCH v22 1/2] Implement Eager Aggregation

Eager aggregation is a query optimization technique that partially
pushes aggregation past a join, and finalizes it once all the
relations are joined.  Eager aggregation may reduce the number of
input rows to the join and thus could result in a better overall plan.

In the current planner architecture, the separation between the
scan/join planning phase and the post-scan/join phase means that
aggregation steps are not visible when constructing the join tree,
limiting the planner's ability to exploit aggregation-aware
optimizations.  To implement eager aggregation, we collect information
about aggregate functions in the targetlist and HAVING clause, along
with grouping expressions from the GROUP BY clause, and store it in
the PlannerInfo node.  During the scan/join planning phase, this
information is used to evaluate each base or join relation to
determine whether eager aggregation can be applied.  If applicable, we
create a separate RelOptInfo, referred to as a grouped relation, to
represent the partially-aggregated version of the relation and
generate grouped paths for it.

Grouped relation paths can be generated in two ways.  The first method
involves adding sorted and hashed partial aggregation paths on top of
the non-grouped paths.  To limit planning time, we only consider the
cheapest or suitably-sorted non-grouped paths in this step.
Alternatively, grouped paths can be generated by joining a grouped
relation with a non-grouped relation.  Joining two grouped relations
is currently not supported.

To further limit planning time, we currently adopt a strategy where
partial aggregation is pushed only to the lowest feasible level in the
join tree where it provides a significant reduction in row count.
This strategy also helps ensure that all grouped paths for the same
grouped relation produce the same set of rows, which is important to
support a fundamental assumption of the planner.

For the partial aggregation that is pushed down to a non-aggregated
relation, we need to consider all expressions from this relation that
are involved in upper join clauses and include them in the grouping
keys, using compatible operators.  This is essential to ensure that an
aggregated row from the partial aggregation matches the other side of
the join if and only if each row in the partial group does.  This
ensures that all rows within the same partial group share the same
"destiny", which is crucial for maintaining correctness.

One restriction is that we cannot push partial aggregation down to a
relation that is in the nullable side of an outer join, because the
NULL-extended rows produced by the outer join would not be available
when we perform the partial aggregation, while with a
non-eager-aggregation plan these rows are available for the top-level
aggregation.  Pushing partial aggregation in this case may result in
the rows being grouped differently than expected, or produce incorrect
values from the aggregate functions.

If we have generated a grouped relation for the topmost join relation,
we finalize its paths at the end.  The final paths will compete in the
usual way with paths built from regular planning.

The patch was originally proposed by Antonin Houska in 2017.  This
commit reworks various important aspects and rewrites most of the
current code.  However, the original patch and reviews were very
useful.

Author: Richard Guo <[email protected]>
Author: Antonin Houska <[email protected]> (in an older version)
Reviewed-by: Robert Haas <[email protected]>
Reviewed-by: Jian He <[email protected]>
Reviewed-by: Tender Wang <[email protected]>
Reviewed-by: Matheus Alcantara <[email protected]>
Reviewed-by: Tom Lane <[email protected]>
Reviewed-by: Tomas Vondra <[email protected]> (in an older version)
Reviewed-by: Andy Fan <[email protected]> (in an older version)
Reviewed-by: Ashutosh Bapat <[email protected]> (in an older version)
Discussion: https://postgr.es/m/CAMbWs48jzLrPt1J_00ZcPZXWUQKawQOFE8ROc-ADiYqsqrpBNw@mail.gmail.com
---
 .../postgres_fdw/expected/postgres_fdw.out    |   49 +-
 doc/src/sgml/config.sgml                      |   31 +
 src/backend/optimizer/README                  |  110 ++
 src/backend/optimizer/geqo/geqo_eval.c        |   21 +
 src/backend/optimizer/path/allpaths.c         |  453 +++++
 src/backend/optimizer/path/joinrels.c         |  193 ++
 src/backend/optimizer/plan/initsplan.c        |  323 ++++
 src/backend/optimizer/plan/planmain.c         |    9 +
 src/backend/optimizer/plan/planner.c          |  124 +-
 src/backend/optimizer/util/appendinfo.c       |   59 +
 src/backend/optimizer/util/relnode.c          |  628 +++++++
 src/backend/utils/misc/guc_parameters.dat     |   16 +
 src/backend/utils/misc/postgresql.conf.sample |    2 +
 src/include/nodes/pathnodes.h                 |  130 ++
 src/include/optimizer/pathnode.h              |    5 +
 src/include/optimizer/paths.h                 |    6 +
 src/include/optimizer/planmain.h              |    1 +
 .../regress/expected/collate.icu.utf8.out     |   32 +-
 src/test/regress/expected/eager_aggregate.out | 1584 +++++++++++++++++
 src/test/regress/expected/join.out            |   12 +-
 .../regress/expected/partition_aggregate.out  |    2 +
 src/test/regress/expected/sysviews.out        |    3 +-
 src/test/regress/parallel_schedule            |    2 +-
 src/test/regress/sql/eager_aggregate.sql      |  225 +++
 src/test/regress/sql/partition_aggregate.sql  |    2 +
 src/tools/pgindent/typedefs.list              |    3 +
 26 files changed, 3951 insertions(+), 74 deletions(-)
 create mode 100644 src/test/regress/expected/eager_aggregate.out
 create mode 100644 src/test/regress/sql/eager_aggregate.sql

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 18d727d7790..f1b2d684e35 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -3701,30 +3701,33 @@ select count(t1.c3) from ft2 t1 left join ft2 t2 on (t1.c1 = random() * t2.c2);
 -- Subquery in FROM clause having aggregate
 explain (verbose, costs off)
 select count(*), x.b from ft1, (select c2 a, sum(c1) b from ft1 group by c2) x where ft1.c2 = x.a group by x.b order by 1, 2;
-                                          QUERY PLAN                                           
------------------------------------------------------------------------------------------------
+                                       QUERY PLAN                                        
+-----------------------------------------------------------------------------------------
  Sort
-   Output: (count(*)), x.b
-   Sort Key: (count(*)), x.b
-   ->  HashAggregate
-         Output: count(*), x.b
-         Group Key: x.b
-         ->  Hash Join
-               Output: x.b
-               Inner Unique: true
-               Hash Cond: (ft1.c2 = x.a)
-               ->  Foreign Scan on public.ft1
-                     Output: ft1.c2
-                     Remote SQL: SELECT c2 FROM "S 1"."T 1"
-               ->  Hash
-                     Output: x.b, x.a
-                     ->  Subquery Scan on x
-                           Output: x.b, x.a
-                           ->  Foreign Scan
-                                 Output: ft1_1.c2, (sum(ft1_1.c1))
-                                 Relations: Aggregate on (public.ft1 ft1_1)
-                                 Remote SQL: SELECT c2, sum("C 1") FROM "S 1"."T 1" GROUP BY 1
-(21 rows)
+   Output: (count(*)), (sum(ft1_1.c1))
+   Sort Key: (count(*)), (sum(ft1_1.c1))
+   ->  Finalize GroupAggregate
+         Output: count(*), (sum(ft1_1.c1))
+         Group Key: (sum(ft1_1.c1))
+         ->  Sort
+               Output: (sum(ft1_1.c1)), (PARTIAL count(*))
+               Sort Key: (sum(ft1_1.c1))
+               ->  Hash Join
+                     Output: (sum(ft1_1.c1)), (PARTIAL count(*))
+                     Hash Cond: (ft1_1.c2 = ft1.c2)
+                     ->  Foreign Scan
+                           Output: ft1_1.c2, (sum(ft1_1.c1))
+                           Relations: Aggregate on (public.ft1 ft1_1)
+                           Remote SQL: SELECT c2, sum("C 1") FROM "S 1"."T 1" GROUP BY 1
+                     ->  Hash
+                           Output: ft1.c2, (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: ft1.c2, PARTIAL count(*)
+                                 Group Key: ft1.c2
+                                 ->  Foreign Scan on public.ft1
+                                       Output: ft1.c2
+                                       Remote SQL: SELECT c2 FROM "S 1"."T 1"
+(24 rows)
 
 select count(*), x.b from ft1, (select c2 a, sum(c1) b from ft1 group by c2) x where ft1.c2 = x.a group by x.b order by 1, 2;
  count |   b   
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 2a3685f474a..bac3c3270a0 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -5475,6 +5475,21 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-enable-eager-aggregate" xreflabel="enable_eager_aggregate">
+      <term><varname>enable_eager_aggregate</varname> (<type>boolean</type>)
+      <indexterm>
+       <primary><varname>enable_eager_aggregate</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Enables or disables the query planner's ability to partially push
+        aggregation past a join, and finalize it once all the relations are
+        joined. The default is <literal>on</literal>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-enable-gathermerge" xreflabel="enable_gathermerge">
       <term><varname>enable_gathermerge</varname> (<type>boolean</type>)
       <indexterm>
@@ -6095,6 +6110,22 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-min-eager-agg-group-size" xreflabel="min_eager_agg_group_size">
+      <term><varname>min_eager_agg_group_size</varname> (<type>floating point</type>)
+      <indexterm>
+       <primary><varname>min_eager_agg_group_size</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Sets the minimum average group size required to consider applying
+        eager aggregation. This helps avoid the overhead of eager
+        aggregation when it does not offer significant row count reduction.
+        The default is <literal>8</literal>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-jit-above-cost" xreflabel="jit_above_cost">
       <term><varname>jit_above_cost</varname> (<type>floating point</type>)
       <indexterm>
diff --git a/src/backend/optimizer/README b/src/backend/optimizer/README
index 843368096fd..6c35baceedb 100644
--- a/src/backend/optimizer/README
+++ b/src/backend/optimizer/README
@@ -1500,3 +1500,113 @@ breaking down aggregation or grouping over a partitioned relation into
 aggregation or grouping over its partitions is called partitionwise
 aggregation.  Especially when the partition keys match the GROUP BY clause,
 this can be significantly faster than the regular method.
+
+Eager aggregation
+-----------------
+
+Eager aggregation is a query optimization technique that partially
+pushes aggregation past a join, and finalizes it once all the
+relations are joined.  Eager aggregation may reduce the number of
+input rows to the join and thus could result in a better overall plan.
+
+To prove that the transformation is correct, let's first consider the
+case where only inner joins are involved.  In this case, we partition
+the tables in the FROM clause into two groups: those that contain at
+least one aggregation column, and those that do not contain any
+aggregation columns.  Each group can be treated as a single relation
+formed by the Cartesian product of the tables within that group.
+Therefore, without loss of generality, we can assume that the FROM
+clause contains exactly two relations, R1 and R2, where R1 represents
+the relation containing all aggregation columns, and R2 represents the
+relation without any aggregation columns.
+
+Let the query be of the form:
+
+SELECT G, AGG(A)
+FROM R1 JOIN R2 ON J
+GROUP BY G;
+
+where G is the set of grouping keys that may include columns from R1
+and/or R2; AGG(A) is an aggregate function over columns A from R1; J
+is the join condition between R1 and R2.
+
+The transformation of eager aggregation is:
+
+    GROUP BY G, AGG(A) on (R1 JOIN R2 ON J)
+    =
+    GROUP BY G, AGG(agg_A) on ((GROUP BY G1, AGG(A) AS agg_A on R1) JOIN R2 ON J)
+
+This equivalence holds under the following conditions:
+
+1) AGG is decomposable, meaning that it can be computed in two stages:
+a partial aggregation followed by a final aggregation;
+2) The set G1 used in the pre-aggregation of R1 includes:
+    * all columns from R1 that are part of the grouping keys G, and
+    * all columns from R1 that appear in the join condition J.
+3) The grouping operator for any column in G1 must be compatible with
+the operator used for that column in the join condition J.
+
+Since G1 includes all columns from R1 that appear in either the
+grouping keys G or the join condition J, all rows within each partial
+group have identical values for both the grouping keys and the
+join-relevant columns from R1, assuming compatible operators are used.
+As a result, the rows within a partial group are indistinguishable in
+terms of their contribution to the aggregation and their behavior in
+the join.  This ensures that all rows in the same partial group share
+the same "destiny": they either all match or all fail to match a given
+row in R2.  Because the aggregate function AGG is decomposable,
+aggregating the partial results after the join yields the same final
+result as aggregating after the full join, thereby preserving query
+semantics.  Q.E.D.
+
+In the case where there are any outer joins, the situation becomes
+more complex due to join order constraints and the semantics of
+null-extension in outer joins.  If the relations that contain at least
+one aggregation column cannot be treated as a single relation because
+of the join order constraints, partial aggregation paths will not be
+generated, and thus the transformation is not applicable.  Otherwise,
+let R1 be the relation containing all aggregation columns, and R2, R3,
+... be the remaining relations.  From the inner join case, under the
+aforementioned conditions, we have the equivalence:
+
+    GROUP BY G, AGG(A) on (R1 JOIN R2 JOIN R3 ...)
+    =
+    GROUP BY G, AGG(agg_A) on ((GROUP BY G1, AGG(A) AS agg_A on R1) JOIN R2 JOIN R3 ...)
+
+To preserve correctness when outer joins are involved, we require an
+additional condition:
+
+4) R1 must not be on the nullable side of any outer join.
+
+This condition ensures that partial aggregation over R1 does not
+suppress any null-extended rows that would be introduced by outer
+joins.  If R1 is on the nullable side of an outer join, the
+NULL-extended rows produced by the outer join would not be available
+when we perform the partial aggregation, while with a
+non-eager-aggregation plan these rows are available for the top-level
+aggregation.  Pushing partial aggregation in this case may result in
+the rows being grouped differently than expected, or produce incorrect
+values from the aggregate functions.
+
+During the construction of the join tree, we evaluate each base or
+join relation to determine if eager aggregation can be applied.  If
+feasible, we create a separate RelOptInfo called a "grouped relation"
+and generate grouped paths by adding sorted and hashed partial
+aggregation paths on top of the non-grouped paths.  To limit planning
+time, we consider only the cheapest or suitably-sorted non-grouped
+paths in this step.
+
+Another way to generate grouped paths is to join a grouped relation
+with a non-grouped relation.  Joining two grouped relations is
+currently not supported.
+
+To further limit planning time, we currently adopt a strategy where
+partial aggregation is pushed only to the lowest feasible level in the
+join tree where it provides a significant reduction in row count.
+This strategy also helps ensure that all grouped paths for the same
+grouped relation produce the same set of rows, which is important to
+support a fundamental assumption of the planner.
+
+If we have generated a grouped relation for the topmost join relation,
+we need to finalize its paths at the end.  The final paths will
+compete in the usual way with paths built from regular planning.
diff --git a/src/backend/optimizer/geqo/geqo_eval.c b/src/backend/optimizer/geqo/geqo_eval.c
index f07d1dc8ac6..4a65f955ca6 100644
--- a/src/backend/optimizer/geqo/geqo_eval.c
+++ b/src/backend/optimizer/geqo/geqo_eval.c
@@ -279,6 +279,27 @@ merge_clump(PlannerInfo *root, List *clumps, Clump *new_clump, int num_gene,
 				/* Find and save the cheapest paths for this joinrel */
 				set_cheapest(joinrel);
 
+				/*
+				 * Except for the topmost scan/join rel, consider generating
+				 * partial aggregation paths for the grouped relation on top
+				 * of the paths of this rel.  After that, we're done creating
+				 * paths for the grouped relation, so run set_cheapest().
+				 */
+				if (!bms_equal(joinrel->relids, root->all_query_rels))
+				{
+					RelOptInfo *grouped_rel;
+
+					grouped_rel = joinrel->grouped_rel;
+					if (grouped_rel)
+					{
+						Assert(IS_GROUPED_REL(grouped_rel));
+
+						generate_grouped_paths(root, grouped_rel, joinrel,
+											   grouped_rel->agg_info);
+						set_cheapest(grouped_rel);
+					}
+				}
+
 				/* Absorb new clump into old */
 				old_clump->joinrel = joinrel;
 				old_clump->size += new_clump->size;
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 6cc6966b060..7b349a4570e 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -40,6 +40,7 @@
 #include "optimizer/paths.h"
 #include "optimizer/plancat.h"
 #include "optimizer/planner.h"
+#include "optimizer/prep.h"
 #include "optimizer/tlist.h"
 #include "parser/parse_clause.h"
 #include "parser/parsetree.h"
@@ -47,6 +48,7 @@
 #include "port/pg_bitutils.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/lsyscache.h"
+#include "utils/selfuncs.h"
 
 
 /* Bitmask flags for pushdown_safety_info.unsafeFlags */
@@ -77,7 +79,9 @@ typedef enum pushdown_safe_type
 
 /* These parameters are set by GUC */
 bool		enable_geqo = false;	/* just in case GUC doesn't set it */
+bool		enable_eager_aggregate = true;
 int			geqo_threshold;
+double		min_eager_agg_group_size;
 int			min_parallel_table_scan_size;
 int			min_parallel_index_scan_size;
 
@@ -90,6 +94,7 @@ join_search_hook_type join_search_hook = NULL;
 
 static void set_base_rel_consider_startup(PlannerInfo *root);
 static void set_base_rel_sizes(PlannerInfo *root);
+static void setup_base_grouped_rels(PlannerInfo *root);
 static void set_base_rel_pathlists(PlannerInfo *root);
 static void set_rel_size(PlannerInfo *root, RelOptInfo *rel,
 						 Index rti, RangeTblEntry *rte);
@@ -114,6 +119,7 @@ static void set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
 								Index rti, RangeTblEntry *rte);
 static void set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 									Index rti, RangeTblEntry *rte);
+static void set_grouped_rel_pathlist(PlannerInfo *root, RelOptInfo *rel);
 static void generate_orderedappend_paths(PlannerInfo *root, RelOptInfo *rel,
 										 List *live_childrels,
 										 List *all_child_pathkeys);
@@ -182,6 +188,11 @@ make_one_rel(PlannerInfo *root, List *joinlist)
 	 */
 	set_base_rel_sizes(root);
 
+	/*
+	 * Build grouped relations for base rels where possible.
+	 */
+	setup_base_grouped_rels(root);
+
 	/*
 	 * We should now have size estimates for every actual table involved in
 	 * the query, and we also know which if any have been deleted from the
@@ -323,6 +334,39 @@ set_base_rel_sizes(PlannerInfo *root)
 	}
 }
 
+/*
+ * setup_base_grouped_rels
+ *	  For each base relation, build a grouped base relation if eager
+ *	  aggregation is possible and if this relation can produce grouped paths.
+ */
+static void
+setup_base_grouped_rels(PlannerInfo *root)
+{
+	Index		rti;
+
+	/*
+	 * If there are no aggregate expressions or grouping expressions, eager
+	 * aggregation is not possible.
+	 */
+	if (root->agg_clause_list == NIL ||
+		root->group_expr_list == NIL)
+		return;
+
+	for (rti = 1; rti < root->simple_rel_array_size; rti++)
+	{
+		RelOptInfo *rel = root->simple_rel_array[rti];
+
+		/* there may be empty slots corresponding to non-baserel RTEs */
+		if (rel == NULL)
+			continue;
+
+		Assert(rel->relid == rti);	/* sanity check on array */
+		Assert(IS_SIMPLE_REL(rel)); /* sanity check on rel */
+
+		(void) build_simple_grouped_rel(root, rel);
+	}
+}
+
 /*
  * set_base_rel_pathlists
  *	  Finds all paths available for scanning each base-relation entry.
@@ -559,6 +603,15 @@ set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 	/* Now find the cheapest of the paths for this rel */
 	set_cheapest(rel);
 
+	/*
+	 * If a grouped relation for this rel exists, build partial aggregation
+	 * paths for it.
+	 *
+	 * Note that this can only happen after we've called set_cheapest() for
+	 * this base rel, because we need its cheapest paths.
+	 */
+	set_grouped_rel_pathlist(root, rel);
+
 #ifdef OPTIMIZER_DEBUG
 	pprint(rel);
 #endif
@@ -1305,6 +1358,36 @@ set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 	add_paths_to_append_rel(root, rel, live_childrels);
 }
 
+/*
+ * set_grouped_rel_pathlist
+ *	  If a grouped relation for the given 'rel' exists, build partial
+ *	  aggregation paths for it.
+ */
+static void
+set_grouped_rel_pathlist(PlannerInfo *root, RelOptInfo *rel)
+{
+	RelOptInfo *grouped_rel;
+
+	/*
+	 * If there are no aggregate expressions or grouping expressions, eager
+	 * aggregation is not possible.
+	 */
+	if (root->agg_clause_list == NIL ||
+		root->group_expr_list == NIL)
+		return;
+
+	/* Add paths to the grouped base relation if one exists. */
+	grouped_rel = rel->grouped_rel;
+	if (grouped_rel)
+	{
+		Assert(IS_GROUPED_REL(grouped_rel));
+
+		generate_grouped_paths(root, grouped_rel, rel,
+							   grouped_rel->agg_info);
+		set_cheapest(grouped_rel);
+	}
+}
+
 
 /*
  * add_paths_to_append_rel
@@ -3335,6 +3418,328 @@ generate_useful_gather_paths(PlannerInfo *root, RelOptInfo *rel, bool override_r
 	}
 }
 
+/*
+ * generate_grouped_paths
+ *		Generate paths for a grouped relation by adding sorted and hashed
+ *		partial aggregation paths on top of paths of the ungrouped base or join
+ *		relation.
+ *
+ * The information needed are provided by the RelAggInfo structure.
+ */
+void
+generate_grouped_paths(PlannerInfo *root, RelOptInfo *grouped_rel,
+					   RelOptInfo *rel, RelAggInfo *agg_info)
+{
+	AggClauseCosts agg_costs;
+	bool		can_hash;
+	bool		can_sort;
+	Path	   *cheapest_total_path = NULL;
+	Path	   *cheapest_partial_path = NULL;
+	double		dNumGroups = 0;
+	double		dNumPartialGroups = 0;
+
+	if (IS_DUMMY_REL(rel))
+	{
+		mark_dummy_rel(grouped_rel);
+		return;
+	}
+
+	/*
+	 * We push partial aggregation only to the lowest possible level in the
+	 * join tree that is deemed useful.
+	 */
+	if (!bms_equal(agg_info->apply_at, rel->relids) ||
+		!agg_info->agg_useful)
+		return;
+
+	MemSet(&agg_costs, 0, sizeof(AggClauseCosts));
+	get_agg_clause_costs(root, AGGSPLIT_INITIAL_SERIAL, &agg_costs);
+
+	/*
+	 * Determine whether it's possible to perform sort-based implementations
+	 * of grouping.
+	 */
+	can_sort = grouping_is_sortable(agg_info->group_clauses);
+
+	/*
+	 * Determine whether we should consider hash-based implementations of
+	 * grouping.
+	 */
+	Assert(root->numOrderedAggs == 0);
+	can_hash = (agg_info->group_clauses != NIL &&
+				grouping_is_hashable(agg_info->group_clauses));
+
+	/*
+	 * Consider whether we should generate partially aggregated non-partial
+	 * paths.  We can only do this if we have a non-partial path.
+	 */
+	if (rel->pathlist != NIL)
+	{
+		cheapest_total_path = rel->cheapest_total_path;
+		Assert(cheapest_total_path != NULL);
+	}
+
+	/*
+	 * If parallelism is possible for grouped_rel, then we should consider
+	 * generating partially-grouped partial paths.  However, if the ungrouped
+	 * rel has no partial paths, then we can't.
+	 */
+	if (grouped_rel->consider_parallel && rel->partial_pathlist != NIL)
+	{
+		cheapest_partial_path = linitial(rel->partial_pathlist);
+		Assert(cheapest_partial_path != NULL);
+	}
+
+	/* Estimate number of partial groups. */
+	if (cheapest_total_path != NULL)
+		dNumGroups = estimate_num_groups(root,
+										 agg_info->group_exprs,
+										 cheapest_total_path->rows,
+										 NULL, NULL);
+	if (cheapest_partial_path != NULL)
+		dNumPartialGroups = estimate_num_groups(root,
+												agg_info->group_exprs,
+												cheapest_partial_path->rows,
+												NULL, NULL);
+
+	if (can_sort && cheapest_total_path != NULL)
+	{
+		ListCell   *lc;
+
+		/*
+		 * Use any available suitably-sorted path as input, and also consider
+		 * sorting the cheapest-total path and incremental sort on any paths
+		 * with presorted keys.
+		 *
+		 * To save planning time, we ignore parameterized input paths unless
+		 * they are the cheapest-total path.
+		 */
+		foreach(lc, rel->pathlist)
+		{
+			Path	   *input_path = (Path *) lfirst(lc);
+			Path	   *path;
+			bool		is_sorted;
+			int			presorted_keys;
+
+			/*
+			 * Ignore parameterized paths that are not the cheapest-total
+			 * path.
+			 */
+			if (input_path->param_info &&
+				input_path != cheapest_total_path)
+				continue;
+
+			is_sorted = pathkeys_count_contained_in(agg_info->group_pathkeys,
+													input_path->pathkeys,
+													&presorted_keys);
+
+			/*
+			 * Ignore paths that are not suitably or partially sorted, unless
+			 * they are the cheapest total path (no need to deal with paths
+			 * which have presorted keys when incremental sort is disabled).
+			 */
+			if (!is_sorted && input_path != cheapest_total_path &&
+				(presorted_keys == 0 || !enable_incremental_sort))
+				continue;
+
+			/*
+			 * Since the path originates from a non-grouped relation that is
+			 * not aware of eager aggregation, we must ensure that it provides
+			 * the correct input for partial aggregation.
+			 */
+			path = (Path *) create_projection_path(root,
+												   grouped_rel,
+												   input_path,
+												   agg_info->agg_input);
+
+			if (!is_sorted)
+			{
+				/*
+				 * We've no need to consider both a sort and incremental sort.
+				 * We'll just do a sort if there are no presorted keys and an
+				 * incremental sort when there are presorted keys.
+				 */
+				if (presorted_keys == 0 || !enable_incremental_sort)
+					path = (Path *) create_sort_path(root,
+													 grouped_rel,
+													 path,
+													 agg_info->group_pathkeys,
+													 -1.0);
+				else
+					path = (Path *) create_incremental_sort_path(root,
+																 grouped_rel,
+																 path,
+																 agg_info->group_pathkeys,
+																 presorted_keys,
+																 -1.0);
+			}
+
+			/*
+			 * qual is NIL because the HAVING clause cannot be evaluated until
+			 * the final value of the aggregate is known.
+			 */
+			path = (Path *) create_agg_path(root,
+											grouped_rel,
+											path,
+											agg_info->target,
+											AGG_SORTED,
+											AGGSPLIT_INITIAL_SERIAL,
+											agg_info->group_clauses,
+											NIL,
+											&agg_costs,
+											dNumGroups);
+
+			add_path(grouped_rel, path);
+		}
+	}
+
+	if (can_sort && cheapest_partial_path != NULL)
+	{
+		ListCell   *lc;
+
+		/* Similar to above logic, but for partial paths. */
+		foreach(lc, rel->partial_pathlist)
+		{
+			Path	   *input_path = (Path *) lfirst(lc);
+			Path	   *path;
+			bool		is_sorted;
+			int			presorted_keys;
+
+			is_sorted = pathkeys_count_contained_in(agg_info->group_pathkeys,
+													input_path->pathkeys,
+													&presorted_keys);
+
+			/*
+			 * Ignore paths that are not suitably or partially sorted, unless
+			 * they are the cheapest partial path (no need to deal with paths
+			 * which have presorted keys when incremental sort is disabled).
+			 */
+			if (!is_sorted && input_path != cheapest_partial_path &&
+				(presorted_keys == 0 || !enable_incremental_sort))
+				continue;
+
+			/*
+			 * Since the path originates from a non-grouped relation that is
+			 * not aware of eager aggregation, we must ensure that it provides
+			 * the correct input for partial aggregation.
+			 */
+			path = (Path *) create_projection_path(root,
+												   grouped_rel,
+												   input_path,
+												   agg_info->agg_input);
+
+			if (!is_sorted)
+			{
+				/*
+				 * We've no need to consider both a sort and incremental sort.
+				 * We'll just do a sort if there are no presorted keys and an
+				 * incremental sort when there are presorted keys.
+				 */
+				if (presorted_keys == 0 || !enable_incremental_sort)
+					path = (Path *) create_sort_path(root,
+													 grouped_rel,
+													 path,
+													 agg_info->group_pathkeys,
+													 -1.0);
+				else
+					path = (Path *) create_incremental_sort_path(root,
+																 grouped_rel,
+																 path,
+																 agg_info->group_pathkeys,
+																 presorted_keys,
+																 -1.0);
+			}
+
+			/*
+			 * qual is NIL because the HAVING clause cannot be evaluated until
+			 * the final value of the aggregate is known.
+			 */
+			path = (Path *) create_agg_path(root,
+											grouped_rel,
+											path,
+											agg_info->target,
+											AGG_SORTED,
+											AGGSPLIT_INITIAL_SERIAL,
+											agg_info->group_clauses,
+											NIL,
+											&agg_costs,
+											dNumPartialGroups);
+
+			add_partial_path(grouped_rel, path);
+		}
+	}
+
+	/*
+	 * Add a partially-grouped HashAgg Path where possible
+	 */
+	if (can_hash && cheapest_total_path != NULL)
+	{
+		Path	   *path;
+
+		/*
+		 * Since the path originates from a non-grouped relation that is not
+		 * aware of eager aggregation, we must ensure that it provides the
+		 * correct input for partial aggregation.
+		 */
+		path = (Path *) create_projection_path(root,
+											   grouped_rel,
+											   cheapest_total_path,
+											   agg_info->agg_input);
+
+		/*
+		 * qual is NIL because the HAVING clause cannot be evaluated until the
+		 * final value of the aggregate is known.
+		 */
+		path = (Path *) create_agg_path(root,
+										grouped_rel,
+										path,
+										agg_info->target,
+										AGG_HASHED,
+										AGGSPLIT_INITIAL_SERIAL,
+										agg_info->group_clauses,
+										NIL,
+										&agg_costs,
+										dNumGroups);
+
+		add_path(grouped_rel, path);
+	}
+
+	/*
+	 * Now add a partially-grouped HashAgg partial Path where possible
+	 */
+	if (can_hash && cheapest_partial_path != NULL)
+	{
+		Path	   *path;
+
+		/*
+		 * Since the path originates from a non-grouped relation that is not
+		 * aware of eager aggregation, we must ensure that it provides the
+		 * correct input for partial aggregation.
+		 */
+		path = (Path *) create_projection_path(root,
+											   grouped_rel,
+											   cheapest_partial_path,
+											   agg_info->agg_input);
+
+		/*
+		 * qual is NIL because the HAVING clause cannot be evaluated until the
+		 * final value of the aggregate is known.
+		 */
+		path = (Path *) create_agg_path(root,
+										grouped_rel,
+										path,
+										agg_info->target,
+										AGG_HASHED,
+										AGGSPLIT_INITIAL_SERIAL,
+										agg_info->group_clauses,
+										NIL,
+										&agg_costs,
+										dNumPartialGroups);
+
+		add_partial_path(grouped_rel, path);
+	}
+}
+
 /*
  * make_rel_from_joinlist
  *	  Build access paths using a "joinlist" to guide the join path search.
@@ -3494,6 +3899,10 @@ standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
 		 *
 		 * After that, we're done creating paths for the joinrel, so run
 		 * set_cheapest().
+		 *
+		 * In addition, we also run generate_grouped_paths() for the grouped
+		 * relation of each just-processed joinrel, and run set_cheapest() for
+		 * the grouped relation afterwards.
 		 */
 		foreach(lc, root->join_rel_level[lev])
 		{
@@ -3514,6 +3923,27 @@ standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
 			/* Find and save the cheapest paths for this rel */
 			set_cheapest(rel);
 
+			/*
+			 * Except for the topmost scan/join rel, consider generating
+			 * partial aggregation paths for the grouped relation on top of
+			 * the paths of this rel.  After that, we're done creating paths
+			 * for the grouped relation, so run set_cheapest().
+			 */
+			if (!bms_equal(rel->relids, root->all_query_rels))
+			{
+				RelOptInfo *grouped_rel;
+
+				grouped_rel = rel->grouped_rel;
+				if (grouped_rel)
+				{
+					Assert(IS_GROUPED_REL(grouped_rel));
+
+					generate_grouped_paths(root, grouped_rel, rel,
+										   grouped_rel->agg_info);
+					set_cheapest(grouped_rel);
+				}
+			}
+
 #ifdef OPTIMIZER_DEBUG
 			pprint(rel);
 #endif
@@ -4383,6 +4813,29 @@ generate_partitionwise_join_paths(PlannerInfo *root, RelOptInfo *rel)
 		if (IS_DUMMY_REL(child_rel))
 			continue;
 
+		/*
+		 * Except for the topmost scan/join rel, consider generating partial
+		 * aggregation paths for the grouped relation on top of the paths of
+		 * this partitioned child-join.  After that, we're done creating paths
+		 * for the grouped relation, so run set_cheapest().
+		 */
+		if (!bms_equal(IS_OTHER_REL(rel) ?
+					   rel->top_parent_relids : rel->relids,
+					   root->all_query_rels))
+		{
+			RelOptInfo *grouped_rel;
+
+			grouped_rel = child_rel->grouped_rel;
+			if (grouped_rel)
+			{
+				Assert(IS_GROUPED_REL(grouped_rel));
+
+				generate_grouped_paths(root, grouped_rel, child_rel,
+									   grouped_rel->agg_info);
+				set_cheapest(grouped_rel);
+			}
+		}
+
 #ifdef OPTIMIZER_DEBUG
 		pprint(child_rel);
 #endif
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index 535248aa525..04cbbcea2a4 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -16,6 +16,7 @@
 
 #include "miscadmin.h"
 #include "optimizer/appendinfo.h"
+#include "optimizer/cost.h"
 #include "optimizer/joininfo.h"
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
@@ -36,6 +37,9 @@ static bool has_legal_joinclause(PlannerInfo *root, RelOptInfo *rel);
 static bool restriction_is_constant_false(List *restrictlist,
 										  RelOptInfo *joinrel,
 										  bool only_pushed_down);
+static void make_grouped_join_rel(PlannerInfo *root, RelOptInfo *rel1,
+								  RelOptInfo *rel2, RelOptInfo *joinrel,
+								  SpecialJoinInfo *sjinfo, List *restrictlist);
 static void populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
 										RelOptInfo *rel2, RelOptInfo *joinrel,
 										SpecialJoinInfo *sjinfo, List *restrictlist);
@@ -762,6 +766,10 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
 		return joinrel;
 	}
 
+	/* Build a grouped join relation for 'joinrel' if possible. */
+	make_grouped_join_rel(root, rel1, rel2, joinrel, sjinfo,
+						  restrictlist);
+
 	/* Add paths to the join relation. */
 	populate_joinrel_with_paths(root, rel1, rel2, joinrel, sjinfo,
 								restrictlist);
@@ -873,6 +881,186 @@ add_outer_joins_to_relids(PlannerInfo *root, Relids input_relids,
 	return input_relids;
 }
 
+/*
+ * make_grouped_join_rel
+ *	  Build a grouped join relation for the given "joinrel" if eager
+ *	  aggregation is applicable and the resulting grouped paths are considered
+ *	  useful.
+ *
+ * There are two strategies for generating grouped paths for a join relation:
+ *
+ * 1. Join a grouped (partially aggregated) input relation with a non-grouped
+ * input (e.g., AGG(B) JOIN A).
+ *
+ * 2. Apply partial aggregation (sorted or hashed) on top of existing
+ * non-grouped join paths (e.g., AGG(A JOIN B)).
+ *
+ * To limit planning effort and avoid an explosion of alternatives, we adopt a
+ * strategy where partial aggregation is only pushed to the lowest possible
+ * level in the join tree that is deemed useful.  That is, if grouped paths can
+ * be built using the first strategy, we skip consideration of the second
+ * strategy for the same join level.
+ *
+ * Additionally, if there are multiple lowest useful levels where partial
+ * aggregation could be applied, such as in a join tree with relations A, B,
+ * and C where both "AGG(A JOIN B) JOIN C" and "A JOIN AGG(B JOIN C)" are valid
+ * placements, we choose only the first one encountered during join search.
+ * This avoids generating multiple versions of the same grouped relation based
+ * on different aggregation placements.
+ *
+ * These heuristics also ensure that all grouped paths for the same grouped
+ * relation produce the same set of rows, which is a basic assumption in the
+ * planner.
+ */
+static void
+make_grouped_join_rel(PlannerInfo *root, RelOptInfo *rel1,
+					  RelOptInfo *rel2, RelOptInfo *joinrel,
+					  SpecialJoinInfo *sjinfo, List *restrictlist)
+{
+	RelOptInfo *grouped_rel;
+	RelOptInfo *grouped_rel1;
+	RelOptInfo *grouped_rel2;
+	bool		rel1_empty;
+	bool		rel2_empty;
+	Relids		agg_apply_at;
+
+	/*
+	 * If there are no aggregate expressions or grouping expressions, eager
+	 * aggregation is not possible.
+	 */
+	if (root->agg_clause_list == NIL ||
+		root->group_expr_list == NIL)
+		return;
+
+	/* Retrieve the grouped relations for the two input rels */
+	grouped_rel1 = rel1->grouped_rel;
+	grouped_rel2 = rel2->grouped_rel;
+
+	rel1_empty = (grouped_rel1 == NULL || IS_DUMMY_REL(grouped_rel1));
+	rel2_empty = (grouped_rel2 == NULL || IS_DUMMY_REL(grouped_rel2));
+
+	/* Find or construct a grouped joinrel for this joinrel */
+	grouped_rel = joinrel->grouped_rel;
+	if (grouped_rel == NULL)
+	{
+		RelAggInfo *agg_info = NULL;
+
+		/*
+		 * Prepare the information needed to create grouped paths for this
+		 * join relation.
+		 */
+		agg_info = create_rel_agg_info(root, joinrel);
+		if (agg_info == NULL)
+			return;
+
+		/*
+		 * If grouped paths for the given join relation are not considered
+		 * useful, and no grouped paths can be built by joining grouped input
+		 * relations, skip building the grouped join relation.
+		 */
+		if (!agg_info->agg_useful &&
+			(rel1_empty == rel2_empty))
+			return;
+
+		/* build the grouped relation */
+		grouped_rel = build_grouped_rel(root, joinrel);
+		grouped_rel->reltarget = agg_info->target;
+
+		if (rel1_empty != rel2_empty)
+		{
+			/*
+			 * If there is exactly one grouped input relation, then we can
+			 * build grouped paths by joining the input relations.  Set size
+			 * estimates for the grouped join relation based on the input
+			 * relations, and update the lowest join level where partial
+			 * aggregation is applied to that of the grouped input relation.
+			 */
+			set_joinrel_size_estimates(root, grouped_rel,
+									   rel1_empty ? rel1 : grouped_rel1,
+									   rel2_empty ? rel2 : grouped_rel2,
+									   sjinfo, restrictlist);
+			agg_info->apply_at = rel1_empty ?
+				grouped_rel2->agg_info->apply_at :
+				grouped_rel1->agg_info->apply_at;
+		}
+		else
+		{
+			/*
+			 * Otherwise, grouped paths can be built by applying partial
+			 * aggregation on top of existing non-grouped join paths.  Set
+			 * size estimates for the grouped join relation based on the
+			 * estimated number of groups, and track the lowest join level
+			 * where partial aggregation is applied.  Note that these values
+			 * may be updated later if it is determined that grouped paths can
+			 * be constructed by joining other input relations.
+			 */
+			grouped_rel->rows = agg_info->grouped_rows;
+			agg_info->apply_at = bms_copy(joinrel->relids);
+		}
+
+		grouped_rel->agg_info = agg_info;
+		joinrel->grouped_rel = grouped_rel;
+	}
+
+	Assert(IS_GROUPED_REL(grouped_rel));
+
+	/* We may have already proven this grouped join relation to be dummy. */
+	if (IS_DUMMY_REL(grouped_rel))
+		return;
+
+	/*
+	 * Nothing to do if there's no grouped input relation.  Also, joining two
+	 * grouped relations is not currently supported.
+	 */
+	if (rel1_empty == rel2_empty)
+		return;
+
+	/*
+	 * Get the lowest join level where partial aggregation is applied among
+	 * the given input relations.
+	 */
+	agg_apply_at = rel1_empty ?
+		grouped_rel2->agg_info->apply_at :
+		grouped_rel1->agg_info->apply_at;
+
+	/*
+	 * If it's not the designated level, skip building grouped paths.
+	 *
+	 * One exception is when it is a subset of the previously recorded level.
+	 * In that case, we need to update the designated level to this one, and
+	 * adjust the size estimates for the grouped join relation accordingly.
+	 * For example, suppose partial aggregation can be applied on top of (B
+	 * JOIN C).  If we first construct the join as ((A JOIN B) JOIN C), we'd
+	 * record the designated level as including all three relations (A B C).
+	 * Later, when we consider (A JOIN (B JOIN C)), we encounter the smaller
+	 * (B C) join level directly.  Since this is a subset of the previous
+	 * level and still valid for partial aggregation, we update the designated
+	 * level to (B C), and adjust the size estimates accordingly.
+	 */
+	if (!bms_equal(agg_apply_at, grouped_rel->agg_info->apply_at))
+	{
+		if (bms_is_subset(agg_apply_at, grouped_rel->agg_info->apply_at))
+		{
+			/* Adjust the size estimates for the grouped join relation. */
+			set_joinrel_size_estimates(root, grouped_rel,
+									   rel1_empty ? rel1 : grouped_rel1,
+									   rel2_empty ? rel2 : grouped_rel2,
+									   sjinfo, restrictlist);
+			grouped_rel->agg_info->apply_at = agg_apply_at;
+		}
+		else
+			return;
+	}
+
+	/* Make paths for the grouped join relation. */
+	populate_joinrel_with_paths(root,
+								rel1_empty ? rel1 : grouped_rel1,
+								rel2_empty ? rel2 : grouped_rel2,
+								grouped_rel,
+								sjinfo,
+								restrictlist);
+}
+
 /*
  * populate_joinrel_with_paths
  *	  Add paths to the given joinrel for given pair of joining relations. The
@@ -1615,6 +1803,11 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
 						 adjust_child_relids(joinrel->relids,
 											 nappinfos, appinfos)));
 
+		/* Build a grouped join relation for 'child_joinrel' if possible */
+		make_grouped_join_rel(root, child_rel1, child_rel2,
+							  child_joinrel, child_sjinfo,
+							  child_restrictlist);
+
 		/* And make paths for the child join */
 		populate_joinrel_with_paths(root, child_rel1, child_rel2,
 									child_joinrel, child_sjinfo,
diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index 3e3fec89252..1b778f692d4 100644
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -14,6 +14,7 @@
  */
 #include "postgres.h"
 
+#include "access/nbtree.h"
 #include "catalog/pg_constraint.h"
 #include "catalog/pg_type.h"
 #include "nodes/makefuncs.h"
@@ -31,6 +32,7 @@
 #include "optimizer/restrictinfo.h"
 #include "parser/analyze.h"
 #include "rewrite/rewriteManip.h"
+#include "utils/fmgroids.h"
 #include "utils/lsyscache.h"
 #include "utils/rel.h"
 #include "utils/typcache.h"
@@ -81,6 +83,9 @@ typedef struct JoinTreeItem
 } JoinTreeItem;
 
 
+static bool is_partial_agg_memory_risky(PlannerInfo *root);
+static void create_agg_clause_infos(PlannerInfo *root);
+static void create_grouping_expr_infos(PlannerInfo *root);
 static void extract_lateral_references(PlannerInfo *root, RelOptInfo *brel,
 									   Index rtindex);
 static List *deconstruct_recurse(PlannerInfo *root, Node *jtnode,
@@ -628,6 +633,324 @@ remove_useless_groupby_columns(PlannerInfo *root)
 	}
 }
 
+/*
+ * setup_eager_aggregation
+ *	  Check if eager aggregation is applicable, and if so collect suitable
+ *	  aggregate expressions and grouping expressions in the query.
+ */
+void
+setup_eager_aggregation(PlannerInfo *root)
+{
+	/*
+	 * Don't apply eager aggregation if disabled by user.
+	 */
+	if (!enable_eager_aggregate)
+		return;
+
+	/*
+	 * Don't apply eager aggregation if there are no available GROUP BY
+	 * clauses.
+	 */
+	if (!root->processed_groupClause)
+		return;
+
+	/*
+	 * For now we don't try to support grouping sets.
+	 */
+	if (root->parse->groupingSets)
+		return;
+
+	/*
+	 * For now we don't try to support DISTINCT or ORDER BY aggregates.
+	 */
+	if (root->numOrderedAggs > 0)
+		return;
+
+	/*
+	 * If there are any aggregates that do not support partial mode, or any
+	 * partial aggregates that are non-serializable, do not apply eager
+	 * aggregation.
+	 */
+	if (root->hasNonPartialAggs || root->hasNonSerialAggs)
+		return;
+
+	/*
+	 * We don't try to apply eager aggregation if there are set-returning
+	 * functions in targetlist.
+	 */
+	if (root->parse->hasTargetSRFs)
+		return;
+
+	/*
+	 * Eager aggregation only makes sense if there are multiple base rels in
+	 * the query.
+	 */
+	if (bms_membership(root->all_baserels) != BMS_MULTIPLE)
+		return;
+
+	/*
+	 * Don't apply eager aggregation if any aggregate poses a risk of
+	 * excessive memory usage during partial aggregation.
+	 */
+	if (is_partial_agg_memory_risky(root))
+		return;
+
+	/*
+	 * Collect aggregate expressions and plain Vars that appear in the
+	 * targetlist and havingQual.
+	 */
+	create_agg_clause_infos(root);
+
+	/*
+	 * If there are no suitable aggregate expressions, we cannot apply eager
+	 * aggregation.
+	 */
+	if (root->agg_clause_list == NIL)
+		return;
+
+	/*
+	 * Collect grouping expressions that appear in grouping clauses.
+	 */
+	create_grouping_expr_infos(root);
+}
+
+/*
+ * is_partial_agg_memory_risky
+ *	  Checks if any aggregate poses a risk of excessive memory usage during
+ *	  partial aggregation.
+ *
+ * We check if any aggregate uses INTERNAL transition type.  Although INTERNAL
+ * is marked as pass-by-value, it usually points to a large internal data
+ * structure (like those used by string_agg or array_agg).  These transition
+ * states can grow large and their size is hard to estimate.  Applying eager
+ * aggregation in such cases risks high memory usage since partial aggregation
+ * results might be stored in join hash tables or materialized nodes.
+ *
+ * We explicitly exclude aggregates with AVG_ACCUM transition function from
+ * this check, based on the assumption that avg() and sum() are safe in this
+ * context.
+ */
+static bool
+is_partial_agg_memory_risky(PlannerInfo *root)
+{
+	ListCell   *lc;
+
+	foreach(lc, root->aggtransinfos)
+	{
+		AggTransInfo *transinfo = lfirst_node(AggTransInfo, lc);
+
+		if (transinfo->transfn_oid == F_NUMERIC_AVG_ACCUM ||
+			transinfo->transfn_oid == F_INT8_AVG_ACCUM)
+			continue;
+
+		if (transinfo->aggtranstype == INTERNALOID)
+			return true;
+	}
+
+	return false;
+}
+
+/*
+ * create_agg_clause_infos
+ *	  Search the targetlist and havingQual for Aggrefs and plain Vars, and
+ *	  create an AggClauseInfo for each Aggref node.
+ */
+static void
+create_agg_clause_infos(PlannerInfo *root)
+{
+	List	   *tlist_exprs;
+	List	   *agg_clause_list = NIL;
+	List	   *tlist_vars = NIL;
+	Relids		aggregate_relids = NULL;
+	bool		eager_agg_applicable = true;
+	ListCell   *lc;
+
+	Assert(root->agg_clause_list == NIL);
+	Assert(root->tlist_vars == NIL);
+
+	tlist_exprs = pull_var_clause((Node *) root->processed_tlist,
+								  PVC_INCLUDE_AGGREGATES |
+								  PVC_RECURSE_WINDOWFUNCS |
+								  PVC_RECURSE_PLACEHOLDERS);
+
+	/*
+	 * Aggregates within the HAVING clause need to be processed in the same
+	 * way as those in the targetlist.  Note that HAVING can contain Aggrefs
+	 * but not WindowFuncs.
+	 */
+	if (root->parse->havingQual != NULL)
+	{
+		List	   *having_exprs;
+
+		having_exprs = pull_var_clause((Node *) root->parse->havingQual,
+									   PVC_INCLUDE_AGGREGATES |
+									   PVC_RECURSE_PLACEHOLDERS);
+		if (having_exprs != NIL)
+		{
+			tlist_exprs = list_concat(tlist_exprs, having_exprs);
+			list_free(having_exprs);
+		}
+	}
+
+	foreach(lc, tlist_exprs)
+	{
+		Expr	   *expr = (Expr *) lfirst(lc);
+		Aggref	   *aggref;
+		Relids		agg_eval_at;
+		AggClauseInfo *ac_info;
+
+		/* For now we don't try to support GROUPING() expressions */
+		if (IsA(expr, GroupingFunc))
+		{
+			eager_agg_applicable = false;
+			break;
+		}
+
+		/* Collect plain Vars for future reference */
+		if (IsA(expr, Var))
+		{
+			tlist_vars = list_append_unique(tlist_vars, expr);
+			continue;
+		}
+
+		aggref = castNode(Aggref, expr);
+
+		Assert(aggref->aggorder == NIL);
+		Assert(aggref->aggdistinct == NIL);
+
+		/*
+		 * If there are any securityQuals, do not try to apply eager
+		 * aggregation if any non-leakproof aggregate functions are present.
+		 * This is overly strict, but for now...
+		 */
+		if (root->qual_security_level > 0 &&
+			!get_func_leakproof(aggref->aggfnoid))
+		{
+			eager_agg_applicable = false;
+			break;
+		}
+
+		agg_eval_at = pull_varnos(root, (Node *) aggref);
+
+		/*
+		 * If all base relations in the query are referenced by aggregate
+		 * functions, then eager aggregation is not applicable.
+		 */
+		aggregate_relids = bms_add_members(aggregate_relids, agg_eval_at);
+		if (bms_is_subset(root->all_baserels, aggregate_relids))
+		{
+			eager_agg_applicable = false;
+			break;
+		}
+
+		/* OK, create the AggClauseInfo node */
+		ac_info = makeNode(AggClauseInfo);
+		ac_info->aggref = aggref;
+		ac_info->agg_eval_at = agg_eval_at;
+
+		/* ... and add it to the list */
+		agg_clause_list = list_append_unique(agg_clause_list, ac_info);
+	}
+
+	list_free(tlist_exprs);
+
+	if (eager_agg_applicable)
+	{
+		root->agg_clause_list = agg_clause_list;
+		root->tlist_vars = tlist_vars;
+	}
+	else
+	{
+		list_free_deep(agg_clause_list);
+		list_free(tlist_vars);
+	}
+}
+
+/*
+ * create_grouping_expr_infos
+ *	  Create a GroupingExprInfo for each expression usable as grouping key.
+ *
+ * If any grouping expression is not suitable, we will just return with
+ * root->group_expr_list being NIL.
+ */
+static void
+create_grouping_expr_infos(PlannerInfo *root)
+{
+	List	   *exprs = NIL;
+	List	   *sortgrouprefs = NIL;
+	List	   *btree_opfamilies = NIL;
+	ListCell   *lc,
+			   *lc1,
+			   *lc2,
+			   *lc3;
+
+	Assert(root->group_expr_list == NIL);
+
+	foreach(lc, root->processed_groupClause)
+	{
+		SortGroupClause *sgc = lfirst_node(SortGroupClause, lc);
+		TargetEntry *tle = get_sortgroupclause_tle(sgc, root->processed_tlist);
+		TypeCacheEntry *tce;
+		Oid			equalimageproc;
+
+		Assert(tle->ressortgroupref > 0);
+
+		/*
+		 * For now we only support plain Vars as grouping expressions.
+		 */
+		if (!IsA(tle->expr, Var))
+			return;
+
+		/*
+		 * Eager aggregation is only possible if equality implies image
+		 * equality for each grouping key.  Otherwise, placing keys with
+		 * different byte images into the same group may result in the loss of
+		 * information that could be necessary to evaluate upper qual clauses.
+		 *
+		 * For instance, the NUMERIC data type is not supported, as values
+		 * that are considered equal by the equality operator (e.g., 0 and
+		 * 0.0) can have different scales.
+		 */
+		tce = lookup_type_cache(exprType((Node *) tle->expr),
+								TYPECACHE_BTREE_OPFAMILY);
+		if (!OidIsValid(tce->btree_opf) ||
+			!OidIsValid(tce->btree_opintype))
+			return;
+
+		equalimageproc = get_opfamily_proc(tce->btree_opf,
+										   tce->btree_opintype,
+										   tce->btree_opintype,
+										   BTEQUALIMAGE_PROC);
+		if (!OidIsValid(equalimageproc) ||
+			!DatumGetBool(OidFunctionCall1Coll(equalimageproc,
+											   tce->typcollation,
+											   ObjectIdGetDatum(tce->btree_opintype))))
+			return;
+
+		exprs = lappend(exprs, tle->expr);
+		sortgrouprefs = lappend_int(sortgrouprefs, tle->ressortgroupref);
+		btree_opfamilies = lappend_oid(btree_opfamilies, tce->btree_opf);
+	}
+
+	/*
+	 * Construct a GroupingExprInfo for each expression.
+	 */
+	forthree(lc1, exprs, lc2, sortgrouprefs, lc3, btree_opfamilies)
+	{
+		Expr	   *expr = (Expr *) lfirst(lc1);
+		int			sortgroupref = lfirst_int(lc2);
+		Oid			btree_opfamily = lfirst_oid(lc3);
+		GroupingExprInfo *ge_info;
+
+		ge_info = makeNode(GroupingExprInfo);
+		ge_info->expr = (Expr *) copyObject(expr);
+		ge_info->sortgroupref = sortgroupref;
+		ge_info->btree_opfamily = btree_opfamily;
+
+		root->group_expr_list = lappend(root->group_expr_list, ge_info);
+	}
+}
+
 /*****************************************************************************
  *
  *	  LATERAL REFERENCES
diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c
index 5467e094ca7..eefc486a566 100644
--- a/src/backend/optimizer/plan/planmain.c
+++ b/src/backend/optimizer/plan/planmain.c
@@ -76,6 +76,9 @@ query_planner(PlannerInfo *root,
 	root->placeholder_list = NIL;
 	root->placeholder_array = NULL;
 	root->placeholder_array_size = 0;
+	root->agg_clause_list = NIL;
+	root->group_expr_list = NIL;
+	root->tlist_vars = NIL;
 	root->fkey_list = NIL;
 	root->initial_rels = NIL;
 
@@ -265,6 +268,12 @@ query_planner(PlannerInfo *root,
 	 */
 	extract_restriction_or_clauses(root);
 
+	/*
+	 * Check if eager aggregation is applicable, and if so, set up
+	 * root->agg_clause_list and root->group_expr_list.
+	 */
+	setup_eager_aggregation(root);
+
 	/*
 	 * Now expand appendrels by adding "otherrels" for their children.  We
 	 * delay this to the end so that we have as much information as possible
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 41bd8353430..462c5335589 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -232,7 +232,6 @@ static void add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 									  RelOptInfo *partially_grouped_rel,
 									  const AggClauseCosts *agg_costs,
 									  grouping_sets_data *gd,
-									  double dNumGroups,
 									  GroupPathExtraData *extra);
 static RelOptInfo *create_partial_grouping_paths(PlannerInfo *root,
 												 RelOptInfo *grouped_rel,
@@ -4010,9 +4009,7 @@ create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
 							   GroupPathExtraData *extra,
 							   RelOptInfo **partially_grouped_rel_p)
 {
-	Path	   *cheapest_path = input_rel->cheapest_total_path;
 	RelOptInfo *partially_grouped_rel = NULL;
-	double		dNumGroups;
 	PartitionwiseAggregateType patype = PARTITIONWISE_AGGREGATE_NONE;
 
 	/*
@@ -4094,23 +4091,16 @@ create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
 
 	/* Gather any partially grouped partial paths. */
 	if (partially_grouped_rel && partially_grouped_rel->partial_pathlist)
-	{
 		gather_grouping_paths(root, partially_grouped_rel);
-		set_cheapest(partially_grouped_rel);
-	}
 
-	/*
-	 * Estimate number of groups.
-	 */
-	dNumGroups = get_number_of_groups(root,
-									  cheapest_path->rows,
-									  gd,
-									  extra->targetList);
+	/* Now choose the best path(s) for partially_grouped_rel. */
+	if (partially_grouped_rel && partially_grouped_rel->pathlist)
+		set_cheapest(partially_grouped_rel);
 
 	/* Build final grouping paths */
 	add_paths_to_grouping_rel(root, input_rel, grouped_rel,
 							  partially_grouped_rel, agg_costs, gd,
-							  dNumGroups, extra);
+							  extra);
 
 	/* Give a helpful error if we failed to find any implementation */
 	if (grouped_rel->pathlist == NIL)
@@ -7055,16 +7045,42 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 						  RelOptInfo *grouped_rel,
 						  RelOptInfo *partially_grouped_rel,
 						  const AggClauseCosts *agg_costs,
-						  grouping_sets_data *gd, double dNumGroups,
+						  grouping_sets_data *gd,
 						  GroupPathExtraData *extra)
 {
 	Query	   *parse = root->parse;
 	Path	   *cheapest_path = input_rel->cheapest_total_path;
+	Path	   *cheapest_partially_grouped_path = NULL;
 	ListCell   *lc;
 	bool		can_hash = (extra->flags & GROUPING_CAN_USE_HASH) != 0;
 	bool		can_sort = (extra->flags & GROUPING_CAN_USE_SORT) != 0;
 	List	   *havingQual = (List *) extra->havingQual;
 	AggClauseCosts *agg_final_costs = &extra->agg_final_costs;
+	double		dNumGroups = 0;
+	double		dNumFinalGroups = 0;
+
+	/*
+	 * Estimate number of groups for non-split aggregation.
+	 */
+	dNumGroups = get_number_of_groups(root,
+									  cheapest_path->rows,
+									  gd,
+									  extra->targetList);
+
+	if (partially_grouped_rel && partially_grouped_rel->pathlist)
+	{
+		cheapest_partially_grouped_path =
+			partially_grouped_rel->cheapest_total_path;
+
+		/*
+		 * Estimate number of groups for final phase of partial aggregation.
+		 */
+		dNumFinalGroups =
+			get_number_of_groups(root,
+								 cheapest_partially_grouped_path->rows,
+								 gd,
+								 extra->targetList);
+	}
 
 	if (can_sort)
 	{
@@ -7177,7 +7193,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 					path = make_ordered_path(root,
 											 grouped_rel,
 											 path,
-											 partially_grouped_rel->cheapest_total_path,
+											 cheapest_partially_grouped_path,
 											 info->pathkeys,
 											 -1.0);
 
@@ -7195,7 +7211,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 												 info->clauses,
 												 havingQual,
 												 agg_final_costs,
-												 dNumGroups));
+												 dNumFinalGroups));
 					else
 						add_path(grouped_rel, (Path *)
 								 create_group_path(root,
@@ -7203,7 +7219,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 												   path,
 												   info->clauses,
 												   havingQual,
-												   dNumGroups));
+												   dNumFinalGroups));
 
 				}
 			}
@@ -7245,19 +7261,17 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 		 */
 		if (partially_grouped_rel && partially_grouped_rel->pathlist)
 		{
-			Path	   *path = partially_grouped_rel->cheapest_total_path;
-
 			add_path(grouped_rel, (Path *)
 					 create_agg_path(root,
 									 grouped_rel,
-									 path,
+									 cheapest_partially_grouped_path,
 									 grouped_rel->reltarget,
 									 AGG_HASHED,
 									 AGGSPLIT_FINAL_DESERIAL,
 									 root->processed_groupClause,
 									 havingQual,
 									 agg_final_costs,
-									 dNumGroups));
+									 dNumFinalGroups));
 		}
 	}
 
@@ -7297,6 +7311,7 @@ create_partial_grouping_paths(PlannerInfo *root,
 {
 	Query	   *parse = root->parse;
 	RelOptInfo *partially_grouped_rel;
+	RelOptInfo *eager_agg_rel = NULL;
 	AggClauseCosts *agg_partial_costs = &extra->agg_partial_costs;
 	AggClauseCosts *agg_final_costs = &extra->agg_final_costs;
 	Path	   *cheapest_partial_path = NULL;
@@ -7307,6 +7322,15 @@ create_partial_grouping_paths(PlannerInfo *root,
 	bool		can_hash = (extra->flags & GROUPING_CAN_USE_HASH) != 0;
 	bool		can_sort = (extra->flags & GROUPING_CAN_USE_SORT) != 0;
 
+	/*
+	 * Check whether any partially aggregated paths have been generated
+	 * through eager aggregation.
+	 */
+	if (input_rel->grouped_rel &&
+		!IS_DUMMY_REL(input_rel->grouped_rel) &&
+		input_rel->grouped_rel->pathlist != NIL)
+		eager_agg_rel = input_rel->grouped_rel;
+
 	/*
 	 * Consider whether we should generate partially aggregated non-partial
 	 * paths.  We can only do this if we have a non-partial path, and only if
@@ -7328,11 +7352,13 @@ create_partial_grouping_paths(PlannerInfo *root,
 
 	/*
 	 * If we can't partially aggregate partial paths, and we can't partially
-	 * aggregate non-partial paths, then don't bother creating the new
+	 * aggregate non-partial paths, and no partially aggregated paths were
+	 * generated by eager aggregation, then don't bother creating the new
 	 * RelOptInfo at all, unless the caller specified force_rel_creation.
 	 */
 	if (cheapest_total_path == NULL &&
 		cheapest_partial_path == NULL &&
+		eager_agg_rel == NULL &&
 		!force_rel_creation)
 		return NULL;
 
@@ -7557,6 +7583,51 @@ create_partial_grouping_paths(PlannerInfo *root,
 										 dNumPartialPartialGroups));
 	}
 
+	/*
+	 * Add any partially aggregated paths generated by eager aggregation to
+	 * the new upper relation after applying projection steps as needed.
+	 */
+	if (eager_agg_rel)
+	{
+		/* Add the paths */
+		foreach(lc, eager_agg_rel->pathlist)
+		{
+			Path	   *path = (Path *) lfirst(lc);
+
+			/* Shouldn't have any parameterized paths anymore */
+			Assert(path->param_info == NULL);
+
+			path = (Path *) create_projection_path(root,
+												   partially_grouped_rel,
+												   path,
+												   partially_grouped_rel->reltarget);
+
+			add_path(partially_grouped_rel, path);
+		}
+
+		/*
+		 * Likewise add the partial paths, but only if parallelism is possible
+		 * for partially_grouped_rel.
+		 */
+		if (partially_grouped_rel->consider_parallel)
+		{
+			foreach(lc, eager_agg_rel->partial_pathlist)
+			{
+				Path	   *path = (Path *) lfirst(lc);
+
+				/* Shouldn't have any parameterized paths anymore */
+				Assert(path->param_info == NULL);
+
+				path = (Path *) create_projection_path(root,
+													   partially_grouped_rel,
+													   path,
+													   partially_grouped_rel->reltarget);
+
+				add_partial_path(partially_grouped_rel, path);
+			}
+		}
+	}
+
 	/*
 	 * If there is an FDW that's responsible for all baserels of the query,
 	 * let it consider adding partially grouped ForeignPaths.
@@ -8120,13 +8191,6 @@ create_partitionwise_grouping_paths(PlannerInfo *root,
 
 		add_paths_to_append_rel(root, partially_grouped_rel,
 								partially_grouped_live_children);
-
-		/*
-		 * We need call set_cheapest, since the finalization step will use the
-		 * cheapest path from the rel.
-		 */
-		if (partially_grouped_rel->pathlist)
-			set_cheapest(partially_grouped_rel);
 	}
 
 	/* If possible, create append paths for fully grouped children. */
diff --git a/src/backend/optimizer/util/appendinfo.c b/src/backend/optimizer/util/appendinfo.c
index 5b3dc0d8653..11c0eb0d180 100644
--- a/src/backend/optimizer/util/appendinfo.c
+++ b/src/backend/optimizer/util/appendinfo.c
@@ -516,6 +516,65 @@ adjust_appendrel_attrs_mutator(Node *node,
 		return (Node *) newinfo;
 	}
 
+	/*
+	 * We have to process RelAggInfo nodes specially.
+	 */
+	if (IsA(node, RelAggInfo))
+	{
+		RelAggInfo *oldinfo = (RelAggInfo *) node;
+		RelAggInfo *newinfo = makeNode(RelAggInfo);
+
+		/* Copy all flat-copiable fields */
+		memcpy(newinfo, oldinfo, sizeof(RelAggInfo));
+
+		newinfo->relids = adjust_child_relids(oldinfo->relids,
+											  nappinfos, appinfos);
+
+		newinfo->target = (PathTarget *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->target,
+										   context);
+
+		newinfo->agg_input = (PathTarget *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->agg_input,
+										   context);
+
+		newinfo->group_clauses = (List *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->group_clauses,
+										   context);
+
+		newinfo->group_exprs = (List *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->group_exprs,
+										   context);
+
+		return (Node *) newinfo;
+	}
+
+	/*
+	 * We have to process PathTarget nodes specially.
+	 */
+	if (IsA(node, PathTarget))
+	{
+		PathTarget *oldtarget = (PathTarget *) node;
+		PathTarget *newtarget = makeNode(PathTarget);
+
+		/* Copy all flat-copiable fields */
+		memcpy(newtarget, oldtarget, sizeof(PathTarget));
+
+		newtarget->exprs = (List *)
+			adjust_appendrel_attrs_mutator((Node *) oldtarget->exprs,
+										   context);
+
+		if (oldtarget->sortgrouprefs)
+		{
+			Size		nbytes = list_length(oldtarget->exprs) * sizeof(Index);
+
+			newtarget->sortgrouprefs = (Index *) palloc(nbytes);
+			memcpy(newtarget->sortgrouprefs, oldtarget->sortgrouprefs, nbytes);
+		}
+
+		return (Node *) newtarget;
+	}
+
 	/*
 	 * NOTE: we do not need to recurse into sublinks, because they should
 	 * already have been converted to subplans before we see them.
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index 0e523d2eb5b..faa44e46594 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -16,6 +16,8 @@
 
 #include <limits.h>
 
+#include "access/nbtree.h"
+#include "catalog/pg_constraint.h"
 #include "miscadmin.h"
 #include "nodes/nodeFuncs.h"
 #include "optimizer/appendinfo.h"
@@ -27,12 +29,16 @@
 #include "optimizer/paths.h"
 #include "optimizer/placeholder.h"
 #include "optimizer/plancat.h"
+#include "optimizer/planner.h"
 #include "optimizer/restrictinfo.h"
 #include "optimizer/tlist.h"
+#include "parser/parse_oper.h"
 #include "parser/parse_relation.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/hsearch.h"
 #include "utils/lsyscache.h"
+#include "utils/selfuncs.h"
+#include "utils/typcache.h"
 
 
 typedef struct JoinHashEntry
@@ -83,6 +89,14 @@ static void build_child_join_reltarget(PlannerInfo *root,
 									   RelOptInfo *childrel,
 									   int nappinfos,
 									   AppendRelInfo **appinfos);
+static bool eager_aggregation_possible_for_relation(PlannerInfo *root,
+													RelOptInfo *rel);
+static bool init_grouping_targets(PlannerInfo *root, RelOptInfo *rel,
+								  PathTarget *target, PathTarget *agg_input,
+								  List **group_clauses, List **group_exprs);
+static bool is_var_in_aggref_only(PlannerInfo *root, Var *var);
+static bool is_var_needed_by_join(PlannerInfo *root, Var *var, RelOptInfo *rel);
+static Index get_expression_sortgroupref(PlannerInfo *root, Expr *expr);
 
 
 /*
@@ -278,6 +292,8 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
 	rel->joininfo = NIL;
 	rel->has_eclass_joins = false;
 	rel->consider_partitionwise_join = false;	/* might get changed later */
+	rel->agg_info = NULL;
+	rel->grouped_rel = NULL;
 	rel->part_scheme = NULL;
 	rel->nparts = -1;
 	rel->boundinfo = NULL;
@@ -408,6 +424,103 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
 	return rel;
 }
 
+/*
+ * build_simple_grouped_rel
+ *	  Construct a new RelOptInfo representing a grouped version of the input
+ *	  base relation.
+ */
+RelOptInfo *
+build_simple_grouped_rel(PlannerInfo *root, RelOptInfo *rel)
+{
+	RelOptInfo *grouped_rel;
+	RelAggInfo *agg_info;
+
+	/*
+	 * We should have available aggregate expressions and grouping
+	 * expressions, otherwise we cannot reach here.
+	 */
+	Assert(root->agg_clause_list != NIL);
+	Assert(root->group_expr_list != NIL);
+
+	/* nothing to do for dummy rel */
+	if (IS_DUMMY_REL(rel))
+		return NULL;
+
+	/*
+	 * Prepare the information needed to create grouped paths for this base
+	 * relation.
+	 */
+	agg_info = create_rel_agg_info(root, rel);
+	if (agg_info == NULL)
+		return NULL;
+
+	/*
+	 * If grouped paths for the given base relation are not considered useful,
+	 * skip building the grouped relation.
+	 */
+	if (!agg_info->agg_useful)
+		return NULL;
+
+	/* Tracks the lowest join level at which partial aggregation is applied */
+	agg_info->apply_at = bms_copy(rel->relids);
+
+	/* build the grouped relation */
+	grouped_rel = build_grouped_rel(root, rel);
+	grouped_rel->reltarget = agg_info->target;
+	grouped_rel->rows = agg_info->grouped_rows;
+	grouped_rel->agg_info = agg_info;
+
+	rel->grouped_rel = grouped_rel;
+
+	return grouped_rel;
+}
+
+/*
+ * build_grouped_rel
+ *	  Build a grouped relation by flat copying the input relation and resetting
+ *	  the necessary fields.
+ */
+RelOptInfo *
+build_grouped_rel(PlannerInfo *root, RelOptInfo *rel)
+{
+	RelOptInfo *grouped_rel;
+
+	grouped_rel = makeNode(RelOptInfo);
+	memcpy(grouped_rel, rel, sizeof(RelOptInfo));
+
+	/*
+	 * clear path info
+	 */
+	grouped_rel->pathlist = NIL;
+	grouped_rel->ppilist = NIL;
+	grouped_rel->partial_pathlist = NIL;
+	grouped_rel->cheapest_startup_path = NULL;
+	grouped_rel->cheapest_total_path = NULL;
+	grouped_rel->cheapest_parameterized_paths = NIL;
+
+	/*
+	 * clear partition info
+	 */
+	grouped_rel->part_scheme = NULL;
+	grouped_rel->nparts = -1;
+	grouped_rel->boundinfo = NULL;
+	grouped_rel->partbounds_merged = false;
+	grouped_rel->partition_qual = NIL;
+	grouped_rel->part_rels = NULL;
+	grouped_rel->live_parts = NULL;
+	grouped_rel->all_partrels = NULL;
+	grouped_rel->partexprs = NULL;
+	grouped_rel->nullable_partexprs = NULL;
+	grouped_rel->consider_partitionwise_join = false;
+
+	/*
+	 * clear size estimates
+	 */
+	grouped_rel->rows = 0;
+
+	return grouped_rel;
+}
+
 /*
  * find_base_rel
  *	  Find a base or otherrel relation entry, which must already exist.
@@ -759,6 +872,8 @@ build_join_rel(PlannerInfo *root,
 	joinrel->joininfo = NIL;
 	joinrel->has_eclass_joins = false;
 	joinrel->consider_partitionwise_join = false;	/* might get changed later */
+	joinrel->agg_info = NULL;
+	joinrel->grouped_rel = NULL;
 	joinrel->parent = NULL;
 	joinrel->top_parent = NULL;
 	joinrel->top_parent_relids = NULL;
@@ -945,6 +1060,8 @@ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
 	joinrel->joininfo = NIL;
 	joinrel->has_eclass_joins = false;
 	joinrel->consider_partitionwise_join = false;	/* might get changed later */
+	joinrel->agg_info = NULL;
+	joinrel->grouped_rel = NULL;
 	joinrel->parent = parent_joinrel;
 	joinrel->top_parent = parent_joinrel->top_parent ? parent_joinrel->top_parent : parent_joinrel;
 	joinrel->top_parent_relids = joinrel->top_parent->relids;
@@ -2523,3 +2640,514 @@ build_child_join_reltarget(PlannerInfo *root,
 	childrel->reltarget->cost.per_tuple = parentrel->reltarget->cost.per_tuple;
 	childrel->reltarget->width = parentrel->reltarget->width;
 }
+
+/*
+ * create_rel_agg_info
+ *	  Create the RelAggInfo structure for the given relation if it can produce
+ *	  grouped paths.  The given relation is the non-grouped one which has the
+ *	  reltarget already constructed.
+ */
+RelAggInfo *
+create_rel_agg_info(PlannerInfo *root, RelOptInfo *rel)
+{
+	ListCell   *lc;
+	RelAggInfo *result;
+	PathTarget *agg_input;
+	PathTarget *target;
+	List	   *group_clauses = NIL;
+	List	   *group_exprs = NIL;
+
+	/*
+	 * The lists of aggregate expressions and grouping expressions should have
+	 * been constructed.
+	 */
+	Assert(root->agg_clause_list != NIL);
+	Assert(root->group_expr_list != NIL);
+
+	/*
+	 * If this is a child rel, the grouped rel for its parent rel must have
+	 * been created if it can.  So we can just use parent's RelAggInfo if
+	 * there is one, with appropriate variable substitutions.
+	 */
+	if (IS_OTHER_REL(rel))
+	{
+		RelOptInfo *grouped_rel;
+		RelAggInfo *agg_info;
+
+		grouped_rel = rel->top_parent->grouped_rel;
+		if (grouped_rel == NULL)
+			return NULL;
+
+		Assert(IS_GROUPED_REL(grouped_rel));
+
+		/* Must do multi-level transformation */
+		agg_info = (RelAggInfo *)
+			adjust_appendrel_attrs_multilevel(root,
+											  (Node *) grouped_rel->agg_info,
+											  rel,
+											  rel->top_parent);
+
+		agg_info->grouped_rows =
+			estimate_num_groups(root, agg_info->group_exprs,
+								rel->rows, NULL, NULL);
+
+		agg_info->apply_at = NULL;	/* caller will change this later */
+
+		/*
+		 * The grouped paths for the given relation are considered useful iff
+		 * the average group size is no less than min_eager_agg_group_size.
+		 */
+		agg_info->agg_useful =
+			(rel->rows / agg_info->grouped_rows) >= min_eager_agg_group_size;
+
+		return agg_info;
+	}
+
+	/* Check if it's possible to produce grouped paths for this relation. */
+	if (!eager_aggregation_possible_for_relation(root, rel))
+		return NULL;
+
+	/*
+	 * Create targets for the grouped paths and for the input paths of the
+	 * grouped paths.
+	 */
+	target = create_empty_pathtarget();
+	agg_input = create_empty_pathtarget();
+
+	/* ... and initialize these targets */
+	if (!init_grouping_targets(root, rel, target, agg_input,
+							   &group_clauses, &group_exprs))
+		return NULL;
+
+	/*
+	 * Eager aggregation is not applicable if there are no available grouping
+	 * expressions.
+	 */
+	if (list_length(group_clauses) == 0)
+		return NULL;
+
+	/* build the RelAggInfo result */
+	result = makeNode(RelAggInfo);
+
+	result->group_clauses = group_clauses;
+	result->group_exprs = group_exprs;
+
+	/* Calculate pathkeys that represent this grouping requirements */
+	result->group_pathkeys =
+		make_pathkeys_for_sortclauses(root, result->group_clauses,
+									  make_tlist_from_pathtarget(target));
+
+	/* Add aggregates to the grouping target */
+	foreach(lc, root->agg_clause_list)
+	{
+		AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+		Aggref	   *aggref;
+
+		Assert(IsA(ac_info->aggref, Aggref));
+
+		aggref = (Aggref *) copyObject(ac_info->aggref);
+		mark_partial_aggref(aggref, AGGSPLIT_INITIAL_SERIAL);
+
+		add_column_to_pathtarget(target, (Expr *) aggref, 0);
+	}
+
+	/* Set the estimated eval cost and output width for both targets */
+	set_pathtarget_cost_width(root, target);
+	set_pathtarget_cost_width(root, agg_input);
+
+	result->relids = bms_copy(rel->relids);
+	result->target = target;
+	result->agg_input = agg_input;
+	result->grouped_rows = estimate_num_groups(root, result->group_exprs,
+											   rel->rows, NULL, NULL);
+	result->apply_at = NULL;	/* caller will change this later */
+
+	/*
+	 * The grouped paths for the given relation are considered useful iff the
+	 * average group size is no less than min_eager_agg_group_size.
+	 */
+	result->agg_useful =
+		(rel->rows / result->grouped_rows) >= min_eager_agg_group_size;
+
+	return result;
+}
+
+/*
+ * eager_aggregation_possible_for_relation
+ * 	  Check if it's possible to produce grouped paths for the given relation.
+ */
+static bool
+eager_aggregation_possible_for_relation(PlannerInfo *root, RelOptInfo *rel)
+{
+	ListCell   *lc;
+	int			cur_relid;
+
+	/*
+	 * Check to see if the given relation is in the nullable side of an outer
+	 * join.  In this case, we cannot push a partial aggregation down to the
+	 * relation, because the NULL-extended rows produced by the outer join
+	 * would not be available when we perform the partial aggregation, while
+	 * with a non-eager-aggregation plan these rows are available for the
+	 * top-level aggregation.  Doing so may result in the rows being grouped
+	 * differently than expected, or produce incorrect values from the
+	 * aggregate functions.
+	 */
+	cur_relid = -1;
+	while ((cur_relid = bms_next_member(rel->relids, cur_relid)) >= 0)
+	{
+		RelOptInfo *baserel = find_base_rel_ignore_join(root, cur_relid);
+
+		if (baserel == NULL)
+			continue;			/* ignore outer joins in rel->relids */
+
+		if (!bms_is_subset(baserel->nulling_relids, rel->relids))
+			return false;
+	}
+
+	/*
+	 * For now we don't try to support PlaceHolderVars.
+	 */
+	foreach(lc, rel->reltarget->exprs)
+	{
+		Expr	   *expr = lfirst(lc);
+
+		if (IsA(expr, PlaceHolderVar))
+			return false;
+	}
+
+	/* Caller should only pass base relations or joins. */
+	Assert(rel->reloptkind == RELOPT_BASEREL ||
+		   rel->reloptkind == RELOPT_JOINREL);
+
+	/*
+	 * Check if all aggregate expressions can be evaluated on this relation
+	 * level.
+	 */
+	foreach(lc, root->agg_clause_list)
+	{
+		AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+
+		Assert(IsA(ac_info->aggref, Aggref));
+
+		/*
+		 * Give up if any aggregate requires relations other than the current
+		 * one.  If the aggregate requires the current relation plus
+		 * additional relations, grouping the current relation could make some
+		 * input rows unavailable for the higher aggregate and may reduce the
+		 * number of input rows it receives.  If the aggregate does not
+		 * require the current relation at all, it should not be grouped, as
+		 * we do not support joining two grouped relations.
+		 */
+		if (!bms_is_subset(ac_info->agg_eval_at, rel->relids))
+			return false;
+	}
+
+	return true;
+}
+
+/*
+ * init_grouping_targets
+ *	  Initialize the target for grouped paths (target) as well as the target
+ *	  for paths that generate input for the grouped paths (agg_input).
+ *
+ * We also construct the list of SortGroupClauses and the list of grouping
+ * expressions for the partial aggregation, and return them in *group_clause
+ * and *group_exprs.
+ *
+ * Return true if the targets could be initialized, false otherwise.
+ */
+static bool
+init_grouping_targets(PlannerInfo *root, RelOptInfo *rel,
+					  PathTarget *target, PathTarget *agg_input,
+					  List **group_clauses, List **group_exprs)
+{
+	ListCell   *lc;
+	List	   *possibly_dependent = NIL;
+	Index		maxSortGroupRef;
+
+	/* Identify the max sortgroupref */
+	maxSortGroupRef = 0;
+	foreach(lc, root->processed_tlist)
+	{
+		Index		ref = ((TargetEntry *) lfirst(lc))->ressortgroupref;
+
+		if (ref > maxSortGroupRef)
+			maxSortGroupRef = ref;
+	}
+
+	/*
+	 * At this point, all Vars from this relation that are needed by upper
+	 * joins or are required in the final targetlist should already be present
+	 * in its reltarget.  Therefore, we can safely iterate over this
+	 * relation's reltarget->exprs to construct the PathTarget and grouping
+	 * clauses for the grouped paths.
+	 */
+	foreach(lc, rel->reltarget->exprs)
+	{
+		Expr	   *expr = (Expr *) lfirst(lc);
+		Index		sortgroupref;
+
+		/*
+		 * Given that PlaceHolderVar currently prevents us from doing eager
+		 * aggregation, the source target cannot contain anything more complex
+		 * than a Var.
+		 */
+		Assert(IsA(expr, Var));
+
+		/*
+		 * Get the sortgroupref of the expr if it is found among, or can be
+		 * deduced from, the original grouping expressions.
+		 */
+		sortgroupref = get_expression_sortgroupref(root, expr);
+		if (sortgroupref > 0)
+		{
+			SortGroupClause *sgc;
+
+			/* Find the matching SortGroupClause */
+			sgc = get_sortgroupref_clause(sortgroupref, root->processed_groupClause);
+			Assert(sgc->tleSortGroupRef <= maxSortGroupRef);
+
+			/*
+			 * If the target expression is to be used as a grouping key, it
+			 * should be emitted by the grouped paths that have been pushed
+			 * down to this relation level.
+			 */
+			add_column_to_pathtarget(target, expr, sortgroupref);
+
+			/*
+			 * ... and it also should be emitted by the input paths.
+			 */
+			add_column_to_pathtarget(agg_input, expr, sortgroupref);
+
+			/*
+			 * Record this SortGroupClause and grouping expression.  Note that
+			 * this SortGroupClause might have already been recorded.
+			 */
+			if (!list_member(*group_clauses, sgc))
+			{
+				*group_clauses = lappend(*group_clauses, sgc);
+				*group_exprs = lappend(*group_exprs, expr);
+			}
+		}
+		else if (is_var_needed_by_join(root, (Var *) expr, rel))
+		{
+			/*
+			 * The expression is needed for an upper join but is neither in
+			 * the GROUP BY clause nor derivable from it using EC (otherwise,
+			 * it would have already been included in the targets above).  We
+			 * need to create a special SortGroupClause for this expression.
+			 *
+			 * It is important to include such expressions in the grouping
+			 * keys.  This is essential to ensure that an aggregated row from
+			 * the partial aggregation matches the other side of the join if
+			 * and only if each row in the partial group does.  This ensures
+			 * that all rows within the same partial group share the same
+			 * 'destiny', which is crucial for maintaining correctness.
+			 */
+			SortGroupClause *sgc;
+			TypeCacheEntry *tce;
+			Oid			equalimageproc;
+
+			/*
+			 * But first, check if equality implies image equality for this
+			 * expression.  If not, we cannot use it as a grouping key.  See
+			 * comments in create_grouping_expr_infos().
+			 */
+			tce = lookup_type_cache(exprType((Node *) expr),
+									TYPECACHE_BTREE_OPFAMILY);
+			if (!OidIsValid(tce->btree_opf) ||
+				!OidIsValid(tce->btree_opintype))
+				return false;
+
+			equalimageproc = get_opfamily_proc(tce->btree_opf,
+											   tce->btree_opintype,
+											   tce->btree_opintype,
+											   BTEQUALIMAGE_PROC);
+			if (!OidIsValid(equalimageproc) ||
+				!DatumGetBool(OidFunctionCall1Coll(equalimageproc,
+												   tce->typcollation,
+												   ObjectIdGetDatum(tce->btree_opintype))))
+				return false;
+
+			/* Create the SortGroupClause. */
+			sgc = makeNode(SortGroupClause);
+
+			/* Initialize the SortGroupClause. */
+			sgc->tleSortGroupRef = ++maxSortGroupRef;
+			get_sort_group_operators(exprType((Node *) expr),
+									 false, true, false,
+									 &sgc->sortop, &sgc->eqop, NULL,
+									 &sgc->hashable);
+
+			/* This expression should be emitted by the grouped paths */
+			add_column_to_pathtarget(target, expr, sgc->tleSortGroupRef);
+
+			/* ... and it also should be emitted by the input paths. */
+			add_column_to_pathtarget(agg_input, expr, sgc->tleSortGroupRef);
+
+			/* Record this SortGroupClause and grouping expression */
+			*group_clauses = lappend(*group_clauses, sgc);
+			*group_exprs = lappend(*group_exprs, expr);
+		}
+		else if (is_var_in_aggref_only(root, (Var *) expr))
+		{
+			/*
+			 * The expression is referenced by an aggregate function pushed
+			 * down to this relation and does not appear elsewhere in the
+			 * targetlist or havingQual.  Add it to 'agg_input' but not to
+			 * 'target'.
+			 */
+			add_new_column_to_pathtarget(agg_input, expr);
+		}
+		else
+		{
+			/*
+			 * The expression may be functionally dependent on other
+			 * expressions in the target, but we cannot verify this until all
+			 * target expressions have been constructed.
+			 */
+			possibly_dependent = lappend(possibly_dependent, expr);
+		}
+	}
+
+	/*
+	 * Now we can verify whether an expression is functionally dependent on
+	 * others.
+	 */
+	foreach(lc, possibly_dependent)
+	{
+		Var		   *tvar;
+		List	   *deps = NIL;
+		RangeTblEntry *rte;
+
+		tvar = lfirst_node(Var, lc);
+		rte = root->simple_rte_array[tvar->varno];
+
+		if (check_functional_grouping(rte->relid, tvar->varno,
+									  tvar->varlevelsup,
+									  target->exprs, &deps))
+		{
+			/*
+			 * The expression is functionally dependent on other target
+			 * expressions, so it can be included in the targets.  Since it
+			 * will not be used as a grouping key, a sortgroupref is not
+			 * needed for it.
+			 */
+			add_new_column_to_pathtarget(target, (Expr *) tvar);
+			add_new_column_to_pathtarget(agg_input, (Expr *) tvar);
+		}
+		else
+		{
+			/*
+			 * We may arrive here with a grouping expression that is proven
+			 * redundant by EquivalenceClass processing, such as 't1.a' in the
+			 * query below.
+			 *
+			 * select max(t1.c) from t t1, t t2 where t1.a = 1 group by t1.a,
+			 * t1.b;
+			 *
+			 * For now we just give up in this case.
+			 */
+			return false;
+		}
+	}
+
+	return true;
+}
+
+/*
+ * is_var_in_aggref_only
+ *	  Check whether the given Var appears in aggregate expressions and not
+ *	  elsewhere in the targetlist or havingQual.
+ */
+static bool
+is_var_in_aggref_only(PlannerInfo *root, Var *var)
+{
+	ListCell   *lc;
+
+	/*
+	 * Search the list of aggregate expressions for the Var.
+	 */
+	foreach(lc, root->agg_clause_list)
+	{
+		AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+		List	   *vars;
+
+		Assert(IsA(ac_info->aggref, Aggref));
+
+		if (!bms_is_member(var->varno, ac_info->agg_eval_at))
+			continue;
+
+		vars = pull_var_clause((Node *) ac_info->aggref,
+							   PVC_RECURSE_AGGREGATES |
+							   PVC_RECURSE_WINDOWFUNCS |
+							   PVC_RECURSE_PLACEHOLDERS);
+
+		if (list_member(vars, var))
+		{
+			list_free(vars);
+			break;
+		}
+
+		list_free(vars);
+	}
+
+	return (lc != NULL && !list_member(root->tlist_vars, var));
+}
+
+/*
+ * is_var_needed_by_join
+ *	  Check if the given Var is needed by joins above the current rel.
+ */
+static bool
+is_var_needed_by_join(PlannerInfo *root, Var *var, RelOptInfo *rel)
+{
+	Relids		relids;
+	int			attno;
+	RelOptInfo *baserel;
+
+	/*
+	 * Note that when checking if the Var is needed by joins above, we want to
+	 * exclude cases where the Var is only needed in the final targetlist.  So
+	 * include "relation 0" in the check.
+	 */
+	relids = bms_copy(rel->relids);
+	relids = bms_add_member(relids, 0);
+
+	baserel = find_base_rel(root, var->varno);
+	attno = var->varattno - baserel->min_attr;
+
+	return bms_nonempty_difference(baserel->attr_needed[attno], relids);
+}
+
+/*
+ * get_expression_sortgroupref
+ *	  Return the sortgroupref of the given "expr" if it is found among the
+ *	  original grouping expressions, or is known equal to any of the original
+ *	  grouping expressions due to equivalence relationships.  Return 0 if no
+ *	  match is found.
+ */
+static Index
+get_expression_sortgroupref(PlannerInfo *root, Expr *expr)
+{
+	ListCell   *lc;
+
+	foreach(lc, root->group_expr_list)
+	{
+		GroupingExprInfo *ge_info = lfirst_node(GroupingExprInfo, lc);
+
+		Assert(IsA(ge_info->expr, Var));
+
+		if (equal(ge_info->expr, expr) ||
+			exprs_known_equal(root, (Node *) expr, (Node *) ge_info->expr,
+							  ge_info->btree_opfamily))
+		{
+			Assert(ge_info->sortgroupref > 0);
+
+			return ge_info->sortgroupref;
+		}
+	}
+
+	/* no match is found */
+	return 0;
+}
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 0da01627cfe..f35dd1b23bf 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -145,6 +145,13 @@
   boot_val => 'false',
 },
 
+{ name => 'enable_eager_aggregate', type => 'bool', context => 'PGC_USERSET', group => 'QUERY_TUNING_METHOD',
+  short_desc => 'Enables eager aggregation.',
+  flags => 'GUC_EXPLAIN',
+  variable => 'enable_eager_aggregate',
+  boot_val => 'true',
+},
+
 { name => 'enable_parallel_append', type => 'bool', context => 'PGC_USERSET', group => 'QUERY_TUNING_METHOD',
   short_desc => 'Enables the planner\'s use of parallel append plans.',
   flags => 'GUC_EXPLAIN',
@@ -2427,6 +2434,15 @@
   max => 'DBL_MAX',
 },
 
+{ name => 'min_eager_agg_group_size', type => 'real', context => 'PGC_USERSET', group => 'QUERY_TUNING_COST',
+  short_desc => 'Sets the minimum average group size required to consider applying eager aggregation.',
+  flags => 'GUC_EXPLAIN',
+  variable => 'min_eager_agg_group_size',
+  boot_val => '8.0',
+  min => '0.0',
+  max => 'DBL_MAX',
+},
+
 { name => 'cursor_tuple_fraction', type => 'real', context => 'PGC_USERSET', group => 'QUERY_TUNING_OTHER',
   short_desc => 'Sets the planner\'s estimate of the fraction of a cursor\'s rows that will be retrieved.',
   flags => 'GUC_EXPLAIN',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index 26c08693564..7325bcd439d 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -428,6 +428,7 @@
 #enable_group_by_reordering = on
 #enable_distinct_reordering = on
 #enable_self_join_elimination = on
+#enable_eager_aggregate = on
 
 # - Planner Cost Constants -
 
@@ -441,6 +442,7 @@
 #min_parallel_table_scan_size = 8MB
 #min_parallel_index_scan_size = 512kB
 #effective_cache_size = 4GB
+#min_eager_agg_group_size = 8.0
 
 #jit_above_cost = 100000		# perform JIT compilation if available
 					# and query more expensive than this;
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index 4a903d1ec18..ad211207343 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -397,6 +397,15 @@ struct PlannerInfo
 	/* list of PlaceHolderInfos */
 	List	   *placeholder_list;
 
+	/* list of AggClauseInfos */
+	List	   *agg_clause_list;
+
+	/* list of GroupExprInfos */
+	List	   *group_expr_list;
+
+	/* list of plain Vars contained in targetlist and havingQual */
+	List	   *tlist_vars;
+
 	/* array of PlaceHolderInfos indexed by phid */
 	struct PlaceHolderInfo **placeholder_array pg_node_attr(read_write_ignore, array_size(placeholder_array_size));
 	/* allocated size of array */
@@ -1046,6 +1055,14 @@ typedef struct RelOptInfo
 	/* consider partitionwise join paths? (if partitioned rel) */
 	bool		consider_partitionwise_join;
 
+	/*
+	 * used by eager aggregation:
+	 */
+	/* information needed to create grouped paths */
+	struct RelAggInfo *agg_info;
+	/* the partially-aggregated version of the relation */
+	struct RelOptInfo *grouped_rel;
+
 	/*
 	 * inheritance links, if this is an otherrel (otherwise NULL):
 	 */
@@ -1130,6 +1147,75 @@ typedef struct RelOptInfo
 	((nominal_jointype) == JOIN_INNER && (sjinfo)->jointype == JOIN_SEMI && \
 	 bms_equal((sjinfo)->syn_righthand, (rel)->relids))
 
+/*
+ * Is the given relation a grouped relation?
+ */
+#define IS_GROUPED_REL(rel) \
+	((rel)->agg_info != NULL)
+
+/*
+ * RelAggInfo
+ *		Information needed to create grouped paths for base and join rels.
+ *
+ * "relids" is the set of relation identifiers (RT indexes).
+ *
+ * "target" is the output tlist for the grouped paths.
+ *
+ * "agg_input" is the output tlist for the paths that provide input to the
+ * grouped paths.  One difference from the reltarget of the non-grouped
+ * relation is that agg_input has its sortgrouprefs[] initialized.
+ *
+ * "grouped_rows" is the estimated number of result tuples of the grouped
+ * relation.
+ *
+ * "group_clauses", "group_exprs" and "group_pathkeys" are lists of
+ * SortGroupClauses, the corresponding grouping expressions and PathKeys
+ * respectively.
+ *
+ * "apply_at" tracks the lowest join level at which partial aggregation is
+ * applied.
+ *
+ * "agg_useful" is a flag to indicate whether the grouped paths are considered
+ * useful.  It is set true if the average partial group size is no less than
+ * min_eager_agg_group_size, suggesting a significant row count reduction.
+ */
+typedef struct RelAggInfo
+{
+	pg_node_attr(no_copy_equal, no_read, no_query_jumble)
+
+	NodeTag		type;
+
+	/* set of base + OJ relids (rangetable indexes) */
+	Relids		relids;
+
+	/*
+	 * default result targetlist for Paths scanning this grouped relation;
+	 * list of Vars/Exprs, cost, width
+	 */
+	struct PathTarget *target;
+
+	/*
+	 * the targetlist for Paths that provide input to the grouped paths
+	 */
+	struct PathTarget *agg_input;
+
+	/* estimated number of result tuples */
+	Cardinality grouped_rows;
+
+	/* a list of SortGroupClauses */
+	List	   *group_clauses;
+	/* a list of grouping expressions */
+	List	   *group_exprs;
+	/* a list of PathKeys */
+	List	   *group_pathkeys;
+
+	/* lowest level partial aggregation is applied at */
+	Relids		apply_at;
+
+	/* the grouped paths are considered useful? */
+	bool		agg_useful;
+} RelAggInfo;
+
 /*
  * IndexOptInfo
  *		Per-index information for planning/optimization
@@ -3283,6 +3369,50 @@ typedef struct MinMaxAggInfo
 	Param	   *param;
 } MinMaxAggInfo;
 
+/*
+ * For each distinct Aggref node that appears in the targetlist and HAVING
+ * clauses, we store an AggClauseInfo node in the PlannerInfo node's
+ * agg_clause_list.  Each AggClauseInfo records the set of relations referenced
+ * by the aggregate expression.  This information is used to determine how far
+ * the aggregate can be safely pushed down in the join tree.
+ */
+typedef struct AggClauseInfo
+{
+	pg_node_attr(no_read, no_query_jumble)
+
+	NodeTag		type;
+
+	/* the Aggref expr */
+	Aggref	   *aggref;
+
+	/* lowest level we can evaluate this aggregate at */
+	Relids		agg_eval_at;
+} AggClauseInfo;
+
+/*
+ * For each grouping expression that appears in grouping clauses, we store a
+ * GroupingExprInfo node in the PlannerInfo node's group_expr_list.  Each
+ * GroupingExprInfo records the expression being grouped on, its sortgroupref,
+ * and the btree opfamily used for equality comparison.  This information is
+ * necessary to reproduce correct grouping semantics at different levels of the
+ * join tree.
+ */
+typedef struct GroupingExprInfo
+{
+	pg_node_attr(no_read, no_query_jumble)
+
+	NodeTag		type;
+
+	/* the represented expression */
+	Expr	   *expr;
+
+	/* the tleSortGroupRef of the corresponding SortGroupClause */
+	Index		sortgroupref;
+
+	/* btree opfamily defining the ordering */
+	Oid			btree_opfamily;
+} GroupingExprInfo;
+
 /*
  * At runtime, PARAM_EXEC slots are used to pass values around from one plan
  * node to another.  They can be used to pass values down into subqueries (for
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index 763cd25bb3c..5b9c1daf14b 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -312,6 +312,10 @@ extern void setup_simple_rel_arrays(PlannerInfo *root);
 extern void expand_planner_arrays(PlannerInfo *root, int add_size);
 extern RelOptInfo *build_simple_rel(PlannerInfo *root, int relid,
 									RelOptInfo *parent);
+extern RelOptInfo *build_simple_grouped_rel(PlannerInfo *root,
+											RelOptInfo *rel_plain);
+extern RelOptInfo *build_grouped_rel(PlannerInfo *root,
+									 RelOptInfo *rel_plain);
 extern RelOptInfo *find_base_rel(PlannerInfo *root, int relid);
 extern RelOptInfo *find_base_rel_noerr(PlannerInfo *root, int relid);
 extern RelOptInfo *find_base_rel_ignore_join(PlannerInfo *root, int relid);
@@ -351,4 +355,5 @@ extern RelOptInfo *build_child_join_rel(PlannerInfo *root,
 										SpecialJoinInfo *sjinfo,
 										int nappinfos, AppendRelInfo **appinfos);
 
+extern RelAggInfo *create_rel_agg_info(PlannerInfo *root, RelOptInfo *rel);
 #endif							/* PATHNODE_H */
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index cbade77b717..8d03d662a04 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -21,7 +21,9 @@
  * allpaths.c
  */
 extern PGDLLIMPORT bool enable_geqo;
+extern PGDLLIMPORT bool enable_eager_aggregate;
 extern PGDLLIMPORT int geqo_threshold;
+extern PGDLLIMPORT double min_eager_agg_group_size;
 extern PGDLLIMPORT int min_parallel_table_scan_size;
 extern PGDLLIMPORT int min_parallel_index_scan_size;
 extern PGDLLIMPORT bool enable_group_by_reordering;
@@ -57,6 +59,10 @@ extern void generate_gather_paths(PlannerInfo *root, RelOptInfo *rel,
 								  bool override_rows);
 extern void generate_useful_gather_paths(PlannerInfo *root, RelOptInfo *rel,
 										 bool override_rows);
+extern void generate_grouped_paths(PlannerInfo *root,
+								   RelOptInfo *rel_grouped,
+								   RelOptInfo *rel_plain,
+								   RelAggInfo *agg_info);
 extern int	compute_parallel_worker(RelOptInfo *rel, double heap_pages,
 									double index_pages, int max_workers);
 extern void create_partial_bitmap_paths(PlannerInfo *root, RelOptInfo *rel,
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index 9d3debcab28..09b48b26f8f 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -76,6 +76,7 @@ extern void add_vars_to_targetlist(PlannerInfo *root, List *vars,
 extern void add_vars_to_attr_needed(PlannerInfo *root, List *vars,
 									Relids where_needed);
 extern void remove_useless_groupby_columns(PlannerInfo *root);
+extern void setup_eager_aggregation(PlannerInfo *root);
 extern void find_lateral_references(PlannerInfo *root);
 extern void rebuild_lateral_attr_needed(PlannerInfo *root);
 extern void create_lateral_join_info(PlannerInfo *root);
diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out
index 69805d4b9ec..ef79d6f1ded 100644
--- a/src/test/regress/expected/collate.icu.utf8.out
+++ b/src/test/regress/expected/collate.icu.utf8.out
@@ -2437,11 +2437,11 @@ SELECT c collate "C", count(c) FROM pagg_tab3 GROUP BY c collate "C" ORDER BY 1;
 SET enable_partitionwise_join TO false;
 EXPLAIN (COSTS OFF)
 SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROUP BY 1 ORDER BY t1.c COLLATE "C";
-                         QUERY PLAN                          
--------------------------------------------------------------
+                            QUERY PLAN                             
+-------------------------------------------------------------------
  Sort
    Sort Key: t1.c COLLATE "C"
-   ->  HashAggregate
+   ->  Finalize HashAggregate
          Group Key: t1.c
          ->  Hash Join
                Hash Cond: (t1.c = t2.c)
@@ -2449,10 +2449,12 @@ SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROU
                      ->  Seq Scan on pagg_tab3_p2 t1_1
                      ->  Seq Scan on pagg_tab3_p1 t1_2
                ->  Hash
-                     ->  Append
-                           ->  Seq Scan on pagg_tab3_p2 t2_1
-                           ->  Seq Scan on pagg_tab3_p1 t2_2
-(13 rows)
+                     ->  Partial HashAggregate
+                           Group Key: t2.c
+                           ->  Append
+                                 ->  Seq Scan on pagg_tab3_p2 t2_1
+                                 ->  Seq Scan on pagg_tab3_p1 t2_2
+(15 rows)
 
 SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROUP BY 1 ORDER BY t1.c COLLATE "C";
  c | count 
@@ -2464,11 +2466,11 @@ SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROU
 SET enable_partitionwise_join TO true;
 EXPLAIN (COSTS OFF)
 SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROUP BY 1 ORDER BY t1.c COLLATE "C";
-                         QUERY PLAN                          
--------------------------------------------------------------
+                            QUERY PLAN                             
+-------------------------------------------------------------------
  Sort
    Sort Key: t1.c COLLATE "C"
-   ->  HashAggregate
+   ->  Finalize HashAggregate
          Group Key: t1.c
          ->  Hash Join
                Hash Cond: (t1.c = t2.c)
@@ -2476,10 +2478,12 @@ SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROU
                      ->  Seq Scan on pagg_tab3_p2 t1_1
                      ->  Seq Scan on pagg_tab3_p1 t1_2
                ->  Hash
-                     ->  Append
-                           ->  Seq Scan on pagg_tab3_p2 t2_1
-                           ->  Seq Scan on pagg_tab3_p1 t2_2
-(13 rows)
+                     ->  Partial HashAggregate
+                           Group Key: t2.c
+                           ->  Append
+                                 ->  Seq Scan on pagg_tab3_p2 t2_1
+                                 ->  Seq Scan on pagg_tab3_p1 t2_2
+(15 rows)
 
 SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROUP BY 1 ORDER BY t1.c COLLATE "C";
  c | count 
diff --git a/src/test/regress/expected/eager_aggregate.out b/src/test/regress/expected/eager_aggregate.out
new file mode 100644
index 00000000000..0dab585e9ce
--- /dev/null
+++ b/src/test/regress/expected/eager_aggregate.out
@@ -0,0 +1,1584 @@
+--
+-- EAGER AGGREGATION
+-- Test we can push aggregation down below join
+--
+-- Enable eager aggregation, which by default is disabled.
+SET enable_eager_aggregate TO on;
+CREATE TABLE eager_agg_t1 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t2 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t3 (a int, b int, c double precision);
+INSERT INTO eager_agg_t1 SELECT i, i, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t2 SELECT i, i%10, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t3 SELECT i%10, i%10, i FROM generate_series(1, 1000)i;
+ANALYZE eager_agg_t1;
+ANALYZE eager_agg_t2;
+ANALYZE eager_agg_t3;
+--
+-- Test eager aggregation over base rel
+--
+-- Perform scan of a table, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                            QUERY PLAN                            
+------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Seq Scan on public.eager_agg_t2 t2
+                                 Output: t2.a, t2.b, t2.c
+(18 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                               QUERY PLAN                               
+------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial GroupAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Sort
+                                 Output: t2.c, t2.b
+                                 Sort Key: t2.b
+                                 ->  Seq Scan on public.eager_agg_t2 t2
+                                       Output: t2.c, t2.b
+(21 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+RESET enable_hashagg;
+--
+-- Test eager aggregation over join rel
+--
+-- Perform join of tables, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+                                  QUERY PLAN                                  
+------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg((t2.c + t3.c))
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg((t2.c + t3.c)))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg((t2.c + t3.c))
+                           Group Key: t2.b
+                           ->  Hash Join
+                                 Output: t2.c, t2.b, t3.c
+                                 Hash Cond: (t3.a = t2.a)
+                                 ->  Seq Scan on public.eager_agg_t3 t3
+                                       Output: t3.a, t3.b, t3.c
+                                 ->  Hash
+                                       Output: t2.c, t2.b, t2.a
+                                       ->  Seq Scan on public.eager_agg_t2 t2
+                                             Output: t2.c, t2.b, t2.a
+(25 rows)
+
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 497
+ 2 | 499
+ 3 | 501
+ 4 | 503
+ 5 | 505
+ 6 | 507
+ 7 | 509
+ 8 | 511
+ 9 | 513
+(9 rows)
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+                                     QUERY PLAN                                     
+------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg((t2.c + t3.c))
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg((t2.c + t3.c)))
+                     ->  Partial GroupAggregate
+                           Output: t2.b, PARTIAL avg((t2.c + t3.c))
+                           Group Key: t2.b
+                           ->  Sort
+                                 Output: t2.c, t2.b, t3.c
+                                 Sort Key: t2.b
+                                 ->  Hash Join
+                                       Output: t2.c, t2.b, t3.c
+                                       Hash Cond: (t3.a = t2.a)
+                                       ->  Seq Scan on public.eager_agg_t3 t3
+                                             Output: t3.a, t3.b, t3.c
+                                       ->  Hash
+                                             Output: t2.c, t2.b, t2.a
+                                             ->  Seq Scan on public.eager_agg_t2 t2
+                                                   Output: t2.c, t2.b, t2.a
+(28 rows)
+
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 497
+ 2 | 499
+ 3 | 501
+ 4 | 503
+ 5 | 505
+ 6 | 507
+ 7 | 509
+ 8 | 511
+ 9 | 513
+(9 rows)
+
+RESET enable_hashagg;
+--
+-- Test that eager aggregation works for outer join
+--
+-- Ensure aggregation can be pushed down to the non-nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                            QUERY PLAN                            
+------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Right Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Seq Scan on public.eager_agg_t2 t2
+                                 Output: t2.a, t2.b, t2.c
+(18 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+   | 505
+(10 rows)
+
+-- Ensure aggregation cannot be pushed down to the nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Sort
+   Output: t2.b, (avg(t2.c))
+   Sort Key: t2.b
+   ->  HashAggregate
+         Output: t2.b, avg(t2.c)
+         Group Key: t2.b
+         ->  Hash Right Join
+               Output: t2.b, t2.c
+               Hash Cond: (t2.b = t1.b)
+               ->  Seq Scan on public.eager_agg_t2 t2
+                     Output: t2.a, t2.b, t2.c
+               ->  Hash
+                     Output: t1.b
+                     ->  Seq Scan on public.eager_agg_t1 t1
+                           Output: t1.b
+(15 rows)
+
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+ b | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+   |    
+(10 rows)
+
+--
+-- Test that eager aggregation works for parallel plans
+--
+SET parallel_setup_cost=0;
+SET parallel_tuple_cost=0;
+SET min_parallel_table_scan_size=0;
+SET max_parallel_workers_per_gather=4;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                                   QUERY PLAN                                    
+---------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Gather Merge
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Workers Planned: 2
+         ->  Sort
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Sort Key: t1.a
+               ->  Parallel Hash Join
+                     Output: t1.a, (PARTIAL avg(t2.c))
+                     Hash Cond: (t1.b = t2.b)
+                     ->  Parallel Seq Scan on public.eager_agg_t1 t1
+                           Output: t1.a, t1.b, t1.c
+                     ->  Parallel Hash
+                           Output: t2.b, (PARTIAL avg(t2.c))
+                           ->  Partial HashAggregate
+                                 Output: t2.b, PARTIAL avg(t2.c)
+                                 Group Key: t2.b
+                                 ->  Parallel Seq Scan on public.eager_agg_t2 t2
+                                       Output: t2.a, t2.b, t2.c
+(21 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+RESET parallel_setup_cost;
+RESET parallel_tuple_cost;
+RESET min_parallel_table_scan_size;
+RESET max_parallel_workers_per_gather;
+--
+-- Test eager aggregation with GEQO
+--
+SET geqo = on;
+SET geqo_threshold = 2;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                            QUERY PLAN                            
+------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Seq Scan on public.eager_agg_t2 t2
+                                 Output: t2.a, t2.b, t2.c
+(18 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+RESET geqo;
+RESET geqo_threshold;
+DROP TABLE eager_agg_t1;
+DROP TABLE eager_agg_t2;
+DROP TABLE eager_agg_t3;
+--
+-- Test eager aggregation for partitionwise join
+--
+-- Enable partitionwise aggregate, which by default is disabled.
+SET enable_partitionwise_aggregate TO true;
+-- Enable partitionwise join, which by default is disabled.
+SET enable_partitionwise_join TO true;
+CREATE TABLE eager_agg_tab1(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab1_p1 PARTITION OF eager_agg_tab1 FOR VALUES FROM (0) TO (5);
+CREATE TABLE eager_agg_tab1_p2 PARTITION OF eager_agg_tab1 FOR VALUES FROM (5) TO (10);
+CREATE TABLE eager_agg_tab1_p3 PARTITION OF eager_agg_tab1 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab2(x int, y int) PARTITION BY RANGE(y);
+CREATE TABLE eager_agg_tab2_p1 PARTITION OF eager_agg_tab2 FOR VALUES FROM (0) TO (5);
+CREATE TABLE eager_agg_tab2_p2 PARTITION OF eager_agg_tab2 FOR VALUES FROM (5) TO (10);
+CREATE TABLE eager_agg_tab2_p3 PARTITION OF eager_agg_tab2 FOR VALUES FROM (10) TO (15);
+INSERT INTO eager_agg_tab1 SELECT i % 15, i % 10 FROM generate_series(1, 1000) i;
+INSERT INTO eager_agg_tab2 SELECT i % 10, i % 15 FROM generate_series(1, 1000) i;
+ANALYZE eager_agg_tab1;
+ANALYZE eager_agg_tab2;
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum(t1.y)), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum(t1.y), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2.y = t1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2
+                           Output: t2.y
+                     ->  Hash
+                           Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1.x, PARTIAL sum(t1.y), PARTIAL count(*)
+                                 Group Key: t1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                       Output: t1.x, t1.y
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum(t1_1.y), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_1
+                           Output: t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.y), PARTIAL count(*)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                       Output: t1_1.x, t1_1.y
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum(t1_2.y), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_2
+                           Output: t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.y), PARTIAL count(*)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                       Output: t1_2.x, t1_2.y
+(49 rows)
+
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+ x  |  sum  | count 
+----+-------+-------
+  0 | 10890 |  4356
+  1 | 15544 |  4489
+  2 | 20033 |  4489
+  3 | 24522 |  4489
+  4 | 29011 |  4489
+  5 | 11390 |  4489
+  6 | 15879 |  4489
+  7 | 20368 |  4489
+  8 | 24857 |  4489
+  9 | 29346 |  4489
+ 10 | 11055 |  4489
+ 11 | 15246 |  4356
+ 12 | 19602 |  4356
+ 13 | 23958 |  4356
+ 14 | 28314 |  4356
+(15 rows)
+
+-- GROUP BY having other matching key
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t2.y, (sum(t1.y)), (count(*))
+   Sort Key: t2.y
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t2.y, sum(t1.y), count(*)
+               Group Key: t2.y
+               ->  Hash Join
+                     Output: t2.y, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2.y = t1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2
+                           Output: t2.y
+                     ->  Hash
+                           Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1.x, PARTIAL sum(t1.y), PARTIAL count(*)
+                                 Group Key: t1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                       Output: t1.y, t1.x
+         ->  Finalize HashAggregate
+               Output: t2_1.y, sum(t1_1.y), count(*)
+               Group Key: t2_1.y
+               ->  Hash Join
+                     Output: t2_1.y, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_1
+                           Output: t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.y), PARTIAL count(*)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                       Output: t1_1.y, t1_1.x
+         ->  Finalize HashAggregate
+               Output: t2_2.y, sum(t1_2.y), count(*)
+               Group Key: t2_2.y
+               ->  Hash Join
+                     Output: t2_2.y, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_2
+                           Output: t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.y), PARTIAL count(*)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                       Output: t1_2.y, t1_2.x
+(49 rows)
+
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+ y  |  sum  | count 
+----+-------+-------
+  0 | 10890 |  4356
+  1 | 15544 |  4489
+  2 | 20033 |  4489
+  3 | 24522 |  4489
+  4 | 29011 |  4489
+  5 | 11390 |  4489
+  6 | 15879 |  4489
+  7 | 20368 |  4489
+  8 | 24857 |  4489
+  9 | 29346 |  4489
+ 10 | 11055 |  4489
+ 11 | 15246 |  4356
+ 12 | 19602 |  4356
+ 13 | 23958 |  4356
+ 14 | 28314 |  4356
+(15 rows)
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 5 ORDER BY t2.x;
+                                                 QUERY PLAN                                                 
+------------------------------------------------------------------------------------------------------------
+ Sort
+   Output: t2.x, (sum(t1.x)), (count(*))
+   Sort Key: t2.x
+   ->  Finalize HashAggregate
+         Output: t2.x, sum(t1.x), count(*)
+         Group Key: t2.x
+         Filter: (avg(t1.x) > '5'::numeric)
+         ->  Append
+               ->  Hash Join
+                     Output: t2.x, (PARTIAL sum(t1.x)), (PARTIAL count(*)), (PARTIAL avg(t1.x))
+                     Hash Cond: (t2.y = t1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2
+                           Output: t2.x, t2.y
+                     ->  Hash
+                           Output: t1.x, (PARTIAL sum(t1.x)), (PARTIAL count(*)), (PARTIAL avg(t1.x))
+                           ->  Partial HashAggregate
+                                 Output: t1.x, PARTIAL sum(t1.x), PARTIAL count(*), PARTIAL avg(t1.x)
+                                 Group Key: t1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                       Output: t1.x
+               ->  Hash Join
+                     Output: t2_1.x, (PARTIAL sum(t1_1.x)), (PARTIAL count(*)), (PARTIAL avg(t1_1.x))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_1
+                           Output: t2_1.x, t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.x)), (PARTIAL count(*)), (PARTIAL avg(t1_1.x))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.x), PARTIAL count(*), PARTIAL avg(t1_1.x)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                       Output: t1_1.x
+               ->  Hash Join
+                     Output: t2_2.x, (PARTIAL sum(t1_2.x)), (PARTIAL count(*)), (PARTIAL avg(t1_2.x))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_2
+                           Output: t2_2.x, t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.x)), (PARTIAL count(*)), (PARTIAL avg(t1_2.x))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.x), PARTIAL count(*), PARTIAL avg(t1_2.x)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                       Output: t1_2.x
+(44 rows)
+
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 5 ORDER BY t2.x;
+ x |  sum  | count 
+---+-------+-------
+ 0 | 33835 |  6667
+ 1 | 39502 |  6667
+ 2 | 46169 |  6667
+ 3 | 52836 |  6667
+ 4 | 59503 |  6667
+ 5 | 33500 |  6667
+ 6 | 39837 |  6667
+ 7 | 46504 |  6667
+ 8 | 53171 |  6667
+ 9 | 59838 |  6667
+(10 rows)
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum((t2.y + t3.y)))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum((t2.y + t3.y))
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum((t2.y + t3.y)))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, t3.x, (PARTIAL sum((t2.y + t3.y)))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, t3.x, PARTIAL sum((t2.y + t3.y))
+                                 Group Key: t2.x
+                                 ->  Hash Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Hash Cond: (t2.x = t3.x)
+                                       ->  Seq Scan on public.eager_agg_tab1_p1 t2
+                                             Output: t2.y, t2.x
+                                       ->  Hash
+                                             Output: t3.y, t3.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p1 t3
+                                                   Output: t3.y, t3.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum((t2_1.y + t3_1.y))
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum((t2_1.y + t3_1.y)))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y)))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y))
+                                 Group Key: t2_1.x
+                                 ->  Hash Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Hash Cond: (t2_1.x = t3_1.x)
+                                       ->  Seq Scan on public.eager_agg_tab1_p2 t2_1
+                                             Output: t2_1.y, t2_1.x
+                                       ->  Hash
+                                             Output: t3_1.y, t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p2 t3_1
+                                                   Output: t3_1.y, t3_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum((t2_2.y + t3_2.y))
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum((t2_2.y + t3_2.y)))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y)))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y))
+                                 Group Key: t2_2.x
+                                 ->  Hash Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Hash Cond: (t2_2.x = t3_2.x)
+                                       ->  Seq Scan on public.eager_agg_tab1_p3 t2_2
+                                             Output: t2_2.y, t2_2.x
+                                       ->  Hash
+                                             Output: t3_2.y, t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p3 t3_2
+                                                   Output: t3_2.y, t3_2.x
+(70 rows)
+
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+ x  |   sum   
+----+---------
+  0 | 1437480
+  1 | 2082896
+  2 | 2684422
+  3 | 3285948
+  4 | 3887474
+  5 | 1526260
+  6 | 2127786
+  7 | 2729312
+  8 | 3330838
+  9 | 3932364
+ 10 | 1481370
+ 11 | 2012472
+ 12 | 2587464
+ 13 | 3162456
+ 14 | 3737448
+(15 rows)
+
+-- partial aggregation
+SET enable_hashagg TO off;
+SET max_parallel_workers_per_gather TO 0;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t3.y, sum((t2.y + t3.y))
+   Group Key: t3.y
+   ->  Sort
+         Output: t3.y, (PARTIAL sum((t2.y + t3.y)))
+         Sort Key: t3.y
+         ->  Append
+               ->  Hash Join
+                     Output: t3.y, (PARTIAL sum((t2.y + t3.y)))
+                     Hash Cond: (t2.x = t1.x)
+                     ->  Partial GroupAggregate
+                           Output: t2.x, t3.y, t3.x, PARTIAL sum((t2.y + t3.y))
+                           Group Key: t2.x, t3.y, t3.x
+                           ->  Incremental Sort
+                                 Output: t2.y, t2.x, t3.y, t3.x
+                                 Sort Key: t2.x, t3.y
+                                 Presorted Key: t2.x
+                                 ->  Merge Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Merge Cond: (t2.x = t3.x)
+                                       ->  Sort
+                                             Output: t2.y, t2.x
+                                             Sort Key: t2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p1 t2
+                                                   Output: t2.y, t2.x
+                                       ->  Sort
+                                             Output: t3.y, t3.x
+                                             Sort Key: t3.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p1 t3
+                                                   Output: t3.y, t3.x
+                     ->  Hash
+                           Output: t1.x
+                           ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                 Output: t1.x
+               ->  Hash Join
+                     Output: t3_1.y, (PARTIAL sum((t2_1.y + t3_1.y)))
+                     Hash Cond: (t2_1.x = t1_1.x)
+                     ->  Partial GroupAggregate
+                           Output: t2_1.x, t3_1.y, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y))
+                           Group Key: t2_1.x, t3_1.y, t3_1.x
+                           ->  Incremental Sort
+                                 Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                 Sort Key: t2_1.x, t3_1.y
+                                 Presorted Key: t2_1.x
+                                 ->  Merge Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Merge Cond: (t2_1.x = t3_1.x)
+                                       ->  Sort
+                                             Output: t2_1.y, t2_1.x
+                                             Sort Key: t2_1.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p2 t2_1
+                                                   Output: t2_1.y, t2_1.x
+                                       ->  Sort
+                                             Output: t3_1.y, t3_1.x
+                                             Sort Key: t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p2 t3_1
+                                                   Output: t3_1.y, t3_1.x
+                     ->  Hash
+                           Output: t1_1.x
+                           ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                 Output: t1_1.x
+               ->  Hash Join
+                     Output: t3_2.y, (PARTIAL sum((t2_2.y + t3_2.y)))
+                     Hash Cond: (t2_2.x = t1_2.x)
+                     ->  Partial GroupAggregate
+                           Output: t2_2.x, t3_2.y, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y))
+                           Group Key: t2_2.x, t3_2.y, t3_2.x
+                           ->  Incremental Sort
+                                 Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                 Sort Key: t2_2.x, t3_2.y
+                                 Presorted Key: t2_2.x
+                                 ->  Merge Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Merge Cond: (t2_2.x = t3_2.x)
+                                       ->  Sort
+                                             Output: t2_2.y, t2_2.x
+                                             Sort Key: t2_2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p3 t2_2
+                                                   Output: t2_2.y, t2_2.x
+                                       ->  Sort
+                                             Output: t3_2.y, t3_2.x
+                                             Sort Key: t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p3 t3_2
+                                                   Output: t3_2.y, t3_2.x
+                     ->  Hash
+                           Output: t1_2.x
+                           ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                 Output: t1_2.x
+(88 rows)
+
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+ y |   sum   
+---+---------
+ 0 | 1111110
+ 1 | 2000132
+ 2 | 2889154
+ 3 | 3778176
+ 4 | 4667198
+ 5 | 3334000
+ 6 | 4223022
+ 7 | 5112044
+ 8 | 6001066
+ 9 | 6890088
+(10 rows)
+
+RESET enable_hashagg;
+RESET max_parallel_workers_per_gather;
+-- try that with GEQO too
+SET geqo = on;
+SET geqo_threshold = 2;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum(t1.y)), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum(t1.y), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2.y = t1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2
+                           Output: t2.y
+                     ->  Hash
+                           Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1.x, PARTIAL sum(t1.y), PARTIAL count(*)
+                                 Group Key: t1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                       Output: t1.x, t1.y
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum(t1_1.y), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_1
+                           Output: t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.y), PARTIAL count(*)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                       Output: t1_1.x, t1_1.y
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum(t1_2.y), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_2
+                           Output: t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.y), PARTIAL count(*)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                       Output: t1_2.x, t1_2.y
+(49 rows)
+
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+ x  |  sum  | count 
+----+-------+-------
+  0 | 10890 |  4356
+  1 | 15544 |  4489
+  2 | 20033 |  4489
+  3 | 24522 |  4489
+  4 | 29011 |  4489
+  5 | 11390 |  4489
+  6 | 15879 |  4489
+  7 | 20368 |  4489
+  8 | 24857 |  4489
+  9 | 29346 |  4489
+ 10 | 11055 |  4489
+ 11 | 15246 |  4356
+ 12 | 19602 |  4356
+ 13 | 23958 |  4356
+ 14 | 28314 |  4356
+(15 rows)
+
+RESET geqo;
+RESET geqo_threshold;
+DROP TABLE eager_agg_tab1;
+DROP TABLE eager_agg_tab2;
+--
+-- Test with multi-level partitioning scheme
+--
+CREATE TABLE eager_agg_tab_ml(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p1 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab_ml_p2 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (10) TO (20) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p2_s1 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab_ml_p2_s2 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (15) TO (20);
+CREATE TABLE eager_agg_tab_ml_p3 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (20) TO (30) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p3_s1 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (20) TO (25);
+CREATE TABLE eager_agg_tab_ml_p3_s2 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (25) TO (30);
+INSERT INTO eager_agg_tab_ml SELECT i % 30, i % 30 FROM generate_series(1, 1000) i;
+ANALYZE eager_agg_tab_ml;
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum(t2.y)), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum(t2.y), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, PARTIAL sum(t2.y), PARTIAL count(*)
+                                 Group Key: t2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                       Output: t2.y, t2.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum(t2_1.y), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, PARTIAL sum(t2_1.y), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                       Output: t2_1.y, t2_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum(t2_2.y), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, PARTIAL sum(t2_2.y), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                       Output: t2_2.y, t2_2.x
+         ->  Finalize HashAggregate
+               Output: t1_3.x, sum(t2_3.y), count(*)
+               Group Key: t1_3.x
+               ->  Hash Join
+                     Output: t1_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, PARTIAL sum(t2_3.y), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                       Output: t2_3.y, t2_3.x
+         ->  Finalize HashAggregate
+               Output: t1_4.x, sum(t2_4.y), count(*)
+               Group Key: t1_4.x
+               ->  Hash Join
+                     Output: t1_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, PARTIAL sum(t2_4.y), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                       Output: t2_4.y, t2_4.x
+(79 rows)
+
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+ x  |  sum  | count 
+----+-------+-------
+  0 |     0 |  1089
+  1 |  1156 |  1156
+  2 |  2312 |  1156
+  3 |  3468 |  1156
+  4 |  4624 |  1156
+  5 |  5780 |  1156
+  6 |  6936 |  1156
+  7 |  8092 |  1156
+  8 |  9248 |  1156
+  9 | 10404 |  1156
+ 10 | 11560 |  1156
+ 11 | 11979 |  1089
+ 12 | 13068 |  1089
+ 13 | 14157 |  1089
+ 14 | 15246 |  1089
+ 15 | 16335 |  1089
+ 16 | 17424 |  1089
+ 17 | 18513 |  1089
+ 18 | 19602 |  1089
+ 19 | 20691 |  1089
+ 20 | 21780 |  1089
+ 21 | 22869 |  1089
+ 22 | 23958 |  1089
+ 23 | 25047 |  1089
+ 24 | 26136 |  1089
+ 25 | 27225 |  1089
+ 26 | 28314 |  1089
+ 27 | 29403 |  1089
+ 28 | 30492 |  1089
+ 29 | 31581 |  1089
+(30 rows)
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.y, (sum(t2.y)), (count(*))
+   Sort Key: t1.y
+   ->  Finalize HashAggregate
+         Output: t1.y, sum(t2.y), count(*)
+         Group Key: t1.y
+         ->  Append
+               ->  Hash Join
+                     Output: t1.y, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.y, t1.x
+                     ->  Hash
+                           Output: t2.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, PARTIAL sum(t2.y), PARTIAL count(*)
+                                 Group Key: t2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                       Output: t2.y, t2.x
+               ->  Hash Join
+                     Output: t1_1.y, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.y, t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, PARTIAL sum(t2_1.y), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                       Output: t2_1.y, t2_1.x
+               ->  Hash Join
+                     Output: t1_2.y, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.y, t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, PARTIAL sum(t2_2.y), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                       Output: t2_2.y, t2_2.x
+               ->  Hash Join
+                     Output: t1_3.y, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.y, t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, PARTIAL sum(t2_3.y), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                       Output: t2_3.y, t2_3.x
+               ->  Hash Join
+                     Output: t1_4.y, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.y, t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, PARTIAL sum(t2_4.y), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                       Output: t2_4.y, t2_4.x
+(67 rows)
+
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+ y  |  sum  | count 
+----+-------+-------
+  0 |     0 |  1089
+  1 |  1156 |  1156
+  2 |  2312 |  1156
+  3 |  3468 |  1156
+  4 |  4624 |  1156
+  5 |  5780 |  1156
+  6 |  6936 |  1156
+  7 |  8092 |  1156
+  8 |  9248 |  1156
+  9 | 10404 |  1156
+ 10 | 11560 |  1156
+ 11 | 11979 |  1089
+ 12 | 13068 |  1089
+ 13 | 14157 |  1089
+ 14 | 15246 |  1089
+ 15 | 16335 |  1089
+ 16 | 17424 |  1089
+ 17 | 18513 |  1089
+ 18 | 19602 |  1089
+ 19 | 20691 |  1089
+ 20 | 21780 |  1089
+ 21 | 22869 |  1089
+ 22 | 23958 |  1089
+ 23 | 25047 |  1089
+ 24 | 26136 |  1089
+ 25 | 27225 |  1089
+ 26 | 28314 |  1089
+ 27 | 29403 |  1089
+ 28 | 30492 |  1089
+ 29 | 31581 |  1089
+(30 rows)
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+                                                QUERY PLAN                                                
+----------------------------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum((t2.y + t3.y))), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum((t2.y + t3.y)), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, t3.x, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, t3.x, PARTIAL sum((t2.y + t3.y)), PARTIAL count(*)
+                                 Group Key: t2.x
+                                 ->  Hash Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Hash Cond: (t2.x = t3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                             Output: t2.y, t2.x
+                                       ->  Hash
+                                             Output: t3.y, t3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p1 t3
+                                                   Output: t3.y, t3.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum((t2_1.y + t3_1.y)), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y)), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Hash Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Hash Cond: (t2_1.x = t3_1.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                             Output: t2_1.y, t2_1.x
+                                       ->  Hash
+                                             Output: t3_1.y, t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t3_1
+                                                   Output: t3_1.y, t3_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum((t2_2.y + t3_2.y)), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y)), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Hash Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Hash Cond: (t2_2.x = t3_2.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                             Output: t2_2.y, t2_2.x
+                                       ->  Hash
+                                             Output: t3_2.y, t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t3_2
+                                                   Output: t3_2.y, t3_2.x
+         ->  Finalize HashAggregate
+               Output: t1_3.x, sum((t2_3.y + t3_3.y)), count(*)
+               Group Key: t1_3.x
+               ->  Hash Join
+                     Output: t1_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, t3_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, t3_3.x, PARTIAL sum((t2_3.y + t3_3.y)), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Hash Join
+                                       Output: t2_3.y, t2_3.x, t3_3.y, t3_3.x
+                                       Hash Cond: (t2_3.x = t3_3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                             Output: t2_3.y, t2_3.x
+                                       ->  Hash
+                                             Output: t3_3.y, t3_3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t3_3
+                                                   Output: t3_3.y, t3_3.x
+         ->  Finalize HashAggregate
+               Output: t1_4.x, sum((t2_4.y + t3_4.y)), count(*)
+               Group Key: t1_4.x
+               ->  Hash Join
+                     Output: t1_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, t3_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, t3_4.x, PARTIAL sum((t2_4.y + t3_4.y)), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Hash Join
+                                       Output: t2_4.y, t2_4.x, t3_4.y, t3_4.x
+                                       Hash Cond: (t2_4.x = t3_4.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                             Output: t2_4.y, t2_4.x
+                                       ->  Hash
+                                             Output: t3_4.y, t3_4.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t3_4
+                                                   Output: t3_4.y, t3_4.x
+(114 rows)
+
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+ x  |   sum   | count 
+----+---------+-------
+  0 |       0 | 35937
+  1 |   78608 | 39304
+  2 |  157216 | 39304
+  3 |  235824 | 39304
+  4 |  314432 | 39304
+  5 |  393040 | 39304
+  6 |  471648 | 39304
+  7 |  550256 | 39304
+  8 |  628864 | 39304
+  9 |  707472 | 39304
+ 10 |  786080 | 39304
+ 11 |  790614 | 35937
+ 12 |  862488 | 35937
+ 13 |  934362 | 35937
+ 14 | 1006236 | 35937
+ 15 | 1078110 | 35937
+ 16 | 1149984 | 35937
+ 17 | 1221858 | 35937
+ 18 | 1293732 | 35937
+ 19 | 1365606 | 35937
+ 20 | 1437480 | 35937
+ 21 | 1509354 | 35937
+ 22 | 1581228 | 35937
+ 23 | 1653102 | 35937
+ 24 | 1724976 | 35937
+ 25 | 1796850 | 35937
+ 26 | 1868724 | 35937
+ 27 | 1940598 | 35937
+ 28 | 2012472 | 35937
+ 29 | 2084346 | 35937
+(30 rows)
+
+-- partial aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+                                                    QUERY PLAN                                                    
+------------------------------------------------------------------------------------------------------------------
+ Sort
+   Output: t3.y, (sum((t2.y + t3.y))), (count(*))
+   Sort Key: t3.y
+   ->  Finalize HashAggregate
+         Output: t3.y, sum((t2.y + t3.y)), count(*)
+         Group Key: t3.y
+         ->  Append
+               ->  Hash Join
+                     Output: t3.y, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, t3.y, t3.x, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, t3.y, t3.x, PARTIAL sum((t2.y + t3.y)), PARTIAL count(*)
+                                 Group Key: t2.x, t3.y, t3.x
+                                 ->  Hash Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Hash Cond: (t2.x = t3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                             Output: t2.y, t2.x
+                                       ->  Hash
+                                             Output: t3.y, t3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p1 t3
+                                                   Output: t3.y, t3.x
+               ->  Hash Join
+                     Output: t3_1.y, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, t3_1.y, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, t3_1.y, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y)), PARTIAL count(*)
+                                 Group Key: t2_1.x, t3_1.y, t3_1.x
+                                 ->  Hash Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Hash Cond: (t2_1.x = t3_1.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                             Output: t2_1.y, t2_1.x
+                                       ->  Hash
+                                             Output: t3_1.y, t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t3_1
+                                                   Output: t3_1.y, t3_1.x
+               ->  Hash Join
+                     Output: t3_2.y, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, t3_2.y, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, t3_2.y, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y)), PARTIAL count(*)
+                                 Group Key: t2_2.x, t3_2.y, t3_2.x
+                                 ->  Hash Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Hash Cond: (t2_2.x = t3_2.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                             Output: t2_2.y, t2_2.x
+                                       ->  Hash
+                                             Output: t3_2.y, t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t3_2
+                                                   Output: t3_2.y, t3_2.x
+               ->  Hash Join
+                     Output: t3_3.y, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, t3_3.y, t3_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, t3_3.y, t3_3.x, PARTIAL sum((t2_3.y + t3_3.y)), PARTIAL count(*)
+                                 Group Key: t2_3.x, t3_3.y, t3_3.x
+                                 ->  Hash Join
+                                       Output: t2_3.y, t2_3.x, t3_3.y, t3_3.x
+                                       Hash Cond: (t2_3.x = t3_3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                             Output: t2_3.y, t2_3.x
+                                       ->  Hash
+                                             Output: t3_3.y, t3_3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t3_3
+                                                   Output: t3_3.y, t3_3.x
+               ->  Hash Join
+                     Output: t3_4.y, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, t3_4.y, t3_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, t3_4.y, t3_4.x, PARTIAL sum((t2_4.y + t3_4.y)), PARTIAL count(*)
+                                 Group Key: t2_4.x, t3_4.y, t3_4.x
+                                 ->  Hash Join
+                                       Output: t2_4.y, t2_4.x, t3_4.y, t3_4.x
+                                       Hash Cond: (t2_4.x = t3_4.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                             Output: t2_4.y, t2_4.x
+                                       ->  Hash
+                                             Output: t3_4.y, t3_4.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t3_4
+                                                   Output: t3_4.y, t3_4.x
+(102 rows)
+
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+ y  |   sum   | count 
+----+---------+-------
+  0 |       0 | 35937
+  1 |   78608 | 39304
+  2 |  157216 | 39304
+  3 |  235824 | 39304
+  4 |  314432 | 39304
+  5 |  393040 | 39304
+  6 |  471648 | 39304
+  7 |  550256 | 39304
+  8 |  628864 | 39304
+  9 |  707472 | 39304
+ 10 |  786080 | 39304
+ 11 |  790614 | 35937
+ 12 |  862488 | 35937
+ 13 |  934362 | 35937
+ 14 | 1006236 | 35937
+ 15 | 1078110 | 35937
+ 16 | 1149984 | 35937
+ 17 | 1221858 | 35937
+ 18 | 1293732 | 35937
+ 19 | 1365606 | 35937
+ 20 | 1437480 | 35937
+ 21 | 1509354 | 35937
+ 22 | 1581228 | 35937
+ 23 | 1653102 | 35937
+ 24 | 1724976 | 35937
+ 25 | 1796850 | 35937
+ 26 | 1868724 | 35937
+ 27 | 1940598 | 35937
+ 28 | 2012472 | 35937
+ 29 | 2084346 | 35937
+(30 rows)
+
+-- try that with GEQO too
+SET geqo = on;
+SET geqo_threshold = 2;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum(t2.y)), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum(t2.y), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, PARTIAL sum(t2.y), PARTIAL count(*)
+                                 Group Key: t2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                       Output: t2.y, t2.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum(t2_1.y), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, PARTIAL sum(t2_1.y), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                       Output: t2_1.y, t2_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum(t2_2.y), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, PARTIAL sum(t2_2.y), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                       Output: t2_2.y, t2_2.x
+         ->  Finalize HashAggregate
+               Output: t1_3.x, sum(t2_3.y), count(*)
+               Group Key: t1_3.x
+               ->  Hash Join
+                     Output: t1_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, PARTIAL sum(t2_3.y), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                       Output: t2_3.y, t2_3.x
+         ->  Finalize HashAggregate
+               Output: t1_4.x, sum(t2_4.y), count(*)
+               Group Key: t1_4.x
+               ->  Hash Join
+                     Output: t1_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, PARTIAL sum(t2_4.y), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                       Output: t2_4.y, t2_4.x
+(79 rows)
+
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+ x  |  sum  | count 
+----+-------+-------
+  0 |     0 |  1089
+  1 |  1156 |  1156
+  2 |  2312 |  1156
+  3 |  3468 |  1156
+  4 |  4624 |  1156
+  5 |  5780 |  1156
+  6 |  6936 |  1156
+  7 |  8092 |  1156
+  8 |  9248 |  1156
+  9 | 10404 |  1156
+ 10 | 11560 |  1156
+ 11 | 11979 |  1089
+ 12 | 13068 |  1089
+ 13 | 14157 |  1089
+ 14 | 15246 |  1089
+ 15 | 16335 |  1089
+ 16 | 17424 |  1089
+ 17 | 18513 |  1089
+ 18 | 19602 |  1089
+ 19 | 20691 |  1089
+ 20 | 21780 |  1089
+ 21 | 22869 |  1089
+ 22 | 23958 |  1089
+ 23 | 25047 |  1089
+ 24 | 26136 |  1089
+ 25 | 27225 |  1089
+ 26 | 28314 |  1089
+ 27 | 29403 |  1089
+ 28 | 30492 |  1089
+ 29 | 31581 |  1089
+(30 rows)
+
+RESET geqo;
+RESET geqo_threshold;
+DROP TABLE eager_agg_tab_ml;
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index 04079268b98..d0bb66f43da 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -2837,20 +2837,22 @@ select x.thousand, x.twothousand, count(*)
 from tenk1 x inner join tenk1 y on x.thousand = y.thousand
 group by x.thousand, x.twothousand
 order by x.thousand desc, x.twothousand;
-                                    QUERY PLAN                                    
-----------------------------------------------------------------------------------
- GroupAggregate
+                                       QUERY PLAN                                       
+----------------------------------------------------------------------------------------
+ Finalize GroupAggregate
    Group Key: x.thousand, x.twothousand
    ->  Incremental Sort
          Sort Key: x.thousand DESC, x.twothousand
          Presorted Key: x.thousand
          ->  Merge Join
                Merge Cond: (y.thousand = x.thousand)
-               ->  Index Only Scan Backward using tenk1_thous_tenthous on tenk1 y
+               ->  Partial GroupAggregate
+                     Group Key: y.thousand
+                     ->  Index Only Scan Backward using tenk1_thous_tenthous on tenk1 y
                ->  Sort
                      Sort Key: x.thousand DESC
                      ->  Seq Scan on tenk1 x
-(11 rows)
+(13 rows)
 
 reset enable_hashagg;
 reset enable_nestloop;
diff --git a/src/test/regress/expected/partition_aggregate.out b/src/test/regress/expected/partition_aggregate.out
index 5f2c0cf5786..1f56f55155b 100644
--- a/src/test/regress/expected/partition_aggregate.out
+++ b/src/test/regress/expected/partition_aggregate.out
@@ -13,6 +13,8 @@ SET enable_partitionwise_join TO true;
 SET max_parallel_workers_per_gather TO 0;
 -- Disable incremental sort, which can influence selected plans due to fuzz factor.
 SET enable_incremental_sort TO off;
+-- Disable eager aggregation, which can interfere with the generation of partitionwise aggregation.
+SET enable_eager_aggregate TO off;
 --
 -- Tests for list partitioned tables.
 --
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index 83228cfca29..3b37fafa65b 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -151,6 +151,7 @@ select name, setting from pg_settings where name like 'enable%';
  enable_async_append            | on
  enable_bitmapscan              | on
  enable_distinct_reordering     | on
+ enable_eager_aggregate         | on
  enable_gathermerge             | on
  enable_group_by_reordering     | on
  enable_hashagg                 | on
@@ -172,7 +173,7 @@ select name, setting from pg_settings where name like 'enable%';
  enable_seqscan                 | on
  enable_sort                    | on
  enable_tidscan                 | on
-(24 rows)
+(25 rows)
 
 -- There are always wait event descriptions for various types.  InjectionPoint
 -- may be present or absent, depending on history since last postmaster start.
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index fbffc67ae60..f9450cdc477 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -123,7 +123,7 @@ test: plancache limit plpgsql copy2 temp domain rangefuncs prepare conversion tr
 # The stats test resets stats, so nothing else needing stats access can be in
 # this group.
 # ----------
-test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression compression_lz4 memoize stats predicate numa
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression compression_lz4 memoize stats predicate numa eager_aggregate
 
 # event_trigger depends on create_am and cannot run concurrently with
 # any test that runs DDL
diff --git a/src/test/regress/sql/eager_aggregate.sql b/src/test/regress/sql/eager_aggregate.sql
new file mode 100644
index 00000000000..8b1049ae3f3
--- /dev/null
+++ b/src/test/regress/sql/eager_aggregate.sql
@@ -0,0 +1,225 @@
+--
+-- EAGER AGGREGATION
+-- Test we can push aggregation down below join
+--
+
+-- Enable eager aggregation, which by default is disabled.
+SET enable_eager_aggregate TO on;
+
+CREATE TABLE eager_agg_t1 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t2 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t3 (a int, b int, c double precision);
+
+INSERT INTO eager_agg_t1 SELECT i, i, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t2 SELECT i, i%10, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t3 SELECT i%10, i%10, i FROM generate_series(1, 1000)i;
+
+ANALYZE eager_agg_t1;
+ANALYZE eager_agg_t2;
+ANALYZE eager_agg_t3;
+
+
+--
+-- Test eager aggregation over base rel
+--
+
+-- Perform scan of a table, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+RESET enable_hashagg;
+
+
+--
+-- Test eager aggregation over join rel
+--
+
+-- Perform join of tables, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+
+RESET enable_hashagg;
+
+
+--
+-- Test that eager aggregation works for outer join
+--
+
+-- Ensure aggregation can be pushed down to the non-nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+-- Ensure aggregation cannot be pushed down to the nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+
+
+--
+-- Test that eager aggregation works for parallel plans
+--
+
+SET parallel_setup_cost=0;
+SET parallel_tuple_cost=0;
+SET min_parallel_table_scan_size=0;
+SET max_parallel_workers_per_gather=4;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+RESET parallel_setup_cost;
+RESET parallel_tuple_cost;
+RESET min_parallel_table_scan_size;
+RESET max_parallel_workers_per_gather;
+
+--
+-- Test eager aggregation with GEQO
+--
+
+SET geqo = on;
+SET geqo_threshold = 2;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+RESET geqo;
+RESET geqo_threshold;
+
+DROP TABLE eager_agg_t1;
+DROP TABLE eager_agg_t2;
+DROP TABLE eager_agg_t3;
+
+
+--
+-- Test eager aggregation for partitionwise join
+--
+
+-- Enable partitionwise aggregate, which by default is disabled.
+SET enable_partitionwise_aggregate TO true;
+-- Enable partitionwise join, which by default is disabled.
+SET enable_partitionwise_join TO true;
+
+CREATE TABLE eager_agg_tab1(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab1_p1 PARTITION OF eager_agg_tab1 FOR VALUES FROM (0) TO (5);
+CREATE TABLE eager_agg_tab1_p2 PARTITION OF eager_agg_tab1 FOR VALUES FROM (5) TO (10);
+CREATE TABLE eager_agg_tab1_p3 PARTITION OF eager_agg_tab1 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab2(x int, y int) PARTITION BY RANGE(y);
+CREATE TABLE eager_agg_tab2_p1 PARTITION OF eager_agg_tab2 FOR VALUES FROM (0) TO (5);
+CREATE TABLE eager_agg_tab2_p2 PARTITION OF eager_agg_tab2 FOR VALUES FROM (5) TO (10);
+CREATE TABLE eager_agg_tab2_p3 PARTITION OF eager_agg_tab2 FOR VALUES FROM (10) TO (15);
+INSERT INTO eager_agg_tab1 SELECT i % 15, i % 10 FROM generate_series(1, 1000) i;
+INSERT INTO eager_agg_tab2 SELECT i % 10, i % 15 FROM generate_series(1, 1000) i;
+
+ANALYZE eager_agg_tab1;
+ANALYZE eager_agg_tab2;
+
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+
+-- GROUP BY having other matching key
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 5 ORDER BY t2.x;
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 5 ORDER BY t2.x;
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+
+-- partial aggregation
+SET enable_hashagg TO off;
+SET max_parallel_workers_per_gather TO 0;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+RESET enable_hashagg;
+RESET max_parallel_workers_per_gather;
+
+-- try that with GEQO too
+SET geqo = on;
+SET geqo_threshold = 2;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+RESET geqo;
+RESET geqo_threshold;
+
+DROP TABLE eager_agg_tab1;
+DROP TABLE eager_agg_tab2;
+
+
+--
+-- Test with multi-level partitioning scheme
+--
+CREATE TABLE eager_agg_tab_ml(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p1 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab_ml_p2 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (10) TO (20) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p2_s1 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab_ml_p2_s2 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (15) TO (20);
+CREATE TABLE eager_agg_tab_ml_p3 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (20) TO (30) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p3_s1 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (20) TO (25);
+CREATE TABLE eager_agg_tab_ml_p3_s2 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (25) TO (30);
+INSERT INTO eager_agg_tab_ml SELECT i % 30, i % 30 FROM generate_series(1, 1000) i;
+
+ANALYZE eager_agg_tab_ml;
+
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+
+-- partial aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+
+-- try that with GEQO too
+SET geqo = on;
+SET geqo_threshold = 2;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+RESET geqo;
+RESET geqo_threshold;
+
+DROP TABLE eager_agg_tab_ml;
diff --git a/src/test/regress/sql/partition_aggregate.sql b/src/test/regress/sql/partition_aggregate.sql
index ab070fee244..124cc260461 100644
--- a/src/test/regress/sql/partition_aggregate.sql
+++ b/src/test/regress/sql/partition_aggregate.sql
@@ -14,6 +14,8 @@ SET enable_partitionwise_join TO true;
 SET max_parallel_workers_per_gather TO 0;
 -- Disable incremental sort, which can influence selected plans due to fuzz factor.
 SET enable_incremental_sort TO off;
+-- Disable eager aggregation, which can interfere with the generation of partitionwise aggregation.
+SET enable_eager_aggregate TO off;
 
 --
 -- Tests for list partitioned tables.
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index a13e8162890..9a4567db01a 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -42,6 +42,7 @@ AfterTriggersTableData
 AfterTriggersTransData
 Agg
 AggClauseCosts
+AggClauseInfo
 AggInfo
 AggPath
 AggSplit
@@ -1110,6 +1111,7 @@ GroupPathExtraData
 GroupResultPath
 GroupState
 GroupVarInfo
+GroupingExprInfo
 GroupingFunc
 GroupingSet
 GroupingSetData
@@ -2473,6 +2475,7 @@ ReindexObjectType
 ReindexParams
 ReindexStmt
 ReindexType
+RelAggInfo
 RelFileLocator
 RelFileLocatorBackend
 RelFileNumber
-- 
2.39.5 (Apple Git-154)



  [application/octet-stream] v22-0002-Allow-negative-aggtransspace-to-indicate-unbound.patch (8.4K, 3-v22-0002-Allow-negative-aggtransspace-to-indicate-unbound.patch)
  download | inline diff:
From ec282bb7fb963325a30a3e94375289aa5457004b Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Fri, 12 Sep 2025 13:11:47 +0900
Subject: [PATCH v22 2/2] Allow negative aggtransspace to indicate unbounded
 state size

This patch reuses the existing aggtransspace in pg_aggregate to
signal that an aggregate's transition state can grow unboundedly.  If
aggtransspace is set to a negative value, it now indicates that the
transition state may consume unpredictable or large amounts of memory,
such as in aggregates like array_agg or string_agg that accumulate
input rows.

This information can be used by the planner to avoid applying
memory-sensitive optimizations (e.g., eager aggregation) when there is
a risk of excessive memory usage during partial aggregation.

Bump catalog version.
---
 doc/src/sgml/catalogs.sgml               |  5 ++++-
 doc/src/sgml/ref/create_aggregate.sgml   | 11 ++++++++---
 src/backend/optimizer/plan/initsplan.c   | 23 +++++++----------------
 src/include/catalog/catversion.h         |  2 +-
 src/include/catalog/pg_aggregate.dat     | 10 ++++++----
 src/test/regress/expected/opr_sanity.out |  2 +-
 src/test/regress/sql/opr_sanity.sql      |  2 +-
 7 files changed, 28 insertions(+), 27 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index e9095bedf21..3acc2222a87 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -596,7 +596,10 @@
       </para>
       <para>
        Approximate average size (in bytes) of the transition state
-       data, or zero to use a default estimate
+       data. A positive value provides an estimate; zero means to
+       use a default estimate. A negative value indicates the state
+       data can grow unboundedly in size, such as when the aggregate
+       accumulates input rows (e.g., array_agg, string_agg).
       </para></entry>
      </row>
 
diff --git a/doc/src/sgml/ref/create_aggregate.sgml b/doc/src/sgml/ref/create_aggregate.sgml
index 222e0aa5c9d..0472ac2e874 100644
--- a/doc/src/sgml/ref/create_aggregate.sgml
+++ b/doc/src/sgml/ref/create_aggregate.sgml
@@ -384,9 +384,13 @@ SELECT col FROM tab ORDER BY col USING sortop LIMIT 1;
      <para>
       The approximate average size (in bytes) of the aggregate's state value.
       If this parameter is omitted or is zero, a default estimate is used
-      based on the <replaceable>state_data_type</replaceable>.
+      based on the <replaceable>state_data_type</replaceable>. If set to a
+      negative value, it indicates the state data can grow unboundedly in
+      size, such as when the aggregate accumulates input rows (e.g.,
+      array_agg, string_agg).
       The planner uses this value to estimate the memory required for a
-      grouped aggregate query.
+      grouped aggregate query and to avoid optimizations that may cause
+      excessive memory usage.
      </para>
     </listitem>
    </varlistentry>
@@ -568,7 +572,8 @@ SELECT col FROM tab ORDER BY col USING sortop LIMIT 1;
      <para>
       The approximate average size (in bytes) of the aggregate's state
       value, when using moving-aggregate mode.  This works the same as
-      <replaceable>state_data_size</replaceable>.
+      <replaceable>state_data_size</replaceable>, except that negative
+      values are not used to indicate unbounded state size.
      </para>
     </listitem>
    </varlistentry>
diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index 1b778f692d4..cb29c72c96c 100644
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -716,19 +716,14 @@ setup_eager_aggregation(PlannerInfo *root)
 
 /*
  * is_partial_agg_memory_risky
- *	  Checks if any aggregate poses a risk of excessive memory usage during
+ *	  Check if any aggregate poses a risk of excessive memory usage during
  *	  partial aggregation.
  *
- * We check if any aggregate uses INTERNAL transition type.  Although INTERNAL
- * is marked as pass-by-value, it usually points to a large internal data
- * structure (like those used by string_agg or array_agg).  These transition
- * states can grow large and their size is hard to estimate.  Applying eager
- * aggregation in such cases risks high memory usage since partial aggregation
- * results might be stored in join hash tables or materialized nodes.
- *
- * We explicitly exclude aggregates with AVG_ACCUM transition function from
- * this check, based on the assumption that avg() and sum() are safe in this
- * context.
+ * We check if any aggregate has a negative aggtransspace value, which
+ * indicates that its transition state data can grow unboundedly in size.
+ * Applying eager aggregation in such cases risks high memory usage since
+ * partial aggregation results might be stored in join hash tables or
+ * materialized nodes.
  */
 static bool
 is_partial_agg_memory_risky(PlannerInfo *root)
@@ -739,11 +734,7 @@ is_partial_agg_memory_risky(PlannerInfo *root)
 	{
 		AggTransInfo *transinfo = lfirst_node(AggTransInfo, lc);
 
-		if (transinfo->transfn_oid == F_NUMERIC_AVG_ACCUM ||
-			transinfo->transfn_oid == F_INT8_AVG_ACCUM)
-			continue;
-
-		if (transinfo->aggtranstype == INTERNALOID)
+		if (transinfo->aggtransspace < 0)
 			return true;
 	}
 
diff --git a/src/include/catalog/catversion.h b/src/include/catalog/catversion.h
index ef0d0f92165..62b0af3e0c3 100644
--- a/src/include/catalog/catversion.h
+++ b/src/include/catalog/catversion.h
@@ -57,6 +57,6 @@
  */
 
 /*							yyyymmddN */
-#define CATALOG_VERSION_NO	202509091
+#define CATALOG_VERSION_NO	202509121
 
 #endif
diff --git a/src/include/catalog/pg_aggregate.dat b/src/include/catalog/pg_aggregate.dat
index d6aa1f6ec47..870769e8f14 100644
--- a/src/include/catalog/pg_aggregate.dat
+++ b/src/include/catalog/pg_aggregate.dat
@@ -558,26 +558,28 @@
   aggfinalfn => 'array_agg_finalfn', aggcombinefn => 'array_agg_combine',
   aggserialfn => 'array_agg_serialize',
   aggdeserialfn => 'array_agg_deserialize', aggfinalextra => 't',
-  aggtranstype => 'internal' },
+  aggtranstype => 'internal', aggtransspace => '-1' },
 { aggfnoid => 'array_agg(anyarray)', aggtransfn => 'array_agg_array_transfn',
   aggfinalfn => 'array_agg_array_finalfn',
   aggcombinefn => 'array_agg_array_combine',
   aggserialfn => 'array_agg_array_serialize',
   aggdeserialfn => 'array_agg_array_deserialize', aggfinalextra => 't',
-  aggtranstype => 'internal' },
+  aggtranstype => 'internal', aggtransspace => '-1' },
 
 # text
 { aggfnoid => 'string_agg(text,text)', aggtransfn => 'string_agg_transfn',
   aggfinalfn => 'string_agg_finalfn', aggcombinefn => 'string_agg_combine',
   aggserialfn => 'string_agg_serialize',
-  aggdeserialfn => 'string_agg_deserialize', aggtranstype => 'internal' },
+  aggdeserialfn => 'string_agg_deserialize',
+  aggtranstype => 'internal', aggtransspace => '-1' },
 
 # bytea
 { aggfnoid => 'string_agg(bytea,bytea)',
   aggtransfn => 'bytea_string_agg_transfn',
   aggfinalfn => 'bytea_string_agg_finalfn',
   aggcombinefn => 'string_agg_combine', aggserialfn => 'string_agg_serialize',
-  aggdeserialfn => 'string_agg_deserialize', aggtranstype => 'internal' },
+  aggdeserialfn => 'string_agg_deserialize',
+  aggtranstype => 'internal', aggtransspace => '-1' },
 
 # range
 { aggfnoid => 'range_intersect_agg(anyrange)',
diff --git a/src/test/regress/expected/opr_sanity.out b/src/test/regress/expected/opr_sanity.out
index 20bf9ea9cdf..a357e1d0c0e 100644
--- a/src/test/regress/expected/opr_sanity.out
+++ b/src/test/regress/expected/opr_sanity.out
@@ -1470,7 +1470,7 @@ WHERE aggfnoid = 0 OR aggtransfn = 0 OR
     (aggkind = 'n' AND aggnumdirectargs > 0) OR
     aggfinalmodify NOT IN ('r', 's', 'w') OR
     aggmfinalmodify NOT IN ('r', 's', 'w') OR
-    aggtranstype = 0 OR aggtransspace < 0 OR aggmtransspace < 0;
+    aggtranstype = 0 OR aggmtransspace < 0;
  ctid | aggfnoid 
 ------+----------
 (0 rows)
diff --git a/src/test/regress/sql/opr_sanity.sql b/src/test/regress/sql/opr_sanity.sql
index 2fb3a852878..cd674d7dbca 100644
--- a/src/test/regress/sql/opr_sanity.sql
+++ b/src/test/regress/sql/opr_sanity.sql
@@ -847,7 +847,7 @@ WHERE aggfnoid = 0 OR aggtransfn = 0 OR
     (aggkind = 'n' AND aggnumdirectargs > 0) OR
     aggfinalmodify NOT IN ('r', 's', 'w') OR
     aggmfinalmodify NOT IN ('r', 's', 'w') OR
-    aggtranstype = 0 OR aggtransspace < 0 OR aggmtransspace < 0;
+    aggtranstype = 0 OR aggmtransspace < 0;
 
 -- Make sure the matching pg_proc entry is sensible, too.
 
-- 
2.39.5 (Apple Git-154)



^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-05 13:12                                   ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-09 09:20                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-09 14:20                                       ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-12 09:34                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-09-12 18:47                                           ` Robert Haas <[email protected]>
  2025-09-13 08:27                                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  0 siblings, 1 reply; 70+ messages in thread

From: Robert Haas @ 2025-09-12 18:47 UTC (permalink / raw)
  To: Richard Guo <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]; Matheus Alcantara <[email protected]>

On Fri, Sep 12, 2025 at 5:34 AM Richard Guo <[email protected]> wrote:
> I really like this idea.  Currently, aggtransspace represents an
> estimate of the transition state size provided by the aggregate
> definition.  If it's set to zero, a default estimate based on the
> state data type is used.  Negative values currently have no defined
> meaning.  I think it makes perfect sense to reuse this field so that
> a negative value indicates that the transition state data can grow
> unboundedly in size.
>
> Attached 0002 implements this idea.  It requires fewer code changes
> than I expected.  This is mainly because that our current code uses
> aggtransspace in such a way that if it's a positive value, that value
> is used as it's provided by the aggregate definition; otherwise, some
> heuristics are applied to estimate the size.  For the aggregates that
> accumulate input rows (e.g., array_agg, string_agg), I don't currently
> have a better heuristic for estimating their size, so I've chosen to
> keep the current logic.  This won't regress anything in estimating
> transition state data size.

This might be OK, but it's not what I was suggesting: I was suggesting
trying to do a calculation like space_used = -aggtransspace *
rowcount, not just using a <0 value as a sentinel.

-- 
Robert Haas
EDB: http://www.enterprisedb.com





^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-05 13:12                                   ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-09 09:20                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-09 14:20                                       ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-12 09:34                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-12 18:47                                           ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
@ 2025-09-13 08:27                                             ` Richard Guo <[email protected]>
  2025-09-25 04:23                                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  0 siblings, 1 reply; 70+ messages in thread

From: Richard Guo @ 2025-09-13 08:27 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]; Matheus Alcantara <[email protected]>

On Sat, Sep 13, 2025 at 3:48 AM Robert Haas <[email protected]> wrote:
> On Fri, Sep 12, 2025 at 5:34 AM Richard Guo <[email protected]> wrote:
> > I really like this idea.  Currently, aggtransspace represents an
> > estimate of the transition state size provided by the aggregate
> > definition.  If it's set to zero, a default estimate based on the
> > state data type is used.  Negative values currently have no defined
> > meaning.  I think it makes perfect sense to reuse this field so that
> > a negative value indicates that the transition state data can grow
> > unboundedly in size.
> >
> > Attached 0002 implements this idea.  It requires fewer code changes
> > than I expected.  This is mainly because that our current code uses
> > aggtransspace in such a way that if it's a positive value, that value
> > is used as it's provided by the aggregate definition; otherwise, some
> > heuristics are applied to estimate the size.  For the aggregates that
> > accumulate input rows (e.g., array_agg, string_agg), I don't currently
> > have a better heuristic for estimating their size, so I've chosen to
> > keep the current logic.  This won't regress anything in estimating
> > transition state data size.

> This might be OK, but it's not what I was suggesting: I was suggesting
> trying to do a calculation like space_used = -aggtransspace *
> rowcount, not just using a <0 value as a sentinel.

I've considered your suggestion, but I'm not sure I'll adopt it in the
end.  Here's why:

1) At the point where we check whether any aggregates might pose a
risk of excessive memory usage during partial aggregation, row count
information is not yet available.  You could argue that we could
reorganize the logic to perform this check after we've had the row
count, but that seems quite tricky.  If I understand correctly, the
"rowcount" in this context actually means the number of rows within
one partial group.  That would require us to first decide on the
grouping expressions for the partial aggregation, then compute the
group row counts, then estimate space usage, and only then decide
whether memory usage is excessive and fall back.  This would come
quite late in planning and adds nontrivial overhead, compared to the
current approach which checks at the very beginning.

2) Even if we were able to estimate space usage based on the number of
rows per partial group and determined that memory usage seems
acceptable, we still couldn't guarantee that the transition state data
won't grow excessively after further joins.  Joins can multiply
partial aggregates, potentially causing a blowup in memory usage even
if the initial estimate seemed safe.

3) I don't think "-aggtransspace * rowcount" reflects the true memory
footprint for aggregates that accumulate input rows.  For example,
what if we have an aggregate like string_agg(somecolumn, 'a very long
delimiter')?

4) AFAICS, the main downside of the current approach compared to yours
is that it avoids pushing down aggregates like string_agg() that
accumulate input rows, whereas your suggestion might allow pushing
them down in some cases where we *think* it wouldn't blow up memory.
You might argue that the current implementation is over-conservative.
But I prefer to start safe.

That said, I appreciate you proposing the idea of reusing
aggtransspace, although I ended up using it in a different way than
you suggested.

- Richard





^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-05 13:12                                   ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-09 09:20                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-09 14:20                                       ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-12 09:34                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-12 18:47                                           ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-13 08:27                                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-09-25 04:23                                               ` Richard Guo <[email protected]>
  2025-09-29 02:09                                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  0 siblings, 1 reply; 70+ messages in thread

From: Richard Guo @ 2025-09-25 04:23 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]; Matheus Alcantara <[email protected]>

I've run TPC-DS again to compare planning times with and without eager
aggregation.  Out of 99 queries, only one query (query 64) shows a
noticeable increase in planning time.  This query performs inner joins
across 38 tables.  This is a very large search space.  (I'm talking
about the standard join search method, not the GEQO.)

If my math doesn't fail me, the maximum number of different join
orders when joining n tables is: Catalan(n − 1) x n!.  For n = 38,
this number is astronomically large.  In practice, query 64 joins 19
tables twice (due to a CTE), which still results in about 3.4E28
different join orders.

Of course, in practice, with the help of join_collapse_limit and other
heuristics, the effective search space is reduced a lot, but even
then, it remains very large.  Given this, I'm not too surprised that
query 64 shows an increase in planning time when eager aggregation is
applied -- exploring the best join order in such a space is inherently
expensive.

That said, I've identified a few performance hotspots that can be
optimized to help reduce planning time:

1) the exprs_known_equal() call in get_expression_sortgroupref(),
which is used to check if a given expression is known equal to a
grouping expression due to ECs.  We can optimize this by storing the
EC of each grouping expression, and then get_expression_sortgroupref()
would only need to search the relevant EC, rather than scanning all of
them.

2) the estimate_num_groups() call in create_rel_agg_info().  We can
optimize this by avoiding unnecessary calls to estimate_num_groups()
where possible.

Attached is an updated version of the patch with these optimizations
applied.  With this patch, the planning times for query 64, with and
without eager aggregation, are:

-- with eager aggregation
 Planning Time: 9432.042 ms
-- without eager aggregation
 Planning Time: 7196.999 ms

I think the increase in planning time is acceptable given the large
search space involved, though I may be biased.

- Richard


Attachments:

  [application/octet-stream] v23-0001-Implement-Eager-Aggregation.patch (187.4K, 2-v23-0001-Implement-Eager-Aggregation.patch)
  download | inline diff:
From 63d36fe266e5c8ab19079698a3ea5e9abb3218bd Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Tue, 11 Jun 2024 15:59:19 +0900
Subject: [PATCH v23 1/2] Implement Eager Aggregation

Eager aggregation is a query optimization technique that partially
pushes aggregation past a join, and finalizes it once all the
relations are joined.  Eager aggregation may reduce the number of
input rows to the join and thus could result in a better overall plan.

In the current planner architecture, the separation between the
scan/join planning phase and the post-scan/join phase means that
aggregation steps are not visible when constructing the join tree,
limiting the planner's ability to exploit aggregation-aware
optimizations.  To implement eager aggregation, we collect information
about aggregate functions in the targetlist and HAVING clause, along
with grouping expressions from the GROUP BY clause, and store it in
the PlannerInfo node.  During the scan/join planning phase, this
information is used to evaluate each base or join relation to
determine whether eager aggregation can be applied.  If applicable, we
create a separate RelOptInfo, referred to as a grouped relation, to
represent the partially-aggregated version of the relation and
generate grouped paths for it.

Grouped relation paths can be generated in two ways.  The first method
involves adding sorted and hashed partial aggregation paths on top of
the non-grouped paths.  To limit planning time, we only consider the
cheapest or suitably-sorted non-grouped paths in this step.
Alternatively, grouped paths can be generated by joining a grouped
relation with a non-grouped relation.  Joining two grouped relations
is currently not supported.

To further limit planning time, we currently adopt a strategy where
partial aggregation is pushed only to the lowest feasible level in the
join tree where it provides a significant reduction in row count.
This strategy also helps ensure that all grouped paths for the same
grouped relation produce the same set of rows, which is important to
support a fundamental assumption of the planner.

For the partial aggregation that is pushed down to a non-aggregated
relation, we need to consider all expressions from this relation that
are involved in upper join clauses and include them in the grouping
keys, using compatible operators.  This is essential to ensure that an
aggregated row from the partial aggregation matches the other side of
the join if and only if each row in the partial group does.  This
ensures that all rows within the same partial group share the same
"destiny", which is crucial for maintaining correctness.

One restriction is that we cannot push partial aggregation down to a
relation that is in the nullable side of an outer join, because the
NULL-extended rows produced by the outer join would not be available
when we perform the partial aggregation, while with a
non-eager-aggregation plan these rows are available for the top-level
aggregation.  Pushing partial aggregation in this case may result in
the rows being grouped differently than expected, or produce incorrect
values from the aggregate functions.

If we have generated a grouped relation for the topmost join relation,
we finalize its paths at the end.  The final paths will compete in the
usual way with paths built from regular planning.

The patch was originally proposed by Antonin Houska in 2017.  This
commit reworks various important aspects and rewrites most of the
current code.  However, the original patch and reviews were very
useful.

Author: Richard Guo <[email protected]>
Author: Antonin Houska <[email protected]> (in an older version)
Reviewed-by: Robert Haas <[email protected]>
Reviewed-by: Jian He <[email protected]>
Reviewed-by: Tender Wang <[email protected]>
Reviewed-by: Matheus Alcantara <[email protected]>
Reviewed-by: Tom Lane <[email protected]>
Reviewed-by: Tomas Vondra <[email protected]> (in an older version)
Reviewed-by: Andy Fan <[email protected]> (in an older version)
Reviewed-by: Ashutosh Bapat <[email protected]> (in an older version)
Discussion: https://postgr.es/m/CAMbWs48jzLrPt1J_00ZcPZXWUQKawQOFE8ROc-ADiYqsqrpBNw@mail.gmail.com
---
 .../postgres_fdw/expected/postgres_fdw.out    |   49 +-
 doc/src/sgml/config.sgml                      |   31 +
 src/backend/optimizer/README                  |  110 ++
 src/backend/optimizer/geqo/geqo_eval.c        |   21 +
 src/backend/optimizer/path/allpaths.c         |  469 +++++
 src/backend/optimizer/path/joinrels.c         |  193 ++
 src/backend/optimizer/plan/initsplan.c        |  379 ++++
 src/backend/optimizer/plan/planmain.c         |    9 +
 src/backend/optimizer/plan/planner.c          |  124 +-
 src/backend/optimizer/util/appendinfo.c       |   51 +
 src/backend/optimizer/util/relnode.c          |  650 +++++++
 src/backend/utils/misc/guc_parameters.dat     |   16 +
 src/backend/utils/misc/postgresql.conf.sample |    2 +
 src/include/nodes/pathnodes.h                 |  121 ++
 src/include/optimizer/pathnode.h              |    6 +
 src/include/optimizer/paths.h                 |    6 +
 src/include/optimizer/planmain.h              |    1 +
 .../regress/expected/collate.icu.utf8.out     |   32 +-
 src/test/regress/expected/eager_aggregate.out | 1584 +++++++++++++++++
 src/test/regress/expected/join.out            |   12 +-
 .../regress/expected/partition_aggregate.out  |    2 +
 src/test/regress/expected/sysviews.out        |    3 +-
 src/test/regress/parallel_schedule            |    2 +-
 src/test/regress/sql/eager_aggregate.sql      |  225 +++
 src/test/regress/sql/partition_aggregate.sql  |    2 +
 src/tools/pgindent/typedefs.list              |    3 +
 26 files changed, 4029 insertions(+), 74 deletions(-)
 create mode 100644 src/test/regress/expected/eager_aggregate.out
 create mode 100644 src/test/regress/sql/eager_aggregate.sql

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 6dc04e916dc..f5a57b9cbd5 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -3701,30 +3701,33 @@ select count(t1.c3) from ft2 t1 left join ft2 t2 on (t1.c1 = random() * t2.c2);
 -- Subquery in FROM clause having aggregate
 explain (verbose, costs off)
 select count(*), x.b from ft1, (select c2 a, sum(c1) b from ft1 group by c2) x where ft1.c2 = x.a group by x.b order by 1, 2;
-                                          QUERY PLAN                                           
------------------------------------------------------------------------------------------------
+                                       QUERY PLAN                                        
+-----------------------------------------------------------------------------------------
  Sort
-   Output: (count(*)), x.b
-   Sort Key: (count(*)), x.b
-   ->  HashAggregate
-         Output: count(*), x.b
-         Group Key: x.b
-         ->  Hash Join
-               Output: x.b
-               Inner Unique: true
-               Hash Cond: (ft1.c2 = x.a)
-               ->  Foreign Scan on public.ft1
-                     Output: ft1.c2
-                     Remote SQL: SELECT c2 FROM "S 1"."T 1"
-               ->  Hash
-                     Output: x.b, x.a
-                     ->  Subquery Scan on x
-                           Output: x.b, x.a
-                           ->  Foreign Scan
-                                 Output: ft1_1.c2, (sum(ft1_1.c1))
-                                 Relations: Aggregate on (public.ft1 ft1_1)
-                                 Remote SQL: SELECT c2, sum("C 1") FROM "S 1"."T 1" GROUP BY 1
-(21 rows)
+   Output: (count(*)), (sum(ft1_1.c1))
+   Sort Key: (count(*)), (sum(ft1_1.c1))
+   ->  Finalize GroupAggregate
+         Output: count(*), (sum(ft1_1.c1))
+         Group Key: (sum(ft1_1.c1))
+         ->  Sort
+               Output: (sum(ft1_1.c1)), (PARTIAL count(*))
+               Sort Key: (sum(ft1_1.c1))
+               ->  Hash Join
+                     Output: (sum(ft1_1.c1)), (PARTIAL count(*))
+                     Hash Cond: (ft1_1.c2 = ft1.c2)
+                     ->  Foreign Scan
+                           Output: ft1_1.c2, (sum(ft1_1.c1))
+                           Relations: Aggregate on (public.ft1 ft1_1)
+                           Remote SQL: SELECT c2, sum("C 1") FROM "S 1"."T 1" GROUP BY 1
+                     ->  Hash
+                           Output: ft1.c2, (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: ft1.c2, PARTIAL count(*)
+                                 Group Key: ft1.c2
+                                 ->  Foreign Scan on public.ft1
+                                       Output: ft1.c2
+                                       Remote SQL: SELECT c2 FROM "S 1"."T 1"
+(24 rows)
 
 select count(*), x.b from ft1, (select c2 a, sum(c1) b from ft1 group by c2) x where ft1.c2 = x.a group by x.b order by 1, 2;
  count |   b   
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index e9b420f3ddb..39e658b7808 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -5475,6 +5475,21 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-enable-eager-aggregate" xreflabel="enable_eager_aggregate">
+      <term><varname>enable_eager_aggregate</varname> (<type>boolean</type>)
+      <indexterm>
+       <primary><varname>enable_eager_aggregate</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Enables or disables the query planner's ability to partially push
+        aggregation past a join, and finalize it once all the relations are
+        joined. The default is <literal>on</literal>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-enable-gathermerge" xreflabel="enable_gathermerge">
       <term><varname>enable_gathermerge</varname> (<type>boolean</type>)
       <indexterm>
@@ -6095,6 +6110,22 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-min-eager-agg-group-size" xreflabel="min_eager_agg_group_size">
+      <term><varname>min_eager_agg_group_size</varname> (<type>floating point</type>)
+      <indexterm>
+       <primary><varname>min_eager_agg_group_size</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Sets the minimum average group size required to consider applying
+        eager aggregation. This helps avoid the overhead of eager
+        aggregation when it does not offer significant row count reduction.
+        The default is <literal>8</literal>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-jit-above-cost" xreflabel="jit_above_cost">
       <term><varname>jit_above_cost</varname> (<type>floating point</type>)
       <indexterm>
diff --git a/src/backend/optimizer/README b/src/backend/optimizer/README
index 843368096fd..6c35baceedb 100644
--- a/src/backend/optimizer/README
+++ b/src/backend/optimizer/README
@@ -1500,3 +1500,113 @@ breaking down aggregation or grouping over a partitioned relation into
 aggregation or grouping over its partitions is called partitionwise
 aggregation.  Especially when the partition keys match the GROUP BY clause,
 this can be significantly faster than the regular method.
+
+Eager aggregation
+-----------------
+
+Eager aggregation is a query optimization technique that partially
+pushes aggregation past a join, and finalizes it once all the
+relations are joined.  Eager aggregation may reduce the number of
+input rows to the join and thus could result in a better overall plan.
+
+To prove that the transformation is correct, let's first consider the
+case where only inner joins are involved.  In this case, we partition
+the tables in the FROM clause into two groups: those that contain at
+least one aggregation column, and those that do not contain any
+aggregation columns.  Each group can be treated as a single relation
+formed by the Cartesian product of the tables within that group.
+Therefore, without loss of generality, we can assume that the FROM
+clause contains exactly two relations, R1 and R2, where R1 represents
+the relation containing all aggregation columns, and R2 represents the
+relation without any aggregation columns.
+
+Let the query be of the form:
+
+SELECT G, AGG(A)
+FROM R1 JOIN R2 ON J
+GROUP BY G;
+
+where G is the set of grouping keys that may include columns from R1
+and/or R2; AGG(A) is an aggregate function over columns A from R1; J
+is the join condition between R1 and R2.
+
+The transformation of eager aggregation is:
+
+    GROUP BY G, AGG(A) on (R1 JOIN R2 ON J)
+    =
+    GROUP BY G, AGG(agg_A) on ((GROUP BY G1, AGG(A) AS agg_A on R1) JOIN R2 ON J)
+
+This equivalence holds under the following conditions:
+
+1) AGG is decomposable, meaning that it can be computed in two stages:
+a partial aggregation followed by a final aggregation;
+2) The set G1 used in the pre-aggregation of R1 includes:
+    * all columns from R1 that are part of the grouping keys G, and
+    * all columns from R1 that appear in the join condition J.
+3) The grouping operator for any column in G1 must be compatible with
+the operator used for that column in the join condition J.
+
+Since G1 includes all columns from R1 that appear in either the
+grouping keys G or the join condition J, all rows within each partial
+group have identical values for both the grouping keys and the
+join-relevant columns from R1, assuming compatible operators are used.
+As a result, the rows within a partial group are indistinguishable in
+terms of their contribution to the aggregation and their behavior in
+the join.  This ensures that all rows in the same partial group share
+the same "destiny": they either all match or all fail to match a given
+row in R2.  Because the aggregate function AGG is decomposable,
+aggregating the partial results after the join yields the same final
+result as aggregating after the full join, thereby preserving query
+semantics.  Q.E.D.
+
+In the case where there are any outer joins, the situation becomes
+more complex due to join order constraints and the semantics of
+null-extension in outer joins.  If the relations that contain at least
+one aggregation column cannot be treated as a single relation because
+of the join order constraints, partial aggregation paths will not be
+generated, and thus the transformation is not applicable.  Otherwise,
+let R1 be the relation containing all aggregation columns, and R2, R3,
+... be the remaining relations.  From the inner join case, under the
+aforementioned conditions, we have the equivalence:
+
+    GROUP BY G, AGG(A) on (R1 JOIN R2 JOIN R3 ...)
+    =
+    GROUP BY G, AGG(agg_A) on ((GROUP BY G1, AGG(A) AS agg_A on R1) JOIN R2 JOIN R3 ...)
+
+To preserve correctness when outer joins are involved, we require an
+additional condition:
+
+4) R1 must not be on the nullable side of any outer join.
+
+This condition ensures that partial aggregation over R1 does not
+suppress any null-extended rows that would be introduced by outer
+joins.  If R1 is on the nullable side of an outer join, the
+NULL-extended rows produced by the outer join would not be available
+when we perform the partial aggregation, while with a
+non-eager-aggregation plan these rows are available for the top-level
+aggregation.  Pushing partial aggregation in this case may result in
+the rows being grouped differently than expected, or produce incorrect
+values from the aggregate functions.
+
+During the construction of the join tree, we evaluate each base or
+join relation to determine if eager aggregation can be applied.  If
+feasible, we create a separate RelOptInfo called a "grouped relation"
+and generate grouped paths by adding sorted and hashed partial
+aggregation paths on top of the non-grouped paths.  To limit planning
+time, we consider only the cheapest or suitably-sorted non-grouped
+paths in this step.
+
+Another way to generate grouped paths is to join a grouped relation
+with a non-grouped relation.  Joining two grouped relations is
+currently not supported.
+
+To further limit planning time, we currently adopt a strategy where
+partial aggregation is pushed only to the lowest feasible level in the
+join tree where it provides a significant reduction in row count.
+This strategy also helps ensure that all grouped paths for the same
+grouped relation produce the same set of rows, which is important to
+support a fundamental assumption of the planner.
+
+If we have generated a grouped relation for the topmost join relation,
+we need to finalize its paths at the end.  The final paths will
+compete in the usual way with paths built from regular planning.
diff --git a/src/backend/optimizer/geqo/geqo_eval.c b/src/backend/optimizer/geqo/geqo_eval.c
index f07d1dc8ac6..4a65f955ca6 100644
--- a/src/backend/optimizer/geqo/geqo_eval.c
+++ b/src/backend/optimizer/geqo/geqo_eval.c
@@ -279,6 +279,27 @@ merge_clump(PlannerInfo *root, List *clumps, Clump *new_clump, int num_gene,
 				/* Find and save the cheapest paths for this joinrel */
 				set_cheapest(joinrel);
 
+				/*
+				 * Except for the topmost scan/join rel, consider generating
+				 * partial aggregation paths for the grouped relation on top
+				 * of the paths of this rel.  After that, we're done creating
+				 * paths for the grouped relation, so run set_cheapest().
+				 */
+				if (!bms_equal(joinrel->relids, root->all_query_rels))
+				{
+					RelOptInfo *grouped_rel;
+
+					grouped_rel = joinrel->grouped_rel;
+					if (grouped_rel)
+					{
+						Assert(IS_GROUPED_REL(grouped_rel));
+
+						generate_grouped_paths(root, grouped_rel, joinrel,
+											   grouped_rel->agg_info);
+						set_cheapest(grouped_rel);
+					}
+				}
+
 				/* Absorb new clump into old */
 				old_clump->joinrel = joinrel;
 				old_clump->size += new_clump->size;
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 6cc6966b060..ee298970427 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -40,6 +40,7 @@
 #include "optimizer/paths.h"
 #include "optimizer/plancat.h"
 #include "optimizer/planner.h"
+#include "optimizer/prep.h"
 #include "optimizer/tlist.h"
 #include "parser/parse_clause.h"
 #include "parser/parsetree.h"
@@ -47,6 +48,7 @@
 #include "port/pg_bitutils.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/lsyscache.h"
+#include "utils/selfuncs.h"
 
 
 /* Bitmask flags for pushdown_safety_info.unsafeFlags */
@@ -77,7 +79,9 @@ typedef enum pushdown_safe_type
 
 /* These parameters are set by GUC */
 bool		enable_geqo = false;	/* just in case GUC doesn't set it */
+bool		enable_eager_aggregate = true;
 int			geqo_threshold;
+double		min_eager_agg_group_size;
 int			min_parallel_table_scan_size;
 int			min_parallel_index_scan_size;
 
@@ -90,6 +94,7 @@ join_search_hook_type join_search_hook = NULL;
 
 static void set_base_rel_consider_startup(PlannerInfo *root);
 static void set_base_rel_sizes(PlannerInfo *root);
+static void setup_base_grouped_rels(PlannerInfo *root);
 static void set_base_rel_pathlists(PlannerInfo *root);
 static void set_rel_size(PlannerInfo *root, RelOptInfo *rel,
 						 Index rti, RangeTblEntry *rte);
@@ -114,6 +119,7 @@ static void set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
 								Index rti, RangeTblEntry *rte);
 static void set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 									Index rti, RangeTblEntry *rte);
+static void set_grouped_rel_pathlist(PlannerInfo *root, RelOptInfo *rel);
 static void generate_orderedappend_paths(PlannerInfo *root, RelOptInfo *rel,
 										 List *live_childrels,
 										 List *all_child_pathkeys);
@@ -182,6 +188,11 @@ make_one_rel(PlannerInfo *root, List *joinlist)
 	 */
 	set_base_rel_sizes(root);
 
+	/*
+	 * Build grouped relations for base rels where possible.
+	 */
+	setup_base_grouped_rels(root);
+
 	/*
 	 * We should now have size estimates for every actual table involved in
 	 * the query, and we also know which if any have been deleted from the
@@ -323,6 +334,39 @@ set_base_rel_sizes(PlannerInfo *root)
 	}
 }
 
+/*
+ * setup_base_grouped_rels
+ *	  For each base relation, build a grouped base relation if eager
+ *	  aggregation is possible and if this relation can produce grouped paths.
+ */
+static void
+setup_base_grouped_rels(PlannerInfo *root)
+{
+	Index		rti;
+
+	/*
+	 * If there are no aggregate expressions or grouping expressions, eager
+	 * aggregation is not possible.
+	 */
+	if (root->agg_clause_list == NIL ||
+		root->group_expr_list == NIL)
+		return;
+
+	for (rti = 1; rti < root->simple_rel_array_size; rti++)
+	{
+		RelOptInfo *rel = root->simple_rel_array[rti];
+
+		/* there may be empty slots corresponding to non-baserel RTEs */
+		if (rel == NULL)
+			continue;
+
+		Assert(rel->relid == rti);	/* sanity check on array */
+		Assert(IS_SIMPLE_REL(rel)); /* sanity check on rel */
+
+		(void) build_simple_grouped_rel(root, rel);
+	}
+}
+
 /*
  * set_base_rel_pathlists
  *	  Finds all paths available for scanning each base-relation entry.
@@ -559,6 +603,15 @@ set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 	/* Now find the cheapest of the paths for this rel */
 	set_cheapest(rel);
 
+	/*
+	 * If a grouped relation for this rel exists, build partial aggregation
+	 * paths for it.
+	 *
+	 * Note that this can only happen after we've called set_cheapest() for
+	 * this base rel, because we need its cheapest paths.
+	 */
+	set_grouped_rel_pathlist(root, rel);
+
 #ifdef OPTIMIZER_DEBUG
 	pprint(rel);
 #endif
@@ -1305,6 +1358,36 @@ set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 	add_paths_to_append_rel(root, rel, live_childrels);
 }
 
+/*
+ * set_grouped_rel_pathlist
+ *	  If a grouped relation for the given 'rel' exists, build partial
+ *	  aggregation paths for it.
+ */
+static void
+set_grouped_rel_pathlist(PlannerInfo *root, RelOptInfo *rel)
+{
+	RelOptInfo *grouped_rel;
+
+	/*
+	 * If there are no aggregate expressions or grouping expressions, eager
+	 * aggregation is not possible.
+	 */
+	if (root->agg_clause_list == NIL ||
+		root->group_expr_list == NIL)
+		return;
+
+	/* Add paths to the grouped base relation if one exists. */
+	grouped_rel = rel->grouped_rel;
+	if (grouped_rel)
+	{
+		Assert(IS_GROUPED_REL(grouped_rel));
+
+		generate_grouped_paths(root, grouped_rel, rel,
+							   grouped_rel->agg_info);
+		set_cheapest(grouped_rel);
+	}
+}
+
 
 /*
  * add_paths_to_append_rel
@@ -3335,6 +3418,344 @@ generate_useful_gather_paths(PlannerInfo *root, RelOptInfo *rel, bool override_r
 	}
 }
 
+/*
+ * generate_grouped_paths
+ *		Generate paths for a grouped relation by adding sorted and hashed
+ *		partial aggregation paths on top of paths of the ungrouped base or join
+ *		relation.
+ *
+ * The information needed are provided by the RelAggInfo structure.
+ */
+void
+generate_grouped_paths(PlannerInfo *root, RelOptInfo *grouped_rel,
+					   RelOptInfo *rel, RelAggInfo *agg_info)
+{
+	AggClauseCosts agg_costs;
+	bool		can_hash;
+	bool		can_sort;
+	Path	   *cheapest_total_path = NULL;
+	Path	   *cheapest_partial_path = NULL;
+	double		dNumGroups = 0;
+	double		dNumPartialGroups = 0;
+	List	   *group_pathkeys = NIL;
+
+	if (IS_DUMMY_REL(rel))
+	{
+		mark_dummy_rel(grouped_rel);
+		return;
+	}
+
+	/*
+	 * We push partial aggregation only to the lowest possible level in the
+	 * join tree that is deemed useful.
+	 */
+	if (!bms_equal(agg_info->apply_at, rel->relids) ||
+		!agg_info->agg_useful)
+		return;
+
+	MemSet(&agg_costs, 0, sizeof(AggClauseCosts));
+	get_agg_clause_costs(root, AGGSPLIT_INITIAL_SERIAL, &agg_costs);
+
+	/*
+	 * Determine whether it's possible to perform sort-based implementations
+	 * of grouping, and generate the pathkeys that represent the grouping
+	 * requirements in that case.
+	 */
+	can_sort = grouping_is_sortable(agg_info->group_clauses);
+	if (can_sort)
+	{
+		RelOptInfo *top_grouped_rel;
+		List	   *top_group_tlist;
+
+		top_grouped_rel = IS_OTHER_REL(rel) ?
+			rel->top_parent->grouped_rel : grouped_rel;
+		top_group_tlist =
+			make_tlist_from_pathtarget(top_grouped_rel->agg_info->target);
+
+		group_pathkeys =
+			make_pathkeys_for_sortclauses(root, agg_info->group_clauses,
+										  top_group_tlist);
+	}
+
+	/*
+	 * Determine whether we should consider hash-based implementations of
+	 * grouping.
+	 */
+	Assert(root->numOrderedAggs == 0);
+	can_hash = (agg_info->group_clauses != NIL &&
+				grouping_is_hashable(agg_info->group_clauses));
+
+	/*
+	 * Consider whether we should generate partially aggregated non-partial
+	 * paths.  We can only do this if we have a non-partial path.
+	 */
+	if (rel->pathlist != NIL)
+	{
+		cheapest_total_path = rel->cheapest_total_path;
+		Assert(cheapest_total_path != NULL);
+	}
+
+	/*
+	 * If parallelism is possible for grouped_rel, then we should consider
+	 * generating partially-grouped partial paths.  However, if the ungrouped
+	 * rel has no partial paths, then we can't.
+	 */
+	if (grouped_rel->consider_parallel && rel->partial_pathlist != NIL)
+	{
+		cheapest_partial_path = linitial(rel->partial_pathlist);
+		Assert(cheapest_partial_path != NULL);
+	}
+
+	/* Estimate number of partial groups. */
+	if (cheapest_total_path != NULL)
+		dNumGroups = estimate_num_groups(root,
+										 agg_info->group_exprs,
+										 cheapest_total_path->rows,
+										 NULL, NULL);
+	if (cheapest_partial_path != NULL)
+		dNumPartialGroups = estimate_num_groups(root,
+												agg_info->group_exprs,
+												cheapest_partial_path->rows,
+												NULL, NULL);
+
+	if (can_sort && cheapest_total_path != NULL)
+	{
+		ListCell   *lc;
+
+		/*
+		 * Use any available suitably-sorted path as input, and also consider
+		 * sorting the cheapest-total path and incremental sort on any paths
+		 * with presorted keys.
+		 *
+		 * To save planning time, we ignore parameterized input paths unless
+		 * they are the cheapest-total path.
+		 */
+		foreach(lc, rel->pathlist)
+		{
+			Path	   *input_path = (Path *) lfirst(lc);
+			Path	   *path;
+			bool		is_sorted;
+			int			presorted_keys;
+
+			/*
+			 * Ignore parameterized paths that are not the cheapest-total
+			 * path.
+			 */
+			if (input_path->param_info &&
+				input_path != cheapest_total_path)
+				continue;
+
+			is_sorted = pathkeys_count_contained_in(group_pathkeys,
+													input_path->pathkeys,
+													&presorted_keys);
+
+			/*
+			 * Ignore paths that are not suitably or partially sorted, unless
+			 * they are the cheapest total path (no need to deal with paths
+			 * which have presorted keys when incremental sort is disabled).
+			 */
+			if (!is_sorted && input_path != cheapest_total_path &&
+				(presorted_keys == 0 || !enable_incremental_sort))
+				continue;
+
+			/*
+			 * Since the path originates from a non-grouped relation that is
+			 * not aware of eager aggregation, we must ensure that it provides
+			 * the correct input for partial aggregation.
+			 */
+			path = (Path *) create_projection_path(root,
+												   grouped_rel,
+												   input_path,
+												   agg_info->agg_input);
+
+			if (!is_sorted)
+			{
+				/*
+				 * We've no need to consider both a sort and incremental sort.
+				 * We'll just do a sort if there are no presorted keys and an
+				 * incremental sort when there are presorted keys.
+				 */
+				if (presorted_keys == 0 || !enable_incremental_sort)
+					path = (Path *) create_sort_path(root,
+													 grouped_rel,
+													 path,
+													 group_pathkeys,
+													 -1.0);
+				else
+					path = (Path *) create_incremental_sort_path(root,
+																 grouped_rel,
+																 path,
+																 group_pathkeys,
+																 presorted_keys,
+																 -1.0);
+			}
+
+			/*
+			 * qual is NIL because the HAVING clause cannot be evaluated until
+			 * the final value of the aggregate is known.
+			 */
+			path = (Path *) create_agg_path(root,
+											grouped_rel,
+											path,
+											agg_info->target,
+											AGG_SORTED,
+											AGGSPLIT_INITIAL_SERIAL,
+											agg_info->group_clauses,
+											NIL,
+											&agg_costs,
+											dNumGroups);
+
+			add_path(grouped_rel, path);
+		}
+	}
+
+	if (can_sort && cheapest_partial_path != NULL)
+	{
+		ListCell   *lc;
+
+		/* Similar to above logic, but for partial paths. */
+		foreach(lc, rel->partial_pathlist)
+		{
+			Path	   *input_path = (Path *) lfirst(lc);
+			Path	   *path;
+			bool		is_sorted;
+			int			presorted_keys;
+
+			is_sorted = pathkeys_count_contained_in(group_pathkeys,
+													input_path->pathkeys,
+													&presorted_keys);
+
+			/*
+			 * Ignore paths that are not suitably or partially sorted, unless
+			 * they are the cheapest partial path (no need to deal with paths
+			 * which have presorted keys when incremental sort is disabled).
+			 */
+			if (!is_sorted && input_path != cheapest_partial_path &&
+				(presorted_keys == 0 || !enable_incremental_sort))
+				continue;
+
+			/*
+			 * Since the path originates from a non-grouped relation that is
+			 * not aware of eager aggregation, we must ensure that it provides
+			 * the correct input for partial aggregation.
+			 */
+			path = (Path *) create_projection_path(root,
+												   grouped_rel,
+												   input_path,
+												   agg_info->agg_input);
+
+			if (!is_sorted)
+			{
+				/*
+				 * We've no need to consider both a sort and incremental sort.
+				 * We'll just do a sort if there are no presorted keys and an
+				 * incremental sort when there are presorted keys.
+				 */
+				if (presorted_keys == 0 || !enable_incremental_sort)
+					path = (Path *) create_sort_path(root,
+													 grouped_rel,
+													 path,
+													 group_pathkeys,
+													 -1.0);
+				else
+					path = (Path *) create_incremental_sort_path(root,
+																 grouped_rel,
+																 path,
+																 group_pathkeys,
+																 presorted_keys,
+																 -1.0);
+			}
+
+			/*
+			 * qual is NIL because the HAVING clause cannot be evaluated until
+			 * the final value of the aggregate is known.
+			 */
+			path = (Path *) create_agg_path(root,
+											grouped_rel,
+											path,
+											agg_info->target,
+											AGG_SORTED,
+											AGGSPLIT_INITIAL_SERIAL,
+											agg_info->group_clauses,
+											NIL,
+											&agg_costs,
+											dNumPartialGroups);
+
+			add_partial_path(grouped_rel, path);
+		}
+	}
+
+	/*
+	 * Add a partially-grouped HashAgg Path where possible
+	 */
+	if (can_hash && cheapest_total_path != NULL)
+	{
+		Path	   *path;
+
+		/*
+		 * Since the path originates from a non-grouped relation that is not
+		 * aware of eager aggregation, we must ensure that it provides the
+		 * correct input for partial aggregation.
+		 */
+		path = (Path *) create_projection_path(root,
+											   grouped_rel,
+											   cheapest_total_path,
+											   agg_info->agg_input);
+
+		/*
+		 * qual is NIL because the HAVING clause cannot be evaluated until the
+		 * final value of the aggregate is known.
+		 */
+		path = (Path *) create_agg_path(root,
+										grouped_rel,
+										path,
+										agg_info->target,
+										AGG_HASHED,
+										AGGSPLIT_INITIAL_SERIAL,
+										agg_info->group_clauses,
+										NIL,
+										&agg_costs,
+										dNumGroups);
+
+		add_path(grouped_rel, path);
+	}
+
+	/*
+	 * Now add a partially-grouped HashAgg partial Path where possible
+	 */
+	if (can_hash && cheapest_partial_path != NULL)
+	{
+		Path	   *path;
+
+		/*
+		 * Since the path originates from a non-grouped relation that is not
+		 * aware of eager aggregation, we must ensure that it provides the
+		 * correct input for partial aggregation.
+		 */
+		path = (Path *) create_projection_path(root,
+											   grouped_rel,
+											   cheapest_partial_path,
+											   agg_info->agg_input);
+
+		/*
+		 * qual is NIL because the HAVING clause cannot be evaluated until the
+		 * final value of the aggregate is known.
+		 */
+		path = (Path *) create_agg_path(root,
+										grouped_rel,
+										path,
+										agg_info->target,
+										AGG_HASHED,
+										AGGSPLIT_INITIAL_SERIAL,
+										agg_info->group_clauses,
+										NIL,
+										&agg_costs,
+										dNumPartialGroups);
+
+		add_partial_path(grouped_rel, path);
+	}
+}
+
 /*
  * make_rel_from_joinlist
  *	  Build access paths using a "joinlist" to guide the join path search.
@@ -3494,6 +3915,10 @@ standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
 		 *
 		 * After that, we're done creating paths for the joinrel, so run
 		 * set_cheapest().
+		 *
+		 * In addition, we also run generate_grouped_paths() for the grouped
+		 * relation of each just-processed joinrel, and run set_cheapest() for
+		 * the grouped relation afterwards.
 		 */
 		foreach(lc, root->join_rel_level[lev])
 		{
@@ -3514,6 +3939,27 @@ standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
 			/* Find and save the cheapest paths for this rel */
 			set_cheapest(rel);
 
+			/*
+			 * Except for the topmost scan/join rel, consider generating
+			 * partial aggregation paths for the grouped relation on top of
+			 * the paths of this rel.  After that, we're done creating paths
+			 * for the grouped relation, so run set_cheapest().
+			 */
+			if (!bms_equal(rel->relids, root->all_query_rels))
+			{
+				RelOptInfo *grouped_rel;
+
+				grouped_rel = rel->grouped_rel;
+				if (grouped_rel)
+				{
+					Assert(IS_GROUPED_REL(grouped_rel));
+
+					generate_grouped_paths(root, grouped_rel, rel,
+										   grouped_rel->agg_info);
+					set_cheapest(grouped_rel);
+				}
+			}
+
 #ifdef OPTIMIZER_DEBUG
 			pprint(rel);
 #endif
@@ -4383,6 +4829,29 @@ generate_partitionwise_join_paths(PlannerInfo *root, RelOptInfo *rel)
 		if (IS_DUMMY_REL(child_rel))
 			continue;
 
+		/*
+		 * Except for the topmost scan/join rel, consider generating partial
+		 * aggregation paths for the grouped relation on top of the paths of
+		 * this partitioned child-join.  After that, we're done creating paths
+		 * for the grouped relation, so run set_cheapest().
+		 */
+		if (!bms_equal(IS_OTHER_REL(rel) ?
+					   rel->top_parent_relids : rel->relids,
+					   root->all_query_rels))
+		{
+			RelOptInfo *grouped_rel;
+
+			grouped_rel = child_rel->grouped_rel;
+			if (grouped_rel)
+			{
+				Assert(IS_GROUPED_REL(grouped_rel));
+
+				generate_grouped_paths(root, grouped_rel, child_rel,
+									   grouped_rel->agg_info);
+				set_cheapest(grouped_rel);
+			}
+		}
+
 #ifdef OPTIMIZER_DEBUG
 		pprint(child_rel);
 #endif
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index 535248aa525..240eda53696 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -16,6 +16,7 @@
 
 #include "miscadmin.h"
 #include "optimizer/appendinfo.h"
+#include "optimizer/cost.h"
 #include "optimizer/joininfo.h"
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
@@ -36,6 +37,9 @@ static bool has_legal_joinclause(PlannerInfo *root, RelOptInfo *rel);
 static bool restriction_is_constant_false(List *restrictlist,
 										  RelOptInfo *joinrel,
 										  bool only_pushed_down);
+static void make_grouped_join_rel(PlannerInfo *root, RelOptInfo *rel1,
+								  RelOptInfo *rel2, RelOptInfo *joinrel,
+								  SpecialJoinInfo *sjinfo, List *restrictlist);
 static void populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
 										RelOptInfo *rel2, RelOptInfo *joinrel,
 										SpecialJoinInfo *sjinfo, List *restrictlist);
@@ -762,6 +766,10 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
 		return joinrel;
 	}
 
+	/* Build a grouped join relation for 'joinrel' if possible. */
+	make_grouped_join_rel(root, rel1, rel2, joinrel, sjinfo,
+						  restrictlist);
+
 	/* Add paths to the join relation. */
 	populate_joinrel_with_paths(root, rel1, rel2, joinrel, sjinfo,
 								restrictlist);
@@ -873,6 +881,186 @@ add_outer_joins_to_relids(PlannerInfo *root, Relids input_relids,
 	return input_relids;
 }
 
+/*
+ * make_grouped_join_rel
+ *	  Build a grouped join relation for the given "joinrel" if eager
+ *	  aggregation is applicable and the resulting grouped paths are considered
+ *	  useful.
+ *
+ * There are two strategies for generating grouped paths for a join relation:
+ *
+ * 1. Join a grouped (partially aggregated) input relation with a non-grouped
+ * input (e.g., AGG(B) JOIN A).
+ *
+ * 2. Apply partial aggregation (sorted or hashed) on top of existing
+ * non-grouped join paths (e.g., AGG(A JOIN B)).
+ *
+ * To limit planning effort and avoid an explosion of alternatives, we adopt a
+ * strategy where partial aggregation is only pushed to the lowest possible
+ * level in the join tree that is deemed useful.  That is, if grouped paths can
+ * be built using the first strategy, we skip consideration of the second
+ * strategy for the same join level.
+ *
+ * Additionally, if there are multiple lowest useful levels where partial
+ * aggregation could be applied, such as in a join tree with relations A, B,
+ * and C where both "AGG(A JOIN B) JOIN C" and "A JOIN AGG(B JOIN C)" are valid
+ * placements, we choose only the first one encountered during join search.
+ * This avoids generating multiple versions of the same grouped relation based
+ * on different aggregation placements.
+ *
+ * These heuristics also ensure that all grouped paths for the same grouped
+ * relation produce the same set of rows, which is a basic assumption in the
+ * planner.
+ */
+static void
+make_grouped_join_rel(PlannerInfo *root, RelOptInfo *rel1,
+					  RelOptInfo *rel2, RelOptInfo *joinrel,
+					  SpecialJoinInfo *sjinfo, List *restrictlist)
+{
+	RelOptInfo *grouped_rel;
+	RelOptInfo *grouped_rel1;
+	RelOptInfo *grouped_rel2;
+	bool		rel1_empty;
+	bool		rel2_empty;
+	Relids		agg_apply_at;
+
+	/*
+	 * If there are no aggregate expressions or grouping expressions, eager
+	 * aggregation is not possible.
+	 */
+	if (root->agg_clause_list == NIL ||
+		root->group_expr_list == NIL)
+		return;
+
+	/* Retrieve the grouped relations for the two input rels */
+	grouped_rel1 = rel1->grouped_rel;
+	grouped_rel2 = rel2->grouped_rel;
+
+	rel1_empty = (grouped_rel1 == NULL || IS_DUMMY_REL(grouped_rel1));
+	rel2_empty = (grouped_rel2 == NULL || IS_DUMMY_REL(grouped_rel2));
+
+	/* Find or construct a grouped joinrel for this joinrel */
+	grouped_rel = joinrel->grouped_rel;
+	if (grouped_rel == NULL)
+	{
+		RelAggInfo *agg_info = NULL;
+
+		/*
+		 * Prepare the information needed to create grouped paths for this
+		 * join relation.
+		 */
+		agg_info = create_rel_agg_info(root, joinrel, rel1_empty == rel2_empty);
+		if (agg_info == NULL)
+			return;
+
+		/*
+		 * If grouped paths for the given join relation are not considered
+		 * useful, and no grouped paths can be built by joining grouped input
+		 * relations, skip building the grouped join relation.
+		 */
+		if (!agg_info->agg_useful &&
+			(rel1_empty == rel2_empty))
+			return;
+
+		/* build the grouped relation */
+		grouped_rel = build_grouped_rel(root, joinrel);
+		grouped_rel->reltarget = agg_info->target;
+
+		if (rel1_empty != rel2_empty)
+		{
+			/*
+			 * If there is exactly one grouped input relation, then we can
+			 * build grouped paths by joining the input relations.  Set size
+			 * estimates for the grouped join relation based on the input
+			 * relations, and update the lowest join level where partial
+			 * aggregation is applied to that of the grouped input relation.
+			 */
+			set_joinrel_size_estimates(root, grouped_rel,
+									   rel1_empty ? rel1 : grouped_rel1,
+									   rel2_empty ? rel2 : grouped_rel2,
+									   sjinfo, restrictlist);
+			agg_info->apply_at = rel1_empty ?
+				grouped_rel2->agg_info->apply_at :
+				grouped_rel1->agg_info->apply_at;
+		}
+		else
+		{
+			/*
+			 * Otherwise, grouped paths can be built by applying partial
+			 * aggregation on top of existing non-grouped join paths.  Set
+			 * size estimates for the grouped join relation based on the
+			 * estimated number of groups, and track the lowest join level
+			 * where partial aggregation is applied.  Note that these values
+			 * may be updated later if it is determined that grouped paths can
+			 * be constructed by joining other input relations.
+			 */
+			grouped_rel->rows = agg_info->grouped_rows;
+			agg_info->apply_at = bms_copy(joinrel->relids);
+		}
+
+		grouped_rel->agg_info = agg_info;
+		joinrel->grouped_rel = grouped_rel;
+	}
+
+	Assert(IS_GROUPED_REL(grouped_rel));
+
+	/* We may have already proven this grouped join relation to be dummy. */
+	if (IS_DUMMY_REL(grouped_rel))
+		return;
+
+	/*
+	 * Nothing to do if there's no grouped input relation.  Also, joining two
+	 * grouped relations is not currently supported.
+	 */
+	if (rel1_empty == rel2_empty)
+		return;
+
+	/*
+	 * Get the lowest join level where partial aggregation is applied among
+	 * the given input relations.
+	 */
+	agg_apply_at = rel1_empty ?
+		grouped_rel2->agg_info->apply_at :
+		grouped_rel1->agg_info->apply_at;
+
+	/*
+	 * If it's not the designated level, skip building grouped paths.
+	 *
+	 * One exception is when it is a subset of the previously recorded level.
+	 * In that case, we need to update the designated level to this one, and
+	 * adjust the size estimates for the grouped join relation accordingly.
+	 * For example, suppose partial aggregation can be applied on top of (B
+	 * JOIN C).  If we first construct the join as ((A JOIN B) JOIN C), we'd
+	 * record the designated level as including all three relations (A B C).
+	 * Later, when we consider (A JOIN (B JOIN C)), we encounter the smaller
+	 * (B C) join level directly.  Since this is a subset of the previous
+	 * level and still valid for partial aggregation, we update the designated
+	 * level to (B C), and adjust the size estimates accordingly.
+	 */
+	if (!bms_equal(agg_apply_at, grouped_rel->agg_info->apply_at))
+	{
+		if (bms_is_subset(agg_apply_at, grouped_rel->agg_info->apply_at))
+		{
+			/* Adjust the size estimates for the grouped join relation. */
+			set_joinrel_size_estimates(root, grouped_rel,
+									   rel1_empty ? rel1 : grouped_rel1,
+									   rel2_empty ? rel2 : grouped_rel2,
+									   sjinfo, restrictlist);
+			grouped_rel->agg_info->apply_at = agg_apply_at;
+		}
+		else
+			return;
+	}
+
+	/* Make paths for the grouped join relation. */
+	populate_joinrel_with_paths(root,
+								rel1_empty ? rel1 : grouped_rel1,
+								rel2_empty ? rel2 : grouped_rel2,
+								grouped_rel,
+								sjinfo,
+								restrictlist);
+}
+
 /*
  * populate_joinrel_with_paths
  *	  Add paths to the given joinrel for given pair of joining relations. The
@@ -1615,6 +1803,11 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
 						 adjust_child_relids(joinrel->relids,
 											 nappinfos, appinfos)));
 
+		/* Build a grouped join relation for 'child_joinrel' if possible */
+		make_grouped_join_rel(root, child_rel1, child_rel2,
+							  child_joinrel, child_sjinfo,
+							  child_restrictlist);
+
 		/* And make paths for the child join */
 		populate_joinrel_with_paths(root, child_rel1, child_rel2,
 									child_joinrel, child_sjinfo,
diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index 3e3fec89252..1af43bb60d2 100644
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -14,6 +14,7 @@
  */
 #include "postgres.h"
 
+#include "access/nbtree.h"
 #include "catalog/pg_constraint.h"
 #include "catalog/pg_type.h"
 #include "nodes/makefuncs.h"
@@ -31,6 +32,7 @@
 #include "optimizer/restrictinfo.h"
 #include "parser/analyze.h"
 #include "rewrite/rewriteManip.h"
+#include "utils/fmgroids.h"
 #include "utils/lsyscache.h"
 #include "utils/rel.h"
 #include "utils/typcache.h"
@@ -81,6 +83,12 @@ typedef struct JoinTreeItem
 } JoinTreeItem;
 
 
+static bool is_partial_agg_memory_risky(PlannerInfo *root);
+static void create_agg_clause_infos(PlannerInfo *root);
+static void create_grouping_expr_infos(PlannerInfo *root);
+static EquivalenceClass *get_eclass_for_sortgroupclause(PlannerInfo *root,
+														SortGroupClause *sgc,
+														Expr *expr);
 static void extract_lateral_references(PlannerInfo *root, RelOptInfo *brel,
 									   Index rtindex);
 static List *deconstruct_recurse(PlannerInfo *root, Node *jtnode,
@@ -628,6 +636,377 @@ remove_useless_groupby_columns(PlannerInfo *root)
 	}
 }
 
+/*
+ * setup_eager_aggregation
+ *	  Check if eager aggregation is applicable, and if so collect suitable
+ *	  aggregate expressions and grouping expressions in the query.
+ */
+void
+setup_eager_aggregation(PlannerInfo *root)
+{
+	/*
+	 * Don't apply eager aggregation if disabled by user.
+	 */
+	if (!enable_eager_aggregate)
+		return;
+
+	/*
+	 * Don't apply eager aggregation if there are no available GROUP BY
+	 * clauses.
+	 */
+	if (!root->processed_groupClause)
+		return;
+
+	/*
+	 * For now we don't try to support grouping sets.
+	 */
+	if (root->parse->groupingSets)
+		return;
+
+	/*
+	 * For now we don't try to support DISTINCT or ORDER BY aggregates.
+	 */
+	if (root->numOrderedAggs > 0)
+		return;
+
+	/*
+	 * If there are any aggregates that do not support partial mode, or any
+	 * partial aggregates that are non-serializable, do not apply eager
+	 * aggregation.
+	 */
+	if (root->hasNonPartialAggs || root->hasNonSerialAggs)
+		return;
+
+	/*
+	 * We don't try to apply eager aggregation if there are set-returning
+	 * functions in targetlist.
+	 */
+	if (root->parse->hasTargetSRFs)
+		return;
+
+	/*
+	 * Eager aggregation only makes sense if there are multiple base rels in
+	 * the query.
+	 */
+	if (bms_membership(root->all_baserels) != BMS_MULTIPLE)
+		return;
+
+	/*
+	 * Don't apply eager aggregation if any aggregate poses a risk of
+	 * excessive memory usage during partial aggregation.
+	 */
+	if (is_partial_agg_memory_risky(root))
+		return;
+
+	/*
+	 * Collect aggregate expressions and plain Vars that appear in the
+	 * targetlist and havingQual.
+	 */
+	create_agg_clause_infos(root);
+
+	/*
+	 * If there are no suitable aggregate expressions, we cannot apply eager
+	 * aggregation.
+	 */
+	if (root->agg_clause_list == NIL)
+		return;
+
+	/*
+	 * Collect grouping expressions that appear in grouping clauses.
+	 */
+	create_grouping_expr_infos(root);
+}
+
+/*
+ * is_partial_agg_memory_risky
+ *	  Checks if any aggregate poses a risk of excessive memory usage during
+ *	  partial aggregation.
+ *
+ * We check if any aggregate uses INTERNAL transition type.  Although INTERNAL
+ * is marked as pass-by-value, it usually points to a large internal data
+ * structure (like those used by string_agg or array_agg).  These transition
+ * states can grow large and their size is hard to estimate.  Applying eager
+ * aggregation in such cases risks high memory usage since partial aggregation
+ * results might be stored in join hash tables or materialized nodes.
+ *
+ * We explicitly exclude aggregates with AVG_ACCUM transition function from
+ * this check, based on the assumption that avg() and sum() are safe in this
+ * context.
+ */
+static bool
+is_partial_agg_memory_risky(PlannerInfo *root)
+{
+	ListCell   *lc;
+
+	foreach(lc, root->aggtransinfos)
+	{
+		AggTransInfo *transinfo = lfirst_node(AggTransInfo, lc);
+
+		if (transinfo->transfn_oid == F_NUMERIC_AVG_ACCUM ||
+			transinfo->transfn_oid == F_INT8_AVG_ACCUM)
+			continue;
+
+		if (transinfo->aggtranstype == INTERNALOID)
+			return true;
+	}
+
+	return false;
+}
+
+/*
+ * create_agg_clause_infos
+ *	  Search the targetlist and havingQual for Aggrefs and plain Vars, and
+ *	  create an AggClauseInfo for each Aggref node.
+ */
+static void
+create_agg_clause_infos(PlannerInfo *root)
+{
+	List	   *tlist_exprs;
+	List	   *agg_clause_list = NIL;
+	List	   *tlist_vars = NIL;
+	Relids		aggregate_relids = NULL;
+	bool		eager_agg_applicable = true;
+	ListCell   *lc;
+
+	Assert(root->agg_clause_list == NIL);
+	Assert(root->tlist_vars == NIL);
+
+	tlist_exprs = pull_var_clause((Node *) root->processed_tlist,
+								  PVC_INCLUDE_AGGREGATES |
+								  PVC_RECURSE_WINDOWFUNCS |
+								  PVC_RECURSE_PLACEHOLDERS);
+
+	/*
+	 * Aggregates within the HAVING clause need to be processed in the same
+	 * way as those in the targetlist.  Note that HAVING can contain Aggrefs
+	 * but not WindowFuncs.
+	 */
+	if (root->parse->havingQual != NULL)
+	{
+		List	   *having_exprs;
+
+		having_exprs = pull_var_clause((Node *) root->parse->havingQual,
+									   PVC_INCLUDE_AGGREGATES |
+									   PVC_RECURSE_PLACEHOLDERS);
+		if (having_exprs != NIL)
+		{
+			tlist_exprs = list_concat(tlist_exprs, having_exprs);
+			list_free(having_exprs);
+		}
+	}
+
+	foreach(lc, tlist_exprs)
+	{
+		Expr	   *expr = (Expr *) lfirst(lc);
+		Aggref	   *aggref;
+		Relids		agg_eval_at;
+		AggClauseInfo *ac_info;
+
+		/* For now we don't try to support GROUPING() expressions */
+		if (IsA(expr, GroupingFunc))
+		{
+			eager_agg_applicable = false;
+			break;
+		}
+
+		/* Collect plain Vars for future reference */
+		if (IsA(expr, Var))
+		{
+			tlist_vars = list_append_unique(tlist_vars, expr);
+			continue;
+		}
+
+		aggref = castNode(Aggref, expr);
+
+		Assert(aggref->aggorder == NIL);
+		Assert(aggref->aggdistinct == NIL);
+
+		/*
+		 * If there are any securityQuals, do not try to apply eager
+		 * aggregation if any non-leakproof aggregate functions are present.
+		 * This is overly strict, but for now...
+		 */
+		if (root->qual_security_level > 0 &&
+			!get_func_leakproof(aggref->aggfnoid))
+		{
+			eager_agg_applicable = false;
+			break;
+		}
+
+		agg_eval_at = pull_varnos(root, (Node *) aggref);
+
+		/*
+		 * If all base relations in the query are referenced by aggregate
+		 * functions, then eager aggregation is not applicable.
+		 */
+		aggregate_relids = bms_add_members(aggregate_relids, agg_eval_at);
+		if (bms_is_subset(root->all_baserels, aggregate_relids))
+		{
+			eager_agg_applicable = false;
+			break;
+		}
+
+		/* OK, create the AggClauseInfo node */
+		ac_info = makeNode(AggClauseInfo);
+		ac_info->aggref = aggref;
+		ac_info->agg_eval_at = agg_eval_at;
+
+		/* ... and add it to the list */
+		agg_clause_list = list_append_unique(agg_clause_list, ac_info);
+	}
+
+	list_free(tlist_exprs);
+
+	if (eager_agg_applicable)
+	{
+		root->agg_clause_list = agg_clause_list;
+		root->tlist_vars = tlist_vars;
+	}
+	else
+	{
+		list_free_deep(agg_clause_list);
+		list_free(tlist_vars);
+	}
+}
+
+/*
+ * create_grouping_expr_infos
+ *	  Create a GroupingExprInfo for each expression usable as grouping key.
+ *
+ * If any grouping expression is not suitable, we will just return with
+ * root->group_expr_list being NIL.
+ */
+static void
+create_grouping_expr_infos(PlannerInfo *root)
+{
+	List	   *exprs = NIL;
+	List	   *sortgrouprefs = NIL;
+	List	   *ecs = NIL;
+	ListCell   *lc,
+			   *lc1,
+			   *lc2,
+			   *lc3;
+
+	Assert(root->group_expr_list == NIL);
+
+	foreach(lc, root->processed_groupClause)
+	{
+		SortGroupClause *sgc = lfirst_node(SortGroupClause, lc);
+		TargetEntry *tle = get_sortgroupclause_tle(sgc, root->processed_tlist);
+		TypeCacheEntry *tce;
+		Oid			equalimageproc;
+
+		Assert(tle->ressortgroupref > 0);
+
+		/*
+		 * For now we only support plain Vars as grouping expressions.
+		 */
+		if (!IsA(tle->expr, Var))
+			return;
+
+		/*
+		 * Eager aggregation is only possible if equality implies image
+		 * equality for each grouping key.  Otherwise, placing keys with
+		 * different byte images into the same group may result in the loss of
+		 * information that could be necessary to evaluate upper qual clauses.
+		 *
+		 * For instance, the NUMERIC data type is not supported, as values
+		 * that are considered equal by the equality operator (e.g., 0 and
+		 * 0.0) can have different scales.
+		 */
+		tce = lookup_type_cache(exprType((Node *) tle->expr),
+								TYPECACHE_BTREE_OPFAMILY);
+		if (!OidIsValid(tce->btree_opf) ||
+			!OidIsValid(tce->btree_opintype))
+			return;
+
+		equalimageproc = get_opfamily_proc(tce->btree_opf,
+										   tce->btree_opintype,
+										   tce->btree_opintype,
+										   BTEQUALIMAGE_PROC);
+		if (!OidIsValid(equalimageproc) ||
+			!DatumGetBool(OidFunctionCall1Coll(equalimageproc,
+											   tce->typcollation,
+											   ObjectIdGetDatum(tce->btree_opintype))))
+			return;
+
+		exprs = lappend(exprs, tle->expr);
+		sortgrouprefs = lappend_int(sortgrouprefs, tle->ressortgroupref);
+		ecs = lappend(ecs, get_eclass_for_sortgroupclause(root, sgc, tle->expr));
+	}
+
+	/*
+	 * Construct a GroupingExprInfo for each expression.
+	 */
+	forthree(lc1, exprs, lc2, sortgrouprefs, lc3, ecs)
+	{
+		Expr	   *expr = (Expr *) lfirst(lc1);
+		int			sortgroupref = lfirst_int(lc2);
+		EquivalenceClass *ec = (EquivalenceClass *) lfirst(lc3);
+		GroupingExprInfo *ge_info;
+
+		ge_info = makeNode(GroupingExprInfo);
+		ge_info->expr = (Expr *) copyObject(expr);
+		ge_info->sortgroupref = sortgroupref;
+		ge_info->ec = ec;
+
+		root->group_expr_list = lappend(root->group_expr_list, ge_info);
+	}
+}
+
+/*
+ * get_eclass_for_sortgroupclause
+ *	  Given a group clause and an expression, find an existing equivalence
+ *	  class that the expression is a member of; return NULL if none.
+ */
+static EquivalenceClass *
+get_eclass_for_sortgroupclause(PlannerInfo *root, SortGroupClause *sgc,
+							   Expr *expr)
+{
+	Oid			opfamily,
+				opcintype,
+				collation;
+	CompareType cmptype;
+	Oid			equality_op;
+	List	   *opfamilies;
+
+	/* Punt if the group clause is not sortable */
+	if (!OidIsValid(sgc->sortop))
+		return NULL;
+
+	/* Find the operator in pg_amop --- failure shouldn't happen */
+	if (!get_ordering_op_properties(sgc->sortop,
+									&opfamily, &opcintype, &cmptype))
+		elog(ERROR, "operator %u is not a valid ordering operator",
+			 sgc->sortop);
+
+	/* Because SortGroupClause doesn't carry collation, consult the expr */
+	collation = exprCollation((Node *) expr);
+
+	/*
+	 * EquivalenceClasses need to contain opfamily lists based on the family
+	 * membership of mergejoinable equality operators, which could belong to
+	 * more than one opfamily.  So we have to look up the opfamily's equality
+	 * operator and get its membership.
+	 */
+	equality_op = get_opfamily_member_for_cmptype(opfamily,
+												  opcintype,
+												  opcintype,
+												  COMPARE_EQ);
+	if (!OidIsValid(equality_op))	/* shouldn't happen */
+		elog(ERROR, "missing operator %d(%u,%u) in opfamily %u",
+			 COMPARE_EQ, opcintype, opcintype, opfamily);
+	opfamilies = get_mergejoin_opfamilies(equality_op);
+	if (!opfamilies)			/* certainly should find some */
+		elog(ERROR, "could not find opfamilies for equality operator %u",
+			 equality_op);
+
+	/* Now find a matching EquivalenceClass */
+	return get_eclass_for_sort_expr(root, expr, opfamilies, opcintype,
+									collation, sgc->tleSortGroupRef,
+									NULL, false);
+}
+
 /*****************************************************************************
  *
  *	  LATERAL REFERENCES
diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c
index 5467e094ca7..eefc486a566 100644
--- a/src/backend/optimizer/plan/planmain.c
+++ b/src/backend/optimizer/plan/planmain.c
@@ -76,6 +76,9 @@ query_planner(PlannerInfo *root,
 	root->placeholder_list = NIL;
 	root->placeholder_array = NULL;
 	root->placeholder_array_size = 0;
+	root->agg_clause_list = NIL;
+	root->group_expr_list = NIL;
+	root->tlist_vars = NIL;
 	root->fkey_list = NIL;
 	root->initial_rels = NIL;
 
@@ -265,6 +268,12 @@ query_planner(PlannerInfo *root,
 	 */
 	extract_restriction_or_clauses(root);
 
+	/*
+	 * Check if eager aggregation is applicable, and if so, set up
+	 * root->agg_clause_list and root->group_expr_list.
+	 */
+	setup_eager_aggregation(root);
+
 	/*
 	 * Now expand appendrels by adding "otherrels" for their children.  We
 	 * delay this to the end so that we have as much information as possible
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 41bd8353430..462c5335589 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -232,7 +232,6 @@ static void add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 									  RelOptInfo *partially_grouped_rel,
 									  const AggClauseCosts *agg_costs,
 									  grouping_sets_data *gd,
-									  double dNumGroups,
 									  GroupPathExtraData *extra);
 static RelOptInfo *create_partial_grouping_paths(PlannerInfo *root,
 												 RelOptInfo *grouped_rel,
@@ -4010,9 +4009,7 @@ create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
 							   GroupPathExtraData *extra,
 							   RelOptInfo **partially_grouped_rel_p)
 {
-	Path	   *cheapest_path = input_rel->cheapest_total_path;
 	RelOptInfo *partially_grouped_rel = NULL;
-	double		dNumGroups;
 	PartitionwiseAggregateType patype = PARTITIONWISE_AGGREGATE_NONE;
 
 	/*
@@ -4094,23 +4091,16 @@ create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
 
 	/* Gather any partially grouped partial paths. */
 	if (partially_grouped_rel && partially_grouped_rel->partial_pathlist)
-	{
 		gather_grouping_paths(root, partially_grouped_rel);
-		set_cheapest(partially_grouped_rel);
-	}
 
-	/*
-	 * Estimate number of groups.
-	 */
-	dNumGroups = get_number_of_groups(root,
-									  cheapest_path->rows,
-									  gd,
-									  extra->targetList);
+	/* Now choose the best path(s) for partially_grouped_rel. */
+	if (partially_grouped_rel && partially_grouped_rel->pathlist)
+		set_cheapest(partially_grouped_rel);
 
 	/* Build final grouping paths */
 	add_paths_to_grouping_rel(root, input_rel, grouped_rel,
 							  partially_grouped_rel, agg_costs, gd,
-							  dNumGroups, extra);
+							  extra);
 
 	/* Give a helpful error if we failed to find any implementation */
 	if (grouped_rel->pathlist == NIL)
@@ -7055,16 +7045,42 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 						  RelOptInfo *grouped_rel,
 						  RelOptInfo *partially_grouped_rel,
 						  const AggClauseCosts *agg_costs,
-						  grouping_sets_data *gd, double dNumGroups,
+						  grouping_sets_data *gd,
 						  GroupPathExtraData *extra)
 {
 	Query	   *parse = root->parse;
 	Path	   *cheapest_path = input_rel->cheapest_total_path;
+	Path	   *cheapest_partially_grouped_path = NULL;
 	ListCell   *lc;
 	bool		can_hash = (extra->flags & GROUPING_CAN_USE_HASH) != 0;
 	bool		can_sort = (extra->flags & GROUPING_CAN_USE_SORT) != 0;
 	List	   *havingQual = (List *) extra->havingQual;
 	AggClauseCosts *agg_final_costs = &extra->agg_final_costs;
+	double		dNumGroups = 0;
+	double		dNumFinalGroups = 0;
+
+	/*
+	 * Estimate number of groups for non-split aggregation.
+	 */
+	dNumGroups = get_number_of_groups(root,
+									  cheapest_path->rows,
+									  gd,
+									  extra->targetList);
+
+	if (partially_grouped_rel && partially_grouped_rel->pathlist)
+	{
+		cheapest_partially_grouped_path =
+			partially_grouped_rel->cheapest_total_path;
+
+		/*
+		 * Estimate number of groups for final phase of partial aggregation.
+		 */
+		dNumFinalGroups =
+			get_number_of_groups(root,
+								 cheapest_partially_grouped_path->rows,
+								 gd,
+								 extra->targetList);
+	}
 
 	if (can_sort)
 	{
@@ -7177,7 +7193,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 					path = make_ordered_path(root,
 											 grouped_rel,
 											 path,
-											 partially_grouped_rel->cheapest_total_path,
+											 cheapest_partially_grouped_path,
 											 info->pathkeys,
 											 -1.0);
 
@@ -7195,7 +7211,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 												 info->clauses,
 												 havingQual,
 												 agg_final_costs,
-												 dNumGroups));
+												 dNumFinalGroups));
 					else
 						add_path(grouped_rel, (Path *)
 								 create_group_path(root,
@@ -7203,7 +7219,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 												   path,
 												   info->clauses,
 												   havingQual,
-												   dNumGroups));
+												   dNumFinalGroups));
 
 				}
 			}
@@ -7245,19 +7261,17 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 		 */
 		if (partially_grouped_rel && partially_grouped_rel->pathlist)
 		{
-			Path	   *path = partially_grouped_rel->cheapest_total_path;
-
 			add_path(grouped_rel, (Path *)
 					 create_agg_path(root,
 									 grouped_rel,
-									 path,
+									 cheapest_partially_grouped_path,
 									 grouped_rel->reltarget,
 									 AGG_HASHED,
 									 AGGSPLIT_FINAL_DESERIAL,
 									 root->processed_groupClause,
 									 havingQual,
 									 agg_final_costs,
-									 dNumGroups));
+									 dNumFinalGroups));
 		}
 	}
 
@@ -7297,6 +7311,7 @@ create_partial_grouping_paths(PlannerInfo *root,
 {
 	Query	   *parse = root->parse;
 	RelOptInfo *partially_grouped_rel;
+	RelOptInfo *eager_agg_rel = NULL;
 	AggClauseCosts *agg_partial_costs = &extra->agg_partial_costs;
 	AggClauseCosts *agg_final_costs = &extra->agg_final_costs;
 	Path	   *cheapest_partial_path = NULL;
@@ -7307,6 +7322,15 @@ create_partial_grouping_paths(PlannerInfo *root,
 	bool		can_hash = (extra->flags & GROUPING_CAN_USE_HASH) != 0;
 	bool		can_sort = (extra->flags & GROUPING_CAN_USE_SORT) != 0;
 
+	/*
+	 * Check whether any partially aggregated paths have been generated
+	 * through eager aggregation.
+	 */
+	if (input_rel->grouped_rel &&
+		!IS_DUMMY_REL(input_rel->grouped_rel) &&
+		input_rel->grouped_rel->pathlist != NIL)
+		eager_agg_rel = input_rel->grouped_rel;
+
 	/*
 	 * Consider whether we should generate partially aggregated non-partial
 	 * paths.  We can only do this if we have a non-partial path, and only if
@@ -7328,11 +7352,13 @@ create_partial_grouping_paths(PlannerInfo *root,
 
 	/*
 	 * If we can't partially aggregate partial paths, and we can't partially
-	 * aggregate non-partial paths, then don't bother creating the new
+	 * aggregate non-partial paths, and no partially aggregated paths were
+	 * generated by eager aggregation, then don't bother creating the new
 	 * RelOptInfo at all, unless the caller specified force_rel_creation.
 	 */
 	if (cheapest_total_path == NULL &&
 		cheapest_partial_path == NULL &&
+		eager_agg_rel == NULL &&
 		!force_rel_creation)
 		return NULL;
 
@@ -7557,6 +7583,51 @@ create_partial_grouping_paths(PlannerInfo *root,
 										 dNumPartialPartialGroups));
 	}
 
+	/*
+	 * Add any partially aggregated paths generated by eager aggregation to
+	 * the new upper relation after applying projection steps as needed.
+	 */
+	if (eager_agg_rel)
+	{
+		/* Add the paths */
+		foreach(lc, eager_agg_rel->pathlist)
+		{
+			Path	   *path = (Path *) lfirst(lc);
+
+			/* Shouldn't have any parameterized paths anymore */
+			Assert(path->param_info == NULL);
+
+			path = (Path *) create_projection_path(root,
+												   partially_grouped_rel,
+												   path,
+												   partially_grouped_rel->reltarget);
+
+			add_path(partially_grouped_rel, path);
+		}
+
+		/*
+		 * Likewise add the partial paths, but only if parallelism is possible
+		 * for partially_grouped_rel.
+		 */
+		if (partially_grouped_rel->consider_parallel)
+		{
+			foreach(lc, eager_agg_rel->partial_pathlist)
+			{
+				Path	   *path = (Path *) lfirst(lc);
+
+				/* Shouldn't have any parameterized paths anymore */
+				Assert(path->param_info == NULL);
+
+				path = (Path *) create_projection_path(root,
+													   partially_grouped_rel,
+													   path,
+													   partially_grouped_rel->reltarget);
+
+				add_partial_path(partially_grouped_rel, path);
+			}
+		}
+	}
+
 	/*
 	 * If there is an FDW that's responsible for all baserels of the query,
 	 * let it consider adding partially grouped ForeignPaths.
@@ -8120,13 +8191,6 @@ create_partitionwise_grouping_paths(PlannerInfo *root,
 
 		add_paths_to_append_rel(root, partially_grouped_rel,
 								partially_grouped_live_children);
-
-		/*
-		 * We need call set_cheapest, since the finalization step will use the
-		 * cheapest path from the rel.
-		 */
-		if (partially_grouped_rel->pathlist)
-			set_cheapest(partially_grouped_rel);
 	}
 
 	/* If possible, create append paths for fully grouped children. */
diff --git a/src/backend/optimizer/util/appendinfo.c b/src/backend/optimizer/util/appendinfo.c
index 5b3dc0d8653..69b8b0c2ae0 100644
--- a/src/backend/optimizer/util/appendinfo.c
+++ b/src/backend/optimizer/util/appendinfo.c
@@ -516,6 +516,57 @@ adjust_appendrel_attrs_mutator(Node *node,
 		return (Node *) newinfo;
 	}
 
+	/*
+	 * We have to process RelAggInfo nodes specially.
+	 */
+	if (IsA(node, RelAggInfo))
+	{
+		RelAggInfo *oldinfo = (RelAggInfo *) node;
+		RelAggInfo *newinfo = makeNode(RelAggInfo);
+
+		newinfo->target = (PathTarget *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->target,
+										   context);
+
+		newinfo->agg_input = (PathTarget *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->agg_input,
+										   context);
+
+		newinfo->group_clauses = oldinfo->group_clauses;
+
+		newinfo->group_exprs = (List *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->group_exprs,
+										   context);
+
+		return (Node *) newinfo;
+	}
+
+	/*
+	 * We have to process PathTarget nodes specially.
+	 */
+	if (IsA(node, PathTarget))
+	{
+		PathTarget *oldtarget = (PathTarget *) node;
+		PathTarget *newtarget = makeNode(PathTarget);
+
+		/* Copy all flat-copiable fields */
+		memcpy(newtarget, oldtarget, sizeof(PathTarget));
+
+		newtarget->exprs = (List *)
+			adjust_appendrel_attrs_mutator((Node *) oldtarget->exprs,
+										   context);
+
+		if (oldtarget->sortgrouprefs)
+		{
+			Size		nbytes = list_length(oldtarget->exprs) * sizeof(Index);
+
+			newtarget->sortgrouprefs = (Index *) palloc(nbytes);
+			memcpy(newtarget->sortgrouprefs, oldtarget->sortgrouprefs, nbytes);
+		}
+
+		return (Node *) newtarget;
+	}
+
 	/*
 	 * NOTE: we do not need to recurse into sublinks, because they should
 	 * already have been converted to subplans before we see them.
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index 0e523d2eb5b..e5bab59fbbe 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -16,6 +16,8 @@
 
 #include <limits.h>
 
+#include "access/nbtree.h"
+#include "catalog/pg_constraint.h"
 #include "miscadmin.h"
 #include "nodes/nodeFuncs.h"
 #include "optimizer/appendinfo.h"
@@ -27,12 +29,16 @@
 #include "optimizer/paths.h"
 #include "optimizer/placeholder.h"
 #include "optimizer/plancat.h"
+#include "optimizer/planner.h"
 #include "optimizer/restrictinfo.h"
 #include "optimizer/tlist.h"
+#include "parser/parse_oper.h"
 #include "parser/parse_relation.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/hsearch.h"
 #include "utils/lsyscache.h"
+#include "utils/selfuncs.h"
+#include "utils/typcache.h"
 
 
 typedef struct JoinHashEntry
@@ -83,6 +89,14 @@ static void build_child_join_reltarget(PlannerInfo *root,
 									   RelOptInfo *childrel,
 									   int nappinfos,
 									   AppendRelInfo **appinfos);
+static bool eager_aggregation_possible_for_relation(PlannerInfo *root,
+													RelOptInfo *rel);
+static bool init_grouping_targets(PlannerInfo *root, RelOptInfo *rel,
+								  PathTarget *target, PathTarget *agg_input,
+								  List **group_clauses, List **group_exprs);
+static bool is_var_in_aggref_only(PlannerInfo *root, Var *var);
+static bool is_var_needed_by_join(PlannerInfo *root, Var *var, RelOptInfo *rel);
+static Index get_expression_sortgroupref(PlannerInfo *root, Expr *expr);
 
 
 /*
@@ -278,6 +292,8 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
 	rel->joininfo = NIL;
 	rel->has_eclass_joins = false;
 	rel->consider_partitionwise_join = false;	/* might get changed later */
+	rel->agg_info = NULL;
+	rel->grouped_rel = NULL;
 	rel->part_scheme = NULL;
 	rel->nparts = -1;
 	rel->boundinfo = NULL;
@@ -408,6 +424,103 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
 	return rel;
 }
 
+/*
+ * build_simple_grouped_rel
+ *	  Construct a new RelOptInfo representing a grouped version of the input
+ *	  base relation.
+ */
+RelOptInfo *
+build_simple_grouped_rel(PlannerInfo *root, RelOptInfo *rel)
+{
+	RelOptInfo *grouped_rel;
+	RelAggInfo *agg_info;
+
+	/*
+	 * We should have available aggregate expressions and grouping
+	 * expressions, otherwise we cannot reach here.
+	 */
+	Assert(root->agg_clause_list != NIL);
+	Assert(root->group_expr_list != NIL);
+
+	/* nothing to do for dummy rel */
+	if (IS_DUMMY_REL(rel))
+		return NULL;
+
+	/*
+	 * Prepare the information needed to create grouped paths for this base
+	 * relation.
+	 */
+	agg_info = create_rel_agg_info(root, rel, true);
+	if (agg_info == NULL)
+		return NULL;
+
+	/*
+	 * If grouped paths for the given base relation are not considered useful,
+	 * skip building the grouped relation.
+	 */
+	if (!agg_info->agg_useful)
+		return NULL;
+
+	/* Tracks the lowest join level at which partial aggregation is applied */
+	agg_info->apply_at = bms_copy(rel->relids);
+
+	/* build the grouped relation */
+	grouped_rel = build_grouped_rel(root, rel);
+	grouped_rel->reltarget = agg_info->target;
+	grouped_rel->rows = agg_info->grouped_rows;
+	grouped_rel->agg_info = agg_info;
+
+	rel->grouped_rel = grouped_rel;
+
+	return grouped_rel;
+}
+
+/*
+ * build_grouped_rel
+ *	  Build a grouped relation by flat copying the input relation and resetting
+ *	  the necessary fields.
+ */
+RelOptInfo *
+build_grouped_rel(PlannerInfo *root, RelOptInfo *rel)
+{
+	RelOptInfo *grouped_rel;
+
+	grouped_rel = makeNode(RelOptInfo);
+	memcpy(grouped_rel, rel, sizeof(RelOptInfo));
+
+	/*
+	 * clear path info
+	 */
+	grouped_rel->pathlist = NIL;
+	grouped_rel->ppilist = NIL;
+	grouped_rel->partial_pathlist = NIL;
+	grouped_rel->cheapest_startup_path = NULL;
+	grouped_rel->cheapest_total_path = NULL;
+	grouped_rel->cheapest_parameterized_paths = NIL;
+
+	/*
+	 * clear partition info
+	 */
+	grouped_rel->part_scheme = NULL;
+	grouped_rel->nparts = -1;
+	grouped_rel->boundinfo = NULL;
+	grouped_rel->partbounds_merged = false;
+	grouped_rel->partition_qual = NIL;
+	grouped_rel->part_rels = NULL;
+	grouped_rel->live_parts = NULL;
+	grouped_rel->all_partrels = NULL;
+	grouped_rel->partexprs = NULL;
+	grouped_rel->nullable_partexprs = NULL;
+	grouped_rel->consider_partitionwise_join = false;
+
+	/*
+	 * clear size estimates
+	 */
+	grouped_rel->rows = 0;
+
+	return grouped_rel;
+}
+
 /*
  * find_base_rel
  *	  Find a base or otherrel relation entry, which must already exist.
@@ -759,6 +872,8 @@ build_join_rel(PlannerInfo *root,
 	joinrel->joininfo = NIL;
 	joinrel->has_eclass_joins = false;
 	joinrel->consider_partitionwise_join = false;	/* might get changed later */
+	joinrel->agg_info = NULL;
+	joinrel->grouped_rel = NULL;
 	joinrel->parent = NULL;
 	joinrel->top_parent = NULL;
 	joinrel->top_parent_relids = NULL;
@@ -945,6 +1060,8 @@ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
 	joinrel->joininfo = NIL;
 	joinrel->has_eclass_joins = false;
 	joinrel->consider_partitionwise_join = false;	/* might get changed later */
+	joinrel->agg_info = NULL;
+	joinrel->grouped_rel = NULL;
 	joinrel->parent = parent_joinrel;
 	joinrel->top_parent = parent_joinrel->top_parent ? parent_joinrel->top_parent : parent_joinrel;
 	joinrel->top_parent_relids = joinrel->top_parent->relids;
@@ -2523,3 +2640,536 @@ build_child_join_reltarget(PlannerInfo *root,
 	childrel->reltarget->cost.per_tuple = parentrel->reltarget->cost.per_tuple;
 	childrel->reltarget->width = parentrel->reltarget->width;
 }
+
+/*
+ * create_rel_agg_info
+ *	  Create the RelAggInfo structure for the given relation if it can produce
+ *	  grouped paths.  The given relation is the non-grouped one which has the
+ *	  reltarget already constructed.
+ *
+ * calculate_grouped_rows: if true, calculate the estimated number of grouped
+ * rows for the relation.  If false, skip the estimation to avoid unnecessary
+ * planning overhead.
+ */
+RelAggInfo *
+create_rel_agg_info(PlannerInfo *root, RelOptInfo *rel,
+					bool calculate_grouped_rows)
+{
+	ListCell   *lc;
+	RelAggInfo *result;
+	PathTarget *agg_input;
+	PathTarget *target;
+	List	   *group_clauses = NIL;
+	List	   *group_exprs = NIL;
+
+	/*
+	 * The lists of aggregate expressions and grouping expressions should have
+	 * been constructed.
+	 */
+	Assert(root->agg_clause_list != NIL);
+	Assert(root->group_expr_list != NIL);
+
+	/*
+	 * If this is a child rel, the grouped rel for its parent rel must have
+	 * been created if it can.  So we can just use parent's RelAggInfo if
+	 * there is one, with appropriate variable substitutions.
+	 */
+	if (IS_OTHER_REL(rel))
+	{
+		RelOptInfo *grouped_rel;
+		RelAggInfo *agg_info;
+
+		grouped_rel = rel->top_parent->grouped_rel;
+		if (grouped_rel == NULL)
+			return NULL;
+
+		Assert(IS_GROUPED_REL(grouped_rel));
+
+		/* Must do multi-level transformation */
+		agg_info = (RelAggInfo *)
+			adjust_appendrel_attrs_multilevel(root,
+											  (Node *) grouped_rel->agg_info,
+											  rel,
+											  rel->top_parent);
+
+		agg_info->apply_at = NULL;	/* caller will change this later */
+
+		if (calculate_grouped_rows)
+		{
+			agg_info->grouped_rows =
+				estimate_num_groups(root, agg_info->group_exprs,
+									rel->rows, NULL, NULL);
+
+			/*
+			 * The grouped paths for the given relation are considered useful
+			 * iff the average group size is no less than
+			 * min_eager_agg_group_size.
+			 */
+			agg_info->agg_useful =
+				(rel->rows / agg_info->grouped_rows) >= min_eager_agg_group_size;
+		}
+
+		return agg_info;
+	}
+
+	/* Check if it's possible to produce grouped paths for this relation. */
+	if (!eager_aggregation_possible_for_relation(root, rel))
+		return NULL;
+
+	/*
+	 * Create targets for the grouped paths and for the input paths of the
+	 * grouped paths.
+	 */
+	target = create_empty_pathtarget();
+	agg_input = create_empty_pathtarget();
+
+	/* ... and initialize these targets */
+	if (!init_grouping_targets(root, rel, target, agg_input,
+							   &group_clauses, &group_exprs))
+		return NULL;
+
+	/*
+	 * Eager aggregation is not applicable if there are no available grouping
+	 * expressions.
+	 */
+	if (list_length(group_clauses) == 0)
+		return NULL;
+
+	/* Add aggregates to the grouping target */
+	foreach(lc, root->agg_clause_list)
+	{
+		AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+		Aggref	   *aggref;
+
+		Assert(IsA(ac_info->aggref, Aggref));
+
+		aggref = (Aggref *) copyObject(ac_info->aggref);
+		mark_partial_aggref(aggref, AGGSPLIT_INITIAL_SERIAL);
+
+		add_column_to_pathtarget(target, (Expr *) aggref, 0);
+	}
+
+	/* Set the estimated eval cost and output width for both targets */
+	set_pathtarget_cost_width(root, target);
+	set_pathtarget_cost_width(root, agg_input);
+
+	/* build the RelAggInfo result */
+	result = makeNode(RelAggInfo);
+	result->target = target;
+	result->agg_input = agg_input;
+	result->group_clauses = group_clauses;
+	result->group_exprs = group_exprs;
+	result->apply_at = NULL;	/* caller will change this later */
+
+	if (calculate_grouped_rows)
+	{
+		result->grouped_rows = estimate_num_groups(root, result->group_exprs,
+												   rel->rows, NULL, NULL);
+
+		/*
+		 * The grouped paths for the given relation are considered useful iff
+		 * the average group size is no less than min_eager_agg_group_size.
+		 */
+		result->agg_useful =
+			(rel->rows / result->grouped_rows) >= min_eager_agg_group_size;
+	}
+
+	return result;
+}
+
+/*
+ * eager_aggregation_possible_for_relation
+ * 	  Check if it's possible to produce grouped paths for the given relation.
+ */
+static bool
+eager_aggregation_possible_for_relation(PlannerInfo *root, RelOptInfo *rel)
+{
+	ListCell   *lc;
+	int			cur_relid;
+
+	/*
+	 * Check to see if the given relation is in the nullable side of an outer
+	 * join.  In this case, we cannot push a partial aggregation down to the
+	 * relation, because the NULL-extended rows produced by the outer join
+	 * would not be available when we perform the partial aggregation, while
+	 * with a non-eager-aggregation plan these rows are available for the
+	 * top-level aggregation.  Doing so may result in the rows being grouped
+	 * differently than expected, or produce incorrect values from the
+	 * aggregate functions.
+	 */
+	cur_relid = -1;
+	while ((cur_relid = bms_next_member(rel->relids, cur_relid)) >= 0)
+	{
+		RelOptInfo *baserel = find_base_rel_ignore_join(root, cur_relid);
+
+		if (baserel == NULL)
+			continue;			/* ignore outer joins in rel->relids */
+
+		if (!bms_is_subset(baserel->nulling_relids, rel->relids))
+			return false;
+	}
+
+	/*
+	 * For now we don't try to support PlaceHolderVars.
+	 */
+	foreach(lc, rel->reltarget->exprs)
+	{
+		Expr	   *expr = lfirst(lc);
+
+		if (IsA(expr, PlaceHolderVar))
+			return false;
+	}
+
+	/* Caller should only pass base relations or joins. */
+	Assert(rel->reloptkind == RELOPT_BASEREL ||
+		   rel->reloptkind == RELOPT_JOINREL);
+
+	/*
+	 * Check if all aggregate expressions can be evaluated on this relation
+	 * level.
+	 */
+	foreach(lc, root->agg_clause_list)
+	{
+		AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+
+		Assert(IsA(ac_info->aggref, Aggref));
+
+		/*
+		 * Give up if any aggregate requires relations other than the current
+		 * one.  If the aggregate requires the current relation plus
+		 * additional relations, grouping the current relation could make some
+		 * input rows unavailable for the higher aggregate and may reduce the
+		 * number of input rows it receives.  If the aggregate does not
+		 * require the current relation at all, it should not be grouped, as
+		 * we do not support joining two grouped relations.
+		 */
+		if (!bms_is_subset(ac_info->agg_eval_at, rel->relids))
+			return false;
+	}
+
+	return true;
+}
+
+/*
+ * init_grouping_targets
+ *	  Initialize the target for grouped paths (target) as well as the target
+ *	  for paths that generate input for the grouped paths (agg_input).
+ *
+ * We also construct the list of SortGroupClauses and the list of grouping
+ * expressions for the partial aggregation, and return them in *group_clause
+ * and *group_exprs.
+ *
+ * Return true if the targets could be initialized, false otherwise.
+ */
+static bool
+init_grouping_targets(PlannerInfo *root, RelOptInfo *rel,
+					  PathTarget *target, PathTarget *agg_input,
+					  List **group_clauses, List **group_exprs)
+{
+	ListCell   *lc;
+	List	   *possibly_dependent = NIL;
+	Index		maxSortGroupRef;
+
+	/* Identify the max sortgroupref */
+	maxSortGroupRef = 0;
+	foreach(lc, root->processed_tlist)
+	{
+		Index		ref = ((TargetEntry *) lfirst(lc))->ressortgroupref;
+
+		if (ref > maxSortGroupRef)
+			maxSortGroupRef = ref;
+	}
+
+	/*
+	 * At this point, all Vars from this relation that are needed by upper
+	 * joins or are required in the final targetlist should already be present
+	 * in its reltarget.  Therefore, we can safely iterate over this
+	 * relation's reltarget->exprs to construct the PathTarget and grouping
+	 * clauses for the grouped paths.
+	 */
+	foreach(lc, rel->reltarget->exprs)
+	{
+		Expr	   *expr = (Expr *) lfirst(lc);
+		Index		sortgroupref;
+
+		/*
+		 * Given that PlaceHolderVar currently prevents us from doing eager
+		 * aggregation, the source target cannot contain anything more complex
+		 * than a Var.
+		 */
+		Assert(IsA(expr, Var));
+
+		/*
+		 * Get the sortgroupref of the expr if it is found among, or can be
+		 * deduced from, the original grouping expressions.
+		 */
+		sortgroupref = get_expression_sortgroupref(root, expr);
+		if (sortgroupref > 0)
+		{
+			SortGroupClause *sgc;
+
+			/* Find the matching SortGroupClause */
+			sgc = get_sortgroupref_clause(sortgroupref, root->processed_groupClause);
+			Assert(sgc->tleSortGroupRef <= maxSortGroupRef);
+
+			/*
+			 * If the target expression is to be used as a grouping key, it
+			 * should be emitted by the grouped paths that have been pushed
+			 * down to this relation level.
+			 */
+			add_column_to_pathtarget(target, expr, sortgroupref);
+
+			/*
+			 * ... and it also should be emitted by the input paths.
+			 */
+			add_column_to_pathtarget(agg_input, expr, sortgroupref);
+
+			/*
+			 * Record this SortGroupClause and grouping expression.  Note that
+			 * this SortGroupClause might have already been recorded.
+			 */
+			if (!list_member(*group_clauses, sgc))
+			{
+				*group_clauses = lappend(*group_clauses, sgc);
+				*group_exprs = lappend(*group_exprs, expr);
+			}
+		}
+		else if (is_var_needed_by_join(root, (Var *) expr, rel))
+		{
+			/*
+			 * The expression is needed for an upper join but is neither in
+			 * the GROUP BY clause nor derivable from it using EC (otherwise,
+			 * it would have already been included in the targets above).  We
+			 * need to create a special SortGroupClause for this expression.
+			 *
+			 * It is important to include such expressions in the grouping
+			 * keys.  This is essential to ensure that an aggregated row from
+			 * the partial aggregation matches the other side of the join if
+			 * and only if each row in the partial group does.  This ensures
+			 * that all rows within the same partial group share the same
+			 * 'destiny', which is crucial for maintaining correctness.
+			 */
+			SortGroupClause *sgc;
+			TypeCacheEntry *tce;
+			Oid			equalimageproc;
+
+			/*
+			 * But first, check if equality implies image equality for this
+			 * expression.  If not, we cannot use it as a grouping key.  See
+			 * comments in create_grouping_expr_infos().
+			 */
+			tce = lookup_type_cache(exprType((Node *) expr),
+									TYPECACHE_BTREE_OPFAMILY);
+			if (!OidIsValid(tce->btree_opf) ||
+				!OidIsValid(tce->btree_opintype))
+				return false;
+
+			equalimageproc = get_opfamily_proc(tce->btree_opf,
+											   tce->btree_opintype,
+											   tce->btree_opintype,
+											   BTEQUALIMAGE_PROC);
+			if (!OidIsValid(equalimageproc) ||
+				!DatumGetBool(OidFunctionCall1Coll(equalimageproc,
+												   tce->typcollation,
+												   ObjectIdGetDatum(tce->btree_opintype))))
+				return false;
+
+			/* Create the SortGroupClause. */
+			sgc = makeNode(SortGroupClause);
+
+			/* Initialize the SortGroupClause. */
+			sgc->tleSortGroupRef = ++maxSortGroupRef;
+			get_sort_group_operators(exprType((Node *) expr),
+									 false, true, false,
+									 &sgc->sortop, &sgc->eqop, NULL,
+									 &sgc->hashable);
+
+			/* This expression should be emitted by the grouped paths */
+			add_column_to_pathtarget(target, expr, sgc->tleSortGroupRef);
+
+			/* ... and it also should be emitted by the input paths. */
+			add_column_to_pathtarget(agg_input, expr, sgc->tleSortGroupRef);
+
+			/* Record this SortGroupClause and grouping expression */
+			*group_clauses = lappend(*group_clauses, sgc);
+			*group_exprs = lappend(*group_exprs, expr);
+		}
+		else if (is_var_in_aggref_only(root, (Var *) expr))
+		{
+			/*
+			 * The expression is referenced by an aggregate function pushed
+			 * down to this relation and does not appear elsewhere in the
+			 * targetlist or havingQual.  Add it to 'agg_input' but not to
+			 * 'target'.
+			 */
+			add_new_column_to_pathtarget(agg_input, expr);
+		}
+		else
+		{
+			/*
+			 * The expression may be functionally dependent on other
+			 * expressions in the target, but we cannot verify this until all
+			 * target expressions have been constructed.
+			 */
+			possibly_dependent = lappend(possibly_dependent, expr);
+		}
+	}
+
+	/*
+	 * Now we can verify whether an expression is functionally dependent on
+	 * others.
+	 */
+	foreach(lc, possibly_dependent)
+	{
+		Var		   *tvar;
+		List	   *deps = NIL;
+		RangeTblEntry *rte;
+
+		tvar = lfirst_node(Var, lc);
+		rte = root->simple_rte_array[tvar->varno];
+
+		if (check_functional_grouping(rte->relid, tvar->varno,
+									  tvar->varlevelsup,
+									  target->exprs, &deps))
+		{
+			/*
+			 * The expression is functionally dependent on other target
+			 * expressions, so it can be included in the targets.  Since it
+			 * will not be used as a grouping key, a sortgroupref is not
+			 * needed for it.
+			 */
+			add_new_column_to_pathtarget(target, (Expr *) tvar);
+			add_new_column_to_pathtarget(agg_input, (Expr *) tvar);
+		}
+		else
+		{
+			/*
+			 * We may arrive here with a grouping expression that is proven
+			 * redundant by EquivalenceClass processing, such as 't1.a' in the
+			 * query below.
+			 *
+			 * select max(t1.c) from t t1, t t2 where t1.a = 1 group by t1.a,
+			 * t1.b;
+			 *
+			 * For now we just give up in this case.
+			 */
+			return false;
+		}
+	}
+
+	return true;
+}
+
+/*
+ * is_var_in_aggref_only
+ *	  Check whether the given Var appears in aggregate expressions and not
+ *	  elsewhere in the targetlist or havingQual.
+ */
+static bool
+is_var_in_aggref_only(PlannerInfo *root, Var *var)
+{
+	ListCell   *lc;
+
+	/*
+	 * Search the list of aggregate expressions for the Var.
+	 */
+	foreach(lc, root->agg_clause_list)
+	{
+		AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+		List	   *vars;
+
+		Assert(IsA(ac_info->aggref, Aggref));
+
+		if (!bms_is_member(var->varno, ac_info->agg_eval_at))
+			continue;
+
+		vars = pull_var_clause((Node *) ac_info->aggref,
+							   PVC_RECURSE_AGGREGATES |
+							   PVC_RECURSE_WINDOWFUNCS |
+							   PVC_RECURSE_PLACEHOLDERS);
+
+		if (list_member(vars, var))
+		{
+			list_free(vars);
+			break;
+		}
+
+		list_free(vars);
+	}
+
+	return (lc != NULL && !list_member(root->tlist_vars, var));
+}
+
+/*
+ * is_var_needed_by_join
+ *	  Check if the given Var is needed by joins above the current rel.
+ */
+static bool
+is_var_needed_by_join(PlannerInfo *root, Var *var, RelOptInfo *rel)
+{
+	Relids		relids;
+	int			attno;
+	RelOptInfo *baserel;
+
+	/*
+	 * Note that when checking if the Var is needed by joins above, we want to
+	 * exclude cases where the Var is only needed in the final targetlist.  So
+	 * include "relation 0" in the check.
+	 */
+	relids = bms_copy(rel->relids);
+	relids = bms_add_member(relids, 0);
+
+	baserel = find_base_rel(root, var->varno);
+	attno = var->varattno - baserel->min_attr;
+
+	return bms_nonempty_difference(baserel->attr_needed[attno], relids);
+}
+
+/*
+ * get_expression_sortgroupref
+ *	  Return the sortgroupref of the given "expr" if it is found among the
+ *	  original grouping expressions, or is known equal to any of the original
+ *	  grouping expressions due to equivalence relationships.  Return 0 if no
+ *	  match is found.
+ */
+static Index
+get_expression_sortgroupref(PlannerInfo *root, Expr *expr)
+{
+	ListCell   *lc;
+
+	Assert(IsA(expr, Var));
+
+	foreach(lc, root->group_expr_list)
+	{
+		GroupingExprInfo *ge_info = lfirst_node(GroupingExprInfo, lc);
+		ListCell   *lc1;
+
+		Assert(IsA(ge_info->expr, Var));
+		Assert(ge_info->sortgroupref > 0);
+
+		if (equal(expr, ge_info->expr))
+			return ge_info->sortgroupref;
+
+		if (ge_info->ec == NULL ||
+			!bms_is_member(((Var *) expr)->varno, ge_info->ec->ec_relids))
+			continue;
+
+		/*
+		 * Scan the EquivalenceClass, looking for a match to the given
+		 * expression.  We ignore child members here.
+		 */
+		foreach(lc1, ge_info->ec->ec_members)
+		{
+			EquivalenceMember *em = (EquivalenceMember *) lfirst(lc1);
+
+			/* Child members should not exist in ec_members */
+			Assert(!em->em_is_child);
+
+			if (equal(expr, em->em_expr))
+				return ge_info->sortgroupref;
+		}
+	}
+
+	/* no match is found */
+	return 0;
+}
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 6bc6be13d2a..b176d5130e4 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -145,6 +145,13 @@
   boot_val => 'false',
 },
 
+{ name => 'enable_eager_aggregate', type => 'bool', context => 'PGC_USERSET', group => 'QUERY_TUNING_METHOD',
+  short_desc => 'Enables eager aggregation.',
+  flags => 'GUC_EXPLAIN',
+  variable => 'enable_eager_aggregate',
+  boot_val => 'true',
+},
+
 { name => 'enable_parallel_append', type => 'bool', context => 'PGC_USERSET', group => 'QUERY_TUNING_METHOD',
   short_desc => 'Enables the planner\'s use of parallel append plans.',
   flags => 'GUC_EXPLAIN',
@@ -2427,6 +2434,15 @@
   max => 'DBL_MAX',
 },
 
+{ name => 'min_eager_agg_group_size', type => 'real', context => 'PGC_USERSET', group => 'QUERY_TUNING_COST',
+  short_desc => 'Sets the minimum average group size required to consider applying eager aggregation.',
+  flags => 'GUC_EXPLAIN',
+  variable => 'min_eager_agg_group_size',
+  boot_val => '8.0',
+  min => '0.0',
+  max => 'DBL_MAX',
+},
+
 { name => 'cursor_tuple_fraction', type => 'real', context => 'PGC_USERSET', group => 'QUERY_TUNING_OTHER',
   short_desc => 'Sets the planner\'s estimate of the fraction of a cursor\'s rows that will be retrieved.',
   flags => 'GUC_EXPLAIN',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index c36fcb9ab61..c5d612ab552 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -428,6 +428,7 @@
 #enable_group_by_reordering = on
 #enable_distinct_reordering = on
 #enable_self_join_elimination = on
+#enable_eager_aggregate = on
 
 # - Planner Cost Constants -
 
@@ -441,6 +442,7 @@
 #min_parallel_table_scan_size = 8MB
 #min_parallel_index_scan_size = 512kB
 #effective_cache_size = 4GB
+#min_eager_agg_group_size = 8.0
 
 #jit_above_cost = 100000		# perform JIT compilation if available
 					# and query more expensive than this;
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index b12a2508d8c..2786f8f0c4d 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -391,6 +391,15 @@ struct PlannerInfo
 	/* list of PlaceHolderInfos */
 	List	   *placeholder_list;
 
+	/* list of AggClauseInfos */
+	List	   *agg_clause_list;
+
+	/* list of GroupExprInfos */
+	List	   *group_expr_list;
+
+	/* list of plain Vars contained in targetlist and havingQual */
+	List	   *tlist_vars;
+
 	/* array of PlaceHolderInfos indexed by phid */
 	struct PlaceHolderInfo **placeholder_array pg_node_attr(read_write_ignore, array_size(placeholder_array_size));
 	/* allocated size of array */
@@ -1040,6 +1049,14 @@ typedef struct RelOptInfo
 	/* consider partitionwise join paths? (if partitioned rel) */
 	bool		consider_partitionwise_join;
 
+	/*
+	 * used by eager aggregation:
+	 */
+	/* information needed to create grouped paths */
+	struct RelAggInfo *agg_info;
+	/* the partially-aggregated version of the relation */
+	struct RelOptInfo *grouped_rel;
+
 	/*
 	 * inheritance links, if this is an otherrel (otherwise NULL):
 	 */
@@ -1124,6 +1141,67 @@ typedef struct RelOptInfo
 	((nominal_jointype) == JOIN_INNER && (sjinfo)->jointype == JOIN_SEMI && \
 	 bms_equal((sjinfo)->syn_righthand, (rel)->relids))
 
+/*
+ * Is the given relation a grouped relation?
+ */
+#define IS_GROUPED_REL(rel) \
+	((rel)->agg_info != NULL)
+
+/*
+ * RelAggInfo
+ *		Information needed to create grouped paths for base and join rels.
+ *
+ * "target" is the output tlist for the grouped paths.
+ *
+ * "agg_input" is the output tlist for the paths that provide input to the
+ * grouped paths.  One difference from the reltarget of the non-grouped
+ * relation is that agg_input has its sortgrouprefs[] initialized.
+ *
+ * "group_clauses" and "group_exprs" are lists of SortGroupClauses and the
+ * corresponding grouping expressions.
+ *
+ * "apply_at" tracks the lowest join level at which partial aggregation is
+ * applied.
+ *
+ * "grouped_rows" is the estimated number of result tuples of the grouped
+ * relation.
+ *
+ * "agg_useful" is a flag to indicate whether the grouped paths are considered
+ * useful.  It is set true if the average partial group size is no less than
+ * min_eager_agg_group_size, suggesting a significant row count reduction.
+ */
+typedef struct RelAggInfo
+{
+	pg_node_attr(no_copy_equal, no_read, no_query_jumble)
+
+	NodeTag		type;
+
+	/*
+	 * default result targetlist for Paths scanning this grouped relation;
+	 * list of Vars/Exprs, cost, width
+	 */
+	struct PathTarget *target;
+
+	/*
+	 * the targetlist for Paths that provide input to the grouped paths
+	 */
+	struct PathTarget *agg_input;
+
+	/* a list of SortGroupClauses */
+	List	   *group_clauses;
+	/* a list of grouping expressions */
+	List	   *group_exprs;
+
+	/* lowest level partial aggregation is applied at */
+	Relids		apply_at;
+
+	/* estimated number of result tuples */
+	Cardinality grouped_rows;
+
+	/* the grouped paths are considered useful? */
+	bool		agg_useful;
+} RelAggInfo;
+
 /*
  * IndexOptInfo
  *		Per-index information for planning/optimization
@@ -3268,6 +3346,49 @@ typedef struct MinMaxAggInfo
 	Param	   *param;
 } MinMaxAggInfo;
 
+/*
+ * For each distinct Aggref node that appears in the targetlist and HAVING
+ * clauses, we store an AggClauseInfo node in the PlannerInfo node's
+ * agg_clause_list.  Each AggClauseInfo records the set of relations referenced
+ * by the aggregate expression.  This information is used to determine how far
+ * the aggregate can be safely pushed down in the join tree.
+ */
+typedef struct AggClauseInfo
+{
+	pg_node_attr(no_read, no_query_jumble)
+
+	NodeTag		type;
+
+	/* the Aggref expr */
+	Aggref	   *aggref;
+
+	/* lowest level we can evaluate this aggregate at */
+	Relids		agg_eval_at;
+} AggClauseInfo;
+
+/*
+ * For each grouping expression that appears in grouping clauses, we store a
+ * GroupingExprInfo node in the PlannerInfo node's group_expr_list.  Each
+ * GroupingExprInfo records the expression being grouped on, its sortgroupref,
+ * and the EquivalenceClass it belongs to.  This information is necessary to
+ * reproduce correct grouping semantics at different levels of the join tree.
+ */
+typedef struct GroupingExprInfo
+{
+	pg_node_attr(no_read, no_query_jumble)
+
+	NodeTag		type;
+
+	/* the represented expression */
+	Expr	   *expr;
+
+	/* the tleSortGroupRef of the corresponding SortGroupClause */
+	Index		sortgroupref;
+
+	/* the equivalence class the expression belongs to */
+	EquivalenceClass *ec pg_node_attr(copy_as_scalar, equal_as_scalar);
+} GroupingExprInfo;
+
 /*
  * At runtime, PARAM_EXEC slots are used to pass values around from one plan
  * node to another.  They can be used to pass values down into subqueries (for
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index 763cd25bb3c..e509b8144ce 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -312,6 +312,10 @@ extern void setup_simple_rel_arrays(PlannerInfo *root);
 extern void expand_planner_arrays(PlannerInfo *root, int add_size);
 extern RelOptInfo *build_simple_rel(PlannerInfo *root, int relid,
 									RelOptInfo *parent);
+extern RelOptInfo *build_simple_grouped_rel(PlannerInfo *root,
+											RelOptInfo *rel_plain);
+extern RelOptInfo *build_grouped_rel(PlannerInfo *root,
+									 RelOptInfo *rel_plain);
 extern RelOptInfo *find_base_rel(PlannerInfo *root, int relid);
 extern RelOptInfo *find_base_rel_noerr(PlannerInfo *root, int relid);
 extern RelOptInfo *find_base_rel_ignore_join(PlannerInfo *root, int relid);
@@ -351,4 +355,6 @@ extern RelOptInfo *build_child_join_rel(PlannerInfo *root,
 										SpecialJoinInfo *sjinfo,
 										int nappinfos, AppendRelInfo **appinfos);
 
+extern RelAggInfo *create_rel_agg_info(PlannerInfo *root, RelOptInfo *rel,
+									   bool calculate_grouped_rows);
 #endif							/* PATHNODE_H */
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index cbade77b717..8d03d662a04 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -21,7 +21,9 @@
  * allpaths.c
  */
 extern PGDLLIMPORT bool enable_geqo;
+extern PGDLLIMPORT bool enable_eager_aggregate;
 extern PGDLLIMPORT int geqo_threshold;
+extern PGDLLIMPORT double min_eager_agg_group_size;
 extern PGDLLIMPORT int min_parallel_table_scan_size;
 extern PGDLLIMPORT int min_parallel_index_scan_size;
 extern PGDLLIMPORT bool enable_group_by_reordering;
@@ -57,6 +59,10 @@ extern void generate_gather_paths(PlannerInfo *root, RelOptInfo *rel,
 								  bool override_rows);
 extern void generate_useful_gather_paths(PlannerInfo *root, RelOptInfo *rel,
 										 bool override_rows);
+extern void generate_grouped_paths(PlannerInfo *root,
+								   RelOptInfo *rel_grouped,
+								   RelOptInfo *rel_plain,
+								   RelAggInfo *agg_info);
 extern int	compute_parallel_worker(RelOptInfo *rel, double heap_pages,
 									double index_pages, int max_workers);
 extern void create_partial_bitmap_paths(PlannerInfo *root, RelOptInfo *rel,
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index 9d3debcab28..09b48b26f8f 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -76,6 +76,7 @@ extern void add_vars_to_targetlist(PlannerInfo *root, List *vars,
 extern void add_vars_to_attr_needed(PlannerInfo *root, List *vars,
 									Relids where_needed);
 extern void remove_useless_groupby_columns(PlannerInfo *root);
+extern void setup_eager_aggregation(PlannerInfo *root);
 extern void find_lateral_references(PlannerInfo *root);
 extern void rebuild_lateral_attr_needed(PlannerInfo *root);
 extern void create_lateral_join_info(PlannerInfo *root);
diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out
index 69805d4b9ec..ef79d6f1ded 100644
--- a/src/test/regress/expected/collate.icu.utf8.out
+++ b/src/test/regress/expected/collate.icu.utf8.out
@@ -2437,11 +2437,11 @@ SELECT c collate "C", count(c) FROM pagg_tab3 GROUP BY c collate "C" ORDER BY 1;
 SET enable_partitionwise_join TO false;
 EXPLAIN (COSTS OFF)
 SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROUP BY 1 ORDER BY t1.c COLLATE "C";
-                         QUERY PLAN                          
--------------------------------------------------------------
+                            QUERY PLAN                             
+-------------------------------------------------------------------
  Sort
    Sort Key: t1.c COLLATE "C"
-   ->  HashAggregate
+   ->  Finalize HashAggregate
          Group Key: t1.c
          ->  Hash Join
                Hash Cond: (t1.c = t2.c)
@@ -2449,10 +2449,12 @@ SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROU
                      ->  Seq Scan on pagg_tab3_p2 t1_1
                      ->  Seq Scan on pagg_tab3_p1 t1_2
                ->  Hash
-                     ->  Append
-                           ->  Seq Scan on pagg_tab3_p2 t2_1
-                           ->  Seq Scan on pagg_tab3_p1 t2_2
-(13 rows)
+                     ->  Partial HashAggregate
+                           Group Key: t2.c
+                           ->  Append
+                                 ->  Seq Scan on pagg_tab3_p2 t2_1
+                                 ->  Seq Scan on pagg_tab3_p1 t2_2
+(15 rows)
 
 SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROUP BY 1 ORDER BY t1.c COLLATE "C";
  c | count 
@@ -2464,11 +2466,11 @@ SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROU
 SET enable_partitionwise_join TO true;
 EXPLAIN (COSTS OFF)
 SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROUP BY 1 ORDER BY t1.c COLLATE "C";
-                         QUERY PLAN                          
--------------------------------------------------------------
+                            QUERY PLAN                             
+-------------------------------------------------------------------
  Sort
    Sort Key: t1.c COLLATE "C"
-   ->  HashAggregate
+   ->  Finalize HashAggregate
          Group Key: t1.c
          ->  Hash Join
                Hash Cond: (t1.c = t2.c)
@@ -2476,10 +2478,12 @@ SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROU
                      ->  Seq Scan on pagg_tab3_p2 t1_1
                      ->  Seq Scan on pagg_tab3_p1 t1_2
                ->  Hash
-                     ->  Append
-                           ->  Seq Scan on pagg_tab3_p2 t2_1
-                           ->  Seq Scan on pagg_tab3_p1 t2_2
-(13 rows)
+                     ->  Partial HashAggregate
+                           Group Key: t2.c
+                           ->  Append
+                                 ->  Seq Scan on pagg_tab3_p2 t2_1
+                                 ->  Seq Scan on pagg_tab3_p1 t2_2
+(15 rows)
 
 SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROUP BY 1 ORDER BY t1.c COLLATE "C";
  c | count 
diff --git a/src/test/regress/expected/eager_aggregate.out b/src/test/regress/expected/eager_aggregate.out
new file mode 100644
index 00000000000..0dab585e9ce
--- /dev/null
+++ b/src/test/regress/expected/eager_aggregate.out
@@ -0,0 +1,1584 @@
+--
+-- EAGER AGGREGATION
+-- Test we can push aggregation down below join
+--
+-- Enable eager aggregation, which by default is disabled.
+SET enable_eager_aggregate TO on;
+CREATE TABLE eager_agg_t1 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t2 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t3 (a int, b int, c double precision);
+INSERT INTO eager_agg_t1 SELECT i, i, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t2 SELECT i, i%10, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t3 SELECT i%10, i%10, i FROM generate_series(1, 1000)i;
+ANALYZE eager_agg_t1;
+ANALYZE eager_agg_t2;
+ANALYZE eager_agg_t3;
+--
+-- Test eager aggregation over base rel
+--
+-- Perform scan of a table, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                            QUERY PLAN                            
+------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Seq Scan on public.eager_agg_t2 t2
+                                 Output: t2.a, t2.b, t2.c
+(18 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                               QUERY PLAN                               
+------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial GroupAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Sort
+                                 Output: t2.c, t2.b
+                                 Sort Key: t2.b
+                                 ->  Seq Scan on public.eager_agg_t2 t2
+                                       Output: t2.c, t2.b
+(21 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+RESET enable_hashagg;
+--
+-- Test eager aggregation over join rel
+--
+-- Perform join of tables, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+                                  QUERY PLAN                                  
+------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg((t2.c + t3.c))
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg((t2.c + t3.c)))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg((t2.c + t3.c))
+                           Group Key: t2.b
+                           ->  Hash Join
+                                 Output: t2.c, t2.b, t3.c
+                                 Hash Cond: (t3.a = t2.a)
+                                 ->  Seq Scan on public.eager_agg_t3 t3
+                                       Output: t3.a, t3.b, t3.c
+                                 ->  Hash
+                                       Output: t2.c, t2.b, t2.a
+                                       ->  Seq Scan on public.eager_agg_t2 t2
+                                             Output: t2.c, t2.b, t2.a
+(25 rows)
+
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 497
+ 2 | 499
+ 3 | 501
+ 4 | 503
+ 5 | 505
+ 6 | 507
+ 7 | 509
+ 8 | 511
+ 9 | 513
+(9 rows)
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+                                     QUERY PLAN                                     
+------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg((t2.c + t3.c))
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg((t2.c + t3.c)))
+                     ->  Partial GroupAggregate
+                           Output: t2.b, PARTIAL avg((t2.c + t3.c))
+                           Group Key: t2.b
+                           ->  Sort
+                                 Output: t2.c, t2.b, t3.c
+                                 Sort Key: t2.b
+                                 ->  Hash Join
+                                       Output: t2.c, t2.b, t3.c
+                                       Hash Cond: (t3.a = t2.a)
+                                       ->  Seq Scan on public.eager_agg_t3 t3
+                                             Output: t3.a, t3.b, t3.c
+                                       ->  Hash
+                                             Output: t2.c, t2.b, t2.a
+                                             ->  Seq Scan on public.eager_agg_t2 t2
+                                                   Output: t2.c, t2.b, t2.a
+(28 rows)
+
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 497
+ 2 | 499
+ 3 | 501
+ 4 | 503
+ 5 | 505
+ 6 | 507
+ 7 | 509
+ 8 | 511
+ 9 | 513
+(9 rows)
+
+RESET enable_hashagg;
+--
+-- Test that eager aggregation works for outer join
+--
+-- Ensure aggregation can be pushed down to the non-nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                            QUERY PLAN                            
+------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Right Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Seq Scan on public.eager_agg_t2 t2
+                                 Output: t2.a, t2.b, t2.c
+(18 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+   | 505
+(10 rows)
+
+-- Ensure aggregation cannot be pushed down to the nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Sort
+   Output: t2.b, (avg(t2.c))
+   Sort Key: t2.b
+   ->  HashAggregate
+         Output: t2.b, avg(t2.c)
+         Group Key: t2.b
+         ->  Hash Right Join
+               Output: t2.b, t2.c
+               Hash Cond: (t2.b = t1.b)
+               ->  Seq Scan on public.eager_agg_t2 t2
+                     Output: t2.a, t2.b, t2.c
+               ->  Hash
+                     Output: t1.b
+                     ->  Seq Scan on public.eager_agg_t1 t1
+                           Output: t1.b
+(15 rows)
+
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+ b | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+   |    
+(10 rows)
+
+--
+-- Test that eager aggregation works for parallel plans
+--
+SET parallel_setup_cost=0;
+SET parallel_tuple_cost=0;
+SET min_parallel_table_scan_size=0;
+SET max_parallel_workers_per_gather=4;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                                   QUERY PLAN                                    
+---------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Gather Merge
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Workers Planned: 2
+         ->  Sort
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Sort Key: t1.a
+               ->  Parallel Hash Join
+                     Output: t1.a, (PARTIAL avg(t2.c))
+                     Hash Cond: (t1.b = t2.b)
+                     ->  Parallel Seq Scan on public.eager_agg_t1 t1
+                           Output: t1.a, t1.b, t1.c
+                     ->  Parallel Hash
+                           Output: t2.b, (PARTIAL avg(t2.c))
+                           ->  Partial HashAggregate
+                                 Output: t2.b, PARTIAL avg(t2.c)
+                                 Group Key: t2.b
+                                 ->  Parallel Seq Scan on public.eager_agg_t2 t2
+                                       Output: t2.a, t2.b, t2.c
+(21 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+RESET parallel_setup_cost;
+RESET parallel_tuple_cost;
+RESET min_parallel_table_scan_size;
+RESET max_parallel_workers_per_gather;
+--
+-- Test eager aggregation with GEQO
+--
+SET geqo = on;
+SET geqo_threshold = 2;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+                            QUERY PLAN                            
+------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Seq Scan on public.eager_agg_t2 t2
+                                 Output: t2.a, t2.b, t2.c
+(18 rows)
+
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+RESET geqo;
+RESET geqo_threshold;
+DROP TABLE eager_agg_t1;
+DROP TABLE eager_agg_t2;
+DROP TABLE eager_agg_t3;
+--
+-- Test eager aggregation for partitionwise join
+--
+-- Enable partitionwise aggregate, which by default is disabled.
+SET enable_partitionwise_aggregate TO true;
+-- Enable partitionwise join, which by default is disabled.
+SET enable_partitionwise_join TO true;
+CREATE TABLE eager_agg_tab1(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab1_p1 PARTITION OF eager_agg_tab1 FOR VALUES FROM (0) TO (5);
+CREATE TABLE eager_agg_tab1_p2 PARTITION OF eager_agg_tab1 FOR VALUES FROM (5) TO (10);
+CREATE TABLE eager_agg_tab1_p3 PARTITION OF eager_agg_tab1 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab2(x int, y int) PARTITION BY RANGE(y);
+CREATE TABLE eager_agg_tab2_p1 PARTITION OF eager_agg_tab2 FOR VALUES FROM (0) TO (5);
+CREATE TABLE eager_agg_tab2_p2 PARTITION OF eager_agg_tab2 FOR VALUES FROM (5) TO (10);
+CREATE TABLE eager_agg_tab2_p3 PARTITION OF eager_agg_tab2 FOR VALUES FROM (10) TO (15);
+INSERT INTO eager_agg_tab1 SELECT i % 15, i % 10 FROM generate_series(1, 1000) i;
+INSERT INTO eager_agg_tab2 SELECT i % 10, i % 15 FROM generate_series(1, 1000) i;
+ANALYZE eager_agg_tab1;
+ANALYZE eager_agg_tab2;
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum(t1.y)), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum(t1.y), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2.y = t1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2
+                           Output: t2.y
+                     ->  Hash
+                           Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1.x, PARTIAL sum(t1.y), PARTIAL count(*)
+                                 Group Key: t1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                       Output: t1.x, t1.y
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum(t1_1.y), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_1
+                           Output: t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.y), PARTIAL count(*)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                       Output: t1_1.x, t1_1.y
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum(t1_2.y), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_2
+                           Output: t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.y), PARTIAL count(*)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                       Output: t1_2.x, t1_2.y
+(49 rows)
+
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+ x  |  sum  | count 
+----+-------+-------
+  0 | 10890 |  4356
+  1 | 15544 |  4489
+  2 | 20033 |  4489
+  3 | 24522 |  4489
+  4 | 29011 |  4489
+  5 | 11390 |  4489
+  6 | 15879 |  4489
+  7 | 20368 |  4489
+  8 | 24857 |  4489
+  9 | 29346 |  4489
+ 10 | 11055 |  4489
+ 11 | 15246 |  4356
+ 12 | 19602 |  4356
+ 13 | 23958 |  4356
+ 14 | 28314 |  4356
+(15 rows)
+
+-- GROUP BY having other matching key
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t2.y, (sum(t1.y)), (count(*))
+   Sort Key: t2.y
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t2.y, sum(t1.y), count(*)
+               Group Key: t2.y
+               ->  Hash Join
+                     Output: t2.y, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2.y = t1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2
+                           Output: t2.y
+                     ->  Hash
+                           Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1.x, PARTIAL sum(t1.y), PARTIAL count(*)
+                                 Group Key: t1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                       Output: t1.y, t1.x
+         ->  Finalize HashAggregate
+               Output: t2_1.y, sum(t1_1.y), count(*)
+               Group Key: t2_1.y
+               ->  Hash Join
+                     Output: t2_1.y, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_1
+                           Output: t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.y), PARTIAL count(*)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                       Output: t1_1.y, t1_1.x
+         ->  Finalize HashAggregate
+               Output: t2_2.y, sum(t1_2.y), count(*)
+               Group Key: t2_2.y
+               ->  Hash Join
+                     Output: t2_2.y, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_2
+                           Output: t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.y), PARTIAL count(*)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                       Output: t1_2.y, t1_2.x
+(49 rows)
+
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+ y  |  sum  | count 
+----+-------+-------
+  0 | 10890 |  4356
+  1 | 15544 |  4489
+  2 | 20033 |  4489
+  3 | 24522 |  4489
+  4 | 29011 |  4489
+  5 | 11390 |  4489
+  6 | 15879 |  4489
+  7 | 20368 |  4489
+  8 | 24857 |  4489
+  9 | 29346 |  4489
+ 10 | 11055 |  4489
+ 11 | 15246 |  4356
+ 12 | 19602 |  4356
+ 13 | 23958 |  4356
+ 14 | 28314 |  4356
+(15 rows)
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 5 ORDER BY t2.x;
+                                                 QUERY PLAN                                                 
+------------------------------------------------------------------------------------------------------------
+ Sort
+   Output: t2.x, (sum(t1.x)), (count(*))
+   Sort Key: t2.x
+   ->  Finalize HashAggregate
+         Output: t2.x, sum(t1.x), count(*)
+         Group Key: t2.x
+         Filter: (avg(t1.x) > '5'::numeric)
+         ->  Append
+               ->  Hash Join
+                     Output: t2.x, (PARTIAL sum(t1.x)), (PARTIAL count(*)), (PARTIAL avg(t1.x))
+                     Hash Cond: (t2.y = t1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2
+                           Output: t2.x, t2.y
+                     ->  Hash
+                           Output: t1.x, (PARTIAL sum(t1.x)), (PARTIAL count(*)), (PARTIAL avg(t1.x))
+                           ->  Partial HashAggregate
+                                 Output: t1.x, PARTIAL sum(t1.x), PARTIAL count(*), PARTIAL avg(t1.x)
+                                 Group Key: t1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                       Output: t1.x
+               ->  Hash Join
+                     Output: t2_1.x, (PARTIAL sum(t1_1.x)), (PARTIAL count(*)), (PARTIAL avg(t1_1.x))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_1
+                           Output: t2_1.x, t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.x)), (PARTIAL count(*)), (PARTIAL avg(t1_1.x))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.x), PARTIAL count(*), PARTIAL avg(t1_1.x)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                       Output: t1_1.x
+               ->  Hash Join
+                     Output: t2_2.x, (PARTIAL sum(t1_2.x)), (PARTIAL count(*)), (PARTIAL avg(t1_2.x))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_2
+                           Output: t2_2.x, t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.x)), (PARTIAL count(*)), (PARTIAL avg(t1_2.x))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.x), PARTIAL count(*), PARTIAL avg(t1_2.x)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                       Output: t1_2.x
+(44 rows)
+
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 5 ORDER BY t2.x;
+ x |  sum  | count 
+---+-------+-------
+ 0 | 33835 |  6667
+ 1 | 39502 |  6667
+ 2 | 46169 |  6667
+ 3 | 52836 |  6667
+ 4 | 59503 |  6667
+ 5 | 33500 |  6667
+ 6 | 39837 |  6667
+ 7 | 46504 |  6667
+ 8 | 53171 |  6667
+ 9 | 59838 |  6667
+(10 rows)
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum((t2.y + t3.y)))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum((t2.y + t3.y))
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum((t2.y + t3.y)))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, t3.x, (PARTIAL sum((t2.y + t3.y)))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, t3.x, PARTIAL sum((t2.y + t3.y))
+                                 Group Key: t2.x
+                                 ->  Hash Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Hash Cond: (t2.x = t3.x)
+                                       ->  Seq Scan on public.eager_agg_tab1_p1 t2
+                                             Output: t2.y, t2.x
+                                       ->  Hash
+                                             Output: t3.y, t3.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p1 t3
+                                                   Output: t3.y, t3.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum((t2_1.y + t3_1.y))
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum((t2_1.y + t3_1.y)))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y)))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y))
+                                 Group Key: t2_1.x
+                                 ->  Hash Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Hash Cond: (t2_1.x = t3_1.x)
+                                       ->  Seq Scan on public.eager_agg_tab1_p2 t2_1
+                                             Output: t2_1.y, t2_1.x
+                                       ->  Hash
+                                             Output: t3_1.y, t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p2 t3_1
+                                                   Output: t3_1.y, t3_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum((t2_2.y + t3_2.y))
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum((t2_2.y + t3_2.y)))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y)))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y))
+                                 Group Key: t2_2.x
+                                 ->  Hash Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Hash Cond: (t2_2.x = t3_2.x)
+                                       ->  Seq Scan on public.eager_agg_tab1_p3 t2_2
+                                             Output: t2_2.y, t2_2.x
+                                       ->  Hash
+                                             Output: t3_2.y, t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p3 t3_2
+                                                   Output: t3_2.y, t3_2.x
+(70 rows)
+
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+ x  |   sum   
+----+---------
+  0 | 1437480
+  1 | 2082896
+  2 | 2684422
+  3 | 3285948
+  4 | 3887474
+  5 | 1526260
+  6 | 2127786
+  7 | 2729312
+  8 | 3330838
+  9 | 3932364
+ 10 | 1481370
+ 11 | 2012472
+ 12 | 2587464
+ 13 | 3162456
+ 14 | 3737448
+(15 rows)
+
+-- partial aggregation
+SET enable_hashagg TO off;
+SET max_parallel_workers_per_gather TO 0;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t3.y, sum((t2.y + t3.y))
+   Group Key: t3.y
+   ->  Sort
+         Output: t3.y, (PARTIAL sum((t2.y + t3.y)))
+         Sort Key: t3.y
+         ->  Append
+               ->  Hash Join
+                     Output: t3.y, (PARTIAL sum((t2.y + t3.y)))
+                     Hash Cond: (t2.x = t1.x)
+                     ->  Partial GroupAggregate
+                           Output: t2.x, t3.y, t3.x, PARTIAL sum((t2.y + t3.y))
+                           Group Key: t2.x, t3.y, t3.x
+                           ->  Incremental Sort
+                                 Output: t2.y, t2.x, t3.y, t3.x
+                                 Sort Key: t2.x, t3.y
+                                 Presorted Key: t2.x
+                                 ->  Merge Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Merge Cond: (t2.x = t3.x)
+                                       ->  Sort
+                                             Output: t2.y, t2.x
+                                             Sort Key: t2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p1 t2
+                                                   Output: t2.y, t2.x
+                                       ->  Sort
+                                             Output: t3.y, t3.x
+                                             Sort Key: t3.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p1 t3
+                                                   Output: t3.y, t3.x
+                     ->  Hash
+                           Output: t1.x
+                           ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                 Output: t1.x
+               ->  Hash Join
+                     Output: t3_1.y, (PARTIAL sum((t2_1.y + t3_1.y)))
+                     Hash Cond: (t2_1.x = t1_1.x)
+                     ->  Partial GroupAggregate
+                           Output: t2_1.x, t3_1.y, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y))
+                           Group Key: t2_1.x, t3_1.y, t3_1.x
+                           ->  Incremental Sort
+                                 Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                 Sort Key: t2_1.x, t3_1.y
+                                 Presorted Key: t2_1.x
+                                 ->  Merge Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Merge Cond: (t2_1.x = t3_1.x)
+                                       ->  Sort
+                                             Output: t2_1.y, t2_1.x
+                                             Sort Key: t2_1.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p2 t2_1
+                                                   Output: t2_1.y, t2_1.x
+                                       ->  Sort
+                                             Output: t3_1.y, t3_1.x
+                                             Sort Key: t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p2 t3_1
+                                                   Output: t3_1.y, t3_1.x
+                     ->  Hash
+                           Output: t1_1.x
+                           ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                 Output: t1_1.x
+               ->  Hash Join
+                     Output: t3_2.y, (PARTIAL sum((t2_2.y + t3_2.y)))
+                     Hash Cond: (t2_2.x = t1_2.x)
+                     ->  Partial GroupAggregate
+                           Output: t2_2.x, t3_2.y, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y))
+                           Group Key: t2_2.x, t3_2.y, t3_2.x
+                           ->  Incremental Sort
+                                 Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                 Sort Key: t2_2.x, t3_2.y
+                                 Presorted Key: t2_2.x
+                                 ->  Merge Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Merge Cond: (t2_2.x = t3_2.x)
+                                       ->  Sort
+                                             Output: t2_2.y, t2_2.x
+                                             Sort Key: t2_2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p3 t2_2
+                                                   Output: t2_2.y, t2_2.x
+                                       ->  Sort
+                                             Output: t3_2.y, t3_2.x
+                                             Sort Key: t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p3 t3_2
+                                                   Output: t3_2.y, t3_2.x
+                     ->  Hash
+                           Output: t1_2.x
+                           ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                 Output: t1_2.x
+(88 rows)
+
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+ y |   sum   
+---+---------
+ 0 | 1111110
+ 1 | 2000132
+ 2 | 2889154
+ 3 | 3778176
+ 4 | 4667198
+ 5 | 3334000
+ 6 | 4223022
+ 7 | 5112044
+ 8 | 6001066
+ 9 | 6890088
+(10 rows)
+
+RESET enable_hashagg;
+RESET max_parallel_workers_per_gather;
+-- try that with GEQO too
+SET geqo = on;
+SET geqo_threshold = 2;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum(t1.y)), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum(t1.y), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2.y = t1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2
+                           Output: t2.y
+                     ->  Hash
+                           Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1.x, PARTIAL sum(t1.y), PARTIAL count(*)
+                                 Group Key: t1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                       Output: t1.x, t1.y
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum(t1_1.y), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_1
+                           Output: t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.y), PARTIAL count(*)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                       Output: t1_1.x, t1_1.y
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum(t1_2.y), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_2
+                           Output: t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.y), PARTIAL count(*)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                       Output: t1_2.x, t1_2.y
+(49 rows)
+
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+ x  |  sum  | count 
+----+-------+-------
+  0 | 10890 |  4356
+  1 | 15544 |  4489
+  2 | 20033 |  4489
+  3 | 24522 |  4489
+  4 | 29011 |  4489
+  5 | 11390 |  4489
+  6 | 15879 |  4489
+  7 | 20368 |  4489
+  8 | 24857 |  4489
+  9 | 29346 |  4489
+ 10 | 11055 |  4489
+ 11 | 15246 |  4356
+ 12 | 19602 |  4356
+ 13 | 23958 |  4356
+ 14 | 28314 |  4356
+(15 rows)
+
+RESET geqo;
+RESET geqo_threshold;
+DROP TABLE eager_agg_tab1;
+DROP TABLE eager_agg_tab2;
+--
+-- Test with multi-level partitioning scheme
+--
+CREATE TABLE eager_agg_tab_ml(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p1 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab_ml_p2 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (10) TO (20) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p2_s1 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab_ml_p2_s2 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (15) TO (20);
+CREATE TABLE eager_agg_tab_ml_p3 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (20) TO (30) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p3_s1 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (20) TO (25);
+CREATE TABLE eager_agg_tab_ml_p3_s2 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (25) TO (30);
+INSERT INTO eager_agg_tab_ml SELECT i % 30, i % 30 FROM generate_series(1, 1000) i;
+ANALYZE eager_agg_tab_ml;
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum(t2.y)), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum(t2.y), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, PARTIAL sum(t2.y), PARTIAL count(*)
+                                 Group Key: t2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                       Output: t2.y, t2.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum(t2_1.y), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, PARTIAL sum(t2_1.y), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                       Output: t2_1.y, t2_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum(t2_2.y), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, PARTIAL sum(t2_2.y), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                       Output: t2_2.y, t2_2.x
+         ->  Finalize HashAggregate
+               Output: t1_3.x, sum(t2_3.y), count(*)
+               Group Key: t1_3.x
+               ->  Hash Join
+                     Output: t1_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, PARTIAL sum(t2_3.y), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                       Output: t2_3.y, t2_3.x
+         ->  Finalize HashAggregate
+               Output: t1_4.x, sum(t2_4.y), count(*)
+               Group Key: t1_4.x
+               ->  Hash Join
+                     Output: t1_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, PARTIAL sum(t2_4.y), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                       Output: t2_4.y, t2_4.x
+(79 rows)
+
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+ x  |  sum  | count 
+----+-------+-------
+  0 |     0 |  1089
+  1 |  1156 |  1156
+  2 |  2312 |  1156
+  3 |  3468 |  1156
+  4 |  4624 |  1156
+  5 |  5780 |  1156
+  6 |  6936 |  1156
+  7 |  8092 |  1156
+  8 |  9248 |  1156
+  9 | 10404 |  1156
+ 10 | 11560 |  1156
+ 11 | 11979 |  1089
+ 12 | 13068 |  1089
+ 13 | 14157 |  1089
+ 14 | 15246 |  1089
+ 15 | 16335 |  1089
+ 16 | 17424 |  1089
+ 17 | 18513 |  1089
+ 18 | 19602 |  1089
+ 19 | 20691 |  1089
+ 20 | 21780 |  1089
+ 21 | 22869 |  1089
+ 22 | 23958 |  1089
+ 23 | 25047 |  1089
+ 24 | 26136 |  1089
+ 25 | 27225 |  1089
+ 26 | 28314 |  1089
+ 27 | 29403 |  1089
+ 28 | 30492 |  1089
+ 29 | 31581 |  1089
+(30 rows)
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.y, (sum(t2.y)), (count(*))
+   Sort Key: t1.y
+   ->  Finalize HashAggregate
+         Output: t1.y, sum(t2.y), count(*)
+         Group Key: t1.y
+         ->  Append
+               ->  Hash Join
+                     Output: t1.y, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.y, t1.x
+                     ->  Hash
+                           Output: t2.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, PARTIAL sum(t2.y), PARTIAL count(*)
+                                 Group Key: t2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                       Output: t2.y, t2.x
+               ->  Hash Join
+                     Output: t1_1.y, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.y, t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, PARTIAL sum(t2_1.y), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                       Output: t2_1.y, t2_1.x
+               ->  Hash Join
+                     Output: t1_2.y, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.y, t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, PARTIAL sum(t2_2.y), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                       Output: t2_2.y, t2_2.x
+               ->  Hash Join
+                     Output: t1_3.y, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.y, t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, PARTIAL sum(t2_3.y), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                       Output: t2_3.y, t2_3.x
+               ->  Hash Join
+                     Output: t1_4.y, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.y, t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, PARTIAL sum(t2_4.y), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                       Output: t2_4.y, t2_4.x
+(67 rows)
+
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+ y  |  sum  | count 
+----+-------+-------
+  0 |     0 |  1089
+  1 |  1156 |  1156
+  2 |  2312 |  1156
+  3 |  3468 |  1156
+  4 |  4624 |  1156
+  5 |  5780 |  1156
+  6 |  6936 |  1156
+  7 |  8092 |  1156
+  8 |  9248 |  1156
+  9 | 10404 |  1156
+ 10 | 11560 |  1156
+ 11 | 11979 |  1089
+ 12 | 13068 |  1089
+ 13 | 14157 |  1089
+ 14 | 15246 |  1089
+ 15 | 16335 |  1089
+ 16 | 17424 |  1089
+ 17 | 18513 |  1089
+ 18 | 19602 |  1089
+ 19 | 20691 |  1089
+ 20 | 21780 |  1089
+ 21 | 22869 |  1089
+ 22 | 23958 |  1089
+ 23 | 25047 |  1089
+ 24 | 26136 |  1089
+ 25 | 27225 |  1089
+ 26 | 28314 |  1089
+ 27 | 29403 |  1089
+ 28 | 30492 |  1089
+ 29 | 31581 |  1089
+(30 rows)
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+                                                QUERY PLAN                                                
+----------------------------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum((t2.y + t3.y))), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum((t2.y + t3.y)), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, t3.x, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, t3.x, PARTIAL sum((t2.y + t3.y)), PARTIAL count(*)
+                                 Group Key: t2.x
+                                 ->  Hash Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Hash Cond: (t2.x = t3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                             Output: t2.y, t2.x
+                                       ->  Hash
+                                             Output: t3.y, t3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p1 t3
+                                                   Output: t3.y, t3.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum((t2_1.y + t3_1.y)), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y)), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Hash Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Hash Cond: (t2_1.x = t3_1.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                             Output: t2_1.y, t2_1.x
+                                       ->  Hash
+                                             Output: t3_1.y, t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t3_1
+                                                   Output: t3_1.y, t3_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum((t2_2.y + t3_2.y)), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y)), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Hash Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Hash Cond: (t2_2.x = t3_2.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                             Output: t2_2.y, t2_2.x
+                                       ->  Hash
+                                             Output: t3_2.y, t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t3_2
+                                                   Output: t3_2.y, t3_2.x
+         ->  Finalize HashAggregate
+               Output: t1_3.x, sum((t2_3.y + t3_3.y)), count(*)
+               Group Key: t1_3.x
+               ->  Hash Join
+                     Output: t1_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, t3_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, t3_3.x, PARTIAL sum((t2_3.y + t3_3.y)), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Hash Join
+                                       Output: t2_3.y, t2_3.x, t3_3.y, t3_3.x
+                                       Hash Cond: (t2_3.x = t3_3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                             Output: t2_3.y, t2_3.x
+                                       ->  Hash
+                                             Output: t3_3.y, t3_3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t3_3
+                                                   Output: t3_3.y, t3_3.x
+         ->  Finalize HashAggregate
+               Output: t1_4.x, sum((t2_4.y + t3_4.y)), count(*)
+               Group Key: t1_4.x
+               ->  Hash Join
+                     Output: t1_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, t3_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, t3_4.x, PARTIAL sum((t2_4.y + t3_4.y)), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Hash Join
+                                       Output: t2_4.y, t2_4.x, t3_4.y, t3_4.x
+                                       Hash Cond: (t2_4.x = t3_4.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                             Output: t2_4.y, t2_4.x
+                                       ->  Hash
+                                             Output: t3_4.y, t3_4.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t3_4
+                                                   Output: t3_4.y, t3_4.x
+(114 rows)
+
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+ x  |   sum   | count 
+----+---------+-------
+  0 |       0 | 35937
+  1 |   78608 | 39304
+  2 |  157216 | 39304
+  3 |  235824 | 39304
+  4 |  314432 | 39304
+  5 |  393040 | 39304
+  6 |  471648 | 39304
+  7 |  550256 | 39304
+  8 |  628864 | 39304
+  9 |  707472 | 39304
+ 10 |  786080 | 39304
+ 11 |  790614 | 35937
+ 12 |  862488 | 35937
+ 13 |  934362 | 35937
+ 14 | 1006236 | 35937
+ 15 | 1078110 | 35937
+ 16 | 1149984 | 35937
+ 17 | 1221858 | 35937
+ 18 | 1293732 | 35937
+ 19 | 1365606 | 35937
+ 20 | 1437480 | 35937
+ 21 | 1509354 | 35937
+ 22 | 1581228 | 35937
+ 23 | 1653102 | 35937
+ 24 | 1724976 | 35937
+ 25 | 1796850 | 35937
+ 26 | 1868724 | 35937
+ 27 | 1940598 | 35937
+ 28 | 2012472 | 35937
+ 29 | 2084346 | 35937
+(30 rows)
+
+-- partial aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+                                                    QUERY PLAN                                                    
+------------------------------------------------------------------------------------------------------------------
+ Sort
+   Output: t3.y, (sum((t2.y + t3.y))), (count(*))
+   Sort Key: t3.y
+   ->  Finalize HashAggregate
+         Output: t3.y, sum((t2.y + t3.y)), count(*)
+         Group Key: t3.y
+         ->  Append
+               ->  Hash Join
+                     Output: t3.y, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, t3.y, t3.x, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, t3.y, t3.x, PARTIAL sum((t2.y + t3.y)), PARTIAL count(*)
+                                 Group Key: t2.x, t3.y, t3.x
+                                 ->  Hash Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Hash Cond: (t2.x = t3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                             Output: t2.y, t2.x
+                                       ->  Hash
+                                             Output: t3.y, t3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p1 t3
+                                                   Output: t3.y, t3.x
+               ->  Hash Join
+                     Output: t3_1.y, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, t3_1.y, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, t3_1.y, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y)), PARTIAL count(*)
+                                 Group Key: t2_1.x, t3_1.y, t3_1.x
+                                 ->  Hash Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Hash Cond: (t2_1.x = t3_1.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                             Output: t2_1.y, t2_1.x
+                                       ->  Hash
+                                             Output: t3_1.y, t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t3_1
+                                                   Output: t3_1.y, t3_1.x
+               ->  Hash Join
+                     Output: t3_2.y, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, t3_2.y, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, t3_2.y, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y)), PARTIAL count(*)
+                                 Group Key: t2_2.x, t3_2.y, t3_2.x
+                                 ->  Hash Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Hash Cond: (t2_2.x = t3_2.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                             Output: t2_2.y, t2_2.x
+                                       ->  Hash
+                                             Output: t3_2.y, t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t3_2
+                                                   Output: t3_2.y, t3_2.x
+               ->  Hash Join
+                     Output: t3_3.y, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, t3_3.y, t3_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, t3_3.y, t3_3.x, PARTIAL sum((t2_3.y + t3_3.y)), PARTIAL count(*)
+                                 Group Key: t2_3.x, t3_3.y, t3_3.x
+                                 ->  Hash Join
+                                       Output: t2_3.y, t2_3.x, t3_3.y, t3_3.x
+                                       Hash Cond: (t2_3.x = t3_3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                             Output: t2_3.y, t2_3.x
+                                       ->  Hash
+                                             Output: t3_3.y, t3_3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t3_3
+                                                   Output: t3_3.y, t3_3.x
+               ->  Hash Join
+                     Output: t3_4.y, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, t3_4.y, t3_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, t3_4.y, t3_4.x, PARTIAL sum((t2_4.y + t3_4.y)), PARTIAL count(*)
+                                 Group Key: t2_4.x, t3_4.y, t3_4.x
+                                 ->  Hash Join
+                                       Output: t2_4.y, t2_4.x, t3_4.y, t3_4.x
+                                       Hash Cond: (t2_4.x = t3_4.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                             Output: t2_4.y, t2_4.x
+                                       ->  Hash
+                                             Output: t3_4.y, t3_4.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t3_4
+                                                   Output: t3_4.y, t3_4.x
+(102 rows)
+
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+ y  |   sum   | count 
+----+---------+-------
+  0 |       0 | 35937
+  1 |   78608 | 39304
+  2 |  157216 | 39304
+  3 |  235824 | 39304
+  4 |  314432 | 39304
+  5 |  393040 | 39304
+  6 |  471648 | 39304
+  7 |  550256 | 39304
+  8 |  628864 | 39304
+  9 |  707472 | 39304
+ 10 |  786080 | 39304
+ 11 |  790614 | 35937
+ 12 |  862488 | 35937
+ 13 |  934362 | 35937
+ 14 | 1006236 | 35937
+ 15 | 1078110 | 35937
+ 16 | 1149984 | 35937
+ 17 | 1221858 | 35937
+ 18 | 1293732 | 35937
+ 19 | 1365606 | 35937
+ 20 | 1437480 | 35937
+ 21 | 1509354 | 35937
+ 22 | 1581228 | 35937
+ 23 | 1653102 | 35937
+ 24 | 1724976 | 35937
+ 25 | 1796850 | 35937
+ 26 | 1868724 | 35937
+ 27 | 1940598 | 35937
+ 28 | 2012472 | 35937
+ 29 | 2084346 | 35937
+(30 rows)
+
+-- try that with GEQO too
+SET geqo = on;
+SET geqo_threshold = 2;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum(t2.y)), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum(t2.y), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, PARTIAL sum(t2.y), PARTIAL count(*)
+                                 Group Key: t2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                       Output: t2.y, t2.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum(t2_1.y), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, PARTIAL sum(t2_1.y), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                       Output: t2_1.y, t2_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum(t2_2.y), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, PARTIAL sum(t2_2.y), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                       Output: t2_2.y, t2_2.x
+         ->  Finalize HashAggregate
+               Output: t1_3.x, sum(t2_3.y), count(*)
+               Group Key: t1_3.x
+               ->  Hash Join
+                     Output: t1_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, PARTIAL sum(t2_3.y), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                       Output: t2_3.y, t2_3.x
+         ->  Finalize HashAggregate
+               Output: t1_4.x, sum(t2_4.y), count(*)
+               Group Key: t1_4.x
+               ->  Hash Join
+                     Output: t1_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, PARTIAL sum(t2_4.y), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                       Output: t2_4.y, t2_4.x
+(79 rows)
+
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+ x  |  sum  | count 
+----+-------+-------
+  0 |     0 |  1089
+  1 |  1156 |  1156
+  2 |  2312 |  1156
+  3 |  3468 |  1156
+  4 |  4624 |  1156
+  5 |  5780 |  1156
+  6 |  6936 |  1156
+  7 |  8092 |  1156
+  8 |  9248 |  1156
+  9 | 10404 |  1156
+ 10 | 11560 |  1156
+ 11 | 11979 |  1089
+ 12 | 13068 |  1089
+ 13 | 14157 |  1089
+ 14 | 15246 |  1089
+ 15 | 16335 |  1089
+ 16 | 17424 |  1089
+ 17 | 18513 |  1089
+ 18 | 19602 |  1089
+ 19 | 20691 |  1089
+ 20 | 21780 |  1089
+ 21 | 22869 |  1089
+ 22 | 23958 |  1089
+ 23 | 25047 |  1089
+ 24 | 26136 |  1089
+ 25 | 27225 |  1089
+ 26 | 28314 |  1089
+ 27 | 29403 |  1089
+ 28 | 30492 |  1089
+ 29 | 31581 |  1089
+(30 rows)
+
+RESET geqo;
+RESET geqo_threshold;
+DROP TABLE eager_agg_tab_ml;
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index cd37f549b5a..bdbf21a874d 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -2840,20 +2840,22 @@ select x.thousand, x.twothousand, count(*)
 from tenk1 x inner join tenk1 y on x.thousand = y.thousand
 group by x.thousand, x.twothousand
 order by x.thousand desc, x.twothousand;
-                                    QUERY PLAN                                    
-----------------------------------------------------------------------------------
- GroupAggregate
+                                       QUERY PLAN                                       
+----------------------------------------------------------------------------------------
+ Finalize GroupAggregate
    Group Key: x.thousand, x.twothousand
    ->  Incremental Sort
          Sort Key: x.thousand DESC, x.twothousand
          Presorted Key: x.thousand
          ->  Merge Join
                Merge Cond: (y.thousand = x.thousand)
-               ->  Index Only Scan Backward using tenk1_thous_tenthous on tenk1 y
+               ->  Partial GroupAggregate
+                     Group Key: y.thousand
+                     ->  Index Only Scan Backward using tenk1_thous_tenthous on tenk1 y
                ->  Sort
                      Sort Key: x.thousand DESC
                      ->  Seq Scan on tenk1 x
-(11 rows)
+(13 rows)
 
 reset enable_hashagg;
 reset enable_nestloop;
diff --git a/src/test/regress/expected/partition_aggregate.out b/src/test/regress/expected/partition_aggregate.out
index cb12bf53719..fc84929a002 100644
--- a/src/test/regress/expected/partition_aggregate.out
+++ b/src/test/regress/expected/partition_aggregate.out
@@ -13,6 +13,8 @@ SET enable_partitionwise_join TO true;
 SET max_parallel_workers_per_gather TO 0;
 -- Disable incremental sort, which can influence selected plans due to fuzz factor.
 SET enable_incremental_sort TO off;
+-- Disable eager aggregation, which can interfere with the generation of partitionwise aggregation.
+SET enable_eager_aggregate TO off;
 --
 -- Tests for list partitioned tables.
 --
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index 83228cfca29..3b37fafa65b 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -151,6 +151,7 @@ select name, setting from pg_settings where name like 'enable%';
  enable_async_append            | on
  enable_bitmapscan              | on
  enable_distinct_reordering     | on
+ enable_eager_aggregate         | on
  enable_gathermerge             | on
  enable_group_by_reordering     | on
  enable_hashagg                 | on
@@ -172,7 +173,7 @@ select name, setting from pg_settings where name like 'enable%';
  enable_seqscan                 | on
  enable_sort                    | on
  enable_tidscan                 | on
-(24 rows)
+(25 rows)
 
 -- There are always wait event descriptions for various types.  InjectionPoint
 -- may be present or absent, depending on history since last postmaster start.
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index fbffc67ae60..f9450cdc477 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -123,7 +123,7 @@ test: plancache limit plpgsql copy2 temp domain rangefuncs prepare conversion tr
 # The stats test resets stats, so nothing else needing stats access can be in
 # this group.
 # ----------
-test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression compression_lz4 memoize stats predicate numa
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression compression_lz4 memoize stats predicate numa eager_aggregate
 
 # event_trigger depends on create_am and cannot run concurrently with
 # any test that runs DDL
diff --git a/src/test/regress/sql/eager_aggregate.sql b/src/test/regress/sql/eager_aggregate.sql
new file mode 100644
index 00000000000..8b1049ae3f3
--- /dev/null
+++ b/src/test/regress/sql/eager_aggregate.sql
@@ -0,0 +1,225 @@
+--
+-- EAGER AGGREGATION
+-- Test we can push aggregation down below join
+--
+
+-- Enable eager aggregation, which by default is disabled.
+SET enable_eager_aggregate TO on;
+
+CREATE TABLE eager_agg_t1 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t2 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t3 (a int, b int, c double precision);
+
+INSERT INTO eager_agg_t1 SELECT i, i, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t2 SELECT i, i%10, i FROM generate_series(1, 1000)i;
+INSERT INTO eager_agg_t3 SELECT i%10, i%10, i FROM generate_series(1, 1000)i;
+
+ANALYZE eager_agg_t1;
+ANALYZE eager_agg_t2;
+ANALYZE eager_agg_t3;
+
+
+--
+-- Test eager aggregation over base rel
+--
+
+-- Perform scan of a table, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+RESET enable_hashagg;
+
+
+--
+-- Test eager aggregation over join rel
+--
+
+-- Perform join of tables, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c + t3.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b JOIN eager_agg_t3 t3 ON t2.a = t3.a GROUP BY t1.a ORDER BY t1.a;
+
+RESET enable_hashagg;
+
+
+--
+-- Test that eager aggregation works for outer join
+--
+
+-- Ensure aggregation can be pushed down to the non-nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+-- Ensure aggregation cannot be pushed down to the nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+SELECT t2.b, avg(t2.c) FROM eager_agg_t1 t1 LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t2.b ORDER BY t2.b;
+
+
+--
+-- Test that eager aggregation works for parallel plans
+--
+
+SET parallel_setup_cost=0;
+SET parallel_tuple_cost=0;
+SET min_parallel_table_scan_size=0;
+SET max_parallel_workers_per_gather=4;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+RESET parallel_setup_cost;
+RESET parallel_tuple_cost;
+RESET min_parallel_table_scan_size;
+RESET max_parallel_workers_per_gather;
+
+--
+-- Test eager aggregation with GEQO
+--
+
+SET geqo = on;
+SET geqo_threshold = 2;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b = t2.b GROUP BY t1.a ORDER BY t1.a;
+
+RESET geqo;
+RESET geqo_threshold;
+
+DROP TABLE eager_agg_t1;
+DROP TABLE eager_agg_t2;
+DROP TABLE eager_agg_t3;
+
+
+--
+-- Test eager aggregation for partitionwise join
+--
+
+-- Enable partitionwise aggregate, which by default is disabled.
+SET enable_partitionwise_aggregate TO true;
+-- Enable partitionwise join, which by default is disabled.
+SET enable_partitionwise_join TO true;
+
+CREATE TABLE eager_agg_tab1(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab1_p1 PARTITION OF eager_agg_tab1 FOR VALUES FROM (0) TO (5);
+CREATE TABLE eager_agg_tab1_p2 PARTITION OF eager_agg_tab1 FOR VALUES FROM (5) TO (10);
+CREATE TABLE eager_agg_tab1_p3 PARTITION OF eager_agg_tab1 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab2(x int, y int) PARTITION BY RANGE(y);
+CREATE TABLE eager_agg_tab2_p1 PARTITION OF eager_agg_tab2 FOR VALUES FROM (0) TO (5);
+CREATE TABLE eager_agg_tab2_p2 PARTITION OF eager_agg_tab2 FOR VALUES FROM (5) TO (10);
+CREATE TABLE eager_agg_tab2_p3 PARTITION OF eager_agg_tab2 FOR VALUES FROM (10) TO (15);
+INSERT INTO eager_agg_tab1 SELECT i % 15, i % 10 FROM generate_series(1, 1000) i;
+INSERT INTO eager_agg_tab2 SELECT i % 10, i % 15 FROM generate_series(1, 1000) i;
+
+ANALYZE eager_agg_tab1;
+ANALYZE eager_agg_tab2;
+
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+
+-- GROUP BY having other matching key
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+SELECT t2.y, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.y ORDER BY t2.y;
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 5 ORDER BY t2.x;
+SELECT t2.x, sum(t1.x), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t2.x HAVING avg(t1.x) > 5 ORDER BY t2.x;
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+
+-- partial aggregation
+SET enable_hashagg TO off;
+SET max_parallel_workers_per_gather TO 0;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+SELECT t3.y, sum(t2.y + t3.y) FROM eager_agg_tab1 t1 JOIN eager_agg_tab1 t2 ON t1.x = t2.x JOIN eager_agg_tab1 t3 ON t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+RESET enable_hashagg;
+RESET max_parallel_workers_per_gather;
+
+-- try that with GEQO too
+SET geqo = on;
+SET geqo_threshold = 2;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t1.y), count(*) FROM eager_agg_tab1 t1, eager_agg_tab2 t2 WHERE t1.x = t2.y GROUP BY t1.x ORDER BY t1.x;
+RESET geqo;
+RESET geqo_threshold;
+
+DROP TABLE eager_agg_tab1;
+DROP TABLE eager_agg_tab2;
+
+
+--
+-- Test with multi-level partitioning scheme
+--
+CREATE TABLE eager_agg_tab_ml(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p1 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab_ml_p2 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (10) TO (20) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p2_s1 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab_ml_p2_s2 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (15) TO (20);
+CREATE TABLE eager_agg_tab_ml_p3 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (20) TO (30) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p3_s1 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (20) TO (25);
+CREATE TABLE eager_agg_tab_ml_p3_s2 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (25) TO (30);
+INSERT INTO eager_agg_tab_ml SELECT i % 30, i % 30 FROM generate_series(1, 1000) i;
+
+ANALYZE eager_agg_tab_ml;
+
+-- When GROUP BY clause matches; full aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+
+-- When GROUP BY clause does not match; partial aggregation is performed for each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+SELECT t1.y, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.y ORDER BY t1.y;
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t1.x ORDER BY t1.x;
+
+-- partial aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+SELECT t3.y, sum(t2.y + t3.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x JOIN eager_agg_tab_ml t3 on t2.x = t3.x GROUP BY t3.y ORDER BY t3.y;
+
+-- try that with GEQO too
+SET geqo = on;
+SET geqo_threshold = 2;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+SELECT t1.x, sum(t2.y), count(*) FROM eager_agg_tab_ml t1 JOIN eager_agg_tab_ml t2 ON t1.x = t2.x GROUP BY t1.x ORDER BY t1.x;
+RESET geqo;
+RESET geqo_threshold;
+
+DROP TABLE eager_agg_tab_ml;
diff --git a/src/test/regress/sql/partition_aggregate.sql b/src/test/regress/sql/partition_aggregate.sql
index ab070fee244..124cc260461 100644
--- a/src/test/regress/sql/partition_aggregate.sql
+++ b/src/test/regress/sql/partition_aggregate.sql
@@ -14,6 +14,8 @@ SET enable_partitionwise_join TO true;
 SET max_parallel_workers_per_gather TO 0;
 -- Disable incremental sort, which can influence selected plans due to fuzz factor.
 SET enable_incremental_sort TO off;
+-- Disable eager aggregation, which can interfere with the generation of partitionwise aggregation.
+SET enable_eager_aggregate TO off;
 
 --
 -- Tests for list partitioned tables.
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 3c80d49b67e..09752d57da4 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -42,6 +42,7 @@ AfterTriggersTableData
 AfterTriggersTransData
 Agg
 AggClauseCosts
+AggClauseInfo
 AggInfo
 AggPath
 AggSplit
@@ -1110,6 +1111,7 @@ GroupPathExtraData
 GroupResultPath
 GroupState
 GroupVarInfo
+GroupingExprInfo
 GroupingFunc
 GroupingSet
 GroupingSetData
@@ -2473,6 +2475,7 @@ ReindexObjectType
 ReindexParams
 ReindexStmt
 ReindexType
+RelAggInfo
 RelFileLocator
 RelFileLocatorBackend
 RelFileNumber
-- 
2.39.5 (Apple Git-154)



  [application/octet-stream] v23-0002-Allow-negative-aggtransspace-to-indicate-unbound.patch (8.0K, 3-v23-0002-Allow-negative-aggtransspace-to-indicate-unbound.patch)
  download | inline diff:
From 48b807a93c29c534c0151b950563b28021acd8c1 Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Fri, 12 Sep 2025 13:11:47 +0900
Subject: [PATCH v23 2/2] Allow negative aggtransspace to indicate unbounded
 state size

This patch reuses the existing aggtransspace in pg_aggregate to
signal that an aggregate's transition state can grow unboundedly.  If
aggtransspace is set to a negative value, it now indicates that the
transition state may consume unpredictable or large amounts of memory,
such as in aggregates like array_agg or string_agg that accumulate
input rows.

This information can be used by the planner to avoid applying
memory-sensitive optimizations (e.g., eager aggregation) when there is
a risk of excessive memory usage during partial aggregation.

Bump catalog version.
---
 doc/src/sgml/catalogs.sgml               |  5 ++++-
 doc/src/sgml/ref/create_aggregate.sgml   | 11 ++++++++---
 src/backend/optimizer/plan/initsplan.c   | 23 +++++++----------------
 src/include/catalog/pg_aggregate.dat     | 10 ++++++----
 src/test/regress/expected/opr_sanity.out |  2 +-
 src/test/regress/sql/opr_sanity.sql      |  2 +-
 6 files changed, 27 insertions(+), 26 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index e9095bedf21..3acc2222a87 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -596,7 +596,10 @@
       </para>
       <para>
        Approximate average size (in bytes) of the transition state
-       data, or zero to use a default estimate
+       data. A positive value provides an estimate; zero means to
+       use a default estimate. A negative value indicates the state
+       data can grow unboundedly in size, such as when the aggregate
+       accumulates input rows (e.g., array_agg, string_agg).
       </para></entry>
      </row>
 
diff --git a/doc/src/sgml/ref/create_aggregate.sgml b/doc/src/sgml/ref/create_aggregate.sgml
index 222e0aa5c9d..0472ac2e874 100644
--- a/doc/src/sgml/ref/create_aggregate.sgml
+++ b/doc/src/sgml/ref/create_aggregate.sgml
@@ -384,9 +384,13 @@ SELECT col FROM tab ORDER BY col USING sortop LIMIT 1;
      <para>
       The approximate average size (in bytes) of the aggregate's state value.
       If this parameter is omitted or is zero, a default estimate is used
-      based on the <replaceable>state_data_type</replaceable>.
+      based on the <replaceable>state_data_type</replaceable>. If set to a
+      negative value, it indicates the state data can grow unboundedly in
+      size, such as when the aggregate accumulates input rows (e.g.,
+      array_agg, string_agg).
       The planner uses this value to estimate the memory required for a
-      grouped aggregate query.
+      grouped aggregate query and to avoid optimizations that may cause
+      excessive memory usage.
      </para>
     </listitem>
    </varlistentry>
@@ -568,7 +572,8 @@ SELECT col FROM tab ORDER BY col USING sortop LIMIT 1;
      <para>
       The approximate average size (in bytes) of the aggregate's state
       value, when using moving-aggregate mode.  This works the same as
-      <replaceable>state_data_size</replaceable>.
+      <replaceable>state_data_size</replaceable>, except that negative
+      values are not used to indicate unbounded state size.
      </para>
     </listitem>
    </varlistentry>
diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index 1af43bb60d2..b8d1c7e88a3 100644
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -719,19 +719,14 @@ setup_eager_aggregation(PlannerInfo *root)
 
 /*
  * is_partial_agg_memory_risky
- *	  Checks if any aggregate poses a risk of excessive memory usage during
+ *	  Check if any aggregate poses a risk of excessive memory usage during
  *	  partial aggregation.
  *
- * We check if any aggregate uses INTERNAL transition type.  Although INTERNAL
- * is marked as pass-by-value, it usually points to a large internal data
- * structure (like those used by string_agg or array_agg).  These transition
- * states can grow large and their size is hard to estimate.  Applying eager
- * aggregation in such cases risks high memory usage since partial aggregation
- * results might be stored in join hash tables or materialized nodes.
- *
- * We explicitly exclude aggregates with AVG_ACCUM transition function from
- * this check, based on the assumption that avg() and sum() are safe in this
- * context.
+ * We check if any aggregate has a negative aggtransspace value, which
+ * indicates that its transition state data can grow unboundedly in size.
+ * Applying eager aggregation in such cases risks high memory usage since
+ * partial aggregation results might be stored in join hash tables or
+ * materialized nodes.
  */
 static bool
 is_partial_agg_memory_risky(PlannerInfo *root)
@@ -742,11 +737,7 @@ is_partial_agg_memory_risky(PlannerInfo *root)
 	{
 		AggTransInfo *transinfo = lfirst_node(AggTransInfo, lc);
 
-		if (transinfo->transfn_oid == F_NUMERIC_AVG_ACCUM ||
-			transinfo->transfn_oid == F_INT8_AVG_ACCUM)
-			continue;
-
-		if (transinfo->aggtranstype == INTERNALOID)
+		if (transinfo->aggtransspace < 0)
 			return true;
 	}
 
diff --git a/src/include/catalog/pg_aggregate.dat b/src/include/catalog/pg_aggregate.dat
index d6aa1f6ec47..870769e8f14 100644
--- a/src/include/catalog/pg_aggregate.dat
+++ b/src/include/catalog/pg_aggregate.dat
@@ -558,26 +558,28 @@
   aggfinalfn => 'array_agg_finalfn', aggcombinefn => 'array_agg_combine',
   aggserialfn => 'array_agg_serialize',
   aggdeserialfn => 'array_agg_deserialize', aggfinalextra => 't',
-  aggtranstype => 'internal' },
+  aggtranstype => 'internal', aggtransspace => '-1' },
 { aggfnoid => 'array_agg(anyarray)', aggtransfn => 'array_agg_array_transfn',
   aggfinalfn => 'array_agg_array_finalfn',
   aggcombinefn => 'array_agg_array_combine',
   aggserialfn => 'array_agg_array_serialize',
   aggdeserialfn => 'array_agg_array_deserialize', aggfinalextra => 't',
-  aggtranstype => 'internal' },
+  aggtranstype => 'internal', aggtransspace => '-1' },
 
 # text
 { aggfnoid => 'string_agg(text,text)', aggtransfn => 'string_agg_transfn',
   aggfinalfn => 'string_agg_finalfn', aggcombinefn => 'string_agg_combine',
   aggserialfn => 'string_agg_serialize',
-  aggdeserialfn => 'string_agg_deserialize', aggtranstype => 'internal' },
+  aggdeserialfn => 'string_agg_deserialize',
+  aggtranstype => 'internal', aggtransspace => '-1' },
 
 # bytea
 { aggfnoid => 'string_agg(bytea,bytea)',
   aggtransfn => 'bytea_string_agg_transfn',
   aggfinalfn => 'bytea_string_agg_finalfn',
   aggcombinefn => 'string_agg_combine', aggserialfn => 'string_agg_serialize',
-  aggdeserialfn => 'string_agg_deserialize', aggtranstype => 'internal' },
+  aggdeserialfn => 'string_agg_deserialize',
+  aggtranstype => 'internal', aggtransspace => '-1' },
 
 # range
 { aggfnoid => 'range_intersect_agg(anyrange)',
diff --git a/src/test/regress/expected/opr_sanity.out b/src/test/regress/expected/opr_sanity.out
index 20bf9ea9cdf..a357e1d0c0e 100644
--- a/src/test/regress/expected/opr_sanity.out
+++ b/src/test/regress/expected/opr_sanity.out
@@ -1470,7 +1470,7 @@ WHERE aggfnoid = 0 OR aggtransfn = 0 OR
     (aggkind = 'n' AND aggnumdirectargs > 0) OR
     aggfinalmodify NOT IN ('r', 's', 'w') OR
     aggmfinalmodify NOT IN ('r', 's', 'w') OR
-    aggtranstype = 0 OR aggtransspace < 0 OR aggmtransspace < 0;
+    aggtranstype = 0 OR aggmtransspace < 0;
  ctid | aggfnoid 
 ------+----------
 (0 rows)
diff --git a/src/test/regress/sql/opr_sanity.sql b/src/test/regress/sql/opr_sanity.sql
index 2fb3a852878..cd674d7dbca 100644
--- a/src/test/regress/sql/opr_sanity.sql
+++ b/src/test/regress/sql/opr_sanity.sql
@@ -847,7 +847,7 @@ WHERE aggfnoid = 0 OR aggtransfn = 0 OR
     (aggkind = 'n' AND aggnumdirectargs > 0) OR
     aggfinalmodify NOT IN ('r', 's', 'w') OR
     aggmfinalmodify NOT IN ('r', 's', 'w') OR
-    aggtranstype = 0 OR aggtransspace < 0 OR aggmtransspace < 0;
+    aggtranstype = 0 OR aggmtransspace < 0;
 
 -- Make sure the matching pg_proc entry is sensible, too.
 
-- 
2.39.5 (Apple Git-154)



^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-05 13:12                                   ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-09 09:20                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-09 14:20                                       ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-12 09:34                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-12 18:47                                           ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-13 08:27                                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-25 04:23                                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-09-29 02:09                                                 ` Richard Guo <[email protected]>
  2025-10-01 23:54                                                   ` Re: Eager aggregation, take 3 Matheus Alcantara <[email protected]>
  2025-10-06 00:59                                                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  0 siblings, 2 replies; 70+ messages in thread

From: Richard Guo @ 2025-09-29 02:09 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]; Matheus Alcantara <[email protected]>

On Thu, Sep 25, 2025 at 1:23 PM Richard Guo <[email protected]> wrote:
> Attached is an updated version of the patch with these optimizations
> applied.

FWIW, I plan to do another self-review of this patch soon, with the
goal of assessing whether it's ready to be pushed.  If anyone has any
concerns about any part of the patch or would like to review it, I
would greatly appreciate hearing from you.

- Richard





^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-05 13:12                                   ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-09 09:20                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-09 14:20                                       ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-12 09:34                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-12 18:47                                           ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-13 08:27                                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-25 04:23                                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-29 02:09                                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-10-01 23:54                                                   ` Matheus Alcantara <[email protected]>
  2025-10-02 01:13                                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  1 sibling, 1 reply; 70+ messages in thread

From: Matheus Alcantara @ 2025-10-01 23:54 UTC (permalink / raw)
  To: Richard Guo <[email protected]>; Robert Haas <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]; Matheus Alcantara <[email protected]>

[ getting back to testing this patch ...]

On my last email you replied:
>> Debugging this query shows that all if conditions on
>> setup_eager_aggregation() returns false and create_agg_clause_infos()
>> and create_grouping_expr_infos() are called. The RelAggInfo->agg_useful
>> is also being set to true so I would expect to see Finalize and Partial
>> agg nodes, is this correct or am I missing something here?
>
> Well, just because eager aggregation *can* be applied does not mean
> that it *will* be; it depends on whether it produces a lower-cost
> execution plan.  This transformation is cost-based, so it's not the
> right mindset to assume that it will always be applied when possible.
>
Sorry for the noise here. I didn't consider the costs.

On Sun Sep 28, 2025 at 11:09 PM -03, Richard Guo wrote:
> On Thu, Sep 25, 2025 at 1:23 PM Richard Guo <[email protected]> wrote:
>> Attached is an updated version of the patch with these optimizations
>> applied.
>
> FWIW, I plan to do another self-review of this patch soon, with the
> goal of assessing whether it's ready to be pushed.  If anyone has any
> concerns about any part of the patch or would like to review it, I
> would greatly appreciate hearing from you.
>
I spent some time testing patch v23 using the TPC-DS benchmark and am
seeing worse execution times when using eager aggregation.
The most interesting cases are:

Query    |  planning time |  execution time |
query 31 |   -2.03%       │    -99.56%      │
query 71 |  -15.51%       │    -68.88%      │
query 20 |  -10.77%       │    -32.40%      │
query 26 |  -28.01%       │    -32.35%      │
query 85 |  -10.57%       │    -31.91%      │
query 77 |  -30.07%       │    -31.38%      │
query 69 |  -32.79%       │    -29.21%      │
query 32 |  -68.48%       │    -27.89%      │
query 57 |   -7.99%       │    -27.32%      │
query 91 |  -24.81%       │    -26.20%      │
query 23 |  -11.72%       │    -18.24%      │

The query 31 seems bad, I don't know if I'm doing something completely
wrong but I've just setup a TPC-DS database and then executed the query
on master and with the v23 patch and I got these results:

Master:
    Planning Time: 3.191 ms
    Execution Time: 16950.619 ms

Patch:
    Planning Time: 3.257 ms
    Execution Time: 3848355.646 ms

Note that I've executed ANALYZE before running the queries on both
scenarios (master and patched).

I'm attaching an EXPLAIN(ANALYZE) output for the query 31 from master
and with the patch applied.

Please let me know if there is any other test that I can run to
benchmark this patch.

--
Matheus Alcantara

│ Sort  (cost=656889.77..656889.77 rows=1 width=210) (actual time=17164.506..17164.519 rows=43.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                                                               │
│   Sort Key: ((ss3.store_sales / ss2.store_sales))                                                                                                                                                                                                                                                                                                                                                                                                                                                                      │
│   Sort Method: quicksort  Memory: 28kB                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 │
│   Buffers: shared hit=6533 read=69203, temp read=4343 written=12055                                                                                                                                                                                                                                                                                                                                                                                                                                                    │
│   CTE ss                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               │
│     ->  HashAggregate  (cost=323021.86..377372.99 rows=1476800 width=54) (actual time=3389.564..3677.220 rows=35136.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                                        │
│           Group Key: customer_address.ca_county, date_dim.d_qoy, date_dim.d_year                                                                                                                                                                                                                                                                                                                                                                                                                                       │
│           Planned Partitions: 64  Batches: 65  Memory Usage: 8209kB  Disk Usage: 56840kB                                                                                                                                                                                                                                                                                                                                                                                                                               │
│           Buffers: shared hit=3408 read=50944, temp read=3962 written=10947                                                                                                                                                                                                                                                                                                                                                                                                                                            │
│           ->  Hash Join  (cost=5328.60..100701.93 rows=2625180 width=28) (actual time=46.394..2034.907 rows=2685273.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                                        │
│                 Hash Cond: (store_sales.ss_sold_date_sk = date_dim.d_date_sk)                                                                                                                                                                                                                                                                                                                                                                                                                                          │
│                 Buffers: shared hit=3408 read=50944                                                                                                                                                                                                                                                                                                                                                                                                                                                                    │
│                 ->  Hash Join  (cost=2261.00..90416.35 rows=2749551 width=24) (actual time=18.753..1396.048 rows=2750429.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                                   │
│                       Hash Cond: (store_sales.ss_addr_sk = customer_address.ca_address_sk)                                                                                                                                                                                                                                                                                                                                                                                                                             │
│                       Buffers: shared hit=1984 read=50944                                                                                                                                                                                                                                                                                                                                                                                                                                                              │
│                       ->  Seq Scan on store_sales  (cost=0.00..80594.17 rows=2880217 width=14) (actual time=0.063..228.063 rows=2880404.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                    │
│                             Buffers: shared hit=848 read=50944                                                                                                                                                                                                                                                                                                                                                                                                                                                         │
│                       ->  Hash  (cost=1636.00..1636.00 rows=50000 width=18) (actual time=18.651..18.651 rows=50000.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                                         │
│                             Buckets: 65536  Batches: 1  Memory Usage: 3052kB                                                                                                                                                                                                                                                                                                                                                                                                                                           │
│                             Buffers: shared hit=1136                                                                                                                                                                                                                                                                                                                                                                                                                                                                   │
│                             ->  Seq Scan on customer_address  (cost=0.00..1636.00 rows=50000 width=18) (actual time=0.005..9.555 rows=50000.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                │
│                                   Buffers: shared hit=1136
│                 ->  Hash  (cost=2154.49..2154.49 rows=73049 width=12) (actual time=27.627..27.629 rows=73049.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                                               │
│                       Buckets: 131072  Batches: 1  Memory Usage: 4163kB                                                                                                                                                                                                                                                                                                                                                                                                                                                │
│                       Buffers: shared hit=1424                                                                                                                                                                                                                                                                                                                                                                                                                                                                         │
│                       ->  Seq Scan on date_dim  (cost=0.00..2154.49 rows=73049 width=12) (actual time=0.009..15.154 rows=73049.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                             │
│                             Buffers: shared hit=1424                                                                                                                                                                                                                                                                                                                                                                                                                                                                   │
│   CTE ws                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               │
│     ->  HashAggregate  (cost=96009.03..114825.35 rows=718952 width=54) (actual time=977.215..1014.889 rows=23320.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                                           │
│           Group Key: customer_address_1.ca_county, date_dim_1.d_qoy, date_dim_1.d_year                                                                                                                                                                                                                                                                                                                                                                                                                                 │
│           Planned Partitions: 32  Batches: 33  Memory Usage: 8209kB  Disk Usage: 6032kB                                                                                                                                                                                                                                                                                                                                                                                                                                │
│           Buffers: shared hit=3125 read=18259, temp read=381 written=1108                                                                                                                                                                                                                                                                                                                                                                                                                                              │
│           ->  Hash Join  (cost=5328.60..35122.78 rows=718952 width=28) (actual time=46.623..611.054 rows=719118.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                                            │
│                 Hash Cond: (web_sales.ws_bill_addr_sk = customer_address_1.ca_address_sk)                                                                                                                                                           
│                 Buffers: shared hit=3125 read=18259                                                                                                                                                                                                                                                                                                                                                                                                                                                                    │
│                 ->  Hash Join  (cost=3067.60..30973.94 rows=719120 width=18) (actual time=27.691..424.273 rows=719195.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                                      │
│                       Hash Cond: (web_sales.ws_sold_date_sk = date_dim_1.d_date_sk)                                                                                                                                                                                                                                                                                                                                                                                                                                    │
│                       Buffers: shared hit=1989 read=18259                                                                                                                                                                                                                                                                                                                                                                                                                                                              │
│                       ->  Seq Scan on web_sales  (cost=0.00..26017.84 rows=719384 width=14) (actual time=0.082..63.389 rows=719384.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                         │
│                             Buffers: shared hit=565 read=18259                                                                                                                                                                                                                                                                                                                                                                                                                                                         │
│                       ->  Hash  (cost=2154.49..2154.49 rows=73049 width=12) (actual time=27.538..27.538 rows=73049.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                                         │
│                             Buckets: 131072  Batches: 1  Memory Usage: 4163kB                                                                                                                                                                                                                                                                                                                                                                                                                                          │
│                             Buffers: shared hit=1424                                                                                                                                                                                                                                                                                                                                                                                                                                                                   │
│                             ->  Seq Scan on date_dim date_dim_1  (cost=0.00..2154.49 rows=73049 width=12) (actual time=0.006..14.914 rows=73049.00 loops=1)                                                                                                                                                                                                                                                                                                                                                            │
│                                   Buffers: shared hit=1424                                                                                                                                                                                                                                                                                                                                                                                                                                                             │
│                 ->  Hash  (cost=1636.00..1636.00 rows=50000 width=18) (actual time=18.902..18.902 rows=50000.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                                               │
│                       Buckets: 65536  Batches: 1  Memory Usage: 3052kB                                                                                                                                                                                                                                                                                                                                                                                                                                                 │
│                       Buffers: shared hit=1136                                                                                                                                                                                                                                                                                                                                                                                                                                                                         │
│                       ->  Seq Scan on customer_address customer_address_1  (cost=0.00..1636.00 rows=50000 width=18) (actual time=0.008..9.727 rows=50000.00 loops=1)                                                                                                                                                                                                                                                                                                                                                   │
│                             Buffers: shared hit=1136                                                                                                                                                                                                                                                                                                                                                                                                                                                                   │
│   ->  Nested Loop  (cost=0.00..164691.41 rows=1 width=210) (actual time=4817.695..17164.430 rows=43.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                                                        │
│         Join Filter: (((ss1.ca_county)::text = (ws2.ca_county)::text) AND (CASE WHEN (ws1.web_sales > '0'::numeric) THEN (ws2.web_sales / ws1.web_sales) ELSE NULL::numeric END > CASE WHEN (ss1.store_sales > '0'::numeric) THEN (ss2.store_sales / ss1.store_sales) ELSE NULL::numeric END) AND (CASE WHEN (ws2.web_sales > '0'::numeric) THEN (ws3.web_sales / ws2.web_sales) ELSE NULL::numeric END > CASE WHEN (ss2.store_sales > '0'::numeric) THEN (ss3.store_sales / ss2.store_sales) ELSE NULL::numeric END)) │
│         Rows Removed by Join Filter: 527207                                                                                                                                                                                                                                                                                                                                                                                                                                                                            │
│         Buffers: shared hit=6533 read=69203, temp read=4343 written=12055                                                                                                                                                                                                                                                                                                                                                                                                                                              │
│         ->  Nested Loop  (cost=0.00..146716.93 rows=1 width=554) (actual time=4671.968..15501.760 rows=570.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                                                 │
│               Join Filter: ((ss1.ca_county)::text = (ss3.ca_county)::text)                                                                                                                                                                                                                                                                                                                                                                                                                                             │
│               Rows Removed by Join Filter: 1038674                                                                                                                                                                                                                                                                                                                                                                                                                                                                     │
│               Buffers: shared hit=6533 read=69203, temp read=4343 written=12055                                                                                                                                                                                                                                                                                                                                                                                                                                        │
│               ->  Nested Loop  (cost=0.00..109796.47 rows=1 width=444) (actual time=4669.164..12922.095 rows=578.00 loops=1)
│                     Join Filter: ((ss1.ca_county)::text = (ss2.ca_county)::text)                                                                                                                                                                                                                                                                                                                                                                                                                                       │
│                     Rows Removed by Join Filter: 1008217                                                                                                                                                                                                                                                                                                                                                                                                                                                               │
│                     Buffers: shared hit=6533 read=69203, temp read=3559 written=12055                                                                                                                                                                                                                                                                                                                                                                                                                                  │
│                     ->  Nested Loop  (cost=0.00..72876.00 rows=1 width=334) (actual time=4666.835..10231.481 rows=617.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                                      │
│                           Join Filter: ((ss1.ca_county)::text = (ws1.ca_county)::text)                                                                                                                                                                                                                                                                                                                                                                                                                                 │
│                           Rows Removed by Join Filter: 1089697                                                                                                                                                                                                                                                                                                                                                                                                                                                         │
│                           Buffers: shared hit=6533 read=69203, temp read=3559 written=12055                                                                                                                                                                                                                                                                                                                                                                                                                            │
│                           ->  Nested Loop  (cost=0.00..35954.71 rows=2 width=220) (actual time=1031.594..3687.112 rows=662.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                                 │
│                                 Join Filter: ((ws1.ca_county)::text = (ws3.ca_county)::text)                                                                                                                                                                                                                                                                                                                                                                                                                           │
│                                 Rows Removed by Join Filter: 1148109                                                                                                                                                                                                                                                                                                                                                                                                                                                   │
│                                 Buffers: shared hit=3125 read=18259, temp read=381 written=1108                                                                                                                                                                                                                                                                                                                                                                                                                        │
│                                 ->  CTE Scan on ws ws1  (cost=0.00..17973.80 rows=18 width=110) (actual time=977.224..980.082 rows=911.00 loops=1)                                                                                                  
│                                       Filter: ((d_qoy = 1) AND (d_year = 1999))                                                                                                                                                                                                                                                                                                                                                                                                                                        │
│                                       Rows Removed by Filter: 22409                                                                                                                                                                                                                                                                                                                                                                                                                                                    │
│                                       Storage: Memory  Maximum Storage: 1700kB                                                                                                                                                                                                                                                                                                                                                                                                                                         │
│                                       Buffers: shared hit=3125 read=18259, temp written=1107                                                                                                                                                                                                                                                                                                                                                                                                                           │
│                                 ->  CTE Scan on ws ws3  (cost=0.00..17973.80 rows=18 width=110) (actual time=0.005..2.857 rows=1261.00 loops=911)                                                                                                                                                                                                                                                                                                                                                                      │
│                                       Filter: ((d_year = 1999) AND (d_qoy = 3))                                                                                                                                                                                                                                                                                                                                                                                                                                        │
│                                       Rows Removed by Filter: 22059                                                                                                                                                                                                                                                                                                                                                                                                                                                    │
│                                       Storage: Memory  Maximum Storage: 1700kB                                                                                                                                                                                                                                                                                                                                                                                                                                         │
│                                       Buffers: temp read=381 written=1                                                                                                                                                                                                                                                                                                                                                                                                                                                 │
│                           ->  CTE Scan on ss ss1  (cost=0.00..36920.00 rows=37 width=114) (actual time=5.121..9.740 rows=1647.00 loops=662)                                                                                                                                                                                                                                                                                                                                                                            │
│                                 Filter: ((d_qoy = 1) AND (d_year = 1999))                                                                                                                                                                                                                                                                                                                                                                                                                                              │
│                                 Rows Removed by Filter: 33489                                                                                                                                                                                                                                                                                                                                                                                                                                                          │
│                                 Storage: Memory  Maximum Storage: 2636kB                                                                                                                                                                                                                                                                                                                                                                                                                                               │
│                                 Buffers: shared hit=3408 read=50944, temp read=3178 written=10947                                                                                                                                                                                                                                                                                                                                                                                                                      │
│                     ->  CTE Scan on ss ss2  (cost=0.00..36920.00 rows=37 width=110) (actual time=0.001..4.216 rows=1635.00 loops=617)                                                                                                                                                                                                                                                                                                                                                                                  │
│                           Filter: ((d_year = 1999) AND (d_qoy = 2))                                                                                                                                                                                                                                                                                                                                                                                                                                                    │
│                           Rows Removed by Filter: 33501                                                                                                                                                                                                                                                                                                                                                                                                                                                                │
│                           Storage: Memory  Maximum Storage: 2636kB                                                                                                                                                                                                                                                                                                                                                                                                                                                     │
│               ->  CTE Scan on ss ss3  (cost=0.00..36920.00 rows=37 width=110) (actual time=0.006..4.305 rows=1798.00 loops=578)                                                                                                                                                                                                                                                                                                                                                                                        │
│                     Filter: ((d_year = 1999) AND (d_qoy = 3))                                                                                                                                                                                                                                                                                                                                                                                                                                                          │
│                     Rows Removed by Filter: 33338                                                                                                                                                                                                                                                                                                                                                                                                                                                                      │
│                     Storage: Memory  Maximum Storage: 2636kB                                                                                                                                                                                                                                                                                                                                                                                                                                                           │
│                     Buffers: temp read=784                                                                                                                                                                                                                                                                                                                                                                                                                                                                             │
│         ->  CTE Scan on ws ws2  (cost=0.00..17973.80 rows=18 width=110) (actual time=0.001..2.810 rows=925.00 loops=570)                                                                                                                                                                                                                                                                                                                                                                                               │
│               Filter: ((d_year = 1999) AND (d_qoy = 2))
│               Rows Removed by Filter: 22395                                                                                                                                                                                                                                                                                                                                                                                                                                                                            │
│               Storage: Memory  Maximum Storage: 1700kB                                                                                                                                                                                                                                                                                                                                                                                                                                                                 │
│ Planning:                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              │
│   Buffers: shared hit=12                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               │
│ Planning Time: 2.180 ms                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                │
│ Execution Time: 17166.558 ms                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘


│ Sort  (cost=302668.66..302668.66 rows=1 width=210) (actual time=3825537.172..3825541.540 rows=43.00 loops=1)                                                                                                                                                                                         │
│   Sort Key: ((ss3.store_sales / ss2.store_sales))                                                                                                                                                                                                                                                    │
│   Sort Method: quicksort  Memory: 28kB                                                                                                                                                                                                                                                               │
│   Buffers: shared hit=21757 read=69012, temp read=14486 written=25552                                                                                                                                                                                                                                │
│   CTE ss                                                                                                                                                                                                                                                                                             │
│     ->  Finalize GroupAggregate  (cost=178135.51..215272.86 rows=262517 width=54) (actual time=1471.638..1733.635 rows=35117.00 loops=1)                                                                                                                                                             │
│           Group Key: customer_address.ca_county, date_dim.d_qoy, date_dim.d_year                                                                                                                                                                                                                     │
│           Buffers: shared hit=3533 read=50849, temp read=14486 written=25552                                                                                                                                                                                                                         │
│           ->  Gather Merge  (cost=178135.51..208709.94 rows=262517 width=54) (actual time=1471.627..1586.417 rows=234867.00 loops=1)                                                                                                                                                                 │
│                 Workers Planned: 2                                                                                                                                                                                                                                                                   │
│                 Workers Launched: 2                                                                                                                                                                                                                                                                  │
│                 Buffers: shared hit=3533 read=50849, temp read=14486 written=25552                                                                                                                                                                                                                   │
│                 ->  Sort  (cost=177135.48..177408.94 rows=109382 width=54) (actual time=1463.292..1497.110 rows=78658.67 loops=3)                                                                                                                                                                    │
│                       Sort Key: customer_address.ca_county, date_dim.d_qoy, date_dim.d_year                                                                                                                                                                                                          │
│                       Sort Method: external merge  Disk: 7944kB                                                                                                                                                                                                                                      │
│                       Buffers: shared hit=3533 read=50849, temp read=14486 written=25552                                                                                                                                                                                                             │
│                       Worker 0:  Sort Method: external merge  Disk: 8000kB                                                                                                                                                                                                                           │
│                       Worker 1:  Sort Method: external merge  Disk: 7928kB                                                                                                                                                                                                                           │
│                       ->  Parallel Hash Join  (cost=147862.49..164239.25 rows=109382 width=54) (actual time=839.965..1235.101 rows=80523.33 loops=3)                                                                                                                                                 │
│                             Hash Cond: (store_sales.ss_sold_date_sk = date_dim.d_date_sk)                                                                                                                                                                                                            │
│                             Buffers: shared hit=3503 read=50849, temp read=11502 written=22562                                                                                                                                                                                                       │
│                             ->  Parallel Hash Join  (cost=145471.66..161547.68 rows=114565 width=50) (actual time=820.740..1192.922 rows=96392.00 loops=3)                                                                                                                                           │
│                                   Hash Cond: (store_sales.ss_addr_sk = customer_address.ca_address_sk)                                                                                                                                                                                               │
│                                   Buffers: shared hit=2079 read=50849, temp read=11502 written=22562                                                                                                                                                                                                 │
│                                   ->  Partial HashAggregate  (cost=143673.89..158993.80 rows=288022 width=40) (actual time=810.581..1155.245 rows=98213.67 loops=3)                                                                                                                                  │
│                                         Group Key: store_sales.ss_sold_date_sk, store_sales.ss_addr_sk                                                                                                                                                                                               │
│                                         Planned Partitions: 16  Batches: 17  Memory Usage: 8337kB  Disk Usage: 31640kB                                                                                                                                                                               │
│                                         Buffers: shared hit=943 read=50849, temp read=11502 written=22562                                                                                                                                                                                            │
│                                         Worker 0:  Batches: 17  Memory Usage: 8337kB  Disk Usage: 31760kB                                                                                                                                                                                            │
│                                         Worker 1:  Batches: 17  Memory Usage: 8337kB  Disk Usage: 31640kB                                                                                                                                                                                            │
│                                         ->  Parallel Seq Scan on store_sales  (cost=0.00..63792.90 rows=1200090 width=14) (actual time=0.126..79.442 rows=960134.67 loops=3)                                                                                                                         │
│                                               Buffers: shared hit=943 read=50849                                                                                                                                                                                                                     │
│                                   ->  Parallel Hash  (cost=1430.12..1430.12 rows=29412 width=18) (actual time=10.036..10.038 rows=16666.67 loops=3)                                                                                                                                                  │
│                                         Buckets: 65536  Batches: 1  Memory Usage: 3264kB                                                                                                                                                                                                             │
│                                         Buffers: shared hit=1136                                                                                                                                                                                                                                     │
│                                         ->  Parallel Seq Scan on customer_address  (cost=0.00..1430.12 rows=29412 width=18) (actual time=0.007..5.102 rows=16666.67 loops=3)                                                                                                                         │
│                                               Buffers: shared hit=1136                                                                                                                                                                                                                               │
│                             ->  Parallel Hash  (cost=1853.70..1853.70 rows=42970 width=12) (actual time=19.092..19.094 rows=24349.67 loops=3)                                                                                                                                                        │
│                                   Buckets: 131072  Batches: 1  Memory Usage: 4512kB                                                                                                                                                                                                                  │
│                                   Buffers: shared hit=1424                                                                                                                                                                                                                                           │
│                                   ->  Parallel Seq Scan on date_dim  (cost=0.00..1853.70 rows=42970 width=12) (actual time=0.012..10.264 rows=24349.67 loops=3)                                                                                                                                      │
│                                         Buffers: shared hit=1424                                                                                                                                                                                                                                     │
│   CTE ws                                                                                                                                                                                                                                                                                             │
│     ->  Finalize GroupAggregate  (cost=52144.19..62314.79 rows=71894 width=54) (actual time=275.121..340.107 rows=23312.00 loops=1)                                                                                                                                                                  │
│           Group Key: customer_address_1.ca_county, date_dim_1.d_qoy, date_dim_1.d_year                                                                                                                                                                                                               │
│           Buffers: shared hit=18224 read=18163                                                                                                                                                                                                                                                       │
│           ->  Gather Merge  (cost=52144.19..60517.44 rows=71894 width=54) (actual time=275.107..297.072 rows=60190.00 loops=1)                                                                                                                                                                       │
│                 Workers Planned: 2                                                                                                                                                                                                                                                                   │
│                 Workers Launched: 2                                                                                                                                                                                                                                                                  │
│                 Buffers: shared hit=18224 read=18163                                                                                                                                                                                                                                                 │
│                 ->  Sort  (cost=51144.17..51219.06 rows=29956 width=54) (actual time=271.870..272.906 rows=20293.33 loops=3)                                                                                                                                                                         │
│                       Sort Key: customer_address_1.ca_county, date_dim_1.d_qoy, date_dim_1.d_year                                                                                                                                                                                                    │
│                       Sort Method: quicksort  Memory: 2931kB                                                                                                                                                                                                                                         │
│                       Buffers: shared hit=18224 read=18163                                                                                                                                                                                                                                           │
│                       Worker 0:  Sort Method: quicksort  Memory: 2938kB                                                                                                                                                                                                                              │
│                       Worker 1:  Sort Method: quicksort  Memory: 2955kB                                                                                                                                                                                                                              │
│                       ->  Nested Loop  (cost=43571.15..48916.86 rows=29956 width=54) (actual time=184.657..215.740 rows=20419.67 loops=3)                                                                                                                                                            │
│                             Buffers: shared hit=18194 read=18163                                                                                                                                                                                                                                     │
│                             ->  Parallel Hash Join  (cost=43570.84..47586.10 rows=29967 width=50) (actual time=184.630..201.358 rows=20451.00 loops=3)                                                                                                                                               │
│                                   Hash Cond: (web_sales.ws_bill_addr_sk = customer_address_1.ca_address_sk)                                                                                                                                                                                          │
│                                   Buffers: shared hit=1797 read=18163                                                                                                                                                                                                                                │
│                                   ->  Partial HashAggregate  (cost=41773.08..45599.48 rows=71938 width=40) (actual time=177.706..188.464 rows=20477.33 loops=3)                                                                                                                                      │
│                                         Group Key: web_sales.ws_sold_date_sk, web_sales.ws_bill_addr_sk                                                                                                                                                                                              │
│                                         Planned Partitions: 4  Batches: 1  Memory Usage: 7953kB                                                                                                                                                                                                      │
│                                         Buffers: shared hit=661 read=18163                                                                                                                                                                                                                           │
│                                         Worker 0:  Batches: 1  Memory Usage: 7953kB                                                                                                                                                                                                                  │
│                                         Worker 1:  Batches: 1  Memory Usage: 7953kB                                                                                                                                                                                                                  │
│                                         ->  Parallel Seq Scan on web_sales  (cost=0.00..21821.43 rows=299743 width=14) (actual time=0.106..23.122 rows=239794.67 loops=3)                                                                                                                            │
│                                               Buffers: shared hit=661 read=18163                                                                                                                                                                                                                     │
│                                   ->  Parallel Hash  (cost=1430.12..1430.12 rows=29412 width=18) (actual time=6.846..6.847 rows=16666.67 loops=3)                                                                                                                                                    │
│                                         Buckets: 65536  Batches: 1  Memory Usage: 3264kB                                                                                                                                                                                                             │
│                                         Buffers: shared hit=1136                                                                                                                                                                                                                                     │
│                                         ->  Parallel Seq Scan on customer_address customer_address_1  (cost=0.00..1430.12 rows=29412 width=18) (actual time=0.008..3.586 rows=16666.67 loops=3)                                                                                                      │
│                                               Buffers: shared hit=1136                                                                                                                                                                                                                               │
│                             ->  Memoize  (cost=0.30..0.33 rows=1 width=12) (actual time=0.000..0.000 rows=1.00 loops=61353)                                                                                                                                                                          │
│                                   Cache Key: web_sales.ws_sold_date_sk                                                                                                                                                                                                                               │
│                                   Cache Mode: logical                                                                                                                                                                                                                                                │
│                                   Estimates: capacity=1822 distinct keys=1822 lookups=29967 hit percent=93.92%                                                                                                                                                                                       │
│                                   Hits: 18542  Misses: 1824  Evictions: 0  Overflows: 0  Memory Usage: 200kB                                                                                                                                                                                         │
│                                   Buffers: shared hit=16397                                                                                                                                                                                                                                          │
│                                   Worker 0:  Hits: 18589  Misses: 1821  Evictions: 0  Overflows: 0  Memory Usage: 200kB                                                                                                                                                                              │
│                                   Worker 1:  Hits: 18754  Misses: 1823  Evictions: 0  Overflows: 0  Memory Usage: 200kB                                                                                                                                                                              │
│                                   ->  Index Scan using date_dim_pkey on date_dim date_dim_1  (cost=0.29..0.32 rows=1 width=12) (actual time=0.002..0.002 rows=1.00 loops=5468)                                                                                                                       │
│                                         Index Cond: (d_date_sk = web_sales.ws_sold_date_sk)                                                                                                                                                                                                          │
│                                         Index Searches: 5465                                                                                                                                                                                                                                         │
│                                         Buffers: shared hit=16397                                                                                                                                                                                                                                    │
│   ->  Nested Loop  (cost=0.00..25081.00 rows=1 width=210) (actual time=43808.287..3825536.966 rows=43.00 loops=1)                                                                                                                                                                                    │
│         Join Filter: (((ss1.ca_county)::text = (ss2.ca_county)::text) AND (CASE WHEN (ws1.web_sales > '0'::numeric) THEN (ws2.web_sales / ws1.web_sales) ELSE NULL::numeric END > CASE WHEN (ss1.store_sales > '0'::numeric) THEN (ss2.store_sales / ss1.store_sales) ELSE NULL::numeric END))       │
│         Rows Removed by Join Filter: 226832                                                                                                                                                                                                                                                          │
│         Buffers: shared hit=7500 read=22936, temp read=4819 written=8505                                                                                                                                                                                                                             │
│         ->  Merge Join  (cost=0.00..8360.31 rows=1 width=224) (actual time=1747.759..1760.887 rows=825.00 loops=1)                                                                                                                                                                                   │
│               Merge Cond: ((ss1.ca_county)::text = (ws1.ca_county)::text)                                                                                                                                                                                                                            │
│               Buffers: shared hit=7500 read=22936, temp read=4321 written=8505                                                                                                                                                                                                                       │
│               ->  CTE Scan on ss ss1  (cost=0.00..6562.93 rows=7 width=114) (actual time=1471.648..1477.297 rows=1647.00 loops=1)                                                                                                                                                                    │
│                     Filter: ((d_qoy = 1) AND (d_year = 1999))                                                                                                                                                                                                                                        │
│                     Rows Removed by Filter: 33470                                                                                                                                                                                                                                                    │
│                     Storage: Memory  Maximum Storage: 2635kB                                                                                                                                                                                                                                         │
│                     Buffers: shared hit=1278 read=16903, temp read=4321 written=8505                                                                                                                                                                                                                 │
│               ->  Materialize  (cost=0.00..1797.36 rows=2 width=110) (actual time=275.335..280.952 rows=911.00 loops=1)                                                                                                                                                                              │
│                     Storage: Memory  Maximum Storage: 17kB                                                                                                                                                                                                                                           │
│                     Buffers: shared hit=6222 read=6033                                                                                                                                                                                                                                               │
│                     ->  CTE Scan on ws ws1  (cost=0.00..1797.35 rows=2 width=110) (actual time=275.333..279.774 rows=911.00 loops=1)                                                                                                                                                                 │
│                           Filter: ((d_qoy = 1) AND (d_year = 1999))                                                                                                                                                                                                                                  │
│                           Rows Removed by Filter: 22390                                                                                                                                                                                                                                              │
│                           Storage: Memory  Maximum Storage: 1700kB                                                                                                                                                                                                                                   │
│                           Buffers: shared hit=6222 read=6033                                                                                                                                                                                                                                         │
│         ->  Nested Loop  (cost=0.00..16720.65 rows=1 width=440) (actual time=5.913..4634.838 rows=275.00 loops=825)                                                                                                                                                                                  │
│               Join Filter: (((ss2.ca_county)::text = (ss3.ca_county)::text) AND (CASE WHEN (ws2.web_sales > '0'::numeric) THEN (ws3.web_sales / ws2.web_sales) ELSE NULL::numeric END > CASE WHEN (ss2.store_sales > '0'::numeric) THEN (ss3.store_sales / ss2.store_sales) ELSE NULL::numeric END)) │
│               Rows Removed by Join Filter: 1037001                                                                                                                                                                                                                                                   │
│               Buffers: temp read=498                                                                                                                                                                                                                                                                 │
│               ->  Merge Join  (cost=0.00..8360.31 rows=1 width=220) (actual time=0.001..5.266 rows=844.00 loops=825)                                                                                                                                                                                 │
│                     Merge Cond: ((ss2.ca_county)::text = (ws2.ca_county)::text)                                                                                                                                                                                                                      │
│                     ->  CTE Scan on ss ss2  (cost=0.00..6562.93 rows=7 width=110) (actual time=0.001..4.131 rows=1634.00 loops=825)                                                                                                                                                                  │
│                           Filter: ((d_year = 1999) AND (d_qoy = 2))                                                                                                                                                                                                                                  │
│                           Rows Removed by Filter: 33468                                                                                                                                                                                                                                              │
│                           Storage: Memory  Maximum Storage: 2635kB                                                                                                                                                                                                                                   │
│                     ->  Materialize  (cost=0.00..1797.36 rows=2 width=110) (actual time=0.000..0.053 rows=925.00 loops=825)                                                                                                                                                                          │
│                           Storage: Memory  Maximum Storage: 74kB                                                                                                                                                                                                                                     │
│                           ->  CTE Scan on ws ws2  (cost=0.00..1797.35 rows=2 width=110) (actual time=0.001..2.784 rows=925.00 loops=1)                                                                                                                                                               │
│                                 Filter: ((d_year = 1999) AND (d_qoy = 2))                                                                                                                                                                                                                            │
│                                 Rows Removed by Filter: 22382                                                                                                                                                                                                                                        │
│                                 Storage: Memory  Maximum Storage: 1700kB                                                                                                                                                                                                                             │
│               ->  Merge Join  (cost=0.00..8360.31 rows=1 width=220) (actual time=0.002..5.383 rows=1229.00 loops=696300)                                                                                                                                                                             │
│                     Merge Cond: ((ss3.ca_county)::text = (ws3.ca_county)::text)                                                                                                                                                                                                                      │
│                     Buffers: temp read=498                                                                                                                                                                                                                                                           │
│                     ->  CTE Scan on ss ss3  (cost=0.00..6562.93 rows=7 width=110) (actual time=0.001..4.051 rows=1796.00 loops=696300)                                                                                                                                                               │
│                           Filter: ((d_year = 1999) AND (d_qoy = 3))                                                                                                                                                                                                                                  │
│                           Rows Removed by Filter: 33292                                                                                                                                                                                                                                              │
│                           Storage: Memory  Maximum Storage: 2635kB                                                                                                                                                                                                                                   │
│                           Buffers: temp read=498                                                                                                                                                                                                                                                     │
│                     ->  Materialize  (cost=0.00..1797.36 rows=2 width=110) (actual time=0.000..0.047 rows=1261.00 loops=696300)                                                                                                                                                                      │
│                           Storage: Memory  Maximum Storage: 95kB                                                                                                                                                                                                                                     │
│                           ->  CTE Scan on ws ws3  (cost=0.00..1797.35 rows=2 width=110) (actual time=0.001..74.725 rows=1261.00 loops=1)                                                                                                                                                             │
│                                 Filter: ((d_year = 1999) AND (d_qoy = 3))                                                                                                                                                                                                                            │
│                                 Rows Removed by Filter: 22051                                                                                                                                                                                                                                        │
│                                 Storage: Memory  Maximum Storage: 1700kB                                                                                                                                                                                                                             │
│ Planning:                                                                                                                                                                                                                                                                                            │
│   Buffers: shared hit=12                                                                                                                                                                                                                                                                             │
│ Planning Time: 4.951 ms                                                                                                                                                                                                                                                                              │
│ Execution Time: 3825542.556 ms                                                                                                                                                                                                                                                                       │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘



Attachments:

  [text/plain] query-31.master.explain (50.6K, 2-query-31.master.explain)
  download | inline:
│ Sort  (cost=656889.77..656889.77 rows=1 width=210) (actual time=17164.506..17164.519 rows=43.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                                                               │
│   Sort Key: ((ss3.store_sales / ss2.store_sales))                                                                                                                                                                                                                                                                                                                                                                                                                                                                      │
│   Sort Method: quicksort  Memory: 28kB                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 │
│   Buffers: shared hit=6533 read=69203, temp read=4343 written=12055                                                                                                                                                                                                                                                                                                                                                                                                                                                    │
│   CTE ss                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               │
│     ->  HashAggregate  (cost=323021.86..377372.99 rows=1476800 width=54) (actual time=3389.564..3677.220 rows=35136.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                                        │
│           Group Key: customer_address.ca_county, date_dim.d_qoy, date_dim.d_year                                                                                                                                                                                                                                                                                                                                                                                                                                       │
│           Planned Partitions: 64  Batches: 65  Memory Usage: 8209kB  Disk Usage: 56840kB                                                                                                                                                                                                                                                                                                                                                                                                                               │
│           Buffers: shared hit=3408 read=50944, temp read=3962 written=10947                                                                                                                                                                                                                                                                                                                                                                                                                                            │
│           ->  Hash Join  (cost=5328.60..100701.93 rows=2625180 width=28) (actual time=46.394..2034.907 rows=2685273.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                                        │
│                 Hash Cond: (store_sales.ss_sold_date_sk = date_dim.d_date_sk)                                                                                                                                                                                                                                                                                                                                                                                                                                          │
│                 Buffers: shared hit=3408 read=50944                                                                                                                                                                                                                                                                                                                                                                                                                                                                    │
│                 ->  Hash Join  (cost=2261.00..90416.35 rows=2749551 width=24) (actual time=18.753..1396.048 rows=2750429.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                                   │
│                       Hash Cond: (store_sales.ss_addr_sk = customer_address.ca_address_sk)                                                                                                                                                                                                                                                                                                                                                                                                                             │
│                       Buffers: shared hit=1984 read=50944                                                                                                                                                                                                                                                                                                                                                                                                                                                              │
│                       ->  Seq Scan on store_sales  (cost=0.00..80594.17 rows=2880217 width=14) (actual time=0.063..228.063 rows=2880404.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                    │
│                             Buffers: shared hit=848 read=50944                                                                                                                                                                                                                                                                                                                                                                                                                                                         │
│                       ->  Hash  (cost=1636.00..1636.00 rows=50000 width=18) (actual time=18.651..18.651 rows=50000.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                                         │
│                             Buckets: 65536  Batches: 1  Memory Usage: 3052kB                                                                                                                                                                                                                                                                                                                                                                                                                                           │
│                             Buffers: shared hit=1136                                                                                                                                                                                                                                                                                                                                                                                                                                                                   │
│                             ->  Seq Scan on customer_address  (cost=0.00..1636.00 rows=50000 width=18) (actual time=0.005..9.555 rows=50000.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                │
│                                   Buffers: shared hit=1136
│                 ->  Hash  (cost=2154.49..2154.49 rows=73049 width=12) (actual time=27.627..27.629 rows=73049.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                                               │
│                       Buckets: 131072  Batches: 1  Memory Usage: 4163kB                                                                                                                                                                                                                                                                                                                                                                                                                                                │
│                       Buffers: shared hit=1424                                                                                                                                                                                                                                                                                                                                                                                                                                                                         │
│                       ->  Seq Scan on date_dim  (cost=0.00..2154.49 rows=73049 width=12) (actual time=0.009..15.154 rows=73049.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                             │
│                             Buffers: shared hit=1424                                                                                                                                                                                                                                                                                                                                                                                                                                                                   │
│   CTE ws                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               │
│     ->  HashAggregate  (cost=96009.03..114825.35 rows=718952 width=54) (actual time=977.215..1014.889 rows=23320.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                                           │
│           Group Key: customer_address_1.ca_county, date_dim_1.d_qoy, date_dim_1.d_year                                                                                                                                                                                                                                                                                                                                                                                                                                 │
│           Planned Partitions: 32  Batches: 33  Memory Usage: 8209kB  Disk Usage: 6032kB                                                                                                                                                                                                                                                                                                                                                                                                                                │
│           Buffers: shared hit=3125 read=18259, temp read=381 written=1108                                                                                                                                                                                                                                                                                                                                                                                                                                              │
│           ->  Hash Join  (cost=5328.60..35122.78 rows=718952 width=28) (actual time=46.623..611.054 rows=719118.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                                            │
│                 Hash Cond: (web_sales.ws_bill_addr_sk = customer_address_1.ca_address_sk)                                                                                                                                                           
│                 Buffers: shared hit=3125 read=18259                                                                                                                                                                                                                                                                                                                                                                                                                                                                    │
│                 ->  Hash Join  (cost=3067.60..30973.94 rows=719120 width=18) (actual time=27.691..424.273 rows=719195.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                                      │
│                       Hash Cond: (web_sales.ws_sold_date_sk = date_dim_1.d_date_sk)                                                                                                                                                                                                                                                                                                                                                                                                                                    │
│                       Buffers: shared hit=1989 read=18259                                                                                                                                                                                                                                                                                                                                                                                                                                                              │
│                       ->  Seq Scan on web_sales  (cost=0.00..26017.84 rows=719384 width=14) (actual time=0.082..63.389 rows=719384.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                         │
│                             Buffers: shared hit=565 read=18259                                                                                                                                                                                                                                                                                                                                                                                                                                                         │
│                       ->  Hash  (cost=2154.49..2154.49 rows=73049 width=12) (actual time=27.538..27.538 rows=73049.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                                         │
│                             Buckets: 131072  Batches: 1  Memory Usage: 4163kB                                                                                                                                                                                                                                                                                                                                                                                                                                          │
│                             Buffers: shared hit=1424                                                                                                                                                                                                                                                                                                                                                                                                                                                                   │
│                             ->  Seq Scan on date_dim date_dim_1  (cost=0.00..2154.49 rows=73049 width=12) (actual time=0.006..14.914 rows=73049.00 loops=1)                                                                                                                                                                                                                                                                                                                                                            │
│                                   Buffers: shared hit=1424                                                                                                                                                                                                                                                                                                                                                                                                                                                             │
│                 ->  Hash  (cost=1636.00..1636.00 rows=50000 width=18) (actual time=18.902..18.902 rows=50000.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                                               │
│                       Buckets: 65536  Batches: 1  Memory Usage: 3052kB                                                                                                                                                                                                                                                                                                                                                                                                                                                 │
│                       Buffers: shared hit=1136                                                                                                                                                                                                                                                                                                                                                                                                                                                                         │
│                       ->  Seq Scan on customer_address customer_address_1  (cost=0.00..1636.00 rows=50000 width=18) (actual time=0.008..9.727 rows=50000.00 loops=1)                                                                                                                                                                                                                                                                                                                                                   │
│                             Buffers: shared hit=1136                                                                                                                                                                                                                                                                                                                                                                                                                                                                   │
│   ->  Nested Loop  (cost=0.00..164691.41 rows=1 width=210) (actual time=4817.695..17164.430 rows=43.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                                                        │
│         Join Filter: (((ss1.ca_county)::text = (ws2.ca_county)::text) AND (CASE WHEN (ws1.web_sales > '0'::numeric) THEN (ws2.web_sales / ws1.web_sales) ELSE NULL::numeric END > CASE WHEN (ss1.store_sales > '0'::numeric) THEN (ss2.store_sales / ss1.store_sales) ELSE NULL::numeric END) AND (CASE WHEN (ws2.web_sales > '0'::numeric) THEN (ws3.web_sales / ws2.web_sales) ELSE NULL::numeric END > CASE WHEN (ss2.store_sales > '0'::numeric) THEN (ss3.store_sales / ss2.store_sales) ELSE NULL::numeric END)) │
│         Rows Removed by Join Filter: 527207                                                                                                                                                                                                                                                                                                                                                                                                                                                                            │
│         Buffers: shared hit=6533 read=69203, temp read=4343 written=12055                                                                                                                                                                                                                                                                                                                                                                                                                                              │
│         ->  Nested Loop  (cost=0.00..146716.93 rows=1 width=554) (actual time=4671.968..15501.760 rows=570.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                                                 │
│               Join Filter: ((ss1.ca_county)::text = (ss3.ca_county)::text)                                                                                                                                                                                                                                                                                                                                                                                                                                             │
│               Rows Removed by Join Filter: 1038674                                                                                                                                                                                                                                                                                                                                                                                                                                                                     │
│               Buffers: shared hit=6533 read=69203, temp read=4343 written=12055                                                                                                                                                                                                                                                                                                                                                                                                                                        │
│               ->  Nested Loop  (cost=0.00..109796.47 rows=1 width=444) (actual time=4669.164..12922.095 rows=578.00 loops=1)
│                     Join Filter: ((ss1.ca_county)::text = (ss2.ca_county)::text)                                                                                                                                                                                                                                                                                                                                                                                                                                       │
│                     Rows Removed by Join Filter: 1008217                                                                                                                                                                                                                                                                                                                                                                                                                                                               │
│                     Buffers: shared hit=6533 read=69203, temp read=3559 written=12055                                                                                                                                                                                                                                                                                                                                                                                                                                  │
│                     ->  Nested Loop  (cost=0.00..72876.00 rows=1 width=334) (actual time=4666.835..10231.481 rows=617.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                                      │
│                           Join Filter: ((ss1.ca_county)::text = (ws1.ca_county)::text)                                                                                                                                                                                                                                                                                                                                                                                                                                 │
│                           Rows Removed by Join Filter: 1089697                                                                                                                                                                                                                                                                                                                                                                                                                                                         │
│                           Buffers: shared hit=6533 read=69203, temp read=3559 written=12055                                                                                                                                                                                                                                                                                                                                                                                                                            │
│                           ->  Nested Loop  (cost=0.00..35954.71 rows=2 width=220) (actual time=1031.594..3687.112 rows=662.00 loops=1)                                                                                                                                                                                                                                                                                                                                                                                 │
│                                 Join Filter: ((ws1.ca_county)::text = (ws3.ca_county)::text)                                                                                                                                                                                                                                                                                                                                                                                                                           │
│                                 Rows Removed by Join Filter: 1148109                                                                                                                                                                                                                                                                                                                                                                                                                                                   │
│                                 Buffers: shared hit=3125 read=18259, temp read=381 written=1108                                                                                                                                                                                                                                                                                                                                                                                                                        │
│                                 ->  CTE Scan on ws ws1  (cost=0.00..17973.80 rows=18 width=110) (actual time=977.224..980.082 rows=911.00 loops=1)                                                                                                  
│                                       Filter: ((d_qoy = 1) AND (d_year = 1999))                                                                                                                                                                                                                                                                                                                                                                                                                                        │
│                                       Rows Removed by Filter: 22409                                                                                                                                                                                                                                                                                                                                                                                                                                                    │
│                                       Storage: Memory  Maximum Storage: 1700kB                                                                                                                                                                                                                                                                                                                                                                                                                                         │
│                                       Buffers: shared hit=3125 read=18259, temp written=1107                                                                                                                                                                                                                                                                                                                                                                                                                           │
│                                 ->  CTE Scan on ws ws3  (cost=0.00..17973.80 rows=18 width=110) (actual time=0.005..2.857 rows=1261.00 loops=911)                                                                                                                                                                                                                                                                                                                                                                      │
│                                       Filter: ((d_year = 1999) AND (d_qoy = 3))                                                                                                                                                                                                                                                                                                                                                                                                                                        │
│                                       Rows Removed by Filter: 22059                                                                                                                                                                                                                                                                                                                                                                                                                                                    │
│                                       Storage: Memory  Maximum Storage: 1700kB                                                                                                                                                                                                                                                                                                                                                                                                                                         │
│                                       Buffers: temp read=381 written=1                                                                                                                                                                                                                                                                                                                                                                                                                                                 │
│                           ->  CTE Scan on ss ss1  (cost=0.00..36920.00 rows=37 width=114) (actual time=5.121..9.740 rows=1647.00 loops=662)                                                                                                                                                                                                                                                                                                                                                                            │
│                                 Filter: ((d_qoy = 1) AND (d_year = 1999))                                                                                                                                                                                                                                                                                                                                                                                                                                              │
│                                 Rows Removed by Filter: 33489                                                                                                                                                                                                                                                                                                                                                                                                                                                          │
│                                 Storage: Memory  Maximum Storage: 2636kB                                                                                                                                                                                                                                                                                                                                                                                                                                               │
│                                 Buffers: shared hit=3408 read=50944, temp read=3178 written=10947                                                                                                                                                                                                                                                                                                                                                                                                                      │
│                     ->  CTE Scan on ss ss2  (cost=0.00..36920.00 rows=37 width=110) (actual time=0.001..4.216 rows=1635.00 loops=617)                                                                                                                                                                                                                                                                                                                                                                                  │
│                           Filter: ((d_year = 1999) AND (d_qoy = 2))                                                                                                                                                                                                                                                                                                                                                                                                                                                    │
│                           Rows Removed by Filter: 33501                                                                                                                                                                                                                                                                                                                                                                                                                                                                │
│                           Storage: Memory  Maximum Storage: 2636kB                                                                                                                                                                                                                                                                                                                                                                                                                                                     │
│               ->  CTE Scan on ss ss3  (cost=0.00..36920.00 rows=37 width=110) (actual time=0.006..4.305 rows=1798.00 loops=578)                                                                                                                                                                                                                                                                                                                                                                                        │
│                     Filter: ((d_year = 1999) AND (d_qoy = 3))                                                                                                                                                                                                                                                                                                                                                                                                                                                          │
│                     Rows Removed by Filter: 33338                                                                                                                                                                                                                                                                                                                                                                                                                                                                      │
│                     Storage: Memory  Maximum Storage: 2636kB                                                                                                                                                                                                                                                                                                                                                                                                                                                           │
│                     Buffers: temp read=784                                                                                                                                                                                                                                                                                                                                                                                                                                                                             │
│         ->  CTE Scan on ws ws2  (cost=0.00..17973.80 rows=18 width=110) (actual time=0.001..2.810 rows=925.00 loops=570)                                                                                                                                                                                                                                                                                                                                                                                               │
│               Filter: ((d_year = 1999) AND (d_qoy = 2))
│               Rows Removed by Filter: 22395                                                                                                                                                                                                                                                                                                                                                                                                                                                                            │
│               Storage: Memory  Maximum Storage: 1700kB                                                                                                                                                                                                                                                                                                                                                                                                                                                                 │
│ Planning:                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              │
│   Buffers: shared hit=12                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               │
│ Planning Time: 2.180 ms                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                │
│ Execution Time: 17166.558 ms                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘


  [text/plain] query-31.patch.explain (42.0K, 3-query-31.patch.explain)
  download | inline:
│ Sort  (cost=302668.66..302668.66 rows=1 width=210) (actual time=3825537.172..3825541.540 rows=43.00 loops=1)                                                                                                                                                                                         │
│   Sort Key: ((ss3.store_sales / ss2.store_sales))                                                                                                                                                                                                                                                    │
│   Sort Method: quicksort  Memory: 28kB                                                                                                                                                                                                                                                               │
│   Buffers: shared hit=21757 read=69012, temp read=14486 written=25552                                                                                                                                                                                                                                │
│   CTE ss                                                                                                                                                                                                                                                                                             │
│     ->  Finalize GroupAggregate  (cost=178135.51..215272.86 rows=262517 width=54) (actual time=1471.638..1733.635 rows=35117.00 loops=1)                                                                                                                                                             │
│           Group Key: customer_address.ca_county, date_dim.d_qoy, date_dim.d_year                                                                                                                                                                                                                     │
│           Buffers: shared hit=3533 read=50849, temp read=14486 written=25552                                                                                                                                                                                                                         │
│           ->  Gather Merge  (cost=178135.51..208709.94 rows=262517 width=54) (actual time=1471.627..1586.417 rows=234867.00 loops=1)                                                                                                                                                                 │
│                 Workers Planned: 2                                                                                                                                                                                                                                                                   │
│                 Workers Launched: 2                                                                                                                                                                                                                                                                  │
│                 Buffers: shared hit=3533 read=50849, temp read=14486 written=25552                                                                                                                                                                                                                   │
│                 ->  Sort  (cost=177135.48..177408.94 rows=109382 width=54) (actual time=1463.292..1497.110 rows=78658.67 loops=3)                                                                                                                                                                    │
│                       Sort Key: customer_address.ca_county, date_dim.d_qoy, date_dim.d_year                                                                                                                                                                                                          │
│                       Sort Method: external merge  Disk: 7944kB                                                                                                                                                                                                                                      │
│                       Buffers: shared hit=3533 read=50849, temp read=14486 written=25552                                                                                                                                                                                                             │
│                       Worker 0:  Sort Method: external merge  Disk: 8000kB                                                                                                                                                                                                                           │
│                       Worker 1:  Sort Method: external merge  Disk: 7928kB                                                                                                                                                                                                                           │
│                       ->  Parallel Hash Join  (cost=147862.49..164239.25 rows=109382 width=54) (actual time=839.965..1235.101 rows=80523.33 loops=3)                                                                                                                                                 │
│                             Hash Cond: (store_sales.ss_sold_date_sk = date_dim.d_date_sk)                                                                                                                                                                                                            │
│                             Buffers: shared hit=3503 read=50849, temp read=11502 written=22562                                                                                                                                                                                                       │
│                             ->  Parallel Hash Join  (cost=145471.66..161547.68 rows=114565 width=50) (actual time=820.740..1192.922 rows=96392.00 loops=3)                                                                                                                                           │
│                                   Hash Cond: (store_sales.ss_addr_sk = customer_address.ca_address_sk)                                                                                                                                                                                               │
│                                   Buffers: shared hit=2079 read=50849, temp read=11502 written=22562                                                                                                                                                                                                 │
│                                   ->  Partial HashAggregate  (cost=143673.89..158993.80 rows=288022 width=40) (actual time=810.581..1155.245 rows=98213.67 loops=3)                                                                                                                                  │
│                                         Group Key: store_sales.ss_sold_date_sk, store_sales.ss_addr_sk                                                                                                                                                                                               │
│                                         Planned Partitions: 16  Batches: 17  Memory Usage: 8337kB  Disk Usage: 31640kB                                                                                                                                                                               │
│                                         Buffers: shared hit=943 read=50849, temp read=11502 written=22562                                                                                                                                                                                            │
│                                         Worker 0:  Batches: 17  Memory Usage: 8337kB  Disk Usage: 31760kB                                                                                                                                                                                            │
│                                         Worker 1:  Batches: 17  Memory Usage: 8337kB  Disk Usage: 31640kB                                                                                                                                                                                            │
│                                         ->  Parallel Seq Scan on store_sales  (cost=0.00..63792.90 rows=1200090 width=14) (actual time=0.126..79.442 rows=960134.67 loops=3)                                                                                                                         │
│                                               Buffers: shared hit=943 read=50849                                                                                                                                                                                                                     │
│                                   ->  Parallel Hash  (cost=1430.12..1430.12 rows=29412 width=18) (actual time=10.036..10.038 rows=16666.67 loops=3)                                                                                                                                                  │
│                                         Buckets: 65536  Batches: 1  Memory Usage: 3264kB                                                                                                                                                                                                             │
│                                         Buffers: shared hit=1136                                                                                                                                                                                                                                     │
│                                         ->  Parallel Seq Scan on customer_address  (cost=0.00..1430.12 rows=29412 width=18) (actual time=0.007..5.102 rows=16666.67 loops=3)                                                                                                                         │
│                                               Buffers: shared hit=1136                                                                                                                                                                                                                               │
│                             ->  Parallel Hash  (cost=1853.70..1853.70 rows=42970 width=12) (actual time=19.092..19.094 rows=24349.67 loops=3)                                                                                                                                                        │
│                                   Buckets: 131072  Batches: 1  Memory Usage: 4512kB                                                                                                                                                                                                                  │
│                                   Buffers: shared hit=1424                                                                                                                                                                                                                                           │
│                                   ->  Parallel Seq Scan on date_dim  (cost=0.00..1853.70 rows=42970 width=12) (actual time=0.012..10.264 rows=24349.67 loops=3)                                                                                                                                      │
│                                         Buffers: shared hit=1424                                                                                                                                                                                                                                     │
│   CTE ws                                                                                                                                                                                                                                                                                             │
│     ->  Finalize GroupAggregate  (cost=52144.19..62314.79 rows=71894 width=54) (actual time=275.121..340.107 rows=23312.00 loops=1)                                                                                                                                                                  │
│           Group Key: customer_address_1.ca_county, date_dim_1.d_qoy, date_dim_1.d_year                                                                                                                                                                                                               │
│           Buffers: shared hit=18224 read=18163                                                                                                                                                                                                                                                       │
│           ->  Gather Merge  (cost=52144.19..60517.44 rows=71894 width=54) (actual time=275.107..297.072 rows=60190.00 loops=1)                                                                                                                                                                       │
│                 Workers Planned: 2                                                                                                                                                                                                                                                                   │
│                 Workers Launched: 2                                                                                                                                                                                                                                                                  │
│                 Buffers: shared hit=18224 read=18163                                                                                                                                                                                                                                                 │
│                 ->  Sort  (cost=51144.17..51219.06 rows=29956 width=54) (actual time=271.870..272.906 rows=20293.33 loops=3)                                                                                                                                                                         │
│                       Sort Key: customer_address_1.ca_county, date_dim_1.d_qoy, date_dim_1.d_year                                                                                                                                                                                                    │
│                       Sort Method: quicksort  Memory: 2931kB                                                                                                                                                                                                                                         │
│                       Buffers: shared hit=18224 read=18163                                                                                                                                                                                                                                           │
│                       Worker 0:  Sort Method: quicksort  Memory: 2938kB                                                                                                                                                                                                                              │
│                       Worker 1:  Sort Method: quicksort  Memory: 2955kB                                                                                                                                                                                                                              │
│                       ->  Nested Loop  (cost=43571.15..48916.86 rows=29956 width=54) (actual time=184.657..215.740 rows=20419.67 loops=3)                                                                                                                                                            │
│                             Buffers: shared hit=18194 read=18163                                                                                                                                                                                                                                     │
│                             ->  Parallel Hash Join  (cost=43570.84..47586.10 rows=29967 width=50) (actual time=184.630..201.358 rows=20451.00 loops=3)                                                                                                                                               │
│                                   Hash Cond: (web_sales.ws_bill_addr_sk = customer_address_1.ca_address_sk)                                                                                                                                                                                          │
│                                   Buffers: shared hit=1797 read=18163                                                                                                                                                                                                                                │
│                                   ->  Partial HashAggregate  (cost=41773.08..45599.48 rows=71938 width=40) (actual time=177.706..188.464 rows=20477.33 loops=3)                                                                                                                                      │
│                                         Group Key: web_sales.ws_sold_date_sk, web_sales.ws_bill_addr_sk                                                                                                                                                                                              │
│                                         Planned Partitions: 4  Batches: 1  Memory Usage: 7953kB                                                                                                                                                                                                      │
│                                         Buffers: shared hit=661 read=18163                                                                                                                                                                                                                           │
│                                         Worker 0:  Batches: 1  Memory Usage: 7953kB                                                                                                                                                                                                                  │
│                                         Worker 1:  Batches: 1  Memory Usage: 7953kB                                                                                                                                                                                                                  │
│                                         ->  Parallel Seq Scan on web_sales  (cost=0.00..21821.43 rows=299743 width=14) (actual time=0.106..23.122 rows=239794.67 loops=3)                                                                                                                            │
│                                               Buffers: shared hit=661 read=18163                                                                                                                                                                                                                     │
│                                   ->  Parallel Hash  (cost=1430.12..1430.12 rows=29412 width=18) (actual time=6.846..6.847 rows=16666.67 loops=3)                                                                                                                                                    │
│                                         Buckets: 65536  Batches: 1  Memory Usage: 3264kB                                                                                                                                                                                                             │
│                                         Buffers: shared hit=1136                                                                                                                                                                                                                                     │
│                                         ->  Parallel Seq Scan on customer_address customer_address_1  (cost=0.00..1430.12 rows=29412 width=18) (actual time=0.008..3.586 rows=16666.67 loops=3)                                                                                                      │
│                                               Buffers: shared hit=1136                                                                                                                                                                                                                               │
│                             ->  Memoize  (cost=0.30..0.33 rows=1 width=12) (actual time=0.000..0.000 rows=1.00 loops=61353)                                                                                                                                                                          │
│                                   Cache Key: web_sales.ws_sold_date_sk                                                                                                                                                                                                                               │
│                                   Cache Mode: logical                                                                                                                                                                                                                                                │
│                                   Estimates: capacity=1822 distinct keys=1822 lookups=29967 hit percent=93.92%                                                                                                                                                                                       │
│                                   Hits: 18542  Misses: 1824  Evictions: 0  Overflows: 0  Memory Usage: 200kB                                                                                                                                                                                         │
│                                   Buffers: shared hit=16397                                                                                                                                                                                                                                          │
│                                   Worker 0:  Hits: 18589  Misses: 1821  Evictions: 0  Overflows: 0  Memory Usage: 200kB                                                                                                                                                                              │
│                                   Worker 1:  Hits: 18754  Misses: 1823  Evictions: 0  Overflows: 0  Memory Usage: 200kB                                                                                                                                                                              │
│                                   ->  Index Scan using date_dim_pkey on date_dim date_dim_1  (cost=0.29..0.32 rows=1 width=12) (actual time=0.002..0.002 rows=1.00 loops=5468)                                                                                                                       │
│                                         Index Cond: (d_date_sk = web_sales.ws_sold_date_sk)                                                                                                                                                                                                          │
│                                         Index Searches: 5465                                                                                                                                                                                                                                         │
│                                         Buffers: shared hit=16397                                                                                                                                                                                                                                    │
│   ->  Nested Loop  (cost=0.00..25081.00 rows=1 width=210) (actual time=43808.287..3825536.966 rows=43.00 loops=1)                                                                                                                                                                                    │
│         Join Filter: (((ss1.ca_county)::text = (ss2.ca_county)::text) AND (CASE WHEN (ws1.web_sales > '0'::numeric) THEN (ws2.web_sales / ws1.web_sales) ELSE NULL::numeric END > CASE WHEN (ss1.store_sales > '0'::numeric) THEN (ss2.store_sales / ss1.store_sales) ELSE NULL::numeric END))       │
│         Rows Removed by Join Filter: 226832                                                                                                                                                                                                                                                          │
│         Buffers: shared hit=7500 read=22936, temp read=4819 written=8505                                                                                                                                                                                                                             │
│         ->  Merge Join  (cost=0.00..8360.31 rows=1 width=224) (actual time=1747.759..1760.887 rows=825.00 loops=1)                                                                                                                                                                                   │
│               Merge Cond: ((ss1.ca_county)::text = (ws1.ca_county)::text)                                                                                                                                                                                                                            │
│               Buffers: shared hit=7500 read=22936, temp read=4321 written=8505                                                                                                                                                                                                                       │
│               ->  CTE Scan on ss ss1  (cost=0.00..6562.93 rows=7 width=114) (actual time=1471.648..1477.297 rows=1647.00 loops=1)                                                                                                                                                                    │
│                     Filter: ((d_qoy = 1) AND (d_year = 1999))                                                                                                                                                                                                                                        │
│                     Rows Removed by Filter: 33470                                                                                                                                                                                                                                                    │
│                     Storage: Memory  Maximum Storage: 2635kB                                                                                                                                                                                                                                         │
│                     Buffers: shared hit=1278 read=16903, temp read=4321 written=8505                                                                                                                                                                                                                 │
│               ->  Materialize  (cost=0.00..1797.36 rows=2 width=110) (actual time=275.335..280.952 rows=911.00 loops=1)                                                                                                                                                                              │
│                     Storage: Memory  Maximum Storage: 17kB                                                                                                                                                                                                                                           │
│                     Buffers: shared hit=6222 read=6033                                                                                                                                                                                                                                               │
│                     ->  CTE Scan on ws ws1  (cost=0.00..1797.35 rows=2 width=110) (actual time=275.333..279.774 rows=911.00 loops=1)                                                                                                                                                                 │
│                           Filter: ((d_qoy = 1) AND (d_year = 1999))                                                                                                                                                                                                                                  │
│                           Rows Removed by Filter: 22390                                                                                                                                                                                                                                              │
│                           Storage: Memory  Maximum Storage: 1700kB                                                                                                                                                                                                                                   │
│                           Buffers: shared hit=6222 read=6033                                                                                                                                                                                                                                         │
│         ->  Nested Loop  (cost=0.00..16720.65 rows=1 width=440) (actual time=5.913..4634.838 rows=275.00 loops=825)                                                                                                                                                                                  │
│               Join Filter: (((ss2.ca_county)::text = (ss3.ca_county)::text) AND (CASE WHEN (ws2.web_sales > '0'::numeric) THEN (ws3.web_sales / ws2.web_sales) ELSE NULL::numeric END > CASE WHEN (ss2.store_sales > '0'::numeric) THEN (ss3.store_sales / ss2.store_sales) ELSE NULL::numeric END)) │
│               Rows Removed by Join Filter: 1037001                                                                                                                                                                                                                                                   │
│               Buffers: temp read=498                                                                                                                                                                                                                                                                 │
│               ->  Merge Join  (cost=0.00..8360.31 rows=1 width=220) (actual time=0.001..5.266 rows=844.00 loops=825)                                                                                                                                                                                 │
│                     Merge Cond: ((ss2.ca_county)::text = (ws2.ca_county)::text)                                                                                                                                                                                                                      │
│                     ->  CTE Scan on ss ss2  (cost=0.00..6562.93 rows=7 width=110) (actual time=0.001..4.131 rows=1634.00 loops=825)                                                                                                                                                                  │
│                           Filter: ((d_year = 1999) AND (d_qoy = 2))                                                                                                                                                                                                                                  │
│                           Rows Removed by Filter: 33468                                                                                                                                                                                                                                              │
│                           Storage: Memory  Maximum Storage: 2635kB                                                                                                                                                                                                                                   │
│                     ->  Materialize  (cost=0.00..1797.36 rows=2 width=110) (actual time=0.000..0.053 rows=925.00 loops=825)                                                                                                                                                                          │
│                           Storage: Memory  Maximum Storage: 74kB                                                                                                                                                                                                                                     │
│                           ->  CTE Scan on ws ws2  (cost=0.00..1797.35 rows=2 width=110) (actual time=0.001..2.784 rows=925.00 loops=1)                                                                                                                                                               │
│                                 Filter: ((d_year = 1999) AND (d_qoy = 2))                                                                                                                                                                                                                            │
│                                 Rows Removed by Filter: 22382                                                                                                                                                                                                                                        │
│                                 Storage: Memory  Maximum Storage: 1700kB                                                                                                                                                                                                                             │
│               ->  Merge Join  (cost=0.00..8360.31 rows=1 width=220) (actual time=0.002..5.383 rows=1229.00 loops=696300)                                                                                                                                                                             │
│                     Merge Cond: ((ss3.ca_county)::text = (ws3.ca_county)::text)                                                                                                                                                                                                                      │
│                     Buffers: temp read=498                                                                                                                                                                                                                                                           │
│                     ->  CTE Scan on ss ss3  (cost=0.00..6562.93 rows=7 width=110) (actual time=0.001..4.051 rows=1796.00 loops=696300)                                                                                                                                                               │
│                           Filter: ((d_year = 1999) AND (d_qoy = 3))                                                                                                                                                                                                                                  │
│                           Rows Removed by Filter: 33292                                                                                                                                                                                                                                              │
│                           Storage: Memory  Maximum Storage: 2635kB                                                                                                                                                                                                                                   │
│                           Buffers: temp read=498                                                                                                                                                                                                                                                     │
│                     ->  Materialize  (cost=0.00..1797.36 rows=2 width=110) (actual time=0.000..0.047 rows=1261.00 loops=696300)                                                                                                                                                                      │
│                           Storage: Memory  Maximum Storage: 95kB                                                                                                                                                                                                                                     │
│                           ->  CTE Scan on ws ws3  (cost=0.00..1797.35 rows=2 width=110) (actual time=0.001..74.725 rows=1261.00 loops=1)                                                                                                                                                             │
│                                 Filter: ((d_year = 1999) AND (d_qoy = 3))                                                                                                                                                                                                                            │
│                                 Rows Removed by Filter: 22051                                                                                                                                                                                                                                        │
│                                 Storage: Memory  Maximum Storage: 1700kB                                                                                                                                                                                                                             │
│ Planning:                                                                                                                                                                                                                                                                                            │
│   Buffers: shared hit=12                                                                                                                                                                                                                                                                             │
│ Planning Time: 4.951 ms                                                                                                                                                                                                                                                                              │
│ Execution Time: 3825542.556 ms                                                                                                                                                                                                                                                                       │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘


^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-05 13:12                                   ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-09 09:20                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-09 14:20                                       ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-12 09:34                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-12 18:47                                           ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-13 08:27                                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-25 04:23                                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-29 02:09                                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-01 23:54                                                   ` Re: Eager aggregation, take 3 Matheus Alcantara <[email protected]>
@ 2025-10-02 01:13                                                     ` Richard Guo <[email protected]>
  2025-10-02 01:39                                                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  0 siblings, 1 reply; 70+ messages in thread

From: Richard Guo @ 2025-10-02 01:13 UTC (permalink / raw)
  To: Matheus Alcantara <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

On Thu, Oct 2, 2025 at 8:55 AM Matheus Alcantara
<[email protected]> wrote:
> The query 31 seems bad, I don't know if I'm doing something completely
> wrong but I've just setup a TPC-DS database and then executed the query
> on master and with the v23 patch and I got these results:
>
> Master:
>     Planning Time: 3.191 ms
>     Execution Time: 16950.619 ms
>
> Patch:
>     Planning Time: 3.257 ms
>     Execution Time: 3848355.646 ms

Thanks for reporting this.  It does seem odd.  I checked the TPC-DS
benchmarking on v13 and found that the execution time for query 31,
with and without eager aggregation, is as follows:

       EAGER-AGG-OFF           EAGER-AGG-ON
q31     10463.536 ms            10244.175 ms

There appears to be a regression between v13 and v23.  Looking into
it...

- Richard





^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-05 13:12                                   ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-09 09:20                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-09 14:20                                       ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-12 09:34                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-12 18:47                                           ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-13 08:27                                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-25 04:23                                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-29 02:09                                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-01 23:54                                                   ` Re: Eager aggregation, take 3 Matheus Alcantara <[email protected]>
  2025-10-02 01:13                                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-10-02 01:39                                                       ` Richard Guo <[email protected]>
  2025-10-02 08:49                                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  0 siblings, 1 reply; 70+ messages in thread

From: Richard Guo @ 2025-10-02 01:39 UTC (permalink / raw)
  To: Matheus Alcantara <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

On Thu, Oct 2, 2025 at 10:13 AM Richard Guo <[email protected]> wrote:
> On Thu, Oct 2, 2025 at 8:55 AM Matheus Alcantara
> <[email protected]> wrote:
> > The query 31 seems bad, I don't know if I'm doing something completely
> > wrong but I've just setup a TPC-DS database and then executed the query
> > on master and with the v23 patch and I got these results:
> >
> > Master:
> >     Planning Time: 3.191 ms
> >     Execution Time: 16950.619 ms
> >
> > Patch:
> >     Planning Time: 3.257 ms
> >     Execution Time: 3848355.646 ms

> Thanks for reporting this.  It does seem odd.  I checked the TPC-DS
> benchmarking on v13 and found that the execution time for query 31,
> with and without eager aggregation, is as follows:
>
>        EAGER-AGG-OFF           EAGER-AGG-ON
> q31     10463.536 ms            10244.175 ms
>
> There appears to be a regression between v13 and v23.  Looking into
> it...

I noticed something interesting while comparing the two EXPLAIN
(ANALYZE) outputs: the patched version uses parallel plans, whereas
the master does not.  To rule that out as a factor, I ran "SET
max_parallel_workers_per_gather TO 0;" and re-ran query 31 on both
master and the patched version.  This time, I got a positive result.

-- on master
 Planning Time: 5.281 ms
 Execution Time: 7222.665 ms

-- on patched
 Planning Time: 4.855 ms
 Execution Time: 5977.287 ms

It seems eager aggregation doesn't cope well with parallel plans for
this query.  Looking into it.

- Richard





^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-05 13:12                                   ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-09 09:20                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-09 14:20                                       ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-12 09:34                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-12 18:47                                           ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-13 08:27                                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-25 04:23                                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-29 02:09                                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-01 23:54                                                   ` Re: Eager aggregation, take 3 Matheus Alcantara <[email protected]>
  2025-10-02 01:13                                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-02 01:39                                                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-10-02 08:49                                                         ` Richard Guo <[email protected]>
  2025-10-02 18:40                                                           ` Re: Eager aggregation, take 3 Matheus Alcantara <[email protected]>
  0 siblings, 1 reply; 70+ messages in thread

From: Richard Guo @ 2025-10-02 08:49 UTC (permalink / raw)
  To: Matheus Alcantara <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

On Thu, Oct 2, 2025 at 10:39 AM Richard Guo <[email protected]> wrote:
> It seems eager aggregation doesn't cope well with parallel plans for
> this query.  Looking into it.

It turns out that this is not related to parallel plans but rather to
poor size estimates.

Looking at query 31, it involves joining 6 base relations, all of
which are CTE references (i.e., RTE_CTE relations) to two different
CTEs.  Each CTE involves aggregations and GROUP BY clauses.
Unfortunately, our size estimates for CTE relations are quite poor,
especially when the CTE uses GROUP BY.  In these cases, we don't have
any ANALYZE statistics available (cf. examine_simple_variable).  As a
result, when computing the selectivity of the CTE relation's qual
clauses, we have to fall back on default values.  For example, for
quals like "CTE.var = const", which are used a lot in query 31, the
selectivity is computed as "1.0 / DEFAULT_NUM_DISTINCT(200)", with the
assumption that there are DEFAULT_NUM_DISTINCT distinct values in the
relation, and that these values are equally common (cf. var_eq_const).

The consequence is that the size estimates are significantly different
from the actual values.  For example, from the EXPLAIN(ANALYZE) output
provided by Matheus:

->  CTE Scan on ws ws3  (cost=0.00..1797.35 rows=2 width=110)
                 (actual time=0.001..74.725 rows=1261.00 loops=1)
      Filter: ((d_year = 1999) AND (d_qoy = 3))

Interestingly, with eager aggregation applied, the row count estimates
for the two CTE plans actually become closer to the actual values.

-- without eager aggregation
CTE ws
  ->  HashAggregate  (cost=96009.03..114825.35 rows=718952 width=54)
                (actual time=977.215..1014.889 rows=23320.00 loops=1)

-- with eager aggregation
CTE ws
  ->  Finalize GroupAggregate  (cost=52144.19..62314.79 rows=71894 width=54)
                          (actual time=275.121..340.107 rows=23312.00 loops=1)

However, due to the highly underestimated selectivity for the qual
clauses, the row count estimates for CTE Scan nodes become worse.
This is because:

-- without eager aggregation
718952 * (1.0/200) * (1.0/200) ~= 18

-- with eager aggregation
71894 * (1.0/200) * (1.0/200) ~= 2

... while the actual row count is 1261.00 as shown above.

That is to say, on master, the CTE plan rows are overestimated while
the selectivity estimates are severely underestimated.  With eager
aggregation, the CTE plan rows become closer to the actual values, but
the selectivity estimates remain equally underestimated.  As a result,
the row count estimates for the CTE Scan nodes worsen with eager
aggregation.  This causes the join order in the final plan to change
when eager aggregation is applied, leading to longer execution times
in this case.


Another point to note is that, due to severely underestimated
selectivity estimates (0.000025, sometimes 0.000000125), the size
estimates for the CTE relations are very small, causing the planner to
tend to choose nestloops.  I tried manually disabling nestloop, and
here are what I got for query 31.

-- on master, set enable_nestloop to on;
 Planning Time: 4.613 ms
 Execution Time: 7142.090 ms

-- on master, set enable_nestloop to off;
 Planning Time: 4.315 ms
 Execution Time: 2262.330 ms

-- on patched, set enable_nestloop to off;
 Planning Time: 4.321 ms
 Execution Time: 1214.376 ms

That is, on master, simply disabling nestloop makes query 31 run more
than 3 times faster.  Enabling eager aggregation on top of that
improves performance further, making it run 1.86 times faster relative
to the nested-loop-disabled baseline.

I manually disabled nested loops for other TPC-DS queries on master
and discovered some additional interesting findings.

For query 4, on master:

-- set enable_nestloop to on
 Planning Time: 3.054 ms
 Execution Time: 3231356.258 ms

-- set enable_nestloop to off
 Planning Time: 4.291 ms
 Execution Time: 12751.170 ms

That is, on master, simply disabling nestloop makes query 4 run more
than 253 times faster.

For query 11, on master:

-- set enable_nestloop to on
 Planning Time: 1.435 ms
 Execution Time: 1824860.937 ms

-- set enable_nestloop to off
 Planning Time: 2.479 ms
 Execution Time: 7984.360 ms

Disabling nestloop makes query 11 run more than 228 times faster.

I believe you can find more such queries in TPC-DS if you keep
looking.  Given this, I don't think it makes much sense to debug a
performance regression on TPC-DS with nestloop enabled.

Matheus, I wonder if you could help run TPC-DS again with this patch,
this time with nested loops disabled for all queries.

- Richard





^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-05 13:12                                   ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-09 09:20                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-09 14:20                                       ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-12 09:34                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-12 18:47                                           ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-13 08:27                                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-25 04:23                                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-29 02:09                                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-01 23:54                                                   ` Re: Eager aggregation, take 3 Matheus Alcantara <[email protected]>
  2025-10-02 01:13                                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-02 01:39                                                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-02 08:49                                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-10-02 18:40                                                           ` Matheus Alcantara <[email protected]>
  2025-10-03 03:14                                                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  0 siblings, 1 reply; 70+ messages in thread

From: Matheus Alcantara @ 2025-10-02 18:40 UTC (permalink / raw)
  To: Richard Guo <[email protected]>; Matheus Alcantara <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

On Thu Oct 2, 2025 at 5:49 AM -03, Richard Guo wrote:
> On Thu, Oct 2, 2025 at 10:39 AM Richard Guo <[email protected]> wrote:
>> It seems eager aggregation doesn't cope well with parallel plans for
>> this query.  Looking into it.
>
> It turns out that this is not related to parallel plans but rather to
> poor size estimates.
>
> [ ... ]

> Matheus, I wonder if you could help run TPC-DS again with this patch,
> this time with nested loops disabled for all queries.
>
Thanks for all the details. I've disabled the nested loops and executed
the benchmark again and the results look much better! I see a 55%
improvement on query_31 on my machine now (MacOS M3 Max).

The only query that I see a considerable regression is query 23 which I
get a 23% worst execution time. I'm attaching the EXPLAIN(ANALYZE)
output from master and from the patched version if it's interesting.

I'm also attaching a csv with the planning time and execution time from
master and the patched version for all queries. It contains the % of
difference between the executions. Negative numbers means that the
patched version using eager aggregation is faster. (I loaded this csv on
a postgres table and played with some queries to analyze the results).

I'm just wondering if there is anything that can be done on the planner
to prevent this type of situation?

--
Matheus Alcantara


Attachments:

  [application/octet-stream] query-23.master.explain (39.5K, 2-query-23.master.explain)
  download

  [application/octet-stream] query-23.patch.explain (38.4K, 3-query-23.patch.explain)
  download

  [text/csv] tpcds-eager-aggregate-times.csv (7.8K, 4-tpcds-eager-aggregate-times.csv)
  download | inline:
Query,Patched Planning (ms),Patched Execution (ms),Master Planning (ms),Master Execution (ms),Planning Diff (%),Execution Diff (%)
query_1.sql,5.197,343109.883,5.718,342439.125,-9.111577474641482,0.19587656638241105
query_10.sql,9.652,1907.724,7.42,1711.916,30.080862533692716,11.4379443851217
query_11.sql,2.097,3679.389,1.902,12420.909,10.252365930599373,-70.37745788170577
query_12.sql,3.706,134.387,6.555,120.692,-43.46300533943555,11.34706525701787
query_13.sql,4.024,1470.213,3.821,1465.751,5.312745354619206,0.3044173259987535
query_14.sql,6.325,3035.944,5.998,3058.458,5.45181727242414,-0.7361225820331724
query_15.sql,1.706,223.125,1.782,221.967,-4.264870931537602,0.5216991714984601
query_16.sql,4.3,335.252,3.871,332.84,11.08240764660294,0.724672515322688
query_17.sql,17.019,586.035,14.329,584.251,18.77311745411402,0.30534821506509907
query_18.sql,4.558,831.184,4.739,819.676,-3.819371175353451,1.4039693732645488
query_19.sql,4.043,348.386,3.426,345.351,18.009340338587272,0.8788160451251118
query_2.sql,1.084,1009.305,1.137,1009.213,-4.66138962181178,0.009116014161528293
query_20.sql,1.411,197.526,1.331,196.463,6.010518407212627,0.5410688017591183
query_21.sql,3.656,759.008,3.374,759.377,8.358032009484292,-0.048592464612427624
query_22.sql,1.062,9664.424,1.155,9720.983,-8.051948051948049,-0.5818238752191963
query_23.sql,6.317,6733.136,2.386,5465.139,164.75272422464374,23.20155077482934
query_24.sql,4.863,71.777,6.99,69.682,-30.42918454935622,3.0065153124192743
query_25.sql,32.706,565.499,29.09,567.284,12.430388449639063,-0.314657208734949
query_26.sql,4.732,500.593,3.597,494.797,31.554072838476515,1.1713894789176151
query_27.sql,1.946,800.924,1.834,795.803,6.106870229007626,0.6435009669478478
query_28.sql,1.403,2115.748,1.177,2109.951,19.201359388275275,0.2747457168436625
query_29.sql,20.743,680.826,18.767,697.571,10.529120264293702,-2.400472496706429
query_3.sql,1.048,306.902,1.037,338.807,1.0607521697203588,-9.416865649174907
query_30.sql,2.163,23196.843,2.62,23227.62,-17.442748091603065,-0.1325017371560161
query_31.sql,3.805,2156.624,3.99,4813.289,-4.636591478696743,-55.19437956042116
query_32.sql,1.376,369.863,1.426,379.844,-3.5063113604488114,-2.627657669990837
query_33.sql,5.592,683.848,4.386,671.533,27.49658002735977,1.8338637118354484
query_34.sql,2.706,293.647,2.868,293.764,-5.648535564853554,-0.03982788905380463
query_35.sql,2.297,1714.709,2.327,1709.587,-1.2892135797163646,0.29960452436758533
query_36.sql,1.341,959.408,1.406,958.635,-4.623044096728304,0.08063548691629499
query_37.sql,3.266,701.037,3.338,692.29,-2.156980227681248,1.2634878446893747
query_38.sql,1.938,2983.255,1.867,2970.44,3.802892340653452,0.43141756776773993
query_39.sql,2.434,4296.654,2.185,4297.245,11.395881006864993,-0.013752997560051609
query_4.sql,4.104,6885.96,3.93,20300.931,4.427480916030532,-66.08057039354502
query_40.sql,4.232,227.916,3.992,226.594,6.0120240480961975,0.5834223324536407
query_41.sql,0.85,1895.989,0.825,1917.606,3.030303030303033,-1.127291007641818
query_42.sql,1.134,216.127,1.088,215.79,4.227941176470571,0.15617035080403055
query_43.sql,1.13,724.987,1.068,724.42,5.805243445692868,0.07826951216145431
query_44.sql,1.007,1009.076,0.973,1015.087,3.4943473792394575,-0.5921659916834682
query_45.sql,2.491,146.108,2.888,148.276,-13.746537396121877,-1.4621381747551905
query_46.sql,2.585,663.085,2.231,679.81,15.867324069923805,-2.460246245274402
query_47.sql,2.107,3566.484,2.349,4028.359,-10.302256279267773,-11.465586855590578
query_48.sql,2.327,1417.187,2.468,1429.552,-5.713128038897894,-0.8649562939997992
query_49.sql,5.191,1332.436,5.117,1300.731,1.446159859292551,2.437475542598733
query_5.sql,3.996,1254.475,3.78,1239.619,5.71428571428572,1.1984327442544842
query_50.sql,3.58,1306.014,2.578,1280.202,38.86733902249807,2.0162443114445923
query_51.sql,1.138,1937.95,1.043,1927.959,9.108341323106423,0.518216414353209
query_52.sql,1.057,216.683,1.026,217.304,3.0214424951266974,-0.2857747671464903
query_53.sql,1.689,299.636,1.477,299.117,14.353419092755582,0.17351069982649112
query_54.sql,3.396,690.892,2.901,687.181,17.063081695966915,0.5400323932122705
query_55.sql,1.041,215.656,0.958,216.543,8.663883089770351,-0.409618412971096
query_56.sql,6.743,696.477,5.359,682.625,25.825713752565782,2.0292254165903643
query_57.sql,2.859,1935.809,2.396,1971.9,19.323873121869788,-1.8302652264313668
query_58.sql,2.893,761.302,2.47,743.917,17.125506072874476,2.3369542569937227
query_59.sql,1.818,1294.186,1.722,1292.091,5.5749128919860675,0.1621402826890697
query_6.sql,2.387,132211.841,1.918,144414.127,24.452554744525553,-8.44950992917751
query_60.sql,4.764,709.541,8.15,770.35,-41.54601226993865,-7.8936846887778245
query_61.sql,4.542,6.09,4.613,6.447,-1.5391285497507177,-5.5374592833876255
query_62.sql,2.194,277.489,2.129,279.699,3.0530765617660847,-0.7901351095284703
query_63.sql,1.609,274.35,1.544,308.721,4.2098445595854885,-11.133353416191312
query_64.sql,231.018,993.314,110.067,993.579,109.88852244541962,-0.026671256135645617
query_65.sql,1.547,1432.056,1.402,1459.18,10.342368045649074,-1.858852232075551
query_66.sql,4.873,459.169,4.288,456.478,13.642723880597012,0.58951362387672
query_67.sql,1.332,6262.641,1.321,6268.535,0.8327024981075034,-0.09402515898850741
query_68.sql,2.459,434.767,2.04,434.896,20.539215686274513,-0.029662264081531928
query_69.sql,3.622,545.235,2.971,559.032,21.91181420397172,-2.468016142188645
query_7.sql,2.59,740.428,1.911,756.807,35.53113553113552,-2.1642241681168404
query_70.sql,1.346,1085.83,1.276,1093.831,5.4858934169279046,-0.7314658297305504
query_71.sql,1.764,690.918,1.636,695.244,7.823960880195606,-0.6222275920396324
query_72.sql,16.468,2433.574,15.637,2422.972,5.314318603312652,0.4375618042635186
query_73.sql,1.561,242.764,1.373,246.741,13.692643845593585,-1.6118115757008376
query_74.sql,2.275,2600.782,1.636,2613.011,39.05867970660147,-0.4680041530632598
query_75.sql,3.936,2060.653,3.872,2021.916,1.6528925619834725,1.9158560494105519
query_76.sql,1.839,262.956,1.808,256.183,1.7146017699114997,2.6438132116494946
query_77.sql,6.134,506.031,4.12,503.471,48.88349514563107,0.5084701998724857
query_78.sql,3.479,3376.111,2.942,3346.175,18.252889191026508,0.8946334247312138
query_79.sql,1.943,494.783,1.66,500.474,17.04819277108435,-1.1371220083360922
query_8.sql,2.108,118.778,1.603,117.003,31.503431066749854,1.5170551182448362
query_80.sql,9.398,810.869,7.552,767.436,24.44385593220339,5.659494733111294
query_81.sql,1.601,102358.136,1.673,101992.064,-4.303646144650332,0.3589220431895565
query_82.sql,2.106,910.711,1.992,888.395,5.722891566265054,2.511945699829471
query_83.sql,2.732,151.69,2.419,147.383,12.939231087226133,2.9223180421079684
query_84.sql,3.255,164.327,3.084,159.529,5.544747081712057,3.0076036331952194
query_85.sql,11.396,609.845,9.978,598.002,14.211264782521557,1.9804281591031596
query_86.sql,0.963,417.937,0.924,409.646,4.220779220779212,2.023942623631134
query_87.sql,1.908,2794.814,1.868,2739.314,2.1413276231263283,2.0260546983660874
query_88.sql,4.025,1909.274,3.887,1872.028,3.550295857988175,1.989606993057789
query_89.sql,1.589,448.853,1.409,437.45,12.775017743080195,2.6066979083323853
query_9.sql,1.044,2384.919,1.005,2353.257,3.8805970149253883,1.345454406382295
query_90.sql,1.568,239.619,1.424,234.375,10.112359550561807,2.23744
query_91.sql,3.797,207.382,2.786,202.386,36.28858578607323,2.468550196159818
query_92.sql,1.132,76.153,1.149,76.136,-1.4795474325500544,0.022328464852382737
query_93.sql,1.293,3.116,1.183,2.986,9.298393913778519,4.3536503683857966
query_94.sql,2.145,257.063,2.005,254.546,6.982543640897762,0.9888193096729062
query_95.sql,2.029,9785.071,2.102,9640.791,-3.4728829686013296,1.496557699466783
query_96.sql,1.056,233.41,1.06,229.286,-0.37735849056603804,1.7986270422092911
query_97.sql,1.142,1025.226,1.2,1015.871,-4.833333333333338,0.9208846398804702
query_98.sql,1.297,356.808,1.209,355.641,7.278742762613718,0.32813989388174397
query_99.sql,1.59,583.963,1.472,571.363,8.016304347826095,2.2052530527877914

^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-05 13:12                                   ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-09 09:20                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-09 14:20                                       ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-12 09:34                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-12 18:47                                           ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-13 08:27                                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-25 04:23                                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-29 02:09                                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-01 23:54                                                   ` Re: Eager aggregation, take 3 Matheus Alcantara <[email protected]>
  2025-10-02 01:13                                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-02 01:39                                                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-02 08:49                                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-02 18:40                                                           ` Re: Eager aggregation, take 3 Matheus Alcantara <[email protected]>
@ 2025-10-03 03:14                                                             ` Richard Guo <[email protected]>
  2025-10-03 20:03                                                               ` Re: Eager aggregation, take 3 Matheus Alcantara <[email protected]>
  0 siblings, 1 reply; 70+ messages in thread

From: Richard Guo @ 2025-10-03 03:14 UTC (permalink / raw)
  To: Matheus Alcantara <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

On Fri, Oct 3, 2025 at 3:41 AM Matheus Alcantara
<[email protected]> wrote:
> Thanks for all the details. I've disabled the nested loops and executed
> the benchmark again and the results look much better! I see a 55%
> improvement on query_31 on my machine now (MacOS M3 Max).

Great!  That is 2.23 times faster.

> The only query that I see a considerable regression is query 23 which I
> get a 23% worst execution time. I'm attaching the EXPLAIN(ANALYZE)
> output from master and from the patched version if it's interesting.

I tested query 23 in my local environment but didn't observe the
regression.

-- on master
 Planning Time: 1.950 ms
 Execution Time: 3260.924 ms

-- on patched
 Planning Time: 2.197 ms
 Execution Time: 3237.287 ms

I ran the benchmark at scale factor 1 and executed ANALYZE beforehand.
For the build configuration, I disabled cassert.

Comparing the plans, I noticed one key difference: in the plan you
provided (query-23.patch.explain), the frequent_ss_items CTE uses
parallel aggregation, whereas in my local environment it does not.
This leads to a different final join order between the two plans.

However, given the highly inaccurate size and cost estimates for the
CTE Scan nodes, I'm not sure it's worth investigating further.  I'm
starting to feel that trying to tune performance here, with such
inaccurate underlying estimates for CTEs, is like building on sand.

> I'm also attaching a csv with the planning time and execution time from
> master and the patched version for all queries. It contains the % of
> difference between the executions. Negative numbers means that the
> patched version using eager aggregation is faster. (I loaded this csv on
> a postgres table and played with some queries to analyze the results).

I really appreciate this; it's very helpful.

> I'm just wondering if there is anything that can be done on the planner
> to prevent this type of situation?

I think the ideal solution is to improve our estimates for CTE
relations to make the plans for TPC-DS queries more reasonable.  Of
course, for queries from other benchmarks, the issues may stem from
other plan nodes.  IMHO, we really need some improvements in our cost
estimation.

- Richard





^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-05 13:12                                   ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-09 09:20                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-09 14:20                                       ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-12 09:34                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-12 18:47                                           ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-13 08:27                                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-25 04:23                                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-29 02:09                                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-01 23:54                                                   ` Re: Eager aggregation, take 3 Matheus Alcantara <[email protected]>
  2025-10-02 01:13                                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-02 01:39                                                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-02 08:49                                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-02 18:40                                                           ` Re: Eager aggregation, take 3 Matheus Alcantara <[email protected]>
  2025-10-03 03:14                                                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-10-03 20:03                                                               ` Matheus Alcantara <[email protected]>
  2025-10-06 00:56                                                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  0 siblings, 1 reply; 70+ messages in thread

From: Matheus Alcantara @ 2025-10-03 20:03 UTC (permalink / raw)
  To: Richard Guo <[email protected]>; Matheus Alcantara <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

On Fri Oct 3, 2025 at 12:14 AM -03, Richard Guo wrote:
>> The only query that I see a considerable regression is query 23 which I
>> get a 23% worst execution time. I'm attaching the EXPLAIN(ANALYZE)
>> output from master and from the patched version if it's interesting.
>
> I tested query 23 in my local environment but didn't observe the
> regression.
>
> -- on master
>  Planning Time: 1.950 ms
>  Execution Time: 3260.924 ms
>
> -- on patched
>  Planning Time: 2.197 ms
>  Execution Time: 3237.287 ms
>
> I ran the benchmark at scale factor 1 and executed ANALYZE beforehand.
> For the build configuration, I disabled cassert.
>
I've disabled the cassert and executed the ANALYZE again before
benchmarking and now I have similar results with a improvement on eager
aggregate version:

-- master
Planning Time: 2.734 ms
Execution Time: 5238.128 ms

-- patched
Planning Time: 2.578 ms
Execution Time: 4732.584 ms

> Comparing the plans, I noticed one key difference: in the plan you
> provided (query-23.patch.explain), the frequent_ss_items CTE uses
> parallel aggregation, whereas in my local environment it does not.
> This leads to a different final join order between the two plans.
>
> However, given the highly inaccurate size and cost estimates for the
> CTE Scan nodes, I'm not sure it's worth investigating further.  I'm
> starting to feel that trying to tune performance here, with such
> inaccurate underlying estimates for CTEs, is like building on sand.
>
> [ ...]
>
>> I'm just wondering if there is anything that can be done on the planner
>> to prevent this type of situation?
>
> I think the ideal solution is to improve our estimates for CTE
> relations to make the plans for TPC-DS queries more reasonable.  Of
> course, for queries from other benchmarks, the issues may stem from
> other plan nodes.  IMHO, we really need some improvements in our cost
> estimation.
>
Fair points, agree.

The performance results look good to me. I don't have to much comments
about the code although I'm still learning about the planner internals
this patch seems in good shape to me.

I'm just attaching a new csv with the last results after running with
cassert disabled and after executing ANALYZE. It looks good to me.

Thanks for working on this!

--
Matheus Alcantara


Attachments:

  [text/csv] tpcds-eager-aggregate-times.csv (7.8K, 2-tpcds-eager-aggregate-times.csv)
  download | inline:
Query,Patched Planning (ms),Patched Execution (ms),Master Planning (ms),Master Execution (ms),Planning Diff (%),Execution Diff (%)
query_1.sql,1.772,348111.128,1.448,347986.657,22.37569060773481,0.035768900185164154
query_10.sql,3.916,1708.264,3.628,1735.879,7.93825799338478,-1.5908366885019065
query_11.sql,1.874,3938.543,1.732,12631.077,8.198614318706705,-68.81862884693048
query_12.sql,2.423,118.035,1.938,120.104,25.025799793601657,-1.7226736828082352
query_13.sql,3.725,1449.302,3.875,1472.918,-3.8709677419354813,-1.603347912103728
query_14.sql,5.585,3142.689,4.926,3153.894,13.377994315874945,-0.3552750980216814
query_15.sql,3.787,229.036,16.127,226.61,-76.51764122279407,1.070561758086575
query_16.sql,3.933,340.588,3.744,330.124,5.048076923076913,3.169718045340538
query_17.sql,20.183,582.729,16.598,581.64,21.598987829859027,0.18722921394678074
query_18.sql,6.141,832.748,5.543,895.849,10.788381742738586,-7.043709375129067
query_19.sql,4.363,345.011,3.951,341.711,10.427739812705653,0.9657283493946672
query_2.sql,1.777,1029.284,2.096,1013.598,-15.219465648854968,1.5475563290377594
query_20.sql,1.605,199.484,1.696,198.66,-5.365566037735848,0.41477901943018836
query_21.sql,2.993,763.408,2.926,760.504,2.289815447710175,0.38185203496628506
query_22.sql,1.081,9782.704,1.017,9812.876,6.293018682399219,-0.3074735684013584
query_23.sql,8.857,4937.117,7.745,5440.361,14.357650096836657,-9.25019497787003
query_24.sql,6.435,73.181,5.783,68.873,11.274425038907127,6.254991070521093
query_25.sql,33.859,563.156,29.377,562.256,15.256833577288365,0.1600694345636111
query_26.sql,4.605,496.635,3.725,461.931,23.624161073825512,7.512810354793251
query_27.sql,2.108,802.41,2.013,791.632,4.719324391455549,1.3614911979303541
query_28.sql,1.203,2129.017,1.157,2114.691,3.975799481417462,0.6774512210058123
query_29.sql,20.799,699.452,17.871,695.774,16.384085949303344,0.5286199254355577
query_3.sql,1.076,316.886,1.465,314.489,-26.552901023890783,0.7621888205946944
query_30.sql,2.068,23992.385,2.01,23732.395,2.8855721393034965,1.0955067956689495
query_31.sql,3.443,2170.827,3.71,4956.94,-7.196765498652288,-56.20630873078956
query_32.sql,1.46,381.865,1.558,384.881,-6.290115532734281,-0.7836188328340352
query_33.sql,6.144,665.558,5.328,683.21,15.315315315315312,-2.583685835980159
query_34.sql,3.072,294.182,2.48,294.62,23.870967741935488,-0.14866607833819434
query_35.sql,3.058,1696.758,3.066,1741.545,-0.26092628832354886,-2.571682040946403
query_36.sql,1.472,951.782,1.543,957.131,-4.601425793907969,-0.5588576694308232
query_37.sql,2.862,692.185,3.06,695.129,-6.470588235294115,-0.42351851239123584
query_38.sql,1.757,2934.366,2.039,2945.52,-13.830308974987751,-0.3786767701458485
query_39.sql,2.106,4287.367,2.459,4355.186,-14.355429036193582,-1.5572010012890267
query_4.sql,4.78,7367.944,4.783,20359.41,-0.06272214091574563,-63.81062123116534
query_40.sql,4.527,229.529,17.79,235.618,-74.55311973018549,-2.584267755434644
query_41.sql,1.076,1911.485,1.364,1898.805,-21.114369501466275,0.6677884248250787
query_42.sql,2.185,214.184,1.876,216.248,16.471215351812376,-0.9544596944249164
query_43.sql,1.618,717.386,1.678,719.886,-3.5756853396900974,-0.3472772077801208
query_44.sql,1.095,999.478,1.022,1020.661,7.142857142857138,-2.0754197524937266
query_45.sql,4.503,147.938,3.965,150.025,13.568726355611608,-1.3911014830861639
query_46.sql,3.389,670.253,2.905,669.081,16.66092943201377,0.17516563764327867
query_47.sql,2.107,3993.831,2.096,3928.992,0.5248091603053493,1.650270603757909
query_48.sql,3.193,1433.704,3.028,1413.382,5.449141347424043,1.4378278483806846
query_49.sql,6.013,1328.264,5.649,1305.393,6.443618339529118,1.7520394241427575
query_5.sql,12.624,1248.898,4.026,1263.176,213.56184798807752,-1.1303254653349986
query_50.sql,3.642,1363.389,2.689,1318.568,35.44068426924507,3.3992179394615913
query_51.sql,1.567,1934.372,1.078,1905.328,45.36178107606678,1.524356961111163
query_52.sql,1.323,215.693,1.063,215.418,24.459078080903108,0.12765878431700492
query_53.sql,1.74,296.488,1.558,297.292,11.681643132220792,-0.27044118240651405
query_54.sql,3.749,689.41,3.331,684.939,12.548784148904238,0.6527588588180852
query_55.sql,1.077,213.579,0.969,214.217,11.145510835913312,-0.29782883711377023
query_56.sql,7.254,693.427,5.672,691.625,27.891396332863188,0.26054581601301585
query_57.sql,2.649,1828.543,2.581,1956.034,2.634637737311122,-6.517831489636694
query_58.sql,2.929,769.344,2.601,752.896,12.610534409842364,2.184631077864684
query_59.sql,1.899,1275.897,1.797,1287.483,5.676126878130222,-0.8998953772593512
query_6.sql,2.646,136423.314,1.94,148868.237,36.391752577319586,-8.359689918273151
query_60.sql,4.999,723.689,11.688,772.551,-57.22963723477071,-6.324760436527825
query_61.sql,5.127,6.56,6.288,6.482,-18.463740458015273,1.2033323048441746
query_62.sql,2.522,279.014,2.42,276.941,4.2148760330578465,0.7485348864920818
query_63.sql,1.722,271.157,1.439,299.55,19.66643502432244,-9.478551160073453
query_64.sql,246.953,961.432,108.404,1220.635,127.80801446441092,-21.235094848173286
query_65.sql,2.185,1451.433,1.613,1459.66,35.46187228766274,-0.5636244056835213
query_66.sql,6.342,469.304,4.331,461.772,46.432694527822655,1.6311079926890288
query_67.sql,1.381,6239.233,1.333,6275.725,3.600900225056267,-0.581478633942695
query_68.sql,2.712,444.651,2.462,454.96,10.154346060113728,-2.2659134868999407
query_69.sql,3.659,549.403,3.068,559.8,19.26336375488917,-1.8572704537334648
query_7.sql,2.512,748.722,1.877,763.937,33.830580713905164,-1.991656380041814
query_70.sql,1.378,1082.349,1.282,1099.89,7.488299531981268,-1.5947958432207008
query_71.sql,1.616,664.626,1.879,690.085,-13.996806812134107,-3.6892556714028064
query_72.sql,17.423,2425.441,16.905,2431.505,3.06418219461696,-0.24939286573543157
query_73.sql,1.532,240.721,1.491,248.724,2.749832327297111,-3.2176227464981206
query_74.sql,2.461,2606.065,1.679,2695.639,46.57534246575341,-3.32292269105767
query_75.sql,5.763,2152.559,5.39,2252.586,6.920222634508353,-4.440540782904608
query_76.sql,1.77,260.015,1.832,273.168,-3.384279475982536,-4.814985649856506
query_77.sql,7.189,502.823,4.539,504.798,58.382903723287086,-0.3912456071537571
query_78.sql,4.667,3404.293,3.075,3526.288,51.77235772357722,-3.4595869651032443
query_79.sql,2.518,441.678,1.757,497.753,43.31246442800227,-11.265627731023216
query_8.sql,2.616,113.6,1.731,118.141,51.12651646447141,-3.8437121744356415
query_80.sql,9.607,777.569,7.965,794.619,20.615191462649083,-2.145682396217567
query_81.sql,1.744,104737.831,1.539,103999.861,13.320337881741395,0.7095874868525076
query_82.sql,1.98,904.683,1.956,903.053,1.226993865030676,0.18049881900619294
query_83.sql,2.78,159.719,2.572,155.356,8.087091757387237,2.808388475501429
query_84.sql,3.311,164.835,3.243,162.348,2.096823928461303,1.5318944489614867
query_85.sql,11.635,607.475,9.547,603.498,21.87074473656645,0.6589914133932465
query_86.sql,1.038,444.518,0.948,435.156,9.493670886075959,2.151412367059162
query_87.sql,2.509,3033.169,1.84,3022.709,36.3586956521739,0.34604720467633626
query_88.sql,4.439,1887.336,4.055,1882.765,9.469790382244152,0.2427812286716564
query_89.sql,1.578,447.047,1.429,447.93,10.426871938418476,-0.19712901569441235
query_9.sql,0.999,2377.692,1.06,2359.84,-5.754716981132081,0.7564919655569811
query_90.sql,1.484,243.019,1.521,240.875,-2.432610124917812,0.8900882200311387
query_91.sql,4.539,211.898,3.484,203.032,30.281285878300796,4.366799322274314
query_92.sql,1.185,76.483,1.149,77.097,3.1331592689295062,-0.7963993410897832
query_93.sql,1.427,3.337,1.236,3.07,15.453074433656964,8.697068403908807
query_94.sql,2.112,265.029,2.338,259.955,-9.666381522668946,1.9518762862803227
query_95.sql,2.02,9955.019,1.959,9880.937,3.1138335885655914,0.7497467092442786
query_96.sql,1.295,231.706,1.092,226.021,18.589743589743573,2.5152530074639095
query_97.sql,1.204,1032.419,1.11,1033.895,8.468468468468455,-0.14276111210518336
query_98.sql,1.472,374.712,1.341,366.122,9.768829231916481,2.346212464697553
query_99.sql,1.509,585.547,1.578,587.693,-4.372623574144498,-0.365156637904477

^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-05 13:12                                   ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-09 09:20                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-09 14:20                                       ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-12 09:34                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-12 18:47                                           ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-13 08:27                                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-25 04:23                                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-29 02:09                                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-01 23:54                                                   ` Re: Eager aggregation, take 3 Matheus Alcantara <[email protected]>
  2025-10-02 01:13                                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-02 01:39                                                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-02 08:49                                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-02 18:40                                                           ` Re: Eager aggregation, take 3 Matheus Alcantara <[email protected]>
  2025-10-03 03:14                                                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-03 20:03                                                               ` Re: Eager aggregation, take 3 Matheus Alcantara <[email protected]>
@ 2025-10-06 00:56                                                                 ` Richard Guo <[email protected]>
  0 siblings, 0 replies; 70+ messages in thread

From: Richard Guo @ 2025-10-06 00:56 UTC (permalink / raw)
  To: Matheus Alcantara <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

On Sat, Oct 4, 2025 at 5:03 AM Matheus Alcantara
<[email protected]> wrote:
> I've disabled the cassert and executed the ANALYZE again before
> benchmarking and now I have similar results with a improvement on eager
> aggregate version:
>
> -- master
> Planning Time: 2.734 ms
> Execution Time: 5238.128 ms
>
> -- patched
> Planning Time: 2.578 ms
> Execution Time: 4732.584 ms

Great!

> The performance results look good to me. I don't have to much comments
> about the code although I'm still learning about the planner internals
> this patch seems in good shape to me.

Thanks for running the benchmark and reviewing the patch.

> I'm just attaching a new csv with the last results after running with
> cassert disabled and after executing ANALYZE. It looks good to me.

Yeah, the results look good this time.  There are no performance
regressions; on the contrary, several queries actually show very
really nice improvements.

- Richard





^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-05 13:12                                   ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-09 09:20                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-09 14:20                                       ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-12 09:34                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-12 18:47                                           ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-13 08:27                                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-25 04:23                                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-29 02:09                                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-10-06 00:59                                                   ` Richard Guo <[email protected]>
  2025-10-06 13:59                                                     ` Re: Eager aggregation, take 3 David Rowley <[email protected]>
  2025-10-09 01:48                                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  1 sibling, 2 replies; 70+ messages in thread

From: Richard Guo @ 2025-10-06 00:59 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]; Matheus Alcantara <[email protected]>

On Mon, Sep 29, 2025 at 11:09 AM Richard Guo <[email protected]> wrote:
> FWIW, I plan to do another self-review of this patch soon, with the
> goal of assessing whether it's ready to be pushed.  If anyone has any
> concerns about any part of the patch or would like to review it, I
> would greatly appreciate hearing from you.

Barring any objections, I'll plan to push v23 in a couple of days.

- Richard





^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-05 13:12                                   ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-09 09:20                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-09 14:20                                       ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-12 09:34                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-12 18:47                                           ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-13 08:27                                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-25 04:23                                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-29 02:09                                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-06 00:59                                                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-10-06 13:59                                                     ` David Rowley <[email protected]>
  2025-10-07 10:56                                                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  1 sibling, 1 reply; 70+ messages in thread

From: David Rowley @ 2025-10-06 13:59 UTC (permalink / raw)
  To: Richard Guo <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]; Matheus Alcantara <[email protected]>

On Mon, 6 Oct 2025 at 13:59, Richard Guo <[email protected]> wrote:
> Barring any objections, I'll plan to push v23 in a couple of days.

Not a complete review, but a customary look:

1. setup_base_grouped_rels() by name and the header comment claim to
operate on base relations, but the code seems to be coded to handle
OTHER_MEMBER rels too.

Note that set_base_rel_pathlists() explicitly skips anything that's
not RELOPT_BASEREL, so if you're not doing that, then you shouldn't
use "base" in the function name. It's confusing.

2. All the calls to generate_grouped_paths() pass the grouped_rel
RelOptInfo and also grouped_rel->agg_info. Is there a reason to keep
it that way rather than access the agg_info from the given grouped_rel
from within the function?

3. " * The information needed are provided by the RelAggInfo
structure." This should use "is" rather than "are"

4. standard_join_search(). I think it's worth getting rid of the
duplicate "if (!bms_equal(rel->relids, root->all_query_rels))" check.
How about setting that in a local variable rather than recalling
bms_equal(). I don't believe the compiler will optimise the extra one
away as it can't know set_cheapest() doesn't change the relids. Also,
wouldn't it be better to check rel->grouped_rel != NULL first? Won't
that be NULL in most cases, where as !bms_equal(rel->relids,
root->all_query_rels) will be true in most cases? Likewise in
generate_partitionwise_join_paths().

5. Wouldn't it be better to do 0002 first and get that into core so
you don't have to do the hacky stuff in is_partial_agg_memory_risky()?

6. Shouldn't this be using lappend()?

 agg_clause_list = list_append_unique(agg_clause_list, ac_info);

I don't understand why ac_info could already be in the list. You've
just done: ac_info = makeNode(AggClauseInfo);

7. The following comment talks about "base" relations. I don't think
it should be as the RelOptInfo can be an OTHER_MEMBER rel.

 * build_simple_grouped_rel
 *   Construct a new RelOptInfo representing a grouped version of the input
 *   base relation.
 */

8. Normally we check the List is NIL instead of:

if (list_length(group_clauses) == 0)

9. In get_expression_sortgroupref(), a comment claims "We ignore child
members here.". I think that's outdated since ec_members no longer has
child members.

10. I don't think this comment quite makes sense:

 * "apply_at" tracks the lowest join level at which partial aggregation is
 * applied.

maybe "minimum set of rels to join before partial aggregation can be applied"?

or at least swap "is" for "can be".

My confusion comes from the fact you're stating "lowest join level",
which seems to indicate that it could be applied after further
relations have been joined, but then you're saying "is applied" to
indicate that it can only be applied at that level.

11. The way you've written the header comments for typedef struct
RelAggInfo seems weird.  I've only ever seen extra details in the
header comment when the inline comments have been kept to a single
line. You're spanning multiple lines, so why have the out of line
comments in the header at all?

12. This just doesn't feel like the right name for this field:

/* lowest level partial aggregation is applied at */
Relids apply_at;

I can't help think that it should be something like "agg_relids" or
"required_relids".  I understand you're currently only applying the
partial grouping when you get exactly the minimum set of relids in the
join search, but if this can be made fast enough, I expect that could
be changed in the future. If you do change it, then "apply_at" is a
pretty confusing name.  Perhaps I've misunderstood here and if you did
that, you'd need to create another RelAggInfo to represent that?

13. Parameter names mismatch between definition and declaration in:

extern RelOptInfo *build_simple_grouped_rel(PlannerInfo *root,
RelOptInfo *rel_plain);
extern RelOptInfo *build_grouped_rel(PlannerInfo *root,
RelOptInfo *rel_plain);

extern void generate_grouped_paths(PlannerInfo *root,
   RelOptInfo *rel_grouped,
   RelOptInfo *rel_plain,
   RelAggInfo *agg_info);

14. Do all the regression tests need VERBOSE in EXPLAIN? It's making
the output kinda huge. It might also be nice to wrap the long queries
onto multiple lines to make them easier to read.

David





^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-05 13:12                                   ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-09 09:20                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-09 14:20                                       ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-12 09:34                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-12 18:47                                           ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-13 08:27                                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-25 04:23                                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-29 02:09                                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-06 00:59                                                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-06 13:59                                                     ` Re: Eager aggregation, take 3 David Rowley <[email protected]>
@ 2025-10-07 10:56                                                       ` Richard Guo <[email protected]>
  2025-10-08 11:14                                                         ` Re: Eager aggregation, take 3 David Rowley <[email protected]>
  2025-10-08 14:45                                                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  0 siblings, 2 replies; 70+ messages in thread

From: Richard Guo @ 2025-10-07 10:56 UTC (permalink / raw)
  To: David Rowley <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]; Matheus Alcantara <[email protected]>

On Mon, Oct 6, 2025 at 10:59 PM David Rowley <[email protected]> wrote:
> Not a complete review, but a customary look:

Thanks for all the comments!  They've been very helpful.

> 1. setup_base_grouped_rels() by name and the header comment claim to
> operate on base relations, but the code seems to be coded to handle
> OTHER_MEMBER rels too.

Indeed.  I renamed it to setup_simple_grouped_rels() and updated the
related comments in v24.

> 2. All the calls to generate_grouped_paths() pass the grouped_rel
> RelOptInfo and also grouped_rel->agg_info. Is there a reason to keep
> it that way rather than access the agg_info from the given grouped_rel
> from within the function?

Thanks.  Fixed by removing the agg_info parameter.

> 3. " * The information needed are provided by the RelAggInfo
> structure." This should use "is" rather than "are"

Yes.

> 4. standard_join_search(). I think it's worth getting rid of the
> duplicate "if (!bms_equal(rel->relids, root->all_query_rels))" check.
> How about setting that in a local variable rather than recalling
> bms_equal(). I don't believe the compiler will optimise the extra one
> away as it can't know set_cheapest() doesn't change the relids. Also,
> wouldn't it be better to check rel->grouped_rel != NULL first? Won't
> that be NULL in most cases, where as !bms_equal(rel->relids,
> root->all_query_rels) will be true in most cases? Likewise in
> generate_partitionwise_join_paths().

Good point.  Done that way in v24.

> 5. Wouldn't it be better to do 0002 first and get that into core so
> you don't have to do the hacky stuff in is_partial_agg_memory_risky()?

Agreed.  Done in v24.

> 6. Shouldn't this be using lappend()?
>
>  agg_clause_list = list_append_unique(agg_clause_list, ac_info);
>
> I don't understand why ac_info could already be in the list. You've
> just done: ac_info = makeNode(AggClauseInfo);

A query can specify the same Aggref expressions multiple times in the
target list.  Using lappend here can lead to duplicate partial Aggref
nodes in the targetlist of a grouped path, which is what I want to
avoid.

> 7. The following comment talks about "base" relations. I don't think
> it should be as the RelOptInfo can be an OTHER_MEMBER rel.
>
>  * build_simple_grouped_rel
>  *   Construct a new RelOptInfo representing a grouped version of the input
>  *   base relation.
>  */

Fixed in v24.


> 8. Normally we check the List is NIL instead of:
>
> if (list_length(group_clauses) == 0)

Right.  Updated in v24.

> 9. In get_expression_sortgroupref(), a comment claims "We ignore child
> members here.". I think that's outdated since ec_members no longer has
> child members.

I think that comment is used to explain why we only scan ec_members
here.  Similar comments can be found in many other places, such as in
equivclass.c:

  /*
   * Found our match.  Scan the other EC members and attempt to generate
   * joinclauses.  Ignore children here.
   */
  foreach(lc2, cur_ec->ec_members)
  {


> 10. I don't think this comment quite makes sense:
>
>  * "apply_at" tracks the lowest join level at which partial aggregation is
>  * applied.
>
> maybe "minimum set of rels to join before partial aggregation can be applied"?
>
> or at least swap "is" for "can be".
>
> My confusion comes from the fact you're stating "lowest join level",
> which seems to indicate that it could be applied after further
> relations have been joined, but then you're saying "is applied" to
> indicate that it can only be applied at that level.
>
> 11. The way you've written the header comments for typedef struct
> RelAggInfo seems weird.  I've only ever seen extra details in the
> header comment when the inline comments have been kept to a single
> line. You're spanning multiple lines, so why have the out of line
> comments in the header at all?
>
> 12. This just doesn't feel like the right name for this field:
>
> /* lowest level partial aggregation is applied at */
> Relids apply_at;
>
> I can't help think that it should be something like "agg_relids" or
> "required_relids".  I understand you're currently only applying the
> partial grouping when you get exactly the minimum set of relids in the
> join search, but if this can be made fast enough, I expect that could
> be changed in the future. If you do change it, then "apply_at" is a
> pretty confusing name.  Perhaps I've misunderstood here and if you did
> that, you'd need to create another RelAggInfo to represent that?

Hmm, RelAggInfo is a per-relation structure; each grouped relation has
a valid RelAggInfo.  The apply_at field represents the set of relids
where partial aggregation is applied within the paths of this grouped
relation.  If we ever change this approach and allow the planner to
explore all join levels for placing partial aggregation, the apply_at
field will become obsolete (cf. prior to v17 patches).

I've updated the comment for apply_at to clarify that it refers to the
relids at which partial aggregation is applied.

I've also updated the comments within RelAggInfo to use one-line
style.

I retained the name of this field though.

> 13. Parameter names mismatch between definition and declaration in:
>
> extern RelOptInfo *build_simple_grouped_rel(PlannerInfo *root,
> RelOptInfo *rel_plain);
> extern RelOptInfo *build_grouped_rel(PlannerInfo *root,
> RelOptInfo *rel_plain);
>
> extern void generate_grouped_paths(PlannerInfo *root,
>    RelOptInfo *rel_grouped,
>    RelOptInfo *rel_plain,
>    RelAggInfo *agg_info);

Nice catch!  Fixed in v24.

> 14. Do all the regression tests need VERBOSE in EXPLAIN? It's making
> the output kinda huge. It might also be nice to wrap the long queries
> onto multiple lines to make them easier to read.

One of the challenges in this patch is generating the correct target
list for each grouped relation.  So I'm kind of inclined to retain
VERBOSE in EXPLAIN.  As I recall, the output target list in the test
cases saved me several times during development when I introduced
problematic code changes.

I wrapped the long queries in v24.

- Richard


Attachments:

  [application/octet-stream] v24-0001-Allow-negative-aggtransspace-to-indicate-unbound.patch (6.3K, 2-v24-0001-Allow-negative-aggtransspace-to-indicate-unbound.patch)
  download | inline diff:
From dc5d4fb9bae1412c3230329d22616e13f3cc9662 Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Tue, 7 Oct 2025 10:16:37 +0900
Subject: [PATCH v24 1/2] Allow negative aggtransspace to indicate unbounded
 state size

This patch reuses the existing aggtransspace in pg_aggregate to
signal that an aggregate's transition state can grow unboundedly.  If
aggtransspace is set to a negative value, it now indicates that the
transition state may consume unpredictable or large amounts of memory,
such as in aggregates like array_agg or string_agg that accumulate
input rows.

This information can be used by the planner to avoid applying
memory-sensitive optimizations (e.g., eager aggregation) when there is
a risk of excessive memory usage during partial aggregation.

Bump catalog version.

Per idea from Robert Haas, though applied differently than originally
suggested.

Discussion: https://postgr.es/m/CA+TgmoYbkvYwLa+1vOP7RDY7kO2=A7rppoPusoRXe44VDOGBPg@mail.gmail.com
---
 doc/src/sgml/catalogs.sgml               |  5 ++++-
 doc/src/sgml/ref/create_aggregate.sgml   | 11 ++++++++---
 src/include/catalog/pg_aggregate.dat     | 10 ++++++----
 src/test/regress/expected/opr_sanity.out |  2 +-
 src/test/regress/sql/opr_sanity.sql      |  2 +-
 5 files changed, 20 insertions(+), 10 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index e9095bedf21..3acc2222a87 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -596,7 +596,10 @@
       </para>
       <para>
        Approximate average size (in bytes) of the transition state
-       data, or zero to use a default estimate
+       data. A positive value provides an estimate; zero means to
+       use a default estimate. A negative value indicates the state
+       data can grow unboundedly in size, such as when the aggregate
+       accumulates input rows (e.g., array_agg, string_agg).
       </para></entry>
      </row>
 
diff --git a/doc/src/sgml/ref/create_aggregate.sgml b/doc/src/sgml/ref/create_aggregate.sgml
index 222e0aa5c9d..0472ac2e874 100644
--- a/doc/src/sgml/ref/create_aggregate.sgml
+++ b/doc/src/sgml/ref/create_aggregate.sgml
@@ -384,9 +384,13 @@ SELECT col FROM tab ORDER BY col USING sortop LIMIT 1;
      <para>
       The approximate average size (in bytes) of the aggregate's state value.
       If this parameter is omitted or is zero, a default estimate is used
-      based on the <replaceable>state_data_type</replaceable>.
+      based on the <replaceable>state_data_type</replaceable>. If set to a
+      negative value, it indicates the state data can grow unboundedly in
+      size, such as when the aggregate accumulates input rows (e.g.,
+      array_agg, string_agg).
       The planner uses this value to estimate the memory required for a
-      grouped aggregate query.
+      grouped aggregate query and to avoid optimizations that may cause
+      excessive memory usage.
      </para>
     </listitem>
    </varlistentry>
@@ -568,7 +572,8 @@ SELECT col FROM tab ORDER BY col USING sortop LIMIT 1;
      <para>
       The approximate average size (in bytes) of the aggregate's state
       value, when using moving-aggregate mode.  This works the same as
-      <replaceable>state_data_size</replaceable>.
+      <replaceable>state_data_size</replaceable>, except that negative
+      values are not used to indicate unbounded state size.
      </para>
     </listitem>
    </varlistentry>
diff --git a/src/include/catalog/pg_aggregate.dat b/src/include/catalog/pg_aggregate.dat
index d6aa1f6ec47..870769e8f14 100644
--- a/src/include/catalog/pg_aggregate.dat
+++ b/src/include/catalog/pg_aggregate.dat
@@ -558,26 +558,28 @@
   aggfinalfn => 'array_agg_finalfn', aggcombinefn => 'array_agg_combine',
   aggserialfn => 'array_agg_serialize',
   aggdeserialfn => 'array_agg_deserialize', aggfinalextra => 't',
-  aggtranstype => 'internal' },
+  aggtranstype => 'internal', aggtransspace => '-1' },
 { aggfnoid => 'array_agg(anyarray)', aggtransfn => 'array_agg_array_transfn',
   aggfinalfn => 'array_agg_array_finalfn',
   aggcombinefn => 'array_agg_array_combine',
   aggserialfn => 'array_agg_array_serialize',
   aggdeserialfn => 'array_agg_array_deserialize', aggfinalextra => 't',
-  aggtranstype => 'internal' },
+  aggtranstype => 'internal', aggtransspace => '-1' },
 
 # text
 { aggfnoid => 'string_agg(text,text)', aggtransfn => 'string_agg_transfn',
   aggfinalfn => 'string_agg_finalfn', aggcombinefn => 'string_agg_combine',
   aggserialfn => 'string_agg_serialize',
-  aggdeserialfn => 'string_agg_deserialize', aggtranstype => 'internal' },
+  aggdeserialfn => 'string_agg_deserialize',
+  aggtranstype => 'internal', aggtransspace => '-1' },
 
 # bytea
 { aggfnoid => 'string_agg(bytea,bytea)',
   aggtransfn => 'bytea_string_agg_transfn',
   aggfinalfn => 'bytea_string_agg_finalfn',
   aggcombinefn => 'string_agg_combine', aggserialfn => 'string_agg_serialize',
-  aggdeserialfn => 'string_agg_deserialize', aggtranstype => 'internal' },
+  aggdeserialfn => 'string_agg_deserialize',
+  aggtranstype => 'internal', aggtransspace => '-1' },
 
 # range
 { aggfnoid => 'range_intersect_agg(anyrange)',
diff --git a/src/test/regress/expected/opr_sanity.out b/src/test/regress/expected/opr_sanity.out
index 20bf9ea9cdf..a357e1d0c0e 100644
--- a/src/test/regress/expected/opr_sanity.out
+++ b/src/test/regress/expected/opr_sanity.out
@@ -1470,7 +1470,7 @@ WHERE aggfnoid = 0 OR aggtransfn = 0 OR
     (aggkind = 'n' AND aggnumdirectargs > 0) OR
     aggfinalmodify NOT IN ('r', 's', 'w') OR
     aggmfinalmodify NOT IN ('r', 's', 'w') OR
-    aggtranstype = 0 OR aggtransspace < 0 OR aggmtransspace < 0;
+    aggtranstype = 0 OR aggmtransspace < 0;
  ctid | aggfnoid 
 ------+----------
 (0 rows)
diff --git a/src/test/regress/sql/opr_sanity.sql b/src/test/regress/sql/opr_sanity.sql
index 2fb3a852878..cd674d7dbca 100644
--- a/src/test/regress/sql/opr_sanity.sql
+++ b/src/test/regress/sql/opr_sanity.sql
@@ -847,7 +847,7 @@ WHERE aggfnoid = 0 OR aggtransfn = 0 OR
     (aggkind = 'n' AND aggnumdirectargs > 0) OR
     aggfinalmodify NOT IN ('r', 's', 'w') OR
     aggmfinalmodify NOT IN ('r', 's', 'w') OR
-    aggtranstype = 0 OR aggtransspace < 0 OR aggmtransspace < 0;
+    aggtranstype = 0 OR aggmtransspace < 0;
 
 -- Make sure the matching pg_proc entry is sensible, too.
 
-- 
2.39.5 (Apple Git-154)



  [application/octet-stream] v24-0002-Implement-Eager-Aggregation.patch (188.5K, 3-v24-0002-Implement-Eager-Aggregation.patch)
  download | inline diff:
From d03a39b1a88bee1280fbdd61529eac428902b39e Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Tue, 11 Jun 2024 15:59:19 +0900
Subject: [PATCH v24 2/2] Implement Eager Aggregation

Eager aggregation is a query optimization technique that partially
pushes aggregation past a join, and finalizes it once all the
relations are joined.  Eager aggregation may reduce the number of
input rows to the join and thus could result in a better overall plan.

In the current planner architecture, the separation between the
scan/join planning phase and the post-scan/join phase means that
aggregation steps are not visible when constructing the join tree,
limiting the planner's ability to exploit aggregation-aware
optimizations.  To implement eager aggregation, we collect information
about aggregate functions in the targetlist and HAVING clause, along
with grouping expressions from the GROUP BY clause, and store it in
the PlannerInfo node.  During the scan/join planning phase, this
information is used to evaluate each base or join relation to
determine whether eager aggregation can be applied.  If applicable, we
create a separate RelOptInfo, referred to as a grouped relation, to
represent the partially-aggregated version of the relation and
generate grouped paths for it.

Grouped relation paths can be generated in two ways.  The first method
involves adding sorted and hashed partial aggregation paths on top of
the non-grouped paths.  To limit planning time, we only consider the
cheapest or suitably-sorted non-grouped paths in this step.
Alternatively, grouped paths can be generated by joining a grouped
relation with a non-grouped relation.  Joining two grouped relations
is currently not supported.

To further limit planning time, we currently adopt a strategy where
partial aggregation is pushed only to the lowest feasible level in the
join tree where it provides a significant reduction in row count.
This strategy also helps ensure that all grouped paths for the same
grouped relation produce the same set of rows, which is important to
support a fundamental assumption of the planner.

For the partial aggregation that is pushed down to a non-aggregated
relation, we need to consider all expressions from this relation that
are involved in upper join clauses and include them in the grouping
keys, using compatible operators.  This is essential to ensure that an
aggregated row from the partial aggregation matches the other side of
the join if and only if each row in the partial group does.  This
ensures that all rows within the same partial group share the same
"destiny", which is crucial for maintaining correctness.

One restriction is that we cannot push partial aggregation down to a
relation that is in the nullable side of an outer join, because the
NULL-extended rows produced by the outer join would not be available
when we perform the partial aggregation, while with a
non-eager-aggregation plan these rows are available for the top-level
aggregation.  Pushing partial aggregation in this case may result in
the rows being grouped differently than expected, or produce incorrect
values from the aggregate functions.

If we have generated a grouped relation for the topmost join relation,
we finalize its paths at the end.  The final paths will compete in the
usual way with paths built from regular planning.

The patch was originally proposed by Antonin Houska in 2017.  This
commit reworks various important aspects and rewrites most of the
current code.  However, the original patch and reviews were very
useful.

Author: Richard Guo <[email protected]>
Author: Antonin Houska <[email protected]> (in an older version)
Reviewed-by: Robert Haas <[email protected]>
Reviewed-by: Jian He <[email protected]>
Reviewed-by: Tender Wang <[email protected]>
Reviewed-by: Matheus Alcantara <[email protected]>
Reviewed-by: Tom Lane <[email protected]>
Reviewed-by: David Rowley <[email protected]>
Reviewed-by: Tomas Vondra <[email protected]> (in an older version)
Reviewed-by: Andy Fan <[email protected]> (in an older version)
Reviewed-by: Ashutosh Bapat <[email protected]> (in an older version)
Discussion: https://postgr.es/m/CAMbWs48jzLrPt1J_00ZcPZXWUQKawQOFE8ROc-ADiYqsqrpBNw@mail.gmail.com
---
 .../postgres_fdw/expected/postgres_fdw.out    |   49 +-
 doc/src/sgml/config.sgml                      |   31 +
 src/backend/optimizer/README                  |  110 ++
 src/backend/optimizer/geqo/geqo_eval.c        |   21 +-
 src/backend/optimizer/path/allpaths.c         |  467 ++++-
 src/backend/optimizer/path/joinrels.c         |  193 ++
 src/backend/optimizer/plan/initsplan.c        |  370 ++++
 src/backend/optimizer/plan/planmain.c         |    9 +
 src/backend/optimizer/plan/planner.c          |  124 +-
 src/backend/optimizer/util/appendinfo.c       |   51 +
 src/backend/optimizer/util/relnode.c          |  650 +++++++
 src/backend/utils/misc/guc_parameters.dat     |   16 +
 src/backend/utils/misc/postgresql.conf.sample |    2 +
 src/include/nodes/pathnodes.h                 |  117 ++
 src/include/optimizer/pathnode.h              |    4 +
 src/include/optimizer/paths.h                 |    4 +
 src/include/optimizer/planmain.h              |    1 +
 .../regress/expected/collate.icu.utf8.out     |   32 +-
 src/test/regress/expected/eager_aggregate.out | 1714 +++++++++++++++++
 src/test/regress/expected/join.out            |   12 +-
 .../regress/expected/partition_aggregate.out  |    2 +
 src/test/regress/expected/sysviews.out        |    3 +-
 src/test/regress/parallel_schedule            |    2 +-
 src/test/regress/sql/eager_aggregate.sql      |  380 ++++
 src/test/regress/sql/partition_aggregate.sql  |    2 +
 src/tools/pgindent/typedefs.list              |    3 +
 26 files changed, 4293 insertions(+), 76 deletions(-)
 create mode 100644 src/test/regress/expected/eager_aggregate.out
 create mode 100644 src/test/regress/sql/eager_aggregate.sql

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index 6dc04e916dc..f5a57b9cbd5 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -3701,30 +3701,33 @@ select count(t1.c3) from ft2 t1 left join ft2 t2 on (t1.c1 = random() * t2.c2);
 -- Subquery in FROM clause having aggregate
 explain (verbose, costs off)
 select count(*), x.b from ft1, (select c2 a, sum(c1) b from ft1 group by c2) x where ft1.c2 = x.a group by x.b order by 1, 2;
-                                          QUERY PLAN                                           
------------------------------------------------------------------------------------------------
+                                       QUERY PLAN                                        
+-----------------------------------------------------------------------------------------
  Sort
-   Output: (count(*)), x.b
-   Sort Key: (count(*)), x.b
-   ->  HashAggregate
-         Output: count(*), x.b
-         Group Key: x.b
-         ->  Hash Join
-               Output: x.b
-               Inner Unique: true
-               Hash Cond: (ft1.c2 = x.a)
-               ->  Foreign Scan on public.ft1
-                     Output: ft1.c2
-                     Remote SQL: SELECT c2 FROM "S 1"."T 1"
-               ->  Hash
-                     Output: x.b, x.a
-                     ->  Subquery Scan on x
-                           Output: x.b, x.a
-                           ->  Foreign Scan
-                                 Output: ft1_1.c2, (sum(ft1_1.c1))
-                                 Relations: Aggregate on (public.ft1 ft1_1)
-                                 Remote SQL: SELECT c2, sum("C 1") FROM "S 1"."T 1" GROUP BY 1
-(21 rows)
+   Output: (count(*)), (sum(ft1_1.c1))
+   Sort Key: (count(*)), (sum(ft1_1.c1))
+   ->  Finalize GroupAggregate
+         Output: count(*), (sum(ft1_1.c1))
+         Group Key: (sum(ft1_1.c1))
+         ->  Sort
+               Output: (sum(ft1_1.c1)), (PARTIAL count(*))
+               Sort Key: (sum(ft1_1.c1))
+               ->  Hash Join
+                     Output: (sum(ft1_1.c1)), (PARTIAL count(*))
+                     Hash Cond: (ft1_1.c2 = ft1.c2)
+                     ->  Foreign Scan
+                           Output: ft1_1.c2, (sum(ft1_1.c1))
+                           Relations: Aggregate on (public.ft1 ft1_1)
+                           Remote SQL: SELECT c2, sum("C 1") FROM "S 1"."T 1" GROUP BY 1
+                     ->  Hash
+                           Output: ft1.c2, (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: ft1.c2, PARTIAL count(*)
+                                 Group Key: ft1.c2
+                                 ->  Foreign Scan on public.ft1
+                                       Output: ft1.c2
+                                       Remote SQL: SELECT c2 FROM "S 1"."T 1"
+(24 rows)
 
 select count(*), x.b from ft1, (select c2 a, sum(c1) b from ft1 group by c2) x where ft1.c2 = x.a group by x.b order by 1, 2;
  count |   b   
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index e9b420f3ddb..39e658b7808 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -5475,6 +5475,21 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-enable-eager-aggregate" xreflabel="enable_eager_aggregate">
+      <term><varname>enable_eager_aggregate</varname> (<type>boolean</type>)
+      <indexterm>
+       <primary><varname>enable_eager_aggregate</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Enables or disables the query planner's ability to partially push
+        aggregation past a join, and finalize it once all the relations are
+        joined. The default is <literal>on</literal>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-enable-gathermerge" xreflabel="enable_gathermerge">
       <term><varname>enable_gathermerge</varname> (<type>boolean</type>)
       <indexterm>
@@ -6095,6 +6110,22 @@ ANY <replaceable class="parameter">num_sync</replaceable> ( <replaceable class="
       </listitem>
      </varlistentry>
 
+     <varlistentry id="guc-min-eager-agg-group-size" xreflabel="min_eager_agg_group_size">
+      <term><varname>min_eager_agg_group_size</varname> (<type>floating point</type>)
+      <indexterm>
+       <primary><varname>min_eager_agg_group_size</varname> configuration parameter</primary>
+      </indexterm>
+      </term>
+      <listitem>
+       <para>
+        Sets the minimum average group size required to consider applying
+        eager aggregation. This helps avoid the overhead of eager
+        aggregation when it does not offer significant row count reduction.
+        The default is <literal>8</literal>.
+       </para>
+      </listitem>
+     </varlistentry>
+
      <varlistentry id="guc-jit-above-cost" xreflabel="jit_above_cost">
       <term><varname>jit_above_cost</varname> (<type>floating point</type>)
       <indexterm>
diff --git a/src/backend/optimizer/README b/src/backend/optimizer/README
index 843368096fd..6c35baceedb 100644
--- a/src/backend/optimizer/README
+++ b/src/backend/optimizer/README
@@ -1500,3 +1500,113 @@ breaking down aggregation or grouping over a partitioned relation into
 aggregation or grouping over its partitions is called partitionwise
 aggregation.  Especially when the partition keys match the GROUP BY clause,
 this can be significantly faster than the regular method.
+
+Eager aggregation
+-----------------
+
+Eager aggregation is a query optimization technique that partially
+pushes aggregation past a join, and finalizes it once all the
+relations are joined.  Eager aggregation may reduce the number of
+input rows to the join and thus could result in a better overall plan.
+
+To prove that the transformation is correct, let's first consider the
+case where only inner joins are involved.  In this case, we partition
+the tables in the FROM clause into two groups: those that contain at
+least one aggregation column, and those that do not contain any
+aggregation columns.  Each group can be treated as a single relation
+formed by the Cartesian product of the tables within that group.
+Therefore, without loss of generality, we can assume that the FROM
+clause contains exactly two relations, R1 and R2, where R1 represents
+the relation containing all aggregation columns, and R2 represents the
+relation without any aggregation columns.
+
+Let the query be of the form:
+
+SELECT G, AGG(A)
+FROM R1 JOIN R2 ON J
+GROUP BY G;
+
+where G is the set of grouping keys that may include columns from R1
+and/or R2; AGG(A) is an aggregate function over columns A from R1; J
+is the join condition between R1 and R2.
+
+The transformation of eager aggregation is:
+
+    GROUP BY G, AGG(A) on (R1 JOIN R2 ON J)
+    =
+    GROUP BY G, AGG(agg_A) on ((GROUP BY G1, AGG(A) AS agg_A on R1) JOIN R2 ON J)
+
+This equivalence holds under the following conditions:
+
+1) AGG is decomposable, meaning that it can be computed in two stages:
+a partial aggregation followed by a final aggregation;
+2) The set G1 used in the pre-aggregation of R1 includes:
+    * all columns from R1 that are part of the grouping keys G, and
+    * all columns from R1 that appear in the join condition J.
+3) The grouping operator for any column in G1 must be compatible with
+the operator used for that column in the join condition J.
+
+Since G1 includes all columns from R1 that appear in either the
+grouping keys G or the join condition J, all rows within each partial
+group have identical values for both the grouping keys and the
+join-relevant columns from R1, assuming compatible operators are used.
+As a result, the rows within a partial group are indistinguishable in
+terms of their contribution to the aggregation and their behavior in
+the join.  This ensures that all rows in the same partial group share
+the same "destiny": they either all match or all fail to match a given
+row in R2.  Because the aggregate function AGG is decomposable,
+aggregating the partial results after the join yields the same final
+result as aggregating after the full join, thereby preserving query
+semantics.  Q.E.D.
+
+In the case where there are any outer joins, the situation becomes
+more complex due to join order constraints and the semantics of
+null-extension in outer joins.  If the relations that contain at least
+one aggregation column cannot be treated as a single relation because
+of the join order constraints, partial aggregation paths will not be
+generated, and thus the transformation is not applicable.  Otherwise,
+let R1 be the relation containing all aggregation columns, and R2, R3,
+... be the remaining relations.  From the inner join case, under the
+aforementioned conditions, we have the equivalence:
+
+    GROUP BY G, AGG(A) on (R1 JOIN R2 JOIN R3 ...)
+    =
+    GROUP BY G, AGG(agg_A) on ((GROUP BY G1, AGG(A) AS agg_A on R1) JOIN R2 JOIN R3 ...)
+
+To preserve correctness when outer joins are involved, we require an
+additional condition:
+
+4) R1 must not be on the nullable side of any outer join.
+
+This condition ensures that partial aggregation over R1 does not
+suppress any null-extended rows that would be introduced by outer
+joins.  If R1 is on the nullable side of an outer join, the
+NULL-extended rows produced by the outer join would not be available
+when we perform the partial aggregation, while with a
+non-eager-aggregation plan these rows are available for the top-level
+aggregation.  Pushing partial aggregation in this case may result in
+the rows being grouped differently than expected, or produce incorrect
+values from the aggregate functions.
+
+During the construction of the join tree, we evaluate each base or
+join relation to determine if eager aggregation can be applied.  If
+feasible, we create a separate RelOptInfo called a "grouped relation"
+and generate grouped paths by adding sorted and hashed partial
+aggregation paths on top of the non-grouped paths.  To limit planning
+time, we consider only the cheapest or suitably-sorted non-grouped
+paths in this step.
+
+Another way to generate grouped paths is to join a grouped relation
+with a non-grouped relation.  Joining two grouped relations is
+currently not supported.
+
+To further limit planning time, we currently adopt a strategy where
+partial aggregation is pushed only to the lowest feasible level in the
+join tree where it provides a significant reduction in row count.
+This strategy also helps ensure that all grouped paths for the same
+grouped relation produce the same set of rows, which is important to
+support a fundamental assumption of the planner.
+
+If we have generated a grouped relation for the topmost join relation,
+we need to finalize its paths at the end.  The final paths will
+compete in the usual way with paths built from regular planning.
diff --git a/src/backend/optimizer/geqo/geqo_eval.c b/src/backend/optimizer/geqo/geqo_eval.c
index f07d1dc8ac6..e39c5da63eb 100644
--- a/src/backend/optimizer/geqo/geqo_eval.c
+++ b/src/backend/optimizer/geqo/geqo_eval.c
@@ -264,6 +264,9 @@ merge_clump(PlannerInfo *root, List *clumps, Clump *new_clump, int num_gene,
 			/* Keep searching if join order is not valid */
 			if (joinrel)
 			{
+				bool		is_top_rel = bms_equal(joinrel->relids,
+												   root->all_query_rels);
+
 				/* Create paths for partitionwise joins. */
 				generate_partitionwise_join_paths(root, joinrel);
 
@@ -273,12 +276,28 @@ merge_clump(PlannerInfo *root, List *clumps, Clump *new_clump, int num_gene,
 				 * rel once we know the final targetlist (see
 				 * grouping_planner).
 				 */
-				if (!bms_equal(joinrel->relids, root->all_query_rels))
+				if (!is_top_rel)
 					generate_useful_gather_paths(root, joinrel, false);
 
 				/* Find and save the cheapest paths for this joinrel */
 				set_cheapest(joinrel);
 
+				/*
+				 * Except for the topmost scan/join rel, consider generating
+				 * partial aggregation paths for the grouped relation on top
+				 * of the paths of this rel.  After that, we're done creating
+				 * paths for the grouped relation, so run set_cheapest().
+				 */
+				if (joinrel->grouped_rel != NULL && !is_top_rel)
+				{
+					RelOptInfo *grouped_rel = joinrel->grouped_rel;
+
+					Assert(IS_GROUPED_REL(grouped_rel));
+
+					generate_grouped_paths(root, grouped_rel, joinrel);
+					set_cheapest(grouped_rel);
+				}
+
 				/* Absorb new clump into old */
 				old_clump->joinrel = joinrel;
 				old_clump->size += new_clump->size;
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index d7ff36d89be..cc562518b04 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -40,6 +40,7 @@
 #include "optimizer/paths.h"
 #include "optimizer/plancat.h"
 #include "optimizer/planner.h"
+#include "optimizer/prep.h"
 #include "optimizer/tlist.h"
 #include "parser/parse_clause.h"
 #include "parser/parsetree.h"
@@ -47,6 +48,7 @@
 #include "port/pg_bitutils.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/lsyscache.h"
+#include "utils/selfuncs.h"
 
 
 /* Bitmask flags for pushdown_safety_info.unsafeFlags */
@@ -77,7 +79,9 @@ typedef enum pushdown_safe_type
 
 /* These parameters are set by GUC */
 bool		enable_geqo = false;	/* just in case GUC doesn't set it */
+bool		enable_eager_aggregate = true;
 int			geqo_threshold;
+double		min_eager_agg_group_size;
 int			min_parallel_table_scan_size;
 int			min_parallel_index_scan_size;
 
@@ -90,6 +94,7 @@ join_search_hook_type join_search_hook = NULL;
 
 static void set_base_rel_consider_startup(PlannerInfo *root);
 static void set_base_rel_sizes(PlannerInfo *root);
+static void setup_simple_grouped_rels(PlannerInfo *root);
 static void set_base_rel_pathlists(PlannerInfo *root);
 static void set_rel_size(PlannerInfo *root, RelOptInfo *rel,
 						 Index rti, RangeTblEntry *rte);
@@ -114,6 +119,7 @@ static void set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
 								Index rti, RangeTblEntry *rte);
 static void set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 									Index rti, RangeTblEntry *rte);
+static void set_grouped_rel_pathlist(PlannerInfo *root, RelOptInfo *rel);
 static void generate_orderedappend_paths(PlannerInfo *root, RelOptInfo *rel,
 										 List *live_childrels,
 										 List *all_child_pathkeys);
@@ -182,6 +188,12 @@ make_one_rel(PlannerInfo *root, List *joinlist)
 	 */
 	set_base_rel_sizes(root);
 
+	/*
+	 * Build grouped relations for simple rels (i.e., base or "other" member
+	 * relations) where possible.
+	 */
+	setup_simple_grouped_rels(root);
+
 	/*
 	 * We should now have size estimates for every actual table involved in
 	 * the query, and we also know which if any have been deleted from the
@@ -323,6 +335,39 @@ set_base_rel_sizes(PlannerInfo *root)
 	}
 }
 
+/*
+ * setup_simple_grouped_rels
+ *	  For each simple relation, build a grouped simple relation if eager
+ *	  aggregation is possible and if this relation can produce grouped paths.
+ */
+static void
+setup_simple_grouped_rels(PlannerInfo *root)
+{
+	Index		rti;
+
+	/*
+	 * If there are no aggregate expressions or grouping expressions, eager
+	 * aggregation is not possible.
+	 */
+	if (root->agg_clause_list == NIL ||
+		root->group_expr_list == NIL)
+		return;
+
+	for (rti = 1; rti < root->simple_rel_array_size; rti++)
+	{
+		RelOptInfo *rel = root->simple_rel_array[rti];
+
+		/* there may be empty slots corresponding to non-baserel RTEs */
+		if (rel == NULL)
+			continue;
+
+		Assert(rel->relid == rti);	/* sanity check on array */
+		Assert(IS_SIMPLE_REL(rel)); /* sanity check on rel */
+
+		(void) build_simple_grouped_rel(root, rel);
+	}
+}
+
 /*
  * set_base_rel_pathlists
  *	  Finds all paths available for scanning each base-relation entry.
@@ -559,6 +604,15 @@ set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 	/* Now find the cheapest of the paths for this rel */
 	set_cheapest(rel);
 
+	/*
+	 * If a grouped relation for this rel exists, build partial aggregation
+	 * paths for it.
+	 *
+	 * Note that this can only happen after we've called set_cheapest() for
+	 * this base rel, because we need its cheapest paths.
+	 */
+	set_grouped_rel_pathlist(root, rel);
+
 #ifdef OPTIMIZER_DEBUG
 	pprint(rel);
 #endif
@@ -1305,6 +1359,35 @@ set_append_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 	add_paths_to_append_rel(root, rel, live_childrels);
 }
 
+/*
+ * set_grouped_rel_pathlist
+ *	  If a grouped relation for the given 'rel' exists, build partial
+ *	  aggregation paths for it.
+ */
+static void
+set_grouped_rel_pathlist(PlannerInfo *root, RelOptInfo *rel)
+{
+	RelOptInfo *grouped_rel;
+
+	/*
+	 * If there are no aggregate expressions or grouping expressions, eager
+	 * aggregation is not possible.
+	 */
+	if (root->agg_clause_list == NIL ||
+		root->group_expr_list == NIL)
+		return;
+
+	/* Add paths to the grouped base relation if one exists. */
+	grouped_rel = rel->grouped_rel;
+	if (grouped_rel)
+	{
+		Assert(IS_GROUPED_REL(grouped_rel));
+
+		generate_grouped_paths(root, grouped_rel, rel);
+		set_cheapest(grouped_rel);
+	}
+}
+
 
 /*
  * add_paths_to_append_rel
@@ -3332,6 +3415,345 @@ generate_useful_gather_paths(PlannerInfo *root, RelOptInfo *rel, bool override_r
 	}
 }
 
+/*
+ * generate_grouped_paths
+ *		Generate paths for a grouped relation by adding sorted and hashed
+ *		partial aggregation paths on top of paths of the ungrouped relation.
+ *
+ * The information needed is provided by the RelAggInfo structure stored in
+ * "grouped_rel".
+ */
+void
+generate_grouped_paths(PlannerInfo *root, RelOptInfo *grouped_rel,
+					   RelOptInfo *rel)
+{
+	RelAggInfo *agg_info = grouped_rel->agg_info;
+	AggClauseCosts agg_costs;
+	bool		can_hash;
+	bool		can_sort;
+	Path	   *cheapest_total_path = NULL;
+	Path	   *cheapest_partial_path = NULL;
+	double		dNumGroups = 0;
+	double		dNumPartialGroups = 0;
+	List	   *group_pathkeys = NIL;
+
+	if (IS_DUMMY_REL(rel))
+	{
+		mark_dummy_rel(grouped_rel);
+		return;
+	}
+
+	/*
+	 * We push partial aggregation only to the lowest possible level in the
+	 * join tree that is deemed useful.
+	 */
+	if (!bms_equal(agg_info->apply_at, rel->relids) ||
+		!agg_info->agg_useful)
+		return;
+
+	MemSet(&agg_costs, 0, sizeof(AggClauseCosts));
+	get_agg_clause_costs(root, AGGSPLIT_INITIAL_SERIAL, &agg_costs);
+
+	/*
+	 * Determine whether it's possible to perform sort-based implementations
+	 * of grouping, and generate the pathkeys that represent the grouping
+	 * requirements in that case.
+	 */
+	can_sort = grouping_is_sortable(agg_info->group_clauses);
+	if (can_sort)
+	{
+		RelOptInfo *top_grouped_rel;
+		List	   *top_group_tlist;
+
+		top_grouped_rel = IS_OTHER_REL(rel) ?
+			rel->top_parent->grouped_rel : grouped_rel;
+		top_group_tlist =
+			make_tlist_from_pathtarget(top_grouped_rel->agg_info->target);
+
+		group_pathkeys =
+			make_pathkeys_for_sortclauses(root, agg_info->group_clauses,
+										  top_group_tlist);
+	}
+
+	/*
+	 * Determine whether we should consider hash-based implementations of
+	 * grouping.
+	 */
+	Assert(root->numOrderedAggs == 0);
+	can_hash = (agg_info->group_clauses != NIL &&
+				grouping_is_hashable(agg_info->group_clauses));
+
+	/*
+	 * Consider whether we should generate partially aggregated non-partial
+	 * paths.  We can only do this if we have a non-partial path.
+	 */
+	if (rel->pathlist != NIL)
+	{
+		cheapest_total_path = rel->cheapest_total_path;
+		Assert(cheapest_total_path != NULL);
+	}
+
+	/*
+	 * If parallelism is possible for grouped_rel, then we should consider
+	 * generating partially-grouped partial paths.  However, if the ungrouped
+	 * rel has no partial paths, then we can't.
+	 */
+	if (grouped_rel->consider_parallel && rel->partial_pathlist != NIL)
+	{
+		cheapest_partial_path = linitial(rel->partial_pathlist);
+		Assert(cheapest_partial_path != NULL);
+	}
+
+	/* Estimate number of partial groups. */
+	if (cheapest_total_path != NULL)
+		dNumGroups = estimate_num_groups(root,
+										 agg_info->group_exprs,
+										 cheapest_total_path->rows,
+										 NULL, NULL);
+	if (cheapest_partial_path != NULL)
+		dNumPartialGroups = estimate_num_groups(root,
+												agg_info->group_exprs,
+												cheapest_partial_path->rows,
+												NULL, NULL);
+
+	if (can_sort && cheapest_total_path != NULL)
+	{
+		ListCell   *lc;
+
+		/*
+		 * Use any available suitably-sorted path as input, and also consider
+		 * sorting the cheapest-total path and incremental sort on any paths
+		 * with presorted keys.
+		 *
+		 * To save planning time, we ignore parameterized input paths unless
+		 * they are the cheapest-total path.
+		 */
+		foreach(lc, rel->pathlist)
+		{
+			Path	   *input_path = (Path *) lfirst(lc);
+			Path	   *path;
+			bool		is_sorted;
+			int			presorted_keys;
+
+			/*
+			 * Ignore parameterized paths that are not the cheapest-total
+			 * path.
+			 */
+			if (input_path->param_info &&
+				input_path != cheapest_total_path)
+				continue;
+
+			is_sorted = pathkeys_count_contained_in(group_pathkeys,
+													input_path->pathkeys,
+													&presorted_keys);
+
+			/*
+			 * Ignore paths that are not suitably or partially sorted, unless
+			 * they are the cheapest total path (no need to deal with paths
+			 * which have presorted keys when incremental sort is disabled).
+			 */
+			if (!is_sorted && input_path != cheapest_total_path &&
+				(presorted_keys == 0 || !enable_incremental_sort))
+				continue;
+
+			/*
+			 * Since the path originates from a non-grouped relation that is
+			 * not aware of eager aggregation, we must ensure that it provides
+			 * the correct input for partial aggregation.
+			 */
+			path = (Path *) create_projection_path(root,
+												   grouped_rel,
+												   input_path,
+												   agg_info->agg_input);
+
+			if (!is_sorted)
+			{
+				/*
+				 * We've no need to consider both a sort and incremental sort.
+				 * We'll just do a sort if there are no presorted keys and an
+				 * incremental sort when there are presorted keys.
+				 */
+				if (presorted_keys == 0 || !enable_incremental_sort)
+					path = (Path *) create_sort_path(root,
+													 grouped_rel,
+													 path,
+													 group_pathkeys,
+													 -1.0);
+				else
+					path = (Path *) create_incremental_sort_path(root,
+																 grouped_rel,
+																 path,
+																 group_pathkeys,
+																 presorted_keys,
+																 -1.0);
+			}
+
+			/*
+			 * qual is NIL because the HAVING clause cannot be evaluated until
+			 * the final value of the aggregate is known.
+			 */
+			path = (Path *) create_agg_path(root,
+											grouped_rel,
+											path,
+											agg_info->target,
+											AGG_SORTED,
+											AGGSPLIT_INITIAL_SERIAL,
+											agg_info->group_clauses,
+											NIL,
+											&agg_costs,
+											dNumGroups);
+
+			add_path(grouped_rel, path);
+		}
+	}
+
+	if (can_sort && cheapest_partial_path != NULL)
+	{
+		ListCell   *lc;
+
+		/* Similar to above logic, but for partial paths. */
+		foreach(lc, rel->partial_pathlist)
+		{
+			Path	   *input_path = (Path *) lfirst(lc);
+			Path	   *path;
+			bool		is_sorted;
+			int			presorted_keys;
+
+			is_sorted = pathkeys_count_contained_in(group_pathkeys,
+													input_path->pathkeys,
+													&presorted_keys);
+
+			/*
+			 * Ignore paths that are not suitably or partially sorted, unless
+			 * they are the cheapest partial path (no need to deal with paths
+			 * which have presorted keys when incremental sort is disabled).
+			 */
+			if (!is_sorted && input_path != cheapest_partial_path &&
+				(presorted_keys == 0 || !enable_incremental_sort))
+				continue;
+
+			/*
+			 * Since the path originates from a non-grouped relation that is
+			 * not aware of eager aggregation, we must ensure that it provides
+			 * the correct input for partial aggregation.
+			 */
+			path = (Path *) create_projection_path(root,
+												   grouped_rel,
+												   input_path,
+												   agg_info->agg_input);
+
+			if (!is_sorted)
+			{
+				/*
+				 * We've no need to consider both a sort and incremental sort.
+				 * We'll just do a sort if there are no presorted keys and an
+				 * incremental sort when there are presorted keys.
+				 */
+				if (presorted_keys == 0 || !enable_incremental_sort)
+					path = (Path *) create_sort_path(root,
+													 grouped_rel,
+													 path,
+													 group_pathkeys,
+													 -1.0);
+				else
+					path = (Path *) create_incremental_sort_path(root,
+																 grouped_rel,
+																 path,
+																 group_pathkeys,
+																 presorted_keys,
+																 -1.0);
+			}
+
+			/*
+			 * qual is NIL because the HAVING clause cannot be evaluated until
+			 * the final value of the aggregate is known.
+			 */
+			path = (Path *) create_agg_path(root,
+											grouped_rel,
+											path,
+											agg_info->target,
+											AGG_SORTED,
+											AGGSPLIT_INITIAL_SERIAL,
+											agg_info->group_clauses,
+											NIL,
+											&agg_costs,
+											dNumPartialGroups);
+
+			add_partial_path(grouped_rel, path);
+		}
+	}
+
+	/*
+	 * Add a partially-grouped HashAgg Path where possible
+	 */
+	if (can_hash && cheapest_total_path != NULL)
+	{
+		Path	   *path;
+
+		/*
+		 * Since the path originates from a non-grouped relation that is not
+		 * aware of eager aggregation, we must ensure that it provides the
+		 * correct input for partial aggregation.
+		 */
+		path = (Path *) create_projection_path(root,
+											   grouped_rel,
+											   cheapest_total_path,
+											   agg_info->agg_input);
+
+		/*
+		 * qual is NIL because the HAVING clause cannot be evaluated until the
+		 * final value of the aggregate is known.
+		 */
+		path = (Path *) create_agg_path(root,
+										grouped_rel,
+										path,
+										agg_info->target,
+										AGG_HASHED,
+										AGGSPLIT_INITIAL_SERIAL,
+										agg_info->group_clauses,
+										NIL,
+										&agg_costs,
+										dNumGroups);
+
+		add_path(grouped_rel, path);
+	}
+
+	/*
+	 * Now add a partially-grouped HashAgg partial Path where possible
+	 */
+	if (can_hash && cheapest_partial_path != NULL)
+	{
+		Path	   *path;
+
+		/*
+		 * Since the path originates from a non-grouped relation that is not
+		 * aware of eager aggregation, we must ensure that it provides the
+		 * correct input for partial aggregation.
+		 */
+		path = (Path *) create_projection_path(root,
+											   grouped_rel,
+											   cheapest_partial_path,
+											   agg_info->agg_input);
+
+		/*
+		 * qual is NIL because the HAVING clause cannot be evaluated until the
+		 * final value of the aggregate is known.
+		 */
+		path = (Path *) create_agg_path(root,
+										grouped_rel,
+										path,
+										agg_info->target,
+										AGG_HASHED,
+										AGGSPLIT_INITIAL_SERIAL,
+										agg_info->group_clauses,
+										NIL,
+										&agg_costs,
+										dNumPartialGroups);
+
+		add_partial_path(grouped_rel, path);
+	}
+}
+
 /*
  * make_rel_from_joinlist
  *	  Build access paths using a "joinlist" to guide the join path search.
@@ -3491,11 +3913,19 @@ standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
 		 *
 		 * After that, we're done creating paths for the joinrel, so run
 		 * set_cheapest().
+		 *
+		 * In addition, we also run generate_grouped_paths() for the grouped
+		 * relation of each just-processed joinrel, and run set_cheapest() for
+		 * the grouped relation afterwards.
 		 */
 		foreach(lc, root->join_rel_level[lev])
 		{
+			bool		is_top_rel;
+
 			rel = (RelOptInfo *) lfirst(lc);
 
+			is_top_rel = bms_equal(rel->relids, root->all_query_rels);
+
 			/* Create paths for partitionwise joins. */
 			generate_partitionwise_join_paths(root, rel);
 
@@ -3505,12 +3935,28 @@ standard_join_search(PlannerInfo *root, int levels_needed, List *initial_rels)
 			 * once we know the final targetlist (see grouping_planner's and
 			 * its call to apply_scanjoin_target_to_paths).
 			 */
-			if (!bms_equal(rel->relids, root->all_query_rels))
+			if (!is_top_rel)
 				generate_useful_gather_paths(root, rel, false);
 
 			/* Find and save the cheapest paths for this rel */
 			set_cheapest(rel);
 
+			/*
+			 * Except for the topmost scan/join rel, consider generating
+			 * partial aggregation paths for the grouped relation on top of
+			 * the paths of this rel.  After that, we're done creating paths
+			 * for the grouped relation, so run set_cheapest().
+			 */
+			if (rel->grouped_rel != NULL && !is_top_rel)
+			{
+				RelOptInfo *grouped_rel = rel->grouped_rel;
+
+				Assert(IS_GROUPED_REL(grouped_rel));
+
+				generate_grouped_paths(root, grouped_rel, rel);
+				set_cheapest(grouped_rel);
+			}
+
 #ifdef OPTIMIZER_DEBUG
 			pprint(rel);
 #endif
@@ -4380,6 +4826,25 @@ generate_partitionwise_join_paths(PlannerInfo *root, RelOptInfo *rel)
 		if (IS_DUMMY_REL(child_rel))
 			continue;
 
+		/*
+		 * Except for the topmost scan/join rel, consider generating partial
+		 * aggregation paths for the grouped relation on top of the paths of
+		 * this partitioned child-join.  After that, we're done creating paths
+		 * for the grouped relation, so run set_cheapest().
+		 */
+		if (child_rel->grouped_rel != NULL &&
+			!bms_equal(IS_OTHER_REL(rel) ?
+					   rel->top_parent_relids : rel->relids,
+					   root->all_query_rels))
+		{
+			RelOptInfo *grouped_rel = child_rel->grouped_rel;
+
+			Assert(IS_GROUPED_REL(grouped_rel));
+
+			generate_grouped_paths(root, grouped_rel, child_rel);
+			set_cheapest(grouped_rel);
+		}
+
 #ifdef OPTIMIZER_DEBUG
 		pprint(child_rel);
 #endif
diff --git a/src/backend/optimizer/path/joinrels.c b/src/backend/optimizer/path/joinrels.c
index 535248aa525..43b84d239ed 100644
--- a/src/backend/optimizer/path/joinrels.c
+++ b/src/backend/optimizer/path/joinrels.c
@@ -16,6 +16,7 @@
 
 #include "miscadmin.h"
 #include "optimizer/appendinfo.h"
+#include "optimizer/cost.h"
 #include "optimizer/joininfo.h"
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
@@ -36,6 +37,9 @@ static bool has_legal_joinclause(PlannerInfo *root, RelOptInfo *rel);
 static bool restriction_is_constant_false(List *restrictlist,
 										  RelOptInfo *joinrel,
 										  bool only_pushed_down);
+static void make_grouped_join_rel(PlannerInfo *root, RelOptInfo *rel1,
+								  RelOptInfo *rel2, RelOptInfo *joinrel,
+								  SpecialJoinInfo *sjinfo, List *restrictlist);
 static void populate_joinrel_with_paths(PlannerInfo *root, RelOptInfo *rel1,
 										RelOptInfo *rel2, RelOptInfo *joinrel,
 										SpecialJoinInfo *sjinfo, List *restrictlist);
@@ -762,6 +766,10 @@ make_join_rel(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2)
 		return joinrel;
 	}
 
+	/* Build a grouped join relation for 'joinrel' if possible. */
+	make_grouped_join_rel(root, rel1, rel2, joinrel, sjinfo,
+						  restrictlist);
+
 	/* Add paths to the join relation. */
 	populate_joinrel_with_paths(root, rel1, rel2, joinrel, sjinfo,
 								restrictlist);
@@ -873,6 +881,186 @@ add_outer_joins_to_relids(PlannerInfo *root, Relids input_relids,
 	return input_relids;
 }
 
+/*
+ * make_grouped_join_rel
+ *	  Build a grouped join relation for the given "joinrel" if eager
+ *	  aggregation is applicable and the resulting grouped paths are considered
+ *	  useful.
+ *
+ * There are two strategies for generating grouped paths for a join relation:
+ *
+ * 1. Join a grouped (partially aggregated) input relation with a non-grouped
+ * input (e.g., AGG(B) JOIN A).
+ *
+ * 2. Apply partial aggregation (sorted or hashed) on top of existing
+ * non-grouped join paths (e.g., AGG(A JOIN B)).
+ *
+ * To limit planning effort and avoid an explosion of alternatives, we adopt a
+ * strategy where partial aggregation is only pushed to the lowest possible
+ * level in the join tree that is deemed useful.  That is, if grouped paths can
+ * be built using the first strategy, we skip consideration of the second
+ * strategy for the same join level.
+ *
+ * Additionally, if there are multiple lowest useful levels where partial
+ * aggregation could be applied, such as in a join tree with relations A, B,
+ * and C where both "AGG(A JOIN B) JOIN C" and "A JOIN AGG(B JOIN C)" are valid
+ * placements, we choose only the first one encountered during join search.
+ * This avoids generating multiple versions of the same grouped relation based
+ * on different aggregation placements.
+ *
+ * These heuristics also ensure that all grouped paths for the same grouped
+ * relation produce the same set of rows, which is a basic assumption in the
+ * planner.
+ */
+static void
+make_grouped_join_rel(PlannerInfo *root, RelOptInfo *rel1,
+					  RelOptInfo *rel2, RelOptInfo *joinrel,
+					  SpecialJoinInfo *sjinfo, List *restrictlist)
+{
+	RelOptInfo *grouped_rel;
+	RelOptInfo *grouped_rel1;
+	RelOptInfo *grouped_rel2;
+	bool		rel1_empty;
+	bool		rel2_empty;
+	Relids		agg_apply_at;
+
+	/*
+	 * If there are no aggregate expressions or grouping expressions, eager
+	 * aggregation is not possible.
+	 */
+	if (root->agg_clause_list == NIL ||
+		root->group_expr_list == NIL)
+		return;
+
+	/* Retrieve the grouped relations for the two input rels */
+	grouped_rel1 = rel1->grouped_rel;
+	grouped_rel2 = rel2->grouped_rel;
+
+	rel1_empty = (grouped_rel1 == NULL || IS_DUMMY_REL(grouped_rel1));
+	rel2_empty = (grouped_rel2 == NULL || IS_DUMMY_REL(grouped_rel2));
+
+	/* Find or construct a grouped joinrel for this joinrel */
+	grouped_rel = joinrel->grouped_rel;
+	if (grouped_rel == NULL)
+	{
+		RelAggInfo *agg_info = NULL;
+
+		/*
+		 * Prepare the information needed to create grouped paths for this
+		 * join relation.
+		 */
+		agg_info = create_rel_agg_info(root, joinrel, rel1_empty == rel2_empty);
+		if (agg_info == NULL)
+			return;
+
+		/*
+		 * If grouped paths for the given join relation are not considered
+		 * useful, and no grouped paths can be built by joining grouped input
+		 * relations, skip building the grouped join relation.
+		 */
+		if (!agg_info->agg_useful &&
+			(rel1_empty == rel2_empty))
+			return;
+
+		/* build the grouped relation */
+		grouped_rel = build_grouped_rel(root, joinrel);
+		grouped_rel->reltarget = agg_info->target;
+
+		if (rel1_empty != rel2_empty)
+		{
+			/*
+			 * If there is exactly one grouped input relation, then we can
+			 * build grouped paths by joining the input relations.  Set size
+			 * estimates for the grouped join relation based on the input
+			 * relations, and update the set of relids where partial
+			 * aggregation is applied to that of the grouped input relation.
+			 */
+			set_joinrel_size_estimates(root, grouped_rel,
+									   rel1_empty ? rel1 : grouped_rel1,
+									   rel2_empty ? rel2 : grouped_rel2,
+									   sjinfo, restrictlist);
+			agg_info->apply_at = rel1_empty ?
+				grouped_rel2->agg_info->apply_at :
+				grouped_rel1->agg_info->apply_at;
+		}
+		else
+		{
+			/*
+			 * Otherwise, grouped paths can be built by applying partial
+			 * aggregation on top of existing non-grouped join paths.  Set
+			 * size estimates for the grouped join relation based on the
+			 * estimated number of groups, and track the set of relids where
+			 * partial aggregation is applied.  Note that these values may be
+			 * updated later if it is determined that grouped paths can be
+			 * constructed by joining other input relations.
+			 */
+			grouped_rel->rows = agg_info->grouped_rows;
+			agg_info->apply_at = bms_copy(joinrel->relids);
+		}
+
+		grouped_rel->agg_info = agg_info;
+		joinrel->grouped_rel = grouped_rel;
+	}
+
+	Assert(IS_GROUPED_REL(grouped_rel));
+
+	/* We may have already proven this grouped join relation to be dummy. */
+	if (IS_DUMMY_REL(grouped_rel))
+		return;
+
+	/*
+	 * Nothing to do if there's no grouped input relation.  Also, joining two
+	 * grouped relations is not currently supported.
+	 */
+	if (rel1_empty == rel2_empty)
+		return;
+
+	/*
+	 * Get the set of relids where partial aggregation is applied among the
+	 * given input relations.
+	 */
+	agg_apply_at = rel1_empty ?
+		grouped_rel2->agg_info->apply_at :
+		grouped_rel1->agg_info->apply_at;
+
+	/*
+	 * If it's not the designated level, skip building grouped paths.
+	 *
+	 * One exception is when it is a subset of the previously recorded level.
+	 * In that case, we need to update the designated level to this one, and
+	 * adjust the size estimates for the grouped join relation accordingly.
+	 * For example, suppose partial aggregation can be applied on top of (B
+	 * JOIN C).  If we first construct the join as ((A JOIN B) JOIN C), we'd
+	 * record the designated level as including all three relations (A B C).
+	 * Later, when we consider (A JOIN (B JOIN C)), we encounter the smaller
+	 * (B C) join level directly.  Since this is a subset of the previous
+	 * level and still valid for partial aggregation, we update the designated
+	 * level to (B C), and adjust the size estimates accordingly.
+	 */
+	if (!bms_equal(agg_apply_at, grouped_rel->agg_info->apply_at))
+	{
+		if (bms_is_subset(agg_apply_at, grouped_rel->agg_info->apply_at))
+		{
+			/* Adjust the size estimates for the grouped join relation. */
+			set_joinrel_size_estimates(root, grouped_rel,
+									   rel1_empty ? rel1 : grouped_rel1,
+									   rel2_empty ? rel2 : grouped_rel2,
+									   sjinfo, restrictlist);
+			grouped_rel->agg_info->apply_at = agg_apply_at;
+		}
+		else
+			return;
+	}
+
+	/* Make paths for the grouped join relation. */
+	populate_joinrel_with_paths(root,
+								rel1_empty ? rel1 : grouped_rel1,
+								rel2_empty ? rel2 : grouped_rel2,
+								grouped_rel,
+								sjinfo,
+								restrictlist);
+}
+
 /*
  * populate_joinrel_with_paths
  *	  Add paths to the given joinrel for given pair of joining relations. The
@@ -1615,6 +1803,11 @@ try_partitionwise_join(PlannerInfo *root, RelOptInfo *rel1, RelOptInfo *rel2,
 						 adjust_child_relids(joinrel->relids,
 											 nappinfos, appinfos)));
 
+		/* Build a grouped join relation for 'child_joinrel' if possible */
+		make_grouped_join_rel(root, child_rel1, child_rel2,
+							  child_joinrel, child_sjinfo,
+							  child_restrictlist);
+
 		/* And make paths for the child join */
 		populate_joinrel_with_paths(root, child_rel1, child_rel2,
 									child_joinrel, child_sjinfo,
diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index 3e3fec89252..b8d1c7e88a3 100644
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -14,6 +14,7 @@
  */
 #include "postgres.h"
 
+#include "access/nbtree.h"
 #include "catalog/pg_constraint.h"
 #include "catalog/pg_type.h"
 #include "nodes/makefuncs.h"
@@ -31,6 +32,7 @@
 #include "optimizer/restrictinfo.h"
 #include "parser/analyze.h"
 #include "rewrite/rewriteManip.h"
+#include "utils/fmgroids.h"
 #include "utils/lsyscache.h"
 #include "utils/rel.h"
 #include "utils/typcache.h"
@@ -81,6 +83,12 @@ typedef struct JoinTreeItem
 } JoinTreeItem;
 
 
+static bool is_partial_agg_memory_risky(PlannerInfo *root);
+static void create_agg_clause_infos(PlannerInfo *root);
+static void create_grouping_expr_infos(PlannerInfo *root);
+static EquivalenceClass *get_eclass_for_sortgroupclause(PlannerInfo *root,
+														SortGroupClause *sgc,
+														Expr *expr);
 static void extract_lateral_references(PlannerInfo *root, RelOptInfo *brel,
 									   Index rtindex);
 static List *deconstruct_recurse(PlannerInfo *root, Node *jtnode,
@@ -628,6 +636,368 @@ remove_useless_groupby_columns(PlannerInfo *root)
 	}
 }
 
+/*
+ * setup_eager_aggregation
+ *	  Check if eager aggregation is applicable, and if so collect suitable
+ *	  aggregate expressions and grouping expressions in the query.
+ */
+void
+setup_eager_aggregation(PlannerInfo *root)
+{
+	/*
+	 * Don't apply eager aggregation if disabled by user.
+	 */
+	if (!enable_eager_aggregate)
+		return;
+
+	/*
+	 * Don't apply eager aggregation if there are no available GROUP BY
+	 * clauses.
+	 */
+	if (!root->processed_groupClause)
+		return;
+
+	/*
+	 * For now we don't try to support grouping sets.
+	 */
+	if (root->parse->groupingSets)
+		return;
+
+	/*
+	 * For now we don't try to support DISTINCT or ORDER BY aggregates.
+	 */
+	if (root->numOrderedAggs > 0)
+		return;
+
+	/*
+	 * If there are any aggregates that do not support partial mode, or any
+	 * partial aggregates that are non-serializable, do not apply eager
+	 * aggregation.
+	 */
+	if (root->hasNonPartialAggs || root->hasNonSerialAggs)
+		return;
+
+	/*
+	 * We don't try to apply eager aggregation if there are set-returning
+	 * functions in targetlist.
+	 */
+	if (root->parse->hasTargetSRFs)
+		return;
+
+	/*
+	 * Eager aggregation only makes sense if there are multiple base rels in
+	 * the query.
+	 */
+	if (bms_membership(root->all_baserels) != BMS_MULTIPLE)
+		return;
+
+	/*
+	 * Don't apply eager aggregation if any aggregate poses a risk of
+	 * excessive memory usage during partial aggregation.
+	 */
+	if (is_partial_agg_memory_risky(root))
+		return;
+
+	/*
+	 * Collect aggregate expressions and plain Vars that appear in the
+	 * targetlist and havingQual.
+	 */
+	create_agg_clause_infos(root);
+
+	/*
+	 * If there are no suitable aggregate expressions, we cannot apply eager
+	 * aggregation.
+	 */
+	if (root->agg_clause_list == NIL)
+		return;
+
+	/*
+	 * Collect grouping expressions that appear in grouping clauses.
+	 */
+	create_grouping_expr_infos(root);
+}
+
+/*
+ * is_partial_agg_memory_risky
+ *	  Check if any aggregate poses a risk of excessive memory usage during
+ *	  partial aggregation.
+ *
+ * We check if any aggregate has a negative aggtransspace value, which
+ * indicates that its transition state data can grow unboundedly in size.
+ * Applying eager aggregation in such cases risks high memory usage since
+ * partial aggregation results might be stored in join hash tables or
+ * materialized nodes.
+ */
+static bool
+is_partial_agg_memory_risky(PlannerInfo *root)
+{
+	ListCell   *lc;
+
+	foreach(lc, root->aggtransinfos)
+	{
+		AggTransInfo *transinfo = lfirst_node(AggTransInfo, lc);
+
+		if (transinfo->aggtransspace < 0)
+			return true;
+	}
+
+	return false;
+}
+
+/*
+ * create_agg_clause_infos
+ *	  Search the targetlist and havingQual for Aggrefs and plain Vars, and
+ *	  create an AggClauseInfo for each Aggref node.
+ */
+static void
+create_agg_clause_infos(PlannerInfo *root)
+{
+	List	   *tlist_exprs;
+	List	   *agg_clause_list = NIL;
+	List	   *tlist_vars = NIL;
+	Relids		aggregate_relids = NULL;
+	bool		eager_agg_applicable = true;
+	ListCell   *lc;
+
+	Assert(root->agg_clause_list == NIL);
+	Assert(root->tlist_vars == NIL);
+
+	tlist_exprs = pull_var_clause((Node *) root->processed_tlist,
+								  PVC_INCLUDE_AGGREGATES |
+								  PVC_RECURSE_WINDOWFUNCS |
+								  PVC_RECURSE_PLACEHOLDERS);
+
+	/*
+	 * Aggregates within the HAVING clause need to be processed in the same
+	 * way as those in the targetlist.  Note that HAVING can contain Aggrefs
+	 * but not WindowFuncs.
+	 */
+	if (root->parse->havingQual != NULL)
+	{
+		List	   *having_exprs;
+
+		having_exprs = pull_var_clause((Node *) root->parse->havingQual,
+									   PVC_INCLUDE_AGGREGATES |
+									   PVC_RECURSE_PLACEHOLDERS);
+		if (having_exprs != NIL)
+		{
+			tlist_exprs = list_concat(tlist_exprs, having_exprs);
+			list_free(having_exprs);
+		}
+	}
+
+	foreach(lc, tlist_exprs)
+	{
+		Expr	   *expr = (Expr *) lfirst(lc);
+		Aggref	   *aggref;
+		Relids		agg_eval_at;
+		AggClauseInfo *ac_info;
+
+		/* For now we don't try to support GROUPING() expressions */
+		if (IsA(expr, GroupingFunc))
+		{
+			eager_agg_applicable = false;
+			break;
+		}
+
+		/* Collect plain Vars for future reference */
+		if (IsA(expr, Var))
+		{
+			tlist_vars = list_append_unique(tlist_vars, expr);
+			continue;
+		}
+
+		aggref = castNode(Aggref, expr);
+
+		Assert(aggref->aggorder == NIL);
+		Assert(aggref->aggdistinct == NIL);
+
+		/*
+		 * If there are any securityQuals, do not try to apply eager
+		 * aggregation if any non-leakproof aggregate functions are present.
+		 * This is overly strict, but for now...
+		 */
+		if (root->qual_security_level > 0 &&
+			!get_func_leakproof(aggref->aggfnoid))
+		{
+			eager_agg_applicable = false;
+			break;
+		}
+
+		agg_eval_at = pull_varnos(root, (Node *) aggref);
+
+		/*
+		 * If all base relations in the query are referenced by aggregate
+		 * functions, then eager aggregation is not applicable.
+		 */
+		aggregate_relids = bms_add_members(aggregate_relids, agg_eval_at);
+		if (bms_is_subset(root->all_baserels, aggregate_relids))
+		{
+			eager_agg_applicable = false;
+			break;
+		}
+
+		/* OK, create the AggClauseInfo node */
+		ac_info = makeNode(AggClauseInfo);
+		ac_info->aggref = aggref;
+		ac_info->agg_eval_at = agg_eval_at;
+
+		/* ... and add it to the list */
+		agg_clause_list = list_append_unique(agg_clause_list, ac_info);
+	}
+
+	list_free(tlist_exprs);
+
+	if (eager_agg_applicable)
+	{
+		root->agg_clause_list = agg_clause_list;
+		root->tlist_vars = tlist_vars;
+	}
+	else
+	{
+		list_free_deep(agg_clause_list);
+		list_free(tlist_vars);
+	}
+}
+
+/*
+ * create_grouping_expr_infos
+ *	  Create a GroupingExprInfo for each expression usable as grouping key.
+ *
+ * If any grouping expression is not suitable, we will just return with
+ * root->group_expr_list being NIL.
+ */
+static void
+create_grouping_expr_infos(PlannerInfo *root)
+{
+	List	   *exprs = NIL;
+	List	   *sortgrouprefs = NIL;
+	List	   *ecs = NIL;
+	ListCell   *lc,
+			   *lc1,
+			   *lc2,
+			   *lc3;
+
+	Assert(root->group_expr_list == NIL);
+
+	foreach(lc, root->processed_groupClause)
+	{
+		SortGroupClause *sgc = lfirst_node(SortGroupClause, lc);
+		TargetEntry *tle = get_sortgroupclause_tle(sgc, root->processed_tlist);
+		TypeCacheEntry *tce;
+		Oid			equalimageproc;
+
+		Assert(tle->ressortgroupref > 0);
+
+		/*
+		 * For now we only support plain Vars as grouping expressions.
+		 */
+		if (!IsA(tle->expr, Var))
+			return;
+
+		/*
+		 * Eager aggregation is only possible if equality implies image
+		 * equality for each grouping key.  Otherwise, placing keys with
+		 * different byte images into the same group may result in the loss of
+		 * information that could be necessary to evaluate upper qual clauses.
+		 *
+		 * For instance, the NUMERIC data type is not supported, as values
+		 * that are considered equal by the equality operator (e.g., 0 and
+		 * 0.0) can have different scales.
+		 */
+		tce = lookup_type_cache(exprType((Node *) tle->expr),
+								TYPECACHE_BTREE_OPFAMILY);
+		if (!OidIsValid(tce->btree_opf) ||
+			!OidIsValid(tce->btree_opintype))
+			return;
+
+		equalimageproc = get_opfamily_proc(tce->btree_opf,
+										   tce->btree_opintype,
+										   tce->btree_opintype,
+										   BTEQUALIMAGE_PROC);
+		if (!OidIsValid(equalimageproc) ||
+			!DatumGetBool(OidFunctionCall1Coll(equalimageproc,
+											   tce->typcollation,
+											   ObjectIdGetDatum(tce->btree_opintype))))
+			return;
+
+		exprs = lappend(exprs, tle->expr);
+		sortgrouprefs = lappend_int(sortgrouprefs, tle->ressortgroupref);
+		ecs = lappend(ecs, get_eclass_for_sortgroupclause(root, sgc, tle->expr));
+	}
+
+	/*
+	 * Construct a GroupingExprInfo for each expression.
+	 */
+	forthree(lc1, exprs, lc2, sortgrouprefs, lc3, ecs)
+	{
+		Expr	   *expr = (Expr *) lfirst(lc1);
+		int			sortgroupref = lfirst_int(lc2);
+		EquivalenceClass *ec = (EquivalenceClass *) lfirst(lc3);
+		GroupingExprInfo *ge_info;
+
+		ge_info = makeNode(GroupingExprInfo);
+		ge_info->expr = (Expr *) copyObject(expr);
+		ge_info->sortgroupref = sortgroupref;
+		ge_info->ec = ec;
+
+		root->group_expr_list = lappend(root->group_expr_list, ge_info);
+	}
+}
+
+/*
+ * get_eclass_for_sortgroupclause
+ *	  Given a group clause and an expression, find an existing equivalence
+ *	  class that the expression is a member of; return NULL if none.
+ */
+static EquivalenceClass *
+get_eclass_for_sortgroupclause(PlannerInfo *root, SortGroupClause *sgc,
+							   Expr *expr)
+{
+	Oid			opfamily,
+				opcintype,
+				collation;
+	CompareType cmptype;
+	Oid			equality_op;
+	List	   *opfamilies;
+
+	/* Punt if the group clause is not sortable */
+	if (!OidIsValid(sgc->sortop))
+		return NULL;
+
+	/* Find the operator in pg_amop --- failure shouldn't happen */
+	if (!get_ordering_op_properties(sgc->sortop,
+									&opfamily, &opcintype, &cmptype))
+		elog(ERROR, "operator %u is not a valid ordering operator",
+			 sgc->sortop);
+
+	/* Because SortGroupClause doesn't carry collation, consult the expr */
+	collation = exprCollation((Node *) expr);
+
+	/*
+	 * EquivalenceClasses need to contain opfamily lists based on the family
+	 * membership of mergejoinable equality operators, which could belong to
+	 * more than one opfamily.  So we have to look up the opfamily's equality
+	 * operator and get its membership.
+	 */
+	equality_op = get_opfamily_member_for_cmptype(opfamily,
+												  opcintype,
+												  opcintype,
+												  COMPARE_EQ);
+	if (!OidIsValid(equality_op))	/* shouldn't happen */
+		elog(ERROR, "missing operator %d(%u,%u) in opfamily %u",
+			 COMPARE_EQ, opcintype, opcintype, opfamily);
+	opfamilies = get_mergejoin_opfamilies(equality_op);
+	if (!opfamilies)			/* certainly should find some */
+		elog(ERROR, "could not find opfamilies for equality operator %u",
+			 equality_op);
+
+	/* Now find a matching EquivalenceClass */
+	return get_eclass_for_sort_expr(root, expr, opfamilies, opcintype,
+									collation, sgc->tleSortGroupRef,
+									NULL, false);
+}
+
 /*****************************************************************************
  *
  *	  LATERAL REFERENCES
diff --git a/src/backend/optimizer/plan/planmain.c b/src/backend/optimizer/plan/planmain.c
index 5467e094ca7..eefc486a566 100644
--- a/src/backend/optimizer/plan/planmain.c
+++ b/src/backend/optimizer/plan/planmain.c
@@ -76,6 +76,9 @@ query_planner(PlannerInfo *root,
 	root->placeholder_list = NIL;
 	root->placeholder_array = NULL;
 	root->placeholder_array_size = 0;
+	root->agg_clause_list = NIL;
+	root->group_expr_list = NIL;
+	root->tlist_vars = NIL;
 	root->fkey_list = NIL;
 	root->initial_rels = NIL;
 
@@ -265,6 +268,12 @@ query_planner(PlannerInfo *root,
 	 */
 	extract_restriction_or_clauses(root);
 
+	/*
+	 * Check if eager aggregation is applicable, and if so, set up
+	 * root->agg_clause_list and root->group_expr_list.
+	 */
+	setup_eager_aggregation(root);
+
 	/*
 	 * Now expand appendrels by adding "otherrels" for their children.  We
 	 * delay this to the end so that we have as much information as possible
diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 41bd8353430..462c5335589 100644
--- a/src/backend/optimizer/plan/planner.c
+++ b/src/backend/optimizer/plan/planner.c
@@ -232,7 +232,6 @@ static void add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 									  RelOptInfo *partially_grouped_rel,
 									  const AggClauseCosts *agg_costs,
 									  grouping_sets_data *gd,
-									  double dNumGroups,
 									  GroupPathExtraData *extra);
 static RelOptInfo *create_partial_grouping_paths(PlannerInfo *root,
 												 RelOptInfo *grouped_rel,
@@ -4010,9 +4009,7 @@ create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
 							   GroupPathExtraData *extra,
 							   RelOptInfo **partially_grouped_rel_p)
 {
-	Path	   *cheapest_path = input_rel->cheapest_total_path;
 	RelOptInfo *partially_grouped_rel = NULL;
-	double		dNumGroups;
 	PartitionwiseAggregateType patype = PARTITIONWISE_AGGREGATE_NONE;
 
 	/*
@@ -4094,23 +4091,16 @@ create_ordinary_grouping_paths(PlannerInfo *root, RelOptInfo *input_rel,
 
 	/* Gather any partially grouped partial paths. */
 	if (partially_grouped_rel && partially_grouped_rel->partial_pathlist)
-	{
 		gather_grouping_paths(root, partially_grouped_rel);
-		set_cheapest(partially_grouped_rel);
-	}
 
-	/*
-	 * Estimate number of groups.
-	 */
-	dNumGroups = get_number_of_groups(root,
-									  cheapest_path->rows,
-									  gd,
-									  extra->targetList);
+	/* Now choose the best path(s) for partially_grouped_rel. */
+	if (partially_grouped_rel && partially_grouped_rel->pathlist)
+		set_cheapest(partially_grouped_rel);
 
 	/* Build final grouping paths */
 	add_paths_to_grouping_rel(root, input_rel, grouped_rel,
 							  partially_grouped_rel, agg_costs, gd,
-							  dNumGroups, extra);
+							  extra);
 
 	/* Give a helpful error if we failed to find any implementation */
 	if (grouped_rel->pathlist == NIL)
@@ -7055,16 +7045,42 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 						  RelOptInfo *grouped_rel,
 						  RelOptInfo *partially_grouped_rel,
 						  const AggClauseCosts *agg_costs,
-						  grouping_sets_data *gd, double dNumGroups,
+						  grouping_sets_data *gd,
 						  GroupPathExtraData *extra)
 {
 	Query	   *parse = root->parse;
 	Path	   *cheapest_path = input_rel->cheapest_total_path;
+	Path	   *cheapest_partially_grouped_path = NULL;
 	ListCell   *lc;
 	bool		can_hash = (extra->flags & GROUPING_CAN_USE_HASH) != 0;
 	bool		can_sort = (extra->flags & GROUPING_CAN_USE_SORT) != 0;
 	List	   *havingQual = (List *) extra->havingQual;
 	AggClauseCosts *agg_final_costs = &extra->agg_final_costs;
+	double		dNumGroups = 0;
+	double		dNumFinalGroups = 0;
+
+	/*
+	 * Estimate number of groups for non-split aggregation.
+	 */
+	dNumGroups = get_number_of_groups(root,
+									  cheapest_path->rows,
+									  gd,
+									  extra->targetList);
+
+	if (partially_grouped_rel && partially_grouped_rel->pathlist)
+	{
+		cheapest_partially_grouped_path =
+			partially_grouped_rel->cheapest_total_path;
+
+		/*
+		 * Estimate number of groups for final phase of partial aggregation.
+		 */
+		dNumFinalGroups =
+			get_number_of_groups(root,
+								 cheapest_partially_grouped_path->rows,
+								 gd,
+								 extra->targetList);
+	}
 
 	if (can_sort)
 	{
@@ -7177,7 +7193,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 					path = make_ordered_path(root,
 											 grouped_rel,
 											 path,
-											 partially_grouped_rel->cheapest_total_path,
+											 cheapest_partially_grouped_path,
 											 info->pathkeys,
 											 -1.0);
 
@@ -7195,7 +7211,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 												 info->clauses,
 												 havingQual,
 												 agg_final_costs,
-												 dNumGroups));
+												 dNumFinalGroups));
 					else
 						add_path(grouped_rel, (Path *)
 								 create_group_path(root,
@@ -7203,7 +7219,7 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 												   path,
 												   info->clauses,
 												   havingQual,
-												   dNumGroups));
+												   dNumFinalGroups));
 
 				}
 			}
@@ -7245,19 +7261,17 @@ add_paths_to_grouping_rel(PlannerInfo *root, RelOptInfo *input_rel,
 		 */
 		if (partially_grouped_rel && partially_grouped_rel->pathlist)
 		{
-			Path	   *path = partially_grouped_rel->cheapest_total_path;
-
 			add_path(grouped_rel, (Path *)
 					 create_agg_path(root,
 									 grouped_rel,
-									 path,
+									 cheapest_partially_grouped_path,
 									 grouped_rel->reltarget,
 									 AGG_HASHED,
 									 AGGSPLIT_FINAL_DESERIAL,
 									 root->processed_groupClause,
 									 havingQual,
 									 agg_final_costs,
-									 dNumGroups));
+									 dNumFinalGroups));
 		}
 	}
 
@@ -7297,6 +7311,7 @@ create_partial_grouping_paths(PlannerInfo *root,
 {
 	Query	   *parse = root->parse;
 	RelOptInfo *partially_grouped_rel;
+	RelOptInfo *eager_agg_rel = NULL;
 	AggClauseCosts *agg_partial_costs = &extra->agg_partial_costs;
 	AggClauseCosts *agg_final_costs = &extra->agg_final_costs;
 	Path	   *cheapest_partial_path = NULL;
@@ -7307,6 +7322,15 @@ create_partial_grouping_paths(PlannerInfo *root,
 	bool		can_hash = (extra->flags & GROUPING_CAN_USE_HASH) != 0;
 	bool		can_sort = (extra->flags & GROUPING_CAN_USE_SORT) != 0;
 
+	/*
+	 * Check whether any partially aggregated paths have been generated
+	 * through eager aggregation.
+	 */
+	if (input_rel->grouped_rel &&
+		!IS_DUMMY_REL(input_rel->grouped_rel) &&
+		input_rel->grouped_rel->pathlist != NIL)
+		eager_agg_rel = input_rel->grouped_rel;
+
 	/*
 	 * Consider whether we should generate partially aggregated non-partial
 	 * paths.  We can only do this if we have a non-partial path, and only if
@@ -7328,11 +7352,13 @@ create_partial_grouping_paths(PlannerInfo *root,
 
 	/*
 	 * If we can't partially aggregate partial paths, and we can't partially
-	 * aggregate non-partial paths, then don't bother creating the new
+	 * aggregate non-partial paths, and no partially aggregated paths were
+	 * generated by eager aggregation, then don't bother creating the new
 	 * RelOptInfo at all, unless the caller specified force_rel_creation.
 	 */
 	if (cheapest_total_path == NULL &&
 		cheapest_partial_path == NULL &&
+		eager_agg_rel == NULL &&
 		!force_rel_creation)
 		return NULL;
 
@@ -7557,6 +7583,51 @@ create_partial_grouping_paths(PlannerInfo *root,
 										 dNumPartialPartialGroups));
 	}
 
+	/*
+	 * Add any partially aggregated paths generated by eager aggregation to
+	 * the new upper relation after applying projection steps as needed.
+	 */
+	if (eager_agg_rel)
+	{
+		/* Add the paths */
+		foreach(lc, eager_agg_rel->pathlist)
+		{
+			Path	   *path = (Path *) lfirst(lc);
+
+			/* Shouldn't have any parameterized paths anymore */
+			Assert(path->param_info == NULL);
+
+			path = (Path *) create_projection_path(root,
+												   partially_grouped_rel,
+												   path,
+												   partially_grouped_rel->reltarget);
+
+			add_path(partially_grouped_rel, path);
+		}
+
+		/*
+		 * Likewise add the partial paths, but only if parallelism is possible
+		 * for partially_grouped_rel.
+		 */
+		if (partially_grouped_rel->consider_parallel)
+		{
+			foreach(lc, eager_agg_rel->partial_pathlist)
+			{
+				Path	   *path = (Path *) lfirst(lc);
+
+				/* Shouldn't have any parameterized paths anymore */
+				Assert(path->param_info == NULL);
+
+				path = (Path *) create_projection_path(root,
+													   partially_grouped_rel,
+													   path,
+													   partially_grouped_rel->reltarget);
+
+				add_partial_path(partially_grouped_rel, path);
+			}
+		}
+	}
+
 	/*
 	 * If there is an FDW that's responsible for all baserels of the query,
 	 * let it consider adding partially grouped ForeignPaths.
@@ -8120,13 +8191,6 @@ create_partitionwise_grouping_paths(PlannerInfo *root,
 
 		add_paths_to_append_rel(root, partially_grouped_rel,
 								partially_grouped_live_children);
-
-		/*
-		 * We need call set_cheapest, since the finalization step will use the
-		 * cheapest path from the rel.
-		 */
-		if (partially_grouped_rel->pathlist)
-			set_cheapest(partially_grouped_rel);
 	}
 
 	/* If possible, create append paths for fully grouped children. */
diff --git a/src/backend/optimizer/util/appendinfo.c b/src/backend/optimizer/util/appendinfo.c
index 5b3dc0d8653..69b8b0c2ae0 100644
--- a/src/backend/optimizer/util/appendinfo.c
+++ b/src/backend/optimizer/util/appendinfo.c
@@ -516,6 +516,57 @@ adjust_appendrel_attrs_mutator(Node *node,
 		return (Node *) newinfo;
 	}
 
+	/*
+	 * We have to process RelAggInfo nodes specially.
+	 */
+	if (IsA(node, RelAggInfo))
+	{
+		RelAggInfo *oldinfo = (RelAggInfo *) node;
+		RelAggInfo *newinfo = makeNode(RelAggInfo);
+
+		newinfo->target = (PathTarget *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->target,
+										   context);
+
+		newinfo->agg_input = (PathTarget *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->agg_input,
+										   context);
+
+		newinfo->group_clauses = oldinfo->group_clauses;
+
+		newinfo->group_exprs = (List *)
+			adjust_appendrel_attrs_mutator((Node *) oldinfo->group_exprs,
+										   context);
+
+		return (Node *) newinfo;
+	}
+
+	/*
+	 * We have to process PathTarget nodes specially.
+	 */
+	if (IsA(node, PathTarget))
+	{
+		PathTarget *oldtarget = (PathTarget *) node;
+		PathTarget *newtarget = makeNode(PathTarget);
+
+		/* Copy all flat-copiable fields */
+		memcpy(newtarget, oldtarget, sizeof(PathTarget));
+
+		newtarget->exprs = (List *)
+			adjust_appendrel_attrs_mutator((Node *) oldtarget->exprs,
+										   context);
+
+		if (oldtarget->sortgrouprefs)
+		{
+			Size		nbytes = list_length(oldtarget->exprs) * sizeof(Index);
+
+			newtarget->sortgrouprefs = (Index *) palloc(nbytes);
+			memcpy(newtarget->sortgrouprefs, oldtarget->sortgrouprefs, nbytes);
+		}
+
+		return (Node *) newtarget;
+	}
+
 	/*
 	 * NOTE: we do not need to recurse into sublinks, because they should
 	 * already have been converted to subplans before we see them.
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index 0e523d2eb5b..cf1bc672137 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -16,6 +16,8 @@
 
 #include <limits.h>
 
+#include "access/nbtree.h"
+#include "catalog/pg_constraint.h"
 #include "miscadmin.h"
 #include "nodes/nodeFuncs.h"
 #include "optimizer/appendinfo.h"
@@ -27,12 +29,16 @@
 #include "optimizer/paths.h"
 #include "optimizer/placeholder.h"
 #include "optimizer/plancat.h"
+#include "optimizer/planner.h"
 #include "optimizer/restrictinfo.h"
 #include "optimizer/tlist.h"
+#include "parser/parse_oper.h"
 #include "parser/parse_relation.h"
 #include "rewrite/rewriteManip.h"
 #include "utils/hsearch.h"
 #include "utils/lsyscache.h"
+#include "utils/selfuncs.h"
+#include "utils/typcache.h"
 
 
 typedef struct JoinHashEntry
@@ -83,6 +89,14 @@ static void build_child_join_reltarget(PlannerInfo *root,
 									   RelOptInfo *childrel,
 									   int nappinfos,
 									   AppendRelInfo **appinfos);
+static bool eager_aggregation_possible_for_relation(PlannerInfo *root,
+													RelOptInfo *rel);
+static bool init_grouping_targets(PlannerInfo *root, RelOptInfo *rel,
+								  PathTarget *target, PathTarget *agg_input,
+								  List **group_clauses, List **group_exprs);
+static bool is_var_in_aggref_only(PlannerInfo *root, Var *var);
+static bool is_var_needed_by_join(PlannerInfo *root, Var *var, RelOptInfo *rel);
+static Index get_expression_sortgroupref(PlannerInfo *root, Expr *expr);
 
 
 /*
@@ -278,6 +292,8 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
 	rel->joininfo = NIL;
 	rel->has_eclass_joins = false;
 	rel->consider_partitionwise_join = false;	/* might get changed later */
+	rel->agg_info = NULL;
+	rel->grouped_rel = NULL;
 	rel->part_scheme = NULL;
 	rel->nparts = -1;
 	rel->boundinfo = NULL;
@@ -408,6 +424,103 @@ build_simple_rel(PlannerInfo *root, int relid, RelOptInfo *parent)
 	return rel;
 }
 
+/*
+ * build_simple_grouped_rel
+ *	  Construct a new RelOptInfo representing a grouped version of the input
+ *	  simple relation.
+ */
+RelOptInfo *
+build_simple_grouped_rel(PlannerInfo *root, RelOptInfo *rel)
+{
+	RelOptInfo *grouped_rel;
+	RelAggInfo *agg_info;
+
+	/*
+	 * We should have available aggregate expressions and grouping
+	 * expressions, otherwise we cannot reach here.
+	 */
+	Assert(root->agg_clause_list != NIL);
+	Assert(root->group_expr_list != NIL);
+
+	/* nothing to do for dummy rel */
+	if (IS_DUMMY_REL(rel))
+		return NULL;
+
+	/*
+	 * Prepare the information needed to create grouped paths for this simple
+	 * relation.
+	 */
+	agg_info = create_rel_agg_info(root, rel, true);
+	if (agg_info == NULL)
+		return NULL;
+
+	/*
+	 * If grouped paths for the given simple relation are not considered
+	 * useful, skip building the grouped relation.
+	 */
+	if (!agg_info->agg_useful)
+		return NULL;
+
+	/* Track the set of relids at which partial aggregation is applied */
+	agg_info->apply_at = bms_copy(rel->relids);
+
+	/* build the grouped relation */
+	grouped_rel = build_grouped_rel(root, rel);
+	grouped_rel->reltarget = agg_info->target;
+	grouped_rel->rows = agg_info->grouped_rows;
+	grouped_rel->agg_info = agg_info;
+
+	rel->grouped_rel = grouped_rel;
+
+	return grouped_rel;
+}
+
+/*
+ * build_grouped_rel
+ *	  Build a grouped relation by flat copying the input relation and resetting
+ *	  the necessary fields.
+ */
+RelOptInfo *
+build_grouped_rel(PlannerInfo *root, RelOptInfo *rel)
+{
+	RelOptInfo *grouped_rel;
+
+	grouped_rel = makeNode(RelOptInfo);
+	memcpy(grouped_rel, rel, sizeof(RelOptInfo));
+
+	/*
+	 * clear path info
+	 */
+	grouped_rel->pathlist = NIL;
+	grouped_rel->ppilist = NIL;
+	grouped_rel->partial_pathlist = NIL;
+	grouped_rel->cheapest_startup_path = NULL;
+	grouped_rel->cheapest_total_path = NULL;
+	grouped_rel->cheapest_parameterized_paths = NIL;
+
+	/*
+	 * clear partition info
+	 */
+	grouped_rel->part_scheme = NULL;
+	grouped_rel->nparts = -1;
+	grouped_rel->boundinfo = NULL;
+	grouped_rel->partbounds_merged = false;
+	grouped_rel->partition_qual = NIL;
+	grouped_rel->part_rels = NULL;
+	grouped_rel->live_parts = NULL;
+	grouped_rel->all_partrels = NULL;
+	grouped_rel->partexprs = NULL;
+	grouped_rel->nullable_partexprs = NULL;
+	grouped_rel->consider_partitionwise_join = false;
+
+	/*
+	 * clear size estimates
+	 */
+	grouped_rel->rows = 0;
+
+	return grouped_rel;
+}
+
 /*
  * find_base_rel
  *	  Find a base or otherrel relation entry, which must already exist.
@@ -759,6 +872,8 @@ build_join_rel(PlannerInfo *root,
 	joinrel->joininfo = NIL;
 	joinrel->has_eclass_joins = false;
 	joinrel->consider_partitionwise_join = false;	/* might get changed later */
+	joinrel->agg_info = NULL;
+	joinrel->grouped_rel = NULL;
 	joinrel->parent = NULL;
 	joinrel->top_parent = NULL;
 	joinrel->top_parent_relids = NULL;
@@ -945,6 +1060,8 @@ build_child_join_rel(PlannerInfo *root, RelOptInfo *outer_rel,
 	joinrel->joininfo = NIL;
 	joinrel->has_eclass_joins = false;
 	joinrel->consider_partitionwise_join = false;	/* might get changed later */
+	joinrel->agg_info = NULL;
+	joinrel->grouped_rel = NULL;
 	joinrel->parent = parent_joinrel;
 	joinrel->top_parent = parent_joinrel->top_parent ? parent_joinrel->top_parent : parent_joinrel;
 	joinrel->top_parent_relids = joinrel->top_parent->relids;
@@ -2523,3 +2640,536 @@ build_child_join_reltarget(PlannerInfo *root,
 	childrel->reltarget->cost.per_tuple = parentrel->reltarget->cost.per_tuple;
 	childrel->reltarget->width = parentrel->reltarget->width;
 }
+
+/*
+ * create_rel_agg_info
+ *	  Create the RelAggInfo structure for the given relation if it can produce
+ *	  grouped paths.  The given relation is the non-grouped one which has the
+ *	  reltarget already constructed.
+ *
+ * calculate_grouped_rows: if true, calculate the estimated number of grouped
+ * rows for the relation.  If false, skip the estimation to avoid unnecessary
+ * planning overhead.
+ */
+RelAggInfo *
+create_rel_agg_info(PlannerInfo *root, RelOptInfo *rel,
+					bool calculate_grouped_rows)
+{
+	ListCell   *lc;
+	RelAggInfo *result;
+	PathTarget *agg_input;
+	PathTarget *target;
+	List	   *group_clauses = NIL;
+	List	   *group_exprs = NIL;
+
+	/*
+	 * The lists of aggregate expressions and grouping expressions should have
+	 * been constructed.
+	 */
+	Assert(root->agg_clause_list != NIL);
+	Assert(root->group_expr_list != NIL);
+
+	/*
+	 * If this is a child rel, the grouped rel for its parent rel must have
+	 * been created if it can.  So we can just use parent's RelAggInfo if
+	 * there is one, with appropriate variable substitutions.
+	 */
+	if (IS_OTHER_REL(rel))
+	{
+		RelOptInfo *grouped_rel;
+		RelAggInfo *agg_info;
+
+		grouped_rel = rel->top_parent->grouped_rel;
+		if (grouped_rel == NULL)
+			return NULL;
+
+		Assert(IS_GROUPED_REL(grouped_rel));
+
+		/* Must do multi-level transformation */
+		agg_info = (RelAggInfo *)
+			adjust_appendrel_attrs_multilevel(root,
+											  (Node *) grouped_rel->agg_info,
+											  rel,
+											  rel->top_parent);
+
+		agg_info->apply_at = NULL;	/* caller will change this later */
+
+		if (calculate_grouped_rows)
+		{
+			agg_info->grouped_rows =
+				estimate_num_groups(root, agg_info->group_exprs,
+									rel->rows, NULL, NULL);
+
+			/*
+			 * The grouped paths for the given relation are considered useful
+			 * iff the average group size is no less than
+			 * min_eager_agg_group_size.
+			 */
+			agg_info->agg_useful =
+				(rel->rows / agg_info->grouped_rows) >= min_eager_agg_group_size;
+		}
+
+		return agg_info;
+	}
+
+	/* Check if it's possible to produce grouped paths for this relation. */
+	if (!eager_aggregation_possible_for_relation(root, rel))
+		return NULL;
+
+	/*
+	 * Create targets for the grouped paths and for the input paths of the
+	 * grouped paths.
+	 */
+	target = create_empty_pathtarget();
+	agg_input = create_empty_pathtarget();
+
+	/* ... and initialize these targets */
+	if (!init_grouping_targets(root, rel, target, agg_input,
+							   &group_clauses, &group_exprs))
+		return NULL;
+
+	/*
+	 * Eager aggregation is not applicable if there are no available grouping
+	 * expressions.
+	 */
+	if (group_clauses == NIL)
+		return NULL;
+
+	/* Add aggregates to the grouping target */
+	foreach(lc, root->agg_clause_list)
+	{
+		AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+		Aggref	   *aggref;
+
+		Assert(IsA(ac_info->aggref, Aggref));
+
+		aggref = (Aggref *) copyObject(ac_info->aggref);
+		mark_partial_aggref(aggref, AGGSPLIT_INITIAL_SERIAL);
+
+		add_column_to_pathtarget(target, (Expr *) aggref, 0);
+	}
+
+	/* Set the estimated eval cost and output width for both targets */
+	set_pathtarget_cost_width(root, target);
+	set_pathtarget_cost_width(root, agg_input);
+
+	/* build the RelAggInfo result */
+	result = makeNode(RelAggInfo);
+	result->target = target;
+	result->agg_input = agg_input;
+	result->group_clauses = group_clauses;
+	result->group_exprs = group_exprs;
+	result->apply_at = NULL;	/* caller will change this later */
+
+	if (calculate_grouped_rows)
+	{
+		result->grouped_rows = estimate_num_groups(root, result->group_exprs,
+												   rel->rows, NULL, NULL);
+
+		/*
+		 * The grouped paths for the given relation are considered useful iff
+		 * the average group size is no less than min_eager_agg_group_size.
+		 */
+		result->agg_useful =
+			(rel->rows / result->grouped_rows) >= min_eager_agg_group_size;
+	}
+
+	return result;
+}
+
+/*
+ * eager_aggregation_possible_for_relation
+ * 	  Check if it's possible to produce grouped paths for the given relation.
+ */
+static bool
+eager_aggregation_possible_for_relation(PlannerInfo *root, RelOptInfo *rel)
+{
+	ListCell   *lc;
+	int			cur_relid;
+
+	/*
+	 * Check to see if the given relation is in the nullable side of an outer
+	 * join.  In this case, we cannot push a partial aggregation down to the
+	 * relation, because the NULL-extended rows produced by the outer join
+	 * would not be available when we perform the partial aggregation, while
+	 * with a non-eager-aggregation plan these rows are available for the
+	 * top-level aggregation.  Doing so may result in the rows being grouped
+	 * differently than expected, or produce incorrect values from the
+	 * aggregate functions.
+	 */
+	cur_relid = -1;
+	while ((cur_relid = bms_next_member(rel->relids, cur_relid)) >= 0)
+	{
+		RelOptInfo *baserel = find_base_rel_ignore_join(root, cur_relid);
+
+		if (baserel == NULL)
+			continue;			/* ignore outer joins in rel->relids */
+
+		if (!bms_is_subset(baserel->nulling_relids, rel->relids))
+			return false;
+	}
+
+	/*
+	 * For now we don't try to support PlaceHolderVars.
+	 */
+	foreach(lc, rel->reltarget->exprs)
+	{
+		Expr	   *expr = lfirst(lc);
+
+		if (IsA(expr, PlaceHolderVar))
+			return false;
+	}
+
+	/* Caller should only pass base relations or joins. */
+	Assert(rel->reloptkind == RELOPT_BASEREL ||
+		   rel->reloptkind == RELOPT_JOINREL);
+
+	/*
+	 * Check if all aggregate expressions can be evaluated on this relation
+	 * level.
+	 */
+	foreach(lc, root->agg_clause_list)
+	{
+		AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+
+		Assert(IsA(ac_info->aggref, Aggref));
+
+		/*
+		 * Give up if any aggregate requires relations other than the current
+		 * one.  If the aggregate requires the current relation plus
+		 * additional relations, grouping the current relation could make some
+		 * input rows unavailable for the higher aggregate and may reduce the
+		 * number of input rows it receives.  If the aggregate does not
+		 * require the current relation at all, it should not be grouped, as
+		 * we do not support joining two grouped relations.
+		 */
+		if (!bms_is_subset(ac_info->agg_eval_at, rel->relids))
+			return false;
+	}
+
+	return true;
+}
+
+/*
+ * init_grouping_targets
+ *	  Initialize the target for grouped paths (target) as well as the target
+ *	  for paths that generate input for the grouped paths (agg_input).
+ *
+ * We also construct the list of SortGroupClauses and the list of grouping
+ * expressions for the partial aggregation, and return them in *group_clause
+ * and *group_exprs.
+ *
+ * Return true if the targets could be initialized, false otherwise.
+ */
+static bool
+init_grouping_targets(PlannerInfo *root, RelOptInfo *rel,
+					  PathTarget *target, PathTarget *agg_input,
+					  List **group_clauses, List **group_exprs)
+{
+	ListCell   *lc;
+	List	   *possibly_dependent = NIL;
+	Index		maxSortGroupRef;
+
+	/* Identify the max sortgroupref */
+	maxSortGroupRef = 0;
+	foreach(lc, root->processed_tlist)
+	{
+		Index		ref = ((TargetEntry *) lfirst(lc))->ressortgroupref;
+
+		if (ref > maxSortGroupRef)
+			maxSortGroupRef = ref;
+	}
+
+	/*
+	 * At this point, all Vars from this relation that are needed by upper
+	 * joins or are required in the final targetlist should already be present
+	 * in its reltarget.  Therefore, we can safely iterate over this
+	 * relation's reltarget->exprs to construct the PathTarget and grouping
+	 * clauses for the grouped paths.
+	 */
+	foreach(lc, rel->reltarget->exprs)
+	{
+		Expr	   *expr = (Expr *) lfirst(lc);
+		Index		sortgroupref;
+
+		/*
+		 * Given that PlaceHolderVar currently prevents us from doing eager
+		 * aggregation, the source target cannot contain anything more complex
+		 * than a Var.
+		 */
+		Assert(IsA(expr, Var));
+
+		/*
+		 * Get the sortgroupref of the expr if it is found among, or can be
+		 * deduced from, the original grouping expressions.
+		 */
+		sortgroupref = get_expression_sortgroupref(root, expr);
+		if (sortgroupref > 0)
+		{
+			SortGroupClause *sgc;
+
+			/* Find the matching SortGroupClause */
+			sgc = get_sortgroupref_clause(sortgroupref, root->processed_groupClause);
+			Assert(sgc->tleSortGroupRef <= maxSortGroupRef);
+
+			/*
+			 * If the target expression is to be used as a grouping key, it
+			 * should be emitted by the grouped paths that have been pushed
+			 * down to this relation level.
+			 */
+			add_column_to_pathtarget(target, expr, sortgroupref);
+
+			/*
+			 * ... and it also should be emitted by the input paths.
+			 */
+			add_column_to_pathtarget(agg_input, expr, sortgroupref);
+
+			/*
+			 * Record this SortGroupClause and grouping expression.  Note that
+			 * this SortGroupClause might have already been recorded.
+			 */
+			if (!list_member(*group_clauses, sgc))
+			{
+				*group_clauses = lappend(*group_clauses, sgc);
+				*group_exprs = lappend(*group_exprs, expr);
+			}
+		}
+		else if (is_var_needed_by_join(root, (Var *) expr, rel))
+		{
+			/*
+			 * The expression is needed for an upper join but is neither in
+			 * the GROUP BY clause nor derivable from it using EC (otherwise,
+			 * it would have already been included in the targets above).  We
+			 * need to create a special SortGroupClause for this expression.
+			 *
+			 * It is important to include such expressions in the grouping
+			 * keys.  This is essential to ensure that an aggregated row from
+			 * the partial aggregation matches the other side of the join if
+			 * and only if each row in the partial group does.  This ensures
+			 * that all rows within the same partial group share the same
+			 * 'destiny', which is crucial for maintaining correctness.
+			 */
+			SortGroupClause *sgc;
+			TypeCacheEntry *tce;
+			Oid			equalimageproc;
+
+			/*
+			 * But first, check if equality implies image equality for this
+			 * expression.  If not, we cannot use it as a grouping key.  See
+			 * comments in create_grouping_expr_infos().
+			 */
+			tce = lookup_type_cache(exprType((Node *) expr),
+									TYPECACHE_BTREE_OPFAMILY);
+			if (!OidIsValid(tce->btree_opf) ||
+				!OidIsValid(tce->btree_opintype))
+				return false;
+
+			equalimageproc = get_opfamily_proc(tce->btree_opf,
+											   tce->btree_opintype,
+											   tce->btree_opintype,
+											   BTEQUALIMAGE_PROC);
+			if (!OidIsValid(equalimageproc) ||
+				!DatumGetBool(OidFunctionCall1Coll(equalimageproc,
+												   tce->typcollation,
+												   ObjectIdGetDatum(tce->btree_opintype))))
+				return false;
+
+			/* Create the SortGroupClause. */
+			sgc = makeNode(SortGroupClause);
+
+			/* Initialize the SortGroupClause. */
+			sgc->tleSortGroupRef = ++maxSortGroupRef;
+			get_sort_group_operators(exprType((Node *) expr),
+									 false, true, false,
+									 &sgc->sortop, &sgc->eqop, NULL,
+									 &sgc->hashable);
+
+			/* This expression should be emitted by the grouped paths */
+			add_column_to_pathtarget(target, expr, sgc->tleSortGroupRef);
+
+			/* ... and it also should be emitted by the input paths. */
+			add_column_to_pathtarget(agg_input, expr, sgc->tleSortGroupRef);
+
+			/* Record this SortGroupClause and grouping expression */
+			*group_clauses = lappend(*group_clauses, sgc);
+			*group_exprs = lappend(*group_exprs, expr);
+		}
+		else if (is_var_in_aggref_only(root, (Var *) expr))
+		{
+			/*
+			 * The expression is referenced by an aggregate function pushed
+			 * down to this relation and does not appear elsewhere in the
+			 * targetlist or havingQual.  Add it to 'agg_input' but not to
+			 * 'target'.
+			 */
+			add_new_column_to_pathtarget(agg_input, expr);
+		}
+		else
+		{
+			/*
+			 * The expression may be functionally dependent on other
+			 * expressions in the target, but we cannot verify this until all
+			 * target expressions have been constructed.
+			 */
+			possibly_dependent = lappend(possibly_dependent, expr);
+		}
+	}
+
+	/*
+	 * Now we can verify whether an expression is functionally dependent on
+	 * others.
+	 */
+	foreach(lc, possibly_dependent)
+	{
+		Var		   *tvar;
+		List	   *deps = NIL;
+		RangeTblEntry *rte;
+
+		tvar = lfirst_node(Var, lc);
+		rte = root->simple_rte_array[tvar->varno];
+
+		if (check_functional_grouping(rte->relid, tvar->varno,
+									  tvar->varlevelsup,
+									  target->exprs, &deps))
+		{
+			/*
+			 * The expression is functionally dependent on other target
+			 * expressions, so it can be included in the targets.  Since it
+			 * will not be used as a grouping key, a sortgroupref is not
+			 * needed for it.
+			 */
+			add_new_column_to_pathtarget(target, (Expr *) tvar);
+			add_new_column_to_pathtarget(agg_input, (Expr *) tvar);
+		}
+		else
+		{
+			/*
+			 * We may arrive here with a grouping expression that is proven
+			 * redundant by EquivalenceClass processing, such as 't1.a' in the
+			 * query below.
+			 *
+			 * select max(t1.c) from t t1, t t2 where t1.a = 1 group by t1.a,
+			 * t1.b;
+			 *
+			 * For now we just give up in this case.
+			 */
+			return false;
+		}
+	}
+
+	return true;
+}
+
+/*
+ * is_var_in_aggref_only
+ *	  Check whether the given Var appears in aggregate expressions and not
+ *	  elsewhere in the targetlist or havingQual.
+ */
+static bool
+is_var_in_aggref_only(PlannerInfo *root, Var *var)
+{
+	ListCell   *lc;
+
+	/*
+	 * Search the list of aggregate expressions for the Var.
+	 */
+	foreach(lc, root->agg_clause_list)
+	{
+		AggClauseInfo *ac_info = lfirst_node(AggClauseInfo, lc);
+		List	   *vars;
+
+		Assert(IsA(ac_info->aggref, Aggref));
+
+		if (!bms_is_member(var->varno, ac_info->agg_eval_at))
+			continue;
+
+		vars = pull_var_clause((Node *) ac_info->aggref,
+							   PVC_RECURSE_AGGREGATES |
+							   PVC_RECURSE_WINDOWFUNCS |
+							   PVC_RECURSE_PLACEHOLDERS);
+
+		if (list_member(vars, var))
+		{
+			list_free(vars);
+			break;
+		}
+
+		list_free(vars);
+	}
+
+	return (lc != NULL && !list_member(root->tlist_vars, var));
+}
+
+/*
+ * is_var_needed_by_join
+ *	  Check if the given Var is needed by joins above the current rel.
+ */
+static bool
+is_var_needed_by_join(PlannerInfo *root, Var *var, RelOptInfo *rel)
+{
+	Relids		relids;
+	int			attno;
+	RelOptInfo *baserel;
+
+	/*
+	 * Note that when checking if the Var is needed by joins above, we want to
+	 * exclude cases where the Var is only needed in the final targetlist.  So
+	 * include "relation 0" in the check.
+	 */
+	relids = bms_copy(rel->relids);
+	relids = bms_add_member(relids, 0);
+
+	baserel = find_base_rel(root, var->varno);
+	attno = var->varattno - baserel->min_attr;
+
+	return bms_nonempty_difference(baserel->attr_needed[attno], relids);
+}
+
+/*
+ * get_expression_sortgroupref
+ *	  Return the sortgroupref of the given "expr" if it is found among the
+ *	  original grouping expressions, or is known equal to any of the original
+ *	  grouping expressions due to equivalence relationships.  Return 0 if no
+ *	  match is found.
+ */
+static Index
+get_expression_sortgroupref(PlannerInfo *root, Expr *expr)
+{
+	ListCell   *lc;
+
+	Assert(IsA(expr, Var));
+
+	foreach(lc, root->group_expr_list)
+	{
+		GroupingExprInfo *ge_info = lfirst_node(GroupingExprInfo, lc);
+		ListCell   *lc1;
+
+		Assert(IsA(ge_info->expr, Var));
+		Assert(ge_info->sortgroupref > 0);
+
+		if (equal(expr, ge_info->expr))
+			return ge_info->sortgroupref;
+
+		if (ge_info->ec == NULL ||
+			!bms_is_member(((Var *) expr)->varno, ge_info->ec->ec_relids))
+			continue;
+
+		/*
+		 * Scan the EquivalenceClass, looking for a match to the given
+		 * expression.  We ignore child members here.
+		 */
+		foreach(lc1, ge_info->ec->ec_members)
+		{
+			EquivalenceMember *em = (EquivalenceMember *) lfirst(lc1);
+
+			/* Child members should not exist in ec_members */
+			Assert(!em->em_is_child);
+
+			if (equal(expr, em->em_expr))
+				return ge_info->sortgroupref;
+		}
+	}
+
+	/* no match is found */
+	return 0;
+}
diff --git a/src/backend/utils/misc/guc_parameters.dat b/src/backend/utils/misc/guc_parameters.dat
index 6bc6be13d2a..b176d5130e4 100644
--- a/src/backend/utils/misc/guc_parameters.dat
+++ b/src/backend/utils/misc/guc_parameters.dat
@@ -145,6 +145,13 @@
   boot_val => 'false',
 },
 
+{ name => 'enable_eager_aggregate', type => 'bool', context => 'PGC_USERSET', group => 'QUERY_TUNING_METHOD',
+  short_desc => 'Enables eager aggregation.',
+  flags => 'GUC_EXPLAIN',
+  variable => 'enable_eager_aggregate',
+  boot_val => 'true',
+},
+
 { name => 'enable_parallel_append', type => 'bool', context => 'PGC_USERSET', group => 'QUERY_TUNING_METHOD',
   short_desc => 'Enables the planner\'s use of parallel append plans.',
   flags => 'GUC_EXPLAIN',
@@ -2427,6 +2434,15 @@
   max => 'DBL_MAX',
 },
 
+{ name => 'min_eager_agg_group_size', type => 'real', context => 'PGC_USERSET', group => 'QUERY_TUNING_COST',
+  short_desc => 'Sets the minimum average group size required to consider applying eager aggregation.',
+  flags => 'GUC_EXPLAIN',
+  variable => 'min_eager_agg_group_size',
+  boot_val => '8.0',
+  min => '0.0',
+  max => 'DBL_MAX',
+},
+
 { name => 'cursor_tuple_fraction', type => 'real', context => 'PGC_USERSET', group => 'QUERY_TUNING_OTHER',
   short_desc => 'Sets the planner\'s estimate of the fraction of a cursor\'s rows that will be retrieved.',
   flags => 'GUC_EXPLAIN',
diff --git a/src/backend/utils/misc/postgresql.conf.sample b/src/backend/utils/misc/postgresql.conf.sample
index c36fcb9ab61..c5d612ab552 100644
--- a/src/backend/utils/misc/postgresql.conf.sample
+++ b/src/backend/utils/misc/postgresql.conf.sample
@@ -428,6 +428,7 @@
 #enable_group_by_reordering = on
 #enable_distinct_reordering = on
 #enable_self_join_elimination = on
+#enable_eager_aggregate = on
 
 # - Planner Cost Constants -
 
@@ -441,6 +442,7 @@
 #min_parallel_table_scan_size = 8MB
 #min_parallel_index_scan_size = 512kB
 #effective_cache_size = 4GB
+#min_eager_agg_group_size = 8.0
 
 #jit_above_cost = 100000		# perform JIT compilation if available
 					# and query more expensive than this;
diff --git a/src/include/nodes/pathnodes.h b/src/include/nodes/pathnodes.h
index b12a2508d8c..798b431c5aa 100644
--- a/src/include/nodes/pathnodes.h
+++ b/src/include/nodes/pathnodes.h
@@ -391,6 +391,15 @@ struct PlannerInfo
 	/* list of PlaceHolderInfos */
 	List	   *placeholder_list;
 
+	/* list of AggClauseInfos */
+	List	   *agg_clause_list;
+
+	/* list of GroupExprInfos */
+	List	   *group_expr_list;
+
+	/* list of plain Vars contained in targetlist and havingQual */
+	List	   *tlist_vars;
+
 	/* array of PlaceHolderInfos indexed by phid */
 	struct PlaceHolderInfo **placeholder_array pg_node_attr(read_write_ignore, array_size(placeholder_array_size));
 	/* allocated size of array */
@@ -1040,6 +1049,14 @@ typedef struct RelOptInfo
 	/* consider partitionwise join paths? (if partitioned rel) */
 	bool		consider_partitionwise_join;
 
+	/*
+	 * used by eager aggregation:
+	 */
+	/* information needed to create grouped paths */
+	struct RelAggInfo *agg_info;
+	/* the partially-aggregated version of the relation */
+	struct RelOptInfo *grouped_rel;
+
 	/*
 	 * inheritance links, if this is an otherrel (otherwise NULL):
 	 */
@@ -1124,6 +1141,63 @@ typedef struct RelOptInfo
 	((nominal_jointype) == JOIN_INNER && (sjinfo)->jointype == JOIN_SEMI && \
 	 bms_equal((sjinfo)->syn_righthand, (rel)->relids))
 
+/*
+ * Is the given relation a grouped relation?
+ */
+#define IS_GROUPED_REL(rel) \
+	((rel)->agg_info != NULL)
+
+/*
+ * RelAggInfo
+ *		Information needed to create paths for a grouped relation.
+ *
+ * "target" is the default result targetlist for Paths scanning this grouped
+ * relation; list of Vars/Exprs, cost, width.
+ *
+ * "agg_input" is the output tlist for the paths that provide input to the
+ * grouped paths.  One difference from the reltarget of the non-grouped
+ * relation is that agg_input has its sortgrouprefs[] initialized.
+ *
+ * "group_clauses" and "group_exprs" are lists of SortGroupClauses and the
+ * corresponding grouping expressions.
+ *
+ * "apply_at" tracks the set of relids at which partial aggregation is applied
+ * in the paths of this grouped relation.
+ *
+ * "grouped_rows" is the estimated number of result tuples of the grouped
+ * relation.
+ *
+ * "agg_useful" is a flag to indicate whether the grouped paths are considered
+ * useful.  It is set true if the average partial group size is no less than
+ * min_eager_agg_group_size, suggesting a significant row count reduction.
+ */
+typedef struct RelAggInfo
+{
+	pg_node_attr(no_copy_equal, no_read, no_query_jumble)
+
+	NodeTag		type;
+
+	/* the output tlist for the grouped paths */
+	struct PathTarget *target;
+
+	/* the output tlist for the input paths */
+	struct PathTarget *agg_input;
+
+	/* a list of SortGroupClauses */
+	List	   *group_clauses;
+	/* a list of grouping expressions */
+	List	   *group_exprs;
+
+	/* the set of relids partial aggregation is applied at */
+	Relids		apply_at;
+
+	/* estimated number of result tuples */
+	Cardinality grouped_rows;
+
+	/* the grouped paths are considered useful? */
+	bool		agg_useful;
+} RelAggInfo;
+
 /*
  * IndexOptInfo
  *		Per-index information for planning/optimization
@@ -3268,6 +3342,49 @@ typedef struct MinMaxAggInfo
 	Param	   *param;
 } MinMaxAggInfo;
 
+/*
+ * For each distinct Aggref node that appears in the targetlist and HAVING
+ * clauses, we store an AggClauseInfo node in the PlannerInfo node's
+ * agg_clause_list.  Each AggClauseInfo records the set of relations referenced
+ * by the aggregate expression.  This information is used to determine how far
+ * the aggregate can be safely pushed down in the join tree.
+ */
+typedef struct AggClauseInfo
+{
+	pg_node_attr(no_read, no_query_jumble)
+
+	NodeTag		type;
+
+	/* the Aggref expr */
+	Aggref	   *aggref;
+
+	/* lowest level we can evaluate this aggregate at */
+	Relids		agg_eval_at;
+} AggClauseInfo;
+
+/*
+ * For each grouping expression that appears in grouping clauses, we store a
+ * GroupingExprInfo node in the PlannerInfo node's group_expr_list.  Each
+ * GroupingExprInfo records the expression being grouped on, its sortgroupref,
+ * and the EquivalenceClass it belongs to.  This information is necessary to
+ * reproduce correct grouping semantics at different levels of the join tree.
+ */
+typedef struct GroupingExprInfo
+{
+	pg_node_attr(no_read, no_query_jumble)
+
+	NodeTag		type;
+
+	/* the represented expression */
+	Expr	   *expr;
+
+	/* the tleSortGroupRef of the corresponding SortGroupClause */
+	Index		sortgroupref;
+
+	/* the equivalence class the expression belongs to */
+	EquivalenceClass *ec pg_node_attr(copy_as_scalar, equal_as_scalar);
+} GroupingExprInfo;
+
 /*
  * At runtime, PARAM_EXEC slots are used to pass values around from one plan
  * node to another.  They can be used to pass values down into subqueries (for
diff --git a/src/include/optimizer/pathnode.h b/src/include/optimizer/pathnode.h
index 763cd25bb3c..da60383c2aa 100644
--- a/src/include/optimizer/pathnode.h
+++ b/src/include/optimizer/pathnode.h
@@ -312,6 +312,8 @@ extern void setup_simple_rel_arrays(PlannerInfo *root);
 extern void expand_planner_arrays(PlannerInfo *root, int add_size);
 extern RelOptInfo *build_simple_rel(PlannerInfo *root, int relid,
 									RelOptInfo *parent);
+extern RelOptInfo *build_simple_grouped_rel(PlannerInfo *root, RelOptInfo *rel);
+extern RelOptInfo *build_grouped_rel(PlannerInfo *root, RelOptInfo *rel);
 extern RelOptInfo *find_base_rel(PlannerInfo *root, int relid);
 extern RelOptInfo *find_base_rel_noerr(PlannerInfo *root, int relid);
 extern RelOptInfo *find_base_rel_ignore_join(PlannerInfo *root, int relid);
@@ -351,4 +353,6 @@ extern RelOptInfo *build_child_join_rel(PlannerInfo *root,
 										SpecialJoinInfo *sjinfo,
 										int nappinfos, AppendRelInfo **appinfos);
 
+extern RelAggInfo *create_rel_agg_info(PlannerInfo *root, RelOptInfo *rel,
+									   bool calculate_grouped_rows);
 #endif							/* PATHNODE_H */
diff --git a/src/include/optimizer/paths.h b/src/include/optimizer/paths.h
index cbade77b717..f6a62df0b43 100644
--- a/src/include/optimizer/paths.h
+++ b/src/include/optimizer/paths.h
@@ -21,7 +21,9 @@
  * allpaths.c
  */
 extern PGDLLIMPORT bool enable_geqo;
+extern PGDLLIMPORT bool enable_eager_aggregate;
 extern PGDLLIMPORT int geqo_threshold;
+extern PGDLLIMPORT double min_eager_agg_group_size;
 extern PGDLLIMPORT int min_parallel_table_scan_size;
 extern PGDLLIMPORT int min_parallel_index_scan_size;
 extern PGDLLIMPORT bool enable_group_by_reordering;
@@ -57,6 +59,8 @@ extern void generate_gather_paths(PlannerInfo *root, RelOptInfo *rel,
 								  bool override_rows);
 extern void generate_useful_gather_paths(PlannerInfo *root, RelOptInfo *rel,
 										 bool override_rows);
+extern void generate_grouped_paths(PlannerInfo *root, RelOptInfo *grouped_rel,
+								   RelOptInfo *rel);
 extern int	compute_parallel_worker(RelOptInfo *rel, double heap_pages,
 									double index_pages, int max_workers);
 extern void create_partial_bitmap_paths(PlannerInfo *root, RelOptInfo *rel,
diff --git a/src/include/optimizer/planmain.h b/src/include/optimizer/planmain.h
index 9d3debcab28..09b48b26f8f 100644
--- a/src/include/optimizer/planmain.h
+++ b/src/include/optimizer/planmain.h
@@ -76,6 +76,7 @@ extern void add_vars_to_targetlist(PlannerInfo *root, List *vars,
 extern void add_vars_to_attr_needed(PlannerInfo *root, List *vars,
 									Relids where_needed);
 extern void remove_useless_groupby_columns(PlannerInfo *root);
+extern void setup_eager_aggregation(PlannerInfo *root);
 extern void find_lateral_references(PlannerInfo *root);
 extern void rebuild_lateral_attr_needed(PlannerInfo *root);
 extern void create_lateral_join_info(PlannerInfo *root);
diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out
index 69805d4b9ec..ef79d6f1ded 100644
--- a/src/test/regress/expected/collate.icu.utf8.out
+++ b/src/test/regress/expected/collate.icu.utf8.out
@@ -2437,11 +2437,11 @@ SELECT c collate "C", count(c) FROM pagg_tab3 GROUP BY c collate "C" ORDER BY 1;
 SET enable_partitionwise_join TO false;
 EXPLAIN (COSTS OFF)
 SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROUP BY 1 ORDER BY t1.c COLLATE "C";
-                         QUERY PLAN                          
--------------------------------------------------------------
+                            QUERY PLAN                             
+-------------------------------------------------------------------
  Sort
    Sort Key: t1.c COLLATE "C"
-   ->  HashAggregate
+   ->  Finalize HashAggregate
          Group Key: t1.c
          ->  Hash Join
                Hash Cond: (t1.c = t2.c)
@@ -2449,10 +2449,12 @@ SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROU
                      ->  Seq Scan on pagg_tab3_p2 t1_1
                      ->  Seq Scan on pagg_tab3_p1 t1_2
                ->  Hash
-                     ->  Append
-                           ->  Seq Scan on pagg_tab3_p2 t2_1
-                           ->  Seq Scan on pagg_tab3_p1 t2_2
-(13 rows)
+                     ->  Partial HashAggregate
+                           Group Key: t2.c
+                           ->  Append
+                                 ->  Seq Scan on pagg_tab3_p2 t2_1
+                                 ->  Seq Scan on pagg_tab3_p1 t2_2
+(15 rows)
 
 SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROUP BY 1 ORDER BY t1.c COLLATE "C";
  c | count 
@@ -2464,11 +2466,11 @@ SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROU
 SET enable_partitionwise_join TO true;
 EXPLAIN (COSTS OFF)
 SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROUP BY 1 ORDER BY t1.c COLLATE "C";
-                         QUERY PLAN                          
--------------------------------------------------------------
+                            QUERY PLAN                             
+-------------------------------------------------------------------
  Sort
    Sort Key: t1.c COLLATE "C"
-   ->  HashAggregate
+   ->  Finalize HashAggregate
          Group Key: t1.c
          ->  Hash Join
                Hash Cond: (t1.c = t2.c)
@@ -2476,10 +2478,12 @@ SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROU
                      ->  Seq Scan on pagg_tab3_p2 t1_1
                      ->  Seq Scan on pagg_tab3_p1 t1_2
                ->  Hash
-                     ->  Append
-                           ->  Seq Scan on pagg_tab3_p2 t2_1
-                           ->  Seq Scan on pagg_tab3_p1 t2_2
-(13 rows)
+                     ->  Partial HashAggregate
+                           Group Key: t2.c
+                           ->  Append
+                                 ->  Seq Scan on pagg_tab3_p2 t2_1
+                                 ->  Seq Scan on pagg_tab3_p1 t2_2
+(15 rows)
 
 SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROUP BY 1 ORDER BY t1.c COLLATE "C";
  c | count 
diff --git a/src/test/regress/expected/eager_aggregate.out b/src/test/regress/expected/eager_aggregate.out
new file mode 100644
index 00000000000..fc0f8c14ec9
--- /dev/null
+++ b/src/test/regress/expected/eager_aggregate.out
@@ -0,0 +1,1714 @@
+--
+-- EAGER AGGREGATION
+-- Test we can push aggregation down below join
+--
+-- Enable eager aggregation, which by default is disabled.
+SET enable_eager_aggregate TO on;
+CREATE TABLE eager_agg_t1 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t2 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t3 (a int, b int, c double precision);
+INSERT INTO eager_agg_t1 SELECT i, i, i FROM generate_series(1, 1000) i;
+INSERT INTO eager_agg_t2 SELECT i, i%10, i FROM generate_series(1, 1000) i;
+INSERT INTO eager_agg_t3 SELECT i%10, i%10, i FROM generate_series(1, 1000) i;
+ANALYZE eager_agg_t1;
+ANALYZE eager_agg_t2;
+ANALYZE eager_agg_t3;
+--
+-- Test eager aggregation over base rel
+--
+-- Perform scan of a table, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+                            QUERY PLAN                            
+------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Seq Scan on public.eager_agg_t2 t2
+                                 Output: t2.a, t2.b, t2.c
+(18 rows)
+
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+                               QUERY PLAN                               
+------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial GroupAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Sort
+                                 Output: t2.c, t2.b
+                                 Sort Key: t2.b
+                                 ->  Seq Scan on public.eager_agg_t2 t2
+                                       Output: t2.c, t2.b
+(21 rows)
+
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+RESET enable_hashagg;
+--
+-- Test eager aggregation over join rel
+--
+-- Perform join of tables, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+  JOIN eager_agg_t3 t3 ON t2.a = t3.a
+GROUP BY t1.a ORDER BY t1.a;
+                                  QUERY PLAN                                  
+------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg((t2.c + t3.c))
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg((t2.c + t3.c)))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg((t2.c + t3.c))
+                           Group Key: t2.b
+                           ->  Hash Join
+                                 Output: t2.c, t2.b, t3.c
+                                 Hash Cond: (t3.a = t2.a)
+                                 ->  Seq Scan on public.eager_agg_t3 t3
+                                       Output: t3.a, t3.b, t3.c
+                                 ->  Hash
+                                       Output: t2.c, t2.b, t2.a
+                                       ->  Seq Scan on public.eager_agg_t2 t2
+                                             Output: t2.c, t2.b, t2.a
+(25 rows)
+
+SELECT t1.a, avg(t2.c + t3.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+  JOIN eager_agg_t3 t3 ON t2.a = t3.a
+GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 497
+ 2 | 499
+ 3 | 501
+ 4 | 503
+ 5 | 505
+ 6 | 507
+ 7 | 509
+ 8 | 511
+ 9 | 513
+(9 rows)
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+  JOIN eager_agg_t3 t3 ON t2.a = t3.a
+GROUP BY t1.a ORDER BY t1.a;
+                                     QUERY PLAN                                     
+------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg((t2.c + t3.c))
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg((t2.c + t3.c)))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg((t2.c + t3.c)))
+                     ->  Partial GroupAggregate
+                           Output: t2.b, PARTIAL avg((t2.c + t3.c))
+                           Group Key: t2.b
+                           ->  Sort
+                                 Output: t2.c, t2.b, t3.c
+                                 Sort Key: t2.b
+                                 ->  Hash Join
+                                       Output: t2.c, t2.b, t3.c
+                                       Hash Cond: (t3.a = t2.a)
+                                       ->  Seq Scan on public.eager_agg_t3 t3
+                                             Output: t3.a, t3.b, t3.c
+                                       ->  Hash
+                                             Output: t2.c, t2.b, t2.a
+                                             ->  Seq Scan on public.eager_agg_t2 t2
+                                                   Output: t2.c, t2.b, t2.a
+(28 rows)
+
+SELECT t1.a, avg(t2.c + t3.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+  JOIN eager_agg_t3 t3 ON t2.a = t3.a
+GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 497
+ 2 | 499
+ 3 | 501
+ 4 | 503
+ 5 | 505
+ 6 | 507
+ 7 | 509
+ 8 | 511
+ 9 | 513
+(9 rows)
+
+RESET enable_hashagg;
+--
+-- Test that eager aggregation works for outer join
+--
+-- Ensure aggregation can be pushed down to the non-nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+                            QUERY PLAN                            
+------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Right Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Seq Scan on public.eager_agg_t2 t2
+                                 Output: t2.a, t2.b, t2.c
+(18 rows)
+
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+   | 505
+(10 rows)
+
+-- Ensure aggregation cannot be pushed down to the nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.b, avg(t2.c)
+  FROM eager_agg_t1 t1
+  LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t2.b ORDER BY t2.b;
+                         QUERY PLAN                         
+------------------------------------------------------------
+ Sort
+   Output: t2.b, (avg(t2.c))
+   Sort Key: t2.b
+   ->  HashAggregate
+         Output: t2.b, avg(t2.c)
+         Group Key: t2.b
+         ->  Hash Right Join
+               Output: t2.b, t2.c
+               Hash Cond: (t2.b = t1.b)
+               ->  Seq Scan on public.eager_agg_t2 t2
+                     Output: t2.a, t2.b, t2.c
+               ->  Hash
+                     Output: t1.b
+                     ->  Seq Scan on public.eager_agg_t1 t1
+                           Output: t1.b
+(15 rows)
+
+SELECT t2.b, avg(t2.c)
+  FROM eager_agg_t1 t1
+  LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t2.b ORDER BY t2.b;
+ b | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+   |    
+(10 rows)
+
+--
+-- Test that eager aggregation works for parallel plans
+--
+SET parallel_setup_cost=0;
+SET parallel_tuple_cost=0;
+SET min_parallel_table_scan_size=0;
+SET max_parallel_workers_per_gather=4;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+                                   QUERY PLAN                                    
+---------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Gather Merge
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Workers Planned: 2
+         ->  Sort
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Sort Key: t1.a
+               ->  Parallel Hash Join
+                     Output: t1.a, (PARTIAL avg(t2.c))
+                     Hash Cond: (t1.b = t2.b)
+                     ->  Parallel Seq Scan on public.eager_agg_t1 t1
+                           Output: t1.a, t1.b, t1.c
+                     ->  Parallel Hash
+                           Output: t2.b, (PARTIAL avg(t2.c))
+                           ->  Partial HashAggregate
+                                 Output: t2.b, PARTIAL avg(t2.c)
+                                 Group Key: t2.b
+                                 ->  Parallel Seq Scan on public.eager_agg_t2 t2
+                                       Output: t2.a, t2.b, t2.c
+(21 rows)
+
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+RESET parallel_setup_cost;
+RESET parallel_tuple_cost;
+RESET min_parallel_table_scan_size;
+RESET max_parallel_workers_per_gather;
+--
+-- Test eager aggregation with GEQO
+--
+SET geqo = on;
+SET geqo_threshold = 2;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+                            QUERY PLAN                            
+------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t1.a, avg(t2.c)
+   Group Key: t1.a
+   ->  Sort
+         Output: t1.a, (PARTIAL avg(t2.c))
+         Sort Key: t1.a
+         ->  Hash Join
+               Output: t1.a, (PARTIAL avg(t2.c))
+               Hash Cond: (t1.b = t2.b)
+               ->  Seq Scan on public.eager_agg_t1 t1
+                     Output: t1.a, t1.b, t1.c
+               ->  Hash
+                     Output: t2.b, (PARTIAL avg(t2.c))
+                     ->  Partial HashAggregate
+                           Output: t2.b, PARTIAL avg(t2.c)
+                           Group Key: t2.b
+                           ->  Seq Scan on public.eager_agg_t2 t2
+                                 Output: t2.a, t2.b, t2.c
+(18 rows)
+
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+ a | avg 
+---+-----
+ 1 | 496
+ 2 | 497
+ 3 | 498
+ 4 | 499
+ 5 | 500
+ 6 | 501
+ 7 | 502
+ 8 | 503
+ 9 | 504
+(9 rows)
+
+RESET geqo;
+RESET geqo_threshold;
+DROP TABLE eager_agg_t1;
+DROP TABLE eager_agg_t2;
+DROP TABLE eager_agg_t3;
+--
+-- Test eager aggregation for partitionwise join
+--
+-- Enable partitionwise aggregate, which by default is disabled.
+SET enable_partitionwise_aggregate TO true;
+-- Enable partitionwise join, which by default is disabled.
+SET enable_partitionwise_join TO true;
+CREATE TABLE eager_agg_tab1(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab1_p1 PARTITION OF eager_agg_tab1 FOR VALUES FROM (0) TO (5);
+CREATE TABLE eager_agg_tab1_p2 PARTITION OF eager_agg_tab1 FOR VALUES FROM (5) TO (10);
+CREATE TABLE eager_agg_tab1_p3 PARTITION OF eager_agg_tab1 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab2(x int, y int) PARTITION BY RANGE(y);
+CREATE TABLE eager_agg_tab2_p1 PARTITION OF eager_agg_tab2 FOR VALUES FROM (0) TO (5);
+CREATE TABLE eager_agg_tab2_p2 PARTITION OF eager_agg_tab2 FOR VALUES FROM (5) TO (10);
+CREATE TABLE eager_agg_tab2_p3 PARTITION OF eager_agg_tab2 FOR VALUES FROM (10) TO (15);
+INSERT INTO eager_agg_tab1 SELECT i % 15, i % 10 FROM generate_series(1, 1000) i;
+INSERT INTO eager_agg_tab2 SELECT i % 10, i % 15 FROM generate_series(1, 1000) i;
+ANALYZE eager_agg_tab1;
+ANALYZE eager_agg_tab2;
+-- When GROUP BY clause matches; full aggregation is performed for each
+-- partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t1.y), count(*)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab2 t2 ON t1.x = t2.y
+GROUP BY t1.x ORDER BY t1.x;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum(t1.y)), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum(t1.y), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2.y = t1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2
+                           Output: t2.y
+                     ->  Hash
+                           Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1.x, PARTIAL sum(t1.y), PARTIAL count(*)
+                                 Group Key: t1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                       Output: t1.x, t1.y
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum(t1_1.y), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_1
+                           Output: t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.y), PARTIAL count(*)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                       Output: t1_1.x, t1_1.y
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum(t1_2.y), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_2
+                           Output: t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.y), PARTIAL count(*)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                       Output: t1_2.x, t1_2.y
+(49 rows)
+
+SELECT t1.x, sum(t1.y), count(*)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab2 t2 ON t1.x = t2.y
+GROUP BY t1.x ORDER BY t1.x;
+ x  |  sum  | count 
+----+-------+-------
+  0 | 10890 |  4356
+  1 | 15544 |  4489
+  2 | 20033 |  4489
+  3 | 24522 |  4489
+  4 | 29011 |  4489
+  5 | 11390 |  4489
+  6 | 15879 |  4489
+  7 | 20368 |  4489
+  8 | 24857 |  4489
+  9 | 29346 |  4489
+ 10 | 11055 |  4489
+ 11 | 15246 |  4356
+ 12 | 19602 |  4356
+ 13 | 23958 |  4356
+ 14 | 28314 |  4356
+(15 rows)
+
+-- GROUP BY having other matching key
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.y, sum(t1.y), count(*)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab2 t2 ON t1.x = t2.y
+GROUP BY t2.y ORDER BY t2.y;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t2.y, (sum(t1.y)), (count(*))
+   Sort Key: t2.y
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t2.y, sum(t1.y), count(*)
+               Group Key: t2.y
+               ->  Hash Join
+                     Output: t2.y, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2.y = t1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2
+                           Output: t2.y
+                     ->  Hash
+                           Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1.x, PARTIAL sum(t1.y), PARTIAL count(*)
+                                 Group Key: t1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                       Output: t1.y, t1.x
+         ->  Finalize HashAggregate
+               Output: t2_1.y, sum(t1_1.y), count(*)
+               Group Key: t2_1.y
+               ->  Hash Join
+                     Output: t2_1.y, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_1
+                           Output: t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.y), PARTIAL count(*)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                       Output: t1_1.y, t1_1.x
+         ->  Finalize HashAggregate
+               Output: t2_2.y, sum(t1_2.y), count(*)
+               Group Key: t2_2.y
+               ->  Hash Join
+                     Output: t2_2.y, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_2
+                           Output: t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.y), PARTIAL count(*)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                       Output: t1_2.y, t1_2.x
+(49 rows)
+
+SELECT t2.y, sum(t1.y), count(*)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab2 t2 ON t1.x = t2.y
+GROUP BY t2.y ORDER BY t2.y;
+ y  |  sum  | count 
+----+-------+-------
+  0 | 10890 |  4356
+  1 | 15544 |  4489
+  2 | 20033 |  4489
+  3 | 24522 |  4489
+  4 | 29011 |  4489
+  5 | 11390 |  4489
+  6 | 15879 |  4489
+  7 | 20368 |  4489
+  8 | 24857 |  4489
+  9 | 29346 |  4489
+ 10 | 11055 |  4489
+ 11 | 15246 |  4356
+ 12 | 19602 |  4356
+ 13 | 23958 |  4356
+ 14 | 28314 |  4356
+(15 rows)
+
+-- When GROUP BY clause does not match; partial aggregation is performed for
+-- each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.x, sum(t1.x), count(*)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab2 t2 ON t1.x = t2.y
+GROUP BY t2.x HAVING avg(t1.x) > 5 ORDER BY t2.x;
+                                                 QUERY PLAN                                                 
+------------------------------------------------------------------------------------------------------------
+ Sort
+   Output: t2.x, (sum(t1.x)), (count(*))
+   Sort Key: t2.x
+   ->  Finalize HashAggregate
+         Output: t2.x, sum(t1.x), count(*)
+         Group Key: t2.x
+         Filter: (avg(t1.x) > '5'::numeric)
+         ->  Append
+               ->  Hash Join
+                     Output: t2.x, (PARTIAL sum(t1.x)), (PARTIAL count(*)), (PARTIAL avg(t1.x))
+                     Hash Cond: (t2.y = t1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2
+                           Output: t2.x, t2.y
+                     ->  Hash
+                           Output: t1.x, (PARTIAL sum(t1.x)), (PARTIAL count(*)), (PARTIAL avg(t1.x))
+                           ->  Partial HashAggregate
+                                 Output: t1.x, PARTIAL sum(t1.x), PARTIAL count(*), PARTIAL avg(t1.x)
+                                 Group Key: t1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                       Output: t1.x
+               ->  Hash Join
+                     Output: t2_1.x, (PARTIAL sum(t1_1.x)), (PARTIAL count(*)), (PARTIAL avg(t1_1.x))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_1
+                           Output: t2_1.x, t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.x)), (PARTIAL count(*)), (PARTIAL avg(t1_1.x))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.x), PARTIAL count(*), PARTIAL avg(t1_1.x)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                       Output: t1_1.x
+               ->  Hash Join
+                     Output: t2_2.x, (PARTIAL sum(t1_2.x)), (PARTIAL count(*)), (PARTIAL avg(t1_2.x))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_2
+                           Output: t2_2.x, t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.x)), (PARTIAL count(*)), (PARTIAL avg(t1_2.x))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.x), PARTIAL count(*), PARTIAL avg(t1_2.x)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                       Output: t1_2.x
+(44 rows)
+
+SELECT t2.x, sum(t1.x), count(*)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab2 t2 ON t1.x = t2.y
+GROUP BY t2.x HAVING avg(t1.x) > 5 ORDER BY t2.x;
+ x |  sum  | count 
+---+-------+-------
+ 0 | 33835 |  6667
+ 1 | 39502 |  6667
+ 2 | 46169 |  6667
+ 3 | 52836 |  6667
+ 4 | 59503 |  6667
+ 5 | 33500 |  6667
+ 6 | 39837 |  6667
+ 7 | 46504 |  6667
+ 8 | 53171 |  6667
+ 9 | 59838 |  6667
+(10 rows)
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab1 t2 ON t1.x = t2.x
+  JOIN eager_agg_tab1 t3 ON t2.x = t3.x
+GROUP BY t1.x ORDER BY t1.x;
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum((t2.y + t3.y)))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum((t2.y + t3.y))
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum((t2.y + t3.y)))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, t3.x, (PARTIAL sum((t2.y + t3.y)))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, t3.x, PARTIAL sum((t2.y + t3.y))
+                                 Group Key: t2.x
+                                 ->  Hash Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Hash Cond: (t2.x = t3.x)
+                                       ->  Seq Scan on public.eager_agg_tab1_p1 t2
+                                             Output: t2.y, t2.x
+                                       ->  Hash
+                                             Output: t3.y, t3.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p1 t3
+                                                   Output: t3.y, t3.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum((t2_1.y + t3_1.y))
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum((t2_1.y + t3_1.y)))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y)))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y))
+                                 Group Key: t2_1.x
+                                 ->  Hash Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Hash Cond: (t2_1.x = t3_1.x)
+                                       ->  Seq Scan on public.eager_agg_tab1_p2 t2_1
+                                             Output: t2_1.y, t2_1.x
+                                       ->  Hash
+                                             Output: t3_1.y, t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p2 t3_1
+                                                   Output: t3_1.y, t3_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum((t2_2.y + t3_2.y))
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum((t2_2.y + t3_2.y)))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y)))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y))
+                                 Group Key: t2_2.x
+                                 ->  Hash Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Hash Cond: (t2_2.x = t3_2.x)
+                                       ->  Seq Scan on public.eager_agg_tab1_p3 t2_2
+                                             Output: t2_2.y, t2_2.x
+                                       ->  Hash
+                                             Output: t3_2.y, t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p3 t3_2
+                                                   Output: t3_2.y, t3_2.x
+(70 rows)
+
+SELECT t1.x, sum(t2.y + t3.y)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab1 t2 ON t1.x = t2.x
+  JOIN eager_agg_tab1 t3 ON t2.x = t3.x
+GROUP BY t1.x ORDER BY t1.x;
+ x  |   sum   
+----+---------
+  0 | 1437480
+  1 | 2082896
+  2 | 2684422
+  3 | 3285948
+  4 | 3887474
+  5 | 1526260
+  6 | 2127786
+  7 | 2729312
+  8 | 3330838
+  9 | 3932364
+ 10 | 1481370
+ 11 | 2012472
+ 12 | 2587464
+ 13 | 3162456
+ 14 | 3737448
+(15 rows)
+
+-- partial aggregation
+SET enable_hashagg TO off;
+SET max_parallel_workers_per_gather TO 0;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab1 t2 ON t1.x = t2.x
+  JOIN eager_agg_tab1 t3 ON t2.x = t3.x
+GROUP BY t3.y ORDER BY t3.y;
+                                        QUERY PLAN                                         
+-------------------------------------------------------------------------------------------
+ Finalize GroupAggregate
+   Output: t3.y, sum((t2.y + t3.y))
+   Group Key: t3.y
+   ->  Sort
+         Output: t3.y, (PARTIAL sum((t2.y + t3.y)))
+         Sort Key: t3.y
+         ->  Append
+               ->  Hash Join
+                     Output: t3.y, (PARTIAL sum((t2.y + t3.y)))
+                     Hash Cond: (t2.x = t1.x)
+                     ->  Partial GroupAggregate
+                           Output: t2.x, t3.y, t3.x, PARTIAL sum((t2.y + t3.y))
+                           Group Key: t2.x, t3.y, t3.x
+                           ->  Incremental Sort
+                                 Output: t2.y, t2.x, t3.y, t3.x
+                                 Sort Key: t2.x, t3.y
+                                 Presorted Key: t2.x
+                                 ->  Merge Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Merge Cond: (t2.x = t3.x)
+                                       ->  Sort
+                                             Output: t2.y, t2.x
+                                             Sort Key: t2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p1 t2
+                                                   Output: t2.y, t2.x
+                                       ->  Sort
+                                             Output: t3.y, t3.x
+                                             Sort Key: t3.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p1 t3
+                                                   Output: t3.y, t3.x
+                     ->  Hash
+                           Output: t1.x
+                           ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                 Output: t1.x
+               ->  Hash Join
+                     Output: t3_1.y, (PARTIAL sum((t2_1.y + t3_1.y)))
+                     Hash Cond: (t2_1.x = t1_1.x)
+                     ->  Partial GroupAggregate
+                           Output: t2_1.x, t3_1.y, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y))
+                           Group Key: t2_1.x, t3_1.y, t3_1.x
+                           ->  Incremental Sort
+                                 Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                 Sort Key: t2_1.x, t3_1.y
+                                 Presorted Key: t2_1.x
+                                 ->  Merge Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Merge Cond: (t2_1.x = t3_1.x)
+                                       ->  Sort
+                                             Output: t2_1.y, t2_1.x
+                                             Sort Key: t2_1.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p2 t2_1
+                                                   Output: t2_1.y, t2_1.x
+                                       ->  Sort
+                                             Output: t3_1.y, t3_1.x
+                                             Sort Key: t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p2 t3_1
+                                                   Output: t3_1.y, t3_1.x
+                     ->  Hash
+                           Output: t1_1.x
+                           ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                 Output: t1_1.x
+               ->  Hash Join
+                     Output: t3_2.y, (PARTIAL sum((t2_2.y + t3_2.y)))
+                     Hash Cond: (t2_2.x = t1_2.x)
+                     ->  Partial GroupAggregate
+                           Output: t2_2.x, t3_2.y, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y))
+                           Group Key: t2_2.x, t3_2.y, t3_2.x
+                           ->  Incremental Sort
+                                 Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                 Sort Key: t2_2.x, t3_2.y
+                                 Presorted Key: t2_2.x
+                                 ->  Merge Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Merge Cond: (t2_2.x = t3_2.x)
+                                       ->  Sort
+                                             Output: t2_2.y, t2_2.x
+                                             Sort Key: t2_2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p3 t2_2
+                                                   Output: t2_2.y, t2_2.x
+                                       ->  Sort
+                                             Output: t3_2.y, t3_2.x
+                                             Sort Key: t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab1_p3 t3_2
+                                                   Output: t3_2.y, t3_2.x
+                     ->  Hash
+                           Output: t1_2.x
+                           ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                 Output: t1_2.x
+(88 rows)
+
+SELECT t3.y, sum(t2.y + t3.y)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab1 t2 ON t1.x = t2.x
+  JOIN eager_agg_tab1 t3 ON t2.x = t3.x
+GROUP BY t3.y ORDER BY t3.y;
+ y |   sum   
+---+---------
+ 0 | 1111110
+ 1 | 2000132
+ 2 | 2889154
+ 3 | 3778176
+ 4 | 4667198
+ 5 | 3334000
+ 6 | 4223022
+ 7 | 5112044
+ 8 | 6001066
+ 9 | 6890088
+(10 rows)
+
+RESET enable_hashagg;
+RESET max_parallel_workers_per_gather;
+-- try that with GEQO too
+SET geqo = on;
+SET geqo_threshold = 2;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t1.y), count(*)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab2 t2 ON t1.x = t2.y
+GROUP BY t1.x ORDER BY t1.x;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum(t1.y)), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum(t1.y), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2.y = t1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p1 t2
+                           Output: t2.y
+                     ->  Hash
+                           Output: t1.x, (PARTIAL sum(t1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1.x, PARTIAL sum(t1.y), PARTIAL count(*)
+                                 Group Key: t1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p1 t1
+                                       Output: t1.x, t1.y
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum(t1_1.y), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_1.y = t1_1.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p2 t2_1
+                           Output: t2_1.y
+                     ->  Hash
+                           Output: t1_1.x, (PARTIAL sum(t1_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_1.x, PARTIAL sum(t1_1.y), PARTIAL count(*)
+                                 Group Key: t1_1.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p2 t1_1
+                                       Output: t1_1.x, t1_1.y
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum(t1_2.y), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t2_2.y = t1_2.x)
+                     ->  Seq Scan on public.eager_agg_tab2_p3 t2_2
+                           Output: t2_2.y
+                     ->  Hash
+                           Output: t1_2.x, (PARTIAL sum(t1_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t1_2.x, PARTIAL sum(t1_2.y), PARTIAL count(*)
+                                 Group Key: t1_2.x
+                                 ->  Seq Scan on public.eager_agg_tab1_p3 t1_2
+                                       Output: t1_2.x, t1_2.y
+(49 rows)
+
+SELECT t1.x, sum(t1.y), count(*)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab2 t2 ON t1.x = t2.y
+GROUP BY t1.x ORDER BY t1.x;
+ x  |  sum  | count 
+----+-------+-------
+  0 | 10890 |  4356
+  1 | 15544 |  4489
+  2 | 20033 |  4489
+  3 | 24522 |  4489
+  4 | 29011 |  4489
+  5 | 11390 |  4489
+  6 | 15879 |  4489
+  7 | 20368 |  4489
+  8 | 24857 |  4489
+  9 | 29346 |  4489
+ 10 | 11055 |  4489
+ 11 | 15246 |  4356
+ 12 | 19602 |  4356
+ 13 | 23958 |  4356
+ 14 | 28314 |  4356
+(15 rows)
+
+RESET geqo;
+RESET geqo_threshold;
+DROP TABLE eager_agg_tab1;
+DROP TABLE eager_agg_tab2;
+--
+-- Test with multi-level partitioning scheme
+--
+CREATE TABLE eager_agg_tab_ml(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p1 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab_ml_p2 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (10) TO (20) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p2_s1 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab_ml_p2_s2 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (15) TO (20);
+CREATE TABLE eager_agg_tab_ml_p3 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (20) TO (30) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p3_s1 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (20) TO (25);
+CREATE TABLE eager_agg_tab_ml_p3_s2 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (25) TO (30);
+INSERT INTO eager_agg_tab_ml SELECT i % 30, i % 30 FROM generate_series(1, 1000) i;
+ANALYZE eager_agg_tab_ml;
+-- When GROUP BY clause matches; full aggregation is performed for each
+-- partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+GROUP BY t1.x ORDER BY t1.x;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum(t2.y)), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum(t2.y), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, PARTIAL sum(t2.y), PARTIAL count(*)
+                                 Group Key: t2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                       Output: t2.y, t2.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum(t2_1.y), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, PARTIAL sum(t2_1.y), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                       Output: t2_1.y, t2_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum(t2_2.y), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, PARTIAL sum(t2_2.y), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                       Output: t2_2.y, t2_2.x
+         ->  Finalize HashAggregate
+               Output: t1_3.x, sum(t2_3.y), count(*)
+               Group Key: t1_3.x
+               ->  Hash Join
+                     Output: t1_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, PARTIAL sum(t2_3.y), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                       Output: t2_3.y, t2_3.x
+         ->  Finalize HashAggregate
+               Output: t1_4.x, sum(t2_4.y), count(*)
+               Group Key: t1_4.x
+               ->  Hash Join
+                     Output: t1_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, PARTIAL sum(t2_4.y), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                       Output: t2_4.y, t2_4.x
+(79 rows)
+
+SELECT t1.x, sum(t2.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+GROUP BY t1.x ORDER BY t1.x;
+ x  |  sum  | count 
+----+-------+-------
+  0 |     0 |  1089
+  1 |  1156 |  1156
+  2 |  2312 |  1156
+  3 |  3468 |  1156
+  4 |  4624 |  1156
+  5 |  5780 |  1156
+  6 |  6936 |  1156
+  7 |  8092 |  1156
+  8 |  9248 |  1156
+  9 | 10404 |  1156
+ 10 | 11560 |  1156
+ 11 | 11979 |  1089
+ 12 | 13068 |  1089
+ 13 | 14157 |  1089
+ 14 | 15246 |  1089
+ 15 | 16335 |  1089
+ 16 | 17424 |  1089
+ 17 | 18513 |  1089
+ 18 | 19602 |  1089
+ 19 | 20691 |  1089
+ 20 | 21780 |  1089
+ 21 | 22869 |  1089
+ 22 | 23958 |  1089
+ 23 | 25047 |  1089
+ 24 | 26136 |  1089
+ 25 | 27225 |  1089
+ 26 | 28314 |  1089
+ 27 | 29403 |  1089
+ 28 | 30492 |  1089
+ 29 | 31581 |  1089
+(30 rows)
+
+-- When GROUP BY clause does not match; partial aggregation is performed for
+-- each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.y, sum(t2.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+GROUP BY t1.y ORDER BY t1.y;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.y, (sum(t2.y)), (count(*))
+   Sort Key: t1.y
+   ->  Finalize HashAggregate
+         Output: t1.y, sum(t2.y), count(*)
+         Group Key: t1.y
+         ->  Append
+               ->  Hash Join
+                     Output: t1.y, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.y, t1.x
+                     ->  Hash
+                           Output: t2.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, PARTIAL sum(t2.y), PARTIAL count(*)
+                                 Group Key: t2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                       Output: t2.y, t2.x
+               ->  Hash Join
+                     Output: t1_1.y, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.y, t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, PARTIAL sum(t2_1.y), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                       Output: t2_1.y, t2_1.x
+               ->  Hash Join
+                     Output: t1_2.y, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.y, t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, PARTIAL sum(t2_2.y), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                       Output: t2_2.y, t2_2.x
+               ->  Hash Join
+                     Output: t1_3.y, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.y, t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, PARTIAL sum(t2_3.y), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                       Output: t2_3.y, t2_3.x
+               ->  Hash Join
+                     Output: t1_4.y, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.y, t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, PARTIAL sum(t2_4.y), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                       Output: t2_4.y, t2_4.x
+(67 rows)
+
+SELECT t1.y, sum(t2.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+GROUP BY t1.y ORDER BY t1.y;
+ y  |  sum  | count 
+----+-------+-------
+  0 |     0 |  1089
+  1 |  1156 |  1156
+  2 |  2312 |  1156
+  3 |  3468 |  1156
+  4 |  4624 |  1156
+  5 |  5780 |  1156
+  6 |  6936 |  1156
+  7 |  8092 |  1156
+  8 |  9248 |  1156
+  9 | 10404 |  1156
+ 10 | 11560 |  1156
+ 11 | 11979 |  1089
+ 12 | 13068 |  1089
+ 13 | 14157 |  1089
+ 14 | 15246 |  1089
+ 15 | 16335 |  1089
+ 16 | 17424 |  1089
+ 17 | 18513 |  1089
+ 18 | 19602 |  1089
+ 19 | 20691 |  1089
+ 20 | 21780 |  1089
+ 21 | 22869 |  1089
+ 22 | 23958 |  1089
+ 23 | 25047 |  1089
+ 24 | 26136 |  1089
+ 25 | 27225 |  1089
+ 26 | 28314 |  1089
+ 27 | 29403 |  1089
+ 28 | 30492 |  1089
+ 29 | 31581 |  1089
+(30 rows)
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+  JOIN eager_agg_tab_ml t3 ON t2.x = t3.x
+GROUP BY t1.x ORDER BY t1.x;
+                                                QUERY PLAN                                                
+----------------------------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum((t2.y + t3.y))), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum((t2.y + t3.y)), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, t3.x, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, t3.x, PARTIAL sum((t2.y + t3.y)), PARTIAL count(*)
+                                 Group Key: t2.x
+                                 ->  Hash Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Hash Cond: (t2.x = t3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                             Output: t2.y, t2.x
+                                       ->  Hash
+                                             Output: t3.y, t3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p1 t3
+                                                   Output: t3.y, t3.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum((t2_1.y + t3_1.y)), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y)), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Hash Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Hash Cond: (t2_1.x = t3_1.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                             Output: t2_1.y, t2_1.x
+                                       ->  Hash
+                                             Output: t3_1.y, t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t3_1
+                                                   Output: t3_1.y, t3_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum((t2_2.y + t3_2.y)), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y)), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Hash Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Hash Cond: (t2_2.x = t3_2.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                             Output: t2_2.y, t2_2.x
+                                       ->  Hash
+                                             Output: t3_2.y, t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t3_2
+                                                   Output: t3_2.y, t3_2.x
+         ->  Finalize HashAggregate
+               Output: t1_3.x, sum((t2_3.y + t3_3.y)), count(*)
+               Group Key: t1_3.x
+               ->  Hash Join
+                     Output: t1_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, t3_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, t3_3.x, PARTIAL sum((t2_3.y + t3_3.y)), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Hash Join
+                                       Output: t2_3.y, t2_3.x, t3_3.y, t3_3.x
+                                       Hash Cond: (t2_3.x = t3_3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                             Output: t2_3.y, t2_3.x
+                                       ->  Hash
+                                             Output: t3_3.y, t3_3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t3_3
+                                                   Output: t3_3.y, t3_3.x
+         ->  Finalize HashAggregate
+               Output: t1_4.x, sum((t2_4.y + t3_4.y)), count(*)
+               Group Key: t1_4.x
+               ->  Hash Join
+                     Output: t1_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, t3_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, t3_4.x, PARTIAL sum((t2_4.y + t3_4.y)), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Hash Join
+                                       Output: t2_4.y, t2_4.x, t3_4.y, t3_4.x
+                                       Hash Cond: (t2_4.x = t3_4.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                             Output: t2_4.y, t2_4.x
+                                       ->  Hash
+                                             Output: t3_4.y, t3_4.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t3_4
+                                                   Output: t3_4.y, t3_4.x
+(114 rows)
+
+SELECT t1.x, sum(t2.y + t3.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+  JOIN eager_agg_tab_ml t3 ON t2.x = t3.x
+GROUP BY t1.x ORDER BY t1.x;
+ x  |   sum   | count 
+----+---------+-------
+  0 |       0 | 35937
+  1 |   78608 | 39304
+  2 |  157216 | 39304
+  3 |  235824 | 39304
+  4 |  314432 | 39304
+  5 |  393040 | 39304
+  6 |  471648 | 39304
+  7 |  550256 | 39304
+  8 |  628864 | 39304
+  9 |  707472 | 39304
+ 10 |  786080 | 39304
+ 11 |  790614 | 35937
+ 12 |  862488 | 35937
+ 13 |  934362 | 35937
+ 14 | 1006236 | 35937
+ 15 | 1078110 | 35937
+ 16 | 1149984 | 35937
+ 17 | 1221858 | 35937
+ 18 | 1293732 | 35937
+ 19 | 1365606 | 35937
+ 20 | 1437480 | 35937
+ 21 | 1509354 | 35937
+ 22 | 1581228 | 35937
+ 23 | 1653102 | 35937
+ 24 | 1724976 | 35937
+ 25 | 1796850 | 35937
+ 26 | 1868724 | 35937
+ 27 | 1940598 | 35937
+ 28 | 2012472 | 35937
+ 29 | 2084346 | 35937
+(30 rows)
+
+-- partial aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+  JOIN eager_agg_tab_ml t3 ON t2.x = t3.x
+GROUP BY t3.y ORDER BY t3.y;
+                                                    QUERY PLAN                                                    
+------------------------------------------------------------------------------------------------------------------
+ Sort
+   Output: t3.y, (sum((t2.y + t3.y))), (count(*))
+   Sort Key: t3.y
+   ->  Finalize HashAggregate
+         Output: t3.y, sum((t2.y + t3.y)), count(*)
+         Group Key: t3.y
+         ->  Append
+               ->  Hash Join
+                     Output: t3.y, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, t3.y, t3.x, (PARTIAL sum((t2.y + t3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, t3.y, t3.x, PARTIAL sum((t2.y + t3.y)), PARTIAL count(*)
+                                 Group Key: t2.x, t3.y, t3.x
+                                 ->  Hash Join
+                                       Output: t2.y, t2.x, t3.y, t3.x
+                                       Hash Cond: (t2.x = t3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                             Output: t2.y, t2.x
+                                       ->  Hash
+                                             Output: t3.y, t3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p1 t3
+                                                   Output: t3.y, t3.x
+               ->  Hash Join
+                     Output: t3_1.y, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, t3_1.y, t3_1.x, (PARTIAL sum((t2_1.y + t3_1.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, t3_1.y, t3_1.x, PARTIAL sum((t2_1.y + t3_1.y)), PARTIAL count(*)
+                                 Group Key: t2_1.x, t3_1.y, t3_1.x
+                                 ->  Hash Join
+                                       Output: t2_1.y, t2_1.x, t3_1.y, t3_1.x
+                                       Hash Cond: (t2_1.x = t3_1.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                             Output: t2_1.y, t2_1.x
+                                       ->  Hash
+                                             Output: t3_1.y, t3_1.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t3_1
+                                                   Output: t3_1.y, t3_1.x
+               ->  Hash Join
+                     Output: t3_2.y, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, t3_2.y, t3_2.x, (PARTIAL sum((t2_2.y + t3_2.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, t3_2.y, t3_2.x, PARTIAL sum((t2_2.y + t3_2.y)), PARTIAL count(*)
+                                 Group Key: t2_2.x, t3_2.y, t3_2.x
+                                 ->  Hash Join
+                                       Output: t2_2.y, t2_2.x, t3_2.y, t3_2.x
+                                       Hash Cond: (t2_2.x = t3_2.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                             Output: t2_2.y, t2_2.x
+                                       ->  Hash
+                                             Output: t3_2.y, t3_2.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t3_2
+                                                   Output: t3_2.y, t3_2.x
+               ->  Hash Join
+                     Output: t3_3.y, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, t3_3.y, t3_3.x, (PARTIAL sum((t2_3.y + t3_3.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, t3_3.y, t3_3.x, PARTIAL sum((t2_3.y + t3_3.y)), PARTIAL count(*)
+                                 Group Key: t2_3.x, t3_3.y, t3_3.x
+                                 ->  Hash Join
+                                       Output: t2_3.y, t2_3.x, t3_3.y, t3_3.x
+                                       Hash Cond: (t2_3.x = t3_3.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                             Output: t2_3.y, t2_3.x
+                                       ->  Hash
+                                             Output: t3_3.y, t3_3.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t3_3
+                                                   Output: t3_3.y, t3_3.x
+               ->  Hash Join
+                     Output: t3_4.y, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, t3_4.y, t3_4.x, (PARTIAL sum((t2_4.y + t3_4.y))), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, t3_4.y, t3_4.x, PARTIAL sum((t2_4.y + t3_4.y)), PARTIAL count(*)
+                                 Group Key: t2_4.x, t3_4.y, t3_4.x
+                                 ->  Hash Join
+                                       Output: t2_4.y, t2_4.x, t3_4.y, t3_4.x
+                                       Hash Cond: (t2_4.x = t3_4.x)
+                                       ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                             Output: t2_4.y, t2_4.x
+                                       ->  Hash
+                                             Output: t3_4.y, t3_4.x
+                                             ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t3_4
+                                                   Output: t3_4.y, t3_4.x
+(102 rows)
+
+SELECT t3.y, sum(t2.y + t3.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+  JOIN eager_agg_tab_ml t3 ON t2.x = t3.x
+GROUP BY t3.y ORDER BY t3.y;
+ y  |   sum   | count 
+----+---------+-------
+  0 |       0 | 35937
+  1 |   78608 | 39304
+  2 |  157216 | 39304
+  3 |  235824 | 39304
+  4 |  314432 | 39304
+  5 |  393040 | 39304
+  6 |  471648 | 39304
+  7 |  550256 | 39304
+  8 |  628864 | 39304
+  9 |  707472 | 39304
+ 10 |  786080 | 39304
+ 11 |  790614 | 35937
+ 12 |  862488 | 35937
+ 13 |  934362 | 35937
+ 14 | 1006236 | 35937
+ 15 | 1078110 | 35937
+ 16 | 1149984 | 35937
+ 17 | 1221858 | 35937
+ 18 | 1293732 | 35937
+ 19 | 1365606 | 35937
+ 20 | 1437480 | 35937
+ 21 | 1509354 | 35937
+ 22 | 1581228 | 35937
+ 23 | 1653102 | 35937
+ 24 | 1724976 | 35937
+ 25 | 1796850 | 35937
+ 26 | 1868724 | 35937
+ 27 | 1940598 | 35937
+ 28 | 2012472 | 35937
+ 29 | 2084346 | 35937
+(30 rows)
+
+-- try that with GEQO too
+SET geqo = on;
+SET geqo_threshold = 2;
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+GROUP BY t1.x ORDER BY t1.x;
+                                      QUERY PLAN                                       
+---------------------------------------------------------------------------------------
+ Sort
+   Output: t1.x, (sum(t2.y)), (count(*))
+   Sort Key: t1.x
+   ->  Append
+         ->  Finalize HashAggregate
+               Output: t1.x, sum(t2.y), count(*)
+               Group Key: t1.x
+               ->  Hash Join
+                     Output: t1.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1.x = t2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p1 t1
+                           Output: t1.x
+                     ->  Hash
+                           Output: t2.x, (PARTIAL sum(t2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2.x, PARTIAL sum(t2.y), PARTIAL count(*)
+                                 Group Key: t2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p1 t2
+                                       Output: t2.y, t2.x
+         ->  Finalize HashAggregate
+               Output: t1_1.x, sum(t2_1.y), count(*)
+               Group Key: t1_1.x
+               ->  Hash Join
+                     Output: t1_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_1.x = t2_1.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t1_1
+                           Output: t1_1.x
+                     ->  Hash
+                           Output: t2_1.x, (PARTIAL sum(t2_1.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_1.x, PARTIAL sum(t2_1.y), PARTIAL count(*)
+                                 Group Key: t2_1.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s1 t2_1
+                                       Output: t2_1.y, t2_1.x
+         ->  Finalize HashAggregate
+               Output: t1_2.x, sum(t2_2.y), count(*)
+               Group Key: t1_2.x
+               ->  Hash Join
+                     Output: t1_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_2.x = t2_2.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t1_2
+                           Output: t1_2.x
+                     ->  Hash
+                           Output: t2_2.x, (PARTIAL sum(t2_2.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_2.x, PARTIAL sum(t2_2.y), PARTIAL count(*)
+                                 Group Key: t2_2.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p2_s2 t2_2
+                                       Output: t2_2.y, t2_2.x
+         ->  Finalize HashAggregate
+               Output: t1_3.x, sum(t2_3.y), count(*)
+               Group Key: t1_3.x
+               ->  Hash Join
+                     Output: t1_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_3.x = t2_3.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t1_3
+                           Output: t1_3.x
+                     ->  Hash
+                           Output: t2_3.x, (PARTIAL sum(t2_3.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_3.x, PARTIAL sum(t2_3.y), PARTIAL count(*)
+                                 Group Key: t2_3.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s1 t2_3
+                                       Output: t2_3.y, t2_3.x
+         ->  Finalize HashAggregate
+               Output: t1_4.x, sum(t2_4.y), count(*)
+               Group Key: t1_4.x
+               ->  Hash Join
+                     Output: t1_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                     Hash Cond: (t1_4.x = t2_4.x)
+                     ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t1_4
+                           Output: t1_4.x
+                     ->  Hash
+                           Output: t2_4.x, (PARTIAL sum(t2_4.y)), (PARTIAL count(*))
+                           ->  Partial HashAggregate
+                                 Output: t2_4.x, PARTIAL sum(t2_4.y), PARTIAL count(*)
+                                 Group Key: t2_4.x
+                                 ->  Seq Scan on public.eager_agg_tab_ml_p3_s2 t2_4
+                                       Output: t2_4.y, t2_4.x
+(79 rows)
+
+SELECT t1.x, sum(t2.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+GROUP BY t1.x ORDER BY t1.x;
+ x  |  sum  | count 
+----+-------+-------
+  0 |     0 |  1089
+  1 |  1156 |  1156
+  2 |  2312 |  1156
+  3 |  3468 |  1156
+  4 |  4624 |  1156
+  5 |  5780 |  1156
+  6 |  6936 |  1156
+  7 |  8092 |  1156
+  8 |  9248 |  1156
+  9 | 10404 |  1156
+ 10 | 11560 |  1156
+ 11 | 11979 |  1089
+ 12 | 13068 |  1089
+ 13 | 14157 |  1089
+ 14 | 15246 |  1089
+ 15 | 16335 |  1089
+ 16 | 17424 |  1089
+ 17 | 18513 |  1089
+ 18 | 19602 |  1089
+ 19 | 20691 |  1089
+ 20 | 21780 |  1089
+ 21 | 22869 |  1089
+ 22 | 23958 |  1089
+ 23 | 25047 |  1089
+ 24 | 26136 |  1089
+ 25 | 27225 |  1089
+ 26 | 28314 |  1089
+ 27 | 29403 |  1089
+ 28 | 30492 |  1089
+ 29 | 31581 |  1089
+(30 rows)
+
+RESET geqo;
+RESET geqo_threshold;
+DROP TABLE eager_agg_tab_ml;
diff --git a/src/test/regress/expected/join.out b/src/test/regress/expected/join.out
index cd37f549b5a..bdbf21a874d 100644
--- a/src/test/regress/expected/join.out
+++ b/src/test/regress/expected/join.out
@@ -2840,20 +2840,22 @@ select x.thousand, x.twothousand, count(*)
 from tenk1 x inner join tenk1 y on x.thousand = y.thousand
 group by x.thousand, x.twothousand
 order by x.thousand desc, x.twothousand;
-                                    QUERY PLAN                                    
-----------------------------------------------------------------------------------
- GroupAggregate
+                                       QUERY PLAN                                       
+----------------------------------------------------------------------------------------
+ Finalize GroupAggregate
    Group Key: x.thousand, x.twothousand
    ->  Incremental Sort
          Sort Key: x.thousand DESC, x.twothousand
          Presorted Key: x.thousand
          ->  Merge Join
                Merge Cond: (y.thousand = x.thousand)
-               ->  Index Only Scan Backward using tenk1_thous_tenthous on tenk1 y
+               ->  Partial GroupAggregate
+                     Group Key: y.thousand
+                     ->  Index Only Scan Backward using tenk1_thous_tenthous on tenk1 y
                ->  Sort
                      Sort Key: x.thousand DESC
                      ->  Seq Scan on tenk1 x
-(11 rows)
+(13 rows)
 
 reset enable_hashagg;
 reset enable_nestloop;
diff --git a/src/test/regress/expected/partition_aggregate.out b/src/test/regress/expected/partition_aggregate.out
index cb12bf53719..fc84929a002 100644
--- a/src/test/regress/expected/partition_aggregate.out
+++ b/src/test/regress/expected/partition_aggregate.out
@@ -13,6 +13,8 @@ SET enable_partitionwise_join TO true;
 SET max_parallel_workers_per_gather TO 0;
 -- Disable incremental sort, which can influence selected plans due to fuzz factor.
 SET enable_incremental_sort TO off;
+-- Disable eager aggregation, which can interfere with the generation of partitionwise aggregation.
+SET enable_eager_aggregate TO off;
 --
 -- Tests for list partitioned tables.
 --
diff --git a/src/test/regress/expected/sysviews.out b/src/test/regress/expected/sysviews.out
index 83228cfca29..3b37fafa65b 100644
--- a/src/test/regress/expected/sysviews.out
+++ b/src/test/regress/expected/sysviews.out
@@ -151,6 +151,7 @@ select name, setting from pg_settings where name like 'enable%';
  enable_async_append            | on
  enable_bitmapscan              | on
  enable_distinct_reordering     | on
+ enable_eager_aggregate         | on
  enable_gathermerge             | on
  enable_group_by_reordering     | on
  enable_hashagg                 | on
@@ -172,7 +173,7 @@ select name, setting from pg_settings where name like 'enable%';
  enable_seqscan                 | on
  enable_sort                    | on
  enable_tidscan                 | on
-(24 rows)
+(25 rows)
 
 -- There are always wait event descriptions for various types.  InjectionPoint
 -- may be present or absent, depending on history since last postmaster start.
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index fbffc67ae60..f9450cdc477 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -123,7 +123,7 @@ test: plancache limit plpgsql copy2 temp domain rangefuncs prepare conversion tr
 # The stats test resets stats, so nothing else needing stats access can be in
 # this group.
 # ----------
-test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression compression_lz4 memoize stats predicate numa
+test: partition_join partition_prune reloptions hash_part indexing partition_aggregate partition_info tuplesort explain compression compression_lz4 memoize stats predicate numa eager_aggregate
 
 # event_trigger depends on create_am and cannot run concurrently with
 # any test that runs DDL
diff --git a/src/test/regress/sql/eager_aggregate.sql b/src/test/regress/sql/eager_aggregate.sql
new file mode 100644
index 00000000000..e328a83b4c7
--- /dev/null
+++ b/src/test/regress/sql/eager_aggregate.sql
@@ -0,0 +1,380 @@
+--
+-- EAGER AGGREGATION
+-- Test we can push aggregation down below join
+--
+
+-- Enable eager aggregation, which by default is disabled.
+SET enable_eager_aggregate TO on;
+
+CREATE TABLE eager_agg_t1 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t2 (a int, b int, c double precision);
+CREATE TABLE eager_agg_t3 (a int, b int, c double precision);
+
+INSERT INTO eager_agg_t1 SELECT i, i, i FROM generate_series(1, 1000) i;
+INSERT INTO eager_agg_t2 SELECT i, i%10, i FROM generate_series(1, 1000) i;
+INSERT INTO eager_agg_t3 SELECT i%10, i%10, i FROM generate_series(1, 1000) i;
+
+ANALYZE eager_agg_t1;
+ANALYZE eager_agg_t2;
+ANALYZE eager_agg_t3;
+
+
+--
+-- Test eager aggregation over base rel
+--
+
+-- Perform scan of a table, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+
+RESET enable_hashagg;
+
+
+--
+-- Test eager aggregation over join rel
+--
+
+-- Perform join of tables, aggregate the result, join it to the other table
+-- and finalize the aggregation.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+  JOIN eager_agg_t3 t3 ON t2.a = t3.a
+GROUP BY t1.a ORDER BY t1.a;
+
+SELECT t1.a, avg(t2.c + t3.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+  JOIN eager_agg_t3 t3 ON t2.a = t3.a
+GROUP BY t1.a ORDER BY t1.a;
+
+-- Produce results with sorting aggregation
+SET enable_hashagg TO off;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c + t3.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+  JOIN eager_agg_t3 t3 ON t2.a = t3.a
+GROUP BY t1.a ORDER BY t1.a;
+
+SELECT t1.a, avg(t2.c + t3.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+  JOIN eager_agg_t3 t3 ON t2.a = t3.a
+GROUP BY t1.a ORDER BY t1.a;
+
+RESET enable_hashagg;
+
+
+--
+-- Test that eager aggregation works for outer join
+--
+
+-- Ensure aggregation can be pushed down to the non-nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  RIGHT JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+
+-- Ensure aggregation cannot be pushed down to the nullable side
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.b, avg(t2.c)
+  FROM eager_agg_t1 t1
+  LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t2.b ORDER BY t2.b;
+
+SELECT t2.b, avg(t2.c)
+  FROM eager_agg_t1 t1
+  LEFT JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t2.b ORDER BY t2.b;
+
+
+--
+-- Test that eager aggregation works for parallel plans
+--
+
+SET parallel_setup_cost=0;
+SET parallel_tuple_cost=0;
+SET min_parallel_table_scan_size=0;
+SET max_parallel_workers_per_gather=4;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+
+RESET parallel_setup_cost;
+RESET parallel_tuple_cost;
+RESET min_parallel_table_scan_size;
+RESET max_parallel_workers_per_gather;
+
+--
+-- Test eager aggregation with GEQO
+--
+
+SET geqo = on;
+SET geqo_threshold = 2;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+
+SELECT t1.a, avg(t2.c)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+
+RESET geqo;
+RESET geqo_threshold;
+
+DROP TABLE eager_agg_t1;
+DROP TABLE eager_agg_t2;
+DROP TABLE eager_agg_t3;
+
+
+--
+-- Test eager aggregation for partitionwise join
+--
+
+-- Enable partitionwise aggregate, which by default is disabled.
+SET enable_partitionwise_aggregate TO true;
+-- Enable partitionwise join, which by default is disabled.
+SET enable_partitionwise_join TO true;
+
+CREATE TABLE eager_agg_tab1(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab1_p1 PARTITION OF eager_agg_tab1 FOR VALUES FROM (0) TO (5);
+CREATE TABLE eager_agg_tab1_p2 PARTITION OF eager_agg_tab1 FOR VALUES FROM (5) TO (10);
+CREATE TABLE eager_agg_tab1_p3 PARTITION OF eager_agg_tab1 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab2(x int, y int) PARTITION BY RANGE(y);
+CREATE TABLE eager_agg_tab2_p1 PARTITION OF eager_agg_tab2 FOR VALUES FROM (0) TO (5);
+CREATE TABLE eager_agg_tab2_p2 PARTITION OF eager_agg_tab2 FOR VALUES FROM (5) TO (10);
+CREATE TABLE eager_agg_tab2_p3 PARTITION OF eager_agg_tab2 FOR VALUES FROM (10) TO (15);
+INSERT INTO eager_agg_tab1 SELECT i % 15, i % 10 FROM generate_series(1, 1000) i;
+INSERT INTO eager_agg_tab2 SELECT i % 10, i % 15 FROM generate_series(1, 1000) i;
+
+ANALYZE eager_agg_tab1;
+ANALYZE eager_agg_tab2;
+
+-- When GROUP BY clause matches; full aggregation is performed for each
+-- partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t1.y), count(*)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab2 t2 ON t1.x = t2.y
+GROUP BY t1.x ORDER BY t1.x;
+
+SELECT t1.x, sum(t1.y), count(*)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab2 t2 ON t1.x = t2.y
+GROUP BY t1.x ORDER BY t1.x;
+
+-- GROUP BY having other matching key
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.y, sum(t1.y), count(*)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab2 t2 ON t1.x = t2.y
+GROUP BY t2.y ORDER BY t2.y;
+
+SELECT t2.y, sum(t1.y), count(*)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab2 t2 ON t1.x = t2.y
+GROUP BY t2.y ORDER BY t2.y;
+
+-- When GROUP BY clause does not match; partial aggregation is performed for
+-- each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t2.x, sum(t1.x), count(*)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab2 t2 ON t1.x = t2.y
+GROUP BY t2.x HAVING avg(t1.x) > 5 ORDER BY t2.x;
+
+SELECT t2.x, sum(t1.x), count(*)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab2 t2 ON t1.x = t2.y
+GROUP BY t2.x HAVING avg(t1.x) > 5 ORDER BY t2.x;
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab1 t2 ON t1.x = t2.x
+  JOIN eager_agg_tab1 t3 ON t2.x = t3.x
+GROUP BY t1.x ORDER BY t1.x;
+
+SELECT t1.x, sum(t2.y + t3.y)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab1 t2 ON t1.x = t2.x
+  JOIN eager_agg_tab1 t3 ON t2.x = t3.x
+GROUP BY t1.x ORDER BY t1.x;
+
+-- partial aggregation
+SET enable_hashagg TO off;
+SET max_parallel_workers_per_gather TO 0;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab1 t2 ON t1.x = t2.x
+  JOIN eager_agg_tab1 t3 ON t2.x = t3.x
+GROUP BY t3.y ORDER BY t3.y;
+
+SELECT t3.y, sum(t2.y + t3.y)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab1 t2 ON t1.x = t2.x
+  JOIN eager_agg_tab1 t3 ON t2.x = t3.x
+GROUP BY t3.y ORDER BY t3.y;
+
+RESET enable_hashagg;
+RESET max_parallel_workers_per_gather;
+
+-- try that with GEQO too
+SET geqo = on;
+SET geqo_threshold = 2;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t1.y), count(*)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab2 t2 ON t1.x = t2.y
+GROUP BY t1.x ORDER BY t1.x;
+
+SELECT t1.x, sum(t1.y), count(*)
+  FROM eager_agg_tab1 t1
+  JOIN eager_agg_tab2 t2 ON t1.x = t2.y
+GROUP BY t1.x ORDER BY t1.x;
+
+RESET geqo;
+RESET geqo_threshold;
+
+DROP TABLE eager_agg_tab1;
+DROP TABLE eager_agg_tab2;
+
+
+--
+-- Test with multi-level partitioning scheme
+--
+CREATE TABLE eager_agg_tab_ml(x int, y int) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p1 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (0) TO (10);
+CREATE TABLE eager_agg_tab_ml_p2 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (10) TO (20) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p2_s1 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (10) TO (15);
+CREATE TABLE eager_agg_tab_ml_p2_s2 PARTITION OF eager_agg_tab_ml_p2 FOR VALUES FROM (15) TO (20);
+CREATE TABLE eager_agg_tab_ml_p3 PARTITION OF eager_agg_tab_ml FOR VALUES FROM (20) TO (30) PARTITION BY RANGE(x);
+CREATE TABLE eager_agg_tab_ml_p3_s1 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (20) TO (25);
+CREATE TABLE eager_agg_tab_ml_p3_s2 PARTITION OF eager_agg_tab_ml_p3 FOR VALUES FROM (25) TO (30);
+INSERT INTO eager_agg_tab_ml SELECT i % 30, i % 30 FROM generate_series(1, 1000) i;
+
+ANALYZE eager_agg_tab_ml;
+
+-- When GROUP BY clause matches; full aggregation is performed for each
+-- partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+GROUP BY t1.x ORDER BY t1.x;
+
+SELECT t1.x, sum(t2.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+GROUP BY t1.x ORDER BY t1.x;
+
+-- When GROUP BY clause does not match; partial aggregation is performed for
+-- each partition.
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.y, sum(t2.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+GROUP BY t1.y ORDER BY t1.y;
+
+SELECT t1.y, sum(t2.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+GROUP BY t1.y ORDER BY t1.y;
+
+-- Check with eager aggregation over join rel
+-- full aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y + t3.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+  JOIN eager_agg_tab_ml t3 ON t2.x = t3.x
+GROUP BY t1.x ORDER BY t1.x;
+
+SELECT t1.x, sum(t2.y + t3.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+  JOIN eager_agg_tab_ml t3 ON t2.x = t3.x
+GROUP BY t1.x ORDER BY t1.x;
+
+-- partial aggregation
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t3.y, sum(t2.y + t3.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+  JOIN eager_agg_tab_ml t3 ON t2.x = t3.x
+GROUP BY t3.y ORDER BY t3.y;
+
+SELECT t3.y, sum(t2.y + t3.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+  JOIN eager_agg_tab_ml t3 ON t2.x = t3.x
+GROUP BY t3.y ORDER BY t3.y;
+
+-- try that with GEQO too
+SET geqo = on;
+SET geqo_threshold = 2;
+
+EXPLAIN (VERBOSE, COSTS OFF)
+SELECT t1.x, sum(t2.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+GROUP BY t1.x ORDER BY t1.x;
+
+SELECT t1.x, sum(t2.y), count(*)
+  FROM eager_agg_tab_ml t1
+  JOIN eager_agg_tab_ml t2 ON t1.x = t2.x
+GROUP BY t1.x ORDER BY t1.x;
+
+RESET geqo;
+RESET geqo_threshold;
+
+DROP TABLE eager_agg_tab_ml;
diff --git a/src/test/regress/sql/partition_aggregate.sql b/src/test/regress/sql/partition_aggregate.sql
index ab070fee244..124cc260461 100644
--- a/src/test/regress/sql/partition_aggregate.sql
+++ b/src/test/regress/sql/partition_aggregate.sql
@@ -14,6 +14,8 @@ SET enable_partitionwise_join TO true;
 SET max_parallel_workers_per_gather TO 0;
 -- Disable incremental sort, which can influence selected plans due to fuzz factor.
 SET enable_incremental_sort TO off;
+-- Disable eager aggregation, which can interfere with the generation of partitionwise aggregation.
+SET enable_eager_aggregate TO off;
 
 --
 -- Tests for list partitioned tables.
diff --git a/src/tools/pgindent/typedefs.list b/src/tools/pgindent/typedefs.list
index 37f26f6c6b7..02b5b041c45 100644
--- a/src/tools/pgindent/typedefs.list
+++ b/src/tools/pgindent/typedefs.list
@@ -42,6 +42,7 @@ AfterTriggersTableData
 AfterTriggersTransData
 Agg
 AggClauseCosts
+AggClauseInfo
 AggInfo
 AggPath
 AggSplit
@@ -1110,6 +1111,7 @@ GroupPathExtraData
 GroupResultPath
 GroupState
 GroupVarInfo
+GroupingExprInfo
 GroupingFunc
 GroupingSet
 GroupingSetData
@@ -2473,6 +2475,7 @@ ReindexObjectType
 ReindexParams
 ReindexStmt
 ReindexType
+RelAggInfo
 RelFileLocator
 RelFileLocatorBackend
 RelFileNumber
-- 
2.39.5 (Apple Git-154)



^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-05 13:12                                   ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-09 09:20                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-09 14:20                                       ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-12 09:34                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-12 18:47                                           ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-13 08:27                                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-25 04:23                                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-29 02:09                                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-06 00:59                                                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-06 13:59                                                     ` Re: Eager aggregation, take 3 David Rowley <[email protected]>
  2025-10-07 10:56                                                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-10-08 11:14                                                         ` David Rowley <[email protected]>
  2025-10-09 01:49                                                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  1 sibling, 1 reply; 70+ messages in thread

From: David Rowley @ 2025-10-08 11:14 UTC (permalink / raw)
  To: Richard Guo <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]; Matheus Alcantara <[email protected]>

On Tue, 7 Oct 2025 at 23:57, Richard Guo <[email protected]> wrote:
>
> On Mon, Oct 6, 2025 at 10:59 PM David Rowley <[email protected]> wrote:
> > 6. Shouldn't this be using lappend()?
> >
> >  agg_clause_list = list_append_unique(agg_clause_list, ac_info);
> >
> > I don't understand why ac_info could already be in the list. You've
> > just done: ac_info = makeNode(AggClauseInfo);
>
> A query can specify the same Aggref expressions multiple times in the
> target list.  Using lappend here can lead to duplicate partial Aggref
> nodes in the targetlist of a grouped path, which is what I want to
> avoid.

I was getting that mixed up with list_append_unique_ptr().

> > 9. In get_expression_sortgroupref(), a comment claims "We ignore child
> > members here.". I think that's outdated since ec_members no longer has
> > child members.
>
> I think that comment is used to explain why we only scan ec_members
> here.  Similar comments can be found in many other places, such as in
> equivclass.c:
>
>   /*
>    * Found our match.  Scan the other EC members and attempt to generate
>    * joinclauses.  Ignore children here.
>    */
>   foreach(lc2, cur_ec->ec_members)
>   {

I'd say that's also wrong. "Ignore" means not to pay attention to
something that's there. The child members are not there.

> > 11. The way you've written the header comments for typedef struct
> > RelAggInfo seems weird.  I've only ever seen extra details in the
> > header comment when the inline comments have been kept to a single
> > line. You're spanning multiple lines, so why have the out of line
> > comments in the header at all?

> I've also updated the comments within RelAggInfo to use one-line
> style.

The style I'd thought of had the comments on the same line as the
field. Something like struct EquivalenceClass.

>I wrapped the long queries in v24.

+-- Enable eager aggregation, which by default is disabled.
+SET enable_eager_aggregate TO on;

The above comment and command mismatch to my understanding from
looking at postgresql.conf.sample and guc_parameters.dat.

David





^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-05 13:12                                   ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-09 09:20                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-09 14:20                                       ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-12 09:34                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-12 18:47                                           ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-13 08:27                                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-25 04:23                                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-29 02:09                                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-06 00:59                                                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-06 13:59                                                     ` Re: Eager aggregation, take 3 David Rowley <[email protected]>
  2025-10-07 10:56                                                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-08 11:14                                                         ` Re: Eager aggregation, take 3 David Rowley <[email protected]>
@ 2025-10-09 01:49                                                           ` Richard Guo <[email protected]>
  2025-10-09 08:07                                                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  0 siblings, 1 reply; 70+ messages in thread

From: Richard Guo @ 2025-10-09 01:49 UTC (permalink / raw)
  To: David Rowley <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]; Matheus Alcantara <[email protected]>

On Wed, Oct 8, 2025 at 8:14 PM David Rowley <[email protected]> wrote:
> +-- Enable eager aggregation, which by default is disabled.
> +SET enable_eager_aggregate TO on;

> The above comment and command mismatch to my understanding from
> looking at postgresql.conf.sample and guc_parameters.dat.

Right.  This GUC was disabled by default prior to v17, and this is a
leftover from that.  Will push a fix.  Thanks for pointing it out!

- Richard





^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-05 13:12                                   ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-09 09:20                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-09 14:20                                       ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-12 09:34                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-12 18:47                                           ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-13 08:27                                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-25 04:23                                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-29 02:09                                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-06 00:59                                                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-06 13:59                                                     ` Re: Eager aggregation, take 3 David Rowley <[email protected]>
  2025-10-07 10:56                                                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-08 11:14                                                         ` Re: Eager aggregation, take 3 David Rowley <[email protected]>
  2025-10-09 01:49                                                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-10-09 08:07                                                             ` Richard Guo <[email protected]>
  2026-03-30 03:17                                                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  0 siblings, 1 reply; 70+ messages in thread

From: Richard Guo @ 2025-10-09 08:07 UTC (permalink / raw)
  To: David Rowley <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]; Matheus Alcantara <[email protected]>

On Thu, Oct 9, 2025 at 10:49 AM Richard Guo <[email protected]> wrote:
> On Wed, Oct 8, 2025 at 8:14 PM David Rowley <[email protected]> wrote:
> > +-- Enable eager aggregation, which by default is disabled.
> > +SET enable_eager_aggregate TO on;
>
> > The above comment and command mismatch to my understanding from
> > looking at postgresql.conf.sample and guc_parameters.dat.

> Right.  This GUC was disabled by default prior to v17, and this is a
> leftover from that.  Will push a fix.  Thanks for pointing it out!

I noticed an unnecessary header include in initsplan.c.  Will fix that
as well.

- Richard





^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-05 13:12                                   ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-09 09:20                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-09 14:20                                       ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-12 09:34                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-12 18:47                                           ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-13 08:27                                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-25 04:23                                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-29 02:09                                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-06 00:59                                                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-06 13:59                                                     ` Re: Eager aggregation, take 3 David Rowley <[email protected]>
  2025-10-07 10:56                                                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-08 11:14                                                         ` Re: Eager aggregation, take 3 David Rowley <[email protected]>
  2025-10-09 01:49                                                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-09 08:07                                                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2026-03-30 03:17                                                               ` Richard Guo <[email protected]>
  2026-04-02 12:18                                                                 ` Re: Eager aggregation, take 3 Matheus Alcantara <[email protected]>
  0 siblings, 1 reply; 70+ messages in thread

From: Richard Guo @ 2026-03-30 03:17 UTC (permalink / raw)
  To: David Rowley <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]; Matheus Alcantara <[email protected]>

On Thu, Oct 9, 2025 at 5:07 PM Richard Guo <[email protected]> wrote:
> I noticed an unnecessary header include in initsplan.c.  Will fix that
> as well.

I noticed a couple of issues that can lead to unexpected results.
I've attached two patches to fix them.

1. Eager aggregation was incorrectly checking the data type's default
collation rather than the expression's actual collation.  This allowed
columns with non-deterministic collations to be pushed down, resulting
in incorrect grouping.  Fixed by 0001.

2. Pushing aggregates containing volatile functions below a join
alters their execution count.  Fixed by 0002.

(As briefly discussed on Discord, this non-deterministic collation
issue also exists in our long-existing logic for pushing HAVING down
to WHERE.  But let's fix it for the eager aggregation first.)

- Richard


Attachments:

  [application/octet-stream] v1-0001-Fix-collation-handling-for-grouping-keys-in-eager.patch (9.3K, 2-v1-0001-Fix-collation-handling-for-grouping-keys-in-eager.patch)
  download | inline diff:
From 3e8997d52dae13b571745355e07678f35d878c0b Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Fri, 27 Mar 2026 17:51:17 +0900
Subject: [PATCH v1 1/2] Fix collation handling for grouping keys in eager
 aggregation

When determining if it is safe to use an expression as a grouping key
for partial aggregation, eager aggregation relies on the B-tree
equalimage support function to ensure that equality implies image
equality.

Previously, the code incorrectly passed the default collation of the
expression's data type to the equalimage procedure, rather than the
expression's actual collation.  As a result, if a column used a
non-deterministic collation but the base type's default collation was
deterministic, eager aggregation would incorrectly assume that the
column was safe for byte-level grouping.  This could cause rows to be
prematurely grouped and subsequently discarded by strict join
conditions, resulting in incorrect query results.

This patch fixes the issue by passing the expression's actual
collation to the equalimage procedure.
---
 src/backend/optimizer/plan/initsplan.c        | 10 ++-
 src/backend/optimizer/util/relnode.c          | 10 ++-
 .../regress/expected/collate.icu.utf8.out     | 71 ++++++++++++++-----
 src/test/regress/sql/collate.icu.utf8.sql     | 31 ++++++++
 4 files changed, 102 insertions(+), 20 deletions(-)

diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index c20e7e49780..b207b8d913b 100644
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -913,9 +913,17 @@ create_grouping_expr_infos(PlannerInfo *root)
 										   tce->btree_opintype,
 										   tce->btree_opintype,
 										   BTEQUALIMAGE_PROC);
+
+		/*
+		 * If there is no BTEQUALIMAGE_PROC, eager aggregation is assumed to
+		 * be unsafe.  Otherwise, we call the procedure to check.  We must be
+		 * careful to pass the expression's actual collation, rather than the
+		 * data type's default collation, to ensure that non-deterministic
+		 * collations are correctly handled.
+		 */
 		if (!OidIsValid(equalimageproc) ||
 			!DatumGetBool(OidFunctionCall1Coll(equalimageproc,
-											   tce->typcollation,
+											   exprCollation((Node *) tle->expr),
 											   ObjectIdGetDatum(tce->btree_opintype))))
 			return;
 
diff --git a/src/backend/optimizer/util/relnode.c b/src/backend/optimizer/util/relnode.c
index 91bcda34a37..3fc2c2f71d0 100644
--- a/src/backend/optimizer/util/relnode.c
+++ b/src/backend/optimizer/util/relnode.c
@@ -3004,9 +3004,17 @@ init_grouping_targets(PlannerInfo *root, RelOptInfo *rel,
 											   tce->btree_opintype,
 											   tce->btree_opintype,
 											   BTEQUALIMAGE_PROC);
+
+			/*
+			 * If there is no BTEQUALIMAGE_PROC, eager aggregation is assumed
+			 * to be unsafe.  Otherwise, we call the procedure to check.  We
+			 * must be careful to pass the expression's actual collation,
+			 * rather than the data type's default collation, to ensure that
+			 * non-deterministic collations are correctly handled.
+			 */
 			if (!OidIsValid(equalimageproc) ||
 				!DatumGetBool(OidFunctionCall1Coll(equalimageproc,
-												   tce->typcollation,
+												   exprCollation((Node *) expr),
 												   ObjectIdGetDatum(tce->btree_opintype))))
 				return false;
 
diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out
index d170e7da066..fbcdb7eb58c 100644
--- a/src/test/regress/expected/collate.icu.utf8.out
+++ b/src/test/regress/expected/collate.icu.utf8.out
@@ -2454,11 +2454,11 @@ SELECT c collate "C", count(c) FROM pagg_tab3 GROUP BY c collate "C" ORDER BY 1;
 SET enable_partitionwise_join TO false;
 EXPLAIN (COSTS OFF)
 SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROUP BY 1 ORDER BY t1.c COLLATE "C";
-                            QUERY PLAN                             
--------------------------------------------------------------------
+                         QUERY PLAN                          
+-------------------------------------------------------------
  Sort
    Sort Key: t1.c COLLATE "C"
-   ->  Finalize HashAggregate
+   ->  HashAggregate
          Group Key: t1.c
          ->  Hash Join
                Hash Cond: (t1.c = t2.c)
@@ -2466,12 +2466,10 @@ SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROU
                      ->  Seq Scan on pagg_tab3_p2 t1_1
                      ->  Seq Scan on pagg_tab3_p1 t1_2
                ->  Hash
-                     ->  Partial HashAggregate
-                           Group Key: t2.c
-                           ->  Append
-                                 ->  Seq Scan on pagg_tab3_p2 t2_1
-                                 ->  Seq Scan on pagg_tab3_p1 t2_2
-(15 rows)
+                     ->  Append
+                           ->  Seq Scan on pagg_tab3_p2 t2_1
+                           ->  Seq Scan on pagg_tab3_p1 t2_2
+(13 rows)
 
 SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROUP BY 1 ORDER BY t1.c COLLATE "C";
  c | count 
@@ -2483,11 +2481,11 @@ SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROU
 SET enable_partitionwise_join TO true;
 EXPLAIN (COSTS OFF)
 SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROUP BY 1 ORDER BY t1.c COLLATE "C";
-                            QUERY PLAN                             
--------------------------------------------------------------------
+                         QUERY PLAN                          
+-------------------------------------------------------------
  Sort
    Sort Key: t1.c COLLATE "C"
-   ->  Finalize HashAggregate
+   ->  HashAggregate
          Group Key: t1.c
          ->  Hash Join
                Hash Cond: (t1.c = t2.c)
@@ -2495,12 +2493,10 @@ SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROU
                      ->  Seq Scan on pagg_tab3_p2 t1_1
                      ->  Seq Scan on pagg_tab3_p1 t1_2
                ->  Hash
-                     ->  Partial HashAggregate
-                           Group Key: t2.c
-                           ->  Append
-                                 ->  Seq Scan on pagg_tab3_p2 t2_1
-                                 ->  Seq Scan on pagg_tab3_p1 t2_2
-(15 rows)
+                     ->  Append
+                           ->  Seq Scan on pagg_tab3_p2 t2_1
+                           ->  Seq Scan on pagg_tab3_p1 t2_2
+(13 rows)
 
 SELECT t1.c, count(t2.c) FROM pagg_tab3 t1 JOIN pagg_tab3 t2 ON t1.c = t2.c GROUP BY 1 ORDER BY t1.c COLLATE "C";
  c | count 
@@ -2691,6 +2687,45 @@ DROP TABLE pagg_tab6;
 RESET enable_partitionwise_aggregate;
 RESET max_parallel_workers_per_gather;
 RESET enable_incremental_sort;
+--
+-- Test for eager aggregation non-deterministic collation bug
+--
+CREATE TABLE eager_agg_t1 (id int, val text COLLATE case_insensitive);
+CREATE TABLE eager_agg_t2 (val text COLLATE case_insensitive);
+INSERT INTO eager_agg_t1 SELECT 1, 'a' FROM generate_series(1, 50);
+INSERT INTO eager_agg_t1 SELECT 1, 'A' FROM generate_series(1, 50);
+INSERT INTO eager_agg_t2 VALUES ('A');
+ANALYZE eager_agg_t1;
+ANALYZE eager_agg_t2;
+-- Ensure that eager aggregation is not used for t1.val due to the
+-- non-deterministic collation.
+EXPLAIN (COSTS OFF)
+SELECT t1.id, count(t1.val)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.val = t2.val COLLATE "C"
+GROUP BY t1.id;
+                       QUERY PLAN                       
+--------------------------------------------------------
+ HashAggregate
+   Group Key: t1.id
+   ->  Nested Loop
+         Join Filter: ((t1.val)::text = (t2.val)::text)
+         ->  Seq Scan on eager_agg_t2 t2
+         ->  Seq Scan on eager_agg_t1 t1
+(6 rows)
+
+-- Ensure it returns 1 row with count = 50
+SELECT t1.id, count(t1.val)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.val = t2.val COLLATE "C"
+GROUP BY t1.id;
+ id | count 
+----+-------
+  1 |    50
+(1 row)
+
+DROP TABLE eager_agg_t1;
+DROP TABLE eager_agg_t2;
 -- virtual generated columns
 CREATE TABLE t5 (
     a int,
diff --git a/src/test/regress/sql/collate.icu.utf8.sql b/src/test/regress/sql/collate.icu.utf8.sql
index 8f0f973f5fa..0e6b76b11b8 100644
--- a/src/test/regress/sql/collate.icu.utf8.sql
+++ b/src/test/regress/sql/collate.icu.utf8.sql
@@ -990,6 +990,37 @@ RESET enable_partitionwise_aggregate;
 RESET max_parallel_workers_per_gather;
 RESET enable_incremental_sort;
 
+--
+-- Test for eager aggregation non-deterministic collation bug
+--
+
+CREATE TABLE eager_agg_t1 (id int, val text COLLATE case_insensitive);
+CREATE TABLE eager_agg_t2 (val text COLLATE case_insensitive);
+
+INSERT INTO eager_agg_t1 SELECT 1, 'a' FROM generate_series(1, 50);
+INSERT INTO eager_agg_t1 SELECT 1, 'A' FROM generate_series(1, 50);
+INSERT INTO eager_agg_t2 VALUES ('A');
+
+ANALYZE eager_agg_t1;
+ANALYZE eager_agg_t2;
+
+-- Ensure that eager aggregation is not used for t1.val due to the
+-- non-deterministic collation.
+EXPLAIN (COSTS OFF)
+SELECT t1.id, count(t1.val)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.val = t2.val COLLATE "C"
+GROUP BY t1.id;
+
+-- Ensure it returns 1 row with count = 50
+SELECT t1.id, count(t1.val)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.val = t2.val COLLATE "C"
+GROUP BY t1.id;
+
+DROP TABLE eager_agg_t1;
+DROP TABLE eager_agg_t2;
+
 -- virtual generated columns
 CREATE TABLE t5 (
     a int,
-- 
2.39.5 (Apple Git-154)



  [application/octet-stream] v1-0002-Fix-volatile-function-evaluation-in-eager-aggrega.patch (3.2K, 3-v1-0002-Fix-volatile-function-evaluation-in-eager-aggrega.patch)
  download | inline diff:
From 0476ff98a83317642a16bca9a5b1eef97925dbd8 Mon Sep 17 00:00:00 2001
From: Richard Guo <[email protected]>
Date: Sat, 28 Mar 2026 16:52:37 +0900
Subject: [PATCH v1 2/2] Fix volatile function evaluation in eager aggregation

Pushing aggregates containing volatile functions below a join can
violate volatility semantics by changing the number of times the
function is executed.

Here we check the Aggref nodes in the targetlist and havingQual for
volatile functions and disable eager aggregation when such functions
are present.
---
 src/backend/optimizer/plan/initsplan.c        | 11 ++++++++++
 src/test/regress/expected/eager_aggregate.out | 20 +++++++++++++++++++
 src/test/regress/sql/eager_aggregate.sql      |  8 ++++++++
 3 files changed, 39 insertions(+)

diff --git a/src/backend/optimizer/plan/initsplan.c b/src/backend/optimizer/plan/initsplan.c
index b207b8d913b..96ee312ebdf 100644
--- a/src/backend/optimizer/plan/initsplan.c
+++ b/src/backend/optimizer/plan/initsplan.c
@@ -810,6 +810,17 @@ create_agg_clause_infos(PlannerInfo *root)
 		Assert(aggref->aggorder == NIL);
 		Assert(aggref->aggdistinct == NIL);
 
+		/*
+		 * We cannot push down aggregates that contain volatile functions.
+		 * Doing so would change the number of times the function is
+		 * evaluated.
+		 */
+		if (contain_volatile_functions((Node *) aggref))
+		{
+			eager_agg_applicable = false;
+			break;
+		}
+
 		/*
 		 * If there are any securityQuals, do not try to apply eager
 		 * aggregation if any non-leakproof aggregate functions are present.
diff --git a/src/test/regress/expected/eager_aggregate.out b/src/test/regress/expected/eager_aggregate.out
index 5ac966186f7..d1b86be3a62 100644
--- a/src/test/regress/expected/eager_aggregate.out
+++ b/src/test/regress/expected/eager_aggregate.out
@@ -428,6 +428,26 @@ GROUP BY t1.a ORDER BY t1.a;
 
 RESET geqo;
 RESET geqo_threshold;
+-- Ensure eager aggregation is not applied because random() is a volatile
+-- function
+EXPLAIN (COSTS OFF)
+SELECT t1.a, avg(t2.c + random())
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+                     QUERY PLAN                      
+-----------------------------------------------------
+ GroupAggregate
+   Group Key: t1.a
+   ->  Sort
+         Sort Key: t1.a
+         ->  Hash Join
+               Hash Cond: (t2.b = t1.b)
+               ->  Seq Scan on eager_agg_t2 t2
+               ->  Hash
+                     ->  Seq Scan on eager_agg_t1 t1
+(9 rows)
+
 DROP TABLE eager_agg_t1;
 DROP TABLE eager_agg_t2;
 DROP TABLE eager_agg_t3;
diff --git a/src/test/regress/sql/eager_aggregate.sql b/src/test/regress/sql/eager_aggregate.sql
index abe6d6ae09f..97e10dd7cf4 100644
--- a/src/test/regress/sql/eager_aggregate.sql
+++ b/src/test/regress/sql/eager_aggregate.sql
@@ -163,6 +163,14 @@ GROUP BY t1.a ORDER BY t1.a;
 RESET geqo;
 RESET geqo_threshold;
 
+-- Ensure eager aggregation is not applied because random() is a volatile
+-- function
+EXPLAIN (COSTS OFF)
+SELECT t1.a, avg(t2.c + random())
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+
 DROP TABLE eager_agg_t1;
 DROP TABLE eager_agg_t2;
 DROP TABLE eager_agg_t3;
-- 
2.39.5 (Apple Git-154)



^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-05 13:12                                   ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-09 09:20                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-09 14:20                                       ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-12 09:34                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-12 18:47                                           ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-13 08:27                                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-25 04:23                                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-29 02:09                                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-06 00:59                                                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-06 13:59                                                     ` Re: Eager aggregation, take 3 David Rowley <[email protected]>
  2025-10-07 10:56                                                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-08 11:14                                                         ` Re: Eager aggregation, take 3 David Rowley <[email protected]>
  2025-10-09 01:49                                                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-09 08:07                                                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2026-03-30 03:17                                                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2026-04-02 12:18                                                                 ` Matheus Alcantara <[email protected]>
  2026-04-06 03:06                                                                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  0 siblings, 1 reply; 70+ messages in thread

From: Matheus Alcantara @ 2026-04-02 12:18 UTC (permalink / raw)
  To: Richard Guo <[email protected]>; David Rowley <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

On Mon Mar 30, 2026 at 12:17 AM -03, Richard Guo wrote:
> On Thu, Oct 9, 2025 at 5:07 PM Richard Guo <[email protected]> wrote:
>> I noticed an unnecessary header include in initsplan.c.  Will fix that
>> as well.
>
> I noticed a couple of issues that can lead to unexpected results.
> I've attached two patches to fix them.
>
> 1. Eager aggregation was incorrectly checking the data type's default
> collation rather than the expression's actual collation.  This allowed
> columns with non-deterministic collations to be pushed down, resulting
> in incorrect grouping.  Fixed by 0001.
>
> 2. Pushing aggregates containing volatile functions below a join
> alters their execution count.  Fixed by 0002.
>
> (As briefly discussed on Discord, this non-deterministic collation
> issue also exists in our long-existing logic for pushing HAVING down
> to WHERE.  But let's fix it for the eager aggregation first.)
>

Hi Richard,

The patches looks good to me and are working as expected. It seems very
straightforward, so I don't have any major comments.

I'm attaching some new tests that I've added to collate.icu.utf8 and
eager_aggregate regression tests during my review, fell free to include
any of them if it could be helpful or none.

--
Matheus Alcantara
EDB: https://www.enterprisedb.com

diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out
index fbcdb7eb58c..a2dd8a34da4 100644
--- a/src/test/regress/expected/collate.icu.utf8.out
+++ b/src/test/regress/expected/collate.icu.utf8.out
@@ -2726,6 +2726,95 @@ GROUP BY t1.id;
 
 DROP TABLE eager_agg_t1;
 DROP TABLE eager_agg_t2;
+--
+-- Test for eager aggregation with multiple columns having different collations
+--
+CREATE TABLE eager_agg_t3 (
+    id int,
+    val1 text COLLATE case_insensitive,
+    val2 text COLLATE "C"
+);
+CREATE TABLE eager_agg_t4 (
+    val1 text COLLATE case_insensitive,
+    val2 text COLLATE "C"
+);
+INSERT INTO eager_agg_t3 SELECT 1, 'a', 'x' FROM generate_series(1, 50);
+INSERT INTO eager_agg_t3 SELECT 1, 'A', 'x' FROM generate_series(1, 50);
+INSERT INTO eager_agg_t4 VALUES ('A', 'x');
+ANALYZE eager_agg_t3;
+ANALYZE eager_agg_t4;
+-- Ensure that eager aggregation is not used when grouping by a column with
+-- non-deterministic collation, even when other grouping columns have
+-- deterministic collations.
+EXPLAIN (COSTS OFF)
+SELECT t1.id, t1.val1, count(*)
+  FROM eager_agg_t3 t1
+  JOIN eager_agg_t4 t2 ON t1.val1 = t2.val1 COLLATE "C" AND t1.val2 = t2.val2
+GROUP BY t1.id, t1.val1;
+                                     QUERY PLAN                                     
+------------------------------------------------------------------------------------
+ HashAggregate
+   Group Key: t1.id, t1.val1
+   ->  Nested Loop
+         Join Filter: (((t1.val1)::text = (t2.val1)::text) AND (t1.val2 = t2.val2))
+         ->  Seq Scan on eager_agg_t4 t2
+         ->  Seq Scan on eager_agg_t3 t1
+(6 rows)
+
+-- Verify correct results (should return 1 row with count = 50)
+SELECT t1.id, t1.val1, count(*)
+  FROM eager_agg_t3 t1
+  JOIN eager_agg_t4 t2 ON t1.val1 = t2.val1 COLLATE "C" AND t1.val2 = t2.val2
+GROUP BY t1.id, t1.val1;
+ id | val1 | count 
+----+------+-------
+  1 | A    |    50
+(1 row)
+
+DROP TABLE eager_agg_t3;
+DROP TABLE eager_agg_t4;
+--
+-- Test for eager aggregation with explicit COLLATE on grouping expression
+--
+CREATE TABLE eager_agg_t5 (id int, val text COLLATE "C");
+CREATE TABLE eager_agg_t6 (val text COLLATE "C");
+INSERT INTO eager_agg_t5 SELECT 1, 'a' FROM generate_series(1, 50);
+INSERT INTO eager_agg_t5 SELECT 1, 'A' FROM generate_series(1, 50);
+INSERT INTO eager_agg_t6 VALUES ('A');
+ANALYZE eager_agg_t5;
+ANALYZE eager_agg_t6;
+-- When grouping by an expression with explicit non-deterministic COLLATE,
+-- eager aggregation should not be used even if the column's native collation
+-- is deterministic.
+EXPLAIN (COSTS OFF)
+SELECT t1.id, t1.val COLLATE case_insensitive, count(*)
+  FROM eager_agg_t5 t1
+  JOIN eager_agg_t6 t2 ON t1.val = t2.val
+GROUP BY t1.id, t1.val COLLATE case_insensitive;
+                  QUERY PLAN                   
+-----------------------------------------------
+ HashAggregate
+   Group Key: t1.id, (t1.val)::text
+   ->  Hash Join
+         Hash Cond: (t1.val = t2.val)
+         ->  Seq Scan on eager_agg_t5 t1
+         ->  Hash
+               ->  Seq Scan on eager_agg_t6 t2
+(7 rows)
+
+-- Verify correct results (should return 1 row with count = 100, since 'a' and
+-- 'A' are equal under case_insensitive collation)
+SELECT t1.id, t1.val COLLATE case_insensitive, count(*)
+  FROM eager_agg_t5 t1
+  JOIN eager_agg_t6 t2 ON t1.val = t2.val
+GROUP BY t1.id, t1.val COLLATE case_insensitive;
+ id | val | count 
+----+-----+-------
+  1 | A   |    50
+(1 row)
+
+DROP TABLE eager_agg_t5;
+DROP TABLE eager_agg_t6;
 -- virtual generated columns
 CREATE TABLE t5 (
     a int,
diff --git a/src/test/regress/expected/eager_aggregate.out b/src/test/regress/expected/eager_aggregate.out
index d1b86be3a62..2bf983d12cb 100644
--- a/src/test/regress/expected/eager_aggregate.out
+++ b/src/test/regress/expected/eager_aggregate.out
@@ -448,6 +448,26 @@ GROUP BY t1.a ORDER BY t1.a;
                      ->  Seq Scan on eager_agg_t1 t1
 (9 rows)
 
+-- Ensure eager aggregation is not applied when FILTER clause contains
+-- volatile function
+EXPLAIN (COSTS OFF)
+SELECT t1.a, avg(t2.c) FILTER (WHERE random() > 0.5)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+                     QUERY PLAN                      
+-----------------------------------------------------
+ GroupAggregate
+   Group Key: t1.a
+   ->  Sort
+         Sort Key: t1.a
+         ->  Hash Join
+               Hash Cond: (t2.b = t1.b)
+               ->  Seq Scan on eager_agg_t2 t2
+               ->  Hash
+                     ->  Seq Scan on eager_agg_t1 t1
+(9 rows)
+
 DROP TABLE eager_agg_t1;
 DROP TABLE eager_agg_t2;
 DROP TABLE eager_agg_t3;
diff --git a/src/test/regress/sql/collate.icu.utf8.sql b/src/test/regress/sql/collate.icu.utf8.sql
index 0e6b76b11b8..93c22b37727 100644
--- a/src/test/regress/sql/collate.icu.utf8.sql
+++ b/src/test/regress/sql/collate.icu.utf8.sql
@@ -1021,6 +1021,76 @@ GROUP BY t1.id;
 DROP TABLE eager_agg_t1;
 DROP TABLE eager_agg_t2;
 
+--
+-- Test for eager aggregation with multiple columns having different collations
+--
+CREATE TABLE eager_agg_t3 (
+    id int,
+    val1 text COLLATE case_insensitive,
+    val2 text COLLATE "C"
+);
+CREATE TABLE eager_agg_t4 (
+    val1 text COLLATE case_insensitive,
+    val2 text COLLATE "C"
+);
+
+INSERT INTO eager_agg_t3 SELECT 1, 'a', 'x' FROM generate_series(1, 50);
+INSERT INTO eager_agg_t3 SELECT 1, 'A', 'x' FROM generate_series(1, 50);
+INSERT INTO eager_agg_t4 VALUES ('A', 'x');
+
+ANALYZE eager_agg_t3;
+ANALYZE eager_agg_t4;
+
+-- Ensure that eager aggregation is not used when grouping by a column with
+-- non-deterministic collation, even when other grouping columns have
+-- deterministic collations.
+EXPLAIN (COSTS OFF)
+SELECT t1.id, t1.val1, count(*)
+  FROM eager_agg_t3 t1
+  JOIN eager_agg_t4 t2 ON t1.val1 = t2.val1 COLLATE "C" AND t1.val2 = t2.val2
+GROUP BY t1.id, t1.val1;
+
+-- Verify correct results (should return 1 row with count = 50)
+SELECT t1.id, t1.val1, count(*)
+  FROM eager_agg_t3 t1
+  JOIN eager_agg_t4 t2 ON t1.val1 = t2.val1 COLLATE "C" AND t1.val2 = t2.val2
+GROUP BY t1.id, t1.val1;
+
+DROP TABLE eager_agg_t3;
+DROP TABLE eager_agg_t4;
+
+--
+-- Test for eager aggregation with explicit COLLATE on grouping expression
+--
+CREATE TABLE eager_agg_t5 (id int, val text COLLATE "C");
+CREATE TABLE eager_agg_t6 (val text COLLATE "C");
+
+INSERT INTO eager_agg_t5 SELECT 1, 'a' FROM generate_series(1, 50);
+INSERT INTO eager_agg_t5 SELECT 1, 'A' FROM generate_series(1, 50);
+INSERT INTO eager_agg_t6 VALUES ('A');
+
+ANALYZE eager_agg_t5;
+ANALYZE eager_agg_t6;
+
+-- When grouping by an expression with explicit non-deterministic COLLATE,
+-- eager aggregation should not be used even if the column's native collation
+-- is deterministic.
+EXPLAIN (COSTS OFF)
+SELECT t1.id, t1.val COLLATE case_insensitive, count(*)
+  FROM eager_agg_t5 t1
+  JOIN eager_agg_t6 t2 ON t1.val = t2.val
+GROUP BY t1.id, t1.val COLLATE case_insensitive;
+
+-- Verify correct results (should return 1 row with count = 100, since 'a' and
+-- 'A' are equal under case_insensitive collation)
+SELECT t1.id, t1.val COLLATE case_insensitive, count(*)
+  FROM eager_agg_t5 t1
+  JOIN eager_agg_t6 t2 ON t1.val = t2.val
+GROUP BY t1.id, t1.val COLLATE case_insensitive;
+
+DROP TABLE eager_agg_t5;
+DROP TABLE eager_agg_t6;
+
 -- virtual generated columns
 CREATE TABLE t5 (
     a int,
diff --git a/src/test/regress/sql/eager_aggregate.sql b/src/test/regress/sql/eager_aggregate.sql
index 97e10dd7cf4..9c935ef0633 100644
--- a/src/test/regress/sql/eager_aggregate.sql
+++ b/src/test/regress/sql/eager_aggregate.sql
@@ -171,6 +171,14 @@ SELECT t1.a, avg(t2.c + random())
   JOIN eager_agg_t2 t2 ON t1.b = t2.b
 GROUP BY t1.a ORDER BY t1.a;
 
+-- Ensure eager aggregation is not applied when FILTER clause contains
+-- volatile function
+EXPLAIN (COSTS OFF)
+SELECT t1.a, avg(t2.c) FILTER (WHERE random() > 0.5)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+
 DROP TABLE eager_agg_t1;
 DROP TABLE eager_agg_t2;
 DROP TABLE eager_agg_t3;


Attachments:

  [text/plain] more-tests.diff.nocfbot (8.0K, 2-more-tests.diff.nocfbot)
  download | inline diff:
diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out
index fbcdb7eb58c..a2dd8a34da4 100644
--- a/src/test/regress/expected/collate.icu.utf8.out
+++ b/src/test/regress/expected/collate.icu.utf8.out
@@ -2726,6 +2726,95 @@ GROUP BY t1.id;
 
 DROP TABLE eager_agg_t1;
 DROP TABLE eager_agg_t2;
+--
+-- Test for eager aggregation with multiple columns having different collations
+--
+CREATE TABLE eager_agg_t3 (
+    id int,
+    val1 text COLLATE case_insensitive,
+    val2 text COLLATE "C"
+);
+CREATE TABLE eager_agg_t4 (
+    val1 text COLLATE case_insensitive,
+    val2 text COLLATE "C"
+);
+INSERT INTO eager_agg_t3 SELECT 1, 'a', 'x' FROM generate_series(1, 50);
+INSERT INTO eager_agg_t3 SELECT 1, 'A', 'x' FROM generate_series(1, 50);
+INSERT INTO eager_agg_t4 VALUES ('A', 'x');
+ANALYZE eager_agg_t3;
+ANALYZE eager_agg_t4;
+-- Ensure that eager aggregation is not used when grouping by a column with
+-- non-deterministic collation, even when other grouping columns have
+-- deterministic collations.
+EXPLAIN (COSTS OFF)
+SELECT t1.id, t1.val1, count(*)
+  FROM eager_agg_t3 t1
+  JOIN eager_agg_t4 t2 ON t1.val1 = t2.val1 COLLATE "C" AND t1.val2 = t2.val2
+GROUP BY t1.id, t1.val1;
+                                     QUERY PLAN                                     
+------------------------------------------------------------------------------------
+ HashAggregate
+   Group Key: t1.id, t1.val1
+   ->  Nested Loop
+         Join Filter: (((t1.val1)::text = (t2.val1)::text) AND (t1.val2 = t2.val2))
+         ->  Seq Scan on eager_agg_t4 t2
+         ->  Seq Scan on eager_agg_t3 t1
+(6 rows)
+
+-- Verify correct results (should return 1 row with count = 50)
+SELECT t1.id, t1.val1, count(*)
+  FROM eager_agg_t3 t1
+  JOIN eager_agg_t4 t2 ON t1.val1 = t2.val1 COLLATE "C" AND t1.val2 = t2.val2
+GROUP BY t1.id, t1.val1;
+ id | val1 | count 
+----+------+-------
+  1 | A    |    50
+(1 row)
+
+DROP TABLE eager_agg_t3;
+DROP TABLE eager_agg_t4;
+--
+-- Test for eager aggregation with explicit COLLATE on grouping expression
+--
+CREATE TABLE eager_agg_t5 (id int, val text COLLATE "C");
+CREATE TABLE eager_agg_t6 (val text COLLATE "C");
+INSERT INTO eager_agg_t5 SELECT 1, 'a' FROM generate_series(1, 50);
+INSERT INTO eager_agg_t5 SELECT 1, 'A' FROM generate_series(1, 50);
+INSERT INTO eager_agg_t6 VALUES ('A');
+ANALYZE eager_agg_t5;
+ANALYZE eager_agg_t6;
+-- When grouping by an expression with explicit non-deterministic COLLATE,
+-- eager aggregation should not be used even if the column's native collation
+-- is deterministic.
+EXPLAIN (COSTS OFF)
+SELECT t1.id, t1.val COLLATE case_insensitive, count(*)
+  FROM eager_agg_t5 t1
+  JOIN eager_agg_t6 t2 ON t1.val = t2.val
+GROUP BY t1.id, t1.val COLLATE case_insensitive;
+                  QUERY PLAN                   
+-----------------------------------------------
+ HashAggregate
+   Group Key: t1.id, (t1.val)::text
+   ->  Hash Join
+         Hash Cond: (t1.val = t2.val)
+         ->  Seq Scan on eager_agg_t5 t1
+         ->  Hash
+               ->  Seq Scan on eager_agg_t6 t2
+(7 rows)
+
+-- Verify correct results (should return 1 row with count = 100, since 'a' and
+-- 'A' are equal under case_insensitive collation)
+SELECT t1.id, t1.val COLLATE case_insensitive, count(*)
+  FROM eager_agg_t5 t1
+  JOIN eager_agg_t6 t2 ON t1.val = t2.val
+GROUP BY t1.id, t1.val COLLATE case_insensitive;
+ id | val | count 
+----+-----+-------
+  1 | A   |    50
+(1 row)
+
+DROP TABLE eager_agg_t5;
+DROP TABLE eager_agg_t6;
 -- virtual generated columns
 CREATE TABLE t5 (
     a int,
diff --git a/src/test/regress/expected/eager_aggregate.out b/src/test/regress/expected/eager_aggregate.out
index d1b86be3a62..2bf983d12cb 100644
--- a/src/test/regress/expected/eager_aggregate.out
+++ b/src/test/regress/expected/eager_aggregate.out
@@ -448,6 +448,26 @@ GROUP BY t1.a ORDER BY t1.a;
                      ->  Seq Scan on eager_agg_t1 t1
 (9 rows)
 
+-- Ensure eager aggregation is not applied when FILTER clause contains
+-- volatile function
+EXPLAIN (COSTS OFF)
+SELECT t1.a, avg(t2.c) FILTER (WHERE random() > 0.5)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+                     QUERY PLAN                      
+-----------------------------------------------------
+ GroupAggregate
+   Group Key: t1.a
+   ->  Sort
+         Sort Key: t1.a
+         ->  Hash Join
+               Hash Cond: (t2.b = t1.b)
+               ->  Seq Scan on eager_agg_t2 t2
+               ->  Hash
+                     ->  Seq Scan on eager_agg_t1 t1
+(9 rows)
+
 DROP TABLE eager_agg_t1;
 DROP TABLE eager_agg_t2;
 DROP TABLE eager_agg_t3;
diff --git a/src/test/regress/sql/collate.icu.utf8.sql b/src/test/regress/sql/collate.icu.utf8.sql
index 0e6b76b11b8..93c22b37727 100644
--- a/src/test/regress/sql/collate.icu.utf8.sql
+++ b/src/test/regress/sql/collate.icu.utf8.sql
@@ -1021,6 +1021,76 @@ GROUP BY t1.id;
 DROP TABLE eager_agg_t1;
 DROP TABLE eager_agg_t2;
 
+--
+-- Test for eager aggregation with multiple columns having different collations
+--
+CREATE TABLE eager_agg_t3 (
+    id int,
+    val1 text COLLATE case_insensitive,
+    val2 text COLLATE "C"
+);
+CREATE TABLE eager_agg_t4 (
+    val1 text COLLATE case_insensitive,
+    val2 text COLLATE "C"
+);
+
+INSERT INTO eager_agg_t3 SELECT 1, 'a', 'x' FROM generate_series(1, 50);
+INSERT INTO eager_agg_t3 SELECT 1, 'A', 'x' FROM generate_series(1, 50);
+INSERT INTO eager_agg_t4 VALUES ('A', 'x');
+
+ANALYZE eager_agg_t3;
+ANALYZE eager_agg_t4;
+
+-- Ensure that eager aggregation is not used when grouping by a column with
+-- non-deterministic collation, even when other grouping columns have
+-- deterministic collations.
+EXPLAIN (COSTS OFF)
+SELECT t1.id, t1.val1, count(*)
+  FROM eager_agg_t3 t1
+  JOIN eager_agg_t4 t2 ON t1.val1 = t2.val1 COLLATE "C" AND t1.val2 = t2.val2
+GROUP BY t1.id, t1.val1;
+
+-- Verify correct results (should return 1 row with count = 50)
+SELECT t1.id, t1.val1, count(*)
+  FROM eager_agg_t3 t1
+  JOIN eager_agg_t4 t2 ON t1.val1 = t2.val1 COLLATE "C" AND t1.val2 = t2.val2
+GROUP BY t1.id, t1.val1;
+
+DROP TABLE eager_agg_t3;
+DROP TABLE eager_agg_t4;
+
+--
+-- Test for eager aggregation with explicit COLLATE on grouping expression
+--
+CREATE TABLE eager_agg_t5 (id int, val text COLLATE "C");
+CREATE TABLE eager_agg_t6 (val text COLLATE "C");
+
+INSERT INTO eager_agg_t5 SELECT 1, 'a' FROM generate_series(1, 50);
+INSERT INTO eager_agg_t5 SELECT 1, 'A' FROM generate_series(1, 50);
+INSERT INTO eager_agg_t6 VALUES ('A');
+
+ANALYZE eager_agg_t5;
+ANALYZE eager_agg_t6;
+
+-- When grouping by an expression with explicit non-deterministic COLLATE,
+-- eager aggregation should not be used even if the column's native collation
+-- is deterministic.
+EXPLAIN (COSTS OFF)
+SELECT t1.id, t1.val COLLATE case_insensitive, count(*)
+  FROM eager_agg_t5 t1
+  JOIN eager_agg_t6 t2 ON t1.val = t2.val
+GROUP BY t1.id, t1.val COLLATE case_insensitive;
+
+-- Verify correct results (should return 1 row with count = 100, since 'a' and
+-- 'A' are equal under case_insensitive collation)
+SELECT t1.id, t1.val COLLATE case_insensitive, count(*)
+  FROM eager_agg_t5 t1
+  JOIN eager_agg_t6 t2 ON t1.val = t2.val
+GROUP BY t1.id, t1.val COLLATE case_insensitive;
+
+DROP TABLE eager_agg_t5;
+DROP TABLE eager_agg_t6;
+
 -- virtual generated columns
 CREATE TABLE t5 (
     a int,
diff --git a/src/test/regress/sql/eager_aggregate.sql b/src/test/regress/sql/eager_aggregate.sql
index 97e10dd7cf4..9c935ef0633 100644
--- a/src/test/regress/sql/eager_aggregate.sql
+++ b/src/test/regress/sql/eager_aggregate.sql
@@ -171,6 +171,14 @@ SELECT t1.a, avg(t2.c + random())
   JOIN eager_agg_t2 t2 ON t1.b = t2.b
 GROUP BY t1.a ORDER BY t1.a;
 
+-- Ensure eager aggregation is not applied when FILTER clause contains
+-- volatile function
+EXPLAIN (COSTS OFF)
+SELECT t1.a, avg(t2.c) FILTER (WHERE random() > 0.5)
+  FROM eager_agg_t1 t1
+  JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+
 DROP TABLE eager_agg_t1;
 DROP TABLE eager_agg_t2;
 DROP TABLE eager_agg_t3;


^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-05 13:12                                   ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-09 09:20                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-09 14:20                                       ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-12 09:34                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-12 18:47                                           ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-13 08:27                                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-25 04:23                                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-29 02:09                                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-06 00:59                                                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-06 13:59                                                     ` Re: Eager aggregation, take 3 David Rowley <[email protected]>
  2025-10-07 10:56                                                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-08 11:14                                                         ` Re: Eager aggregation, take 3 David Rowley <[email protected]>
  2025-10-09 01:49                                                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-09 08:07                                                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2026-03-30 03:17                                                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2026-04-02 12:18                                                                 ` Re: Eager aggregation, take 3 Matheus Alcantara <[email protected]>
@ 2026-04-06 03:06                                                                   ` Richard Guo <[email protected]>
  0 siblings, 0 replies; 70+ messages in thread

From: Richard Guo @ 2026-04-06 03:06 UTC (permalink / raw)
  To: Matheus Alcantara <[email protected]>; +Cc: David Rowley <[email protected]>; Robert Haas <[email protected]>; Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

On Thu, Apr 2, 2026 at 9:18 PM Matheus Alcantara
<[email protected]> wrote:
> The patches looks good to me and are working as expected. It seems very
> straightforward, so I don't have any major comments.
>
> I'm attaching some new tests that I've added to collate.icu.utf8 and
> eager_aggregate regression tests during my review, fell free to include
> any of them if it could be helpful or none.

Thanks for the review.  I have added two of your test cases and
committed the patches.

- Richard





^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-05 13:12                                   ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-09 09:20                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-09 14:20                                       ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-12 09:34                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-12 18:47                                           ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-13 08:27                                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-25 04:23                                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-29 02:09                                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-06 00:59                                                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-06 13:59                                                     ` Re: Eager aggregation, take 3 David Rowley <[email protected]>
  2025-10-07 10:56                                                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-10-08 14:45                                                         ` Robert Haas <[email protected]>
  2025-10-09 01:51                                                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  1 sibling, 1 reply; 70+ messages in thread

From: Robert Haas @ 2025-10-08 14:45 UTC (permalink / raw)
  To: Richard Guo <[email protected]>; +Cc: David Rowley <[email protected]>; Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]; Matheus Alcantara <[email protected]>

On Tue, Oct 7, 2025 at 6:57 AM Richard Guo <[email protected]> wrote:
> > 10. I don't think this comment quite makes sense:
> >
> >  * "apply_at" tracks the lowest join level at which partial aggregation is
> >  * applied.
> >
> > maybe "minimum set of rels to join before partial aggregation can be applied"?
> I've updated the comment for apply_at to clarify that it refers to the
> relids at which partial aggregation is applied.
>
> I've also updated the comments within RelAggInfo to use one-line
> style.
>
> I retained the name of this field though.

For what it's worth, I also don't like that field name. I'm not sure
what to propose instead, but I don't think apply_at is very clear.

-- 
Robert Haas
EDB: http://www.enterprisedb.com





^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-05 13:12                                   ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-09 09:20                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-09 14:20                                       ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-12 09:34                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-12 18:47                                           ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-13 08:27                                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-25 04:23                                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-29 02:09                                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-06 00:59                                                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-06 13:59                                                     ` Re: Eager aggregation, take 3 David Rowley <[email protected]>
  2025-10-07 10:56                                                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-08 14:45                                                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
@ 2025-10-09 01:51                                                           ` Richard Guo <[email protected]>
  0 siblings, 0 replies; 70+ messages in thread

From: Richard Guo @ 2025-10-09 01:51 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: David Rowley <[email protected]>; Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]; Matheus Alcantara <[email protected]>

On Wed, Oct 8, 2025 at 11:45 PM Robert Haas <[email protected]> wrote:
> On Tue, Oct 7, 2025 at 6:57 AM Richard Guo <[email protected]> wrote:
> > I retained the name of this field though.

> For what it's worth, I also don't like that field name. I'm not sure
> what to propose instead, but I don't think apply_at is very clear.

This field represents the set of relids at which partial aggregation
is applied.  So how about naming it partial_agg_designated_relids?
That feels a bit verbose, though.  How about partial_agg_relids or,
for brevity, agg_relids instead?

- Richard





^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-05 13:12                                   ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-09 09:20                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-09 14:20                                       ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-12 09:34                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-12 18:47                                           ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-13 08:27                                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-25 04:23                                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-29 02:09                                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-06 00:59                                                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-10-09 01:48                                                     ` Richard Guo <[email protected]>
  2025-10-09 05:09                                                       ` Re: Eager aggregation, take 3 Antonin Houska <[email protected]>
  1 sibling, 1 reply; 70+ messages in thread

From: Richard Guo @ 2025-10-09 01:48 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]; Matheus Alcantara <[email protected]>

On Mon, Oct 6, 2025 at 9:59 AM Richard Guo <[email protected]> wrote:
> On Mon, Sep 29, 2025 at 11:09 AM Richard Guo <[email protected]> wrote:
> > FWIW, I plan to do another self-review of this patch soon, with the
> > goal of assessing whether it's ready to be pushed.  If anyone has any
> > concerns about any part of the patch or would like to review it, I
> > would greatly appreciate hearing from you.

> Barring any objections, I'll plan to push v23 in a couple of days.

I've pushed v24 -- thanks for all the reviews!  Now bracing for the
upcoming bug reports.

- Richard





^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-05 13:12                                   ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-09 09:20                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-09 14:20                                       ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-12 09:34                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-12 18:47                                           ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-13 08:27                                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-25 04:23                                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-29 02:09                                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-06 00:59                                                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-09 01:48                                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-10-09 05:09                                                       ` Antonin Houska <[email protected]>
  2025-10-09 07:01                                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  0 siblings, 1 reply; 70+ messages in thread

From: Antonin Houska @ 2025-10-09 05:09 UTC (permalink / raw)
  To: Richard Guo <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]; Matheus Alcantara <[email protected]>

Richard Guo <[email protected]> wrote:

> On Mon, Oct 6, 2025 at 9:59 AM Richard Guo <[email protected]> wrote:
> > On Mon, Sep 29, 2025 at 11:09 AM Richard Guo <[email protected]> wrote:
> > > FWIW, I plan to do another self-review of this patch soon, with the
> > > goal of assessing whether it's ready to be pushed.  If anyone has any
> > > concerns about any part of the patch or would like to review it, I
> > > would greatly appreciate hearing from you.
> 
> > Barring any objections, I'll plan to push v23 in a couple of days.
> 
> I've pushed v24 -- thanks for all the reviews!  Now bracing for the
> upcoming bug reports.

Thanks for finishing this! The lack of feedback I encountered earlier made me
so frustrated that I could not find motivation to collaborate with you. I'm
happy now that my effort did not get wasted.

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com





^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-05 13:12                                   ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-09 09:20                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-09 14:20                                       ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-12 09:34                                         ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-12 18:47                                           ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-09-13 08:27                                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-25 04:23                                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-29 02:09                                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-06 00:59                                                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-09 01:48                                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-09 05:09                                                       ` Re: Eager aggregation, take 3 Antonin Houska <[email protected]>
@ 2025-10-09 07:01                                                         ` Richard Guo <[email protected]>
  0 siblings, 0 replies; 70+ messages in thread

From: Richard Guo @ 2025-10-09 07:01 UTC (permalink / raw)
  To: Antonin Houska <[email protected]>; +Cc: Robert Haas <[email protected]>; Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]; Matheus Alcantara <[email protected]>

On Thu, Oct 9, 2025 at 2:09 PM Antonin Houska <[email protected]> wrote:
> Richard Guo <[email protected]> wrote:
> > I've pushed v24 -- thanks for all the reviews!  Now bracing for the
> > upcoming bug reports.

> Thanks for finishing this! The lack of feedback I encountered earlier made me
> so frustrated that I could not find motivation to collaborate with you. I'm
> happy now that my effort did not get wasted.

Your efforts in the earlier versions were very important for getting
this feature done.  Thank you for your work.

- Richard





^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-09-05 14:50                                   ` Robert Haas <[email protected]>
  2025-09-09 11:18                                     ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2 siblings, 1 reply; 70+ messages in thread

From: Robert Haas @ 2025-09-05 14:50 UTC (permalink / raw)
  To: Richard Guo <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]; Matheus Alcantara <[email protected]>

On Wed, Aug 6, 2025 at 3:52 AM Richard Guo <[email protected]> wrote:
> Looking at TPC-DS queries 4 and 11, a threshold of 10 is the minimum
> needed to consider eager aggregation for them.  The resulting plans
> show nice performance improvements without any measurable increase in
> planning time.  So, I'm inclined to lower the threshold to 10 for now.
> (Wondering whether we should make this threshold a GUC, so users can
> adjust it based on their needs.)

Like Matheus, I think a GUC is reasonable. A significant danger here
appears to be the possibility of a performance cliff, where queries
are optimized very different when the ratio is 9.99 vs. 10.01, say. It
would be nice if there were some way to mitigate that danger, but at
least a GUC avoids chaining the performance of the whole system to a
hard-coded value.

It might be worth considering whether there are heuristics other than
the group size that could help here. Possibly that's just making
things more complicated to no benefit. It seems to me, for example,
that reducing 100 rows to 10 is quite different from reducing a
million rows to 100,000. On the whole, the latter seems more likely to
work out well, but it's tricky, because the effort expended per group
can be arbitrarily high. I think we do want to let the cost model make
most of the decisions, and just use this threshold to prune ideas that
are obviously bad at an early stage. That said, it's worth thinking
about how this interacts with the just-considered-one-eager-agg
strategy. Does this threshold apply before or after that rule?

For instance, consider AGG(FACT_TABLE JOIN DIMENSION_TABLE), like a
count of orders grouped by customer name. Aggregating on the dimension
table (in this case, the list of customers) is probably useless, but
aggregating on the join column of the fact table has a good chance of
being useful. If we consider only one of those strategies, we want it
to be the right one. This threshold could be the thing that helps us
to get it right.

-- 
Robert Haas
EDB: http://www.enterprisedb.com





^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-06-26 02:01                             ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-07-24 03:21                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-08-06 07:52                                 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-05 14:50                                   ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
@ 2025-09-09 11:18                                     ` Richard Guo <[email protected]>
  0 siblings, 0 replies; 70+ messages in thread

From: Richard Guo @ 2025-09-09 11:18 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]; Matheus Alcantara <[email protected]>

On Fri, Sep 5, 2025 at 11:50 PM Robert Haas <[email protected]> wrote:
> Like Matheus, I think a GUC is reasonable. A significant danger here
> appears to be the possibility of a performance cliff, where queries
> are optimized very different when the ratio is 9.99 vs. 10.01, say. It
> would be nice if there were some way to mitigate that danger, but at
> least a GUC avoids chaining the performance of the whole system to a
> hard-coded value.

Yeah, I think the performance cliff issue does exist.  It might be
mitigated by carefully selecting the threshold value to ensure that
small differences in the average group size near the boundary don't
cause big performance swings with and without eager aggregation, but
this doesn't seem like an easy task.

How is this issue avoided in other thresholds?  For example, with
min_parallel_table_scan_size, is there a performance cliff when the
table size is 7.99MB vs. 8.01MB, where a parallel scan is considered
in the latter case but not the former?

> It might be worth considering whether there are heuristics other than
> the group size that could help here. Possibly that's just making
> things more complicated to no benefit. It seems to me, for example,
> that reducing 100 rows to 10 is quite different from reducing a
> million rows to 100,000. On the whole, the latter seems more likely to
> work out well, but it's tricky, because the effort expended per group
> can be arbitrarily high. I think we do want to let the cost model make
> most of the decisions, and just use this threshold to prune ideas that
> are obviously bad at an early stage. That said, it's worth thinking
> about how this interacts with the just-considered-one-eager-agg
> strategy. Does this threshold apply before or after that rule?

If I understand correctly, this means that we need to explore each
join level to find out the most optimal position for applying partial
aggregation.  For example, suppose Agg(B) reduces 100 rows to 10, and
Agg(A JOIN B) reduces a million rows to 100,000, it might be better to
apply partial aggregation at the (A JOIN B) level rather than just
over B.  However, that's not always the case: the Agg(B) option can
reduce the number of input rows to the join earlier, potentially
outperforming the Agg(A JOIN B) approach.  Therefore, we need to
consider both options and compare their costs.

This is actually what the patch used to do before I introduced the
always-push-to-lowest heuristic.

> For instance, consider AGG(FACT_TABLE JOIN DIMENSION_TABLE), like a
> count of orders grouped by customer name. Aggregating on the dimension
> table (in this case, the list of customers) is probably useless, but
> aggregating on the join column of the fact table has a good chance of
> being useful. If we consider only one of those strategies, we want it
> to be the right one. This threshold could be the thing that helps us
> to get it right.

Now I see what you meant.  However, in the current implementation, we
only push partial aggregation down to relations that contain all the
aggregation columns.  So, in the case you mentioned, if the
aggregation columns come from the dimension table, unfortunately, we
don't have the option to partially aggregate the fact table.

The paper does discuss several other transformations, such as "Eager
Count", "Double Eager", and "Eager Split", that can perform partial
aggregation on relations that don't contain aggregation columns, or
even on both sides of the join.  However, those are beyond the scope
of this patch.

- Richard





^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-09-05 13:09                             ` Robert Haas <[email protected]>
  2025-09-09 09:07                               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  1 sibling, 1 reply; 70+ messages in thread

From: Robert Haas @ 2025-09-05 13:09 UTC (permalink / raw)
  To: Richard Guo <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

Sorry for the slow response.

On Fri, Jun 13, 2025 at 3:42 AM Richard Guo <[email protected]> wrote:
> The transformation of eager aggregation is:
>
>     GROUP BY G, AGG(A) on (R1 JOIN R2 ON J)
>     =
>     GROUP BY G, AGG(agg_A) on ((GROUP BY G1, AGG(A) AS agg_A on R1)
> JOIN R2 ON J)
>
> This equivalence holds under the following conditions:
>
> 1) AGG is decomposable, meaning that it can be computed in two stages:
> a partial aggregation followed by a final aggregation;
> 2) The set G1 used in the pre-aggregation of R1 includes:
>     * all columns from R1 that are part of the grouping keys G, and
>     * all columns from R1 that appear in the join condition J.
> 3) The grouping operator for any column in G1 must be compatible with
> the operator used for that column in the join condition J.

This proof seems to ignore join-order constraints. I'm not sure to
what degree that influences the ultimate outcome here, but given A
LEFT JOIN (B INNER JOIN C), we cannot simply decide that A and C
comprise R1 and B comprises R2, because it is not actually possible to
do the A-C join first and treat the result as a relation to be joined
to B. That said, I do very much like the explicit enumeration of
criteria that must be met for the optimization to be valid. That makes
it a lot easier to evaluate whether the theory of the patch is
correct.

> To address these concerns, I'm thinking that maybe we can adopt a
> strategy where partial aggregation is only pushed to the lowest
> possible level in the join tree that is deemed useful.  In other
> words, if we can build a grouped path like "AGG(B) JOIN A" -- and
> AGG(B) yields a significant reduction in row count -- we skip
> exploring alternatives like "AGG(A JOIN B)".

I really like this idea. I believe we need some heuristic here and
this seems like a reasonable one. I think there could be a better one,
potentially. For instance, it would be reasonable (in my opinion) to
do some kind of evaluation of AGG(A JOIN B) vs. AGG(B) JOIN A that
does not involve performing full path generation for both cases; e.g.
one could try to decide considering only row counts, for instance.
However, I'm not saying that would work better than your proposal
here, or that it should be a requirement for this to be committed;
it's just an idea. IMHO, the requirement to have something committable
is that there is SOME heuristic limiting the search space and at the
same time the patch can still be demonstrated to give SOME benefit. I
think what you propose here meets those criteria. I also like the fact
that it's simple and easy to understand. If it does go wrong, it will
not be too difficult for someone to understand why it has gone wrong,
which is very desirable.

> I think this heuristic serves as a good starting point, and we can
> look into extending it with more advanced strategies as the feature
> evolves.

So IOW, +1 to what you say here.

-- 
Robert Haas
EDB: http://www.enterprisedb.com





^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-24 20:53                         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-06-13 07:41                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-09-05 13:09                             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
@ 2025-09-09 09:07                               ` Richard Guo <[email protected]>
  0 siblings, 0 replies; 70+ messages in thread

From: Richard Guo @ 2025-09-09 09:07 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: Tom Lane <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]

On Fri, Sep 5, 2025 at 10:10 PM Robert Haas <[email protected]> wrote:
> On Fri, Jun 13, 2025 at 3:42 AM Richard Guo <[email protected]> wrote:
> > The transformation of eager aggregation is:
> >
> >     GROUP BY G, AGG(A) on (R1 JOIN R2 ON J)
> >     =
> >     GROUP BY G, AGG(agg_A) on ((GROUP BY G1, AGG(A) AS agg_A on R1)
> > JOIN R2 ON J)
> >
> > This equivalence holds under the following conditions:
> >
> > 1) AGG is decomposable, meaning that it can be computed in two stages:
> > a partial aggregation followed by a final aggregation;
> > 2) The set G1 used in the pre-aggregation of R1 includes:
> >     * all columns from R1 that are part of the grouping keys G, and
> >     * all columns from R1 that appear in the join condition J.
> > 3) The grouping operator for any column in G1 must be compatible with
> > the operator used for that column in the join condition J.

> This proof seems to ignore join-order constraints. I'm not sure to
> what degree that influences the ultimate outcome here, but given A
> LEFT JOIN (B INNER JOIN C), we cannot simply decide that A and C
> comprise R1 and B comprises R2, because it is not actually possible to
> do the A-C join first and treat the result as a relation to be joined
> to B. That said, I do very much like the explicit enumeration of
> criteria that must be met for the optimization to be valid. That makes
> it a lot easier to evaluate whether the theory of the patch is
> correct.

Thanks for pointing this out.  I should have clarified that the proof
is intended for the inner join case.  My plan was to first establish
the correctness for inner joins, and then extend the proof to cover
outer joins, but I failed to make that clear.

In the case where there are any outer joins, the situation becomes
more complex due to join order constraints and the semantics of
null-extension in outer joins.  If the relations that contain at least
one aggregation column cannot be treated as a single relation because
of the join order constraints, partial aggregation paths will not be
generated, and thus the transformation is not applicable.

Otherwise, to preserve correctness, we need to add an additional
condition: R1 must not be on the nullable side of any outer join.
This ensures that partial aggregation over R1 does not suppress any
null-extended rows that would be introduced by outer joins.

I'll update the proof in README to cover the outer join case.

> > To address these concerns, I'm thinking that maybe we can adopt a
> > strategy where partial aggregation is only pushed to the lowest
> > possible level in the join tree that is deemed useful.  In other
> > words, if we can build a grouped path like "AGG(B) JOIN A" -- and
> > AGG(B) yields a significant reduction in row count -- we skip
> > exploring alternatives like "AGG(A JOIN B)".

> I really like this idea. I believe we need some heuristic here and
> this seems like a reasonable one. I think there could be a better one,
> potentially. For instance, it would be reasonable (in my opinion) to
> do some kind of evaluation of AGG(A JOIN B) vs. AGG(B) JOIN A that
> does not involve performing full path generation for both cases; e.g.
> one could try to decide considering only row counts, for instance.
> However, I'm not saying that would work better than your proposal
> here, or that it should be a requirement for this to be committed;
> it's just an idea. IMHO, the requirement to have something committable
> is that there is SOME heuristic limiting the search space and at the
> same time the patch can still be demonstrated to give SOME benefit. I
> think what you propose here meets those criteria. I also like the fact
> that it's simple and easy to understand. If it does go wrong, it will
> not be too difficult for someone to understand why it has gone wrong,
> which is very desirable.

> > I think this heuristic serves as a good starting point, and we can
> > look into extending it with more advanced strategies as the feature
> > evolves.

> So IOW, +1 to what you say here.

Thanks for liking this idea.  Another way this heuristic makes life
easier is that it ensures all grouped paths for the same grouped
relation produce the same set of rows.  This means we don't need all
the hacks for comparing costs between grouped paths, nor do we have to
resolve disputes about how many RelOptInfos to create for a single
grouped relation.  I'd prefer to keep this property for now and
explore more complex heuristics in the future.

- Richard





^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-10-09 02:13                         ` Tom Lane <[email protected]>
  2025-10-09 03:10                           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  1 sibling, 1 reply; 70+ messages in thread

From: Tom Lane @ 2025-10-09 02:13 UTC (permalink / raw)
  To: Richard Guo <[email protected]>; +Cc: Robert Haas <[email protected]>; David Rowley <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]; Matheus Alcantara <[email protected]>

Richard Guo <[email protected]> writes:
> On Wed, Oct 8, 2025 at 11:45 PM Robert Haas <[email protected]> wrote:
>> For what it's worth, I also don't like that field name. I'm not sure
>> what to propose instead, but I don't think apply_at is very clear.

> This field represents the set of relids at which partial aggregation
> is applied.  So how about naming it partial_agg_designated_relids?
> That feels a bit verbose, though.  How about partial_agg_relids or,
> for brevity, agg_relids instead?

I might be missing a subtlety here, but how about
"apply_aggregation_at" or "apply_partial_agg_at"?

I don't think including "relids" in the field name adds anything,
given the field's declared type and comments.

			regards, tom lane





^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 16:28                 ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 08:33                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-21 16:36                     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-22 06:48                       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-10-09 02:13                         ` Re: Eager aggregation, take 3 Tom Lane <[email protected]>
@ 2025-10-09 03:10                           ` Richard Guo <[email protected]>
  0 siblings, 0 replies; 70+ messages in thread

From: Richard Guo @ 2025-10-09 03:10 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: Robert Haas <[email protected]>; David Rowley <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; pgsql-hackers; [email protected]; Matheus Alcantara <[email protected]>

On Thu, Oct 9, 2025 at 11:13 AM Tom Lane <[email protected]> wrote:
> Richard Guo <[email protected]> writes:
> > On Wed, Oct 8, 2025 at 11:45 PM Robert Haas <[email protected]> wrote:
> >> For what it's worth, I also don't like that field name. I'm not sure
> >> what to propose instead, but I don't think apply_at is very clear.

> > This field represents the set of relids at which partial aggregation
> > is applied.  So how about naming it partial_agg_designated_relids?
> > That feels a bit verbose, though.  How about partial_agg_relids or,
> > for brevity, agg_relids instead?

> I might be missing a subtlety here, but how about
> "apply_aggregation_at" or "apply_partial_agg_at"?
>
> I don't think including "relids" in the field name adds anything,
> given the field's declared type and comments.

Fair point.

'agg' seems better to me than 'aggregation' when used in a name: it's
shorter, and it's unlikely anyone would interpret it as anything other
than "aggregation".

I kind of wonder whether we need to include 'partial' in the name.
Given the context, it seems very clear that we're referring to
partial aggregation rather than final aggregation.

So I'm weighing between "apply_partial_agg_at" and "apply_agg_at".

- Richard





^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
@ 2025-01-20 17:57                 ` Tom Lane <[email protected]>
  2025-01-20 18:39                   ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-21 14:13                   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  1 sibling, 2 replies; 70+ messages in thread

From: Tom Lane @ 2025-01-20 17:57 UTC (permalink / raw)
  To: Robert Haas <[email protected]>; +Cc: Richard Guo <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; [email protected]

Robert Haas <[email protected]> writes:
> So I don't quite know which way to jump here. It now seems to me that
> we have three similar features with three different designs.
> Parameterization added non-comparable paths to the same path list;
> parallel query added them to a different path list in the same
> RelOptInfo; and this patch currently adds them a separate RelOptInfo.

Yeah, this.  I don't think that either of those first two decisions
was wrong, but it does seem annoying that this patch wants to do it
yet a third way.  Still, it may be the right thing.  Bear with me a
moment:

We dealt with parameterized paths being in the same list as
non-parameterized paths by treating the set of parameter rels as a
figure-of-merit that add_path can compare.  This works because if,
say, a nonparameterized path dominates a parameterized one on every
other figure of merit then there's no point in keeping the
parameterized one.  It is squirrely that the parameterized paths
typically don't yield the same number of rows as others for the same
RelOptInfo, but at least so far that hasn't broken anything.  I think
it's important that the parameterized paths do yield the same column
set as other paths for the rel; and the rows they do yield will be a
subset of the rows that nonparameterized paths yield.

On the other hand, it's not sensible for partial paths to compete
in an add_path tournament with non-partial ones.  If they did, neither
group could be allowed to dominate the other group, so add_path would
just be wasting its time making those path comparisons.  So I do think
it was right to put them in a separate path list.  Importantly, they
generate the same column set and some subset of the same rows that
the non-partial ones do, which I think is what justifies putting
them into the same RelOptInfo.

However, a partial-aggregation path does not generate the same data
as an unaggregated path, no matter how fuzzy you are willing to be
about the concept.  So I'm having a very hard time accepting that
it ought to be part of the same RelOptInfo, and thus I don't really
buy that annotating paths with a GroupPathInfo is the way forward.

What this line of analysis doesn't tell us though is whether paths
that did their partial aggregations at different join levels can be
considered as enough alike that they should compete on cost terms.
If they are, we need to put them into the same RelOptInfo.  So
while I want to have separate RelOptInfos for partially aggregated
paths, I'm unclear on how many of those we need or what their
identifying property is.

Also: we avoid generating parameterized partial paths, because
combining those things would be too much of a mess.  There's some
handwaving in the comments for add_partial_path to the effect that
it wouldn't be a win anyway, but I think the real reason is that
it'd be far too complicated for the potential value.  Can we make
a similar argument for partial aggregation?  I sure hope so.

> I agree that creating an exponential number of RelOptInfos is not
> going to work out well.

FWIW, I'm way more concerned about the number of Paths considered
than I am about the number of RelOptInfos.  This relates to your
question about whether we want to use some heuristics to limit
the planner's search space.

			regards, tom lane






^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 17:57                 ` Re: Eager aggregation, take 3 Tom Lane <[email protected]>
@ 2025-01-20 18:39                   ` Robert Haas <[email protected]>
  1 sibling, 0 replies; 70+ messages in thread

From: Robert Haas @ 2025-01-20 18:39 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: Richard Guo <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; [email protected]

On Mon, Jan 20, 2025 at 12:57 PM Tom Lane <[email protected]> wrote:
> However, a partial-aggregation path does not generate the same data
> as an unaggregated path, no matter how fuzzy you are willing to be
> about the concept.  So I'm having a very hard time accepting that
> it ought to be part of the same RelOptInfo, and thus I don't really
> buy that annotating paths with a GroupPathInfo is the way forward.

Seems like a fair argument. I'm not sure it's dispositive if practical
considerations merited the opposite treatment, but that doesn't seem
to be the case.

> What this line of analysis doesn't tell us though is whether paths
> that did their partial aggregations at different join levels can be
> considered as enough alike that they should compete on cost terms.
> If they are, we need to put them into the same RelOptInfo.  So
> while I want to have separate RelOptInfos for partially aggregated
> paths, I'm unclear on how many of those we need or what their
> identifying property is.
>
> Also: we avoid generating parameterized partial paths, because
> combining those things would be too much of a mess.  There's some
> handwaving in the comments for add_partial_path to the effect that
> it wouldn't be a win anyway, but I think the real reason is that
> it'd be far too complicated for the potential value.  Can we make
> a similar argument for partial aggregation?  I sure hope so.

I think your hopes will be dashed in this instance. Suppose we have:

SELECT m.mapped_value, SUM(g.summable_quantity)
FROM mapping_table m JOIN gigantic_dataset g
WHERE m.raw_value = g.raw_value GROUP BY 1;

If the mapping_table is small, we can do something like this:

FinalizeAggregate
-> Gather
  -> PartialAggregate
    -> Hash Join

But if mapping_table is big but relatively few of the keys appear as
raw values in gigantic_dataset, being able to do the PartialAggregate
before the join would be rather nice; and you wouldn't want to give up
on parallel query in such a case.

P.S. I'm not so sure you're right about the reason why this is
supported. We can create a partial path for a joinrel by picking a
partial path on one side and a non-partial path on the other side, so
we can get NestLoops below Gather just fine using the parameterized
paths that we're generating anyway to support non-parallel cases. But
what would the plan look like if we were using a partial,
parameterized path? That path would have to be on the inner side of a
nested loo, and as far as I can see it would need to have a Gather
node on top of it and below the Nested Loop, so you're talking about
something like this:

Nested Loop
-> Seq Scan on something
-> Gather
  -> Nested Loop
    -> Index Scan on otherthing
       Index Cond: otherthing.x = something.x
    -> Whatever Scan on whatever

But putting Gather on the inner side of a nested loop like that would
mean repeatedly starting up workers and shutting them down again which
seems no fun at all. If there's some way of using a partial,
parameterized path that doesn't involve sticking a Gather on the inner
side of a Nested Loop, then the technique might have some promise and
the existing comment (which I probably wrote) is likely bunk.

> > I agree that creating an exponential number of RelOptInfos is not
> > going to work out well.
>
> FWIW, I'm way more concerned about the number of Paths considered
> than I am about the number of RelOptInfos.  This relates to your
> question about whether we want to use some heuristics to limit
> the planner's search space.

I had that instinct, too, but I'm not 100% sure whether it was a
correct instinct. If we create too many Paths, it's possible that most
of them will be thrown away before we really do anything with them, in
which case they cost CPU cycles but there's no memory accumulation.
Mixing paths that perform the partial aggregation at different levels
into the same RelOptInfo also increases the chances that you're going
to throw away a lot of stuff early. On the other hand, if we create
too many RelOptInfos, that memory can't be freed until the end of the
planning cycle. If you wouldn't have minded waiting a long time for
the planner, but you do mind running out of memory, the second one is
worse. But of course, the best option is to consider neither too many
Paths nor too many RelOptInfos.

I have heard rumors that in some other systems, they decide on the
best serial plan first and then insert parallel operators afterward.
Something like that could potentially be done here, too: only explore
eager aggregation for join orders that are sub-parts of the best
non-eagerly-aggregated join order. But I am sort of hesitant to
propose it as a development direction because we've never done
anything like that before and I don't think it would be at all easy to
implement. Nonetheless, I can't help feeling like we're kidding
ourselves a little bit, not just with this patch but in general. We
talk about "pushing down" aggregates or sorts or operations that can
be done on foreign nodes, but that implies that we start with them at
the top and then try to move them downward. In fact, we consider them
everywhere and expect the pushed-down versions to win out on cost.
While that feels sensible to some degree, it means every major new
query planning technique tends to multiply the amount of planner work
we're doing rather than adding to it. I'm fairly sure that the best
parallel plan need not be a parallelized version of the best
non-parallel plan but it often is, and the more things parallelism
supports, the more likely it is that it will be (I think). With eager
aggregation, it feels like we're multiplying the number of times that
we replan the same join tree by a number that is potentially MUCH
larger than 2, yet it seems to me that much of that extra work is
unlikely to find anything. Even if we find a way to make it work here
without too much pain, I wonder what happens when the next interesting
optimization comes along. Multiplication by a constant greater than or
equal to two isn't an operation one can do too many times, generally
speaking, without ending up with a big number.

-- 
Robert Haas
EDB: http://www.enterprisedb.com






^ permalink  raw  reply  [nested|flat] 70+ messages in thread

* Re: Eager aggregation, take 3
  2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2024-12-21 01:05 ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-13 02:04   ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-14 15:07     ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-15 06:58       ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-15 14:40         ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-16 08:18           ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-17 21:16             ` Re: Eager aggregation, take 3 Robert Haas <[email protected]>
  2025-01-19 12:53               ` Re: Eager aggregation, take 3 Richard Guo <[email protected]>
  2025-01-20 17:57                 ` Re: Eager aggregation, take 3 Tom Lane <[email protected]>
@ 2025-01-21 14:13                   ` Richard Guo <[email protected]>
  1 sibling, 0 replies; 70+ messages in thread

From: Richard Guo @ 2025-01-21 14:13 UTC (permalink / raw)
  To: Tom Lane <[email protected]>; +Cc: Robert Haas <[email protected]>; Tender Wang <[email protected]>; Paul George <[email protected]>; Andy Fan <[email protected]>; [email protected]

On Tue, Jan 21, 2025 at 2:57 AM Tom Lane <[email protected]> wrote:
> However, a partial-aggregation path does not generate the same data
> as an unaggregated path, no matter how fuzzy you are willing to be
> about the concept.  So I'm having a very hard time accepting that
> it ought to be part of the same RelOptInfo, and thus I don't really
> buy that annotating paths with a GroupPathInfo is the way forward.

Agreed.  I think one point I failed to make myself clear on is that
I've never intended to put a partial-aggregation path and an
unaggregated path into the same RelOptInfo.  One of the basic designs
of this patch is that partial-aggregation paths are placed in a
separate category of RelOptInfos, which I call "grouped relations"
(though I admit that's not the best name).  This ensures that we never
compare a partial-aggregation path with an unaggregated path during
scan/join planning, because I am certain that the two categories of
paths are not comparable.

Regarding the GroupPathInfo proposal, my intention is to add a valid
GroupPathInfo only for the partial-aggregation paths.  The goal is to
ensure that partial-aggregation paths within this category are
compared only if their partial aggregations are at the same location.

To be honest, I still doubt that this is necessary.  I have two main
reasons for this.

1.
For a partial-aggregation path, the location where we place the
partial aggregation does not impose any restrictions on further
planning.  This is different from the parameterized path case.  If two
parameterized paths are equal on very other figure of merit, we will
choose the one with fewer required outer rels, as it means fewer join
restrictions on upper planning.  However, for partial-aggregation
paths, we do not have a preference regarding the location of the
partial aggregation.  For instance, for path "A JOIN PartialAgg(B)
JOIN C" and path "PartialAgg(A JOIN B) JOIN C", if one path dominates
the other on every figure of merit, it seems to me that there's no
point in keeping the less favorable one, although they have their
partial aggregations at different join levels.

2.
A partial-aggregation path of a rel essentially yields an aggregated
form of that rel's row set.  The difference between the row sets
yielded by paths with different locations of partial aggregation is
primarily about the different degrees to which the rows are
aggregated.  These sets are fundamentally homogeneous.

In summary, in my own opinion, I think the partial-aggregation paths
of the same "grouped relation" are comparable, regardless of the
position of the partial aggregation within the path tree.  So I think
we should put them into the same RelOptInfo.

Of course, I could be very wrong about this.  I would greatly
appreciate hearing others' thoughts on this.

Thanks
Richard






^ permalink  raw  reply  [nested|flat] 70+ messages in thread


end of thread, other threads:[~2026-04-06 03:06 UTC | newest]

Thread overview: 70+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2024-12-17 03:42 Re: Eager aggregation, take 3 Richard Guo <[email protected]>
2024-12-21 01:05 ` Richard Guo <[email protected]>
2025-01-09 03:15   ` jian he <[email protected]>
2025-01-09 09:27     ` Richard Guo <[email protected]>
2025-01-13 02:04   ` Richard Guo <[email protected]>
2025-01-14 15:07     ` Robert Haas <[email protected]>
2025-01-15 06:58       ` Richard Guo <[email protected]>
2025-01-15 14:40         ` Robert Haas <[email protected]>
2025-01-16 08:18           ` Richard Guo <[email protected]>
2025-01-16 21:40             ` Tom Lane <[email protected]>
2025-01-17 12:19               ` Richard Guo <[email protected]>
2025-01-17 21:16             ` Robert Haas <[email protected]>
2025-01-19 12:53               ` Richard Guo <[email protected]>
2025-01-20 16:28                 ` Robert Haas <[email protected]>
2025-01-21 08:33                   ` Richard Guo <[email protected]>
2025-01-21 16:36                     ` Robert Haas <[email protected]>
2025-01-22 06:48                       ` Richard Guo <[email protected]>
2025-01-24 20:53                         ` Robert Haas <[email protected]>
2025-06-13 07:41                           ` Richard Guo <[email protected]>
2025-06-26 02:01                             ` Richard Guo <[email protected]>
2025-07-24 03:21                               ` Richard Guo <[email protected]>
2025-08-06 07:52                                 ` Richard Guo <[email protected]>
2025-08-06 13:44                                   ` Matheus Alcantara <[email protected]>
2025-08-09 01:32                                     ` Richard Guo <[email protected]>
2025-08-14 19:22                                       ` Matheus Alcantara <[email protected]>
2025-08-15 01:41                                         ` Richard Guo <[email protected]>
2025-09-01 01:32                                       ` Richard Guo <[email protected]>
2025-09-05 07:35                                         ` Richard Guo <[email protected]>
2025-09-05 14:37                                           ` Robert Haas <[email protected]>
2025-09-09 10:30                                             ` Richard Guo <[email protected]>
2025-09-09 14:30                                               ` Robert Haas <[email protected]>
2025-09-05 13:12                                   ` Robert Haas <[email protected]>
2025-09-09 09:20                                     ` Richard Guo <[email protected]>
2025-09-09 14:20                                       ` Robert Haas <[email protected]>
2025-09-12 09:34                                         ` Richard Guo <[email protected]>
2025-09-12 18:47                                           ` Robert Haas <[email protected]>
2025-09-13 08:27                                             ` Richard Guo <[email protected]>
2025-09-25 04:23                                               ` Richard Guo <[email protected]>
2025-09-29 02:09                                                 ` Richard Guo <[email protected]>
2025-10-01 23:54                                                   ` Matheus Alcantara <[email protected]>
2025-10-02 01:13                                                     ` Richard Guo <[email protected]>
2025-10-02 01:39                                                       ` Richard Guo <[email protected]>
2025-10-02 08:49                                                         ` Richard Guo <[email protected]>
2025-10-02 18:40                                                           ` Matheus Alcantara <[email protected]>
2025-10-03 03:14                                                             ` Richard Guo <[email protected]>
2025-10-03 20:03                                                               ` Matheus Alcantara <[email protected]>
2025-10-06 00:56                                                                 ` Richard Guo <[email protected]>
2025-10-06 00:59                                                   ` Richard Guo <[email protected]>
2025-10-06 13:59                                                     ` David Rowley <[email protected]>
2025-10-07 10:56                                                       ` Richard Guo <[email protected]>
2025-10-08 11:14                                                         ` David Rowley <[email protected]>
2025-10-09 01:49                                                           ` Richard Guo <[email protected]>
2025-10-09 08:07                                                             ` Richard Guo <[email protected]>
2026-03-30 03:17                                                               ` Richard Guo <[email protected]>
2026-04-02 12:18                                                                 ` Matheus Alcantara <[email protected]>
2026-04-06 03:06                                                                   ` Richard Guo <[email protected]>
2025-10-08 14:45                                                         ` Robert Haas <[email protected]>
2025-10-09 01:51                                                           ` Richard Guo <[email protected]>
2025-10-09 01:48                                                     ` Richard Guo <[email protected]>
2025-10-09 05:09                                                       ` Antonin Houska <[email protected]>
2025-10-09 07:01                                                         ` Richard Guo <[email protected]>
2025-09-05 14:50                                   ` Robert Haas <[email protected]>
2025-09-09 11:18                                     ` Richard Guo <[email protected]>
2025-09-05 13:09                             ` Robert Haas <[email protected]>
2025-09-09 09:07                               ` Richard Guo <[email protected]>
2025-10-09 02:13                         ` Tom Lane <[email protected]>
2025-10-09 03:10                           ` Richard Guo <[email protected]>
2025-01-20 17:57                 ` Tom Lane <[email protected]>
2025-01-20 18:39                   ` Robert Haas <[email protected]>
2025-01-21 14:13                   ` Richard Guo <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox