Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wSzYN-000FFl-28 for pgsql-hackers@arkaria.postgresql.org; Fri, 29 May 2026 15:55:20 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wSzYM-003AQ3-0X for pgsql-hackers@arkaria.postgresql.org; Fri, 29 May 2026 15:55:18 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wSzYL-003APm-1r for pgsql-hackers@lists.postgresql.org; Fri, 29 May 2026 15:55:18 +0000 Received: from mail-lf1-x12a.google.com ([2a00:1450:4864:20::12a]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1wSzYI-000000008pZ-0KDf for pgsql-hackers@postgresql.org; Fri, 29 May 2026 15:55:17 +0000 Received: by mail-lf1-x12a.google.com with SMTP id 2adb3069b0e04-5aa4b50e054so3787269e87.1 for ; Fri, 29 May 2026 08:55:14 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1780070113; cv=none; d=google.com; s=arc-20240605; b=BwcX3uSU21KaeMOJTxDhJJXs7AtbILfFEqhlfBnnpkkX907px0mZCmKlNLZnS1pUtP du+opd9FtpvRnw25TovCPUHxBUuKeaAimWVakv4oKZuvI10RWTH8ox7cjfyDbYqJ99E+ V5TJgZskR6ccQ7hZi1/19oekRGMxwT4OFuTqwlA+UW086zpbSJiHAvzxgr61epLbS3RY gkP1RlnsSJGR5y+TUNTBply/nNibNnrP58R5fMtSM9SN0vpuzRebqNMRdWPKqGUHpqb3 U6aPPe674Dy7xNA8rYGvEglWwfIPaLv8wB9WQQIfe+0aDyHAWAtSMsjSFvHaXpoVeBzj UvVw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:dkim-signature; bh=8/n2lgWF56qo5QEuaa6LB5GAbPwkTaHRW26/I0WzGmg=; fh=wtnkEZeRsC20NBHoS1LC6AzRMBE5EH/9JAtXW2XK6GA=; b=WqO9PTDEDiqnkdGSl3bhSVxZpSqanhw0LFtrLfgxA6a5N6VWIia2sET+NDWvoJ8Lu4 s7sum6PWMxJYJ53QGmRup2tSmIKTS3XAyxnuM77LUnPYSocAMYYap6LRK7kZf8/mJ308 Z8TM/+l4kjerXRsEGfP6R3lJniMphYUVNYHazvjkN5SdzxWBiRS7FnaWNCdX81LhmECG hhZGKYlI09UQjlbBk0X+Z25bAqeFyjLlqQU0a7P5bRSgFfWWkzLGTNJdp1NCcnyT0Kq+ EyOYZKanKjvGRlafrz9NPgnAAOUCAuUWk/OjoXaTus9vdyTXvueLKJbLcNbg7PLnW7OK WTTg==; darn=postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=boringsql.com; s=google; t=1780070113; x=1780674913; darn=postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=8/n2lgWF56qo5QEuaa6LB5GAbPwkTaHRW26/I0WzGmg=; b=apez0N7Qkv6Paq6NfUK6kBpT57xjJC88baXTBe/ryZZdPua29MrviovE17IWcq1Nq8 eOpsp4Kmf1Tl2ZImoSEZtx7fJnc/7rMjQHYzI9B+O/eO94IIEiBJfsZqYR2zHTm/7XMO uK1aIQNTj4a0da8SbOWnw0S3Xew8uWFdlxDor6thkFiixdAytwv0e+4eIW/kkWFlBxpB +iyXLze78J+ZjNVuKEaoELOpxl1Jbf/iIPsa3xtVEL8JweHF375UZDzdApGqZHU8UrXR LE3aLhYlWn6gp7I5zeoZwmktXIJCXDbKFWdiJlpd+Cw5XGI6aVwXkQtPf//Vno/xlYiI F5kg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780070113; x=1780674913; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=8/n2lgWF56qo5QEuaa6LB5GAbPwkTaHRW26/I0WzGmg=; b=ihWeTd5JlXrGktZ4rgCgU08XEO9ivEPIC0l44sngtkFzpdNFSU09KS0OcJxx+Q0GWj HVK2ccyORDNGqOFsfLsoHcyAQxQGCZnww7M5n38KJEJlFvCfXQMST77MhtmI/7U4xebj Zu16ELc/AylPA0cUIZt/OC+Qg6Jb2tZ239uUDdu99stb8gc4uhg3ht5GcZiHy93dbT4x l0pAHDW6DICeeVHfjFkEJXVwfxaXVhfk29Ti2kvbyKsA0EyyHp6m0TcydaT9H6iWes4X AG5U3Cy+Qe2Uv4hXradE+cY2xs6dasxy2Ues8AsjikHvlpjgis/CGdMla4PjbHw3x7RI aFkQ== X-Gm-Message-State: AOJu0YzMgbFgM4mxT7/tQQEK8WiinKXt3D+3rlCrc46jrgcEsXWpliih aPfAOQ2YxBQ2Er3z85uoItnOVKX33XeGVXguHC3zjPONFFjxDltJdOmTo75Vnn/MEX3r1ZQWFYb 3zXzOHWXnggCjT8SgnisHDIYJv6hoUeG6hwU4nj0Nt6/i/QL8zLOKwCN4Cg== X-Gm-Gg: Acq92OG80Wf+7SC/q128jk9CsEphi4wiol0TJMO+6ooCBeI8Btl6mHlHNKuzsNqj6Wr DpU5Nd8Qm5Qa5O2j4p2eKtdt7n7bEB/lVr3Mb2N8sj7vSznPc3FnEZKYWi04ifKOGjbQ4Fk8o5t U9Q5KCbkGllE7g0u6DvAyPTp/OAaNC0SwYnHdTn0dWQFCjkntOT4yZC4ox09QPDXHMZTKcvqH9U RD8GIpRnHGCz7yfFgINzkErzdEAzi8AbjaIeq1wdan8bUL7Bjc8ZS2oy4+qUm2oR5gzmm9hGdTZ xGvzlQ7N1IhFFNiTdTYOz8OZm+EsYeMXfBlILQOkMktDkzBdb3b2MRC0+HT5mE3TjzHj9PbrAxf XN8zIRzBicsjxjMJd26SFyA11jzBjPBfEDt85pANJLjmbgoU= X-Received: by 2002:a05:6512:800d:10b0:5a8:fca6:a537 with SMTP id 2adb3069b0e04-5aa608ff3d8mr87107e87.22.1780070112566; Fri, 29 May 2026 08:55:12 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Radim Marek Date: Fri, 29 May 2026 17:55:01 +0200 X-Gm-Features: AVHnY4Jcu2cWtc3woyWyY719ZqqZhDypulu2SwL9LzdQCCyYxqfX6Z-WIptugPU Message-ID: Subject: Re: Eager aggregation, take 3 To: Richard Guo Cc: PostgreSQL-development Content-Type: multipart/alternative; boundary="000000000000bc2c190652f6de50" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000bc2c190652f6de50 Content-Type: text/plain; charset="UTF-8" Hey Richard, I might be out of my depth here - but while testing RegreSQL as correctness/performance harness on PostgreSQL it picked up a problem with the wrong-results case during eager aggregation. It reproduces on current HEAD (commit 2670cc298f42cd7b1c426bf7ccfb0652d8e0b347 now) with enable_eager_aggregate enabled. My testing environment - Linux aarch64, gcc 12 (Debian) - macOS arm64, Apple clang 21 (PostgreSQL 19devel on aarch64-apple-darwin25.5.0) == How to reproduce CREATE TEMP TABLE c(id int, country text); CREATE TEMP TABLE o(customer_id int); INSERT INTO c VALUES (1,'US'),(2,'US'),(3,'DE'),(4,'DE'),(5,'DE'); INSERT INTO o VALUES (1),(3); -- only customers 1 and 3 have a row in o SELECT c.country, count(*) AS n FROM c WHERE NOT EXISTS (SELECT 1 FROM o WHERE o.customer_id = c.id) GROUP BY c.country ORDER BY c.country; Expected results (everywhere except master) country | n ---------+--- DE | 2 US | 1 (2 rows) The actual result with enable_eager_aggregate = on (default) country | n ---------+--- DE | 0 US | 0 (2 rows) With SET enable_eager_aggregate = off, the result is correct (DE=2, US=1), as it is on PG18. Query Plan QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------- Sort (cost=108.19..108.69 rows=200 width=40) (actual time=0.195..0.197 rows=2.00 loops=1) Sort Key: c.country Sort Method: quicksort Memory: 25kB Buffers: local hit=2 -> Finalize HashAggregate (cost=98.55..100.55 rows=200 width=40) (actual time=0.183..0.186 rows=2.00 loops=1) Group Key: c.country Batches: 1 Memory Usage: 32kB Buffers: local hit=2 -> Hash Anti Join (cost=52.75..95.37 rows=635 width=40) (actual time=0.177..0.179 rows=3.00 loops=1) Hash Cond: (c.id = o.customer_id) Buffers: local hit=2 -> Seq Scan on c (cost=0.00..22.70 rows=1270 width=36) (actual time=0.024..0.025 rows=5.00 loops=1) Buffers: local hit=1 -> Hash (cost=50.25..50.25 rows=200 width=12) (actual time=0.145..0.146 rows=2.00 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 9kB Buffers: local hit=1 -> Partial HashAggregate (cost=48.25..50.25 rows=200 width=12) (actual time=0.122..0.123 rows=2.00 loops=1) Group Key: o.customer_id Batches: 1 Memory Usage: 32kB Buffers: local hit=1 -> Seq Scan on o (cost=0.00..35.50 rows=2550 width=4) (actual time=0.002..0.003 rows=2.00 loops=1) Buffers: local hit=1 Planning Time: 0.294 ms Execution Time: 0.255 ms (24 rows) If this is already known or in progress, apologies for the noise. --- Radim On Fri, 29 May 2026 at 17:25, Richard Guo wrote: > Hi All, > > Eager aggregation is a query optimization technique that partially > pushes a group-by past a join, and finalizes it once all the relations > are joined. Eager aggregation reduces the number of input rows to the > join and thus may result in a better overall plan. This technique is > thoroughly described in the 'Eager Aggregation and Lazy Aggregation' > paper [1]. > > Back in 2017, a patch set has been proposed by Antonin Houska to > implement eager aggregation in thread [2]. However, it was at last > withdrawn after entering the pattern of "please rebase thx" followed by > rebasing and getting no feedback until "please rebase again thx". A > second attempt in 2022 unfortunately fell into the same pattern about > one year ago and was eventually closed again [3]. > > That patch set has included most of the necessary concepts to implement > eager aggregation. However, as far as I can see, it has several weak > points that we need to address. It introduces invasive changes to some > core planner functions, such as make_join_rel(). And with such changes > join_is_legal() would be performed three times for the same proposed > join, which is not great. Another weak point is that the complexity of > join searching dramatically increases with the growing number of > relations to be joined. This occurs because when we generate partially > aggregated paths, each path of the input relation is considered as an > input path for the grouped paths. As a result, the number of grouped > paths we generate increases exponentially, leading to a significant > explosion in computational complexity. Other weak points include the > lack of support for outer joins and partitionwise joins. And during my > review of the code, I came across several bugs (planning error or crash) > that need to be addressed. > > I'd like to give it another take to implement eager aggregation, while > borrowing lots of concepts and many chunks of codes from the previous > patch set. Please see attached. I have chosen to use the term 'Eager > Aggregation' from the paper [1] instead of 'Aggregation push-down', to > differentiate the aggregation push-down technique in FDW. > > The patch has been split into small pieces to make the review easier. > > 0001 introduces the RelInfoList structure, which encapsulates both a > list and a hash table, so that we can leverage the hash table for faster > lookups not only for join relations but also for upper relations. With > eager aggregation, it is possible that we generate so many upper rels of > type UPPERREL_PARTIAL_GROUP_AGG that a hash table can help a lot with > lookups. > > 0002 introduces the RelAggInfo structure to store information needed to > create grouped paths for base and join rels. It also revises the > RelInfoList related structures and functions so that they can be used > with RelAggInfos. > > 0003 checks if eager aggregation is applicable, and if so, collects > suitable aggregate expressions and grouping expressions in the query, > and records them in root->agg_clause_list and root->group_expr_list > respectively. > > 0004 implements the functions that check if eager aggregation is > applicable for a given relation, and if so, create RelAggInfo structure > for the relation, using the infos about aggregate expressions and > grouping expressions we collected earlier. In this patch, when we check > if a target expression can act as grouping expression, we need to check > if this expression can be known equal to other expressions due to ECs > that can act as grouping expressions. This patch leverages function > exprs_known_equal() to achieve that, after enhancing this function to > consider opfamily if provided. > > 0005 implements the functions that generate paths for grouped relations > by adding sorted and hashed partial aggregation paths on top of paths of > the plain base or join relations. For sorted partial aggregation paths, > we only consider any suitably-sorted input paths as well as sorting the > cheapest-total path. For hashed partial aggregation paths, we only > consider the cheapest-total path as input. By not considering other > paths we can reduce the number of grouping paths as much as possible > while still achieving reasonable results. > > 0006 builds grouped relations for each base relation if possible, and > generates aggregation paths for the grouped base relations. > > 0007 builds grouped relations for each just-processed join relation if > possible, and generates aggregation paths for the grouped join > relations. The changes made to make_join_rel() are relatively minor, > with the addition of a new function make_grouped_join_rel(), which finds > or creates a grouped relation for the just-processed joinrel, and > generates grouped paths by joining a grouped input relation with a > non-grouped input relation. > > The other way to generate grouped paths is by adding sorted and hashed > partial aggregation paths on top of paths of the joinrel. This occurs > in standard_join_search(), after we've run set_cheapest() for the > joinrel. The reason for performing this step after set_cheapest() is > that we need to know the joinrel's cheapest paths (see 0005). > > This patch also makes the grouped relation for the topmost join rel act > as the upper rel representing the result of partial aggregation, so that > we can add the final aggregation on top of that. Additionally, this > patch extends the functionality of eager aggregation to work with > partitionwise join and geqo. > > This patch also makes eager aggregation work with outer joins. With > outer join, the aggregate cannot be pushed down if any column referenced > by grouping expressions or aggregate functions is nullable by an outer > join above the relation to which we want to apply the partiall > aggregation. Thanks to Tom's outer-join-aware-Var infrastructure, we > can easily identify such situations and subsequently refrain from > pushing down the aggregates. > > Starting from this patch, you should be able to see plans with eager > aggregation. > > 0008 adds test cases for eager aggregation. > > 0009 adds a section in README that describes this feature (copied from > previous patch set, with minor tweaks). > > Thoughts and comments are welcome. > > [1] https://www.vldb.org/conf/1995/P345.PDF > [2] https://www.postgresql.org/message-id/flat/9666.1491295317%40localhost > [3] > https://www.postgresql.org/message-id/flat/OS3PR01MB66609589B896FBDE190209F495EE9%40OS3PR01MB6660.jpnprd01.prod.outlook.com > > Thanks > Richard > --000000000000bc2c190652f6de50 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hey Richard,

I might be out = of my depth here - but while testing RegreSQL as correctness/performance ha= rness on PostgreSQL it picked up a problem with the=C2=A0wrong-results case= during eager aggregation.

It reproduces on curren= t HEAD (commit=C2=A02670cc298f42cd7b1c426bf7ccfb0652d8e0b347 now) with=C2= =A0enable_eager_aggregate enabled.

My testing envi= ronment
=C2=A0 - Linux aarch64, gcc 12 (Debian)
=C2=A0 - macOS= arm64, Apple clang 21
=C2=A0 =C2=A0 (PostgreSQL 19devel on aarch64-appl= e-darwin25.5.0)

=3D=3D How to reproduce
=
=C2=A0 CREATE TEMP TABLE c(id int, country text);
=C2=A0 = CREATE TEMP TABLE o(customer_id int);
=C2=A0 INSERT INTO c VALUES (1,= 9;US'),(2,'US'),(3,'DE'),(4,'DE'),(5,'DE= 9;);
=C2=A0 INSERT INTO o VALUES (1),(3); =C2=A0 -- only customers 1 and= 3 have a row in o

=C2=A0 SELECT c.country, count(*) AS n
=C2=A0 = FROM c
=C2=A0 WHERE NOT EXISTS (SELECT 1 FROM o WHERE o.customer_id =3D = c.id)
=C2=A0 GROUP BY c.country
=C2=A0 OR= DER BY c.country;

Expected results (everywhere exc= ept master)

=C2=A0country | n
---------+---
= =C2=A0DE =C2=A0 =C2=A0 =C2=A0| 2
=C2=A0US =C2=A0 =C2=A0 =C2=A0| 1
(2 = rows)

The actual result=C2=A0with enable_eager_agg= regate =3D on (default)

=C2=A0country | n
-----= ----+---
=C2=A0DE =C2=A0 =C2=A0 =C2=A0| 0
=C2=A0US =C2=A0 =C2=A0 =C2= =A0| 0
(2 rows)

With SET enable_eager_aggregate= =3D off, the result is correct (DE=3D2, US=3D1), as it is on PG18.

Query Plan

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 QUERY PLAN
------------------------------------= ---------------------------------------------------------------------------= --------------------
=C2=A0Sort =C2=A0(cost=3D108.19..108.69 rows=3D200 = width=3D40) (actual time=3D0.195..0.197 rows=3D2.00 loops=3D1)
=C2=A0 = =C2=A0Sort Key: c.country
=C2=A0 =C2=A0Sort Method: quicksort =C2=A0Memo= ry: 25kB
=C2=A0 =C2=A0Buffers: local hit=3D2
=C2=A0 =C2=A0-> =C2= =A0Finalize HashAggregate =C2=A0(cost=3D98.55..100.55 rows=3D200 width=3D40= ) (actual time=3D0.183..0.186 rows=3D2.00 loops=3D1)
=C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0Group Key: c.country
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= Batches: 1 =C2=A0Memory Usage: 32kB
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Bu= ffers: local hit=3D2
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0-> =C2=A0Hash = Anti Join =C2=A0(cost=3D52.75..95.37 rows=3D635 width=3D40) (actual time=3D= 0.177..0.179 rows=3D3.00 loops=3D1)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0Hash Cond: (c.id =3D o.cust= omer_id)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Buffers:= local hit=3D2
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0-&= gt; =C2=A0Seq Scan on c =C2=A0(cost=3D0.00..22.70 rows=3D1270 width=3D36) (= actual time=3D0.024..0.025 rows=3D5.00 loops=3D1)
=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Buffers: local hit= =3D1
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0-> =C2=A0= Hash =C2=A0(cost=3D50.25..50.25 rows=3D200 width=3D12) (actual time=3D0.145= ..0.146 rows=3D2.00 loops=3D1)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Buckets: 1024 =C2=A0Batches: 1 =C2=A0Mem= ory Usage: 9kB
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0Buffers: local hit=3D1
=C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0-> =C2=A0Partial HashAgg= regate =C2=A0(cost=3D48.25..50.25 rows=3D200 width=3D12) (actual time=3D0.1= 22..0.123 rows=3D2.00 loops=3D1)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Group Key: o.cus= tomer_id
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Batches: 1 =C2=A0Memory Usage: 32kB
= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0Buffers: local hit=3D1
=C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0-&= gt; =C2=A0Seq Scan on o =C2=A0(cost=3D0.00..35.50 rows=3D2550 width=3D4) (a= ctual time=3D0.002..0.003 rows=3D2.00 loops=3D1)
=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0Buffers: local hit=3D1
=C2=A0Planning Time: 0.29= 4 ms
=C2=A0Execution Time: 0.255 ms
(24 rows)

If this is alrea= dy known or in progress, apologies for the noise.

= ---

Radim

On Fri, 29 Ma= y 2026 at 17:25, Richard Guo <= guofenglinux@gmail.com> wrote:
Hi All,

Eager aggregation is = a query optimization technique that partially
pushes a group-by past a j= oin, and finalizes it once all the relations
are joined.=C2=A0 Eager agg= regation reduces the number of input rows to the
join and thus may resul= t in a better overall plan.=C2=A0 This technique is
thoroughly described= in the 'Eager Aggregation and Lazy Aggregation'
paper [1].
<= br>Back in 2017, a patch set has been proposed by Antonin Houska to
impl= ement eager aggregation in thread [2].=C2=A0 However, it was at last
wit= hdrawn after entering the pattern of "please rebase thx" followed= by
rebasing and getting no feedback until "please rebase again thx= ". =C2=A0A
second attempt in 2022 unfortunately fell into the same = pattern about
one year ago and was eventually closed again [3].

T= hat patch set has included most of the necessary concepts to implement
e= ager aggregation.=C2=A0 However, as far as I can see, it has several weakpoints that we need to address.=C2=A0 It introduces invasive changes to s= ome
core planner functions, such as make_join_rel().=C2=A0 And with such= changes
join_is_legal() would be performed three times for the same pro= posed
join, which is not great.=C2=A0 Another weak point is that the com= plexity of
join searching dramatically increases with the growing number= of
relations to be joined.=C2=A0 This occurs because when we generate p= artially
aggregated paths, each path of the input relation is considered= as an
input path for the grouped paths.=C2=A0 As a result, the number o= f grouped
paths we generate increases exponentially, leading to a signif= icant
explosion in computational complexity.=C2=A0 Other weak points inc= lude the
lack of support for outer joins and partitionwise joins.=C2=A0 = And during my
review of the code, I came across several bugs (planning e= rror or crash)
that need to be addressed.

I'd like to give it= another take to implement eager aggregation, while
borrowing lots of co= ncepts and many chunks of codes from the previous
patch set.=C2=A0 Pleas= e see attached.=C2=A0 I have chosen to use the term 'Eager
Aggregati= on' from the paper [1] instead of 'Aggregation push-down', todifferentiate the aggregation push-down technique in FDW.

The patc= h has been split into small pieces to make the review easier.

<= /div>
0001 introduces the RelInfoList structure, which encapsulates bot= h a
list and a hash table, so that we can leverage the hash table for fa= ster
lookups not only for join relations but also for upper relations.= =C2=A0 With
eager aggregation, it is possible that we generate so many u= pper rels of
type UPPERREL_PARTIAL_GROUP_AGG that a hash table can help = a lot with
lookups.

0002 introduces the RelAggInfo structure to s= tore information needed to
create grouped paths for base and join rels.= =C2=A0 It also revises the
RelInfoList related structures and functions = so that they can be used
with RelAggInfos.

0003 checks if eager a= ggregation is applicable, and if so, collects
suitable aggregate express= ions and grouping expressions in the query,
and records them in root->= ;agg_clause_list and root->group_expr_list
respectively.

0004 = implements the functions that check if eager aggregation is
applicable f= or a given relation, and if so, create RelAggInfo structure
for the rela= tion, using the infos about aggregate expressions and
grouping expressio= ns we collected earlier.=C2=A0 In this patch, when we check
if a target = expression can act as grouping expression, we need to check
if this expr= ession can be known equal to other expressions due to ECs
that can act a= s grouping expressions.=C2=A0 This patch leverages function
exprs_known_= equal() to achieve that, after enhancing this function to
consider opfam= ily if provided.

0005 implements the functions that generate paths f= or grouped relations
by adding sorted and hashed partial aggregation pat= hs on top of paths of
the plain base or join relations.=C2=A0 For sorted= partial aggregation paths,
we only consider any suitably-sorted input p= aths as well as sorting the
cheapest-total path.=C2=A0 For hashed partia= l aggregation paths, we only
consider the cheapest-total path as input.= =C2=A0 By not considering other
paths we can reduce the number of groupi= ng paths as much as possible
while still achieving reasonable results.
0006 builds grouped relations for each base relation if possible, and=
generates aggregation paths for the grouped base relations.

0007 builds grouped relations for each just-processed joi= n relation if
possible, and generates aggregation paths for the grouped = join
relations.=C2=A0 The changes made to make_join_rel() are relatively= minor,
with the addition of a new function make_grouped_join_rel(), whi= ch finds
or creates a grouped relation for the just-processed joinrel, a= nd
generates grouped paths by joining a grouped input relation with anon-grouped input relation.

The other way to generate grouped paths= is by adding sorted and hashed
partial aggregation paths on top of path= s of the joinrel.=C2=A0 This occurs
in standard_join_search(), after we&= #39;ve run set_cheapest() for the
joinrel.=C2=A0 The reason for performi= ng this step after set_cheapest() is
that we need to know the joinrel= 9;s cheapest paths (see 0005).

This patch also makes the grouped rel= ation for the topmost join rel act
as the upper rel representing the res= ult of partial aggregation, so that
we can add the final aggregation on = top of that.=C2=A0 Additionally, this
patch extends the functionality of= eager aggregation to work with
partitionwise join and geqo.

This= patch also makes eager aggregation work with outer joins.=C2=A0 With
ou= ter join, the aggregate cannot be pushed down if any column referenced
b= y grouping expressions or aggregate functions is nullable by an outer
jo= in above the relation to which we want to apply the partiall
aggregation= .=C2=A0 Thanks to Tom's outer-join-aware-Var infrastructure, we
can = easily identify such situations and subsequently refrain from
pushing do= wn the aggregates.

Starting from this patch, you should be able to s= ee plans with eager
aggregation.

0008 adds test cases for eager a= ggregation.

0009 adds a section in README that describes this featur= e (copied from
previous patch set, with minor tweaks).

Thoughts a= nd comments are welcome.

--000000000000bc2c190652f6de50--