public inbox for [email protected]
help / color / mirror / Atom feedFrom: Matheus Alcantara <[email protected]>
To: Richard Guo <[email protected]>
To: David Rowley <[email protected]>
Cc: Robert Haas <[email protected]>
Cc: Tom Lane <[email protected]>
Cc: Tender Wang <[email protected]>
Cc: Paul George <[email protected]>
Cc: Andy Fan <[email protected]>
Cc: PostgreSQL-development <[email protected]>
Cc: [email protected]
Subject: Re: Eager aggregation, take 3
Date: Thu, 02 Apr 2026 09:18:50 -0300
Message-ID: <[email protected]> (raw)
In-Reply-To: <CAMbWs48A53PY1Y4zoj7YhxPww9fO1hfnbdntKfA855zpXfVFRA@mail.gmail.com>
References: <CAMbWs48jzLrPt1J_00ZcPZXWUQKawQOFE8ROc-ADiYqsqrpBNw@mail.gmail.com>
<[email protected]>
<CAMbWs49=eAd2W9jCtGhaZPPp+SOC_2rg16RTG74xAht=hkr5JQ@mail.gmail.com>
<CAMbWs49Nc4M3H+eCf1+8w8piDyEECjRb-gK_JMF4VvcyWwGEVQ@mail.gmail.com>
<CAMbWs49E_dR0nobsExsyetpnBpHObLTsQLsEbWKQLkh0omPxNg@mail.gmail.com>
<CAMbWs49B_qUiHvu2EqLHZRpLr3p_+QPBs50n2=L5ibYzniwTzA@mail.gmail.com>
<CAMbWs48KCQtDymnYi4M=Vz+WMzo3fkBxffJsyk6VX6hOXXv+VA@mail.gmail.com>
<CAMbWs49sv_MuOYqqrtmBN_oYf8VSQ2BXDwXaTpJTn_YfwyYdWQ@mail.gmail.com>
<CAMbWs49U8Sddx_fGszPdvA3jp_nheynxaqm5Y4NqMV21VBYAuQ@mail.gmail.com>
<CAMbWs4-LwyOg9ga+NVF7yQbMi0ZsZdN1G_sO2v=YJHV18=19+A@mail.gmail.com>
<CALA8mJquG_zCJXfVwash5LKqHGtZXQmq7RfTSaRDUzGYeW=7Rw@mail.gmail.com>
<CAMbWs4_EjgcBib5+y1LYcGB3EK3Y6R+OOxGKfJo42fDovadk1g@mail.gmail.com>
<CALA8mJqe0anNM8_V6cOeOQnCHUTQggn7iOQNyQr1VaN_xMjz+w@mail.gmail.com>
<CAMbWs48eE-s-jCicC8pSVfXk8Ws-ZvUKnsw8qH-DkVBdYv0eJQ@mail.gmail.com>
<CAMbWs483a7-8M0pDttG44r-+8Gevn9VG0xNceE3WpkEQxJXPZw@mail.gmail.com>
<CAHewXNmYM6DvR_kaxDL0w0fz9BwKbac+TSU3QS10aA3cXHyMmA@mail.gmail.com>
<CA+TgmoaxH=P63hLYgyJJcEbMRnw3xi16d=HxFi1j-m7MhH6W_w@mail.gmail.com>
<CAMbWs4_cOnpGsywj9Jt1WAgzJLW9Rxt5X13cfGz4iN2qvZQ68g@mail.gmail.com>
<CA+Tgmob0q7bRbsFTVDMjxHE6zA4uDQLQa-s0CtwUw49V53UL_A@mail.gmail.com>
<CAMbWs4-Xru_eKBeRHFduigSGihdixFWVTR8A+dtMw7Mao+RkJA@mail.gmail.com>
<CAMbWs49dLjSSQRWeud+KSN0G531ciZdYoLBd5qktXA+3JQm_UQ@mail.gmail.com>
<CAMbWs48LXGC-Y63YtzEeM-3f0NUXWCUEMs7XwGzywXTjUNMcxQ@mail.gmail.com>
<CAMbWs48XdzvnwfTHWxQ7qK-yjvdrbwsPpqhJBuKDnO+hcbsVwA@mail.gmail.com>
<CA+TgmoaO-7RHdyJuizWChXZm7EJGvDcfoePDDEyUA-y8vTB1tg@mail.gmail.com>
<CAMbWs4-+jXRpKuFMZa08bS34-TBka3qqjVMAUjF=-1RA9BKvgg@mail.gmail.com>
<CA+TgmoZapU1y59-s3o8oPt7Hv+cxRh_34FMu6MXumomLe+U1Cw@mail.gmail.com>
<CAMbWs4_sEeeBmucBzbamBMfA9uLxVmOc_MV=ZpSyDbTcrUO_XQ@mail.gmail.com>
<CA+Tgmob4fnv57PQB0Oox86mHSJQ0vVL249eT=gqPvrMkG7h1zw@mail.gmail.com>
<CAMbWs489NYyTcCTbrUi7hPXKtNY5vHrrFcHyMRAv=CA5WsszVw@mail.gmail.com>
<CA+TgmoazmDdcc7NeTo3WM5HW3DASNP4rfZw6X+2nnQKHampOng@mail.gmail.com>
<CAMbWs49bYr-ULhA+-At0iQ+NaFKy72AWB6jzughk8MPTiY+gMQ@mail.gmail.com>
<CA+TgmoYa-zexdbc5nO_D6oxPMZYs06hkYwZK5Dufq+4Hhe6uNQ@mail.gmail.com>
<CAMbWs4_aji0kME490phz6nTXnPToddUn19OF3rLm1g4TbNkuzQ@mail.gmail.com>
<CA+Tgmoa3+G_=8XuQWN+0ugv6r-WV6ruFESpOxpXAAKrne3oVDQ@mail.gmail.com>
<CAMbWs49qiox13EKb7bqgLu7Gu9oar+xe6KMwBjgFwod3JzPfUw@mail.gmail.com>
<CAMbWs48F8WGA-Lzj1Dk76mFqRFxPEwG2_9Zb7+pFs8oi6ew2pw@mail.gmail.com>
<CAMbWs484ms=WRZamOyWnVditREKFqipLsdaQjcv2uKur8SZuqw@mail.gmail.com>
<CAMbWs49bL2ZMSc0W4G8=R7bjaa-vO6grucEOFYLZFUZE7+nzrQ@mail.gmail.com>
<CA+TgmobqbeJ9iRQO4ym6OiHt71sSv2eai=01kOZjxhdof9K4Mw@mail.gmail.com>
<CAMbWs4_2BzuAX+BSO1p7rtUwmQjORrG-b906Cw-RkfRjFP0oSQ@mail.gmail.com>
<CA+TgmoYbkvYwLa+1vOP7RDY7kO2=A7rppoPusoRXe44VDOGBPg@mail.gmail.com>
<CAMbWs4_aezTYOZSj7v+aypLo0dnjAierJtdx2gf6se28p88WHg@mail.gmail.com>
<CA+TgmoaY6E5-UFTWp5BtAjBO=tDQd=UVAgeJ3dRbFFzhnP5NHg@mail.gmail.com>
<CAMbWs484dnecwXT2WzWFvzEmWPzC3U9F8SRDXg-SEegTYUFyXQ@mail.gmail.com>
<CAMbWs4-2cVfBk1HNGtqV1QFo2yKnzdxLy0BAqQaJHBt+8+kspw@mail.gmail.com>
<CAMbWs48W80HFm9b+yZPKER=MA5M_bveYvBx1AwOrxdPYbLmYmQ@mail.gmail.com>
<CAMbWs4_W6PmVXxgqM8LMerX2iwqOwR5EC5RGWSAMuvkm8o+-jw@mail.gmail.com>
<CAApHDvrxyGSLv3Sbg9fpmz6yYri_7M6SaKYnqQQv59uZfQdduA@mail.gmail.com>
<CAMbWs48qOOfBV6svQgbdzZ4HbTV18jsjyFD_b6ofJQfAt5n11w@mail.gmail.com>
<CAApHDvr4YWpiMR3RsgYwJWv-u8xoRqTAKRiYy9zUszjZOqG4Ug@mail.gmail.com>
<CAMbWs4_nErgVAcuw3igo6q3Th=pFheu5MZ056ORdsU46YmuNyQ@mail.gmail.com>
<CAMbWs4_-sag-cAKrLJ+X+5njL1=oudk=+KfLmsLZ5a2jckn=kg@mail.gmail.com>
<CAMbWs48A53PY1Y4zoj7YhxPww9fO1hfnbdntKfA855zpXfVFRA@mail.gmail.com>
On Mon Mar 30, 2026 at 12:17 AM -03, Richard Guo wrote:
> On Thu, Oct 9, 2025 at 5:07 PM Richard Guo <[email protected]> wrote:
>> I noticed an unnecessary header include in initsplan.c. Will fix that
>> as well.
>
> I noticed a couple of issues that can lead to unexpected results.
> I've attached two patches to fix them.
>
> 1. Eager aggregation was incorrectly checking the data type's default
> collation rather than the expression's actual collation. This allowed
> columns with non-deterministic collations to be pushed down, resulting
> in incorrect grouping. Fixed by 0001.
>
> 2. Pushing aggregates containing volatile functions below a join
> alters their execution count. Fixed by 0002.
>
> (As briefly discussed on Discord, this non-deterministic collation
> issue also exists in our long-existing logic for pushing HAVING down
> to WHERE. But let's fix it for the eager aggregation first.)
>
Hi Richard,
The patches looks good to me and are working as expected. It seems very
straightforward, so I don't have any major comments.
I'm attaching some new tests that I've added to collate.icu.utf8 and
eager_aggregate regression tests during my review, fell free to include
any of them if it could be helpful or none.
--
Matheus Alcantara
EDB: https://www.enterprisedb.com
diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out
index fbcdb7eb58c..a2dd8a34da4 100644
--- a/src/test/regress/expected/collate.icu.utf8.out
+++ b/src/test/regress/expected/collate.icu.utf8.out
@@ -2726,6 +2726,95 @@ GROUP BY t1.id;
DROP TABLE eager_agg_t1;
DROP TABLE eager_agg_t2;
+--
+-- Test for eager aggregation with multiple columns having different collations
+--
+CREATE TABLE eager_agg_t3 (
+ id int,
+ val1 text COLLATE case_insensitive,
+ val2 text COLLATE "C"
+);
+CREATE TABLE eager_agg_t4 (
+ val1 text COLLATE case_insensitive,
+ val2 text COLLATE "C"
+);
+INSERT INTO eager_agg_t3 SELECT 1, 'a', 'x' FROM generate_series(1, 50);
+INSERT INTO eager_agg_t3 SELECT 1, 'A', 'x' FROM generate_series(1, 50);
+INSERT INTO eager_agg_t4 VALUES ('A', 'x');
+ANALYZE eager_agg_t3;
+ANALYZE eager_agg_t4;
+-- Ensure that eager aggregation is not used when grouping by a column with
+-- non-deterministic collation, even when other grouping columns have
+-- deterministic collations.
+EXPLAIN (COSTS OFF)
+SELECT t1.id, t1.val1, count(*)
+ FROM eager_agg_t3 t1
+ JOIN eager_agg_t4 t2 ON t1.val1 = t2.val1 COLLATE "C" AND t1.val2 = t2.val2
+GROUP BY t1.id, t1.val1;
+ QUERY PLAN
+------------------------------------------------------------------------------------
+ HashAggregate
+ Group Key: t1.id, t1.val1
+ -> Nested Loop
+ Join Filter: (((t1.val1)::text = (t2.val1)::text) AND (t1.val2 = t2.val2))
+ -> Seq Scan on eager_agg_t4 t2
+ -> Seq Scan on eager_agg_t3 t1
+(6 rows)
+
+-- Verify correct results (should return 1 row with count = 50)
+SELECT t1.id, t1.val1, count(*)
+ FROM eager_agg_t3 t1
+ JOIN eager_agg_t4 t2 ON t1.val1 = t2.val1 COLLATE "C" AND t1.val2 = t2.val2
+GROUP BY t1.id, t1.val1;
+ id | val1 | count
+----+------+-------
+ 1 | A | 50
+(1 row)
+
+DROP TABLE eager_agg_t3;
+DROP TABLE eager_agg_t4;
+--
+-- Test for eager aggregation with explicit COLLATE on grouping expression
+--
+CREATE TABLE eager_agg_t5 (id int, val text COLLATE "C");
+CREATE TABLE eager_agg_t6 (val text COLLATE "C");
+INSERT INTO eager_agg_t5 SELECT 1, 'a' FROM generate_series(1, 50);
+INSERT INTO eager_agg_t5 SELECT 1, 'A' FROM generate_series(1, 50);
+INSERT INTO eager_agg_t6 VALUES ('A');
+ANALYZE eager_agg_t5;
+ANALYZE eager_agg_t6;
+-- When grouping by an expression with explicit non-deterministic COLLATE,
+-- eager aggregation should not be used even if the column's native collation
+-- is deterministic.
+EXPLAIN (COSTS OFF)
+SELECT t1.id, t1.val COLLATE case_insensitive, count(*)
+ FROM eager_agg_t5 t1
+ JOIN eager_agg_t6 t2 ON t1.val = t2.val
+GROUP BY t1.id, t1.val COLLATE case_insensitive;
+ QUERY PLAN
+-----------------------------------------------
+ HashAggregate
+ Group Key: t1.id, (t1.val)::text
+ -> Hash Join
+ Hash Cond: (t1.val = t2.val)
+ -> Seq Scan on eager_agg_t5 t1
+ -> Hash
+ -> Seq Scan on eager_agg_t6 t2
+(7 rows)
+
+-- Verify correct results (should return 1 row with count = 100, since 'a' and
+-- 'A' are equal under case_insensitive collation)
+SELECT t1.id, t1.val COLLATE case_insensitive, count(*)
+ FROM eager_agg_t5 t1
+ JOIN eager_agg_t6 t2 ON t1.val = t2.val
+GROUP BY t1.id, t1.val COLLATE case_insensitive;
+ id | val | count
+----+-----+-------
+ 1 | A | 50
+(1 row)
+
+DROP TABLE eager_agg_t5;
+DROP TABLE eager_agg_t6;
-- virtual generated columns
CREATE TABLE t5 (
a int,
diff --git a/src/test/regress/expected/eager_aggregate.out b/src/test/regress/expected/eager_aggregate.out
index d1b86be3a62..2bf983d12cb 100644
--- a/src/test/regress/expected/eager_aggregate.out
+++ b/src/test/regress/expected/eager_aggregate.out
@@ -448,6 +448,26 @@ GROUP BY t1.a ORDER BY t1.a;
-> Seq Scan on eager_agg_t1 t1
(9 rows)
+-- Ensure eager aggregation is not applied when FILTER clause contains
+-- volatile function
+EXPLAIN (COSTS OFF)
+SELECT t1.a, avg(t2.c) FILTER (WHERE random() > 0.5)
+ FROM eager_agg_t1 t1
+ JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+ QUERY PLAN
+-----------------------------------------------------
+ GroupAggregate
+ Group Key: t1.a
+ -> Sort
+ Sort Key: t1.a
+ -> Hash Join
+ Hash Cond: (t2.b = t1.b)
+ -> Seq Scan on eager_agg_t2 t2
+ -> Hash
+ -> Seq Scan on eager_agg_t1 t1
+(9 rows)
+
DROP TABLE eager_agg_t1;
DROP TABLE eager_agg_t2;
DROP TABLE eager_agg_t3;
diff --git a/src/test/regress/sql/collate.icu.utf8.sql b/src/test/regress/sql/collate.icu.utf8.sql
index 0e6b76b11b8..93c22b37727 100644
--- a/src/test/regress/sql/collate.icu.utf8.sql
+++ b/src/test/regress/sql/collate.icu.utf8.sql
@@ -1021,6 +1021,76 @@ GROUP BY t1.id;
DROP TABLE eager_agg_t1;
DROP TABLE eager_agg_t2;
+--
+-- Test for eager aggregation with multiple columns having different collations
+--
+CREATE TABLE eager_agg_t3 (
+ id int,
+ val1 text COLLATE case_insensitive,
+ val2 text COLLATE "C"
+);
+CREATE TABLE eager_agg_t4 (
+ val1 text COLLATE case_insensitive,
+ val2 text COLLATE "C"
+);
+
+INSERT INTO eager_agg_t3 SELECT 1, 'a', 'x' FROM generate_series(1, 50);
+INSERT INTO eager_agg_t3 SELECT 1, 'A', 'x' FROM generate_series(1, 50);
+INSERT INTO eager_agg_t4 VALUES ('A', 'x');
+
+ANALYZE eager_agg_t3;
+ANALYZE eager_agg_t4;
+
+-- Ensure that eager aggregation is not used when grouping by a column with
+-- non-deterministic collation, even when other grouping columns have
+-- deterministic collations.
+EXPLAIN (COSTS OFF)
+SELECT t1.id, t1.val1, count(*)
+ FROM eager_agg_t3 t1
+ JOIN eager_agg_t4 t2 ON t1.val1 = t2.val1 COLLATE "C" AND t1.val2 = t2.val2
+GROUP BY t1.id, t1.val1;
+
+-- Verify correct results (should return 1 row with count = 50)
+SELECT t1.id, t1.val1, count(*)
+ FROM eager_agg_t3 t1
+ JOIN eager_agg_t4 t2 ON t1.val1 = t2.val1 COLLATE "C" AND t1.val2 = t2.val2
+GROUP BY t1.id, t1.val1;
+
+DROP TABLE eager_agg_t3;
+DROP TABLE eager_agg_t4;
+
+--
+-- Test for eager aggregation with explicit COLLATE on grouping expression
+--
+CREATE TABLE eager_agg_t5 (id int, val text COLLATE "C");
+CREATE TABLE eager_agg_t6 (val text COLLATE "C");
+
+INSERT INTO eager_agg_t5 SELECT 1, 'a' FROM generate_series(1, 50);
+INSERT INTO eager_agg_t5 SELECT 1, 'A' FROM generate_series(1, 50);
+INSERT INTO eager_agg_t6 VALUES ('A');
+
+ANALYZE eager_agg_t5;
+ANALYZE eager_agg_t6;
+
+-- When grouping by an expression with explicit non-deterministic COLLATE,
+-- eager aggregation should not be used even if the column's native collation
+-- is deterministic.
+EXPLAIN (COSTS OFF)
+SELECT t1.id, t1.val COLLATE case_insensitive, count(*)
+ FROM eager_agg_t5 t1
+ JOIN eager_agg_t6 t2 ON t1.val = t2.val
+GROUP BY t1.id, t1.val COLLATE case_insensitive;
+
+-- Verify correct results (should return 1 row with count = 100, since 'a' and
+-- 'A' are equal under case_insensitive collation)
+SELECT t1.id, t1.val COLLATE case_insensitive, count(*)
+ FROM eager_agg_t5 t1
+ JOIN eager_agg_t6 t2 ON t1.val = t2.val
+GROUP BY t1.id, t1.val COLLATE case_insensitive;
+
+DROP TABLE eager_agg_t5;
+DROP TABLE eager_agg_t6;
+
-- virtual generated columns
CREATE TABLE t5 (
a int,
diff --git a/src/test/regress/sql/eager_aggregate.sql b/src/test/regress/sql/eager_aggregate.sql
index 97e10dd7cf4..9c935ef0633 100644
--- a/src/test/regress/sql/eager_aggregate.sql
+++ b/src/test/regress/sql/eager_aggregate.sql
@@ -171,6 +171,14 @@ SELECT t1.a, avg(t2.c + random())
JOIN eager_agg_t2 t2 ON t1.b = t2.b
GROUP BY t1.a ORDER BY t1.a;
+-- Ensure eager aggregation is not applied when FILTER clause contains
+-- volatile function
+EXPLAIN (COSTS OFF)
+SELECT t1.a, avg(t2.c) FILTER (WHERE random() > 0.5)
+ FROM eager_agg_t1 t1
+ JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+
DROP TABLE eager_agg_t1;
DROP TABLE eager_agg_t2;
DROP TABLE eager_agg_t3;
Attachments:
[text/plain] more-tests.diff.nocfbot (8.0K, 2-more-tests.diff.nocfbot)
download | inline diff:
diff --git a/src/test/regress/expected/collate.icu.utf8.out b/src/test/regress/expected/collate.icu.utf8.out
index fbcdb7eb58c..a2dd8a34da4 100644
--- a/src/test/regress/expected/collate.icu.utf8.out
+++ b/src/test/regress/expected/collate.icu.utf8.out
@@ -2726,6 +2726,95 @@ GROUP BY t1.id;
DROP TABLE eager_agg_t1;
DROP TABLE eager_agg_t2;
+--
+-- Test for eager aggregation with multiple columns having different collations
+--
+CREATE TABLE eager_agg_t3 (
+ id int,
+ val1 text COLLATE case_insensitive,
+ val2 text COLLATE "C"
+);
+CREATE TABLE eager_agg_t4 (
+ val1 text COLLATE case_insensitive,
+ val2 text COLLATE "C"
+);
+INSERT INTO eager_agg_t3 SELECT 1, 'a', 'x' FROM generate_series(1, 50);
+INSERT INTO eager_agg_t3 SELECT 1, 'A', 'x' FROM generate_series(1, 50);
+INSERT INTO eager_agg_t4 VALUES ('A', 'x');
+ANALYZE eager_agg_t3;
+ANALYZE eager_agg_t4;
+-- Ensure that eager aggregation is not used when grouping by a column with
+-- non-deterministic collation, even when other grouping columns have
+-- deterministic collations.
+EXPLAIN (COSTS OFF)
+SELECT t1.id, t1.val1, count(*)
+ FROM eager_agg_t3 t1
+ JOIN eager_agg_t4 t2 ON t1.val1 = t2.val1 COLLATE "C" AND t1.val2 = t2.val2
+GROUP BY t1.id, t1.val1;
+ QUERY PLAN
+------------------------------------------------------------------------------------
+ HashAggregate
+ Group Key: t1.id, t1.val1
+ -> Nested Loop
+ Join Filter: (((t1.val1)::text = (t2.val1)::text) AND (t1.val2 = t2.val2))
+ -> Seq Scan on eager_agg_t4 t2
+ -> Seq Scan on eager_agg_t3 t1
+(6 rows)
+
+-- Verify correct results (should return 1 row with count = 50)
+SELECT t1.id, t1.val1, count(*)
+ FROM eager_agg_t3 t1
+ JOIN eager_agg_t4 t2 ON t1.val1 = t2.val1 COLLATE "C" AND t1.val2 = t2.val2
+GROUP BY t1.id, t1.val1;
+ id | val1 | count
+----+------+-------
+ 1 | A | 50
+(1 row)
+
+DROP TABLE eager_agg_t3;
+DROP TABLE eager_agg_t4;
+--
+-- Test for eager aggregation with explicit COLLATE on grouping expression
+--
+CREATE TABLE eager_agg_t5 (id int, val text COLLATE "C");
+CREATE TABLE eager_agg_t6 (val text COLLATE "C");
+INSERT INTO eager_agg_t5 SELECT 1, 'a' FROM generate_series(1, 50);
+INSERT INTO eager_agg_t5 SELECT 1, 'A' FROM generate_series(1, 50);
+INSERT INTO eager_agg_t6 VALUES ('A');
+ANALYZE eager_agg_t5;
+ANALYZE eager_agg_t6;
+-- When grouping by an expression with explicit non-deterministic COLLATE,
+-- eager aggregation should not be used even if the column's native collation
+-- is deterministic.
+EXPLAIN (COSTS OFF)
+SELECT t1.id, t1.val COLLATE case_insensitive, count(*)
+ FROM eager_agg_t5 t1
+ JOIN eager_agg_t6 t2 ON t1.val = t2.val
+GROUP BY t1.id, t1.val COLLATE case_insensitive;
+ QUERY PLAN
+-----------------------------------------------
+ HashAggregate
+ Group Key: t1.id, (t1.val)::text
+ -> Hash Join
+ Hash Cond: (t1.val = t2.val)
+ -> Seq Scan on eager_agg_t5 t1
+ -> Hash
+ -> Seq Scan on eager_agg_t6 t2
+(7 rows)
+
+-- Verify correct results (should return 1 row with count = 100, since 'a' and
+-- 'A' are equal under case_insensitive collation)
+SELECT t1.id, t1.val COLLATE case_insensitive, count(*)
+ FROM eager_agg_t5 t1
+ JOIN eager_agg_t6 t2 ON t1.val = t2.val
+GROUP BY t1.id, t1.val COLLATE case_insensitive;
+ id | val | count
+----+-----+-------
+ 1 | A | 50
+(1 row)
+
+DROP TABLE eager_agg_t5;
+DROP TABLE eager_agg_t6;
-- virtual generated columns
CREATE TABLE t5 (
a int,
diff --git a/src/test/regress/expected/eager_aggregate.out b/src/test/regress/expected/eager_aggregate.out
index d1b86be3a62..2bf983d12cb 100644
--- a/src/test/regress/expected/eager_aggregate.out
+++ b/src/test/regress/expected/eager_aggregate.out
@@ -448,6 +448,26 @@ GROUP BY t1.a ORDER BY t1.a;
-> Seq Scan on eager_agg_t1 t1
(9 rows)
+-- Ensure eager aggregation is not applied when FILTER clause contains
+-- volatile function
+EXPLAIN (COSTS OFF)
+SELECT t1.a, avg(t2.c) FILTER (WHERE random() > 0.5)
+ FROM eager_agg_t1 t1
+ JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+ QUERY PLAN
+-----------------------------------------------------
+ GroupAggregate
+ Group Key: t1.a
+ -> Sort
+ Sort Key: t1.a
+ -> Hash Join
+ Hash Cond: (t2.b = t1.b)
+ -> Seq Scan on eager_agg_t2 t2
+ -> Hash
+ -> Seq Scan on eager_agg_t1 t1
+(9 rows)
+
DROP TABLE eager_agg_t1;
DROP TABLE eager_agg_t2;
DROP TABLE eager_agg_t3;
diff --git a/src/test/regress/sql/collate.icu.utf8.sql b/src/test/regress/sql/collate.icu.utf8.sql
index 0e6b76b11b8..93c22b37727 100644
--- a/src/test/regress/sql/collate.icu.utf8.sql
+++ b/src/test/regress/sql/collate.icu.utf8.sql
@@ -1021,6 +1021,76 @@ GROUP BY t1.id;
DROP TABLE eager_agg_t1;
DROP TABLE eager_agg_t2;
+--
+-- Test for eager aggregation with multiple columns having different collations
+--
+CREATE TABLE eager_agg_t3 (
+ id int,
+ val1 text COLLATE case_insensitive,
+ val2 text COLLATE "C"
+);
+CREATE TABLE eager_agg_t4 (
+ val1 text COLLATE case_insensitive,
+ val2 text COLLATE "C"
+);
+
+INSERT INTO eager_agg_t3 SELECT 1, 'a', 'x' FROM generate_series(1, 50);
+INSERT INTO eager_agg_t3 SELECT 1, 'A', 'x' FROM generate_series(1, 50);
+INSERT INTO eager_agg_t4 VALUES ('A', 'x');
+
+ANALYZE eager_agg_t3;
+ANALYZE eager_agg_t4;
+
+-- Ensure that eager aggregation is not used when grouping by a column with
+-- non-deterministic collation, even when other grouping columns have
+-- deterministic collations.
+EXPLAIN (COSTS OFF)
+SELECT t1.id, t1.val1, count(*)
+ FROM eager_agg_t3 t1
+ JOIN eager_agg_t4 t2 ON t1.val1 = t2.val1 COLLATE "C" AND t1.val2 = t2.val2
+GROUP BY t1.id, t1.val1;
+
+-- Verify correct results (should return 1 row with count = 50)
+SELECT t1.id, t1.val1, count(*)
+ FROM eager_agg_t3 t1
+ JOIN eager_agg_t4 t2 ON t1.val1 = t2.val1 COLLATE "C" AND t1.val2 = t2.val2
+GROUP BY t1.id, t1.val1;
+
+DROP TABLE eager_agg_t3;
+DROP TABLE eager_agg_t4;
+
+--
+-- Test for eager aggregation with explicit COLLATE on grouping expression
+--
+CREATE TABLE eager_agg_t5 (id int, val text COLLATE "C");
+CREATE TABLE eager_agg_t6 (val text COLLATE "C");
+
+INSERT INTO eager_agg_t5 SELECT 1, 'a' FROM generate_series(1, 50);
+INSERT INTO eager_agg_t5 SELECT 1, 'A' FROM generate_series(1, 50);
+INSERT INTO eager_agg_t6 VALUES ('A');
+
+ANALYZE eager_agg_t5;
+ANALYZE eager_agg_t6;
+
+-- When grouping by an expression with explicit non-deterministic COLLATE,
+-- eager aggregation should not be used even if the column's native collation
+-- is deterministic.
+EXPLAIN (COSTS OFF)
+SELECT t1.id, t1.val COLLATE case_insensitive, count(*)
+ FROM eager_agg_t5 t1
+ JOIN eager_agg_t6 t2 ON t1.val = t2.val
+GROUP BY t1.id, t1.val COLLATE case_insensitive;
+
+-- Verify correct results (should return 1 row with count = 100, since 'a' and
+-- 'A' are equal under case_insensitive collation)
+SELECT t1.id, t1.val COLLATE case_insensitive, count(*)
+ FROM eager_agg_t5 t1
+ JOIN eager_agg_t6 t2 ON t1.val = t2.val
+GROUP BY t1.id, t1.val COLLATE case_insensitive;
+
+DROP TABLE eager_agg_t5;
+DROP TABLE eager_agg_t6;
+
-- virtual generated columns
CREATE TABLE t5 (
a int,
diff --git a/src/test/regress/sql/eager_aggregate.sql b/src/test/regress/sql/eager_aggregate.sql
index 97e10dd7cf4..9c935ef0633 100644
--- a/src/test/regress/sql/eager_aggregate.sql
+++ b/src/test/regress/sql/eager_aggregate.sql
@@ -171,6 +171,14 @@ SELECT t1.a, avg(t2.c + random())
JOIN eager_agg_t2 t2 ON t1.b = t2.b
GROUP BY t1.a ORDER BY t1.a;
+-- Ensure eager aggregation is not applied when FILTER clause contains
+-- volatile function
+EXPLAIN (COSTS OFF)
+SELECT t1.a, avg(t2.c) FILTER (WHERE random() > 0.5)
+ FROM eager_agg_t1 t1
+ JOIN eager_agg_t2 t2 ON t1.b = t2.b
+GROUP BY t1.a ORDER BY t1.a;
+
DROP TABLE eager_agg_t1;
DROP TABLE eager_agg_t2;
DROP TABLE eager_agg_t3;
view thread (75+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: Eager aggregation, take 3
In-Reply-To: <[email protected]>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox