Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1umdXO-001Ec0-D8 for pgsql-hackers@arkaria.postgresql.org; Thu, 14 Aug 2025 19:22:58 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1umdXM-0095mp-T2 for pgsql-hackers@arkaria.postgresql.org; Thu, 14 Aug 2025 19:22:57 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1umdXM-0095mh-Dm for pgsql-hackers@lists.postgresql.org; Thu, 14 Aug 2025 19:22:56 +0000 Received: from mail-pj1-x1035.google.com ([2607:f8b0:4864:20::1035]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1umdXK-000b1Y-0Z for pgsql-hackers@postgresql.org; Thu, 14 Aug 2025 19:22:55 +0000 Received: by mail-pj1-x1035.google.com with SMTP id 98e67ed59e1d1-32326e69f1dso1466552a91.3 for ; Thu, 14 Aug 2025 12:22:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1755199374; x=1755804174; darn=postgresql.org; h=in-reply-to:references:mime-version:to:from:subject:cc:message-id :date:content-transfer-encoding:from:to:cc:subject:date:message-id :reply-to; bh=gGjj4/KmvJO4O2Kc/TiI2QkTuaun0DwrmdwWt5FAlWQ=; b=QGkUrR5DbH9b4XXNT7+rDAit+ZCXU26AjklQCo/jGjGg6S7J2M+OrD4L/APyG4VGIi yhsCGZfBv0cInArvSLbMMPj3RoN+FWuCGxhd0dKRH7hhkeOW71Td3DMzZT07X6ZnyIPA 1XwcRsLrQ97aV9VBDUpvY8J5bqgdFpAgG/nX9mPyIoR54M83PdYi2PK0pBmJal0VdT7U Sxm7fxCJTOKkpYkCcpz3pPTFqZYqlOuiIWxSKXXl8PzLNIJqHgTjjBgu8/MnTVBTPuwg 9bcoK/qO/0embLeVqosz3/fCRb7Jd+DRRIU86dSstfSrIp+HHUalkaTdDsaUHNXCKh+2 0Dcg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1755199374; x=1755804174; h=in-reply-to:references:mime-version:to:from:subject:cc:message-id :date:content-transfer-encoding:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=gGjj4/KmvJO4O2Kc/TiI2QkTuaun0DwrmdwWt5FAlWQ=; b=S7RyILYyc6kUS+iVFh2P1zpk9U+aXlnSoaLQn2Fd6UQbM/upayBJ7ccmcXktNUlmPJ CMaBC9TJ6t4dhsjpNSmPP/jpN8029vdIGGCMOKjLCn5GP9silyvqmffteHh3FoV5fbfR c11Pxvy7NVzwpPkj043oTak38fOMz0OVRsPc/rOchzPw914+y4MrcSoL1yr/u6iyx5n5 hheg1NvhGyBKbbauxQyi8GEnW9L6AG8XgpUyoStP/fSKYvHHK13LW+smDa3ze8MQihFO BrPbOjq3by0totlj04PrRa5Bscy0QoNaCTiqY+BwceMqYnj2znFztOralv+gRz1kbCxW iLcA== X-Forwarded-Encrypted: i=1; AJvYcCWCgA3UWxGbzMEZpE7mx/P0xbKTuxFtU1LBQNBf9umVX20lQoe+aZgYxpGFk4M1iHim0DEOgzF+LI0xSo2/@postgresql.org X-Gm-Message-State: AOJu0YxyZXqRqTMI7bUcy1LBtm/ZTui6mPhkH3OWhITLG80l8K4NrID5 Ayqs5/6UCKLK1KHiHfgmVULa9NQu5sK0PkBq8fFiVuhpxfwSNJYTZVoO X-Gm-Gg: ASbGncu8KFoJ0kVgtitMSgwyKsxhbbUPoMdu9+VT8JDju9nslgVMO5EPtFVRk55TAp3 Ovnqa3T7PdRXTjcjo0oEWzm6mVClTEHasZM5O55lSJDQyq9j3Wkem/N9O1A0ssdoHrjNlRZQ+4k m9c9V6uCTsbvejHLBJKPQp6cbWOfvIa/q+UXyLviPpjHnuC8Bk15t8w+BuR/HSwg/MqZrLsnY3s KB1XAPJfCWVXPXBjwZ32Wmh6UjFwbtPtHKAXmvKGAqIte14nZ08CJ2pEsAgnE0rIaeMjGrEccxC fXotyB/ztglsPn/ThBw4/R6H4yF3lvBkX3KwKeXEY6dV3KkZxnHVVAnTSZicReh3OfKDH9W6VEQ aAxITtEVmblaaHZoJhIWVn/6fRc3RMcXwtoEBlIuxIcfKaQ== X-Google-Smtp-Source: AGHT+IGH3hBrnnMJKFyIJyjyJ1XfKMS6z6wuuwFMCyAUmyv+rz4Uq/mQwkaqbziK0uTCq/R2HxnA3A== X-Received: by 2002:a17:90b:578b:b0:31f:11d6:cea0 with SMTP id 98e67ed59e1d1-32327ca13bemr6564494a91.27.1755199373558; Thu, 14 Aug 2025 12:22:53 -0700 (PDT) Received: from localhost ([2804:14d:328a:a59c:4432:dc0c:5c1d:552a]) by smtp.gmail.com with ESMTPSA id 41be03b00d2f7-b4371980668sm12258854a12.30.2025.08.14.12.22.50 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 14 Aug 2025 12:22:53 -0700 (PDT) Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=UTF-8 Date: Thu, 14 Aug 2025 16:22:49 -0300 Message-Id: Cc: "Robert Haas" , "Tom Lane" , "Tender Wang" , "Paul George" , "Andy Fan" , "PostgreSQL-development" , Subject: Re: Eager aggregation, take 3 From: "Matheus Alcantara" To: "Richard Guo" MIME-Version: 1.0 References: <87il22cj51.fsf@163.com> In-Reply-To: List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On 08/08/25 22:32, Richard Guo wrote: >> It sounds like a good way to go for me, looking forward to the next >> patch version to perform some other tests. > > OK. Here it is. > Thanks! I can confirm now that I can see the eager aggregate in action in some of these queries that I've tested on the TPC-DS benchmark. I few questions regarding the new version: I've noticed that when a query has a WHERE clause filtering columns from the same relation being aggregated using "=3D" operator the Partial and Finalize aggregation nodes are not present on explain results even if setup_eager_aggregation() returns true on all if statements and also RelAggInfo->agg_useful is true. For example, consider this query that is used on eager aggregation paper that use some tables from TPC-H benchmark: tpch=3D# show enable_eager_aggregate ; enable_eager_aggregate ------------------------ on (1 row) tpch=3D# set max_parallel_workers_per_gather to 0; SET tpch=3D# EXPLAIN(COSTS OFF) SELECT O_CLERK, SUM(L_EXTENDEDPRICE * (1 - L_DISCOUNT)) AS LOSS FROM LINEITEM JOIN ORDERS ON L_ORDERKEY =3D O_ORDERKEY WHERE L_RETURNFLAG =3D 'R' GROUP BY O_CLERK; QUERY PLAN -------------------------------------------------------------- HashAggregate Group Key: orders.o_clerk -> Hash Join Hash Cond: (lineitem.l_orderkey =3D orders.o_orderkey) -> Seq Scan on lineitem Filter: (l_returnflag =3D 'R'::bpchar) -> Hash -> Seq Scan on orders (8 rows) Debugging this query shows that all if conditions on setup_eager_aggregation() returns false and create_agg_clause_infos() and create_grouping_expr_infos() are called. The RelAggInfo->agg_useful is also being set to true so I would expect to see Finalize and Partial agg nodes, is this correct or am I missing something here? Removing the WHERE clause I can see the Finalize and Partial agg nodes: tpch=3D# EXPLAIN(COSTS OFF) SELECT O_CLERK, SUM(L_EXTENDEDPRICE * (1 - L_DISCOUNT)) AS LOSS FROM LINEITEM JOIN ORDERS ON L_ORDERKEY =3D O_ORDERKEY GROUP BY O_CLERK; QUERY PLAN ---------------------------------------------------------------------- Finalize HashAggregate Group Key: orders.o_clerk -> Merge Join Merge Cond: (lineitem.l_orderkey =3D orders.o_orderkey) -> Partial GroupAggregate Group Key: lineitem.l_orderkey -> Index Scan using idx_lineitem_orderkey on lineitem -> Index Scan using orders_pkey on orders (8 rows) This can also be reproduced with an addition of a WHERE clause on some tests on eager_aggregate.sql: postgres=3D# EXPLAIN (VERBOSE, COSTS OFF) SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b =3D t2.b WHERE t2.c =3D 5 GROUP BY t1.a ORDER BY t1.a; QUERY PLAN ------------------------------------------------------------------ GroupAggregate Output: t1.a, avg(t2.c) Group Key: t1.a -> Sort Output: t1.a, t2.c Sort Key: t1.a -> Hash Join Output: t1.a, t2.c Hash Cond: (t1.b =3D t2.b) -> Seq Scan on public.eager_agg_t1 t1 Output: t1.a, t1.b, t1.c -> Hash Output: t2.c, t2.b -> Seq Scan on public.eager_agg_t2 t2 Output: t2.c, t2.b Filter: (t2.c =3D '5'::double precision) (16 rows) Note that if I use ">" operator for example, this doesn't happen: SELECT t1.a, avg(t2.c) FROM eager_agg_t1 t1 JOIN eager_agg_t2 t2 ON t1.b =3D t2.b WHERE t2.c > 5 GROUP BY t1.a ORDER BY t1.a; QUERY PLAN ------------------------------------------------------------------------ Finalize GroupAggregate Output: t1.a, avg(t2.c) Group Key: t1.a -> Sort Output: t1.a, (PARTIAL avg(t2.c)) Sort Key: t1.a -> Hash Join Output: t1.a, (PARTIAL avg(t2.c)) Hash Cond: (t1.b =3D t2.b) -> Seq Scan on public.eager_agg_t1 t1 Output: t1.a, t1.b, t1.c -> Hash Output: t2.b, (PARTIAL avg(t2.c)) -> Partial HashAggregate Output: t2.b, PARTIAL avg(t2.c) Group Key: t2.b -> Seq Scan on public.eager_agg_t2 t2 Output: t2.a, t2.b, t2.c Filter: (t2.c > '5'::double precision) (19 rows) Is this behavior correct? If it's correct, would be possible to check this limitation on setup_eager_aggregation() and maybe skip all the other work? -- Matheus Alcantara