Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sS1gj-003CJE-SR for pgsql-hackers@arkaria.postgresql.org; Thu, 11 Jul 2024 21:50:54 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1sS1gi-0017Ed-H3 for pgsql-hackers@arkaria.postgresql.org; Thu, 11 Jul 2024 21:50:52 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sS1gi-0017EU-4m for pgsql-hackers@lists.postgresql.org; Thu, 11 Jul 2024 21:50:52 +0000 Received: from mail-ed1-x535.google.com ([2a00:1450:4864:20::535]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1sS1gf-001chQ-7r for pgsql-hackers@postgresql.org; Thu, 11 Jul 2024 21:50:51 +0000 Received: by mail-ed1-x535.google.com with SMTP id 4fb4d7f45d1cf-58b0dddab8cso2464183a12.0 for ; Thu, 11 Jul 2024 14:50:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1720734648; x=1721339448; darn=postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=ss2XwriYmznrmU98/lg6FxeN1sd3t4Wfa2pNJo68GBI=; b=ZO0d4qJjs1v+SfBxvN7OoX9NiWu3KSY9/EomBSnZKdr/PCDBnC7pVsQUqq6DloEzfm T/uQ9zJrajTX8yb5yHSTmyBidJ5ulNMA7cXJ0fyLvlAwLVuws4Wj0sHp5rE0q3HrPEL4 nBEMA+0+v67e4V/rLMbIiZjXz0bWTCwRsPJFl106w6dhcBaWcA+9UI2BssZr/aULo6Un goehTB2MeWrErJTbUnaBCwsUdn1RKNj4CaiW+p/bogXFHwKg4iB6c1Lho3+c8mpFOvJ+ tEBkBpFX0nq7Pd856+pxbp0gHvvf3ZMxyL/wfSlEKsxuYgb5p7jFHl+CwHuwzD5pso9d ocdw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1720734648; x=1721339448; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=ss2XwriYmznrmU98/lg6FxeN1sd3t4Wfa2pNJo68GBI=; b=lPKLc+1e6PW9VsSmi1Mpmq3jv6U2EjgO0ecBoi02iBX0uOt9jPeqje01PRW7HtGY6h e87xZcL4JequNlXVorNvijZmewupSSOSIS00pWB7s6k2faKfkSvAz4hOcr/InIQ4Snnf HBzitsI7nr/J6Q4VqOIzk5tyxUTBsnqqupp/V9vTpWuTWnW4JdeHgVTqih2NqVwKRypd vKUXqfoRsnRc9WooIOPg74B/ToXbxDfqrHflSD+/baRC1fufrq38CyXhAuTwqokFpYcy cIWMecoVUpvSvTR3s+1DLeu/d9T4xkOyNqy5621ffU2eJGwRhLcrHassJcnU8mZHKWpB trjA== X-Forwarded-Encrypted: i=1; AJvYcCUmlQxScZHpVkKbSpuX0BCV5oLhGnZGjS38X5+fMB+OixLTnFKe36TMFSRYHKHT0w5WyfEjPDPNTsII9AEePo/PJbMLNnb3CyGy+U7m X-Gm-Message-State: AOJu0YyZfe37Upqu3dg56VgEjnLBooJFu9QoDJL8VB4BvdTprCn1onAI TEpBZ/E86idRn7khiDDDLdKxFjev6sHMgR1Z9DGO7KwaQsEouy4v3TY78tNPP9zzyZeYbbYx7K1 YwVnEZ8vlHoc4krKwTwlYhKGtCb0= X-Google-Smtp-Source: AGHT+IFX22Yzru+P+n5wdI5wLlaIgz+kqpNION5pCVYBQcFv4L0NejWsRcYJbIgQdUeiudPlTo8bQOhAeaCMDlDcXwU= X-Received: by 2002:aa7:d88a:0:b0:57c:8027:534d with SMTP id 4fb4d7f45d1cf-594bc7ca26amr5445916a12.27.1720734648256; Thu, 11 Jul 2024 14:50:48 -0700 (PDT) MIME-Version: 1.0 References: <87il22cj51.fsf@163.com> In-Reply-To: From: Paul George Date: Thu, 11 Jul 2024 14:50:35 -0700 Message-ID: Subject: Re: Eager aggregation, take 3 To: Richard Guo Cc: Andy Fan , PostgreSQL-development , pgsql-hackers@lists.postgresql.org Content-Type: multipart/alternative; boundary="000000000000764e7b061cffc121" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000764e7b061cffc121 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hey Richard, Looking more closely at this example >select t2.a, count(*) from t t1 left join t t2 on t2.b > 1 group by t2.a having t2.a is null; I wonder if the inability to exploit eager aggregation is more based on the fact that COUNT(*) cannot be decomposed into an aggregation of PARTIAL COUNT(*)s (apologies if my terminology is off/made up...I'm new to the codebase). In other words, is it the case that a given aggregate function already has built-in protection against the error case you correctly pointed out? To highlight this, in the simple example below we don't see aggregate pushdown even with an INNER JOIN when the agg function is COUNT(*) but we do when it's COUNT(t2.*): -- same setup drop table if exists t; create table t(a int, b int, c int); insert into t select i % 100, i % 10, i from generate_series(1, 1000) i; analyze t; -- query 1: COUNT(*) --> no pushdown set enable_eager_aggregate=3Don; explain (verbose, costs off) select t1.a, count(*) from t t1 join t t2 on t1.a=3Dt2.a group by t1.a; QUERY PLAN ------------------------------------------- HashAggregate Output: t1.a, count(*) Group Key: t1.a -> Hash Join Output: t1.a Hash Cond: (t1.a =3D t2.a) -> Seq Scan on public.t t1 Output: t1.a, t1.b, t1.c -> Hash Output: t2.a -> Seq Scan on public.t t2 Output: t2.a (12 rows) -- query 2: COUNT(t2.*) --> agg pushdown set enable_eager_aggregate=3Don; explain (verbose, costs off) select t1.a, count(t2.*) from t t1 join t t2 on t1.a=3Dt2.a group by t1.a; QUERY PLAN ------------------------------------------------------- Finalize HashAggregate Output: t1.a, count(t2.*) Group Key: t1.a -> Hash Join Output: t1.a, (PARTIAL count(t2.*)) Hash Cond: (t1.a =3D t2.a) -> Seq Scan on public.t t1 Output: t1.a, t1.b, t1.c -> Hash Output: t2.a, (PARTIAL count(t2.*)) -> Partial HashAggregate Output: t2.a, PARTIAL count(t2.*) Group Key: t2.a -> Seq Scan on public.t t2 Output: t2.*, t2.a (15 rows) ...while it might be true that COUNT(*) ... INNER JOIN should allow eager agg pushdown (I haven't thought deeply about it, TBH), I did find this result pretty interesting. -Paul On Wed, Jul 10, 2024 at 1:27=E2=80=AFAM Richard Guo wrote: > On Sun, Jul 7, 2024 at 10:45=E2=80=AFAM Paul George > wrote: > > Thanks for reviving this patch and for all of your work on it! Eager > aggregation pushdown will be beneficial for my work and I'm hoping to see > it land. > > Thanks for looking at this patch! > > > The output of both the original query and this one match (and the plans > with eager aggregation and the subquery are nearly identical if you resto= re > the LEFT JOIN to a JOIN). I admittedly may be missing a subtlety, but doe= s > this mean that there are conditions under which eager aggregation can be > pushed down to the nullable side? > > I think it's a very risky thing to push a partial aggregation down to > the nullable side of an outer join, because the NULL-extended rows > produced by the outer join would not be available when we perform the > partial aggregation, while with a non-eager-aggregation plan these > rows are available for the top-level aggregation. This may put the > rows into groups in a different way than expected, or get wrong values > from the aggregate functions. I've managed to compose an example: > > create table t (a int, b int); > insert into t select 1, 1; > > select t2.a, count(*) from t t1 left join t t2 on t2.b > 1 group by > t2.a having t2.a is null; > a | count > ---+------- > | 1 > (1 row) > > This is the expected result, because after the outer join we have got > a NULL-extended row. > > But if we somehow push down the partial aggregation to the nullable > side of this outer join, we would get a wrong result. > > explain (costs off) > select t2.a, count(*) from t t1 left join t t2 on t2.b > 1 group by > t2.a having t2.a is null; > QUERY PLAN > ------------------------------------------- > Finalize HashAggregate > Group Key: t2.a > -> Nested Loop Left Join > Filter: (t2.a IS NULL) > -> Seq Scan on t t1 > -> Materialize > -> Partial HashAggregate > Group Key: t2.a > -> Seq Scan on t t2 > Filter: (b > 1) > (10 rows) > > select t2.a, count(*) from t t1 left join t t2 on t2.b > 1 group by > t2.a having t2.a is null; > a | count > ---+------- > | 0 > (1 row) > > I believe there are cases where pushing a partial aggregation down to > the nullable side of an outer join can be safe, but I doubt that there > is an easy way to identify these cases and do the push-down for them. > So for now I think we'd better refrain from doing that. > > Thanks > Richard > --000000000000764e7b061cffc121 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hey Richard,

Looking more closely at this example

>select t2.a, count(*) from t t1 left jo= in t t2 on t2.b > 1 group by=C2=A0t2.a having=C2=A0t2.a is null;

=
I wonder if the in= ability to exploit eager aggregation is more based on the fact that COUNT(*= ) cannot be decomposed into an aggregation of PARTIAL COUNT(*)s (apologies = if my terminology is off/made up...I'm new to the codebase). In other w= ords, is it the case that a given aggregate function already=C2=A0has built= -in protection against the error case you correctly pointed out?

To highlight this, in the simpl= e example below we don't see aggregate pushdown even with an INNER JOIN= when the agg function is COUNT(*) but we do when=C2=A0it's COUNT(t2.*)= :

<= div class=3D"gmail_default" style=3D"font-size:x-small">-- same setup
=
drop table if exis= ts t;
create table t(a int, b int, c int);
insert into t select i % 1= 00, i % 10, i from generate_series(1, 1000) i;
analyze t;

-- query 1: COUNT(*) --> no push= down

set enable_eager= _aggregate=3Don;
explain (verbose, costs off) select t1.a, count(*) from= t t1 join t t2 on t1.a=3Dt2.a group by t1.a;

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 QUERY PLAN =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0
-------------------------------------------
=C2=A0HashAggregate<= br>=C2=A0 =C2=A0Output: t1.a, count(*)
=C2=A0 =C2=A0Group Key: t1.a
= =C2=A0 =C2=A0-> =C2=A0Hash Join
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Out= put: t1.a
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Hash Cond: (t1.a =3D t2.a)=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0-> =C2=A0Seq Scan on public.t t1
= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Output: t1.a, t1.b, = t1.c
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0-> =C2=A0Hash
=C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Output: t2.a
=C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0-> =C2=A0Seq Scan on public.t t2=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0Output: t2.a
(12 rows)


-- query 2: COUNT(t2.*) --> agg pushdown

set enable_eager_aggregate=3Don;
explai= n (verbose, costs off) select t1.a, count(t2.*) from t t1 join t t2 on t1.a= =3Dt2.a group by t1.a;

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 QUERY PLAN =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0
--------------------------------------------------= -----
=C2=A0Finalize HashAggregate
=C2=A0 =C2=A0Output: t1.a, count(t= 2.*)
=C2=A0 =C2=A0Group Key: t1.a
=C2=A0 =C2=A0-> =C2=A0Hash Join<= br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Output: t1.a, (PARTIAL count(t2.*))=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Hash Cond: (t1.a =3D t2.a)
=C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0-> =C2=A0Seq Scan on public.t t1
=C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Output: t1.a, t1.b, t1.c
= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0-> =C2=A0Hash
=C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Output: t2.a, (PARTIAL count(t2.*))
= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0-> =C2=A0Partial = HashAggregate
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0Output: t2.a, PARTIAL count(t2.*)
=C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Group Key: t2.a<= br>=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0-> =C2=A0Seq Scan on public.t t2
=C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Output= : t2.*, t2.a
(15 rows)

...while it might be true that COUNT(*) ... INNER JOIN should allow ea= ger agg pushdown (I haven't thought deeply about it, TBH), I did find t= his result pretty interesting.


-Paul

On Wed, Jul 10, 2024 at 1:27=E2=80=AFAM Richard Guo <guofenglinux@gmail.com> wrote:
On Sun, Jul 7, 2024 at = 10:45=E2=80=AFAM Paul George <p.a.george19@gmail.com> wrote:
> Thanks for reviving this patch and for all of your work on it! Eager a= ggregation pushdown will be beneficial for my work and I'm hoping to se= e it land.

Thanks for looking at this patch!

> The output of both the original query and this one match (and the plan= s with eager aggregation and the subquery are nearly identical if you resto= re the LEFT JOIN to a JOIN). I admittedly may be missing a subtlety, but do= es this mean that there are conditions under which eager aggregation can be= pushed down to the nullable side?

I think it's a very risky thing to push a partial aggregation down to the nullable side of an outer join, because the NULL-extended rows
produced by the outer join would not be available when we perform the
partial aggregation, while with a non-eager-aggregation plan these
rows are available for the top-level aggregation.=C2=A0 This may put the rows into groups in a different way than expected, or get wrong values
from the aggregate functions.=C2=A0 I've managed to compose an example:=

create tab= le t (a int, b int);
insert into t select 1, 1;

select t2.a, count(*) from t t1 left join t t2 on t2.b > 1 group by
t2.a having t2.a is null;
=C2=A0a | count
---+-------
=C2=A0 =C2=A0|=C2=A0 =C2=A0 =C2=A01
(1 row)

This is the expected result, because after the outer join we have got
a NULL-extended row.

But if we somehow push down the partial aggregation to the nullable
side of this outer join, we would get a wrong result.

explain (costs off)
select t2.a, count(*) from t t1 left join t t2 on t2.b > 1 group by
t2.a having t2.a is null;
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 QUERY PLAN
-------------------------------------------
=C2=A0Finalize HashAggregate
=C2=A0 =C2=A0Group Key: t2.a
=C2=A0 =C2=A0->=C2=A0 Nested Loop Left Join
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0Filter: (t2.a IS NULL)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0->=C2=A0 Seq Scan on t t1
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0->=C2=A0 Materialize
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0->=C2=A0 Partial = HashAggregate
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0Group Key: t2.a
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0->=C2=A0 Seq Scan on t t2
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0Filter: (b > 1)
(10 rows)

select t2.a, count(*) from t t1 left join t t2 on t2.b > 1 group by
t2.a having t2.a is null;
=C2=A0a | count
---+-------
=C2=A0 =C2=A0|=C2=A0 =C2=A0 =C2=A00
(1 row)

I believe there are cases where pushing a partial aggregation down to
the nullable side of an outer join can be safe, but I doubt that there
is an easy way to identify these cases and do the push-down for them.
So for now I think we'd better refrain from doing that.

Thanks
Richard
--000000000000764e7b061cffc121--