Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tYuM6-00GLb2-JV for pgsql-hackers@arkaria.postgresql.org; Fri, 17 Jan 2025 21:58:19 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1tYuM5-007Egf-9R for pgsql-hackers@arkaria.postgresql.org; Fri, 17 Jan 2025 21:58:17 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tYuIn-0076XC-B3 for pgsql-hackers@lists.postgresql.org; Fri, 17 Jan 2025 21:54:53 +0000 Received: from mail-ej1-x62e.google.com ([2a00:1450:4864:20::62e]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1tYtiS-00057u-25 for pgsql-hackers@postgresql.org; Fri, 17 Jan 2025 21:17:44 +0000 Received: by mail-ej1-x62e.google.com with SMTP id a640c23a62f3a-aaf57c2e0beso516164266b.3 for ; Fri, 17 Jan 2025 13:17:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1737148591; x=1737753391; darn=postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=NbiNQ/RQ2Tpnj70D/D0YnWsY+DSkvLODFpGkcpDCJ8w=; b=AZ0mtyfjNwx4cU9FrrUEE1rFHdnChkhEj5wvQW+4Cq5OXfRbCP+jQbwP029ZlY0uub 1jZL01gn1IxmdZGwI6ZNhFZm/+oRoAWpaEAdI09L8x9D7AJB1TP71u9Prd0DEvWOHpUB BaAAVEmMleQNb3ejZ1Fz3pi52WyRFMXGSwTgAvJmhwukroNcwKYs1fla+T11yuLN6gwK 0X1k1Yr5zXZCMINyyxZrqXV1asEfrHYDeYGk4t8RFc5zfzDsRemXROBpj7l7L0y9NX6R 6YtUcEqO0PmI9Ir8uPI4BX2xA1k8eQYQFnpzJ9mePhoh5ASaOgNQVwlqez7auSOhQrxn DZLQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1737148591; x=1737753391; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=NbiNQ/RQ2Tpnj70D/D0YnWsY+DSkvLODFpGkcpDCJ8w=; b=JZ6/R1rk+m/nx81oXJEQCn3767Mnlukjf6EFQDX2aLp42xW7adp/V7yjNGhuRwwWCS U31kHOyGq3aBB+qFfLH4fo4zefxrKbuU2xxrhpoOTG1l9ZRt4Lppz60yZ9bRRHAAGWJR Mi+d+wTD6iYHaw0lcWJUflMCd3+QIRy5ocUMJ7m8sxdKq3+G50GtrF2sal+t48JTtlOI jYeguzhP0qW0Tvs/8mCEAmp5QUyFr4ELkxCpZuv5zJASXoNa9MM8HfPONFmr8BcIa+43 tYPP9gFijH055TYMInvl+CI/ClAc1EPRmczHf+g22wOmX8jB3mhjROyQuP3R8UAFA4GS durQ== X-Forwarded-Encrypted: i=1; AJvYcCXNB6NSwMtzmqKUYTHVEdBWlccaytY57GaoT6lSlJdjDXH6C2Ax6gVEt2622oCnQh8JoAZlojcw2ThKcON+@postgresql.org X-Gm-Message-State: AOJu0YwLnzbqzR57NZP3tPXTyR2rNXadcit7KfzlDTK+bIloAzKvDP42 ix4R/EvvkQ0SHGHi71afN9C9dEeewz6aosUJ+RnoQwrwjM7RPSbYJh2r93pQZHKau73AhdI7d1G adqFVO9SrXz9mMynY2NL+B1I4JpI= X-Gm-Gg: ASbGnct0VtCtNFQePcZEV9KnBoQsevczqgTG5kiQiUOiJBKuqEdWf8+Dqig7dRJgrUg GdAF8+gTwjvTc/QYhQsFEH2Ps/zI1z171FUq71Q== X-Google-Smtp-Source: AGHT+IFclQJP7Z4otJ/FmRYKquKbywCpE8QsHhWMGyfh2YrFX8bvxahykuHAv47HBHvgaGFcXIfz3pTyBAGdP1OMEKk= X-Received: by 2002:a17:907:971a:b0:aa6:66eb:9c06 with SMTP id a640c23a62f3a-ab38b1e207bmr379588466b.5.1737148590457; Fri, 17 Jan 2025 13:16:30 -0800 (PST) MIME-Version: 1.0 References: <87il22cj51.fsf@163.com> In-Reply-To: From: Robert Haas Date: Fri, 17 Jan 2025 16:16:19 -0500 X-Gm-Features: AbW1kvYunt0H5torQdfCKRQafOUYHFbFlsZu58bMyxLKtsIA7biJU2UySjfTwx8 Message-ID: Subject: Re: Eager aggregation, take 3 To: Richard Guo Cc: Tom Lane , Tender Wang , Paul George , Andy Fan , PostgreSQL-development , pgsql-hackers@lists.postgresql.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Thu, Jan 16, 2025 at 3:18=E2=80=AFAM Richard Guo wrote: > If this t1/t2 join is part of a larger SELECT query, I think the cost > estimates for the upper join nodes would likely be quite inaccurate. That's definitely true. However, the question is not whether the planner has problems today (it definitely does) but whether it's OK to make this change without improving our ability to estimate the effects of aggregation operations. I understand that you (quite rightly) don't want to get sucked into fixing unrelated planner problems, and I'm also not sure to what extent these problems are actually fixable. However, major projects sometimes require such work. For instance, commit 5edc63bda68a77c4d38f0cbeae1c4271f9ef4100 was motivated by the discovery that it was too easy to get a Parallel Bitmap Heap Scan plan even when it wasn't best. The fact that the costing wasn't right wasn't the fault of parallel query, but parallel query still needed to do something about it to get good results. > Yeah, I know 7 is not a large number, but this is the result I got > from running the TPC-DS benchmark. For the remaining 92 queries in > the benchmark, either the logic in this patch determines that eager > aggregation is not applicable, or the path with eager aggregation is > not the optimal one. I'd be more than happy if a benchmark query > showed significant performance regression, so it would provide an > opportunity to investigate how the cost estimates are negatively > impacting the final plan and explore ways to avoid or improve that. > If anyone can provide such a benchmark query, I'd be very grateful. Yes, having more people test this and look for regressions would be quite valuable. > Well, from the perspective of planning effort, what really matters is > whether the RelOptInfo for the grouped relation is added to the > PlannerInfo, as it is only then available for further joining in the > join search routine, not whether the RelOptInfo is built or not. > Building the RelOptInfo for a grouped relation is simply a makeNode > call followed by a flat copy; it doesn't require going through the > full process of determining its target list, or constructing its > restrict and join clauses, or calculating size estimates, etc. That's probably mostly true, but the overhead of memory allocations in planner routines is not trivial. There are previous cases of changing things or declining to change this purely on the number of palloc cycles involved. > > It's possible you're right, but it does make me nervous. I do agree > > that making the number of RelOptInfos explode would be really bad. > > Based on my explanation in [1], I think it's acceptable to compare > grouped paths for the same grouped rel, regardless of where the > partial aggregation is placed. > > I fully understand that I could be wrong about this, but I don't think > it would break anything in regular planning (i.e., planning without > eager aggregation). I think you might be taking too narrow a view of the problem. As Tom says, the issue is that this breaks a bunch of assumptions that hold elsewhere. One place that shows up in the patch is in the special-case logic you've added to set_cheapest(), but I fear that won't be the end of it. It seems a bit surprising to me that you didn't also need to adjust add_path(), for example. Even if you don't, there's lots of places that rely on the assumption that all paths for a RelOptInfo are returning the same set of rows. If it turns out that a bunch of those places need to be adjusted to work with this, then the code could potentially end up quite messy, and that might also have performance consequences, even when this feature is disabled. Many of the code paths that deal with paths in the planner are quite hot. To say that another way, I'm not so much worried about the possibility that the patch contains a bug. Patches contain bugs all the time and we can just fix them. It's not wonderful, but that's how software development goes. What I am worried about is whether the architecture is right. If we go with one RelOptInfo when the "right answer" is many, or for that matter if we go with many when the right answer is one, those are things that cannot be easily and reasonably patched post-commit, and especially not post-release. --=20 Robert Haas EDB: http://www.enterprisedb.com