Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uvwMt-003uWz-6v for pgsql-hackers@arkaria.postgresql.org; Tue, 09 Sep 2025 11:18:36 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1uvwMs-00BjqP-4r for pgsql-hackers@arkaria.postgresql.org; Tue, 09 Sep 2025 11:18:34 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uvwMr-00Bjq3-NQ for pgsql-hackers@lists.postgresql.org; Tue, 09 Sep 2025 11:18:34 +0000 Received: from mail-yw1-x1134.google.com ([2607:f8b0:4864:20::1134]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1uvwMq-001SoF-10 for pgsql-hackers@postgresql.org; Tue, 09 Sep 2025 11:18:33 +0000 Received: by mail-yw1-x1134.google.com with SMTP id 00721157ae682-71d60504bf8so53089557b3.2 for ; Tue, 09 Sep 2025 04:18:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1757416711; x=1758021511; darn=postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=rPB4c90qulvpLARnMrFaw8Hq257hQM59dC2R8uDbjuQ=; b=HmtTJTC1dhxF4fjNW1kd7BJwSVX2HgzOUVdgLE35O7P1xWFNLDw9jZHLZ0OPfEf1cX cjuIIlv3hnIMAAsw1asRr8PoHwZNtE5wC1Bm/cYa8pCjdJn0a02YikG7R1N8dfARZala A14SJM4QHAYs6lcX2JL1+c5GPyWYdnnzCefT6NZd5hLcdQNtwcOyGhId5znhySK8A3Df Ito4UDNbi4jtOx0yKWawsel5gCIOYZqzJPxDS4Gb1qP7SH4KLQf3LBTFsK+wtSk5kTxJ KBzFdYPFMM8B4ouXsdck63JRdb7pjdyXgoSHIdE34TqCFp35yIeMTiCUjYdI48uV1luh XM6A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1757416711; x=1758021511; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=rPB4c90qulvpLARnMrFaw8Hq257hQM59dC2R8uDbjuQ=; b=i/rHrEzmI3BLCIj3Ylec6wlJEuQKLhfi4UqF+yCCwlOR4bMWEpLUs32XmKiWf5xgWU /7bpX+hcb/Gv0jRJsQTVHHFj0yRd9nLHmBZBELr4CLvvgU/G5Y172lBC/ccQsOTnGajs 0tiX95iI2TDRkPJX1QpP4lzXdyTWsXKRoNPr89VbiNGVIN01FeahCVqx5zXtArwmlOBB xfR71lUYNl1nGcDfxSULHZ7emtXVP5CX4Ug+ae7+C8+A02DYxDC7uJkCEplIbpCQEQ1E YDD5e38lLi6jUYto//8/ned4vZ/eVwfhbH2WZex27o9bmsFaxYn1b5sY2X0XYb63H/UZ BjEw== X-Forwarded-Encrypted: i=1; AJvYcCUO4wrFGZHDQTdLCnUwGcb6aNXQNaCiJXqkGqH4xn5vt+8kkeMOY3jVl48aW8TITbCRp47wHFOqioMJjeZW@postgresql.org X-Gm-Message-State: AOJu0Yy6G8pvEA7Q6GB++zBw4tWUG61T1ygJEiPhlbKJebA+A8qMxErb vIzpFMg3iQ/MDunjZCBeHUVNcXuraJ/jvWCKgPHNWfrGwd5G5EFCwmoDW/17H7uPyC7ErnPSTx3 MQ6XHQobgX1FVHshFxG0JF2/DvYN9A+w= X-Gm-Gg: ASbGncturMASS9Xq6gc6yEn9XxMYyL0Mmcmq/uUV+Lq+M5R3EVccl4cYs+lE+L6nQVg orHiz9wD/2Pw+xqTgSu0x/uaOAjmqLpeBEwJCGhlrRwUommRK+QLWoMUQGS2ikdA3uG/Nts+Hb3 KOf9OLrnzDMjuqIkHYcq18F9ARhsivhsrJ7BcJtbNXe3c2drJewVblTaGPkho+3ml4RNAbzEJ51 +dVxSxsiA== X-Google-Smtp-Source: AGHT+IFqpZdwPFOfkISfuUJsMxKGX00OmhEhlWJ3KcQwe0Vq5ivs6E85VrDazgIAvO0UNv2zX+U4RitEtF2NIGl4r+c= X-Received: by 2002:a05:690c:6889:b0:721:c0d:9dd9 with SMTP id 00721157ae682-727f681b748mr98691617b3.52.1757416711241; Tue, 09 Sep 2025 04:18:31 -0700 (PDT) MIME-Version: 1.0 References: <87il22cj51.fsf@163.com> In-Reply-To: From: Richard Guo Date: Tue, 9 Sep 2025 20:18:20 +0900 X-Gm-Features: Ac12FXyP2U1sh_IiZhSKKOqDpYnfxvoeZVIdG5w-bO_HIzqs-Llp4DEauFskVnQ Message-ID: Subject: Re: Eager aggregation, take 3 To: Robert Haas Cc: Tom Lane , Tender Wang , Paul George , Andy Fan , PostgreSQL-development , pgsql-hackers@lists.postgresql.org, Matheus Alcantara Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Fri, Sep 5, 2025 at 11:50=E2=80=AFPM Robert Haas = wrote: > Like Matheus, I think a GUC is reasonable. A significant danger here > appears to be the possibility of a performance cliff, where queries > are optimized very different when the ratio is 9.99 vs. 10.01, say. It > would be nice if there were some way to mitigate that danger, but at > least a GUC avoids chaining the performance of the whole system to a > hard-coded value. Yeah, I think the performance cliff issue does exist. It might be mitigated by carefully selecting the threshold value to ensure that small differences in the average group size near the boundary don't cause big performance swings with and without eager aggregation, but this doesn't seem like an easy task. How is this issue avoided in other thresholds? For example, with min_parallel_table_scan_size, is there a performance cliff when the table size is 7.99MB vs. 8.01MB, where a parallel scan is considered in the latter case but not the former? > It might be worth considering whether there are heuristics other than > the group size that could help here. Possibly that's just making > things more complicated to no benefit. It seems to me, for example, > that reducing 100 rows to 10 is quite different from reducing a > million rows to 100,000. On the whole, the latter seems more likely to > work out well, but it's tricky, because the effort expended per group > can be arbitrarily high. I think we do want to let the cost model make > most of the decisions, and just use this threshold to prune ideas that > are obviously bad at an early stage. That said, it's worth thinking > about how this interacts with the just-considered-one-eager-agg > strategy. Does this threshold apply before or after that rule? If I understand correctly, this means that we need to explore each join level to find out the most optimal position for applying partial aggregation. For example, suppose Agg(B) reduces 100 rows to 10, and Agg(A JOIN B) reduces a million rows to 100,000, it might be better to apply partial aggregation at the (A JOIN B) level rather than just over B. However, that's not always the case: the Agg(B) option can reduce the number of input rows to the join earlier, potentially outperforming the Agg(A JOIN B) approach. Therefore, we need to consider both options and compare their costs. This is actually what the patch used to do before I introduced the always-push-to-lowest heuristic. > For instance, consider AGG(FACT_TABLE JOIN DIMENSION_TABLE), like a > count of orders grouped by customer name. Aggregating on the dimension > table (in this case, the list of customers) is probably useless, but > aggregating on the join column of the fact table has a good chance of > being useful. If we consider only one of those strategies, we want it > to be the right one. This threshold could be the thing that helps us > to get it right. Now I see what you meant. However, in the current implementation, we only push partial aggregation down to relations that contain all the aggregation columns. So, in the case you mentioned, if the aggregation columns come from the dimension table, unfortunately, we don't have the option to partially aggregate the fact table. The paper does discuss several other transformations, such as "Eager Count", "Double Eager", and "Eager Split", that can perform partial aggregation on relations that don't contain aggregation columns, or even on both sides of the join. However, those are beyond the scope of this patch. - Richard