Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1taF0U-00029H-7d for pgsql-hackers@arkaria.postgresql.org; Tue, 21 Jan 2025 14:13:30 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1taF0S-00H6Nc-Vc for pgsql-hackers@arkaria.postgresql.org; Tue, 21 Jan 2025 14:13:28 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1taF0S-00H6M7-I9 for pgsql-hackers@lists.postgresql.org; Tue, 21 Jan 2025 14:13:28 +0000 Received: from mail-yb1-xb35.google.com ([2607:f8b0:4864:20::b35]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1taF0Q-000jzC-1N for pgsql-hackers@lists.postgresql.org; Tue, 21 Jan 2025 14:13:27 +0000 Received: by mail-yb1-xb35.google.com with SMTP id 3f1490d57ef6-e580d6211c8so492123276.1 for ; Tue, 21 Jan 2025 06:13:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1737468806; x=1738073606; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=9vVkKJnpdMdtYRL9vO73TfT+ZRTQW/j9d7Uj3adHqnk=; b=hgti3tLClH0OZ/oHmwvfYDdSP63FzXbX2+McEelpYXQQ66qSak3jrOGIjKBSKZzCgb V0vrRqC0Liea22W2IM9LxEUtlid3YiuXmQwrC5PZpgeaU0S0jYiAFg79HTIgNNLeumtb /+0PwzmZwnEE/9Vz2ZmeHUrx8sdJdo+9HwFx79Y8Hd/wwnohkTVv8rwXPaqVZipCXLUX 91sJS6/RHGmccvu2kgnHanlTR5PD+gC3MmPQNzwWrxt71jBlftLgmoDGGlRcLDp1OnUo A/alr/p3/xRit91ESwBANFAi9uuC6RmfB4s9kjsXRg8hxRQ5eHYkEdhK/l3a+iUq7o/W Nn8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1737468806; x=1738073606; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=9vVkKJnpdMdtYRL9vO73TfT+ZRTQW/j9d7Uj3adHqnk=; b=bM8dhk2W0UP3XTU6lvGQv3/EqM2MjxavSneRWABXgnsiOA02v+R2ViRro2q3uV24MC ZXcafCdP1eOBQ524mjLGbTXxbqO1lYaZHJp36YQHf6meGWF8C3nOZDV/bX0853uie/Rh xlt8ry8J/Rs6eAYduYchY/qgCJdWALo9gKZJrwqEW61BAgCU5uXAlWPMPSCZU34b7Erw zLSa3ObM5jb/GVdABisIljJ8TUNvzJBgb5J5SFs4/h3x2wziyDx7CDmVMiBZbRfCLRkI VsoKd6jo1aUmb09B09/wQGwyp+LcXnZHlrXmaeoypqlLEn1/RpIoBPTHYYrf3bG+PP4K PfNA== X-Forwarded-Encrypted: i=1; AJvYcCW1+SneO3y1SoObzMZLV0IQiJpDTcOdz0wbdzEUg69vBQ9VsdfiH9RaJC0/MYyq6btWvJ9zGrwbi44EXHwK@lists.postgresql.org X-Gm-Message-State: AOJu0YyvU+w7/jNAt+z1SRTFUe7K8Sjmub5DE4jxchDQ7IzCAUlme0Lq 72T5hkg8yydmbLX673GVr3kpumZmIeJs0GhUwTzQiGM7Tpcu8YRTJdatW/jibNqYaiHdyRCKJdL zfI//4M5sZOnY+dHrGVtRbK2/RHM= X-Gm-Gg: ASbGncvkg53tB8bcQesk9TrkVZrWDLHCGPZ83yVkdz31r/0c+G1DzZyFcRVCFT6abgG E8OBxz504pNPfz9rAyfGY6QNfQg7StSUIYpTC0C9S4kiStpaDuQRwVlpjJ2LLZCcDDCPWR6+Ke5 RHw1iF9SiTvA== X-Google-Smtp-Source: AGHT+IH3jtsHua5cULFY7lZsU7fVvRuccYA94eOKygJSq/UiHL5m9DzR6RsY3Ewblknijt6pDGV8pIXpyV6jbfHnrSI= X-Received: by 2002:a05:690c:a85:b0:6f6:ca9a:30aa with SMTP id 00721157ae682-6f6eb9221e6mr127957517b3.25.1737468806092; Tue, 21 Jan 2025 06:13:26 -0800 (PST) MIME-Version: 1.0 References: <87il22cj51.fsf@163.com> <3016309.1737395840@sss.pgh.pa.us> In-Reply-To: <3016309.1737395840@sss.pgh.pa.us> From: Richard Guo Date: Tue, 21 Jan 2025 23:13:14 +0900 X-Gm-Features: AbW1kvYxJYPJuXh-FOPlqHm_pd6HOaA7rmLoY0ABFz4CzyN6dnXSJNI82_bxgM0 Message-ID: Subject: Re: Eager aggregation, take 3 To: Tom Lane Cc: Robert Haas , Tender Wang , Paul George , Andy Fan , pgsql-hackers@lists.postgresql.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Tue, Jan 21, 2025 at 2:57=E2=80=AFAM Tom Lane wrote: > However, a partial-aggregation path does not generate the same data > as an unaggregated path, no matter how fuzzy you are willing to be > about the concept. So I'm having a very hard time accepting that > it ought to be part of the same RelOptInfo, and thus I don't really > buy that annotating paths with a GroupPathInfo is the way forward. Agreed. I think one point I failed to make myself clear on is that I've never intended to put a partial-aggregation path and an unaggregated path into the same RelOptInfo. One of the basic designs of this patch is that partial-aggregation paths are placed in a separate category of RelOptInfos, which I call "grouped relations" (though I admit that's not the best name). This ensures that we never compare a partial-aggregation path with an unaggregated path during scan/join planning, because I am certain that the two categories of paths are not comparable. Regarding the GroupPathInfo proposal, my intention is to add a valid GroupPathInfo only for the partial-aggregation paths. The goal is to ensure that partial-aggregation paths within this category are compared only if their partial aggregations are at the same location. To be honest, I still doubt that this is necessary. I have two main reasons for this. 1. For a partial-aggregation path, the location where we place the partial aggregation does not impose any restrictions on further planning. This is different from the parameterized path case. If two parameterized paths are equal on very other figure of merit, we will choose the one with fewer required outer rels, as it means fewer join restrictions on upper planning. However, for partial-aggregation paths, we do not have a preference regarding the location of the partial aggregation. For instance, for path "A JOIN PartialAgg(B) JOIN C" and path "PartialAgg(A JOIN B) JOIN C", if one path dominates the other on every figure of merit, it seems to me that there's no point in keeping the less favorable one, although they have their partial aggregations at different join levels. 2. A partial-aggregation path of a rel essentially yields an aggregated form of that rel's row set. The difference between the row sets yielded by paths with different locations of partial aggregation is primarily about the different degrees to which the rows are aggregated. These sets are fundamentally homogeneous. In summary, in my own opinion, I think the partial-aggregation paths of the same "grouped relation" are comparable, regardless of the position of the partial aggregation within the path tree. So I think we should put them into the same RelOptInfo. Of course, I could be very wrong about this. I would greatly appreciate hearing others' thoughts on this. Thanks Richard