Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1v5lku-00BdbP-IQ for pgsql-hackers@arkaria.postgresql.org; Mon, 06 Oct 2025 14:00:00 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1v5lks-008JPe-Dw for pgsql-hackers@arkaria.postgresql.org; Mon, 06 Oct 2025 13:59:59 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1v5lks-008JPV-32 for pgsql-hackers@lists.postgresql.org; Mon, 06 Oct 2025 13:59:59 +0000 Received: from mail-lf1-x129.google.com ([2a00:1450:4864:20::129]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1v5lkq-000gjQ-2p for pgsql-hackers@postgresql.org; Mon, 06 Oct 2025 13:59:58 +0000 Received: by mail-lf1-x129.google.com with SMTP id 2adb3069b0e04-57a59124323so3481453e87.2 for ; Mon, 06 Oct 2025 06:59:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1759759196; x=1760363996; darn=postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=mtxepKxWDM5kXjMFV8/0tv/7qyM7Sed6V27CZ00+37o=; b=bHo+lNv8mWxUE5H8m3TRRHnFq6uyrcyOSY0KIMRSW6/vxz27nkSXRcuPqSyZCl/efw RV8+MaWhx7ewRCQvs3z07RKUPfKyhyEsp6LeJ6wjSBj+2/C9DbPMmZMUg+bTvLf96ZDL sdcInpV16fQ3w7YZgh7GwFoz6vrU96CxP9MDjvGvjlnb09w9dqjy2FzGL7ftLjRn2A7g WZUYXkRilK0A26osRtwW6mHQBRMuqp6rLn7H/oGz2fM1I+CeJPWlsgqELSK/LrA86fZ3 C+qlkE/NAFaWTvFM6i1fQJtTSaazYdVrjDYyqcEmG7WdzGWI8NWpRsj2Dz4K/jvP8daW BQZQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1759759196; x=1760363996; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=mtxepKxWDM5kXjMFV8/0tv/7qyM7Sed6V27CZ00+37o=; b=qncZpmLqM0qfMcn47v20Jvz5KIDWVdL+OGu6b0+yKQvS055uiVPx5F87IyMV2tnpOe gXHQsqIX0u/TSxFhjXQ/b+AnOtoc25QVo1CQmKd/lP1987SEDBD++FarWKGqZQag1G/R oB4XIo3EdhKzpwd0xvfA5wMVt5CvKEnYRIEiLQL7VDGisP6eFkUvtgXjDy3wmOSv2TAs 3Dt4+5ASAvrs/KgHuasp3uM8tUcurTkcsNdxAgQt88eSmRWonNNJJuSaeJyHrEyquK2T i/0JQkJnkyBYWfUumGY+jMcuyA1RQfqB19ZEP3yahMeTfTxpZeutGpuWHilBWE27FVjH E3Vw== X-Forwarded-Encrypted: i=1; AJvYcCXlofsWL8P8bf18+vP0U1y4nQb2y5Q+qJ/X8/l4PxYK7/R5LWAeTnQFE/fBekTkRAAgSp3Bg/rx51VmjwRN@postgresql.org X-Gm-Message-State: AOJu0YwX5kXYlmEPoehWkQr1gELzqYvwBJO+ljNLvt+1932e9cw/yv44 v+8dDtQ49uu4fHoWVolFOKOYoew/GZAcL2lberuj6s/idKndqDLtWrwb5AM/vsUzPFyNEMMAG2L 6Y6tgOQQFemNn0uv7YkFjh4RkZsejBVg= X-Gm-Gg: ASbGncuoZZZJfaNIVqiQ5BhVFCSGnXS9lNdoJ28XdxQJV9z8aastPQw37IiiAMNJTJw GZDINPf/CbrLtqYnca/BNxIrCgcLP1mcYlCfTkeb9eJzvjWuo/WkbdMCC2lnmM+gkXYzaynq/Py /k5Joc9Or/4i/b3wcRASmStBjM+L2WVsJwv+Rdwimv+/4EOvwDxIf/WMMynyGVqkjX9bTdgaKCd yVwP7pb3t3OPr6y2Q+tmwakgQeIBYRBefu2lsA3MUHU0y8OiJRd7S48vM7mKmfWYzg64Pu8BmBm BIJ2a8/1rUlTrr1WucojiyYa0gcmkzlR X-Google-Smtp-Source: AGHT+IEIEYuIP8xd307+SK9zaLYtPZ2QZ3VZ9bvTAq+Uxo80rtyd2HBc4qhccb7rc27f8wCgBbxaNhz1AvNpxz5k29Y= X-Received: by 2002:a05:6512:3b88:b0:57e:4998:95ce with SMTP id 2adb3069b0e04-58cbb06ad50mr4078717e87.35.1759759195653; Mon, 06 Oct 2025 06:59:55 -0700 (PDT) MIME-Version: 1.0 References: <87il22cj51.fsf@163.com> In-Reply-To: From: David Rowley Date: Tue, 7 Oct 2025 02:59:44 +1300 X-Gm-Features: AS18NWDVJJtwMsfyBcNd6gVxhpmX9vY1ImBkqrx6UztVzMNRvg8M-TTI8quiZSg Message-ID: Subject: Re: Eager aggregation, take 3 To: Richard Guo Cc: Robert Haas , Tom Lane , Tender Wang , Paul George , Andy Fan , PostgreSQL-development , pgsql-hackers@lists.postgresql.org, Matheus Alcantara Content-Type: text/plain; charset="UTF-8" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Mon, 6 Oct 2025 at 13:59, Richard Guo wrote: > Barring any objections, I'll plan to push v23 in a couple of days. Not a complete review, but a customary look: 1. setup_base_grouped_rels() by name and the header comment claim to operate on base relations, but the code seems to be coded to handle OTHER_MEMBER rels too. Note that set_base_rel_pathlists() explicitly skips anything that's not RELOPT_BASEREL, so if you're not doing that, then you shouldn't use "base" in the function name. It's confusing. 2. All the calls to generate_grouped_paths() pass the grouped_rel RelOptInfo and also grouped_rel->agg_info. Is there a reason to keep it that way rather than access the agg_info from the given grouped_rel from within the function? 3. " * The information needed are provided by the RelAggInfo structure." This should use "is" rather than "are" 4. standard_join_search(). I think it's worth getting rid of the duplicate "if (!bms_equal(rel->relids, root->all_query_rels))" check. How about setting that in a local variable rather than recalling bms_equal(). I don't believe the compiler will optimise the extra one away as it can't know set_cheapest() doesn't change the relids. Also, wouldn't it be better to check rel->grouped_rel != NULL first? Won't that be NULL in most cases, where as !bms_equal(rel->relids, root->all_query_rels) will be true in most cases? Likewise in generate_partitionwise_join_paths(). 5. Wouldn't it be better to do 0002 first and get that into core so you don't have to do the hacky stuff in is_partial_agg_memory_risky()? 6. Shouldn't this be using lappend()? agg_clause_list = list_append_unique(agg_clause_list, ac_info); I don't understand why ac_info could already be in the list. You've just done: ac_info = makeNode(AggClauseInfo); 7. The following comment talks about "base" relations. I don't think it should be as the RelOptInfo can be an OTHER_MEMBER rel. * build_simple_grouped_rel * Construct a new RelOptInfo representing a grouped version of the input * base relation. */ 8. Normally we check the List is NIL instead of: if (list_length(group_clauses) == 0) 9. In get_expression_sortgroupref(), a comment claims "We ignore child members here.". I think that's outdated since ec_members no longer has child members. 10. I don't think this comment quite makes sense: * "apply_at" tracks the lowest join level at which partial aggregation is * applied. maybe "minimum set of rels to join before partial aggregation can be applied"? or at least swap "is" for "can be". My confusion comes from the fact you're stating "lowest join level", which seems to indicate that it could be applied after further relations have been joined, but then you're saying "is applied" to indicate that it can only be applied at that level. 11. The way you've written the header comments for typedef struct RelAggInfo seems weird. I've only ever seen extra details in the header comment when the inline comments have been kept to a single line. You're spanning multiple lines, so why have the out of line comments in the header at all? 12. This just doesn't feel like the right name for this field: /* lowest level partial aggregation is applied at */ Relids apply_at; I can't help think that it should be something like "agg_relids" or "required_relids". I understand you're currently only applying the partial grouping when you get exactly the minimum set of relids in the join search, but if this can be made fast enough, I expect that could be changed in the future. If you do change it, then "apply_at" is a pretty confusing name. Perhaps I've misunderstood here and if you did that, you'd need to create another RelAggInfo to represent that? 13. Parameter names mismatch between definition and declaration in: extern RelOptInfo *build_simple_grouped_rel(PlannerInfo *root, RelOptInfo *rel_plain); extern RelOptInfo *build_grouped_rel(PlannerInfo *root, RelOptInfo *rel_plain); extern void generate_grouped_paths(PlannerInfo *root, RelOptInfo *rel_grouped, RelOptInfo *rel_plain, RelAggInfo *agg_info); 14. Do all the regression tests need VERBOSE in EXPLAIN? It's making the output kinda huge. It might also be nice to wrap the long queries onto multiple lines to make them easier to read. David