public inbox for [email protected]
help / color / mirror / Atom feedFrom: David Rowley <[email protected]>
To: Richard Guo <[email protected]>
Cc: Robert Haas <[email protected]>
Cc: Tom Lane <[email protected]>
Cc: Tender Wang <[email protected]>
Cc: Paul George <[email protected]>
Cc: Andy Fan <[email protected]>
Cc: PostgreSQL-development <[email protected]>
Cc: [email protected]
Cc: Matheus Alcantara <[email protected]>
Subject: Re: Eager aggregation, take 3
Date: Tue, 7 Oct 2025 02:59:44 +1300
Message-ID: <CAApHDvrxyGSLv3Sbg9fpmz6yYri_7M6SaKYnqQQv59uZfQdduA@mail.gmail.com> (raw)
In-Reply-To: <CAMbWs4_W6PmVXxgqM8LMerX2iwqOwR5EC5RGWSAMuvkm8o+-jw@mail.gmail.com>
References: <CAMbWs48jzLrPt1J_00ZcPZXWUQKawQOFE8ROc-ADiYqsqrpBNw@mail.gmail.com>
<[email protected]>
<CAMbWs49=eAd2W9jCtGhaZPPp+SOC_2rg16RTG74xAht=hkr5JQ@mail.gmail.com>
<CAMbWs49Nc4M3H+eCf1+8w8piDyEECjRb-gK_JMF4VvcyWwGEVQ@mail.gmail.com>
<CAMbWs49E_dR0nobsExsyetpnBpHObLTsQLsEbWKQLkh0omPxNg@mail.gmail.com>
<CAMbWs49B_qUiHvu2EqLHZRpLr3p_+QPBs50n2=L5ibYzniwTzA@mail.gmail.com>
<CAMbWs48KCQtDymnYi4M=Vz+WMzo3fkBxffJsyk6VX6hOXXv+VA@mail.gmail.com>
<CAMbWs49sv_MuOYqqrtmBN_oYf8VSQ2BXDwXaTpJTn_YfwyYdWQ@mail.gmail.com>
<CAMbWs49U8Sddx_fGszPdvA3jp_nheynxaqm5Y4NqMV21VBYAuQ@mail.gmail.com>
<CAMbWs4-LwyOg9ga+NVF7yQbMi0ZsZdN1G_sO2v=YJHV18=19+A@mail.gmail.com>
<CALA8mJquG_zCJXfVwash5LKqHGtZXQmq7RfTSaRDUzGYeW=7Rw@mail.gmail.com>
<CAMbWs4_EjgcBib5+y1LYcGB3EK3Y6R+OOxGKfJo42fDovadk1g@mail.gmail.com>
<CALA8mJqe0anNM8_V6cOeOQnCHUTQggn7iOQNyQr1VaN_xMjz+w@mail.gmail.com>
<CAMbWs48eE-s-jCicC8pSVfXk8Ws-ZvUKnsw8qH-DkVBdYv0eJQ@mail.gmail.com>
<CAMbWs483a7-8M0pDttG44r-+8Gevn9VG0xNceE3WpkEQxJXPZw@mail.gmail.com>
<CAHewXNmYM6DvR_kaxDL0w0fz9BwKbac+TSU3QS10aA3cXHyMmA@mail.gmail.com>
<CA+TgmoaxH=P63hLYgyJJcEbMRnw3xi16d=HxFi1j-m7MhH6W_w@mail.gmail.com>
<CAMbWs4_cOnpGsywj9Jt1WAgzJLW9Rxt5X13cfGz4iN2qvZQ68g@mail.gmail.com>
<CA+Tgmob0q7bRbsFTVDMjxHE6zA4uDQLQa-s0CtwUw49V53UL_A@mail.gmail.com>
<CAMbWs4-Xru_eKBeRHFduigSGihdixFWVTR8A+dtMw7Mao+RkJA@mail.gmail.com>
<CAMbWs49dLjSSQRWeud+KSN0G531ciZdYoLBd5qktXA+3JQm_UQ@mail.gmail.com>
<CAMbWs48LXGC-Y63YtzEeM-3f0NUXWCUEMs7XwGzywXTjUNMcxQ@mail.gmail.com>
<CAMbWs48XdzvnwfTHWxQ7qK-yjvdrbwsPpqhJBuKDnO+hcbsVwA@mail.gmail.com>
<CA+TgmoaO-7RHdyJuizWChXZm7EJGvDcfoePDDEyUA-y8vTB1tg@mail.gmail.com>
<CAMbWs4-+jXRpKuFMZa08bS34-TBka3qqjVMAUjF=-1RA9BKvgg@mail.gmail.com>
<CA+TgmoZapU1y59-s3o8oPt7Hv+cxRh_34FMu6MXumomLe+U1Cw@mail.gmail.com>
<CAMbWs4_sEeeBmucBzbamBMfA9uLxVmOc_MV=ZpSyDbTcrUO_XQ@mail.gmail.com>
<CA+Tgmob4fnv57PQB0Oox86mHSJQ0vVL249eT=gqPvrMkG7h1zw@mail.gmail.com>
<CAMbWs489NYyTcCTbrUi7hPXKtNY5vHrrFcHyMRAv=CA5WsszVw@mail.gmail.com>
<CA+TgmoazmDdcc7NeTo3WM5HW3DASNP4rfZw6X+2nnQKHampOng@mail.gmail.com>
<CAMbWs49bYr-ULhA+-At0iQ+NaFKy72AWB6jzughk8MPTiY+gMQ@mail.gmail.com>
<CA+TgmoYa-zexdbc5nO_D6oxPMZYs06hkYwZK5Dufq+4Hhe6uNQ@mail.gmail.com>
<CAMbWs4_aji0kME490phz6nTXnPToddUn19OF3rLm1g4TbNkuzQ@mail.gmail.com>
<CA+Tgmoa3+G_=8XuQWN+0ugv6r-WV6ruFESpOxpXAAKrne3oVDQ@mail.gmail.com>
<CAMbWs49qiox13EKb7bqgLu7Gu9oar+xe6KMwBjgFwod3JzPfUw@mail.gmail.com>
<CAMbWs48F8WGA-Lzj1Dk76mFqRFxPEwG2_9Zb7+pFs8oi6ew2pw@mail.gmail.com>
<CAMbWs484ms=WRZamOyWnVditREKFqipLsdaQjcv2uKur8SZuqw@mail.gmail.com>
<CAMbWs49bL2ZMSc0W4G8=R7bjaa-vO6grucEOFYLZFUZE7+nzrQ@mail.gmail.com>
<CA+TgmobqbeJ9iRQO4ym6OiHt71sSv2eai=01kOZjxhdof9K4Mw@mail.gmail.com>
<CAMbWs4_2BzuAX+BSO1p7rtUwmQjORrG-b906Cw-RkfRjFP0oSQ@mail.gmail.com>
<CA+TgmoYbkvYwLa+1vOP7RDY7kO2=A7rppoPusoRXe44VDOGBPg@mail.gmail.com>
<CAMbWs4_aezTYOZSj7v+aypLo0dnjAierJtdx2gf6se28p88WHg@mail.gmail.com>
<CA+TgmoaY6E5-UFTWp5BtAjBO=tDQd=UVAgeJ3dRbFFzhnP5NHg@mail.gmail.com>
<CAMbWs484dnecwXT2WzWFvzEmWPzC3U9F8SRDXg-SEegTYUFyXQ@mail.gmail.com>
<CAMbWs4-2cVfBk1HNGtqV1QFo2yKnzdxLy0BAqQaJHBt+8+kspw@mail.gmail.com>
<CAMbWs48W80HFm9b+yZPKER=MA5M_bveYvBx1AwOrxdPYbLmYmQ@mail.gmail.com>
<CAMbWs4_W6PmVXxgqM8LMerX2iwqOwR5EC5RGWSAMuvkm8o+-jw@mail.gmail.com>
On Mon, 6 Oct 2025 at 13:59, Richard Guo <[email protected]> wrote:
> Barring any objections, I'll plan to push v23 in a couple of days.
Not a complete review, but a customary look:
1. setup_base_grouped_rels() by name and the header comment claim to
operate on base relations, but the code seems to be coded to handle
OTHER_MEMBER rels too.
Note that set_base_rel_pathlists() explicitly skips anything that's
not RELOPT_BASEREL, so if you're not doing that, then you shouldn't
use "base" in the function name. It's confusing.
2. All the calls to generate_grouped_paths() pass the grouped_rel
RelOptInfo and also grouped_rel->agg_info. Is there a reason to keep
it that way rather than access the agg_info from the given grouped_rel
from within the function?
3. " * The information needed are provided by the RelAggInfo
structure." This should use "is" rather than "are"
4. standard_join_search(). I think it's worth getting rid of the
duplicate "if (!bms_equal(rel->relids, root->all_query_rels))" check.
How about setting that in a local variable rather than recalling
bms_equal(). I don't believe the compiler will optimise the extra one
away as it can't know set_cheapest() doesn't change the relids. Also,
wouldn't it be better to check rel->grouped_rel != NULL first? Won't
that be NULL in most cases, where as !bms_equal(rel->relids,
root->all_query_rels) will be true in most cases? Likewise in
generate_partitionwise_join_paths().
5. Wouldn't it be better to do 0002 first and get that into core so
you don't have to do the hacky stuff in is_partial_agg_memory_risky()?
6. Shouldn't this be using lappend()?
agg_clause_list = list_append_unique(agg_clause_list, ac_info);
I don't understand why ac_info could already be in the list. You've
just done: ac_info = makeNode(AggClauseInfo);
7. The following comment talks about "base" relations. I don't think
it should be as the RelOptInfo can be an OTHER_MEMBER rel.
* build_simple_grouped_rel
* Construct a new RelOptInfo representing a grouped version of the input
* base relation.
*/
8. Normally we check the List is NIL instead of:
if (list_length(group_clauses) == 0)
9. In get_expression_sortgroupref(), a comment claims "We ignore child
members here.". I think that's outdated since ec_members no longer has
child members.
10. I don't think this comment quite makes sense:
* "apply_at" tracks the lowest join level at which partial aggregation is
* applied.
maybe "minimum set of rels to join before partial aggregation can be applied"?
or at least swap "is" for "can be".
My confusion comes from the fact you're stating "lowest join level",
which seems to indicate that it could be applied after further
relations have been joined, but then you're saying "is applied" to
indicate that it can only be applied at that level.
11. The way you've written the header comments for typedef struct
RelAggInfo seems weird. I've only ever seen extra details in the
header comment when the inline comments have been kept to a single
line. You're spanning multiple lines, so why have the out of line
comments in the header at all?
12. This just doesn't feel like the right name for this field:
/* lowest level partial aggregation is applied at */
Relids apply_at;
I can't help think that it should be something like "agg_relids" or
"required_relids". I understand you're currently only applying the
partial grouping when you get exactly the minimum set of relids in the
join search, but if this can be made fast enough, I expect that could
be changed in the future. If you do change it, then "apply_at" is a
pretty confusing name. Perhaps I've misunderstood here and if you did
that, you'd need to create another RelAggInfo to represent that?
13. Parameter names mismatch between definition and declaration in:
extern RelOptInfo *build_simple_grouped_rel(PlannerInfo *root,
RelOptInfo *rel_plain);
extern RelOptInfo *build_grouped_rel(PlannerInfo *root,
RelOptInfo *rel_plain);
extern void generate_grouped_paths(PlannerInfo *root,
RelOptInfo *rel_grouped,
RelOptInfo *rel_plain,
RelAggInfo *agg_info);
14. Do all the regression tests need VERBOSE in EXPLAIN? It's making
the output kinda huge. It might also be nice to wrap the long queries
onto multiple lines to make them easier to read.
David
view thread (75+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: Eager aggregation, take 3
In-Reply-To: <CAApHDvrxyGSLv3Sbg9fpmz6yYri_7M6SaKYnqQQv59uZfQdduA@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox