public inbox for [email protected]  
help / color / mirror / Atom feed
From: Robert Haas <[email protected]>
To: Richard Guo <[email protected]>
Cc: Matheus Alcantara <[email protected]>
Cc: Tom Lane <[email protected]>
Cc: Tender Wang <[email protected]>
Cc: Paul George <[email protected]>
Cc: Andy Fan <[email protected]>
Cc: PostgreSQL-development <[email protected]>
Cc: [email protected]
Subject: Re: Eager aggregation, take 3
Date: Tue, 9 Sep 2025 10:30:01 -0400
Message-ID: <CA+TgmobT156YuP5HUscbR_17nQs=KC703T=SB3VkQMcFbK4Gqw@mail.gmail.com> (raw)
In-Reply-To: <CAMbWs4-07qxWp4x+ia12D_44GPbBf4JzaUZRghBz16MrnmhdOQ@mail.gmail.com>
References: <CAMbWs48jzLrPt1J_00ZcPZXWUQKawQOFE8ROc-ADiYqsqrpBNw@mail.gmail.com>
	<[email protected]>
	<CAMbWs49=eAd2W9jCtGhaZPPp+SOC_2rg16RTG74xAht=hkr5JQ@mail.gmail.com>
	<CAMbWs49Nc4M3H+eCf1+8w8piDyEECjRb-gK_JMF4VvcyWwGEVQ@mail.gmail.com>
	<CAMbWs49E_dR0nobsExsyetpnBpHObLTsQLsEbWKQLkh0omPxNg@mail.gmail.com>
	<CAMbWs49B_qUiHvu2EqLHZRpLr3p_+QPBs50n2=L5ibYzniwTzA@mail.gmail.com>
	<CAMbWs48KCQtDymnYi4M=Vz+WMzo3fkBxffJsyk6VX6hOXXv+VA@mail.gmail.com>
	<CAMbWs49sv_MuOYqqrtmBN_oYf8VSQ2BXDwXaTpJTn_YfwyYdWQ@mail.gmail.com>
	<CAMbWs49U8Sddx_fGszPdvA3jp_nheynxaqm5Y4NqMV21VBYAuQ@mail.gmail.com>
	<CAMbWs4-LwyOg9ga+NVF7yQbMi0ZsZdN1G_sO2v=YJHV18=19+A@mail.gmail.com>
	<CALA8mJquG_zCJXfVwash5LKqHGtZXQmq7RfTSaRDUzGYeW=7Rw@mail.gmail.com>
	<CAMbWs4_EjgcBib5+y1LYcGB3EK3Y6R+OOxGKfJo42fDovadk1g@mail.gmail.com>
	<CALA8mJqe0anNM8_V6cOeOQnCHUTQggn7iOQNyQr1VaN_xMjz+w@mail.gmail.com>
	<CAMbWs48eE-s-jCicC8pSVfXk8Ws-ZvUKnsw8qH-DkVBdYv0eJQ@mail.gmail.com>
	<CAMbWs483a7-8M0pDttG44r-+8Gevn9VG0xNceE3WpkEQxJXPZw@mail.gmail.com>
	<CAHewXNmYM6DvR_kaxDL0w0fz9BwKbac+TSU3QS10aA3cXHyMmA@mail.gmail.com>
	<CA+TgmoaxH=P63hLYgyJJcEbMRnw3xi16d=HxFi1j-m7MhH6W_w@mail.gmail.com>
	<CAMbWs4_cOnpGsywj9Jt1WAgzJLW9Rxt5X13cfGz4iN2qvZQ68g@mail.gmail.com>
	<CA+Tgmob0q7bRbsFTVDMjxHE6zA4uDQLQa-s0CtwUw49V53UL_A@mail.gmail.com>
	<CAMbWs4-Xru_eKBeRHFduigSGihdixFWVTR8A+dtMw7Mao+RkJA@mail.gmail.com>
	<CAMbWs49dLjSSQRWeud+KSN0G531ciZdYoLBd5qktXA+3JQm_UQ@mail.gmail.com>
	<CAMbWs48LXGC-Y63YtzEeM-3f0NUXWCUEMs7XwGzywXTjUNMcxQ@mail.gmail.com>
	<CAMbWs48XdzvnwfTHWxQ7qK-yjvdrbwsPpqhJBuKDnO+hcbsVwA@mail.gmail.com>
	<CA+TgmoaO-7RHdyJuizWChXZm7EJGvDcfoePDDEyUA-y8vTB1tg@mail.gmail.com>
	<CAMbWs4-+jXRpKuFMZa08bS34-TBka3qqjVMAUjF=-1RA9BKvgg@mail.gmail.com>
	<CA+TgmoZapU1y59-s3o8oPt7Hv+cxRh_34FMu6MXumomLe+U1Cw@mail.gmail.com>
	<CAMbWs4_sEeeBmucBzbamBMfA9uLxVmOc_MV=ZpSyDbTcrUO_XQ@mail.gmail.com>
	<CA+Tgmob4fnv57PQB0Oox86mHSJQ0vVL249eT=gqPvrMkG7h1zw@mail.gmail.com>
	<CAMbWs489NYyTcCTbrUi7hPXKtNY5vHrrFcHyMRAv=CA5WsszVw@mail.gmail.com>
	<CA+TgmoazmDdcc7NeTo3WM5HW3DASNP4rfZw6X+2nnQKHampOng@mail.gmail.com>
	<CAMbWs49bYr-ULhA+-At0iQ+NaFKy72AWB6jzughk8MPTiY+gMQ@mail.gmail.com>
	<CA+TgmoYa-zexdbc5nO_D6oxPMZYs06hkYwZK5Dufq+4Hhe6uNQ@mail.gmail.com>
	<CAMbWs4_aji0kME490phz6nTXnPToddUn19OF3rLm1g4TbNkuzQ@mail.gmail.com>
	<CA+Tgmoa3+G_=8XuQWN+0ugv6r-WV6ruFESpOxpXAAKrne3oVDQ@mail.gmail.com>
	<CAMbWs49qiox13EKb7bqgLu7Gu9oar+xe6KMwBjgFwod3JzPfUw@mail.gmail.com>
	<CAMbWs48F8WGA-Lzj1Dk76mFqRFxPEwG2_9Zb7+pFs8oi6ew2pw@mail.gmail.com>
	<CAMbWs484ms=WRZamOyWnVditREKFqipLsdaQjcv2uKur8SZuqw@mail.gmail.com>
	<CAMbWs49bL2ZMSc0W4G8=R7bjaa-vO6grucEOFYLZFUZE7+nzrQ@mail.gmail.com>
	<[email protected]>
	<CAMbWs4_VtGu18P-jWMXAp3Q+mzGDCaT8AhxQDyWv2_rUxsjv8A@mail.gmail.com>
	<CAMbWs48KHOYWLAexTpt=0MTAhKHpBeEC2K5MQthhx+S4kRETZQ@mail.gmail.com>
	<CAMbWs4_if55Qsn1qSoDb1ALeu5L+wzx=G-rDvQNChTQ12a7dHw@mail.gmail.com>
	<CA+TgmoZh8aAadYx-j=Ahq1XRj67RDJ_5H0bUQx6rtB8=_wNkQg@mail.gmail.com>
	<CAMbWs4-07qxWp4x+ia12D_44GPbBf4JzaUZRghBz16MrnmhdOQ@mail.gmail.com>

On Tue, Sep 9, 2025 at 6:30 AM Richard Guo <[email protected]> wrote:
> > I think it would be worth considering generating the partially grouped
> > relations in a second pass. Right now, as you progress from the bottom
> > of the join tree towards the top, you created grouped rels as you go.
> > But you could equally well finish planning everything up to the
> > scan/join target first and then go back and add grouped_rels to
> > relations where it seems worthwhile.
>
> Hmm, I don't think so.  I think the presence of eager aggregation
> could change the best join order.  For example, without eager
> aggregation, the optimizer might find that (A JOIN B) JOIN C the best
> join order.  But with eager aggregation on B, the optimizer could
> prefer A JOIN (AGG(B) JOIN C).  I'm not sure how we could find the
> best join order with eager aggregation applied without building the
> join tree from the bottom up.

Oh, that is a problem, yes. :-(

> > I haven't done a detailed comparison of generate_grouped_paths() to
> > other parts of the code, but I have an uncomfortable feeling that it
> > might be rather similar to some existing code that probably already
> > exists in multiple, slightly-different versions. Is there any
> > refactoring we could do here?
>
> Yeah, we currently have several functions that do similar, but not
> exactly the same, things.  Maybe some refactoring is possible -- maybe
> not -- I haven't looked into it closely yet.  However, I'd prefer to
> address that in a separate patch if possible, since this issue also
> exists on master, and I want to avoid introducing such changes in this
> already large patch.

Well, it's not just a matter of "this already exists" -- it gets
harder and harder to unify things the more near-copies you add.

> Good point.  I do have manually tested GEQO by setting geqo_threshold
> to 2 and running the regression tests to check for any planning
> errors, crashes, or incorrect results.  However, I'm not sure where
> test cases for GEQO should be added.  I searched the regression tests
> and found only one explicit GEQO test, added back in 2009 (commit
> a43b190e3).  It's not quite clear to me what the current policy is for
> adding GEQO test cases.
>
> Anyway, I will add some test cases in eager_aggregate.sql with
> geqo_threshold set to 2.

Sounds good. I think GEQO is mostly-unmaintained these days, but if
we're updating the code, I think it is good to add tests. Being that
the code is so old, it probably lacks adequate test coverage.

-- 
Robert Haas
EDB: http://www.enterprisedb.com





view thread (75+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Eager aggregation, take 3
  In-Reply-To: <CA+TgmobT156YuP5HUscbR_17nQs=KC703T=SB3VkQMcFbK4Gqw@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox