Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uvzMS-004a3r-8F for pgsql-hackers@arkaria.postgresql.org; Tue, 09 Sep 2025 14:30:21 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1uvzMQ-00CY9A-Uv for pgsql-hackers@arkaria.postgresql.org; Tue, 09 Sep 2025 14:30:19 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uvzMQ-00CY91-HG for pgsql-hackers@lists.postgresql.org; Tue, 09 Sep 2025 14:30:19 +0000 Received: from mail-ej1-x630.google.com ([2a00:1450:4864:20::630]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1uvzMP-001Ual-0f for pgsql-hackers@lists.postgresql.org; Tue, 09 Sep 2025 14:30:18 +0000 Received: by mail-ej1-x630.google.com with SMTP id a640c23a62f3a-b04b55d5a2cso593513766b.2 for ; Tue, 09 Sep 2025 07:30:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1757428216; x=1758033016; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=A8wevT0yKbdrFbrdI1ylhOQnQNu8BTnfgReEB6ZdnzM=; b=E0ME8pnQSCXgh1YuWooj4j0ELKTK/Ed2bhXC5lgLLbdHJzuvyxmeq2JRtr84Q0fmmm NI4aN5KNKa3b3sy/QaDyj6tdzqdXBTudy7WXIOK9fqGWnOy94oa3VZmdFNoEHBMPmbIg tkmRtCyzY0WfbSJj9SlZyPAVrvNi3fSi1TMYF/nj1Ah2srag0ntlsC4uZq39LKwmdYHe 4E1AMKK0BFDjC+mX39V/uk70mGko+9Dfj84d6E50y9evkrGQ4r642lqoQO5gLKQ5Bi5E EkqwZ9zC865gRqa+anQl2W4fviEm/cJVodmKansAiOKc257YFn/SjgqGDYxqlVAvlIWj mqfQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1757428216; x=1758033016; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=A8wevT0yKbdrFbrdI1ylhOQnQNu8BTnfgReEB6ZdnzM=; b=VAvxMY2RhG4od93isuQNgmc3A7DpkautymzRufQ6di+IVg2SJG3UT5LIXbySjW3cM0 51TzlSkNjmQ7RhVsdgAbs8ymNCwjL+LusQInNjeXb+vLRuDh+/9skroqNZKBp1Pv9nRZ HtuOyYZrQmjhN9Fry1+2jAI/6rrR+414dzdrsItZC8dH21oxppHg/JbqF1sof5/xUX/0 e/Fy3mtVtkqTegBwDBetmKKfKZL25nGYvdxTR336AZ33WjmL1sC48Pa7PAV+QnXOsM2N aTa0iu5Vq3lxc9xjKtsgp7aai+f5sJ1bi2Cq5QFvk36t4KM5apuxqotG/96GUxdPTD1s POYw== X-Forwarded-Encrypted: i=1; AJvYcCVkqlIMLRcDh+lyWf3qctKa50kDoQq5h4YPQmJxnX6v1cHJaYtBp5Fa8thxYdewgyKG5fytAp/NbPN+a9Ui@lists.postgresql.org X-Gm-Message-State: AOJu0YxyX00uSz2HbJqDBZYzYEAg7xqjPwpTcWfvOvNhZ9toS/tHceg3 kqcs8TA5tHx+V50SKpdyZnOIJyqFa0/I3RFuQb6m6JPdjvdikCVYrioH0g0c5Rs10RFFLnlMfC4 w9ZHul3Pxt0p6NbzpV4pITppoyxva8/s= X-Gm-Gg: ASbGncsKuEghbgCUYja5rdb1qhMQlmqEXe63nOwoaMjxc0I9TSdiJUH0cP2p5qd8NvR TcHrrNHfmAr/ONsqLVJ2cklSu87HSArUn01O6YFCzy2Qp/x143DOre7MOKkajuENnIF0X7QnUNx faRsTsJMSIx8+ssy8rX8Ev9v/JSateMECXfnHV/+JhMA3Jew6U1WBKrvFTFWh6tXmx4GeazkYcR 3EkHVQK X-Google-Smtp-Source: AGHT+IHF/EC28CfVDlf5V0H1M/T95MDazDP+jmRzzY83ccAmAp/akZkMgH2iqWKPKRNnhmI5/dAyyQlGwd+/ZODK7aw= X-Received: by 2002:a17:907:c05:b0:afe:a615:39ef with SMTP id a640c23a62f3a-b04b13cfb31mr1085196066b.9.1757428215534; Tue, 09 Sep 2025 07:30:15 -0700 (PDT) MIME-Version: 1.0 References: <87il22cj51.fsf@163.com> In-Reply-To: From: Robert Haas Date: Tue, 9 Sep 2025 10:30:01 -0400 X-Gm-Features: Ac12FXzKHq9jnskmyqvC5xDSqBmJhzOpbq4d5x21lojB0wurYD6kIjRjTyChyf8 Message-ID: Subject: Re: Eager aggregation, take 3 To: Richard Guo Cc: Matheus Alcantara , Tom Lane , Tender Wang , Paul George , Andy Fan , PostgreSQL-development , pgsql-hackers@lists.postgresql.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Tue, Sep 9, 2025 at 6:30=E2=80=AFAM Richard Guo = wrote: > > I think it would be worth considering generating the partially grouped > > relations in a second pass. Right now, as you progress from the bottom > > of the join tree towards the top, you created grouped rels as you go. > > But you could equally well finish planning everything up to the > > scan/join target first and then go back and add grouped_rels to > > relations where it seems worthwhile. > > Hmm, I don't think so. I think the presence of eager aggregation > could change the best join order. For example, without eager > aggregation, the optimizer might find that (A JOIN B) JOIN C the best > join order. But with eager aggregation on B, the optimizer could > prefer A JOIN (AGG(B) JOIN C). I'm not sure how we could find the > best join order with eager aggregation applied without building the > join tree from the bottom up. Oh, that is a problem, yes. :-( > > I haven't done a detailed comparison of generate_grouped_paths() to > > other parts of the code, but I have an uncomfortable feeling that it > > might be rather similar to some existing code that probably already > > exists in multiple, slightly-different versions. Is there any > > refactoring we could do here? > > Yeah, we currently have several functions that do similar, but not > exactly the same, things. Maybe some refactoring is possible -- maybe > not -- I haven't looked into it closely yet. However, I'd prefer to > address that in a separate patch if possible, since this issue also > exists on master, and I want to avoid introducing such changes in this > already large patch. Well, it's not just a matter of "this already exists" -- it gets harder and harder to unify things the more near-copies you add. > Good point. I do have manually tested GEQO by setting geqo_threshold > to 2 and running the regression tests to check for any planning > errors, crashes, or incorrect results. However, I'm not sure where > test cases for GEQO should be added. I searched the regression tests > and found only one explicit GEQO test, added back in 2009 (commit > a43b190e3). It's not quite clear to me what the current policy is for > adding GEQO test cases. > > Anyway, I will add some test cases in eager_aggregate.sql with > geqo_threshold set to 2. Sounds good. I think GEQO is mostly-unmaintained these days, but if we're updating the code, I think it is good to add tests. Being that the code is so old, it probably lacks adequate test coverage. --=20 Robert Haas EDB: http://www.enterprisedb.com