Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tjv8X-00EgLm-2N for pgsql-general@arkaria.postgresql.org; Mon, 17 Feb 2025 07:01:49 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1tjv8V-00DqjE-OQ for pgsql-general@arkaria.postgresql.org; Mon, 17 Feb 2025 07:01:47 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tjv8V-00Dqj6-E4 for pgsql-general@lists.postgresql.org; Mon, 17 Feb 2025 07:01:47 +0000 Received: from sss.pgh.pa.us ([68.162.161.243]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1tjv8S-001IGR-34 for pgsql-general@lists.postgresql.org; Mon, 17 Feb 2025 07:01:47 +0000 Received: from sss1.sss.pgh.pa.us (localhost [127.0.0.1]) by sss.pgh.pa.us (8.15.2/8.15.2) with ESMTP id 51H71fG5744080; Mon, 17 Feb 2025 02:01:41 -0500 From: Tom Lane To: WU Yan <4wuyan@gmail.com> cc: pgsql-general@lists.postgresql.org Subject: Re: Wasteful nested loop join when there is `limit` in the query In-reply-to: References: Comments: In-reply-to WU Yan <4wuyan@gmail.com> message dated "Mon, 17 Feb 2025 17:38:49 +1100" MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <744078.1739775701.1@sss.pgh.pa.us> Date: Mon, 17 Feb 2025 02:01:41 -0500 Message-ID: <744079.1739775701@sss.pgh.pa.us> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk WU Yan <4wuyan@gmail.com> writes: > Hello everyone, I am still learning postgres planner and performance > optimization, so please kindly point out if I missed something obvious. An index on employee.name would likely help here. Even if we had an optimization for pushing LIMIT down through a join (which you are right, we don't) it could not push the LIMIT through a sort step. So you need presorted output from the scan of "employee". I think this example would behave better with that. You may also need to test with non-toy amounts of data to get the plan you think is better: an example with only half a dozen rows is going to be swamped by startup costs. regards, tom lane