Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1s5bcV-006e0j-62 for pgsql-general@arkaria.postgresql.org; Sat, 11 May 2024 01:33:51 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1s5bcT-002Xad-2P for pgsql-general@arkaria.postgresql.org; Sat, 11 May 2024 01:33:49 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1s5bcS-002XaU-Nl for pgsql-general@lists.postgresql.org; Sat, 11 May 2024 01:33:49 +0000 Received: from sss.pgh.pa.us ([68.162.161.243]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1s5bcL-000RIr-Po for pgsql-general@lists.postgresql.org; Sat, 11 May 2024 01:33:48 +0000 Received: from sss1.sss.pgh.pa.us (localhost [127.0.0.1]) by sss.pgh.pa.us (8.15.2/8.15.2) with ESMTP id 44B1XcMO1685689; Fri, 10 May 2024 21:33:38 -0400 From: Tom Lane To: Dimitrios Apostolou cc: pgsql-general@lists.postgresql.org Subject: Re: SELECT DISTINCT chooses parallel seqscan instead of indexscan on huge table with 1000 partitions In-reply-to: <6a2f3906-3d7a-6924-7403-8f77d57a18e4@gmx.net> References: <7886a68f-b466-2131-1747-f69f0fb71a37@gmx.net> <69077f15-4125-2d63-733f-21ce6eac4f01@gmx.net> <559b0e40-63e6-fa9a-6b03-d1eba10f30f8@gmx.net> <1629463.1715372568@sss.pgh.pa.us> <6a2f3906-3d7a-6924-7403-8f77d57a18e4@gmx.net> Comments: In-reply-to Dimitrios Apostolou message dated "Sat, 11 May 2024 03:10:50 +0200" MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <1685687.1715391218.1@sss.pgh.pa.us> Date: Fri, 10 May 2024 21:33:38 -0400 Message-ID: <1685688.1715391218@sss.pgh.pa.us> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Dimitrios Apostolou writes: > On Fri, 10 May 2024, Tom Lane wrote: >> I'd say the blame lies with that (probably-default) estimate of >> just 200 distinct rows. That means the planner expects to have >> to read about 5% (10/200) of the tables to get the result, and >> that's making fast-start plans look bad. > In any case, even after the planner decides to execute the terrible plan > with the parallel seqscans, why doesn't it finish right when it finds 10 > distinct values? That plan can't emit anything at all till it finishes the Sort. I do kind of wonder why it's producing both a hashagg and a Unique step --- seems like it should do one or the other. > Thanks, I'll save the ANALYZE as the last step; I feel it's a good > opportunity to figure out more details about how postgres works. Plus I > expect ANALYZE to last a couple of days, so I should first find quiet time > for that. :-) It really should not take too long --- it reads a sample, not the whole table. regards, tom lane