Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1s5Vqi-005loJ-Pn for pgsql-general@arkaria.postgresql.org; Fri, 10 May 2024 19:24:08 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1s5Vqg-000qxZ-Sj for pgsql-general@arkaria.postgresql.org; Fri, 10 May 2024 19:24:07 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1s5Vqg-000qxQ-4u for pgsql-general@lists.postgresql.org; Fri, 10 May 2024 19:24:06 +0000 Received: from mout.gmx.net ([212.227.17.22]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1s5Vqd-000Nbd-Oa for pgsql-general@lists.postgresql.org; Fri, 10 May 2024 19:24:05 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmx.net; s=s31663417; t=1715369041; x=1715973841; i=jimis@gmx.net; bh=32fYwo2L5hTqSro9Ys79/mtS29Cqfbg7bB9KBURLTbY=; h=X-UI-Sender-Class:Date:From:To:Subject:In-Reply-To:Message-ID: References:MIME-Version:Content-Type:Content-Transfer-Encoding:cc: content-transfer-encoding:content-type:date:from:message-id: mime-version:reply-to:subject:to; b=exSBgLyFLOQqJyHcNa0KyZZIsqx2jYdKb3xrnh/bJi3bb7RgivJBVX6wFdqEBID9 yMlns2VEEqEo5Qk+UhAsxixtSNRVkepqcagBQjvjxN0CzcrsRrm1F4vY2NX8QLE/j 6NIrPfT6E4058/pgM9lWCi7nFdFpFbYaZQ2Faaud8ueHXirxZhFPJmb5SpLtaHPyX mqAQxArwxBynBeL+Z6rNwvfp24Q9NL1vrW92DG3wa1+X8HDh3DHtxUaIbgs28Lh7N gK8VuKsCbRSgPxQOjoN/0n8smK9P5wDS7hHdYxK3DBZktuNzt6xvC1r+v1pJ4elTE MTv8ibgWfnT+slgiAw== X-UI-Sender-Class: 724b4f7f-cbec-4199-ad4e-598c01a50d3a Received: from [10.9.70.35] ([185.55.106.54]) by mail.gmx.net (mrgmx105 [212.227.17.168]) with ESMTPSA (Nemesis) id 1M72oB-1rxWpS2bhn-008WkT for ; Fri, 10 May 2024 21:24:01 +0200 Date: Fri, 10 May 2024 21:24:00 +0200 (CEST) From: Dimitrios Apostolou To: pgsql-general@lists.postgresql.org Subject: Re: SELECT DISTINCT chooses parallel seqscan instead of indexscan on huge table with 1000 partitions In-Reply-To: <7886a68f-b466-2131-1747-f69f0fb71a37@gmx.net> Message-ID: <69077f15-4125-2d63-733f-21ce6eac4f01@gmx.net> References: <7886a68f-b466-2131-1747-f69f0fb71a37@gmx.net> MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed X-Provags-ID: V03:K1:FWjgms+ZM0U5G0lkgawdecj4A/J6SfqwxEdolnAU2nwYkV/xSOE 1SEFdkLejfgXT+jxP9oP9ois1+gQfY5prb9otGpoFNhDcTeJqrC96NqV0LEySlUF/ANf1Jp 69qv7eGBQUY2YY2cIxqHT70ZFCMvTl4qaKVaPvfhQgdU/8ygvpVYXDNO1V09iw+8WycovKz 3l/nVpQA2LFOPrLGNSLdw== X-Spam-Flag: NO UI-OutboundReport: notjunk:1;M01:P0:lkjQENpIVuU=;6OvVAyh0VuIUZmDsLJ9IVZTS/8Z LivfFdFyGEfHC4QUMgVjGp4M67vXCx3+UqNO5DY7H0bfnc41XPB+9XNSh77mW+EAmXjKwAUtV XJiwEq3iXl/2xfnwMEiAyq+WWmvfE/P2ZI4zwpEZKBMEuGSlAqPueIo1o/8QmKziIuqQNPjmj jypFohHDbcCbDxZvHHSJ216UD3sIQMNjwgZCgK2N+w0hBwyT7aTNEX768Nvnx234HVQJAYW7h UsMHU2+9g7Xno422NWNFdy22YTDD42jo0T0OuUlLsdoO7WWNukP0JgazW4klGZT1K8orPSGvc ieskAgKH2rkDrL0IqGQERY3x0NXZ55Z27yUY/a+S6WK49MnxIi3EVjIYE5EXvaOnQTOhULYkI 7j2DTLtsEDBohMUShiEb6CND0/qx/riLm+rgzogqsJBKPnbpgEN77FuPAyE9iFUy8Ey64NRLY gphUqfldrainykG8tkwv1wdcgrTN4CMXOBcyrB2wkbj7mrPFq+56wh/Bqlkw6B4rKhoq5iZ0s /A7coPOlt99/h0uKgCuG40Y5oXWguKn7R2b8V4WP1K1mYUETxpjOebcFzF8SACSOYec3K0dYg R7d7ZTjm1aETtS5xD7koHeHSEWWY7uuf7GoAoZx5AzqyD9Yz3MvDxkhRoUhO33LY6oo686PLc XHHIsZfxspfSYMp4FaBpO8tLh+1M/bkFYExpvXJJtGcUMF2fTvnKSjCYmXZZSk083dXCkJ2b/ 8BxOP7HovIYRug4WQtH0txGZNe7klnJwH0YuqD2dx90NCza92QT2Pvnfar58RdbRMq56fwLq7 8dxLYFL2vqS8mZAFf/i0Rynrmcd+9XRf/lWtCYMnUGQPE= Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Fri, 10 May 2024, Dimitrios Apostolou wrote: > I noticed that the following query is very very slow (too long to wait f= or it > to finish): > > SELECT DISTINCT workitem_n FROM test_runs_raw ORDER BY workitem_n DESC L= IMIT 10; Update: even the simplest SELECT DISTINCT query shows similar behaviour: EXPLAIN SELECT DISTINCT workitem_n FROM test_runs_raw LIMIT 10; Limit (cost=3D724518979.52..724518979.92 rows=3D10 width=3D4) -> Unique (cost=3D724518979.52..724518987.52 rows=3D200 width=3D4) -> Sort (cost=3D724518979.52..724518983.52 rows=3D1600 width= =3D4) Sort Key: test_runs_raw.workitem_n -> Gather (cost=3D724518732.37..724518894.37 rows=3D1600= width=3D4) Workers Planned: 4 -> HashAggregate (cost=3D724517732.37..724517734.3= 7 rows=3D200 width=3D4) Group Key: test_runs_raw.workitem_n -> Parallel Append (cost=3D0.00..704131546.9= 0 rows=3D8154474186 width=3D4) -> Parallel Index Only Scan using test_= runs_raw__part_max9600k_pkey on test_runs_raw__part_max9600k test_runs_raw= _480 (cost=3D0.57..1429238.50 rows=3D16811660 width=3D4) -> Parallel Index Only Scan using test_= runs_raw__part_max10140k_pkey on test_runs_raw__part_max10140k test_runs_r= aw_507 (cost=3D0.57..1081827.27 rows=3D12896836 width=3D4) -> Parallel Seq Scan on test_runs_raw__= part_max9500k test_runs_raw_475 (cost=3D0.00..2717185.06 rows=3D32060806 = width=3D4) -> Parallel Seq Scan on test_runs_raw__= part_max11180k test_runs_raw_559 (cost=3D0.00..2610814.95 rows=3D30806095= width=3D4) It also takes ages to return, so I have to interrupt it. I believe it should exit early, as soon as it finds 10 distinct values (which should be rather easy even with parallel seqscans, given the pattern followed when inserting the data). Thanks, Dimitris