Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uBF8V-008O3P-O0 for pgsql-general@arkaria.postgresql.org; Sat, 03 May 2025 15:50:44 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1uBF8T-00FGPd-8b for pgsql-general@arkaria.postgresql.org; Sat, 03 May 2025 15:50:42 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uBF8S-00FGPU-U5 for pgsql-general@lists.postgresql.org; Sat, 03 May 2025 15:50:41 +0000 Received: from sss.pgh.pa.us ([68.162.161.243]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1uBF8R-000sLZ-3A for pgsql-general@lists.postgresql.org; Sat, 03 May 2025 15:50:41 +0000 Received: from sss1.sss.pgh.pa.us (localhost [127.0.0.1]) by sss.pgh.pa.us (8.15.2/8.15.2) with ESMTP id 543FoSia2173355; Sat, 3 May 2025 11:50:29 -0400 From: Tom Lane To: "Peter J. Holzer" cc: pgsql-general@lists.postgresql.org Subject: Re: Index not used in certain nested views but not in others In-reply-to: <20250503123447.6axo52bqm2lyy33k@hjp.at> References: <20250430151647.7kootztymzznydn5@victor> <20250503123447.6axo52bqm2lyy33k@hjp.at> Comments: In-reply-to "Peter J. Holzer" message dated "Sat, 03 May 2025 14:34:47 +0200" MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <2173353.1746287428.1@sss.pgh.pa.us> Content-Transfer-Encoding: quoted-printable Date: Sat, 03 May 2025 11:50:28 -0400 Message-ID: <2173354.1746287428@sss.pgh.pa.us> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk "Peter J. Holzer" writes: > On 2025-04-30 17:16:47 +0200, Markus Demleitner wrote: >> dfbsspec.raw_spectra is about 23 Megarows, >> -> Parallel Seq Scan on raw_spectra (cost=3D0.00..2626995.66 rows=3D5= 803266 width=3D756) (actual time=3D0.137..6841.379 rows=3D4642657 loops=3D= 5) > It estimates that it has to read 5803266 of those 23000000 rows. No, you're misreading that (I admit it's confusing). The rows report is the average per parallel worker, and the loops count indicates we had 5 workers. So actually this parallel seqscan emitted 4642657*5 =3D 23213285 rows, or the whole table, which is what should be expected given it has no filter condition and no LIMIT. (I am wondering why the estimate is only 5803266 rows, because I don't think that number is scaled for the number of workers...) >> -> Nested Loop (cost=3D0.56..4871.60 rows=3D561 width=3D0) (actual ti= me=3D2.478..2.479 rows=3D0 loops=3D1) >> -> Seq Scan on main (cost=3D0.00..52.61 rows=3D561 width=3D48) = (actual time=3D0.011..0.317 rows=3D561 loops=3D1) >> -> Index Scan using raw_spectra_pub_did on raw_spectra (cost=3D= 0.56..8.58 rows=3D1 width=3D66) (actual time=3D0.003..0.003 rows=3D0 loops= =3D561) >> Index Cond: (pub_did =3D main.obs_publisher_did) > Here you select only 561 rows. That's just a tiny fraction of the whole > table, so the optimizer estimates that doing a few hundred index lookups > is faster than reading the whole table. The point here is that the chosen plan shape allows pushing the join qual "raw_spectra.pub_did =3D main.obs_publisher_did" down to be an index condition, which is exactly what we have to do if we want to avoid reading all of raw_spectra. What Markus is complaining about is that that fails to happen if there's a UNION ALL in the way. Postgres is capable of doing that in other cases, so it's a fair question. regards, tom lane