public inbox for [email protected]
help / color / mirror / Atom feedFrom: Tom Lane <[email protected]>
To: Peter J. Holzer <[email protected]>
Cc: [email protected]
Subject: Re: Index not used in certain nested views but not in others
Date: Sat, 03 May 2025 11:50:28 -0400
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <20250430151647.7kootztymzznydn5@victor>
<[email protected]>
"Peter J. Holzer" <[email protected]> writes:
> On 2025-04-30 17:16:47 +0200, Markus Demleitner wrote:
>> dfbsspec.raw_spectra is about 23 Megarows,
>> -> Parallel Seq Scan on raw_spectra (cost=0.00..2626995.66 rows=5803266 width=756) (actual time=0.137..6841.379 rows=4642657 loops=5)
> It estimates that it has to read 5803266 of those 23000000 rows.
No, you're misreading that (I admit it's confusing). The rows report
is the average per parallel worker, and the loops count indicates we
had 5 workers. So actually this parallel seqscan emitted 4642657*5
= 23213285 rows, or the whole table, which is what should be expected
given it has no filter condition and no LIMIT.
(I am wondering why the estimate is only 5803266 rows, because I don't
think that number is scaled for the number of workers...)
>> -> Nested Loop (cost=0.56..4871.60 rows=561 width=0) (actual time=2.478..2.479 rows=0 loops=1)
>> -> Seq Scan on main (cost=0.00..52.61 rows=561 width=48) (actual time=0.011..0.317 rows=561 loops=1)
>> -> Index Scan using raw_spectra_pub_did on raw_spectra (cost=0.56..8.58 rows=1 width=66) (actual time=0.003..0.003 rows=0 loops=561)
>> Index Cond: (pub_did = main.obs_publisher_did)
> Here you select only 561 rows. That's just a tiny fraction of the whole
> table, so the optimizer estimates that doing a few hundred index lookups
> is faster than reading the whole table.
The point here is that the chosen plan shape allows pushing the join
qual "raw_spectra.pub_did = main.obs_publisher_did" down to be an
index condition, which is exactly what we have to do if we want to
avoid reading all of raw_spectra. What Markus is complaining about
is that that fails to happen if there's a UNION ALL in the way.
Postgres is capable of doing that in other cases, so it's a fair
question.
regards, tom lane
view thread (7+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected]
Subject: Re: Index not used in certain nested views but not in others
In-Reply-To: <[email protected]>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox