public inbox for [email protected]
help / color / mirror / Atom feedFrom: Peter J. Holzer <[email protected]>
To: [email protected]
Subject: Re: Index not used in certain nested views but not in others
Date: Sat, 3 May 2025 14:34:47 +0200
Message-ID: <[email protected]> (raw)
In-Reply-To: <20250430151647.7kootztymzznydn5@victor>
References: <20250430151647.7kootztymzznydn5@victor>
On 2025-04-30 17:16:47 +0200, Markus Demleitner wrote:
> There is another view made up of about 20 tables, looking somewhat
> like this:
[...]
> The first view, ivoa.obs_radio, is just a few hundred records,
> dfbsspec.raw_spectra is about 23 Megarows,
[...]
> -> Parallel Seq Scan on raw_spectra (cost=0.00..2626995.66 rows=5803266 width=756) (actual time=0.137..6841.379 rows=4642657 loops=5)
[...]
>
> My problem is: I can't seem to figure out why Postgres chooses to ignore
> the pubdid index on raw_spectra.pub_did and instead does the
> time-consuming seqscan.
It estimates that it has to read 5803266 of those 23000000 rows. That's
25 %! I'm not surprised that it thinks just reading the whole table is
faster than doing almost 6 million index lookups (The actual count is
4642657, so that estimate wasn't totally off).
> Trying to investigate more closely, I wanted to simplify the
> situation and created a view like ivoa.obscore but only having the
> evil table in it:
[...]
> -> Nested Loop (cost=0.56..4871.60 rows=561 width=0) (actual time=2.478..2.479 rows=0 loops=1)
> -> Seq Scan on main (cost=0.00..52.61 rows=561 width=48) (actual time=0.011..0.317 rows=561 loops=1)
> -> Index Scan using raw_spectra_pub_did on raw_spectra (cost=0.56..8.58 rows=1 width=66) (actual time=0.003..0.003 rows=0 loops=561)
> Index Cond: (pub_did = main.obs_publisher_did)
Here you select only 561 rows. That's just a tiny fraction of the whole
table, so the optimizer estimates that doing a few hundred index lookups
is faster than reading the whole table.
> So, when the SELECT statement on dfbsspec.ssa stands along in the view
> definition, Postgres does the right thing; when the exact same query
> stands in a UNION ALL with other tables, Postgres doesn't use the
> index. Hu?
It is obviously not the exact same query if one of them need to read
10000 times as many rows.
hjp
--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | [email protected] | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"
Attachments:
[application/pgp-signature] signature.asc (833B, 2-signature.asc)
download
view thread (7+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected]
Subject: Re: Index not used in certain nested views but not in others
In-Reply-To: <[email protected]>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox