Re: Postgres Query Plan using wrong index

public inbox for [email protected]  
help / color / mirror / Atom feed

From: Manikandan Swaminathan <[email protected]>
To: Tom Lane <[email protected]>
Cc: [email protected]
Subject: Re: Postgres Query Plan using wrong index
Date: Wed, 2 Apr 2025 16:44:32 -0700
Message-ID: <[email protected]> (raw)

Thanks so much for your help, Tom.

Sorry, I didn’t quite understand the answer — I have a few follow-up questions.  Sorry, I'm new to Postgres so I am a bit ignorant here and would appreciate any tips on the query planner you could give.

1) Why is the query currently picking the poorly performing index? I already have an index on (col_a, col_b) that performs well. When I remove the separate index on (col_b), it correctly uses the (col_a, col_b) index and the query runs efficiently. But when both indexes are present, it chooses the slower (col_b) index instead.

2) Why would the index you suggested, (col_b, col_a), perform better than (col_a, col_b)? I would’ve expected the filter on col_a to come first, followed by the aggregate on col_b. In my mind, it needs to find rows matching the col_a condition before calculating the MIN(col_b), and I assumed it would traverse the B-tree accordingly.  I'm more used to MySQL where I think it is called a "lose index scan".  I must have a gap in my understanding of how Postgres approaches this.  Thanks for your help!

3) Why does the planner choose the better-performing (col_a, col_b) index when the filter is col_a > 5000, but switch to the slower (col_b) index when the filter is not at the edge of the range, like col_a > 4996? For reference, here’s the query plan when filtering for col_a > 5000. It uses the correct index on (col_a, col_b).

postgres=# explain analyze select min(col_b) from test_table  where col_a > 5000;

 Aggregate  (cost=4.46..4.46 rows=1 width=4) (actual time=0.008..0.008 rows=1 loops=1)
   ->  Index Only Scan using idx_col_b_a on test_table  (cost=0.43..4.45 rows=1 width=4) (actual time=0.004..0.005 rows=0 loops=1)
         Index Cond: (col_a > 5000)
         Heap Fetches: 0
 Planning Time: 2.279 ms
 Execution Time: 0.028 ms
(6 rows)

> 
> On Apr 1, 2025, at 5:30 PM, Tom Lane <[email protected]> wrote:
> Manikandan Swaminathan <[email protected]> writes:
>> 4. When running the following query, I would expect the index "idx_col_b_a"
>> to be used: select min(col_b) from test_table  where col_a > 4996.
>> I have a range-based filter on col_a, and am aggregating the result with
>> min(col_b). Both columns are covered by "idx_col_b_a".
> 
> They may be covered, but sort order matters, and that index has the
> wrong sort order to help with this query.  Try
> 
> create index on test_table(col_b, col_a);
> 
>          regards, tom lane

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected]
  Subject: Re: Postgres Query Plan using wrong index
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox