public inbox for [email protected]  
help / color / mirror / Atom feed
Re: Unnecessary buffer usage with multicolumn index, row comparison, and equility constraint
2+ messages / 2 participants
[nested] [flat]

* Re: Unnecessary buffer usage with multicolumn index, row comparison, and equility constraint
@ 2024-05-11 04:05  Ron Johnson <[email protected]>
  0 siblings, 1 reply; 2+ messages in thread

From: Ron Johnson @ 2024-05-11 04:05 UTC (permalink / raw)
  To: pgsql-generallists.postgresql.org <[email protected]>

On Fri, May 10, 2024 at 11:28 PM WU Yan <[email protected]> wrote:

> Hi everyone, first time here. Please kindly let me know if this is not the
> right place to ask.
>
> I notice a simple query can read a lot of buffer blocks in a meaningless
> way, when
> 1. there is an index scan on a multicolumn index
> 2. there is row constructor comparison in the Index Cond
> 3. there is also an equality constraint on the leftmost column of the
> multicolumn index
>
>
> ## How to reproduce
>
> I initially noticed it on AWS Aurora RDS, but it can be reproduced in
> docker container as well.
> ```bash
> docker run --name test-postgres -e POSTGRES_PASSWORD=mysecretpassword -d
> -p 5432:5432 postgres:16.3
> ```
>
> Create a table with a multicolumn index. Populate 12 million rows with
> random integers.
> ```sql
> CREATE TABLE t(a int, b int);
> CREATE INDEX my_idx ON t USING BTREE (a, b);
>
> INSERT INTO t(a, b)
> SELECT
>     (random() * 123456)::int AS a,
>     (random() * 123456)::int AS b
> FROM
>     generate_series(1, 12345678);
>
> ANALYZE t;
> ```
>
> Simple query that uses the multicolumn index.
> ```
> postgres=# explain (analyze, buffers) select * from t where row(a, b) >
> row(123450, 123450) and a = 0 order by a, b;
>

Out of curiosity, why "where row(a, b) > row(123450, 123450)" instead of "where
a > 123450 and b > 123450"?


^ permalink  raw  reply  [nested|flat] 2+ messages in thread

* Re: Unnecessary buffer usage with multicolumn index, row comparison, and equility constraint
@ 2024-05-11 04:18  Tom Lane <[email protected]>
  parent: Ron Johnson <[email protected]>
  0 siblings, 0 replies; 2+ messages in thread

From: Tom Lane @ 2024-05-11 04:18 UTC (permalink / raw)
  To: Ron Johnson <[email protected]>; +Cc: pgsql-generallists.postgresql.org <[email protected]>

Ron Johnson <[email protected]> writes:
> On Fri, May 10, 2024 at 11:28 PM WU Yan <[email protected]> wrote:
>> Simple query that uses the multicolumn index.
>> postgres=# explain (analyze, buffers) select * from t where row(a, b) >
>> row(123450, 123450) and a = 0 order by a, b;

> Out of curiosity, why "where row(a, b) > row(123450, 123450)" instead of "where
> a > 123450 and b > 123450"?

That row() condition actually means "a > 123450 OR
(a = 123450 AND b > 123450)", which is not the same.

(It'd be a little clearer with two different values in
the row constant, perhaps.)

It does seem like there's an optimization failure here.
I don't expect btree to analyze row comparisons exactly,
but it's sad that it seems to be stupider than for the
simplified case

explain (analyze, buffers) select * from t
where a >= 123450 and a = 0
order by a, b;
                                                  QUERY PLAN                                                   
---------------------------------------------------------------------------------------------------------------
 Index Only Scan using my_idx on t  (cost=0.43..4.45 rows=1 width=8) (actual time=0.001..0.002 rows=0 loops=1)
   Index Cond: ((a >= 123450) AND (a = 0))
   Heap Fetches: 0
 Planning:
   Buffers: shared hit=4
 Planning Time: 0.081 ms
 Execution Time: 0.013 ms
(7 rows)

For that, it's able to see that the index conditions are
contradictory, so it fetches no index pages whatever.

			regards, tom lane






^ permalink  raw  reply  [nested|flat] 2+ messages in thread


end of thread, other threads:[~2024-05-11 04:18 UTC | newest]

Thread overview: 2+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2024-05-11 04:05 Re: Unnecessary buffer usage with multicolumn index, row comparison, and equility constraint Ron Johnson <[email protected]>
2024-05-11 04:18 ` Tom Lane <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox