Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1s5dzS-0073As-Ic for pgsql-general@arkaria.postgresql.org; Sat, 11 May 2024 04:05:42 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1s5dzQ-003H5P-DD for pgsql-general@arkaria.postgresql.org; Sat, 11 May 2024 04:05:40 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1s5dzQ-003H5G-1y for pgsql-general@lists.postgresql.org; Sat, 11 May 2024 04:05:40 +0000 Received: from mail-ot1-x334.google.com ([2607:f8b0:4864:20::334]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1s5dzL-000R0f-80 for pgsql-general@lists.postgresql.org; Sat, 11 May 2024 04:05:39 +0000 Received: by mail-ot1-x334.google.com with SMTP id 46e09a7af769-6f0e3b45706so1495799a34.0 for ; Fri, 10 May 2024 21:05:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1715400334; x=1716005134; darn=lists.postgresql.org; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :from:to:cc:subject:date:message-id:reply-to; bh=NgPgwWzPWLSr5GzG//ezmo7N9no0dv2DB476PiqxbAM=; b=V4NcY7zPqDhfLhOh0a6PASdDa6NREPcUeDxP+lAKIoZzdT0xqURMOg0fYV8FhFMsoG QQ0OR/tgnE8DjcMaS2F30YCtsJ5CmTDWNnhTU17Jj7uNPmK59JbdwpLXlFKViN6H8tT+ B1ddgcZzxn7j9n6tiqv7COCGRS8YjkBEHcmdoEhnPl3TAUMDB9vuSHTMsuwUf9q0oVss 6tbR/jSo9rIXJ+4q9AmP2fuem+VtWmi4cjR532UDBTU1pOyY5UMc834kmP+fqmRmT4cG HhW8nr5aZxnldIotMsgHjFz8l3CAbF1zP/DIy41Gj0DxXATJyUr4jsdrBf1veADBHBUo 0wBQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1715400334; x=1716005134; h=to:subject:message-id:date:from:in-reply-to:references:mime-version :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=NgPgwWzPWLSr5GzG//ezmo7N9no0dv2DB476PiqxbAM=; b=SL3vaU10kF1EDi4hljJ3jYyo+UFM0SEpWuxJP8HxYkXiUJneevxRe7dcfgVdswIdhi 95GIqjTCdIBSMEZLRXuXJ+qrspHWqes+hKqE9dHhGgI3R7Cd3ssPTmpkuUfJggyhDF3n mZVGml5gZDMemXqDcnXDCs7DL7mcDgkRdiLClpdfaVf0FlXC9C/1ShNZUg5GbIXVYCSa M6jbOaiYAS6ILe/QwH1iJjXmNCjG6QTpyiOxPOlVrBTAMbOresLAttkFHC6NSxfrxBNc AaA6e3wJERUjgn5nyki6dj+v1Pja+eQd+lX734ByHuMmSRYygG5y/8GsCtl4nYfv1nni +2Kg== X-Gm-Message-State: AOJu0Yxt1YhmLRIUmWaENB9/x7yxUczsooBSoHH1NqislUhhCfCAn1Zr gayZNcjS4IKlwa7nfc8RhQDYLdu69xW79wHrbFsrJQrZm4w7zuNC/j6Ed1orNXeTJDrRO2ew4lE aqFjNXQaSnzzzOvtcQvUkLCwFbgHp8Gal X-Google-Smtp-Source: AGHT+IF9hvhdUcsLJ4nkwUANGNHMkJjM4V4b4En8XWXi8e4+vU/KerdX9evx1gRhD75IbSXdh0KZ1G3Gp615Ws5E3oI= X-Received: by 2002:a05:6870:e98c:b0:23d:510d:ea55 with SMTP id 586e51a60fabf-24172870b2cmr4755521fac.18.1715400334330; Fri, 10 May 2024 21:05:34 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Ron Johnson Date: Sat, 11 May 2024 00:05:22 -0400 Message-ID: Subject: Re: Unnecessary buffer usage with multicolumn index, row comparison, and equility constraint To: "pgsql-generallists.postgresql.org" Content-Type: multipart/alternative; boundary="0000000000009352e6061825c3aa" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000009352e6061825c3aa Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Fri, May 10, 2024 at 11:28=E2=80=AFPM WU Yan <4wuyan@gmail.com> wrote: > Hi everyone, first time here. Please kindly let me know if this is not th= e > right place to ask. > > I notice a simple query can read a lot of buffer blocks in a meaningless > way, when > 1. there is an index scan on a multicolumn index > 2. there is row constructor comparison in the Index Cond > 3. there is also an equality constraint on the leftmost column of the > multicolumn index > > > ## How to reproduce > > I initially noticed it on AWS Aurora RDS, but it can be reproduced in > docker container as well. > ```bash > docker run --name test-postgres -e POSTGRES_PASSWORD=3Dmysecretpassword -= d > -p 5432:5432 postgres:16.3 > ``` > > Create a table with a multicolumn index. Populate 12 million rows with > random integers. > ```sql > CREATE TABLE t(a int, b int); > CREATE INDEX my_idx ON t USING BTREE (a, b); > > INSERT INTO t(a, b) > SELECT > (random() * 123456)::int AS a, > (random() * 123456)::int AS b > FROM > generate_series(1, 12345678); > > ANALYZE t; > ``` > > Simple query that uses the multicolumn index. > ``` > postgres=3D# explain (analyze, buffers) select * from t where row(a, b) > > row(123450, 123450) and a =3D 0 order by a, b; > Out of curiosity, why "where row(a, b) > row(123450, 123450)" instead of "w= here a > 123450 and b > 123450"? --0000000000009352e6061825c3aa Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
On Fri, May 10, 2024 at 11:28=E2=80=AFPM = WU Yan <4wuyan@gmail.com> wro= te:
Hi everyone, first time here. Please kindly= let me know if this is not the right place to ask.

I notice a simpl= e query can read a lot of buffer blocks in a meaningless way, when
1. th= ere is an index scan on a multicolumn index
2. there is row constructor = comparison in the Index Cond
3. there is also an equality constraint on = the leftmost column of the multicolumn index


## How to reproduce=

I initially noticed it on AWS Aurora RDS, but it can be reproduced = in docker container as well.
```bash
docker run --name test-postgres = -e POSTGRES_PASSWORD=3Dmysecretpassword -d -p 5432:5432 postgres:16.3
``= `

Create a table with a multicolumn index. Populate 12 million rows = with random integers.
```sql
CREATE TABLE t(a int, b int);
CREATE = INDEX my_idx ON t USING BTREE (a, b);

INSERT INTO t(a, b)
SELECT<= br>=C2=A0 =C2=A0 (random() * 123456)::int AS a,
=C2=A0 =C2=A0 (random() = * 123456)::int AS b
FROM
=C2=A0 =C2=A0 generate_series(1, 12345678);<= br>
ANALYZE t;
```

Simple query that uses the multicolumn inde= x.
```
postgres=3D# explain (analyze, buffers) select * from t where = row(a, b) > row(123450, 123450) and a =3D 0 order by a, b;

Out of curiosity, why "where row(a, b) > row(123450, 123450)" instead of &qu= ot;where a > 123450 and b > 123450"?

--0000000000009352e6061825c3aa--