Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ulwRw-007buA-97 for pgsql-hackers@arkaria.postgresql.org; Tue, 12 Aug 2025 21:22:28 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1ulwRu-009v08-Mt for pgsql-hackers@arkaria.postgresql.org; Tue, 12 Aug 2025 21:22:26 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ulwRu-009v00-7t for pgsql-hackers@lists.postgresql.org; Tue, 12 Aug 2025 21:22:26 +0000 Received: from mail-qk1-x734.google.com ([2607:f8b0:4864:20::734]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1ulwRr-000Q7F-2p for pgsql-hackers@lists.postgresql.org; Tue, 12 Aug 2025 21:22:25 +0000 Received: by mail-qk1-x734.google.com with SMTP id af79cd13be357-7e62a1cbf82so418743585a.2 for ; Tue, 12 Aug 2025 14:22:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bowt-ie.20230601.gappssmtp.com; s=20230601; t=1755033742; x=1755638542; darn=lists.postgresql.org; h=in-reply-to:references:subject:cc:to:from:message-id:date :content-transfer-encoding:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=SwSRlJ5RAAhJI48lZMoQR5NRQfOxjW4KcGebVpjX03o=; b=D0YumjvqG0AuYC19wgnVxU7kQo0tZ/NQljIyyraM/tjw6MoTsS2R5Xckv88EScp37I muFtFe8BN8LnA+anPQB5ZXZWe3olEJRIAmFTEizKqLt6OZI/hTdIj412A62Iw1kvOl3g I+AulaYuqHwQt6drNRYYiVBrmq676dqSTK3xDBHqjFFfMJ8bCDY/aTWQC14OzWdKTxT3 8A7olCvt4I0dN8FP0Z2E9g2SqVDPat/WqrZNkVtXjHzQckWyD48OjGk8Fx7cxtA2HKNg Zea4UnJ0h305qbFbPSHqUZ9iGX+Gt+CEwAbNNXj9UOjxGhnhYGnvNq8TX8iGCKYwIv1f cA+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1755033742; x=1755638542; h=in-reply-to:references:subject:cc:to:from:message-id:date :content-transfer-encoding:mime-version:x-gm-message-state:from:to :cc:subject:date:message-id:reply-to; bh=SwSRlJ5RAAhJI48lZMoQR5NRQfOxjW4KcGebVpjX03o=; b=c4u/WybVNYYpVNNTrOUwAb+pzDJEyTcFHitEXGdYZMe4PCQSpVtAKhnY2FnvRDrkqo 5JTdYz5b9IzZZN6XHMRdTya+1u0yW1gjYLeXzrrqQGCLpwnqrRKU+HhpTWu6gYAA3pZW zSp+P1di9ea77bY+NZM4/5e25AY0zEanT+vE/Ztl+LrGdxaOrxL8Fj/m5QOzxRYZvU5R wqVnJHGkge+Z01P0FxpFqpEGhv57u40uo1VzC8TbkbNOuc01cYa7lU4k12RMgbzYIZ9A qy0DGRMtuSfgkIBB3/n4Oshb42D3ClzbuZt2j2djJrdiQBlyaLC2Y9edO5V6IWsbsVyG 51SA== X-Forwarded-Encrypted: i=1; AJvYcCUCOZ47yDdJRx9q9r0sc1w1L3lS99nCOLnuosxKTavNxLLNET8LEnGg0kgRww/fFTvjSL8HgWH6oPv8YT3T@lists.postgresql.org X-Gm-Message-State: AOJu0Yx6kclgAjoxhlqiGdytFFDlWNDY393MiTdpoY3GWjsvMX4llVAr 7ZHA2Kr2ye3024zWs9j9LHsuWjShoPHDd/MQNqj43zrRZnszFjIz6UV4uPwhw0ZLeto= X-Gm-Gg: ASbGncuunRvyNlndSD0lQa9oD95kYkLonHiYkKvETIP6H9MCUrvwLnywQjC5DeYnmTy /+C/oSn1JH23qRDudkvM575SSnzg6haZ8cKuT3jI4eW1FIB/5MdBvvEFpRGvYGc8Qky6H6Gxh14 Z4j9/lTF4rwZc2Gsfw5ZOKxqO3HOSWpwiN29fE326vdkzCjoYoxgs1uy7qOUGSVLDnY1DS27Mvc /Xj/V571HNZs+OuNnSNKWCbBUm22nLDil1h8Nj/HCDYG+aeVwNsIJpxtbLPtwku6xoRxGJW9l8p KdUKyNQytE8zZFDWMc/y2AVrCyzN4Xd4pnCOsHvUO4s65UK14FVphxYiXnAcyxwsoyz7ALkCJ/s z4W+zoAqxhHCpaUnWPkyhQwKEg3QdvzZKehEKtv6b/1pLP3cT8v5rPqcg X-Google-Smtp-Source: AGHT+IFNsHABX88zpz/xnlpT7ErDY/w8ffbbqHvtyErOo837qqQy2jJsTa9W5NYV9YzJswhBVsMxSw== X-Received: by 2002:ae9:c119:0:b0:7e7:faaa:e7c9 with SMTP id af79cd13be357-7e86522b3fcmr95439685a.12.1755033741741; Tue, 12 Aug 2025 14:22:21 -0700 (PDT) Received: from localhost (syn-024-090-110-103.res.spectrum.com. [24.90.110.103]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7e861b50961sm111155585a.25.2025.08.12.14.22.20 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 12 Aug 2025 14:22:21 -0700 (PDT) Mime-Version: 1.0 Content-Transfer-Encoding: quoted-printable Content-Type: text/plain; charset=UTF-8 Date: Tue, 12 Aug 2025 17:22:20 -0400 Message-Id: From: "Peter Geoghegan" To: "Thomas Munro" Cc: "Tomas Vondra" , "Andres Freund" , "Nazir Bilal Yavuz" , "Robert Haas" , "Melanie Plageman" , "PostgreSQL Hackers" , "Georgios" , "Konstantin Knizhnik" , "Dilip Kumar" Subject: Re: index prefetching X-Mailer: aerc 0.20.0 References: <8ed1d326-5c6e-476e-b3fd-30d3da210546@vondra.me> <38b865bd-2ae9-4a94-a788-6e3dc99ccd70@vondra.me> <306fc8c0-c882-4602-86f5-a106b9ace603@vondra.me> <152ea782-5bd4-4435-b021-0ab2da61e63d@vondra.me> <7c2f6350-6fca-4e39-b0a8-8ac735f5d58a@vondra.me> <2e63cadd-2a03-46b1-866e-7ea5d3ffd37f@vondra.me> In-Reply-To: List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Tue Aug 12, 2025 at 1:06 AM EDT, Thomas Munro wrote: > I'd be interested to hear more about reverse scans. Bilal was > speculating about backwards I/O combining in read_stream.c a while > back, but we didn't have anything interesting to use it yet. You'll > probably see a flood of uncombined 8KB IOs in the pg_aios view while > travelling up the heap with cache misses today. I suspect Linux does > reverse sequential prefetching with buffered I/O (less sure about > other OSes) which should help but we'd still have more overheads than > we could if we combined them, not to mention direct I/O. Doesn't look like Linux will do this, if what my local testing shows is any= thing to go on. I'm a bit surprised by this (I also thought that OS readahead on = linux was quite sophisticated). There does seem to be something fishy going on with the patch here. I can = see strange inconsistencies in EXPLAIN ANALYZE output when the server is starte= d with --debug_io_direct=3Ddata with the master, compared to what I see with = the patch. Test case =3D=3D=3D=3D=3D=3D=3D=3D=3D My test case is a minor refinement of Tomas' backwards scan test case from earlier today, though with one important difference: I ran "alter index idx set (deduplicate_items =3D off); reindex index idx;" to ge= t a pristine index without any posting lists (since the unrelated issue with po= sting list TIDs otherwise risks obscuring something relevant). master ------ pg@regression:5432 [2390630]=3D# select pg_buffercache_evict_relation('t');= select pg_prewarm('idx'); ***SNIP*** pg@regression:5432 [2390630]=3D# EXPLAIN (ANALYZE, COSTS OFF) SELECT * FROM t WHERE a BETWEEN 16336 AND 49103 ORDER BY a; =E2=94=8C=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=90 =E2=94=82 QUERY PLAN = =E2=94=82 =E2=94=9C=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=A4 =E2=94=82 Index Scan using idx on t (actual time=3D0.117..982.469 rows=3D10= 48576.00 loops=3D1) =E2=94=82 =E2=94=82 Index Cond: ((a >=3D 16336) AND (a <=3D 49103)) = =E2=94=82 =E2=94=82 Index Searches: 1 = =E2=94=82 =E2=94=82 Buffers: shared hit=3D10353 read=3D49933 = =E2=94=82 =E2=94=82 I/O Timings: shared read=3D861.953 = =E2=94=82 =E2=94=82 Planning: = =E2=94=82 =E2=94=82 Buffers: shared hit=3D63 read=3D20 = =E2=94=82 =E2=94=82 I/O Timings: shared read=3D1.898 = =E2=94=82 =E2=94=82 Planning Time: 2.131 ms = =E2=94=82 =E2=94=82 Execution Time: 1015.679 ms = =E2=94=82 =E2=94=94=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=98 (10 rows) pg@regression:5432 [2390630]=3D# select pg_buffercache_evict_relation('t');= select pg_prewarm('idx'); ***SNIP*** pg@regression:5432 [2390630]=3D# EXPLAIN (ANALYZE, COSTS OFF) SELECT * FROM t WHERE a BETWEEN 16336 AND 49103 ORDER BY a desc; =E2=94=8C=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =90 =E2=94=82 QUERY PLAN = =E2=94=82 =E2=94=9C=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =A4 =E2=94=82 Index Scan Backward using idx on t (actual time=3D7.919..6340.579= rows=3D1048576.00 loops=3D1) =E2=94=82 =E2=94=82 Index Cond: ((a >=3D 16336) AND (a <=3D 49103)) = =E2=94=82 =E2=94=82 Index Searches: 1 = =E2=94=82 =E2=94=82 Buffers: shared hit=3D10350 read=3D49933 = =E2=94=82 =E2=94=82 I/O Timings: shared read=3D6219.776 = =E2=94=82 =E2=94=82 Planning: = =E2=94=82 =E2=94=82 Buffers: shared hit=3D5 = =E2=94=82 =E2=94=82 Planning Time: 0.076 ms = =E2=94=82 =E2=94=82 Execution Time: 6374.008 ms = =E2=94=82 =E2=94=94=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =98 (9 rows) Notice that readahead seems to be effective with the forwards scan only (ev= en though I'm using debug_io_direct=3Ddata for this). Also notice that each q= uery shows identical "Buffers:" output -- that detail is exactly as expected. Prefetch patch -------------- Same pair of queries/prewarming/eviction steps with my working copy of the prefetching patch: pg@regression:5432 [2400564]=3D# select pg_buffercache_evict_relation('t');= select pg_prewarm('idx'); ***SNIP*** pg@regression:5432 [2400564]=3D# EXPLAIN (ANALYZE, COSTS OFF) SELECT * FROM t WHERE a BETWEEN 16336 AND 49103 ORDER BY a; =E2=94=8C=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=90 =E2=94=82 QUERY PLAN = =E2=94=82 =E2=94=9C=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=A4 =E2=94=82 Index Scan using idx on t (actual time=3D0.136..298.301 rows=3D10= 48576.00 loops=3D1) =E2=94=82 =E2=94=82 Index Cond: ((a >=3D 16336) AND (a <=3D 49103)) = =E2=94=82 =E2=94=82 Index Searches: 1 = =E2=94=82 =E2=94=82 Buffers: shared hit=3D6619 read=3D49933 = =E2=94=82 =E2=94=82 I/O Timings: shared read=3D45.313 = =E2=94=82 =E2=94=82 Planning: = =E2=94=82 =E2=94=82 Buffers: shared hit=3D63 read=3D20 = =E2=94=82 =E2=94=82 I/O Timings: shared read=3D2.232 = =E2=94=82 =E2=94=82 Planning Time: 2.634 ms = =E2=94=82 =E2=94=82 Execution Time: 330.379 ms = =E2=94=82 =E2=94=94=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=98 (10 rows) pg@regression:5432 [2400564]=3D# select pg_buffercache_evict_relation('t');= select pg_prewarm('idx'); ***SNIP*** pg@regression:5432 [2400564]=3D# EXPLAIN (ANALYZE, COSTS OFF) SELECT * FROM t WHERE a BETWEEN 16336 AND 49103 ORDER BY a desc; =E2=94=8C=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =90 =E2=94=82 QUERY PLAN = =E2=94=82 =E2=94=9C=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =A4 =E2=94=82 Index Scan Backward using idx on t (actual time=3D7.926..1201.988= rows=3D1048576.00 loops=3D1) =E2=94=82 =E2=94=82 Index Cond: ((a >=3D 16336) AND (a <=3D 49103)) = =E2=94=82 =E2=94=82 Index Searches: 1 = =E2=94=82 =E2=94=82 Buffers: shared hit=3D10350 read=3D49933 = =E2=94=82 =E2=94=82 I/O Timings: shared read=3D194.774 = =E2=94=82 =E2=94=82 Planning: = =E2=94=82 =E2=94=82 Buffers: shared hit=3D5 = =E2=94=82 =E2=94=82 Planning Time: 0.097 ms = =E2=94=82 =E2=94=82 Execution Time: 1236.655 ms = =E2=94=82 =E2=94=94=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80= =E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2= =94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94=80=E2=94= =98 (9 rows) It looks like the patch does significantly better with the forwards scan, compared to the backwards scan (though both are improved by a lot). But th= at's not the main thing about these results that I find interesting. The really odd thing is that we get "shared hit=3D6619 read=3D49933" for th= e forwards scan, and "shared hit=3D10350 read=3D49933" for the backwards scan= . The latter matches master (regardless of the scan direction used on master), wh= ile the former just looks wrong. What explains the "missing buffer hits" seen = with the forwards scan? Discrepancies ------------- All 4 query executions agree that "rows=3D1048576.00", so the patch doesn't= appear to simply be broken/giving wrong answers. Might it be that the "Buffers" instrumentation is broken? The premise of my original complaint was that big inconsistencies in perfor= mance shouldn't happen between similar forwards and backwards scans (at least not= with direct I/O). I now have serious doubts about that premise, since it looks = like OS readahead remains a big factor with direct I/O. Did I just miss somethi= ng obvious? >> I wonder if today's commit b4212231 from Thomas ("Fix rare bug in >> read_stream.c's split IO handling") fixed the issue, without anyone >> realizing that the bug in question could manifest like this. > > I can't explain that. If you can consistently reproduce the change at > the two base commits, maybe bisect? Commit b4212231 was a wild guess on my part. Probably should have refraine= d from that. -- Peter Geoghegan