Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uttkd-00FhsN-Mn for pgsql-hackers@arkaria.postgresql.org; Wed, 03 Sep 2025 20:06:40 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1uttkc-00BP5E-QQ for pgsql-hackers@arkaria.postgresql.org; Wed, 03 Sep 2025 20:06:39 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uttkb-00BP4o-Rs for pgsql-hackers@lists.postgresql.org; Wed, 03 Sep 2025 20:06:38 +0000 Received: from fout-a4-smtp.messagingengine.com ([103.168.172.147]) by makus.postgresql.org with smtp (Exim 4.96) (envelope-from ) id 1uttkZ-000NuB-2z for pgsql-hackers@lists.postgresql.org; Wed, 03 Sep 2025 20:06:37 +0000 Received: from phl-compute-07.internal (phl-compute-07.internal [10.202.2.47]) by mailfout.phl.internal (Postfix) with ESMTP id 8ACE4EC01C3; Wed, 3 Sep 2025 16:06:34 -0400 (EDT) Received: from phl-mailfrontend-02 ([10.202.2.163]) by phl-compute-07.internal (MEProxy); Wed, 03 Sep 2025 16:06:34 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=anarazel.de; h= cc:cc:content-transfer-encoding:content-type:content-type:date :date:from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to; s=fm1; t=1756929994; x=1757016394; bh=yZVHjjziDFmyp20FerQjgEaHsFgLig0DTE3CnYJ34Qk=; b= KYMAUwbZxLj0YA9L3NqtcFivghFv4vter3DTEmY0RnBvoFXO2PObXIK2lKd1Irpw YkhmHovgUODLwFFYkMO0b+uFsUYeAfa7b4pq5HtU6oUwnWCVAb/fwazSJX5cENRH tXMetcwl8VQEabGmMUZTPNKOAyB87EzU8RkFb5obDyOzIEZnmwH2cOq9khbsxLoK dhgTNwkGwRfjXYqHUGrBKDGTjWE5mwHnKXfuio/tgCT+xWM8TrMRdry3zYanihj1 SULi4SS2dRUZWmLeGBaA4FQ9zYjR4/dqvqLn1YeFlLlZfjdsPgnjt+hwhXMtLqTR DpzpVlXtxVXJsWMHUHx+gQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm1; t=1756929994; x= 1757016394; bh=yZVHjjziDFmyp20FerQjgEaHsFgLig0DTE3CnYJ34Qk=; b=T m11c/5GNd4D8OSt9GiOPe6Tgpa78+elRWdA6StmthihnJV8hog1O9Tr7OBMRLCKF f4+ahlkmgg47hVOMD6J10ap/q4l+mvDR6SULL+rasFDV4+3rIXkItJKEXTahseXB XAY6A0E5nyKuscxlZzoeFG/lvYAcgiwe+3+7C1F1pYcqRDzi7kxhj94KID/TtBU2 E/tI3tLO0VYQGHKtMUEkzl/rqBXsK/MQ1Y0yst/Qw7a3Geliw9dpG78Ru1Pxv7Wo n+DfutJDn6brf31/Fgyj2U0TKl6A47dNMwbwgi3bWNrTOBQJi3KL5e/EgKAEvdU4 NRzUXvkVg2dze9cgnpSHQ== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeffedrtdeggdegtdehucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfurfetoffkrfgpnffqhgenuceurghi lhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujfgurh epfffhvfevuffkfhggtggugfgjsehtkefstddttdejnecuhfhrohhmpeetnhgurhgvshcu hfhrvghunhguuceorghnughrvghssegrnhgrrhgriigvlhdruggvqeenucggtffrrghtth gvrhhnpedtleelvdfgjedvffeiueekfeeuleffhfegfffhgfffkeevueehieehhfeigffh vdenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpegrnh gurhgvshesrghnrghrrgiivghlrdguvgdpnhgspghrtghpthhtohepuddtpdhmohguvgep shhmthhpohhuthdprhgtphhtthhopehpghessghofihtrdhivgdprhgtphhtthhopehknh hiiihhnhhikhesghgrrhhrvghtrdhruhdprhgtphhtthhopegshigrvhhuiiekudesghhm rghilhdrtghomhdprhgtphhtthhopeguihhlihhpsggrlhgruhhtsehgmhgrihhlrdgtoh hmpdhrtghpthhtohepmhgvlhgrnhhivghplhgrghgvmhgrnhesghhmrghilhdrtghomhdp rhgtphhtthhopehrohgsvghrthhmhhgrrghssehgmhgrihhlrdgtohhmpdhrtghpthhtoh epthhhohhmrghsrdhmuhhnrhhosehgmhgrihhlrdgtohhmpdhrtghpthhtohepphhgshhq lhdqhhgrtghkvghrsheslhhishhtshdrphhoshhtghhrvghsqhhlrdhorhhgpdhrtghpth htohepghhkohhkohhlrghtohhssehprhhothhonhhmrghilhdrtghomh X-ME-Proxy: Feedback-ID: id4a34324:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Wed, 3 Sep 2025 16:06:33 -0400 (EDT) Date: Wed, 3 Sep 2025 16:06:32 -0400 From: Andres Freund To: Peter Geoghegan Cc: Tomas Vondra , Thomas Munro , Nazir Bilal Yavuz , Robert Haas , Melanie Plageman , PostgreSQL Hackers , Georgios , Konstantin Knizhnik , Dilip Kumar Subject: Re: index prefetching Message-ID: <4zeu5yb73byiquvf3eefsunnrydyqfxy3eup66jrliutrtd4xl@5iifjey4n5m5> References: <6d59c277-c440-4d1f-a46e-157958c06a5f@vondra.me> <5pltwb73d7cynsxo2yb54ygjk7haviatkrx43mnzihc6kkield@ahnstpgof46i> <931afce3-8c86-4c96-9861-0ffa17c6560f@vondra.me> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi, On 2025-09-03 15:33:30 -0400, Peter Geoghegan wrote: > On Wed, Sep 3, 2025 at 2:47 PM Andres Freund wrote: > > I still don't think I fully understand why the impact of this is so large. The > > branch misses appear to be the only thing differentiating the two cases, but > > with resowners neutralized, the remaining difference in branch misses seems > > too large - it's not like the sequence of block numbers is more predictable > > without prefetching... > > > > The main increase in branch misses is in index_scan_stream_read_next... > > I've been working on fixing the same regressed query, but using a > completely different (though likely complementary) approach: by adding > a test to index_scan_stream_read_next that detects when prefetching > isn't favorable. If it isn't favorable, then we stop prefetching > entirely (we fall back on regular sync I/O). The issue to me is that this kind of query actually *can* substantially benefit from prefetching, no? Afaict the performance without prefetching is rather atrocious as soon as a) storage has a tad higher latency or b) DIO is used. Indeed: With DIO, readahead provides a ~2.6x improvement for the query at hand. I continue to be worried that we're optimizing for queries that have no real-world relevance. The regression afaict is contingent on 1) An access pattern that is unpredictable to the CPU (due to the use of random() as part of ORDER BY during the data generation) 2) Index and heap are somewhat correlated, but fuzzily, i.e. there are backward jumps in the heap block numbers being fetched 3) There are 1 - small_number tuples on one heap tables 4) The query scans a huge number of tuples, without actually doing any meaningful analysis on the tuples. As soon as one does meaningful work for returned tuples, the small difference in per-tuple CPU costs vanishes 5) The query visits all heap pages within a range, just not quite in order. Without that the kernel readahead would not work and the query's performance without readahead would be terrible even on low-latency storage This just doesn't strike me as a particularly realistic combination of factors? I suspect we could more than eat back the loss in performance by doing batched heap_hot_search_buffer()... Greetings, Andres Freund