Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vvpIs-004bn2-2s for pgsql-hackers@arkaria.postgresql.org; Fri, 27 Feb 2026 04:18:15 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vvpIr-000LRg-29 for pgsql-hackers@arkaria.postgresql.org; Fri, 27 Feb 2026 04:18:13 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vvpIq-000LRO-1x for pgsql-hackers@lists.postgresql.org; Fri, 27 Feb 2026 04:18:13 +0000 Received: from fhigh-b5-smtp.messagingengine.com ([202.12.124.156]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1vvpIm-00000001VOY-1VEv for pgsql-hackers@lists.postgresql.org; Fri, 27 Feb 2026 04:18:12 +0000 Received: from phl-compute-04.internal (phl-compute-04.internal [10.202.2.44]) by mailfhigh.stl.internal (Postfix) with ESMTP id 85B2C7A0154; Thu, 26 Feb 2026 23:18:05 -0500 (EST) Received: from phl-frontend-03 ([10.202.2.162]) by phl-compute-04.internal (MEProxy); Thu, 26 Feb 2026 23:18:05 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=anarazel.de; h= cc:cc:content-transfer-encoding:content-type:content-type:date :date:from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to; s=fm3; t=1772165885; x=1772252285; bh=mKwchcLj074QiQ6TPv9ueOQ26MYwD9Chlyhv019mPcs=; b= vlBRN++qjRVLGGodPY26IQRf4i57Hd2E6WGQL7V7ZkGRb2oWrZpZhlgJ8Bx40p7M Wir3ZCGDx1mfoh3tT4z7EAklPQRooZZiUzAvrfXRX8IpD9FAKFbza6jWGH4Dku1j 2cHcUr/Aqe/NiuACgxTJYBbDEeObIeC67P99zs/+YdcGTlVwy1cd933ghRtDQaBO e70tnZku+k4qM2asD4dQS+aKSozJJBXsy8SLays7AXEr8+GiCK6BbHHaqPtqIRYe +7yfhQlrrNZPi+qvhl3x7LM7scD8QER/80+CAozJ9DfRCG1oJcUVykiu765JXbR0 je3jzMGmPWDoy9hMZhTcLQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm3; t=1772165885; x= 1772252285; bh=mKwchcLj074QiQ6TPv9ueOQ26MYwD9Chlyhv019mPcs=; b=k g8v650pbe521OMe04790XQa0MF+/+NA1HFeY2H1/Wc+/hPMHD/ToHTe30HxqBO3g dr5Mbzy/0BVaBm1oHOyxS7wr1estYOatcBbfsp51eZ+zh1DJwhoDeldsoX0ghY4I IkspyvgfSpy/QwJ9SABXzGzu+8/6eooe4c8DyPq8gP4J06NGZCV1EX0Ib+aErsbJ yxzoySjloHfRvvs5yWLWjq9PgdxSvAljRAlc6DUlPjW0yXmgqpHzwdNXv0wl0QaC vSRnUTGQohPi19ZvOykJ5TAM/ilX/FnzbSnwUOU2Y9G+Hj3lw61AQ/46mLppWWrj OesVdjlS0OH6XwEDLqJ7A== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefgedrtddtgddvgeejleelucetufdoteggodetrf dotffvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfurfetoffkrfgpnffqhgenuceu rghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujf gurhepfffhvfevuffkfhggtggugfgjsehtkefstddttdejnecuhfhrohhmpeetnhgurhgv shcuhfhrvghunhguuceorghnughrvghssegrnhgrrhgriigvlhdruggvqeenucggtffrrg htthgvrhhnpedtleelvdfgjedvffeiueekfeeuleffhfegfffhgfffkeevueehieehhfei gffhvdenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhrohhmpe grnhgurhgvshesrghnrghrrgiivghlrdguvgdpnhgspghrtghpthhtohepuddupdhmohgu vgepshhmthhpohhuthdprhgtphhtthhopehpghessghofihtrdhivgdprhgtphhtthhope hknhhiiihhnhhikhesghgrrhhrvghtrdhruhdprhgtphhtthhopegshigrvhhuiiekudes ghhmrghilhdrtghomhdprhgtphhtthhopeguihhlihhpsggrlhgruhhtsehgmhgrihhlrd gtohhmpdhrtghpthhtohepmhgvlhgrnhhivghplhgrghgvmhgrnhesghhmrghilhdrtgho mhdprhgtphhtthhopehordgrlhgvgigrnhgurhgvrdhfvghlihhpvgesghhmrghilhdrtg homhdprhgtphhtthhopehrohgsvghrthhmhhgrrghssehgmhgrihhlrdgtohhmpdhrtghp thhtohepthhhohhmrghsrdhmuhhnrhhosehgmhgrihhlrdgtohhmpdhrtghpthhtohepph hgshhqlhdqhhgrtghkvghrsheslhhishhtshdrphhoshhtghhrvghsqhhlrdhorhhg X-ME-Proxy: Feedback-ID: id4a34324:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Thu, 26 Feb 2026 23:18:03 -0500 (EST) Date: Thu, 26 Feb 2026 23:18:03 -0500 From: Andres Freund To: Peter Geoghegan Cc: Tomas Vondra , Alexandre Felipe , Thomas Munro , Nazir Bilal Yavuz , Robert Haas , Melanie Plageman , PostgreSQL Hackers , Georgios , Konstantin Knizhnik , Dilip Kumar Subject: Re: index prefetching Message-ID: References: <64a2re223ajj4popowsyu4xekbuvvyfwkrihn5yzyrkwsmsuvp@2lls3tpww5dl> <52512325-b1f2-4fff-819e-f68122b2e427@vondra.me> <64mfcfv7iihc4pmqlxarii4esnmqry52ckz5m7lmwylnfnuxuz@oxh4ioxkjtep> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi, On 2026-02-24 13:13:25 -0500, Peter Geoghegan wrote: > > Plausible. It could be that we could get away with controlling the rampup to > > be slower in potentially problematic cases, without needing the yielding, but > > not sure. > > Attached is v11, which makes the read stream yielding mechanism better > cooperate with index prefetching, so as to avoid interefering with > io_combine_limit. This should deal with the odd performance that you > complained about. See > v11-0006-Introduce-read_stream_-pause-resume-yield.patch (and the > later prefetching patch > v11-0007-Add-heapam-index-scan-I-O-prefetching.patch) for details. > > The whole idea of measuring "batch distance" is gone in this version, > though we do still only consider whether now is a good time to yield > at "batch boundaries". We always refuse yield on the first few batches > of the scan, so the idea of caring about batch boundaries is still > there, albeit in a much more limited form. I'm planning to do some reviewing in the next days. In preparation I just retried a benchmark and saw some odd results. After a while I was able to reproduce even with a simpler setup: -c shared_buffers=2GB -c debug_io_direct=data -c io_method=io_uring pgbench -i -q -s 100 --fillfactor=90 ┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ QUERY PLAN │ ├──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Index Scan using pgbench_accounts_pkey on pgbench_accounts (cost=0.43..441511.11 rows=10000045 width=97) (actual time=0.308..6101.837 rows=10000000.00 loops=1) │ │ Index Searches: 1 │ │ Buffers: shared hit=27325 read=181819 │ │ I/O Timings: shared read=4538.003 │ │ Planning Time: 0.041 ms │ │ Execution Time: 6433.192 ms │ └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ pgbench -i -q -s 100 --fillfactor=50 ┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ QUERY PLAN │ ├─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Index Scan using pgbench_accounts_pkey on pgbench_accounts (cost=0.43..593022.41 rows=9999798 width=97) (actual time=0.131..3973.698 rows=10000000.00 loops=1) │ │ Index Searches: 1 │ │ Buffers: shared hit=19239 read=341420 │ │ I/O Timings: shared read=1752.057 │ │ Planning: │ │ Buffers: shared hit=42 read=15 │ │ Planning Time: 1.668 ms │ │ Execution Time: 4308.182 ms │ └─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ pgbench -i -q -s 100 --fillfactor=25 ┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ QUERY PLAN │ ├──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Index Scan using pgbench_accounts_pkey on pgbench_accounts (cost=0.43..926358.51 rows=10000005 width=97) (actual time=0.112..3259.362 rows=10000000.00 loops=1) │ │ Index Searches: 1 │ │ Buffers: shared hit=9610 read=684382 │ │ I/O Timings: shared read=242.259 │ │ Planning: │ │ Buffers: shared hit=18 │ │ Planning Time: 0.097 ms │ │ Execution Time: 3594.782 ms │ └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ Note how the increase in scanned heap pages actually *decreases* the overall time rather substantially. It's quite visible, both in iostat, and a query like SELECT pid, target_desc, off, length FROM pg_aios \watch 0.5 that for the first query has basically no IO concurrency, the second has very intermittent IO concurrency and the third one has nice IO concurrency. If I disable the yield logic, the fillfactor=90 case is good: ┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ QUERY PLAN │ ├──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ Index Scan using pgbench_accounts_pkey on pgbench_accounts (cost=0.43..441511.11 rows=10000045 width=97) (actual time=0.470..1662.331 rows=10000000.00 loops=1) │ │ Index Searches: 1 │ │ Buffers: shared hit=27325 read=181819 │ │ I/O Timings: shared read=21.113 │ │ Planning Time: 0.043 ms │ │ Execution Time: 1995.723 ms │ └──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ Of course this is a silly query, but you'd also see that with a mergejoin or such. Greetings, Andres Freund