Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vrX0e-009Aci-2R for pgsql-hackers@arkaria.postgresql.org; Sun, 15 Feb 2026 07:57:40 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vrX0c-000oTe-03 for pgsql-hackers@arkaria.postgresql.org; Sun, 15 Feb 2026 07:57:38 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vrX0b-000oTW-1z for pgsql-hackers@lists.postgresql.org; Sun, 15 Feb 2026 07:57:37 +0000 Received: from relay4-d.mail.gandi.net ([2001:4b98:dc4:8::224]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1vrX0Y-00000000lGz-1B6c for pgsql-hackers@lists.postgresql.org; Sun, 15 Feb 2026 07:57:36 +0000 Received: by mail.gandi.net (Postfix) with ESMTPSA id 01A2C3EAF2; Sun, 15 Feb 2026 07:57:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=vondra.me; s=gm1; t=1771142247; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=WgytqRD0L1LhCYdNKV4EDHI5TKSqUQmzU5MZ4sxGWvo=; b=oHK2yiI8pDdbUwugSR/BzO3CmHPjDCgQIL1THTHwYKstZGeVvgGnESvL3FI0G2tnnLbPpJ 09kMANKpJz/PPg4hzgtC5ncxTQdTxyttk7l8VyNpny90dmZsFniSBKfDpxjgoL5CWDL/vG RTx7JAk2Ub2BW3lIGbnPGJzs7KqLgQaQPNvlkJmrZvbpcwOJ1brSdWBUQkN+ExVZeI2xXa I067sif0tUTEo7UQ/FXZ4EZfYq0JV3BmrrmHuUaxcn4C3mIILw37JW6a6urjLwtgdRfxZt eepnoBGRqEE/27I2Sn/ezxopuzOqd/qYjGDvcAtsHMJHKoqsI69MGG6dYC3NLQ== Message-ID: <9411f220-007d-4f1e-9c8f-ca8eb09e6788@vondra.me> Date: Sun, 15 Feb 2026 08:57:19 +0100 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: index prefetching To: Alexandre Felipe , Peter Geoghegan Cc: Andres Freund , Thomas Munro , Nazir Bilal Yavuz , Robert Haas , Melanie Plageman , PostgreSQL Hackers , Georgios , Konstantin Knizhnik , Dilip Kumar References: <6d59c277-c440-4d1f-a46e-157958c06a5f@vondra.me> Content-Language: en-US From: Tomas Vondra In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit X-GND-Sasl: tomas@vondra.me X-GND-Score: -100 X-GND-Cause: gggruggvucftvghtrhhoucdtuddrgeefgedrtddtgddvudefgeehucetufdoteggodetrfdotffvucfrrhhofhhilhgvmecuifetpfffkfdpucggtfgfnhhsuhgsshgtrhhisggvnecuuegrihhlohhuthemuceftddunecusecvtfgvtghiphhivghnthhsucdlqddutddtmdenucfjughrpefkffggfgfuvfevfhfhjggtgfesthekredttddvjeenucfhrhhomhepvfhomhgrshcugghonhgurhgruceothhomhgrshesvhhonhgurhgrrdhmvgeqnecuggftrfgrthhtvghrnhepuedvvdeifefffeekudeggfdtieeglefggeduheffveeihefggfehgfdvudetffevnecukfhppedutdelrdduieegrdegledrvdegudenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepihhnvghtpedutdelrdduieegrdegledrvdeguddphhgvlhhopegluddtrddufeejrddtrddukegnpdhmrghilhhfrhhomhepthhomhgrshesvhhonhgurhgrrdhmvgdpqhhiugeptddutedvveefgfethfdvpdhmohguvgepshhmthhpohhuthdpnhgspghrtghpthhtohepuddupdhrtghpthhtohepohgrlhgvgigrnhgurhgvfhgvlhhiphgvsehgmhgrihhlrdgtohhmpdhrtghpthhtohepphhgsegsohifthdrihgvpdhrtghpthhtoheprghnughrvghssegrnhgrrhgriigvlhdruggvpdhrtghpthhtohepthhhohhmrghsmhhunhhrohesghhmrghilhdrtghomhdprhgtphhtthhopegshigrvhhuiiekudesghhmrghilhdrtghomhdprhgtp hhtthhopehrohgsvghrthhmhhgrrghssehgmhgrihhlrdgtohhm X-GND-State: clean List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi, On 2/15/26 01:13, Alexandre Felipe wrote: > Hi, > I decided to test this PR. > > I didn't take much time to go through the thread or the code in detail > yet. But I have my first benchmark results and I would like to share. > I'm quite confused by the scripts you shared, it seems incomplete. The run_regression.py is meant to call purge_cache.sh (which is missing), and the run_benchmark tries to call all sorts of missing .sql scripts. So how do we use that? > EXPERIMENT > > I tested [CF 4351] v10 - Index Prefetching > > I created a table with 100k rows and > Sequential is, as guessed, 1,2,3,4 (indexed) > Periodic is a quasi random (i * jump) % num_rows, where gcd(jump, > num_rows) = 1, guarantee that there are no repeated entries (indexed) > Random is a `row_number() over (order by random())` (indexed) > The payload is a fixed 200 character long string, just to make it more > realistic. > > For the tests, I disable sorting, sequential scans, index only scans and > bitmap scans. > Since buffer cache always has a significant impact on the query > performance, I shuffled the tests, and tried to adjust for the number of > buffer hit/read, but later I found that the best way to control that was > to use a table small enough to be entirely held in cache, and evict the > buffers. > That seems a bit bizarre. The whole point of index prefetching is better I/O scheduling (ahead of time), but if you "control" the impact of cache by making sure everything is cached, that kinda defeats the whole thing. A table that is just 24MB and fits into buffers is a bit useless. It means that even with random pattern (which is generally about the best for prefetching), only about ~1/30 of pages will require I/O. Each page has ~32 items, but only the first item from each page will incur an I/O. > * off: buffers are kept in cache > * pg: buffers evicted from postgres pg_buffercache_evict from > pg_buffercache extension. > * os: supported only in  python, I separated the buffer eviction in > purge_cache as it requires sudo (tested only in MacOS). > > I varied  >  * max_parallel_workers_per_gather (although I guess it wasn't exploited),  >  * enable_index_prefetch  >  * the column used as sorting key and, as a result, the index used. >  * and buffer eviction mode. > > Running from python with psycopg > On what kind of hardware? How much variance is in the results? regards -- Tomas Vondra