Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uqgvJ-00BJzJ-6A for pgsql-hackers@arkaria.postgresql.org; Mon, 25 Aug 2025 23:48:26 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1uqgvH-00Fot0-Sz for pgsql-hackers@arkaria.postgresql.org; Mon, 25 Aug 2025 23:48:24 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uqgvH-00Fosr-4m for pgsql-hackers@lists.postgresql.org; Mon, 25 Aug 2025 23:48:24 +0000 Received: from fhigh-a7-smtp.messagingengine.com ([103.168.172.158]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1uqgvF-001qxY-0G for pgsql-hackers@lists.postgresql.org; Mon, 25 Aug 2025 23:48:23 +0000 Received: from phl-compute-09.internal (phl-compute-09.internal [10.202.2.49]) by mailfhigh.phl.internal (Postfix) with ESMTP id 7610614001BC; Mon, 25 Aug 2025 19:48:19 -0400 (EDT) Received: from phl-mailfrontend-02 ([10.202.2.163]) by phl-compute-09.internal (MEProxy); Mon, 25 Aug 2025 19:48:19 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=anarazel.de; h= cc:cc:content-type:content-type:date:date:from:from:in-reply-to :in-reply-to:message-id:mime-version:references:reply-to:subject :subject:to:to; s=fm3; t=1756165699; x=1756252099; bh=sOkoc/cnfx QYWrHdKITC5A8KcZK/ehuwnoVrv2y+uNI=; b=JrdwUg3JMtyzcn2VoreWqF0W8u OpbxdSm8q3evndCqS/TlpNWGtLaDDeFobkiUAes6uoYlYzJ7PYZvf/eXan8CIF1u z5VqX9fDmffDj2O7OG1mDi3t5bTsHSYTjCToUcIq3A1RWHICI2AlXTLRgnnslwuv ovQJ0xLFbF7CRP7MQJmuCaF25i5uMevmWOe5vKTO1SHymMtz0mQF0j/hOgchv4pg Azu/Bm8JII2IF1RWA7HS+ed1+/SnHr6UnFbWzhrZd85GknuIA/jTi0p+YGlweayI dat372fqTwPqY1Ph3yhBgpjLBAjI4cnExellf+WFNoQhlvwCIE7728J1L4gw== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-type:content-type:date:date :feedback-id:feedback-id:from:from:in-reply-to:in-reply-to :message-id:mime-version:references:reply-to:subject:subject:to :to:x-me-proxy:x-me-sender:x-me-sender:x-sasl-enc; s=fm3; t= 1756165699; x=1756252099; bh=sOkoc/cnfxQYWrHdKITC5A8KcZK/ehuwnoV rv2y+uNI=; b=cLgaNgjYprmxlwP1Fjf+rJSMi9FhGFI+aYosll8ZkWjQQRSgL3B bVfcaMyhNLsW7Uux1LRYUlSxv4n+nThGdO/sp68WDdQewZu9WLqLNc9nru9g9Efl 0Nhvj977nDTloLlH/7vzaggoaKk2MlOn0HCf3ShuW3/0IkI3y0M/QkdfPTlsA1un /Eoe2k8ZRIdzHZHk/sE+0GF6srxyYLPi2qmYVA+ijGE8HunXL3SHtfczwSgVGX+t Y9eBzf6QVVdl3CFo8AX0XkRlPtlXtEdmKz21LjkK33RC//7veUO5i7h58b/fmVeD B6cTem54u8xZVbCxHbNiv7WvSf/H7/RM8TA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeeffedrtdefgddujeefjeegucetufdoteggodetrf dotffvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfurfetoffkrfgpnffqhgenuceu rghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujf gurhepfffhvfevuffkfhggtggujgesthdtsfdttddtvdenucfhrhhomheptehnughrvghs ucfhrhgvuhhnugcuoegrnhgurhgvshesrghnrghrrgiivghlrdguvgeqnecuggftrfgrth htvghrnhepfeffgfelvdffgedtveelgfdtgefghfdvkefggeetieevjeekteduleevjefh ueegnecuvehluhhsthgvrhfuihiivgeptdenucfrrghrrghmpehmrghilhhfrhhomheprg hnughrvghssegrnhgrrhgriigvlhdruggvpdhnsggprhgtphhtthhopedutddpmhhouggv pehsmhhtphhouhhtpdhrtghpthhtohepphhgsegsohifthdrihgvpdhrtghpthhtohepkh hnihiihhhnihhksehgrghrrhgvthdrrhhupdhrtghpthhtohepsgihrghvuhiikedusehg mhgrihhlrdgtohhmpdhrtghpthhtohepughilhhiphgsrghlrghuthesghhmrghilhdrtg homhdprhgtphhtthhopehmvghlrghnihgvphhlrghgvghmrghnsehgmhgrihhlrdgtohhm pdhrtghpthhtoheprhhosggvrhhtmhhhrggrshesghhmrghilhdrtghomhdprhgtphhtth hopehthhhomhgrshdrmhhunhhrohesghhmrghilhdrtghomhdprhgtphhtthhopehpghhs qhhlqdhhrggtkhgvrhhssehlihhsthhsrdhpohhsthhgrhgvshhqlhdrohhrghdprhgtph htthhopehgkhhokhholhgrthhoshesphhrohhtohhnmhgrihhlrdgtohhm X-ME-Proxy: Feedback-ID: id4a34324:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Mon, 25 Aug 2025 19:48:18 -0400 (EDT) Date: Mon, 25 Aug 2025 19:48:17 -0400 From: Andres Freund To: Tomas Vondra Cc: Peter Geoghegan , Thomas Munro , Nazir Bilal Yavuz , Robert Haas , Melanie Plageman , PostgreSQL Hackers , Georgios , Konstantin Knizhnik , Dilip Kumar Subject: Re: index prefetching Message-ID: References: <6wyxbnry2unm3kbcu2sabhzhs7baoedlg77xqm42chpofjq45g@igst42zpl7ok> <5v2wuxg65l5e3s6uf373zskcqqoukmraxiucnvgn4t7b5cmeqx@5mhqsurdj6xn> <6butbqln6ewi5kuxz3kfv2mwomnlgtate4mb4lpa7gb2l63j4t@stlwbi2dvvev> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi, On 2025-08-25 15:00:39 +0200, Tomas Vondra wrote: > Thanks. Based on the testing so far, the patch seems to be a substantial > improvement. What's needed to make this prototype committable? Mainly some testing infrastructure that can trigger this kind of stream. The logic is too finnicky for me to commit it without that. > I assume this is PG19+ improvement, right? It probably affects PG18 too, > but it's harder to hit / the impact is not as bad as on PG19. Yea. It does apply to 18 too, but I can't come up with realistic scenarios where it's a real issue. I can repro a slowdown when using many parallel seqscans with debug_io_direct=data - but that's even slower in 17... > On a related note, my test that generates random datasets / queries, and > compares index prefetching with different io_method values found a > pretty massive difference between worker and io_uring. I wonder if this > might be some issue in io_method=worker. > while with index prefetching (with the aio prototype patch), it looks > like this: > > QUERY PLAN > ---------------------------------------------------------------------- > Index Scan using idx on t (actual rows=9048576.00 loops=1) > Index Cond: ((a >= 16150) AND (a <= 4540437)) > Index Searches: 1 > Prefetch Distance: 2.032 > Prefetch Count: 868165 > Prefetch Stalls: 2140228 > Prefetch Skips: 6039906 > Prefetch Resets: 0 > Stream Ungets: 0 > Stream Forwarded: 4 > Prefetch Histogram: [2,4) => 855753, [4,8) => 12412 > Buffers: shared hit=2577599 read=455610 > Planning: > Buffers: shared hit=78 read=26 dirtied=1 > Planning Time: 1.032 ms > Execution Time: 3150.578 ms > (16 rows) > > So it's about 2x slower. The prefetch distance collapses, because > there's a lot of cache hits (about 50% of requests seem to be hits of > already visited blocks). I think that's a problem with how we adjust the > distance, but I'll post about that separately. > > Let's try to simply set io_method=io_uring: > > QUERY PLAN > ---------------------------------------------------------------------- > Index Scan using idx on t (actual rows=9048576.00 loops=1) > Index Cond: ((a >= 16150) AND (a <= 4540437)) > Index Searches: 1 > Prefetch Distance: 2.032 > Prefetch Count: 868165 > Prefetch Stalls: 2140228 > Prefetch Skips: 6039906 > Prefetch Resets: 0 > Stream Ungets: 0 > Stream Forwarded: 4 > Prefetch Histogram: [2,4) => 855753, [4,8) => 12412 > Buffers: shared hit=2577599 read=455610 > Planning: > Buffers: shared hit=78 read=26 > Planning Time: 2.212 ms > Execution Time: 1837.615 ms > (16 rows) > > That's much closer to master (and the difference could be mostly noise). > > I'm not sure what's causing this, but almost all regressions my script > is finding look like this - always io_method=worker, with distance close > to 2.0. Is this some inherent io_method=worker overhead? I think what you might be observing might be the inherent IPC / latency overhead of the worker based approach. This is particularly pronounced if the workers are idle (and the CPU they get scheduled on is clocked down). The latency impact of that is small, but if you never actually get to do much readahead it can be visible. Greetings, Andres Freund