Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w0684-001cAp-31 for pgsql-hackers@arkaria.postgresql.org; Tue, 10 Mar 2026 23:04:45 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w0681-006Xsb-2u for pgsql-hackers@arkaria.postgresql.org; Tue, 10 Mar 2026 23:04:42 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w0681-006XsS-1v for pgsql-hackers@lists.postgresql.org; Tue, 10 Mar 2026 23:04:42 +0000 Received: from fhigh-a4-smtp.messagingengine.com ([103.168.172.155]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1w067z-000000022f5-23JS for pgsql-hackers@lists.postgresql.org; Tue, 10 Mar 2026 23:04:41 +0000 Received: from phl-compute-01.internal (phl-compute-01.internal [10.202.2.41]) by mailfhigh.phl.internal (Postfix) with ESMTP id 150CF1400176; Tue, 10 Mar 2026 19:04:38 -0400 (EDT) Received: from phl-frontend-04 ([10.202.2.163]) by phl-compute-01.internal (MEProxy); Tue, 10 Mar 2026 19:04:38 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=anarazel.de; h= cc:cc:content-transfer-encoding:content-type:content-type:date :date:from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to; s=fm1; t=1773183878; x=1773270278; bh=n/LcmKrKwdaq5DHlK/dtAN+j+8I684t5YmQ1pxr+A7o=; b= eHV4pR88kFQ0nWvwCexkM4u5DxvxMbEVeR18iCrvbpkWg18LTc9DAvWzB5pYU1wd RWF4E2f00yWI3fiRv4wVjyQM5ruY58bNShh+nisbcZETeqGwinNSXBbzJ0TpqxMC 42j8njeAImAhbGEjwGjPqC0/BFf0braLUbgfcNEyKIsXMNpFUVpksmY0lbgTMI4N FVCrK95lJH5R1cUdNAEgQIucTaC9KW3ebSL9P6icwRcdXeuMFW3+kwzUH2kgAyi2 8si3VRY6W5P2jOsbn92g1Ql+3uMlHk6pO3yRq+bx1w2SIOUkAh4L2a0h9FN5wjhR pyObK1WcbZ3K6WAVrpZgbQ== DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d= messagingengine.com; h=cc:cc:content-transfer-encoding :content-type:content-type:date:date:feedback-id:feedback-id :from:from:in-reply-to:in-reply-to:message-id:mime-version :references:reply-to:subject:subject:to:to:x-me-proxy :x-me-sender:x-me-sender:x-sasl-enc; s=fm1; t=1773183878; x= 1773270278; bh=n/LcmKrKwdaq5DHlK/dtAN+j+8I684t5YmQ1pxr+A7o=; b=J Ct7QXCUBHtJCHR64uvHQg31wQQRojnviWDGqOBXGBfLiT4kStvs3qTwVFDwRztWW y3PCpWAe7sy4eLl8CyLnMOHPbanHE4LArZi5Gdx89M+snwpjSnRQMFLkZHv94i5N DY1oPrtXicKYdnVK1HmsgXukrMrzmol+VCtUWxEePsyXgiYAa+Ymp4VjMcj4lWsa ox5R2CGCrWoUH9VIb8EnF015AR50SJSsZLH7HAIge/OWlS8x6e7VRmC9q46N+cwx /D++INwUfspgoW0tBpyF2uXgSr11skE2Lv4Gf2VEgFhi0rtmA/MsY4/Oajh6nHEu sch3633D/uCOfItiKj8MA== X-ME-Sender: X-ME-Received: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgeefgedrtddtgddvkedvvdelucetufdoteggodetrf dotffvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfurfetoffkrfgpnffqhgenuceu rghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmnecujf gurhepfffhvfevuffkfhggtggugfgjsehtkefstddttdejnecuhfhrohhmpeetnhgurhgv shcuhfhrvghunhguuceorghnughrvghssegrnhgrrhgriigvlhdruggvqeenucggtffrrg htthgvrhhnpedtledvheegvdekgfelveehudevveeufefhheeitdefieehffejgeegteef teelleenucffohhmrghinhepkhgvrhhnvghlrdhorhhgnecuvehluhhsthgvrhfuihiivg eptdenucfrrghrrghmpehmrghilhhfrhhomheprghnughrvghssegrnhgrrhgriigvlhdr uggvpdhnsggprhgtphhtthhopeegpdhmohguvgepshhmthhpohhuthdprhgtphhtthhope gshigrvhhuiiekudesghhmrghilhdrtghomhdprhgtphhtthhopeiguhhnvghnghiihhho uhesghhmrghilhdrtghomhdprhgtphhtthhopehpghhsqhhlqdhhrggtkhgvrhhssehlih hsthhsrdhpohhsthhgrhgvshhqlhdrohhrghdprhgtphhtthhopehmihgthhgrvghlsehp rghquhhivghrrdighiii X-ME-Proxy: Feedback-ID: id4a34324:Fastmail Received: by mail.messagingengine.com (Postfix) with ESMTPA; Tue, 10 Mar 2026 19:04:37 -0400 (EDT) Date: Tue, 10 Mar 2026 19:04:37 -0400 From: Andres Freund To: Michael Paquier Cc: Xuneng Zhou , pgsql-hackers , Nazir Bilal Yavuz Subject: Re: Streamify more code paths Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi, On 2026-03-10 19:28:29 +0900, Michael Paquier wrote: > On Tue, Mar 10, 2026 at 02:06:12PM +0800, Xuneng Zhou wrote: > > Here’s v5 of the patchset. The wal_logging_large patch has been > > removed, as no performance gains were observed in the benchmark runs. > > Looking at the numbers you are posting, it is harder to get excited > about the hash, gin, bloom_vacuum and wal_logging. It's perhaps worth emphasizing that, to allow real world usage of direct IO, we'll need streaming implementation for most of these. Also, on windows the OS provided readahead is ... not aggressive, so you'll hit IO stalls much more frequently than you'd on linux (and some of the BSDs). It might be a good idea to run the benchmarks with debug_io_direct=data. That'll make them very slow, since the write side doesn't yet use AIO and thus will do a lot of synchronous writes, but it should still allow to evaluate the gains from using read stream. The other thing that's kinda important to evaluate read streams is to test on higher latency storage, even without direct IO. Many workloads are not at all benefiting from AIO when run on a local NVMe SSD with < 10us latency, but are severely IO bound when run on a cloud storage disk with 0.5ms - 4ms latency. To be able to test such higher latencies locally, I've found it quite useful to use dm_delay above a fast disk. See [1]. > The worker method seems more efficient, may show that we are out of noise > level. I think that's more likely to show that memory bandwidth, probably due to checksum computations, is a factor. The memory copy (from the kernel page cache, with buffered IO) and the checksum computations (when checksums are enabled) are parallelized by worker, but not by io_uring. Greetings, Andres Freund [1] https://docs.kernel.org/admin-guide/device-mapper/delay.html Assuming /dev/md0 is mounted to /srv, and a delay of 1ms should be introduced for it: umount /srv && dmsetup create delayed --table "0 $(blockdev --getsz /dev/md0) delay /dev/md0 0 1" /dev/md0 && mount /dev/mapper/delayed /srv/ To update the amount of delay to 3ms the following can be used: dmsetup suspend delayed && dmsetup reload delayed --table "0 $(blockdev --getsz /dev/md0) delay /dev/md0 0 3" /dev/md0 && dmsetup resume delayed (I will often just update the delay to 0 for comparison runs, as that doesn't require remounting)