Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w08Wd-001eCW-1v for pgsql-hackers@arkaria.postgresql.org; Wed, 11 Mar 2026 01:38:15 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w08Wb-0073nR-2c for pgsql-hackers@arkaria.postgresql.org; Wed, 11 Mar 2026 01:38:14 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w08Wb-0073nJ-1J for pgsql-hackers@lists.postgresql.org; Wed, 11 Mar 2026 01:38:14 +0000 Received: from mail-ej1-x631.google.com ([2a00:1450:4864:20::631]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1w08Wa-00000001XPb-0OZW for pgsql-hackers@lists.postgresql.org; Wed, 11 Mar 2026 01:38:13 +0000 Received: by mail-ej1-x631.google.com with SMTP id a640c23a62f3a-b941bb3e23cso915661666b.0 for ; Tue, 10 Mar 2026 18:38:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1773193089; cv=none; d=google.com; s=arc-20240605; b=Mbpa5Wuj9Tv6pp5dcqWn9TNidKXTJbl126NFAgvLnMTv0XXpriYNeXi7ULsQE0z6SN be2m4jtOmKooFXW45LR6XjquGHPFn3q1GQWpgSJ6to0TKwAzBopdj5Hy1BhsDvmQLdJA KUg/HzpGENFFeWL/oHPJHph2Uhh08gU+5JX/hRJ3a9HKn8RVRN1/oVzEqgPiNHPPsIKy PlmDesgj+y4MqUU8S7SAaSxqSn1JXhUXBy6r6SwUTB2EAcFct34eIIn4fG6lc3n/EvPe O8aP+hyhigHyCIoRAXBiQuH+X6nwX6CCWu5uW0Dfsa2VygL13FFpMhziCqzy9QkG3cRc /oIg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=e9BOwJQMzIwsNVhsOZKCrIS0WgVgJ9pUP5HkTN9xDN0=; fh=zqjTdZW1qCya7a1D7fos5CgL8KnyQjqcd05h3Q18Ma0=; b=QUXFGdP9GMlbLQCcdrHFSNnPys44L8du2dYk4f16ZFxXXZTzO/s9cKlgtUX28dTxcD LU41Y0fgcy7McJLitAmZhFpAYqP2I5bAVlCPIeRSAHxnwisSL7TFcyI8XCZW5YtFs2Yz csoyUQtQWKC8aW/k/3/tSEk+xkVt2cMDjy1i4z9GzNYrEQjoQn6a3N3RclmwAN54NBuf RgJzcWe5XUEYzm+qGrI5CtgTI/4c71b3ACFCDl7mIPsbrlHMCxwrKZgR4ktMLuL97Mfh qodN5E0VQ/voWFLn7Gep/4ypfOysMG9k5C5kCxqBlwxsYYJl2MkB1K9m9kWXWzQ+QC/k Yb9w==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1773193089; x=1773797889; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=e9BOwJQMzIwsNVhsOZKCrIS0WgVgJ9pUP5HkTN9xDN0=; b=KG5ATVbokL491THik7s7dKqDkYfe3pfrUztd3JPu6cXFYIrfOvIxRHaIpDlXWEGTit 6NB4nb4SplCcAycKE8SkDEzYi+7/m1ACXjeLhwQGheFp1bzZw5Cip1Tv4i2V3f6Z2lD4 iaI08iZCjufNPka4LWQ4gca9ve7lU0RH5aM5oU538StVnBqGWdda3u5T0cTiFjiAC62w G9b4CCbviL3xgVfpEdj5eHiKbHI+Cg0KxAdwef6PzltGu4I7arVa8F5USyvG9gnLZMGk 7h9zcGm+qnDJp+fXomr3lwwdxndzhmcj8KSAhGvswq5+vS9nPeD5i1Dl7XQ5CIFp0waP 0QnQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1773193089; x=1773797889; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=e9BOwJQMzIwsNVhsOZKCrIS0WgVgJ9pUP5HkTN9xDN0=; b=kaDyo4v1js14B4gwIqiZP8pvztUWintoYAIgaZWWdxp7iQqGyLNtaDt7ZvU7zL668X BY33ib8DMHXqUdMlZYzg0+VWYcKDtr9f0TBCuKTNFntDeJ797j5MSpOznlP86zZMpwUO 0XVlO5ES3oslbNbnBw1JB6XrV5EL3gKE29rv51e3h+QGmMQyjxac/HOmBpt1OatbprD0 TFMmellhxjs95rh6Hre4gVHaZBiHd3nA8uw7Gke5U6h0dmEnfM/kxZzyAg6XpxGe2r7t T8WEWGcJBOoPkw51KqBDlakCB5T6Kw6VOouj7fVoP1nMuuM1yRQKULkPv5uL/io5apSR noQw== X-Forwarded-Encrypted: i=1; AJvYcCXA8TQ59wpCCZJtHOQbC+qbVeiUFlEv4VkReOhjhAOSu5gRqJURK11xBQg03bKFeMch7y4VA0LCtKUX/HsH@lists.postgresql.org X-Gm-Message-State: AOJu0Yxg4JKiJDZ4MMj/hCFzM2qZVEIx77RMHDLqSNYoRZblas9173kf 7NUKc6qHKNlVONYAc098rwpv8si0i+sXTyCePDH8nZ0gVvRasBjY+CdY2go5xJj4HLzDMA1yfqv 4csSclBKZI2SecOGe30d/KOTWHeowkj8PyVFv9sU= X-Gm-Gg: ATEYQzxZAH2FWg5/Lv0MnVDkSOrxfV/X3BjIGLJUjgPPQK0K0+w6yTovn+AuxgPOlok 8pM2tNzFQV6ul2ULts0bLSENYyjxM4poN4gLUYAquQgXn9xEaQ5TWXw9JhhRUmLTC7L7oHlA2bM hTOq35lsqmyQwFY07qXzP35z6+wAPgF0kvhHLCr9yztvgAnFqUfFpSGQ9NUd/MyRsm5XIQalxzv jzIqgSzb7kQHa3VHTCsaZjbbgtUytAIoTT3HuP7YEuUKJ09gROhLIxyFtVupD2d9t55hRl3am0s Dc8te/2XOl/mGmK5LWPcl32WiGvVXNF7uraHTY2EZT2ENgDWPv3skC6bJKQPxh+/TCvjRrBKnCr ARRoSLKeR X-Received: by 2002:a17:906:fd8c:b0:b93:a02d:2ef3 with SMTP id a640c23a62f3a-b972e4e2b85mr28386366b.41.1773193089358; Tue, 10 Mar 2026 18:38:09 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Xuneng Zhou Date: Wed, 11 Mar 2026 09:37:57 +0800 X-Gm-Features: AaiRm51muX3lhrr5qcUNhgnzg_9MzFSxtcZavDWUZFhDRZE-FVfiNFQxQYV1K34 Message-ID: Subject: Re: Streamify more code paths To: Andres Freund Cc: Michael Paquier , pgsql-hackers , Nazir Bilal Yavuz Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi Andres, On Wed, Mar 11, 2026 at 7:04=E2=80=AFAM Andres Freund = wrote: > > Hi, > > On 2026-03-10 19:28:29 +0900, Michael Paquier wrote: > > On Tue, Mar 10, 2026 at 02:06:12PM +0800, Xuneng Zhou wrote: > > > Here=E2=80=99s v5 of the patchset. The wal_logging_large patch has be= en > > > removed, as no performance gains were observed in the benchmark runs. > > > > Looking at the numbers you are posting, it is harder to get excited > > about the hash, gin, bloom_vacuum and wal_logging. > > It's perhaps worth emphasizing that, to allow real world usage of direct = IO, > we'll need streaming implementation for most of these. Also, on windows t= he OS > provided readahead is ... not aggressive, so you'll hit IO stalls much mo= re > frequently than you'd on linux (and some of the BSDs). > > It might be a good idea to run the benchmarks with debug_io_direct=3Ddata= . > That'll make them very slow, since the write side doesn't yet use AIO and= thus > will do a lot of synchronous writes, but it should still allow to evaluat= e the > gains from using read stream. > > > The other thing that's kinda important to evaluate read streams is to tes= t on > higher latency storage, even without direct IO. Many workloads are not a= t all > benefiting from AIO when run on a local NVMe SSD with < 10us latency, but= are > severely IO bound when run on a cloud storage disk with 0.5ms - 4ms laten= cy. > > > To be able to test such higher latencies locally, I've found it quite use= ful > to use dm_delay above a fast disk. See [1]. Thanks for the tips! I currently don=E2=80=99t have access to a machine or cloud instance with slower SSDs or HDDs that have higher latency. I=E2=80= =99ll try running the benchmark with debug_io_direct=3Ddata and dm_delay, as you suggested, to see if the results vary. > > > The worker method seems more efficient, may show that we are out of noi= se > > level. > > I think that's more likely to show that memory bandwidth, probably due to > checksum computations, is a factor. The memory copy (from the kernel page > cache, with buffered IO) and the checksum computations (when checksums ar= e > enabled) are parallelized by worker, but not by io_uring. > > > Greetings, > > Andres Freund > > > [1] > > https://docs.kernel.org/admin-guide/device-mapper/delay.html > > Assuming /dev/md0 is mounted to /srv, and a delay of 1ms should be > introduced for it: > > umount /srv && dmsetup create delayed --table "0 $(blockdev --getsz /de= v/md0) delay /dev/md0 0 1" /dev/md0 && mount /dev/mapper/delayed /srv/ > > To update the amount of delay to 3ms the following can be used: > dmsetup suspend delayed && dmsetup reload delayed --table "0 $(blockdev= --getsz /dev/md0) delay /dev/md0 0 3" /dev/md0 && dmsetup resume delayed > > (I will often just update the delay to 0 for comparison runs, as that > doesn't require remounting) --=20 Best, Xuneng