Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ulhEQ-004JIY-Iy for pgsql-hackers@arkaria.postgresql.org; Tue, 12 Aug 2025 05:07:30 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1ulhEO-005LS7-UH for pgsql-hackers@arkaria.postgresql.org; Tue, 12 Aug 2025 05:07:29 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ulhEO-005LRz-J3 for pgsql-hackers@lists.postgresql.org; Tue, 12 Aug 2025 05:07:28 +0000 Received: from mail-pj1-x102e.google.com ([2607:f8b0:4864:20::102e]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1ulhEM-00091A-1a for pgsql-hackers@lists.postgresql.org; Tue, 12 Aug 2025 05:07:28 +0000 Received: by mail-pj1-x102e.google.com with SMTP id 98e67ed59e1d1-321b1c14413so246635a91.1 for ; Mon, 11 Aug 2025 22:07:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1754975244; x=1755580044; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Vj9JaJi0tjPAKRDMAzAHFheV24lDh/AdPQQCYc2G1tg=; b=l/36NasbG4JRQXG59r0D9GLii/Ukwev67QoJbdWEyh7NcZG2H11jyJKCLysaizbLIf mGifmvzPjGf38X9qHVbhi/0ht/NGliy9g+1GJgVDr3BOGVjQjwRrWSbJx9ohth28y7R/ KA4FivAPrtwWXFBBtS4Xf4iGkWTCAyiAL5enQLsXWwY9dXQkyzK5LcDFcfc4C5vLnLUe gA+3KamRYWMtMI8iI2Y0U3T406sa4hcU09FA9ph/rSKFptZyokLoIHyJtDbzXlW/0Uwi KSIrRqHgsJirY0JOpj3b2RDdAxF6Om6nu+JtJ5w+D4rS2qH+7ToHuJDT4LBYHP/RH/8B QrWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1754975244; x=1755580044; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Vj9JaJi0tjPAKRDMAzAHFheV24lDh/AdPQQCYc2G1tg=; b=Btb1zP4KuZhCX4ALdXnWpGo24HRyIqdbmUz7zg1ldFC1oIrZQqGX7DbFH3tAjKZCTp v3oKqqTXMClPjnliiK5NEEDd2gdkuKZD901BsvX1xx1IEJgMqrSpn5y+i9BGMgAghQg3 k8HSjWCY52kYJC4krYsvv+w3jzK013yJgf7BJ+QzvU2exhko3VzOhdQhyY81XwX0T2HY mRb1n70yr6WHqE6no+k7vQKlqcL+/ZrBgEZBXNjuwvBncglZ3f0mabcf9Kxz9I5NvO/4 FdZUZST6qU6lT7cCWcdSI2icxGVOhKVi3KWmcWwDHd6nP89ASpmMGmrAROKg0ROzxc+q e0iA== X-Forwarded-Encrypted: i=1; AJvYcCVyRnBvu6uMsn1XObYZfNxTYqLROWgPI3V/zEaB4AyNT6vCwvEfSZVj7n04cu/UyI2XO6acWbnW8PfHqK4p@lists.postgresql.org X-Gm-Message-State: AOJu0YwnQVdXITnosp8OHrvQ7sCTob9azYB/W0KITLoYShfd8K5otdI6 Rm/9VvjpZOaJTZbdyNYVWyZE2/D+0y4sOoBoani8jDE4hNZVRv+/6G1v1qcDmcgi6aQB41GpGbq lsfx8CfOq+WG57iw8G7ot3BOSvl2oEqw= X-Gm-Gg: ASbGnctMJqo7bv6UxKlpuRgfF8yiF2SHGjTCpEftShe7c7rhM4zddRfzERehbQCmG0Y 8T3VnAJ0vDHBboFAn2SSVkWE1zdFqZuKAZX5S+HKTsBKxdknexMF3HJZB4L4jTD5pXnkw8l7dPV +adq/ekEb2GlR7gIOSfl1Bul9gQsAESU5d3+8/LCSrVfcmw18jkH8iDYqP7krQTxhAZiafe7AgT 7+Iozx3niJSOnIpnf4= X-Google-Smtp-Source: AGHT+IF05r4DJQDzbxjWLfUj+39kg/DDPIdEyVXTBC3R3eujhlKxZbBbPrmjyQJxgC9Q3QjzkPGw5fO4O4S5q80osMA= X-Received: by 2002:a05:6a20:43a7:b0:23f:fa51:67c with SMTP id adf61e73a8af0-2406a12d90fmr8328048637.8.1754975244317; Mon, 11 Aug 2025 22:07:24 -0700 (PDT) MIME-Version: 1.0 References: <8ed1d326-5c6e-476e-b3fd-30d3da210546@vondra.me> <38b865bd-2ae9-4a94-a788-6e3dc99ccd70@vondra.me> <306fc8c0-c882-4602-86f5-a106b9ace603@vondra.me> <152ea782-5bd4-4435-b021-0ab2da61e63d@vondra.me> <7c2f6350-6fca-4e39-b0a8-8ac735f5d58a@vondra.me> <2e63cadd-2a03-46b1-866e-7ea5d3ffd37f@vondra.me> In-Reply-To: From: Thomas Munro Date: Tue, 12 Aug 2025 17:06:47 +1200 X-Gm-Features: Ac12FXyIA5tOdGpoUKV48nRUbLhqOqlugdvBsBMiiozjhUK-MK8S8aZ9YHdbjto Message-ID: Subject: Re: index prefetching To: Peter Geoghegan Cc: Tomas Vondra , Andres Freund , Nazir Bilal Yavuz , Robert Haas , Melanie Plageman , PostgreSQL Hackers , Georgios , Konstantin Knizhnik , Dilip Kumar Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Tue, Aug 12, 2025 at 11:42=E2=80=AFAM Peter Geoghegan wrote= : > On Mon, Aug 11, 2025 at 5:07=E2=80=AFPM Tomas Vondra wr= ote: > > I can do some tests with forward vs. backwards scans. Of course, the > > trouble with finding these weird cases is that they may be fairly rare. > > So hitting them is a matter or luck or just happening to generate the > > right data / query. But I'll give it a try and we'll see. > > I was talking more about finding "performance bugs" through a > semi-directed process of trying random things while looking out for > discrepancies. Something like that shouldn't require the usual > "benchmarking rigor", since suspicious inconsistencies should be > fairly obvious once encountered. I expect similar queries to have > similar performance, regardless of superficial differences such as > scan direction, DESC vs ASC column order, etc. I'd be interested to hear more about reverse scans. Bilal was speculating about backwards I/O combining in read_stream.c a while back, but we didn't have anything interesting to use it yet. You'll probably see a flood of uncombined 8KB IOs in the pg_aios view while travelling up the heap with cache misses today. I suspect Linux does reverse sequential prefetching with buffered I/O (less sure about other OSes) which should help but we'd still have more overheads than we could if we combined them, not to mention direct I/O. Not tested, but something like this might do it: /* Can we merge it with the pending read? */ - if (stream->pending_read_nblocks > 0 && - stream->pending_read_blocknum + stream->pending_read_nblocks =3D=3D blocknum) + if (stream->pending_read_nblocks > 0) { - stream->pending_read_nblocks++; - continue; + if (stream->pending_read_blocknum + stream->pending_read_nblocks =3D=3D + blocknum) + { + stream->pending_read_nblocks++; + continue; + } + else if (stream->pending_read_blocknum =3D=3D blocknum + 1 && + stream->forwarded_buffers =3D=3D 0= ) + { + stream->pending_read_blocknum--; + stream->pending_read_nblocks++; + continue; + } } > I tested this issue again (using my original pgbench_account query), > having rebased on top of HEAD as of today. I found that the > inconsistency seems to be much smaller now -- so much so that I don't > think that the remaining inconsistency is particularly suspicious. > > I also think that performance might have improved across the board. I > see that the same TPC-C query that took 768.454 ms a few weeks back > now takes only 617.408 ms. Also, while I originally saw "I/O Timings: > shared read=3D138.856" with this query, I now see "I/O Timings: shared > read=3D46.745". That feels like a performance bug fix to me. > > I wonder if today's commit b4212231 from Thomas ("Fix rare bug in > read_stream.c's split IO handling") fixed the issue, without anyone > realizing that the bug in question could manifest like this. I can't explain that. If you can consistently reproduce the change at the two base commits, maybe bisect? If it's a real phenomenon I'm definitely curious to know what you're seeing.