Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w7eof-005ZEL-0I for pgsql-hackers@arkaria.postgresql.org; Tue, 31 Mar 2026 19:31:57 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w7eod-00CriO-27 for pgsql-hackers@arkaria.postgresql.org; Tue, 31 Mar 2026 19:31:56 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w7eod-00CriF-1B for pgsql-hackers@lists.postgresql.org; Tue, 31 Mar 2026 19:31:55 +0000 Received: from mail-ed1-x531.google.com ([2a00:1450:4864:20::531]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1w7eoa-00000002E4s-3oz3 for pgsql-hackers@postgresql.org; Tue, 31 Mar 2026 19:31:55 +0000 Received: by mail-ed1-x531.google.com with SMTP id 4fb4d7f45d1cf-66c5b2e41c1so1572021a12.1 for ; Tue, 31 Mar 2026 12:31:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1774985511; cv=none; d=google.com; s=arc-20240605; b=P73tXImuJhTKU77aMlix9+66T2n4y+8xwt1xl4cVK43HXJgQ7ksSDrOJsgwgNa6/AI OAeetbT+P5vCHGZa8MG+bTHWml+4NMfoi1rQSjhJCuOMRYOn+NJeRM5ohpTBt0Zr8NUm jZJ1f5LByfFfBo+fgqBH38C0u3r+Ixqn7aFbbmdrhfk5b/b7HC11gqN06j5NkhjbSmZ2 uUR9UlYqOi2xyMkmNvCtMaCya/3ZzYxC+uD+lNsXfLyfbaDg2omrdSLSgI8B83XUlIU0 vl05FLx2QceUzWB1KX+sqK5DN2BkplSDp1dQYbe4i9yaivDMXkoxfwEMYSWC8qMntcpG EUPw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=ytc08B1nswD24gbAMHPxq5NctxwhT3KqNlTanBxWgsg=; fh=62GROJdbzln/QsGjE7a71YRuD5BIah4s1ki9oo1aJ7c=; b=hw3hQAXLUvWKBuTPwxBU9Yc8JvVuYW5KZCe2Ypg/kxU3Zt+A9JvmUGziRlUMUHr6uD 7l06ySurQVtALl1bOM0t0pHWKMjk2gWVM72ZLhG8P/hKRKuL26U1YWCsyJYhy6ljsEEz 7zns9HXOmwFS10rnbKpxY0vvJD3ftRFCGm8+/+7OFXPOGpTg+fg9f95fZX2W/jOxUM4X 4Njwm49HVPwbHnISb76dOa55pW9lQJJJKoxjkZzfGLb3wuu5OQRELlwGtJbTBnymrDd7 XZu94uPXcg8Ntorgd0O7t6Tdpq3IWy8AAG/Usv1kqj2eF/mpgZFMRD23nKGwUBcrWQYJ Ie/g==; darn=postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1774985511; x=1775590311; darn=postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=ytc08B1nswD24gbAMHPxq5NctxwhT3KqNlTanBxWgsg=; b=OMVfC13NuRg//xD5+l1YdoGh5pbqquLpf9o3WVLItQwpSMKFxdj1zUEwovOemoAmSk EtQJL/b8eJNqkhUdL5VdYCx0bplDqYuRP65lpK2B9wioR3TMA7cjKPkV+8oV0wOr+wWt xqYlQWjSt4AylxyR4ORvW77cpc4nIGoytrzft676DgQXQ752P+YC5cVXPcEptepMNGxu vLBiJM1t7YWQ0dtGnDh9H4lZIzTfgUZtqkE7fpP07+oBd/3Z7+mLaQPxycpNZ/7tbTb5 TuDxC3a2I7moV3UNVnnrGG+SpoD1CYRffVisW3DdS3rwlmGQaGfIOcHEZD24HqtM4Csp ciaw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1774985511; x=1775590311; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=ytc08B1nswD24gbAMHPxq5NctxwhT3KqNlTanBxWgsg=; b=gLwIjPR/PW1CdtiTKSYV+wotPwuAo2U41lqU6zky7xU+qz0t/M4PBIB519rvdoN+6/ CvnoyChw8rvIazMOXAIA/m4L7vCVYncWBF+Sc3TLqRA8ho94pfeIQa/M4SsvV6GENzqK +uv5RK0tmJqRTcVEwBkZPd9yGR69jPZaqmXj/VkWPfRLIkXblX748VhDKa+wuwdnajte oSoAcnEQLgxhQZxNxCn9Tah9NQwJ8nsLr+aqDNnL6QL9geJwPxrSisuj9hoB3o5H/sbw vdOA0imTEdvcgC60BiFUPn4alSjn4xLlUMTOcVjTjILMXHoOhPWkEwtZQ6qjz4sx+MZ7 GrrQ== X-Gm-Message-State: AOJu0YxMcXIu7+Ye8K9R/flJdNemKHBIomwAwySHXNiD1Ugw9PuzVFui rF8I06y9d8YZAq5hGxQKRnnbYhCdgwlOiSA02jIy/N385BGJ4I+bNUP8mP14F1MCejqhrhhh6mT 4mbrJBWSqfE7yt9P1647gW/ZJry5y4ms= X-Gm-Gg: ATEYQzxPETUTYsCKsp56oF/ULwaQV1RH7uHOQydWDXks8hdtNpAI4V11qEwxUCSS8+9 c4CiLpDm1Pdteg+37rnhrs5ehP9sjg/8Js275EAc2CzyCeMbw/s1coO6Uvur4IKZ7WI8jB7dVed Ur9nGAo1YMuYVLquX5fZXOe0WMoNapZAgSW171yR3A9gBdhqeXdiRnTHY31orZDNNKp9cBcggqz nghWSYD6we+55KTh1h51xvJZDURxlvnT7zwfND1bYpF6LL2e/tPeOlh8v4L1E81+kOX3RBBJeq/ tzPAAUDy7Xj7Gk/udITsby/r5ZBYErqrLi2OzYcZjcTSzZquEMoQOXULe/WMf6F41PXU5sQwzQ5 Q95nePtt8TPA4Nqboti4= X-Received: by 2002:a05:6402:3059:b0:66c:ecdf:4313 with SMTP id 4fb4d7f45d1cf-66db0bf5580mr329235a12.20.1774985510711; Tue, 31 Mar 2026 12:31:50 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Melanie Plageman Date: Tue, 31 Mar 2026 15:31:39 -0400 X-Gm-Features: AQROBzDd0r_7P8nPt5peF8hq3xRB1GfYQT7MwyTZZbJLSiHwMc5pGGnOnnvykV4 Message-ID: Subject: Re: AIO / read stream heuristics adjustments for index prefetching To: Andres Freund Cc: pgsql-hackers@postgresql.org, Thomas Munro , Peter Geoghegan , Tomas Vondra , Nazir Bilal Yavuz Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Tue, Mar 31, 2026 at 12:02=E2=80=AFPM Andres Freund = wrote: > > 0001+0002: Return whether WaitReadBuffers() needed to wait > > The first patch allows pgaio_wref_check_done() to work more reliably = with > io_uring. Until now it only was able to return true if userspace alre= ady > had consumed the kernel's completion event, but returned false otherw= ise. > That's not really incorrect, just suboptimal. > > The second patch returns whether WaitReadBuffers() needed to wait for > IO. This is useful for a) instrumentation like in [2] and b) to provi= de > information to the read_stream heuristics to control how aggressive t= o > perform read ahead. These both look good to me except in 0001 you left an XXX in pgaio_wref_check_done() that I think is the very thing that commit does. > 0003: read_stream: Issue IO synchronously while in fast path LGTM > 0004: read_stream: Prevent distance from decaying too quickly > > There are two minor questions here: > - Should read_stream_pause()/read_stream_resume() restore the "holdof= f" > counter? I doubt it matters for the prospective user, since it wil= l > only be used when the lookahead distance is very large. I don't really understand this. We have to do this with distance because we set it to 0 and use distance =3D=3D 0 to indicate stream end. read_stream_pause() doesn't set the distance_decay_holoff to 0. If you mean, should we reset holdoff to its initial value, then I don't think so. I imagine that users doing a lot of pause and resume may not have high distance. > - For how long to hold off distance reductions? Initially I was torn > between using "max_pinned_buffers" (Min(max_ios * io_combine_limit, > cap)) and "max_ios" ([maintenance_]effective_io_concurrency). But I > think the former makes more sense, as we otherwise won't allow for = far > enough readahead when doing IO combining, and it does seem to make = sense > to hold off decay for long enough that the maximum lookahead could = not > theoretically allow us to start an IO. I agree. 0004 LGTM otherwise. - Melanie