Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ud7IF-003FK6-Jt for pgsql-hackers@arkaria.postgresql.org; Sat, 19 Jul 2025 13:07:59 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1ud7ID-006ZFj-4Z for pgsql-hackers@arkaria.postgresql.org; Sat, 19 Jul 2025 13:07:57 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1ud7IC-006ZBn-Q5 for pgsql-hackers@lists.postgresql.org; Sat, 19 Jul 2025 13:07:57 +0000 Received: from mail-ot1-x32f.google.com ([2607:f8b0:4864:20::32f]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1ud7IB-008cgL-0A for pgsql-hackers@lists.postgresql.org; Sat, 19 Jul 2025 13:07:56 +0000 Received: by mail-ot1-x32f.google.com with SMTP id 46e09a7af769-73e5487c4d7so722536a34.0 for ; Sat, 19 Jul 2025 06:07:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1752930473; x=1753535273; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=FLjvvkZlIyA6oizVPwoGlq9Yc3Slu1aA9KR6ghMjzVc=; b=Q5Wloo3DgoEqeuAjMfFasylFMAVoCbmpilWmchAHEHKncVgadeEoQMDaeldgDpMsoe yJ9my5PAXGaJUJHkeDXzs7OjOeSKsgF8NlgmZ0LSp0hAMnPuJJsSs/9yZLKo5/8Hsm8b 2MHUL2QXT1hh3sFyH2oMBQnCcAjQEDDSm40KArha2j/3IMvYm+B5bfbo0lqoMSA2bY4R 5T/CtncoPJ5GdcmXgoD5YXBYjGpH9arHkZJ0MMvM/7pFiOVQppzUDUGW+/tKokfIbFlO njjwfCV20MkoJWknSpzoww4N1kkixG6nqZJN4wmkUhwuZCmG2cM45Mnli0q/0C+ZBDb5 NZrQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752930473; x=1753535273; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=FLjvvkZlIyA6oizVPwoGlq9Yc3Slu1aA9KR6ghMjzVc=; b=jQC3CmO8yYz3iSRcQtf9Yz0hxaRQ7bWGu2AcrAf5MQgjIE9opx8zm4lmdFjeXN0y79 ZAnpuzxhrtjiVA/oS26l355YdgclpPaFpPbj8A1/Ly782FZ3H3UbIpTlRx2wdSXWZ38p 9KE/gyMExZczKiDHYQg7g61OH78Y0a12tUKNKH818kg9qTOuLZIiHW79NcyFwbqltb95 RtkTnUGzvRL0l7oNq1cnQKrwFu1+tS0hmwl74uvF4fVe3EZuNhEhtv8dkrYgusvqjpwF atHbG0xvOvQaGwpc8GOpjWIX5sg2po9zmWqR2FWOaMuwISPlt1VMUtn36iTpZAvRkCm7 qdEg== X-Forwarded-Encrypted: i=1; AJvYcCWeEw48X8e3q1W9WXl/SkOB5OguRnlT+T31mnGon3vmRTY/RbOqN6hzLX8ky8XcnpNmNZsmi9BAHu0O4xb8@lists.postgresql.org X-Gm-Message-State: AOJu0YzqnoLfWRtdeoiO150XidDfhDfdirFuhsLfNKzZIW/izHX6D8Ln KZZUS17rOPSgXik1anu4msAgtP+f+8NtqG6aEcs5flqAfZfrKEaX7kB7vZrPBuQdAaCn0dc31w1 08iTjF0X8ZP90SA65zGw6pbBbEWJ32rY= X-Gm-Gg: ASbGnctB5mrByZo5Edl9NbR6abogRze3zYBKkVpeYRK2RLNL55hlYgVj5K1GdJqvoS8 A/58ZIg+iRWP8IjqNxWBOiBWGg79sTEMIcFM+v5I1gsxV2rJlobbm/fOVKEA2bijlwUGP6oINXl XBBNgIvq8F/AK72DT32RlwiVCN9tsIraN2rUGwi05L4HmLqfMfYLB1NeivyYnQtKaOS/tyAe4wZ 630wvLw3cX7V/8MQu0CsuBqaYTnloIPMBlTBlPXrw== X-Google-Smtp-Source: AGHT+IFPofSntZDVfaKubQM1CaMyh3h7duWp3IBMo48/z45i9zTmjnospJJL+d9D2VfDpITc2SS7AIgHSlT6UvBbql0= X-Received: by 2002:a05:6871:69c6:b0:2ea:bc9:7a73 with SMTP id 586e51a60fabf-2ffaf2bdd0cmr3750930fac.5.1752930473013; Sat, 19 Jul 2025 06:07:53 -0700 (PDT) MIME-Version: 1.0 References: <57d0e292-73d5-4ab9-9855-110ee9cbd90a@vondra.me> <32c15a30-6e25-4f6d-9191-76a19482c556@vondra.me> <64c8b824-6203-46a3-b045-5e95b796feee@vondra.me> <03dcc1a9-c5d0-4965-889c-684dc0a7580c@vondra.me> <23f490f4-8325-408c-91a0-a6757ab2441c@vondra.me> In-Reply-To: <23f490f4-8325-408c-91a0-a6757ab2441c@vondra.me> From: Thomas Munro Date: Sun, 20 Jul 2025 01:07:16 +1200 X-Gm-Features: Ac12FXzbRUMp5pqgGUADp7hWVxCbkVd7YOddBBLvGBne6jgIwvjKu5eksDgN4RI Message-ID: Subject: Re: index prefetching To: Tomas Vondra Cc: Peter Geoghegan , Andres Freund , Robert Haas , Melanie Plageman , PostgreSQL Hackers , Georgios , Konstantin Knizhnik , Dilip Kumar Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Sat, Jul 19, 2025 at 11:23=E2=80=AFPM Tomas Vondra wro= te: > Thanks for the link. It seems I came up with an almost the same patch, > with three minor differences: > > 1) There's another place that sets "distance =3D 0" in > read_stream_next_buffer, so maybe this should preserve the distance too? > > 2) I suspect we need to preserve the distance at the beginning of > read_stream_reset, like > > stream->reset_distance =3D Max(stream->reset_distance, > stream->distance); > > because what if you call _reset before reaching the end of the stream? > > 3) Shouldn't it reset the reset_distance to 0 after restoring it? Probably. Hmm... an earlier version of this code didn't use distance =3D=3D 0 to indicate end-of-stream, but instead had a separate internal end_of_stream flag. If we brought that back and didn't clobber distance, we wouldn't need this save-and-restore dance. It seemed shorter and sweeter without it back then, before _reset() existed in its present form, but I wonder if end_of_stream would be nicer than having to add this kind of stuff, without measurable downsides. > > There was also some discussion at the time about whether "reset so I > > can rescan", and "reset so I can continue after a temporary stop" > > should be different operations requiring different APIs. It now seems > > like one operation is sufficient, but it should preserve the distance > > as you showed and then let the algorithm learn about already-cached > > data in the rescan case (if it is even true then, which is also > > debatable since it depends on the size of the scan). So, I think we > > should just go ahead and commit a patch like that. > > Not sure. To me it seems more like two distinct cases, but I'm not sure > if it requires two distinct "operations" with distinct API. Perhaps a > simple flag for the _reset() would be enough? It'd need to track the > distance anyway, just in case. > > Consider for example a nested loop, which does a rescan every time the > outer row changes. Is there a reason to believe the outer rows will need > the same number of inner rows? Aren't those "distinct streams"? Maybe > I'm thinking about this wrong, of course. Good question. Yeah, your flag idea seems like a good way to avoid baking opinion into this level. I wonder if it should be a bitmask rather than a boolean, in case we think of more things that need to be included or not when resetting. > The thing that however concerns me is that what I observed was not the > distance getting reset to 1, and then ramping up. Which should happen > pretty quickly, thanks to the doubling. In my experiments it *never* > ramped up again, it stayed at 1. I still don't quite understand why. Huh. Will look into that on Monday.