Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1raTNZ-009sdS-En for pgsql-hackers@arkaria.postgresql.org; Thu, 15 Feb 2024 04:29:45 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1raTNY-00AMQn-4N for pgsql-hackers@arkaria.postgresql.org; Thu, 15 Feb 2024 04:29:44 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1raTNX-00AMQe-Qz for pgsql-hackers@lists.postgresql.org; Thu, 15 Feb 2024 04:29:43 +0000 Received: from mail-ed1-x52c.google.com ([2a00:1450:4864:20::52c]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1raTNU-007Gaf-RW for pgsql-hackers@lists.postgresql.org; Thu, 15 Feb 2024 04:29:43 +0000 Received: by mail-ed1-x52c.google.com with SMTP id 4fb4d7f45d1cf-5639ffbcc10so544907a12.0 for ; Wed, 14 Feb 2024 20:29:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1707971379; x=1708576179; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=D66HRgq5loyGzmaUWsHMEQ3IXDpM5cKGPf/rFFiHtus=; b=jE4yNqoOlPB5jGDTGJ6jqE+HLdq6HN6i2BmG92Nk4tQZ5wKiW82GNgGC6urSfDMeut UaQRvC2Ujdqlug9R0eFpXKx3/JK4zP3MIxm6X6PvXkdzCS8Vj7rIWeVTzAvsIjZxXSKf x5pmsLQJQS5DRgqQ8RzKzE23Qn80ZZeiVNA84D0yt9K1uAmhru1Y1H0yGeOtFlC82Fqo 0KvTphnFGQlBa13tSC40rfgjm4T/wyZKYaII/PjfSmp2nWwZLiOoPSNY7jLQj96VZLK+ Oq61VW9cJtCvD7kU6Whlovj3nbKsua1octA5wnEnrCb3oTj9nxkFpNOkJDiPvdLnPV33 89+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1707971379; x=1708576179; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=D66HRgq5loyGzmaUWsHMEQ3IXDpM5cKGPf/rFFiHtus=; b=El0Qt8HMIk/EvnbwZ+dw1wYd1RU8FEtkThV6h4vUrcCp8vPzNXtvYfwm2eEZVBpU9v Dx2NDjJtU3dd1TwiuzwO5zSWUjZnP3XN8hI9AjLeriSgSpFR+q9myyjBPLcKx8azd55c hVCUidE7rhr8+F4QODaSpWnMR8RUzndAJw69wLSDaIrcVVOikQX19eoIQoNTWH1wdWAt /KaRAWrJGdR9mMfNKEHg83cGrN7Sf59QWrV0pRI1lxg+oyWe5Cz3tIRAFLln18KGCxye eojms4ChX7NYuy9YQBIUAUl+hnzz1Ibdj4xJoWMcJatGVEYpRLxLyq9XIUYN0poRxgH/ qLLg== X-Forwarded-Encrypted: i=1; AJvYcCVSOiKha/JqxnWrNhiCnwn5vXIlO+n4/futU9gvjoOCQjEfByMchaFyKBF8l/ABPn5VEepsSRfzIlfMxaPofW+VutejIONjN6u5PITXNscvBVsf X-Gm-Message-State: AOJu0YzWrYKo+7wjEnPenSdft9vsCeBUuX88FxAizvKtXtQdZKCQO+ei NUnE+BmhYb6LGxD47bV3C1vcXo0hlLy4tqHK3U2pbSrArGuDl1+F43M5aNfo2hE/LbEAMQyTWOK jKci2WGTaCMDn6Rs+tLfD+mvYz6oRi/zb X-Google-Smtp-Source: AGHT+IEb6bpGoRw1mZyh1qZ/stAMNjjFJqfmVZTnzzjc9QRyl7lsXthNuFQT6oTccEOBtF51XZKI4BK45BF77XqX79s= X-Received: by 2002:a17:906:f201:b0:a3d:3aee:a854 with SMTP id gt1-20020a170906f20100b00a3d3aeea854mr312364ejb.28.1707971379214; Wed, 14 Feb 2024 20:29:39 -0800 (PST) MIME-Version: 1.0 References: <8ec36f51-b863-60e3-20e2-b9c981c5ce5e@enterprisedb.com> <98ba4b25-fae8-c1f4-1597-8093375a1986@enterprisedb.com> <20231221134314.wf2rs62d37u62j7t@alap3.anarazel.de> <20231221154352.ijtg6wloa3nowivh@alap3.anarazel.de> <482ec3ff-52ad-415d-96fd-f3832a894023@enterprisedb.com> <56176b8d-956c-487e-ab09-310db4581c07@enterprisedb.com> <4867452a-b853-4813-a6da-9bb06a336f8b@enterprisedb.com> <5fc8f150-f350-4cb3-b6b7-ffcc21ed7bc2@enterprisedb.com> In-Reply-To: <5fc8f150-f350-4cb3-b6b7-ffcc21ed7bc2@enterprisedb.com> From: Robert Haas Date: Thu, 15 Feb 2024 09:59:27 +0530 Message-ID: Subject: Re: index prefetching To: Tomas Vondra Cc: Melanie Plageman , Andres Freund , PostgreSQL Hackers , Georgios , Thomas Munro , Konstantin Knizhnik , Dilip Kumar Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Wed, Feb 14, 2024 at 7:43=E2=80=AFPM Tomas Vondra wrote: > I don't think it's just a bookkeeping problem. In a way, nbtree already > does keep an array of tuples to kill (see btgettuple), but it's always > for the current index page. So it's not that we immediately go and kill > the prior tuple - nbtree already stashes it in an array, and kills all > those tuples when moving to the next index page. > > The way I understand the problem is that with prefetching we're bound to > determine the kill_prior_tuple flag with a delay, in which case we might > have already moved to the next index page ... Well... I'm not clear on all of the details of how this works, but this sounds broken to me, for the reasons that Peter G. mentions in his comments about desynchronization. If we currently have a rule that you hold a pin on the index page while processing the heap tuples it references, you can't just throw that out the window and expect things to keep working. Saying that kill_prior_tuple doesn't work when you throw that rule out the window is probably understating the extent of the problem very considerably. I would have thought that the way this prefetching would work is that we would bring pages into shared_buffers sooner than we currently do, but not actually pin them until we're ready to use them, so that it's possible they might be evicted again before we get around to them, if we prefetch too far and the system is too busy. Alternately, it also seems OK to read those later pages and pin them right away, as long as (1) we don't also give up pins that we would have held in the absence of prefetching and (2) we have some mechanism for limiting the number of extra pins that we're holding to a reasonable number given the size of shared_buffers. However, it doesn't seem OK at all to give up pins that the current code holds sooner than the current code would do. --=20 Robert Haas EDB: http://www.enterprisedb.com