Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uc2K4-008Ev7-0O for pgsql-hackers@arkaria.postgresql.org; Wed, 16 Jul 2025 13:37:24 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1uc2K1-00AMy7-8a for pgsql-hackers@arkaria.postgresql.org; Wed, 16 Jul 2025 13:37:21 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1uc2K0-00AMvg-Nw for pgsql-hackers@lists.postgresql.org; Wed, 16 Jul 2025 13:37:21 +0000 Received: from mail-wr1-x429.google.com ([2a00:1450:4864:20::429]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1uc2Jz-0084JL-04 for pgsql-hackers@lists.postgresql.org; Wed, 16 Jul 2025 13:37:20 +0000 Received: by mail-wr1-x429.google.com with SMTP id ffacd0b85a97d-3a4ef2c2ef3so4969889f8f.2 for ; Wed, 16 Jul 2025 06:37:18 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bowt-ie.20230601.gappssmtp.com; s=20230601; t=1752673036; x=1753277836; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=Y52sNIrN6yFlZKilru9jMeVSRD/yZ4Z6tU7T2hSecdI=; b=F9svMblbUD7GvaV4IQooog7MytzD3/jZtd3WEh6JIipTRQk8fqKDiO28B54SwhcoxC B0DTzTgP+ZcL/+21bIPZXSvGjuuTZ3mA3ABdLYW+yhKz5tjXWb08VSCFlkkwTjAxEMie 3CivsSH1pVJMh8SVb0NubY8IKGWm3LzYPSOhP0M8Zt3wRdCifTptqRNCDyKV5q3x8+Ib /DZJSGGHzVZDQHjv390Yc+uQYOOFwnr/h/z5kte0wg3XCFgpCLaQ0T0nE0xP4GgJIDMl bQZzqOAeLgL5HztJ8wTsiY/P9neghffL77AX3zFd2HZhClwXSc3oqseLbyynmfrrGQHd uqBA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1752673036; x=1753277836; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Y52sNIrN6yFlZKilru9jMeVSRD/yZ4Z6tU7T2hSecdI=; b=sGiQnxxeVm/WortCQbQS/tLYWg/XUp4OG1BvxIlmVvS2OY7uADyfJ4GmIpZ5zBwKob tTlId4JWrS4YhKwto2t10THQBq1pI2LaqLc4ETZAVUPyrpfU1iDuErIshTy1I4TYQdUL jDY41LmIhBmneKCSH/66QvHedLS4DAH9cO3b8LbFuKSRlPJzsoYo4puD/6CQDqV1N4p8 FtF1acSprrN7ky2HZoyz+bt06br4lHaAedHxMv6Mk/CW47WXUdTINOQxABlKYZAyzkdS kxEdvbo4/qqj3kELpmRfJ9W9JvHYnRCqI3RV7qw32+gxv5A30jbuubbZiAsGPLKPcjhI A8fg== X-Forwarded-Encrypted: i=1; AJvYcCUZI2fnrDPTITs6KuJmpPT3+v4EyIsriQDSjL/pZmgwbdQqE513REaWj/UsFd0rzW/2DOIcjMYu5M++pa6F@lists.postgresql.org X-Gm-Message-State: AOJu0YyuvdRVRvi/a4PAMJ1Y+4LPVpoyFH5shHzVCDyrmdQ3bOiedFs2 Az9ei8Hy07SRTV0lRcVcJ+VvLIUmS0T3xRisOK2aQvmsmoxNCZStisUcX8huNr4z3LZn6WbyvjB trFRb9d8k5WZeXpidlHHmv7BvE/9YmAZAewvwlV/MLw== X-Gm-Gg: ASbGncv5DB8sZu1l/cojCvcp7nhcYpU+Rxmh35OmlwxEJ8ml4x3sl70KD8dYu81+PPd KV0E8KGsDuSNkdnvEgG/rwDIDCnGXzv7TXHS0ueN5tFFAbkSJ8nR6PvDnhaut4ryZz4T+RBo0tJ 9vpAW69Ti+HDuqeXorGFHfwf4panaucw3b8tLy1T9ZXzXPz1Hk4BaT1o0VwVqGOxraXHS9Zs7WL 6Ns7xY= X-Google-Smtp-Source: AGHT+IEJBR+zpQjgEfL8nmVgHGpz1823vsE5Y4E9ro3su7o2iN1JDvq0mk3eEvF2rFAefyJyBgvXs8FrL5DBE1GfE7k= X-Received: by 2002:adf:9d8a:0:b0:3a5:2cb5:63f8 with SMTP id ffacd0b85a97d-3b60e4f2799mr1484246f8f.29.1752673035817; Wed, 16 Jul 2025 06:37:15 -0700 (PDT) MIME-Version: 1.0 References: <57d0e292-73d5-4ab9-9855-110ee9cbd90a@vondra.me> <32c15a30-6e25-4f6d-9191-76a19482c556@vondra.me> <64c8b824-6203-46a3-b045-5e95b796feee@vondra.me> In-Reply-To: <64c8b824-6203-46a3-b045-5e95b796feee@vondra.me> From: Peter Geoghegan Date: Wed, 16 Jul 2025 09:36:48 -0400 X-Gm-Features: Ac12FXzEqNbhdUIdNt3HniChnTYb_QaFsbTAEYVBE8j5LfXTTLhjEIEBRab0-RA Message-ID: Subject: Re: index prefetching To: Tomas Vondra Cc: Andres Freund , Robert Haas , Melanie Plageman , PostgreSQL Hackers , Georgios , Thomas Munro , Konstantin Knizhnik , Dilip Kumar Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Wed, Jul 16, 2025 at 4:40=E2=80=AFAM Tomas Vondra wrot= e: > But the thing I don't really understand it the "cyclic" dataset (for > example). And the "simple" patch performs really badly here. This data > set is designed to not work for prefetching, it's pretty much an > adversary case. There's ~100 TIDs from 100 pages for each key value, and > once you read the 100 pages you'll hit them many times for following > values. Prefetching is pointless, and skipping duplicate blocks can't > help, because the blocks are not effective. > > But how come the "complex" patch does so much better? It can't really > benefit from prefetching TID from the next leaf - not this much. Yet it > does a bit better than master. I'm looking at this since yesterday, and > it makes no sense to me. Per "perf trace" it actually does 2x many > fadvise calls compared to the "simple" patch (which is strange on it's > own, I think), yet it's apparently so much faster? The "simple" patch has _bt_readpage reset the read stream. That doesn't make any sense to me. Though it does explain why the "complex" patch does so many more fadvise calls. Another issue with the "simple" patch: it adds 2 bool fields to "BTScanPosItem". That increases its size considerably. We're very sensitive to the size of this struct (I think that you know about this already). Bloating it like this will blow up our memory usage, since right now we allocate MaxTIDsPerBTreePage/1358 such structs for so->currPos (and so->markPos). Wasting all that memory on alignment padding is probably going to have consequences beyond memory bloat. --=20 Peter Geoghegan