Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vT6Ly-00CqSQ-0b for pgsql-hackers@arkaria.postgresql.org; Tue, 09 Dec 2025 22:38:42 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vT6Lw-007pE9-2b for pgsql-hackers@arkaria.postgresql.org; Tue, 09 Dec 2025 22:38:41 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vT6Lw-007pDy-1O for pgsql-hackers@lists.postgresql.org; Tue, 09 Dec 2025 22:38:40 +0000 Received: from mail-wr1-x430.google.com ([2a00:1450:4864:20::430]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1vT6Lu-0047tL-0o for pgsql-hackers@postgresql.org; Tue, 09 Dec 2025 22:38:40 +0000 Received: by mail-wr1-x430.google.com with SMTP id ffacd0b85a97d-42f762198cbso4005019f8f.3 for ; Tue, 09 Dec 2025 14:38:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bowt-ie.20230601.gappssmtp.com; s=20230601; t=1765319917; x=1765924717; darn=postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=TxIUjpmsptF071PwEcGt8SmZQZ8liiwFMjevkQq5ezA=; b=uWGJaiXlacZGOEi0oZ8vbEyMCWYXs/E4o3bj+3i9wnYRFj4GGV+pGpkQWqBr+ve4ya OMdmN/Y3AqEthLm08I/ebrBlD5bI/AFDP8GJ3Pxatq1vsf/ls9BN1LPRnw3A9+VWeANI YXpf1maF4sT+f6Y0DSSAPnGEOYJx7N0sLWUvLrmA5wba1wNe7Tm6XzjULEuBLJ67ut5g 7tIv1QLqYatLNShpzHtkFIrXqb2u3lGFlq6DG0hBYE7nw2QC4kn9dyWU4635l/FczHc+ QYxZ/o2eDlbr0nnco/iirfmrXy2GPpv4qPYqZ1W0qnVOyIfFAXe0D14XbPi+CqjuxONg 6lNw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1765319917; x=1765924717; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=TxIUjpmsptF071PwEcGt8SmZQZ8liiwFMjevkQq5ezA=; b=fmef4ICIo6jQ3VclApjcFIfUmPQipz8DekKY3yRZv7RwX74XJJ6u4KWK1ldKxuvC0A J12VTrhyLxTrJJyhxkif7K2FtUqKi6Cq+aAAJw2aRSwXmxNsqCxwx7etuEo7AU+fZGwn CNnehRXYkl10zyOV7rRTinK42jav4oX6ymTHGl5y5mzy82LFuDv+s1UO98LokEHGM8dT ScdAlUaIbzLyyFj+aJSOnBoqA8Xy18QQE6LvXkZxSkPGVR2VpJfLRsoqigjquo0Mc1Rg G7nFO6Hrlwql8mbgclUeAiovTnaIAEGjaeUTnR1RptVqt4VAehCA0eVMTWUIaMcLXG9X MHJg== X-Forwarded-Encrypted: i=1; AJvYcCWMkoeDHBHc5pShV9EaSEkaFxclfmHewS1xXLD2nlKA1FztbvVltgEUkIdT1nNWKA4jZFIe+Dy3aqXFEQop@postgresql.org X-Gm-Message-State: AOJu0YxhTzlQP8BVdJv6mOCJZPEyAxmMJhpPJZok1GOhZRaKKgmsC9o2 CJNhZ2IfLu/MX4+Q2NZlglgsgQ+K7usYW9ynCkDmR4pXuOosN71xruwvvGAaSiqAMC45lvWk0CV kmwAC4JgrVG2BzRdNDIE8KT3Jr2aU/V4mbnI6aNHkSJURe+LZhRug X-Gm-Gg: AY/fxX6NNlfYLCWJQzcU/XZdHCLPjifR9V3pUlJf9VXm38ubZHve7RAtz2sqXtV5mIk NkzGKKmYhLLfJ0Xnka1X0mUKMoJV6SpH9Ea1BsDWn7l1lRyqS4FVASzjYLYHqz29YuthyTUmMV7 uhH5VQY5ifRZ1SgY2jqy2niQD7V+87KufDmkDsvgqiiM4fySA5hg/L19Nn13WbkImM8A7JW7cB9 cbQeVMkDhoHj8JO0MxPmFU/DmR8g0fdoPuGkrXuVFxMHvVDKk/bov7ZdqaejXQfuXmXTFI= X-Google-Smtp-Source: AGHT+IHMtYNzb6yuTJ46jLRHBoaRFI6pXrNkfksJeQfytn0IR1hNnKCt4MEhbQcBAI+9hb40lwmdNIh8qjOlfpwG5Mc= X-Received: by 2002:a5d:584e:0:b0:42f:8816:9504 with SMTP id ffacd0b85a97d-42fa3b12e2dmr224188f8f.60.1765319916448; Tue, 09 Dec 2025 14:38:36 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Peter Geoghegan Date: Tue, 9 Dec 2025 17:38:10 -0500 X-Gm-Features: AQt7F2p5NZ4rdJD4POaPoAMQU29cDmSaUbG7m3VyB3PLnMhCT-XR_LvWDJQWlSA Message-ID: Subject: Re: Trying out read streams in pgvector (an extension) To: Thomas Munro Cc: Melanie Plageman , Nazir Bilal Yavuz , "Jonathan S. Katz" , pgsql-hackers Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Mon, Dec 8, 2025 at 10:47=E2=80=AFPM Thomas Munro wrote: > Yielding just because you've scanned N index pages/tuples/whatever is > harder to think about. The stream shouldn't get far ahead unless it's > recently been useful for I/O concurrency (though optimal distance > heuristics are an open problem), but in this case a single invocation > of the block number callback can call ReadBuffer() an arbitrary number > of times, filtering out all the index tuples as it rampages through > the whole index IIUC. I see why you might want to yield periodically > if you can, but I also wonder how much that can really help if you > still have to pick up where you left off next time. I think of it as a necessary precaution against pathological behavior where the amount of memory used to cache matching tuples/TIDs gets out of hand. There's no specific reason to expect that to happen (or no good reason). But I'm pretty sure that it'll prove necessary to pay non-zero attention to how much work has been done since the last time we returned a tuple (when there's a tuple available to return). > I guess it > depends on the distribution of matches. To be clear, I haven't done any kind of modelling of the problems in this area. Once I do that (in 2026), I'll be able to say more about the requirements. Maybe Tomas could take a look sooner? Right now my focus is on getting the basic interfaces/API revisions in better shape. And avoiding regressions while doing so. --=20 Peter Geoghegan