public inbox for [email protected]  
help / color / mirror / Atom feed
From: Thomas Munro <[email protected]>
To: Melanie Plageman <[email protected]>
Cc: Jonathan S. Katz <[email protected]>
Cc: pgsql-hackers <[email protected]>
Subject: Re: Trying out read streams in pgvector (an extension)
Date: Wed, 12 Nov 2025 12:19:26 +1300
Message-ID: <CA+hUKG+zLmkD9zus=JOjjC+j5p9R1+CSXNZgd5=exZ01ZTaKoA@mail.gmail.com> (raw)
In-Reply-To: <CAAKRu_ZVxzwRRbxedgb_LtkFaGf78XAbTO9uExvadV2DzaE=Jg@mail.gmail.com>
References: <CA+hUKGJ_7NKd46nx1wbyXWriuZSNzsTfm+rhEuvU6nxZi3-KVw@mail.gmail.com>
	<[email protected]>
	<CA+hUKG+x2BcqWzBC77cN0ewhzMF0kYhC6c4G_T2gJLPbqYQ6Ow@mail.gmail.com>
	<CA+hUKGL-3mBtkA9RTbLFHuSS5cviuv0ko7nBhCg9KM7Q-GSEkw@mail.gmail.com>
	<CAAKRu_ZVxzwRRbxedgb_LtkFaGf78XAbTO9uExvadV2DzaE=Jg@mail.gmail.com>

On Wed, Nov 12, 2025 at 11:52 AM Melanie Plageman
<[email protected]> wrote:
> On Tue, Nov 11, 2025 at 4:22 PM Thomas Munro <[email protected]> wrote:
> > But for now, to fix pgvector's woes, I wonder if it might make sense
> > to call this a bug in v18, and back-patch the tiniest possible change.
> > Something like what I posted[2] in this thread almost two years ago.
> > I don't think it really affects any core code: we use
> > read_stream_reset() only in very minimal ways there (I could
> > elaborate), and it's quite arguable that the existing policy is wrong
> > for them too, but we'd need to confirm that and perhaps think about
> > other extensions that might be using it.
>
> If we are worried about regressing other extensions using
> read_stream_reset(), we could make the read stream reset which
> preserves the distance a different function in backbranches.

Hmm, yeah, interesting idea.  Candidate names might include
read_stream_restart() and read_stream_continue().  The point being
that the block number callback reported end-of-stream, but that was
only temporary, and now it has more information and would like to
continue.  Those are some of the names I bounced around for a new
read_stream_reset() flag argument for v19 (I rather liked "continue"),
but I also like this separate function idea.  Back-patching a new
function would certainly remove all doubt about unintended
consequences for existing callers of read_stream_reset(), so yeah,
that wins on pure conservative safety grounds.  As for the future,
hmm, it might even be better to have an explicit separate API for this
operation in master too, as it is turning out to be quite a common
requirement and the naming is much clearer like that.  We don't
usually design new APIs while back-patching though, that's probably
why I didn't think of that, but if we view this as a design bug that
folded too many jobs into read_stream_reset() that we now want to fix
by splitting one off, maybe that's OK?  Seems pretty risk-free,
anyway.





view thread (18+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected]
  Subject: Re: Trying out read streams in pgvector (an extension)
  In-Reply-To: <CA+hUKG+zLmkD9zus=JOjjC+j5p9R1+CSXNZgd5=exZ01ZTaKoA@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox