public inbox for [email protected]  
help / color / mirror / Atom feed
From: Thomas Munro <[email protected]>
To: Noah Misch <[email protected]>
Cc: [email protected]
Cc: [email protected]
Subject: Re: BUG #19406: substring(text) fails on valid UTF-8 toasted value in PostgreSQL 15.16
Date: Sun, 15 Feb 2026 09:10:00 +1300
Message-ID: <CA+hUKGKJm6Zi0P5vp=ZnhS=WN0_13tUX2aS5EzQQu3Ci=xhVew@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<CA+hUKGJ2Azp21o+Vm3enjA6WNao_=Xyu_EFjZwoc=YqSg+LjXQ@mail.gmail.com>
	<CA+hUKGKy7uqJv9HDMPjDR6O8bv9FSvdg2X5k8z3J1k6QhnJoCg@mail.gmail.com>
	<[email protected]>

On Sun, Feb 15, 2026 at 8:33 AM Noah Misch <[email protected]> wrote:
> - slice_len is the amount *returned* from the toaster.  It's nonnegative.

Ah, right, that makes more sense.

> > The outline I had come up with before seeing your patch was: let's
> > just delete it.  The position search can check bounds incrementally,
> > following our general approach.  This avoids the reported problem by
> > ditching the pre-flight scan through the slice (up to 4x more
> > pg_mblen_XXX calls and memory access than we strictly need), and also
> > the special cases for empty strings since they already fall out of the
> > general behaviour (am I missing something?), not leaving much code
> > behind.
>
> Like you, I made a note that it's wasteful to make two mblen passes over the
> string.  I'm only seeing a 50% reduction in mblen calls, not an 80% reduction,
> but I didn't look too closely.  I guessed such a change would be less clearly
> correct, so I figured it would be less suitable for back branches.  Hence, I
> didn't draft it.

I was comparing to unpatched master, but yeah of course your patch
already gets part of the way there.

> My first impression, hurried due to the commit ETA in 30 minutes, is that this
> is less conservative and should hold for master-only.

Got it.  Will add it to the pile of master-only fallout from this area.






reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected]
  Subject: Re: BUG #19406: substring(text) fails on valid UTF-8 toasted value in PostgreSQL 15.16
  In-Reply-To: <CA+hUKGKJm6Zi0P5vp=ZnhS=WN0_13tUX2aS5EzQQu3Ci=xhVew@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox