Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vr1CH-0051qd-1O for pgsql-bugs@arkaria.postgresql.org; Fri, 13 Feb 2026 21:59:33 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vr1CG-00GPii-0q for pgsql-bugs@arkaria.postgresql.org; Fri, 13 Feb 2026 21:59:32 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vr1CG-00GPia-00 for pgsql-bugs@lists.postgresql.org; Fri, 13 Feb 2026 21:59:32 +0000 Received: from mail-dy1-x132c.google.com ([2607:f8b0:4864:20::132c]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1vr1CE-00000000YUw-0Co5 for pgsql-bugs@lists.postgresql.org; Fri, 13 Feb 2026 21:59:31 +0000 Received: by mail-dy1-x132c.google.com with SMTP id 5a478bee46e88-2bab709f867so70160eec.1 for ; Fri, 13 Feb 2026 13:59:29 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1771019967; cv=none; d=google.com; s=arc-20240605; b=T/zuSaOpenJZdnsXUMS4+AgpTIRJSutHrvnD9w1f3iZWTmn67yA41JTgB4R/BhtkAf lnvpOHx+CKAbbJC8vG45rQJhRBATupRuFqBBQnUsHeho58Rj/LU2JqguYTMVYpgReBUO 7mob5J2mbvyQe8VNtGP3GT7jX7oCRf9HdinLsQORfw2BU6N8BymKENRhT1EDeNgYIzGS KTnklq3URqJb/2cdGeaR4Ahfe3Y2hrBDL6OI/ERulquQNtDvg6aSfmhc4NHDBGOIm1Bb ahgicynLNPCVe4w4q0qCj8Sb3T6aAHtvLVUMdfKT+6tmG2d9dxwnYBN/5uO+gmK1vlMi v7nQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=e2tRMOJtULJIgOVw1tuaLJGjkUMVR3cKVN3kvwtC0QE=; fh=ismimSqBZZNS2vTTASgFSh3w8nLhG+CNsnQYUSTbLcQ=; b=kYB1FFwvruVtxEluOJVhKdTP5Uy8OiZ9FN8M2cvMeu8B5LKqGY4UJDgXON3QCXgj2K H/epvjpTXYW7S9wsWATCfiLeIhCzs2zooxhe4Er9F1sbVylUzQXChSDctWV/GzKeWOD7 tAbVVFNPtbWvDzZWCjok18Ys7SuOk77BZ2B8CT/nBI0D2ng8hIwDjW0E8rV5y8CZ8esH W6xk0bz/Y+8+Hg45idhJ5WWhVTGipDkSU7VU5xKS42AyTGgMPw8QHw6pE20xWSuXjMw+ ManVzbgrax1ZPMHVA0EaQbaq7YXHXZbHSX0vn11PCxLV6QgLlfvMRDwej0YWySW7jDRz aUmA==; darn=lists.postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1771019967; x=1771624767; darn=lists.postgresql.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=e2tRMOJtULJIgOVw1tuaLJGjkUMVR3cKVN3kvwtC0QE=; b=nbofSHeZEYHFAK3tcaO9M4W7SHQW9uuvRmkqmJJ2krobRoqRsV84DH2hrVnVqjMcWr Z/IQJlZT+hERUYYoVix/e5W4x/z8LWkzTW88aHrJJ2Y79htsK5j8pABfqaU7W0k7uMXk aMCGuyrSKJta/ogQQIdSzruo5EXuvWQjv2DfRXZLw/edueIHP4shpvyUzkPSiHp8FqSj f+SUaxzgsq4j1VYX/HjabgE9IMsPL96ZbWWiiJJhS7/od25AWY/jhoLs3EY9twn4TEuw h1U6rzbEQRHHeziKCwYLgukqMDU32jpYyz0WwROCVjFZEdKN49dkvfMiQRyKMNlwo4W6 KM9A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1771019967; x=1771624767; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-gg:x-gm-message-state:from :to:cc:subject:date:message-id:reply-to; bh=e2tRMOJtULJIgOVw1tuaLJGjkUMVR3cKVN3kvwtC0QE=; b=bqOE1Z5FEgQStNl+8rUcW9kgVt9lqHSvopae5/m76SLugeO0049rlNGl9bC8MkAzqG 8QvJJANtUWkxDqBym2cy6njJLM8n6COS3xvOVXpHRjvr/2Vu7ImNTxh53Zeqtt363unX wT4fuX3urcRPWZ5JrA3t6/kMJMnLxcse48bXFK9V8ILx+B21kjrc27IryI1UsG4GQI8P jqnzTIK8D0BKImNiyQdZ6kj/Cnbzaj6gVrl4FDlXdc5Otp5Sw+CyR3v2SrqqSL66FLFH L+2qiu4BSf/fsLisHPDgFrN/tAyC4fziXBooBkkgt3MvWkeVQJYVwwQgoqsK+vQtJslU 7iGQ== X-Forwarded-Encrypted: i=1; AJvYcCVzuj0iDiyR0C3i7akRVHuCZJ5Z8ETe0CTPk4KWIvQQ/R3CwHUTxc2/Q8raRjvIhNEkpwoskTfYMJlJ@lists.postgresql.org X-Gm-Message-State: AOJu0YwA3WAmQHaxCBA7RIclHvEpevDQ1ijI61kKGzvZsfXV6eW0Or2U j5BNXKcFqlQ1zFvSOGQeWqUVdm61wF0kGqauM2nzgmGUmFq04EJSOzhMPBuhERCErNv03djhnv/ r3b77OPwAyTb410vArmjkhNziW/tOQCI= X-Gm-Gg: AZuq6aIBoU6jFOjUuGbZrtQN4tQBoPwNHZTBABC3EpZJvgu7D/3CjX75OysZwytNBN1 8CWeRZ7wHpQrthAHVwITqwTBJSq7lu8CcJ3BpomEfhVwC8FP9u/bD2qV27Pl9rHprkspGJtp9Zz +K8kbFouYU9oQ4BORuWso870L00aC4426EuXy9b6mgGbHhT1FnUlfxqRz4a+TXdsj7xwtn/XKWV 7QI0kNSPbb3bkM8cM90p2enxowB4UvX3uYhn+ttlBOMTgbO1pr7CxzI/dV+6lRvti5dAmUcIfJi +Yk/pYkAKzuNLH5AsDCSqp3l3StAmvaEnOViSX+F0HmX97Sb+lL/pX4EOuVLuFtG X-Received: by 2002:a05:7300:cc10:b0:2ba:7783:d1d1 with SMTP id 5a478bee46e88-2bab9fec7aemr845431eec.2.1771019967451; Fri, 13 Feb 2026 13:59:27 -0800 (PST) MIME-Version: 1.0 References: <19406-9867fddddd724fca@postgresql.org> <20260213172702.71@rfd.leadboat.com> In-Reply-To: <20260213172702.71@rfd.leadboat.com> From: Thomas Munro Date: Sat, 14 Feb 2026 10:58:50 +1300 X-Gm-Features: AaiRm51fxbWJUAYr2zxSCU8na3NUEqoB9YUpKqcfx4hmeCVRGkOnp484uaQPzgY Message-ID: Subject: Re: BUG #19406: substring(text) fails on valid UTF-8 toasted value in PostgreSQL 15.16 To: Noah Misch Cc: ranvis@gmail.com, pgsql-bugs@lists.postgresql.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On Sat, Feb 14, 2026 at 6:27=E2=80=AFAM Noah Misch wrot= e: > On Fri, Feb 13, 2026 at 07:46:22AM +0000, PG Bug reporting form wrote: > > After upgrading from PostgreSQL 15.15 to 15.16, substring(text) raises: > > >ERROR: invalid byte sequence for encoding "UTF8": 0xe6 0x97 > > on valid UTF-8 text stored in a TOAST-compressed column. > > > user=3D> select substring(data from 1 for 1) from toast_repro; > > ERROR: 22021: invalid byte sequence for encoding "UTF8": 0xe6 0x97 > > Thanks for the report. That is a bug and a regression; I regret missing = it > during review. The substring operation works by taking a 4-byte slice fr= om > the toasted value (4 bytes being the max length of a UTF8 char in Postgre= SQL), > the finding the actual first character within those bytes. However, it > incorrectly requires those four bytes to be a valid UTF8 string. I'll st= art > on a fix. Ack. Also looking into this.