public inbox for [email protected]
help / color / mirror / Atom feedFrom: Laurenz Albe <[email protected]>
To: Tom Lane <[email protected]>
To: Dominique Devienne <[email protected]>
Cc: [email protected]
Subject: Re: Latest patches break one of our unit-test, related to RLS
Date: Sat, 13 Sep 2025 01:34:08 +0200
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <CAFCRh-8NwJd0jq6P=R3qhHyqU7hw0BTor3W0SvUcii24et+zAw@mail.gmail.com>
<CAFCRh-93hsRvoJqQCwXq06=AJJJ_s783gVsGHKc-QjgOO6jGVg@mail.gmail.com>
<[email protected]>
<CAFCRh--DPWEHpRmdr-Jfw4guYqjc9TM_QyVH7SR4B=Le5gGe3Q@mail.gmail.com>
<CAFCRh--M1y2YpJJMb6undzvyvhd4geKtRWFkDyr1shHvvoOkpQ@mail.gmail.com>
<CAFCRh-_OBKtabe8MO0vUocNSq_O3_b852AXknPiNhL0tNUBGQg@mail.gmail.com>
<CAFCRh-_U+8fmRoc-sYM5ahp=9dNjs=Dri1LFLcHCgEJX8Zq_oA@mail.gmail.com>
<[email protected]>
On Fri, 2025-09-12 at 10:07 -0400, Tom Lane wrote:
> Dominique Devienne <[email protected]> writes:
> > > This DOES look like a bug, no? I've done regexes for a long time,
> > > and these two forms should be equivalent IMHO. --DD
>
> Yeah, I agree it's busted. You can use EXPLAIN VERBOSE to see the
> translated-to-POSIX pattern, and it's wrong:
>
> regression=# explain verbose with t(v) as (values ('foo:bar'), ('foo/bar'), ('foo0bar'))
> select v from t where v similar to 'foo[\d\w]_%';
> QUERY PLAN
> --------------------------------------------------------------
> Values Scan on "*VALUES*" (cost=0.00..0.05 rows=1 width=32)
> Output: "*VALUES*".column1
> Filter: ("*VALUES*".column1 ~ '^(?:foo[\d\w]_%)$'::text)
> (3 rows)
>
> The _ and % are not getting converted to their POSIX equivalents
> ("." and ".*").
Indeed, and I have to take the blame for introducing a bug in a minor
release :^(
The attached patch should fix the problem.
Yours,
Laurenz Albe
Attachments:
[text/x-patch] v1-0001-Amend-recent-fix-for-SIMILAR-TO-regex-conversion.patch (3.1K, 2-v1-0001-Amend-recent-fix-for-SIMILAR-TO-regex-conversion.patch)
download | inline diff:
From 8d0f8ebac6c42fe7da36ec8c30ee091d20270068 Mon Sep 17 00:00:00 2001
From: Laurenz Albe <[email protected]>
Date: Sat, 13 Sep 2025 01:32:55 +0200
Subject: [PATCH v1] Amend recent fix for SIMILAR TO regex conversion
Commit e3ffc3e91d fixed the translation of character classes in
SIMILAR TO regular expressions. Unfortunately the fix broke a corner
case: if there is an escape character right after the opening bracket,
(for example in "[\q]") a closing bracket right after the escape
sequence would not be seen as closing the character class.
Author: Laurenz Albe <[email protected]>
Reported-By: Dominique Devienne <[email protected]>
Reported-By: Stephan Springl <[email protected]>
Backpatch-through: 13
---
src/backend/utils/adt/regexp.c | 6 ++++++
src/test/regress/expected/strings.out | 9 +++++++++
src/test/regress/sql/strings.sql | 3 +++
3 files changed, 18 insertions(+)
diff --git a/src/backend/utils/adt/regexp.c b/src/backend/utils/adt/regexp.c
index 6e2864cbbda..29692fb1a9d 100644
--- a/src/backend/utils/adt/regexp.c
+++ b/src/backend/utils/adt/regexp.c
@@ -948,6 +948,12 @@ similar_escape_internal(text *pat_text, text *esc_text)
*/
*r++ = '\\';
*r++ = pchar;
+ /*
+ * If we encounter an escaped character in a character class,
+ * we must be past an initial ^ or ].
+ */
+ if (charclass_depth > 0)
+ charclass_start = 3;
}
afterescape = false;
}
diff --git a/src/test/regress/expected/strings.out b/src/test/regress/expected/strings.out
index ba302da51e7..2d6cb02ad60 100644
--- a/src/test/regress/expected/strings.out
+++ b/src/test/regress/expected/strings.out
@@ -693,6 +693,15 @@ EXPLAIN (COSTS OFF) SELECT * FROM TEXT_TBL WHERE f1 SIMILAR TO '[^^]^';
Filter: (f1 ~ '^(?:[^^]\^)$'::text)
(2 rows)
+-- Closing square bracket after an escape sequence at the beginning of
+-- a character closes the character class
+EXPLAIN (COSTS OFF) SELECT * FROM TEXT_TBL WHERE f1 SIMILAR TO '[|a]%' ESCAPE '|';
+ QUERY PLAN
+---------------------------------------
+ Seq Scan on text_tbl
+ Filter: (f1 ~ '^(?:[\a].*)$'::text)
+(2 rows)
+
-- Test backslash escapes in regexp_replace's replacement string
SELECT regexp_replace('1112223333', E'(\\d{3})(\\d{3})(\\d{4})', E'(\\1) \\2-\\3');
regexp_replace
diff --git a/src/test/regress/sql/strings.sql b/src/test/regress/sql/strings.sql
index b94004cc08c..5ed421d6205 100644
--- a/src/test/regress/sql/strings.sql
+++ b/src/test/regress/sql/strings.sql
@@ -218,6 +218,9 @@ EXPLAIN (COSTS OFF) SELECT * FROM TEXT_TBL WHERE f1 SIMILAR TO '[]%][^]%][^%]%';
-- Closing square bracket effective after two carets at the beginning
-- of character class.
EXPLAIN (COSTS OFF) SELECT * FROM TEXT_TBL WHERE f1 SIMILAR TO '[^^]^';
+-- Closing square bracket after an escape sequence at the beginning of
+-- a character closes the character class
+EXPLAIN (COSTS OFF) SELECT * FROM TEXT_TBL WHERE f1 SIMILAR TO '[|a]%' ESCAPE '|';
-- Test backslash escapes in regexp_replace's replacement string
SELECT regexp_replace('1112223333', E'(\\d{3})(\\d{3})(\\d{4})', E'(\\1) \\2-\\3');
--
2.51.0
view thread (9+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected]
Subject: Re: Latest patches break one of our unit-test, related to RLS
In-Reply-To: <[email protected]>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox