public inbox for [email protected]
help / color / mirror / Atom feedFrom: Henson Choi <[email protected]>
To: Tatsuo Ishii <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Subject: Re: Row pattern recognition
Date: Tue, 24 Feb 2026 10:28:19 +0900
Message-ID: <CAAAe_zAxEqdyoOgx4u0A2RXu829CE-RJbJjg1jVz00j3aSYryQ@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
<CAAAe_zCCApM0u9YWDAOGG+9R5XLCCuoaXhNXgQ0LVRcYBUqC4g@mail.gmail.com>
<CAAAe_zDcrF9_m5FnWFgdrCpxjTqdH2ZMDmpKXevMG4KBHJsw4w@mail.gmail.com>
<[email protected]>
Hi Tatsuo,
Here are six incremental patches on top of v43.
nocfbot-0001 through nocfbot-0004 are the same patches from the
previous round (32-bit test fix, PREV/NEXT restriction, ALT lexical
ordering, reluctant quantifiers).
nocfbot-0005: Detect zero-consumption NFA cycles
Adds a per-element visited bitmap to detect zero-consumption cycles
during DFS epsilon expansion. Before each state's DFS, the bitmap
is cleared; as nfa_advance_state() recurses through epsilon
transitions, each elemIdx is marked visited. If the same elemIdx
is reached again within the same DFS, it means an epsilon-only
loop -- the state is freed immediately.
The ad-hoc (count == 0 && min == 0) exit condition in
nfa_advance_end() is removed. The FIXME test cases are resolved
and renamed to "Zero-Consumption Cycle Detection".
nocfbot-0006: Allow A{0} quantifier
With cycle detection in place, this becomes straightforward.
Lowers the {n} bound minimum from 1 to 0. A{0} is treated as an
epsilon transition -- the variable is skipped entirely.
Next I plan to work on test reorganization, cross-database result
comparison, and a review pass over the NFA executor.
Best regards,
Henson
Attachments:
[application/octet-stream] nocfbot-0003-alt-dfs-early-termination.patch (23.8K, 3-nocfbot-0003-alt-dfs-early-termination.patch)
download | inline diff:
From cbf4da013b2cb0ffe2a1496e7f9467f1dc02997f Mon Sep 17 00:00:00 2001
From: Henson Choi <[email protected]>
Date: Fri, 20 Feb 2026 09:18:21 +0900
Subject: [PATCH 3/6] Fix ALT lexical ordering via DFS early termination,
remove altPriority
---
src/backend/executor/nodeWindowAgg.c | 169 +++++++++-------------
src/include/nodes/execnodes.h | 7 -
src/test/regress/expected/rpr_base.out | 2 +-
src/test/regress/expected/rpr_explain.out | 40 +++++
src/test/regress/expected/rpr_nfa.out | 92 ++++--------
src/test/regress/sql/rpr_explain.sql | 24 +++
src/test/regress/sql/rpr_nfa.sql | 70 +++------
7 files changed, 186 insertions(+), 218 deletions(-)
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index 9b2c4b6a1d7..96b2cd4b9c4 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -265,8 +265,7 @@ static RPRNFAState *nfa_state_alloc(WindowAggState *winstate);
static void nfa_state_free(WindowAggState *winstate, RPRNFAState *state);
static void nfa_state_free_list(WindowAggState *winstate, RPRNFAState *list);
static RPRNFAState *nfa_state_create(WindowAggState *winstate, int16 elemIdx,
- int16 altPriority, int32 *counts,
- bool sourceAbsorbable);
+ int32 *counts, bool sourceAbsorbable);
static bool nfa_states_equal(WindowAggState *winstate, RPRNFAState *s1,
RPRNFAState *s2);
static bool nfa_add_state_unique(WindowAggState *winstate, RPRNFAContext *ctx,
@@ -5188,14 +5187,14 @@ nfa_state_free_list(WindowAggState *winstate, RPRNFAState *list)
/*
* nfa_state_create
*
- * Create a new state with given elemIdx, altPriority and counts.
+ * Create a new state with given elemIdx and counts.
* isAbsorbable is computed immediately: inherited AND new element's flag.
* Monotonic property: once false, stays false through all transitions.
*
* Caller is responsible for linking the returned state.
*/
static RPRNFAState *
-nfa_state_create(WindowAggState *winstate, int16 elemIdx, int16 altPriority,
+nfa_state_create(WindowAggState *winstate, int16 elemIdx,
int32 *counts, bool sourceAbsorbable)
{
RPRPattern *pattern = winstate->rpPattern;
@@ -5204,7 +5203,6 @@ nfa_state_create(WindowAggState *winstate, int16 elemIdx, int16 altPriority,
RPRPatternElement *elem = &pattern->elements[elemIdx];
state->elemIdx = elemIdx;
- state->altPriority = altPriority;
if (counts != NULL && maxDepth > 0)
memcpy(state->counts, counts, sizeof(int32) * maxDepth);
@@ -5249,7 +5247,7 @@ nfa_states_equal(WindowAggState *winstate, RPRNFAState *s1, RPRNFAState *s2)
*
* Add a state to ctx->states at the END, only if no duplicate exists.
* Returns true if state was added, false if duplicate found (state is freed).
- * Earlier states have lower altPriority (lexical order), so existing wins.
+ * Earlier states have better lexical order (DFS traversal order), so existing wins.
*/
static bool
nfa_add_state_unique(WindowAggState *winstate, RPRNFAContext *ctx, RPRNFAState *state)
@@ -5286,96 +5284,42 @@ nfa_add_state_unique(WindowAggState *winstate, RPRNFAContext *ctx, RPRNFAState *
/*
* nfa_add_matched_state
*
- * Record a matched state following SQL standard semantics.
- * Lexical order (lower altPriority) wins first. Among same lexical order,
- * longer match wins (greedy).
+ * Record a state that reached FIN, replacing any previous match.
*
- * FIXME: altPriority is a single value that only tracks the last ALT choice.
- * For patterns with repeated or nested ALTs like (A|B)+, this cannot correctly
- * implement SQL standard lexical order, which requires comparing the full path
- * from left to right. For example:
- * Pattern: (A | B)+
- * Path "A B A" vs "B A B"
- * Current: compares last choice (A vs B) → altPriority 10 vs 20
- * Correct: should compare first choice (A < B) → "A B A" wins
- *
- * A classifier structure tracking the entire ALT path is required for correct
- * implementation. Without it, patterns with repeated or nested ALTs will
- * produce incorrect match selection.
+ * For SKIP PAST LAST ROW, also prune subsequent contexts whose start row
+ * falls within the match range, as they cannot produce output rows.
*/
static void
nfa_add_matched_state(WindowAggState *winstate, RPRNFAContext *ctx,
RPRNFAState *state, int64 matchEndRow)
{
- bool shouldUpdate = false;
+ if (ctx->matchedState != NULL)
+ nfa_state_free(winstate, ctx->matchedState);
- if (ctx->matchedState == NULL)
- shouldUpdate = true;
- else if (state->altPriority < ctx->matchedState->altPriority)
- shouldUpdate = true; /* Better lexical order wins */
- else if (state->altPriority == ctx->matchedState->altPriority &&
- matchEndRow > ctx->matchEndRow)
- shouldUpdate = true; /* Same lexical order, longer wins */
+ ctx->matchedState = state;
+ state->next = NULL;
+ ctx->matchEndRow = matchEndRow;
- if (shouldUpdate)
+ /* Prune contexts that started within this match's range */
+ if (winstate->rpSkipTo == ST_PAST_LAST_ROW)
{
- /* Free old matchedState if exists */
- if (ctx->matchedState != NULL)
- nfa_state_free(winstate, ctx->matchedState);
+ RPRNFAContext *nextCtx;
+ int64 skippedLen;
- /* Take ownership of the new state */
- ctx->matchedState = state;
- state->next = NULL;
- ctx->matchEndRow = matchEndRow;
-
- /*----------
- * SKIP PAST LAST ROW: eagerly prune contexts within match range.
- *
- * This function is called whenever a FIN state is reached, including
- * during greedy matching when intermediate (shorter) matches are
- * found. Each time we update matchEndRow (whether extending a greedy
- * match or finding a new match), we can prune pending contexts that
- * started within the current match range.
- *
- * SKIP PAST LAST ROW uses lexical order (matchStartRow). Therefore,
- * any pending context that started at or before matchEndRow can never
- * produce a valid output row - it would be skipped anyway per SQL
- * standard.
- *
- * Example (greedy matching in progress):
- * Pattern: START UP+
- * Rows: 1 2 3 4 5
- * Context A starts at row 1:
- * - Matches START UP (rows 1-2) → matchEndRow=2 → prune Context B(row 2)
- * - Matches START UP UP (rows 1-3) → matchEndRow=3 → prune Context C(row 3)
- * - Continues greedy extension while pruning incrementally
- *----------
- */
- if (winstate->rpSkipTo == ST_PAST_LAST_ROW)
+ while (ctx->next != NULL &&
+ ctx->next->matchStartRow <= matchEndRow)
{
- RPRNFAContext *nextCtx;
- int64 skippedLen;
-
- while (ctx->next != NULL &&
- ctx->next->matchStartRow <= matchEndRow)
- {
- nextCtx = ctx->next;
- ctx->next = ctx->next->next;
+ nextCtx = ctx->next;
+ ctx->next = ctx->next->next;
- Assert(nextCtx->lastProcessedRow >= nextCtx->matchStartRow);
- skippedLen = nextCtx->lastProcessedRow - nextCtx->matchStartRow + 1;
- nfa_record_context_skipped(winstate, skippedLen);
+ Assert(nextCtx->lastProcessedRow >= nextCtx->matchStartRow);
+ skippedLen = nextCtx->lastProcessedRow - nextCtx->matchStartRow + 1;
+ nfa_record_context_skipped(winstate, skippedLen);
- nfa_context_free(winstate, nextCtx);
- }
- if (ctx->next == NULL)
- winstate->nfaContextTail = ctx;
+ nfa_context_free(winstate, nextCtx);
}
- }
- else
- {
- /* This state didn't win, free it */
- nfa_state_free(winstate, state);
+ if (ctx->next == NULL)
+ winstate->nfaContextTail = ctx;
}
}
@@ -6124,8 +6068,7 @@ nfa_route_to_elem(WindowAggState *winstate, RPRNFAContext *ctx,
RPRNFAState *skipState;
skipState = nfa_state_create(winstate, nextElem->next,
- state->altPriority, state->counts,
- state->isAbsorbable);
+ state->counts, state->isAbsorbable);
nfa_advance_state(winstate, ctx, skipState, currentPos, initialAdvance);
}
}
@@ -6138,8 +6081,7 @@ nfa_route_to_elem(WindowAggState *winstate, RPRNFAContext *ctx,
/*
* nfa_advance_alt
*
- * Handle ALT element: expand all branches in lexical order (DFS).
- * Sets altPriority to element index to preserve lexical order for match selection.
+ * Handle ALT element: expand all branches in lexical order via DFS.
*/
static void
nfa_advance_alt(WindowAggState *winstate, RPRNFAContext *ctx,
@@ -6163,13 +6105,12 @@ nfa_advance_alt(WindowAggState *winstate, RPRNFAContext *ctx,
if (first)
{
state->elemIdx = altIdx;
- state->altPriority = altIdx;
newState = state;
first = false;
}
else
{
- newState = nfa_state_create(winstate, altIdx, altIdx,
+ newState = nfa_state_create(winstate, altIdx,
state->counts, state->isAbsorbable);
}
@@ -6204,7 +6145,7 @@ nfa_advance_begin(WindowAggState *winstate, RPRNFAContext *ctx,
/* Optional group: create skip path (but don't route yet) */
if (elem->min == 0)
{
- skipState = nfa_state_create(winstate, elem->jump, state->altPriority,
+ skipState = nfa_state_create(winstate, elem->jump,
state->counts, state->isAbsorbable);
}
@@ -6286,21 +6227,15 @@ nfa_advance_end(WindowAggState *winstate, RPRNFAContext *ctx,
* Between min and max (with at least one iteration) - can exit or
* loop
*/
- RPRElemIdx exitAltPriority;
RPRNFAState *exitState;
RPRPatternElement *jumpElem;
RPRPatternElement *nextElem;
- /* Preserve altPriority for greedy extension */
- exitAltPriority = state->altPriority;
- if (ctx->matchedState != NULL)
- exitAltPriority = ctx->matchedState->altPriority;
-
/*
* Create exit state first (need original counts before modifying
* state)
*/
- exitState = nfa_state_create(winstate, elem->next, exitAltPriority,
+ exitState = nfa_state_create(winstate, elem->next,
state->counts, state->isAbsorbable);
exitState->counts[depth] = 0;
nextElem = &elements[exitState->elemIdx];
@@ -6349,7 +6284,7 @@ nfa_advance_var(WindowAggState *winstate, RPRNFAContext *ctx,
RPRNFAState *loopState;
RPRPatternElement *nextElem;
- loopState = nfa_state_create(winstate, state->elemIdx, state->altPriority,
+ loopState = nfa_state_create(winstate, state->elemIdx,
state->counts, state->isAbsorbable);
nfa_add_state_unique(winstate, ctx, loopState);
@@ -6358,6 +6293,20 @@ nfa_advance_var(WindowAggState *winstate, RPRNFAContext *ctx,
state->elemIdx = elem->next;
nextElem = &elements[state->elemIdx];
+ /*
+ * When a quantified VAR (e.g., A+) exits directly to an outer END,
+ * increment the END's iteration count. Simple VARs (min=max=1)
+ * handle this via inline advance in nfa_match, but quantified VARs
+ * bypass that path. Only increment when the VAR actually consumed
+ * rows (count > 0) to preserve cycle prevention for zero-progress
+ * loops.
+ */
+ if (RPRElemIsEnd(nextElem) && count > 0)
+ {
+ if (state->counts[nextElem->depth] < RPR_COUNT_MAX)
+ state->counts[nextElem->depth]++;
+ }
+
nfa_route_to_elem(winstate, ctx, state, nextElem, currentPos, initialAdvance);
}
else if (canLoop)
@@ -6374,6 +6323,13 @@ nfa_advance_var(WindowAggState *winstate, RPRNFAContext *ctx,
state->elemIdx = elem->next;
nextElem = &elements[state->elemIdx];
+ /* See comment above: increment outer END count for quantified VARs */
+ if (RPRElemIsEnd(nextElem) && count > 0)
+ {
+ if (state->counts[nextElem->depth] < RPR_COUNT_MAX)
+ state->counts[nextElem->depth]++;
+ }
+
nfa_route_to_elem(winstate, ctx, state, nextElem, currentPos, initialAdvance);
}
}
@@ -6382,9 +6338,7 @@ nfa_advance_var(WindowAggState *winstate, RPRNFAContext *ctx,
* nfa_advance_state
*
* Recursively process a single state through epsilon transitions.
- * Uses DFS traversal to maintain lexical order: lower altPriority paths
- * are fully processed before higher altPriority paths, ensuring states
- * are added to ctx->states in lexical order.
+ * DFS traversal ensures states are added to ctx->states in lexical order.
*/
static void
nfa_advance_state(WindowAggState *winstate, RPRNFAContext *ctx,
@@ -6443,13 +6397,26 @@ nfa_advance(WindowAggState *winstate, RPRNFAContext *ctx, int64 currentPos,
ctx->states = NULL; /* Will rebuild */
- /* Process each state in order */
+ /* Process each state in lexical order (DFS order from previous advance) */
while (states != NULL)
{
+ RPRNFAState *savedMatchedState = ctx->matchedState;
+
state = states;
states = states->next;
state->next = NULL;
nfa_advance_state(winstate, ctx, state, currentPos, initialAdvance);
+
+ /*
+ * Early termination: if a FIN was newly reached in this advance,
+ * remaining old states have worse lexical order and can be pruned.
+ * Only check for new FIN arrivals (not ones from previous rows).
+ */
+ if (ctx->matchedState != savedMatchedState && states != NULL)
+ {
+ nfa_state_free_list(winstate, states);
+ break;
+ }
}
}
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index cd6f794f62b..9a1561fce7e 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -2523,21 +2523,14 @@ typedef enum WindowAggStatus
* RPRNFAState - single NFA state for pattern matching
*
* counts[] tracks repetition counts at each nesting depth.
- * altPriority tracks lexical order for alternation (lower = earlier alternative).
*
* isAbsorbable tracks if state is in absorbable region (ABSORBABLE_BRANCH).
* Monotonic property: once false, stays false (can't re-enter region).
- *
- * Absorption comparison uses elemIdx and counts[depth] directly because:
- * - VAR elements consume a row, forcing states to wait for next row
- * - END loop puts states at group start with iteration count preserved
- * - At row end, comparable states are at the same position (elemIdx)
*/
typedef struct RPRNFAState
{
struct RPRNFAState *next; /* next state in linked list */
int16 elemIdx; /* current pattern element index */
- int16 altPriority; /* lexical order priority (lower = preferred) */
bool isAbsorbable; /* true if state is in absorbable region */
int32 counts[FLEXIBLE_ARRAY_MEMBER]; /* repetition counts by depth */
} RPRNFAState;
diff --git a/src/test/regress/expected/rpr_base.out b/src/test/regress/expected/rpr_base.out
index 1d42af15980..d317ec70c8d 100644
--- a/src/test/regress/expected/rpr_base.out
+++ b/src/test/regress/expected/rpr_base.out
@@ -3703,7 +3703,7 @@ WINDOW w AS (
ORDER BY id;
id | val | cnt
----+-----+-----
- 1 | 10 | 0
+ 1 | 10 | 3
2 | 20 | 0
3 | 30 | 0
4 | 40 | 0
diff --git a/src/test/regress/expected/rpr_explain.out b/src/test/regress/expected/rpr_explain.out
index bfe4ee2ba7d..dacb8d6907f 100644
--- a/src/test/regress/expected/rpr_explain.out
+++ b/src/test/regress/expected/rpr_explain.out
@@ -676,6 +676,46 @@ WINDOW w AS (
-> Function Scan on generate_series s (actual rows=100.00 loops=1)
(9 rows)
+DROP VIEW rpr_v;
+-- Early termination: first ALT branch (A) reaches FIN immediately,
+-- pruning second branch (A B+) before it can accumulate B repetitions.
+CREATE TEMP VIEW rpr_v AS
+SELECT count(*) OVER w
+FROM generate_series(1, 100) AS s(v)
+WINDOW w AS (
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP PAST LAST ROW
+ PATTERN ((A | A B)+)
+ DEFINE A AS v = 1, B AS v > 1
+);
+SELECT line FROM unnest(string_to_array(pg_get_viewdef('rpr_v'), E'\n')) AS line WHERE line ~ 'PATTERN';
+ line
+-------------------------
+ PATTERN ((a | a b)+)
+(1 row)
+
+SELECT rpr_explain_filter('
+EXPLAIN (ANALYZE, BUFFERS OFF, COSTS OFF, TIMING OFF, SUMMARY OFF)
+SELECT count(*) OVER w
+FROM generate_series(1, 100) AS s(v)
+WINDOW w AS (
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP PAST LAST ROW
+ PATTERN ((A | A B)+)
+ DEFINE A AS v = 1, B AS v > 1
+);');
+ rpr_explain_filter
+-----------------------------------------------------------------------
+ WindowAgg (actual rows=100.00 loops=1)
+ Window: w AS (ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
+ Pattern: (a | a b)+
+ Storage: Memory Maximum Storage: NkB
+ NFA States: 5 peak, 204 total, 0 merged
+ NFA Contexts: 3 peak, 101 total, 99 pruned
+ NFA: 1 matched (len 1/1/1.0), 0 mismatched
+ -> Function Scan on generate_series s (actual rows=100.00 loops=1)
+(8 rows)
+
DROP VIEW rpr_v;
-- Nested quantifiers causing state growth
CREATE TEMP VIEW rpr_v AS
diff --git a/src/test/regress/expected/rpr_nfa.out b/src/test/regress/expected/rpr_nfa.out
index 46a463c2597..7d38d62ac31 100644
--- a/src/test/regress/expected/rpr_nfa.out
+++ b/src/test/regress/expected/rpr_nfa.out
@@ -1514,6 +1514,34 @@ WINDOW w AS (
4 | {_,_} | |
(4 rows)
+-- ALT lexical order takes priority over greedy (longer match).
+-- Row 1 matches both A and B; A wins by lexical order (match 1-1).
+WITH test_alt_lexical_order AS (
+ SELECT * FROM (VALUES
+ (1, ARRAY['A','B']), -- A and B both match
+ (2, ARRAY['_','C']) -- only C matches (would continue B C)
+ ) AS t(id, flags)
+)
+SELECT id, flags,
+ first_value(id) OVER w AS match_start,
+ last_value(id) OVER w AS match_end
+FROM test_alt_lexical_order
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP PAST LAST ROW
+ PATTERN ((A | B C)+)
+ DEFINE
+ A AS 'A' = ANY(flags),
+ B AS 'B' = ANY(flags),
+ C AS 'C' = ANY(flags)
+);
+ id | flags | match_start | match_end
+----+-------+-------------+-----------
+ 1 | {A,B} | 1 | 1
+ 2 | {_,C} | |
+(2 rows)
+
-- ============================================================
-- Deep Nested Groups
-- ============================================================
@@ -2403,66 +2431,6 @@ WINDOW w AS (
-- ============================================================
-- FIXME Issues - Known Limitations
-- ============================================================
--- FIXME 1 - altPriority lexical order
-WITH test_alt_priority_repeated AS (
- SELECT * FROM (VALUES
- (1, ARRAY['A','B']), -- Both A and B match
- (2, ARRAY['A','B']),
- (3, ARRAY['A','B'])
- ) AS t(id, flags)
-)
-SELECT id, flags,
- first_value(id) OVER w AS match_start,
- last_value(id) OVER w AS match_end
-FROM test_alt_priority_repeated
-WINDOW w AS (
- ORDER BY id
- ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
- AFTER MATCH SKIP TO NEXT ROW
- PATTERN ((A | B)+)
- DEFINE
- A AS 'A' = ANY(flags),
- B AS 'B' = ANY(flags)
-);
- id | flags | match_start | match_end
-----+-------+-------------+-----------
- 1 | {A,B} | 1 | 3
- 2 | {A,B} | 2 | 3
- 3 | {A,B} | 3 | 3
-(3 rows)
-
--- FIXME 1 - Nested ALT lexical order
-WITH test_alt_priority_nested AS (
- SELECT * FROM (VALUES
- (1, ARRAY['A','B']),
- (2, ARRAY['C','D']),
- (3, ARRAY['A','B']),
- (4, ARRAY['C','D'])
- ) AS t(id, flags)
-)
-SELECT id, flags,
- first_value(id) OVER w AS match_start,
- last_value(id) OVER w AS match_end
-FROM test_alt_priority_nested
-WINDOW w AS (
- ORDER BY id
- ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
- AFTER MATCH SKIP TO NEXT ROW
- PATTERN (((A | B) (C | D))+)
- DEFINE
- A AS 'A' = ANY(flags),
- B AS 'B' = ANY(flags),
- C AS 'C' = ANY(flags),
- D AS 'D' = ANY(flags)
-);
- id | flags | match_start | match_end
-----+-------+-------------+-----------
- 1 | {A,B} | 1 | 4
- 2 | {C,D} | |
- 3 | {A,B} | 3 | 4
- 4 | {C,D} | |
-(4 rows)
-
-- FIXME 2 - Cycle prevention at count > 0
WITH test_cycle_nonzero AS (
SELECT * FROM (VALUES
@@ -2516,8 +2484,8 @@ WINDOW w AS (
);
id | flags | match_start | match_end
----+-------+-------------+-----------
- 1 | {A} | 1 | 2
- 2 | {B} | 2 | 2
+ 1 | {A} | 1 | 3
+ 2 | {B} | 2 | 3
3 | {A} | 3 | 3
4 | {C} | |
(4 rows)
diff --git a/src/test/regress/sql/rpr_explain.sql b/src/test/regress/sql/rpr_explain.sql
index ea8c39b991e..d017a2292d2 100644
--- a/src/test/regress/sql/rpr_explain.sql
+++ b/src/test/regress/sql/rpr_explain.sql
@@ -451,6 +451,30 @@ WINDOW w AS (
);');
DROP VIEW rpr_v;
+-- Early termination: first ALT branch (A) reaches FIN immediately,
+-- pruning second branch (A B+) before it can accumulate B repetitions.
+CREATE TEMP VIEW rpr_v AS
+SELECT count(*) OVER w
+FROM generate_series(1, 100) AS s(v)
+WINDOW w AS (
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP PAST LAST ROW
+ PATTERN ((A | A B)+)
+ DEFINE A AS v = 1, B AS v > 1
+);
+SELECT line FROM unnest(string_to_array(pg_get_viewdef('rpr_v'), E'\n')) AS line WHERE line ~ 'PATTERN';
+SELECT rpr_explain_filter('
+EXPLAIN (ANALYZE, BUFFERS OFF, COSTS OFF, TIMING OFF, SUMMARY OFF)
+SELECT count(*) OVER w
+FROM generate_series(1, 100) AS s(v)
+WINDOW w AS (
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP PAST LAST ROW
+ PATTERN ((A | A B)+)
+ DEFINE A AS v = 1, B AS v > 1
+);');
+DROP VIEW rpr_v;
+
-- Nested quantifiers causing state growth
CREATE TEMP VIEW rpr_v AS
SELECT count(*) OVER w
diff --git a/src/test/regress/sql/rpr_nfa.sql b/src/test/regress/sql/rpr_nfa.sql
index 9573d1dab3b..e9894cfa691 100644
--- a/src/test/regress/sql/rpr_nfa.sql
+++ b/src/test/regress/sql/rpr_nfa.sql
@@ -1079,6 +1079,29 @@ WINDOW w AS (
D AS 'D' = ANY(flags)
);
+-- ALT lexical order takes priority over greedy (longer match).
+-- Row 1 matches both A and B; A wins by lexical order (match 1-1).
+WITH test_alt_lexical_order AS (
+ SELECT * FROM (VALUES
+ (1, ARRAY['A','B']), -- A and B both match
+ (2, ARRAY['_','C']) -- only C matches (would continue B C)
+ ) AS t(id, flags)
+)
+SELECT id, flags,
+ first_value(id) OVER w AS match_start,
+ last_value(id) OVER w AS match_end
+FROM test_alt_lexical_order
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP PAST LAST ROW
+ PATTERN ((A | B C)+)
+ DEFINE
+ A AS 'A' = ANY(flags),
+ B AS 'B' = ANY(flags),
+ C AS 'C' = ANY(flags)
+);
+
-- ============================================================
-- Deep Nested Groups
-- ============================================================
@@ -1772,53 +1795,6 @@ WINDOW w AS (
-- FIXME Issues - Known Limitations
-- ============================================================
--- FIXME 1 - altPriority lexical order
-WITH test_alt_priority_repeated AS (
- SELECT * FROM (VALUES
- (1, ARRAY['A','B']), -- Both A and B match
- (2, ARRAY['A','B']),
- (3, ARRAY['A','B'])
- ) AS t(id, flags)
-)
-SELECT id, flags,
- first_value(id) OVER w AS match_start,
- last_value(id) OVER w AS match_end
-FROM test_alt_priority_repeated
-WINDOW w AS (
- ORDER BY id
- ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
- AFTER MATCH SKIP TO NEXT ROW
- PATTERN ((A | B)+)
- DEFINE
- A AS 'A' = ANY(flags),
- B AS 'B' = ANY(flags)
-);
-
--- FIXME 1 - Nested ALT lexical order
-WITH test_alt_priority_nested AS (
- SELECT * FROM (VALUES
- (1, ARRAY['A','B']),
- (2, ARRAY['C','D']),
- (3, ARRAY['A','B']),
- (4, ARRAY['C','D'])
- ) AS t(id, flags)
-)
-SELECT id, flags,
- first_value(id) OVER w AS match_start,
- last_value(id) OVER w AS match_end
-FROM test_alt_priority_nested
-WINDOW w AS (
- ORDER BY id
- ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
- AFTER MATCH SKIP TO NEXT ROW
- PATTERN (((A | B) (C | D))+)
- DEFINE
- A AS 'A' = ANY(flags),
- B AS 'B' = ANY(flags),
- C AS 'C' = ANY(flags),
- D AS 'D' = ANY(flags)
-);
-
-- FIXME 2 - Cycle prevention at count > 0
WITH test_cycle_nonzero AS (
SELECT * FROM (VALUES
--
2.50.1 (Apple Git-155)
[application/octet-stream] nocfbot-0001-rpr-explain-32bit-fix.patch (4.2K, 4-nocfbot-0001-rpr-explain-32bit-fix.patch)
download | inline diff:
From 0f20725c68bf5a6662245ab3ef6018830ebee70b Mon Sep 17 00:00:00 2001
From: Henson Choi <[email protected]>
Date: Mon, 16 Feb 2026 23:57:50 +0900
Subject: [PATCH 1/6] Normalize Sort Method memory in rpr_explain test for
32-bit compatibility
---
src/test/regress/expected/rpr_explain.out | 13 +++++++++----
src/test/regress/sql/rpr_explain.sql | 9 +++++++--
2 files changed, 16 insertions(+), 6 deletions(-)
diff --git a/src/test/regress/expected/rpr_explain.out b/src/test/regress/expected/rpr_explain.out
index f23d06f6d59..bfe4ee2ba7d 100644
--- a/src/test/regress/expected/rpr_explain.out
+++ b/src/test/regress/expected/rpr_explain.out
@@ -33,7 +33,7 @@
-- DEFINE Expression Variations
-- Large Scale Statistics Verification
-- ============================================================
--- Filter function to normalize Storage memory values only (not NFA statistics).
+-- Filter function to normalize platform-dependent memory values (not NFA statistics).
-- NFA statistics should not change between platforms; if they do, it could
-- indicate issues such as uninitialized memory access.
-- Works for text, JSON, and XML formats.
@@ -45,7 +45,7 @@ declare
begin
for ln in execute $1
loop
- -- Normalize memory size in Storage line only (platform-dependent)
+ -- Normalize platform-dependent memory values
-- Keep NFA statistics numbers unchanged (they are test assertions)
-- Text format: "Storage: Memory Maximum Storage: 18kB"
@@ -63,6 +63,11 @@ begin
ln := regexp_replace(ln, '<Maximum-Storage>\d+</Maximum-Storage>', '<Maximum-Storage>0</Maximum-Storage>', 'g');
end if;
+ -- Sort Method memory is platform-dependent (32-bit vs 64-bit)
+ if ln ~ 'Sort Method:.*Memory:' then
+ ln := regexp_replace(ln, 'Memory: \d+kB', 'Memory: NkB');
+ end if;
+
return next ln;
end loop;
end;
@@ -1625,7 +1630,7 @@ WINDOW w AS (
NFA: 54 absorbed (len 1/1/1.0), 18 skipped (len 1/1/1.0)
-> Sort (actual rows=90.00 loops=1)
Sort Key: p.p
- Sort Method: quicksort Memory: 27kB
+ Sort Method: quicksort Memory: NkB
-> Nested Loop (actual rows=90.00 loops=1)
-> Function Scan on generate_series p (actual rows=3.00 loops=1)
-> Function Scan on generate_series v (actual rows=30.00 loops=3)
@@ -1682,7 +1687,7 @@ WINDOW w AS (
NFA: 19 absorbed (len 1/1/1.0), 5 skipped (len 1/1/1.0)
-> Sort (actual rows=50.00 loops=1)
Sort Key: (CASE WHEN (v.v <= 25) THEN 1 ELSE 2 END)
- Sort Method: quicksort Memory: 26kB
+ Sort Method: quicksort Memory: NkB
-> Function Scan on generate_series v (actual rows=50.00 loops=1)
(12 rows)
diff --git a/src/test/regress/sql/rpr_explain.sql b/src/test/regress/sql/rpr_explain.sql
index f8c8f62e594..ea8c39b991e 100644
--- a/src/test/regress/sql/rpr_explain.sql
+++ b/src/test/regress/sql/rpr_explain.sql
@@ -34,7 +34,7 @@
-- Large Scale Statistics Verification
-- ============================================================
--- Filter function to normalize Storage memory values only (not NFA statistics).
+-- Filter function to normalize platform-dependent memory values (not NFA statistics).
-- NFA statistics should not change between platforms; if they do, it could
-- indicate issues such as uninitialized memory access.
-- Works for text, JSON, and XML formats.
@@ -46,7 +46,7 @@ declare
begin
for ln in execute $1
loop
- -- Normalize memory size in Storage line only (platform-dependent)
+ -- Normalize platform-dependent memory values
-- Keep NFA statistics numbers unchanged (they are test assertions)
-- Text format: "Storage: Memory Maximum Storage: 18kB"
@@ -64,6 +64,11 @@ begin
ln := regexp_replace(ln, '<Maximum-Storage>\d+</Maximum-Storage>', '<Maximum-Storage>0</Maximum-Storage>', 'g');
end if;
+ -- Sort Method memory is platform-dependent (32-bit vs 64-bit)
+ if ln ~ 'Sort Method:.*Memory:' then
+ ln := regexp_replace(ln, 'Memory: \d+kB', 'Memory: NkB');
+ end if;
+
return next ln;
end loop;
end;
--
2.50.1 (Apple Git-155)
[application/octet-stream] nocfbot-0005-cycle-detection-bitmap.patch (6.9K, 5-nocfbot-0005-cycle-detection-bitmap.patch)
download | inline diff:
From 39fc31eca0cee7fe874aa4e0abfc7ebaaa69bc1e Mon Sep 17 00:00:00 2001
From: Henson Choi <[email protected]>
Date: Mon, 23 Feb 2026 13:02:06 +0900
Subject: [PATCH 5/6] Detect zero-consumption NFA cycles using elemIdx visited
bitmap
---
src/backend/executor/nodeWindowAgg.c | 45 ++++++++++++++++-----------
src/include/nodes/execnodes.h | 4 +++
src/test/regress/expected/rpr_nfa.out | 8 ++---
src/test/regress/sql/rpr_nfa.sql | 8 ++---
4 files changed, 38 insertions(+), 27 deletions(-)
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index 371439aad91..98546fb17f7 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -343,6 +343,10 @@ static void nfa_advance(WindowAggState *winstate, RPRNFAContext *ctx,
/* calculate shift bits */
#define NN_SHIFT(pos) ((pos) % NN_ITEM_PER_VAR) * NN_BITS_PER_MEMBER
+/* Bitmap macros for NFA cycle detection (cf. bitmapset.c, tidbitmap.c) */
+#define WORDNUM(x) ((x) / BITS_PER_BITMAPWORD)
+#define BITNUM(x) ((x) % BITS_PER_BITMAPWORD)
+
/*
* initialize_windowaggregate
* parallel to initialize_aggregates in nodeAgg.c
@@ -3016,11 +3020,15 @@ ExecInitWindowAgg(WindowAgg *node, EState *estate, int eflags)
/* Set up row pattern recognition PATTERN clause (compiled NFA) */
winstate->rpPattern = node->rpPattern;
- /* Calculate NFA state size for allocation */
+ /* Calculate NFA state size and allocate cycle detection bitmap */
if (node->rpPattern != NULL)
{
winstate->nfaStateSize = offsetof(RPRNFAState, counts) +
sizeof(int32) * node->rpPattern->maxDepth;
+ winstate->nfaVisitedNWords =
+ (node->rpPattern->numElements - 1) / BITS_PER_BITMAPWORD + 1;
+ winstate->nfaVisitedElems = palloc0(sizeof(bitmapword) *
+ winstate->nfaVisitedNWords);
}
/* Set up row pattern recognition DEFINE clause */
@@ -6214,25 +6222,9 @@ nfa_advance_end(WindowAggState *winstate, RPRNFAContext *ctx,
nfa_route_to_elem(winstate, ctx, state, jumpElem, currentPos, initialAdvance);
}
- else if ((elem->max != RPR_QUANTITY_INF && count >= elem->max) ||
- (count == 0 && elem->min == 0))
+ else if (elem->max != RPR_QUANTITY_INF && count >= elem->max)
{
- /*----------
- * Must exit: either reached max iterations, or group matched empty.
- *
- * FIXME: The (count == 0 && min == 0) condition is insufficient for
- * cycle prevention. Cycles can occur at any count value when loop back
- * happens without consuming rows. For example:
- * Pattern: (A*)*
- * After matching 3 A's (count=3), loop back at a B row
- * Inner A* matches 0 times (skip path) → same (elemIdx, count=3)
- * Infinite cycle at count=3, not count=0
- *
- * Currently, cycles are silently prevented by nfa_add_state_unique
- * detecting duplicate states, but this is implicit and not guaranteed
- * for all code paths. Explicit cycle detection is needed.
- *----------
- */
+ /* Must exit: reached max iterations. */
RPRPatternElement *nextElem;
state->counts[depth] = 0;
@@ -6452,6 +6444,17 @@ nfa_advance_state(WindowAggState *winstate, RPRNFAContext *ctx,
RPRPatternElement *elem;
Assert(state->elemIdx >= 0 && state->elemIdx < pattern->numElements);
+
+ /* Cycle detection: if this elemIdx was already visited in this DFS, bail */
+ if (winstate->nfaVisitedElems[WORDNUM(state->elemIdx)] &
+ ((bitmapword) 1 << BITNUM(state->elemIdx)))
+ {
+ nfa_state_free(winstate, state);
+ return;
+ }
+ winstate->nfaVisitedElems[WORDNUM(state->elemIdx)] |=
+ ((bitmapword) 1 << BITNUM(state->elemIdx));
+
elem = &pattern->elements[state->elemIdx];
switch (elem->varId)
@@ -6506,6 +6509,10 @@ nfa_advance(WindowAggState *winstate, RPRNFAContext *ctx, int64 currentPos,
{
RPRNFAState *savedMatchedState = ctx->matchedState;
+ /* Clear visited bitmap before each state's DFS expansion */
+ memset(winstate->nfaVisitedElems, 0,
+ sizeof(bitmapword) * winstate->nfaVisitedNWords);
+
state = states;
states = states->next;
state->next = NULL;
diff --git a/src/include/nodes/execnodes.h b/src/include/nodes/execnodes.h
index 9a1561fce7e..3681d905bde 100644
--- a/src/include/nodes/execnodes.h
+++ b/src/include/nodes/execnodes.h
@@ -2651,6 +2651,10 @@ typedef struct WindowAggState
Size nfaStateSize; /* pre-calculated RPRNFAState size */
bool *nfaVarMatched; /* per-row cache: varMatched[varId] for varId
* < numDefines */
+ bitmapword *nfaVisitedElems; /* elemIdx visited bitmap for cycle
+ * detection */
+ int nfaVisitedNWords; /* number of bitmapwords in
+ * nfaVisitedElems */
int64 nfaLastProcessedRow; /* last row processed by NFA (-1 =
* none) */
diff --git a/src/test/regress/expected/rpr_nfa.out b/src/test/regress/expected/rpr_nfa.out
index a2a257329b7..8ac375cfe86 100644
--- a/src/test/regress/expected/rpr_nfa.out
+++ b/src/test/regress/expected/rpr_nfa.out
@@ -30,7 +30,7 @@
-- Special Partition Cases
-- DEFINE Special Cases
-- Absorption Dynamic Flags
--- FIXME Issues (Known Limitations)
+-- Zero-Consumption Cycle Detection
--
-- Responsibility:
-- - NFA runtime execution paths
@@ -2945,9 +2945,9 @@ WINDOW w AS (
(5 rows)
-- ============================================================
--- FIXME Issues - Known Limitations
+-- Zero-Consumption Cycle Detection
-- ============================================================
--- FIXME 2 - Cycle prevention at count > 0
+-- Cycle prevention at count > 0: (A*)* inner skip cycles at count=3
WITH test_cycle_nonzero AS (
SELECT * FROM (VALUES
(1, ARRAY['A']),
@@ -2976,7 +2976,7 @@ WINDOW w AS (
4 | {B} | |
(4 rows)
--- FIXME 2 - Cycle with mixed nullables
+-- Cycle with mixed nullables: (A* B*)* multiple nullable paths
WITH test_cycle_mixed AS (
SELECT * FROM (VALUES
(1, ARRAY['A']),
diff --git a/src/test/regress/sql/rpr_nfa.sql b/src/test/regress/sql/rpr_nfa.sql
index 0e853ca753b..be78ab760a9 100644
--- a/src/test/regress/sql/rpr_nfa.sql
+++ b/src/test/regress/sql/rpr_nfa.sql
@@ -30,7 +30,7 @@
-- Special Partition Cases
-- DEFINE Special Cases
-- Absorption Dynamic Flags
--- FIXME Issues (Known Limitations)
+-- Zero-Consumption Cycle Detection
--
-- Responsibility:
-- - NFA runtime execution paths
@@ -2190,10 +2190,10 @@ WINDOW w AS (
);
-- ============================================================
--- FIXME Issues - Known Limitations
+-- Zero-Consumption Cycle Detection
-- ============================================================
--- FIXME 2 - Cycle prevention at count > 0
+-- Cycle prevention at count > 0: (A*)* inner skip cycles at count=3
WITH test_cycle_nonzero AS (
SELECT * FROM (VALUES
(1, ARRAY['A']),
@@ -2215,7 +2215,7 @@ WINDOW w AS (
A AS 'A' = ANY(flags)
);
--- FIXME 2 - Cycle with mixed nullables
+-- Cycle with mixed nullables: (A* B*)* multiple nullable paths
WITH test_cycle_mixed AS (
SELECT * FROM (VALUES
(1, ARRAY['A']),
--
2.50.1 (Apple Git-155)
[application/octet-stream] nocfbot-0002-prev-next-define-only.patch (4.2K, 6-nocfbot-0002-prev-next-define-only.patch)
download | inline diff:
From 9d0c2be032b4495183e374becaa27b9f2a51da4e Mon Sep 17 00:00:00 2001
From: Henson Choi <[email protected]>
Date: Thu, 19 Feb 2026 13:53:55 +0900
Subject: [PATCH 2/6] Restrict PREV/NEXT functions to DEFINE clause only
Per the SQL standard, PREV() and NEXT() are only allowed in a DEFINE
clause of row pattern recognition. Previously they could be used
anywhere normal functions are allowed. Add a check in ParseFuncOrColumn
to reject them outside EXPR_KIND_RPR_DEFINE context.
Patch by Tatsuo Ishii.
---
src/backend/parser/parse_func.c | 11 +++++++++
src/test/regress/expected/rpr_base.out | 32 ++++++++++++++++++++++++++
src/test/regress/sql/rpr_base.sql | 28 ++++++++++++++++++++++
3 files changed, 71 insertions(+)
diff --git a/src/backend/parser/parse_func.c b/src/backend/parser/parse_func.c
index 357b236a818..ba9523ae3d4 100644
--- a/src/backend/parser/parse_func.c
+++ b/src/backend/parser/parse_func.c
@@ -31,6 +31,7 @@
#include "parser/parse_target.h"
#include "parser/parse_type.h"
#include "utils/builtins.h"
+#include "utils/fmgroids.h"
#include "utils/lsyscache.h"
#include "utils/syscache.h"
@@ -756,6 +757,16 @@ ParseFuncOrColumn(ParseState *pstate, List *funcname, List *fargs,
if (retset)
check_srf_call_placement(pstate, last_srf, location);
+ /* next() and prev() are only allowed in a WINDOW DEFINE clause */
+ if (fdresult == FUNCDETAIL_NORMAL &&
+ pstate->p_expr_kind != EXPR_KIND_RPR_DEFINE &&
+ (funcid == F_PREV || funcid == F_NEXT))
+ ereport(ERROR,
+ (errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("%s can only be used in a DEFINE clause",
+ NameListToString(funcname)),
+ parser_errposition(pstate, location)));
+
/* build the appropriate output structure */
if (fdresult == FUNCDETAIL_NORMAL || fdresult == FUNCDETAIL_PROCEDURE)
{
diff --git a/src/test/regress/expected/rpr_base.out b/src/test/regress/expected/rpr_base.out
index ec67a099ee6..1d42af15980 100644
--- a/src/test/regress/expected/rpr_base.out
+++ b/src/test/regress/expected/rpr_base.out
@@ -1622,6 +1622,38 @@ ORDER BY id;
5 | 30 | 0
(5 rows)
+-- PREV function cannot be used other than in DEFINE
+SELECT PREV(id), id, val, COUNT(*) OVER w as cnt
+FROM rpr_nav
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ PATTERN (A B+)
+ DEFINE
+ A AS val > 0,
+ B AS val > PREV(val)
+)
+ORDER BY id;
+ERROR: prev can only be used in a DEFINE clause
+LINE 1: SELECT PREV(id), id, val, COUNT(*) OVER w as cnt
+ ^
+-- Expected: ERROR: prev can only be used in a DEFINE clause
+-- NEXT function cannot be used other than in DEFINE
+SELECT NEXT(id), id, val, COUNT(*) OVER w as cnt
+FROM rpr_nav
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ PATTERN (A B+)
+ DEFINE
+ A AS val > 0,
+ B AS val > PREV(val)
+)
+ORDER BY id;
+ERROR: next can only be used in a DEFINE clause
+LINE 1: SELECT NEXT(id), id, val, COUNT(*) OVER w as cnt
+ ^
+-- Expected: ERROR: next can only be used in a DEFINE clause
DROP TABLE rpr_nav;
-- ============================================================
-- SKIP TO / INITIAL Tests
diff --git a/src/test/regress/sql/rpr_base.sql b/src/test/regress/sql/rpr_base.sql
index 879f5fe48a2..b752ad12873 100644
--- a/src/test/regress/sql/rpr_base.sql
+++ b/src/test/regress/sql/rpr_base.sql
@@ -1219,6 +1219,34 @@ WINDOW w AS (
)
ORDER BY id;
+-- PREV function cannot be used other than in DEFINE
+SELECT PREV(id), id, val, COUNT(*) OVER w as cnt
+FROM rpr_nav
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ PATTERN (A B+)
+ DEFINE
+ A AS val > 0,
+ B AS val > PREV(val)
+)
+ORDER BY id;
+-- Expected: ERROR: prev can only be used in a DEFINE clause
+
+-- NEXT function cannot be used other than in DEFINE
+SELECT NEXT(id), id, val, COUNT(*) OVER w as cnt
+FROM rpr_nav
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ PATTERN (A B+)
+ DEFINE
+ A AS val > 0,
+ B AS val > PREV(val)
+)
+ORDER BY id;
+-- Expected: ERROR: next can only be used in a DEFINE clause
+
DROP TABLE rpr_nav;
-- ============================================================
--
2.50.1 (Apple Git-155)
[application/octet-stream] nocfbot-0004-reluctant-quantifiers.patch (65.6K, 7-nocfbot-0004-reluctant-quantifiers.patch)
download | inline diff:
From 6032b704c3e2930c5f55fe03a881a1bb59bc7e14 Mon Sep 17 00:00:00 2001
From: Henson Choi <[email protected]>
Date: Fri, 20 Feb 2026 13:18:19 +0900
Subject: [PATCH 4/6] Implement reluctant quantifiers for row pattern
recognition
---
src/backend/executor/nodeWindowAgg.c | 170 ++++++--
src/backend/optimizer/plan/rpr.c | 33 +-
src/backend/parser/gram.y | 6 -
src/test/regress/expected/rpr.out | 51 +--
src/test/regress/expected/rpr_base.out | 278 ++++++++++---
src/test/regress/expected/rpr_nfa.out | 516 +++++++++++++++++++++++++
src/test/regress/sql/rpr.sql | 25 +-
src/test/regress/sql/rpr_base.sql | 87 ++++-
src/test/regress/sql/rpr_nfa.sql | 398 +++++++++++++++++++
9 files changed, 1428 insertions(+), 136 deletions(-)
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index 96b2cd4b9c4..371439aad91 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -6149,16 +6149,40 @@ nfa_advance_begin(WindowAggState *winstate, RPRNFAContext *ctx,
state->counts, state->isAbsorbable);
}
- /* Enter group: route to first child (lexically first) */
- state->elemIdx = elem->next;
- nfa_route_to_elem(winstate, ctx, state,
- &elements[state->elemIdx], currentPos, initialAdvance);
-
- /* Now route skip path (lexically second) */
- if (skipState != NULL)
+ if (skipState != NULL && RPRElemIsReluctant(elem))
{
+ RPRNFAState *savedMatch = ctx->matchedState;
+
+ /* Reluctant: skip first (prefer fewer iterations), enter second */
nfa_route_to_elem(winstate, ctx, skipState,
&elements[elem->jump], currentPos, initialAdvance);
+
+ /*
+ * If skip path reached FIN, shortest match is found. Skip group entry
+ * to prevent longer matches.
+ */
+ if (ctx->matchedState != savedMatch)
+ {
+ nfa_state_free(winstate, state);
+ return;
+ }
+
+ state->elemIdx = elem->next;
+ nfa_route_to_elem(winstate, ctx, state,
+ &elements[state->elemIdx], currentPos, initialAdvance);
+ }
+ else
+ {
+ /* Greedy: enter first, skip second */
+ state->elemIdx = elem->next;
+ nfa_route_to_elem(winstate, ctx, state,
+ &elements[state->elemIdx], currentPos, initialAdvance);
+
+ if (skipState != NULL)
+ {
+ nfa_route_to_elem(winstate, ctx, skipState,
+ &elements[elem->jump], currentPos, initialAdvance);
+ }
}
}
@@ -6225,7 +6249,8 @@ nfa_advance_end(WindowAggState *winstate, RPRNFAContext *ctx,
{
/*
* Between min and max (with at least one iteration) - can exit or
- * loop
+ * loop. Greedy: loop first (prefer more iterations). Reluctant: exit
+ * first (prefer fewer iterations).
*/
RPRNFAState *exitState;
RPRPatternElement *jumpElem;
@@ -6244,16 +6269,43 @@ nfa_advance_end(WindowAggState *winstate, RPRNFAContext *ctx,
if (RPRElemIsEnd(nextElem) && exitState->counts[nextElem->depth] < RPR_COUNT_MAX)
exitState->counts[nextElem->depth]++;
- /* Route loop state first (earlier in pattern = lexical order) */
+ /* Prepare loop state */
for (int d = depth + 1; d < pattern->maxDepth; d++)
state->counts[d] = 0;
state->elemIdx = elem->jump;
jumpElem = &elements[state->elemIdx];
- nfa_route_to_elem(winstate, ctx, state, jumpElem, currentPos, initialAdvance);
+ if (RPRElemIsReluctant(elem))
+ {
+ RPRNFAState *savedMatch = ctx->matchedState;
+
+ /* Exit first (preferred for reluctant) */
+ nfa_route_to_elem(winstate, ctx, exitState, nextElem,
+ currentPos, initialAdvance);
- /* Then route exit state */
- nfa_route_to_elem(winstate, ctx, exitState, nextElem, currentPos, initialAdvance);
+ /*
+ * If exit path reached FIN, shortest match is found. Skip loop to
+ * prevent longer matches from replacing it.
+ */
+ if (ctx->matchedState != savedMatch)
+ {
+ nfa_state_free(winstate, state);
+ return;
+ }
+
+ /* Loop second */
+ nfa_route_to_elem(winstate, ctx, state, jumpElem,
+ currentPos, initialAdvance);
+ }
+ else
+ {
+ /* Loop first (preferred for greedy) */
+ nfa_route_to_elem(winstate, ctx, state, jumpElem,
+ currentPos, initialAdvance);
+ /* Exit second */
+ nfa_route_to_elem(winstate, ctx, exitState, nextElem,
+ currentPos, initialAdvance);
+ }
}
}
@@ -6280,34 +6332,86 @@ nfa_advance_var(WindowAggState *winstate, RPRNFAContext *ctx,
if (canLoop && canExit)
{
- /* Both: clone for loop, modify original for exit */
- RPRNFAState *loopState;
+ /*
+ * Both loop and exit possible. Greedy: loop first (prefer longer
+ * match). Reluctant: exit first (prefer shorter match).
+ */
+ RPRNFAState *cloneState;
RPRPatternElement *nextElem;
-
- loopState = nfa_state_create(winstate, state->elemIdx,
- state->counts, state->isAbsorbable);
- nfa_add_state_unique(winstate, ctx, loopState);
-
- /* Exit: advance to next element */
- state->counts[depth] = 0;
- state->elemIdx = elem->next;
- nextElem = &elements[state->elemIdx];
+ bool reluctant = RPRElemIsReluctant(elem);
/*
- * When a quantified VAR (e.g., A+) exits directly to an outer END,
- * increment the END's iteration count. Simple VARs (min=max=1)
- * handle this via inline advance in nfa_match, but quantified VARs
- * bypass that path. Only increment when the VAR actually consumed
- * rows (count > 0) to preserve cycle prevention for zero-progress
- * loops.
+ * Clone state for the second-priority path. For greedy, clone is the
+ * loop state; for reluctant, clone is the exit state.
*/
- if (RPRElemIsEnd(nextElem) && count > 0)
+ if (reluctant)
{
- if (state->counts[nextElem->depth] < RPR_COUNT_MAX)
- state->counts[nextElem->depth]++;
+ RPRNFAState *savedMatch = ctx->matchedState;
+
+ /* Clone for exit, original stays for loop */
+ cloneState = nfa_state_create(winstate, elem->next,
+ state->counts, state->isAbsorbable);
+ cloneState->counts[depth] = 0;
+ nextElem = &elements[cloneState->elemIdx];
+
+ /*
+ * When exiting directly to an outer END, increment the END's
+ * iteration count.
+ */
+ if (RPRElemIsEnd(nextElem) && count > 0)
+ {
+ if (cloneState->counts[nextElem->depth] < RPR_COUNT_MAX)
+ cloneState->counts[nextElem->depth]++;
+ }
+
+ /* Exit first (preferred for reluctant) */
+ nfa_route_to_elem(winstate, ctx, cloneState, nextElem,
+ currentPos, initialAdvance);
+
+ /*
+ * If exit path reached FIN, the shortest match is found. Skip
+ * loop state to prevent longer matches from replacing it.
+ */
+ if (ctx->matchedState != savedMatch)
+ {
+ nfa_state_free(winstate, state);
+ return;
+ }
+
+ /* Loop second */
+ nfa_add_state_unique(winstate, ctx, state);
}
+ else
+ {
+ /* Clone for loop, original used for exit */
+ cloneState = nfa_state_create(winstate, state->elemIdx,
+ state->counts, state->isAbsorbable);
- nfa_route_to_elem(winstate, ctx, state, nextElem, currentPos, initialAdvance);
+ /* Loop first (preferred for greedy) */
+ nfa_add_state_unique(winstate, ctx, cloneState);
+
+ /* Exit second */
+ state->counts[depth] = 0;
+ state->elemIdx = elem->next;
+ nextElem = &elements[state->elemIdx];
+
+ /*
+ * When exiting directly to an outer END, increment the END's
+ * iteration count. Simple VARs (min=max=1) handle this via
+ * inline advance in nfa_match, but quantified VARs bypass that
+ * path. Only increment when the VAR actually consumed rows
+ * (count > 0) to preserve cycle prevention for zero-progress
+ * loops.
+ */
+ if (RPRElemIsEnd(nextElem) && count > 0)
+ {
+ if (state->counts[nextElem->depth] < RPR_COUNT_MAX)
+ state->counts[nextElem->depth]++;
+ }
+
+ nfa_route_to_elem(winstate, ctx, state, nextElem,
+ currentPos, initialAdvance);
+ }
}
else if (canLoop)
{
diff --git a/src/backend/optimizer/plan/rpr.c b/src/backend/optimizer/plan/rpr.c
index 112ed034fe2..987f595d434 100644
--- a/src/backend/optimizer/plan/rpr.c
+++ b/src/backend/optimizer/plan/rpr.c
@@ -835,21 +835,44 @@ tryMultiplyQuantifiers(RPRPatternNode *pattern)
* Examples:
* (A){1,1} -> A
* (A B){1,1} -> SEQ(A, B) (unwraps the inner SEQ)
+ * (A)? -> A? (propagate quantifier to single VAR child)
+ * (A)+? -> A+? (propagate quantifier including reluctant)
*
- * If GROUP has min=1, max=1, and is not reluctant, return the child directly.
- * Otherwise returns the pattern unchanged.
+ * If GROUP has min=1, max=1, return the child directly (reluctant on
+ * {1,1} is meaningless). If GROUP has a single VAR child with default
+ * quantifier {1,1}, propagate the GROUP's quantifier to the child and
+ * unwrap. Otherwise returns the pattern unchanged.
*
* Note: Parser always creates GROUP with exactly one child via list_make1().
*/
static RPRPatternNode *
tryUnwrapGroup(RPRPatternNode *pattern)
{
- if (pattern->min != 1 || pattern->max != 1 || pattern->reluctant >= 0)
- return pattern;
+ RPRPatternNode *child;
/* Parser always creates GROUP with single child */
Assert(list_length(pattern->children) == 1);
- return (RPRPatternNode *) linitial(pattern->children);
+
+ child = (RPRPatternNode *) linitial(pattern->children);
+
+ /* GROUP{1,1}: unwrap directly (reluctant on {1,1} is meaningless) */
+ if (pattern->min == 1 && pattern->max == 1)
+ return child;
+
+ /*
+ * Single VAR child with default {1,1}: propagate GROUP's quantifier to
+ * the child and unwrap. E.g., (A)?? -> A??, (A)+? -> A+?
+ */
+ if (child->nodeType == RPR_PATTERN_VAR &&
+ child->min == 1 && child->max == 1 && child->reluctant < 0)
+ {
+ child->min = pattern->min;
+ child->max = pattern->max;
+ child->reluctant = pattern->reluctant;
+ return child;
+ }
+
+ return pattern;
}
/*
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index e6ae23f368b..5a3d141e992 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -20376,12 +20376,6 @@ makeRPRQuantifier(int min, int max, ParseLoc reluctant, int location,
{
RPRPatternNode *n = makeNode(RPRPatternNode);
- if (reluctant >= 0)
- ereport(ERROR,
- (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
- errmsg("reluctant quantifiers are not yet supported"),
- parser_errposition(reluctant)));
-
n->min = min;
n->max = max;
n->reluctant = reluctant;
diff --git a/src/test/regress/expected/rpr.out b/src/test/regress/expected/rpr.out
index c921badb006..8af44392700 100644
--- a/src/test/regress/expected/rpr.out
+++ b/src/test/regress/expected/rpr.out
@@ -2814,6 +2814,7 @@ ERROR: unsupported quantifier "~"
LINE 9: PATTERN (START UP~ DOWN+)
^
HINT: Valid quantifiers are: *, +, ?, *?, +?, ??, {n}, {n,}, {,m}, {n,m} and their reluctant versions.
+-- PREV(1) missing column reference (error)
SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER w
FROM stock
WINDOW w AS (
@@ -2828,9 +2829,7 @@ SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER
UP AS price > PREV(1),
DOWN AS price < PREV(1)
);
-ERROR: reluctant quantifiers are not yet supported
-LINE 9: PATTERN (START UP+? DOWN+)
- ^
+ERROR: row pattern navigation operation's argument must include at least one column reference
-- Maximum pattern variables is 251 (RPR_VARID_MAX)
-- Error: 252 variables exceeds limit of 251
DO $$
@@ -3836,15 +3835,15 @@ WINDOW w AS (
-- Expected: 1-4 (A C B C)
-- ============================================
--- RELUCTANT quantifiers (not yet supported)
+-- RELUCTANT quantifiers
-- ============================================
--- Test: A+? B (reluctant) - parser rejects with ERROR
+-- Test: A+? B (reluctant) - B available at row 3, A exits early
WITH jacob_reluctant AS (
SELECT * FROM (VALUES
- (1, ARRAY['A']),
- (2, ARRAY['A']),
- (3, ARRAY['A']),
- (4, ARRAY['B'])
+ (1, ARRAY['A','_']),
+ (2, ARRAY['A','_']),
+ (3, ARRAY['A','B']),
+ (4, ARRAY['B','_'])
) AS t(id, flags)
)
SELECT id, flags, first_value(id) OVER w AS match_start, last_value(id) OVER w AS match_end
@@ -3858,17 +3857,21 @@ WINDOW w AS (
A AS 'A' = ANY(flags),
B AS 'B' = ANY(flags)
);
-ERROR: reluctant quantifiers are not yet supported
-LINE 15: PATTERN (A+? B)
- ^
--- Expected: ERROR (reluctant quantifiers not yet supported)
--- Test: A{1,3}? B (reluctant bounded) - parser rejects with ERROR
+ id | flags | match_start | match_end
+----+-------+-------------+-----------
+ 1 | {A,_} | 1 | 3
+ 2 | {A,_} | |
+ 3 | {A,B} | |
+ 4 | {B,_} | |
+(4 rows)
+
+-- Test: A{1,3}? B (reluctant bounded) - same data, bounded quantifier
WITH jacob_reluctant_bounded AS (
SELECT * FROM (VALUES
- (1, ARRAY['A']),
- (2, ARRAY['A']),
- (3, ARRAY['A']),
- (4, ARRAY['B'])
+ (1, ARRAY['A','_']),
+ (2, ARRAY['A','_']),
+ (3, ARRAY['A','B']),
+ (4, ARRAY['B','_'])
) AS t(id, flags)
)
SELECT id, flags, first_value(id) OVER w AS match_start, last_value(id) OVER w AS match_end
@@ -3882,10 +3885,14 @@ WINDOW w AS (
A AS 'A' = ANY(flags),
B AS 'B' = ANY(flags)
);
-ERROR: reluctant quantifiers are not yet supported
-LINE 15: PATTERN (A{1,3}? B)
- ^
--- Expected: ERROR (reluctant quantifiers not yet supported)
+ id | flags | match_start | match_end
+----+-------+-------------+-----------
+ 1 | {A,_} | 1 | 3
+ 2 | {A,_} | |
+ 3 | {A,B} | |
+ 4 | {B,_} | |
+(4 rows)
+
-- ============================================
-- Nested quantifiers (pathological patterns)
-- ============================================
diff --git a/src/test/regress/expected/rpr_base.out b/src/test/regress/expected/rpr_base.out
index d317ec70c8d..43eea32130f 100644
--- a/src/test/regress/expected/rpr_base.out
+++ b/src/test/regress/expected/rpr_base.out
@@ -1076,7 +1076,7 @@ ORDER BY id;
(10 rows)
DROP TABLE rpr_quant;
--- Reluctant quantifiers (not yet supported)
+-- Reluctant quantifiers
CREATE TABLE rpr_reluctant (id INT, val INT);
INSERT INTO rpr_reluctant VALUES (1, 10), (2, 20), (3, 30);
-- *? (zero or more, reluctant)
@@ -1088,10 +1088,14 @@ WINDOW w AS (
PATTERN (A*?)
DEFINE A AS val > 0
);
-ERROR: reluctant quantifiers are not yet supported
-LINE 6: PATTERN (A*?)
- ^
--- Expected: ERROR: reluctant quantifiers are not yet supported
+ count
+-------
+ 1
+ 1
+ 1
+(3 rows)
+
+-- Reluctant quantifier: prefer shortest match
-- +? (one or more, reluctant)
SELECT COUNT(*) OVER w
FROM rpr_reluctant
@@ -1101,10 +1105,14 @@ WINDOW w AS (
PATTERN (A+?)
DEFINE A AS val > 0
);
-ERROR: reluctant quantifiers are not yet supported
-LINE 6: PATTERN (A+?)
- ^
--- Expected: ERROR: reluctant quantifiers are not yet supported
+ count
+-------
+ 1
+ 1
+ 1
+(3 rows)
+
+-- Reluctant quantifier: prefer shortest match
-- ?? (zero or one, reluctant)
SELECT COUNT(*) OVER w
FROM rpr_reluctant
@@ -1114,10 +1122,14 @@ WINDOW w AS (
PATTERN (A??)
DEFINE A AS val > 0
);
-ERROR: reluctant quantifiers are not yet supported
-LINE 6: PATTERN (A??)
- ^
--- Expected: ERROR: reluctant quantifiers are not yet supported
+ count
+-------
+ 1
+ 1
+ 1
+(3 rows)
+
+-- Reluctant quantifier: prefer shortest match
-- {n,}? (n or more, reluctant)
SELECT COUNT(*) OVER w
FROM rpr_reluctant
@@ -1127,10 +1139,14 @@ WINDOW w AS (
PATTERN (A{2,}?)
DEFINE A AS val > 0
);
-ERROR: reluctant quantifiers are not yet supported
-LINE 6: PATTERN (A{2,}?)
- ^
--- Expected: ERROR: reluctant quantifiers are not yet supported
+ count
+-------
+ 2
+ 0
+ 0
+(3 rows)
+
+-- Reluctant quantifier: prefer shortest match
-- {n,m}? (n to m, reluctant)
SELECT COUNT(*) OVER w
FROM rpr_reluctant
@@ -1140,10 +1156,14 @@ WINDOW w AS (
PATTERN (A{1,3}?)
DEFINE A AS val > 0
);
-ERROR: reluctant quantifiers are not yet supported
-LINE 6: PATTERN (A{1,3}?)
- ^
--- Expected: ERROR: reluctant quantifiers are not yet supported
+ count
+-------
+ 1
+ 1
+ 1
+(3 rows)
+
+-- Reluctant quantifier: prefer shortest match
-- {n}? (exactly n, reluctant)
SELECT COUNT(*) OVER w
FROM rpr_reluctant
@@ -1153,10 +1173,14 @@ WINDOW w AS (
PATTERN (A{2}?)
DEFINE A AS val > 0
);
-ERROR: reluctant quantifiers are not yet supported
-LINE 6: PATTERN (A{2}?)
- ^
--- Expected: ERROR: reluctant quantifiers are not yet supported
+ count
+-------
+ 2
+ 0
+ 0
+(3 rows)
+
+-- Reluctant quantifier: prefer shortest match
-- {,m}? (up to m, reluctant) - COMPLETELY UNTESTED RULE!
SELECT COUNT(*) OVER w
FROM rpr_reluctant
@@ -1166,10 +1190,14 @@ WINDOW w AS (
PATTERN (A{,3}?)
DEFINE A AS val > 0
);
-ERROR: reluctant quantifiers are not yet supported
-LINE 6: PATTERN (A{,3}?)
- ^
--- Expected: ERROR: reluctant quantifiers are not yet supported
+ count
+-------
+ 1
+ 1
+ 1
+(3 rows)
+
+-- Reluctant quantifier: prefer shortest match
-- Invalid reluctant patterns (wrong token after quantifier)
-- {2}+ (should be {2}? not {2}+)
SELECT COUNT(*) OVER w
@@ -1352,10 +1380,14 @@ WINDOW w AS (
PATTERN (A* ?)
DEFINE A AS val > 0
);
-ERROR: reluctant quantifiers are not yet supported
-LINE 6: PATTERN (A* ?)
- ^
--- Expected: ERROR: reluctant quantifiers are not yet supported
+ count
+-------
+ 1
+ 1
+ 1
+(3 rows)
+
+-- Reluctant quantifier: prefer shortest match
-- + ? (token separated)
SELECT COUNT(*) OVER w
FROM rpr_reluctant
@@ -1365,10 +1397,14 @@ WINDOW w AS (
PATTERN (A+ ?)
DEFINE A AS val > 0
);
-ERROR: reluctant quantifiers are not yet supported
-LINE 6: PATTERN (A+ ?)
- ^
--- Expected: ERROR: reluctant quantifiers are not yet supported
+ count
+-------
+ 1
+ 1
+ 1
+(3 rows)
+
+-- Reluctant quantifier: prefer shortest match
-- {2,} ? (token separated)
SELECT COUNT(*) OVER w
FROM rpr_reluctant
@@ -1378,10 +1414,14 @@ WINDOW w AS (
PATTERN (A{2,} ?)
DEFINE A AS val > 0
);
-ERROR: reluctant quantifiers are not yet supported
-LINE 6: PATTERN (A{2,} ?)
- ^
--- Expected: ERROR: reluctant quantifiers are not yet supported
+ count
+-------
+ 2
+ 0
+ 0
+(3 rows)
+
+-- Reluctant quantifier: prefer shortest match
-- Invalid token combinations
-- * + (invalid combination)
SELECT COUNT(*) OVER w
@@ -1418,10 +1458,14 @@ WINDOW w AS (
PATTERN (A? ?)
DEFINE A AS val > 0
);
-ERROR: reluctant quantifiers are not yet supported
-LINE 6: PATTERN (A? ?)
- ^
--- Expected: ERROR: reluctant quantifiers are not yet supported
+ count
+-------
+ 1
+ 1
+ 1
+(3 rows)
+
+-- Reluctant quantifier: prefer shortest match
DROP TABLE rpr_reluctant;
-- Quantifier boundary conditions
CREATE TABLE rpr_bounds (id INT);
@@ -3391,6 +3435,150 @@ WINDOW w AS (ORDER BY id ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
-> Seq Scan on rpr_plan
(6 rows)
+-- Reluctant optimization bypass: VAR merge
+-- A+? A stays as a+? a (greedy A+ A merges to a{2,})
+EXPLAIN (COSTS OFF)
+SELECT COUNT(*) OVER w FROM rpr_plan
+WINDOW w AS (ORDER BY id ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ PATTERN (A+? A) DEFINE A AS val > 0);
+ QUERY PLAN
+-------------------------------------------------------------------------------
+ WindowAgg
+ Window: w AS (ORDER BY id ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
+ Pattern: a+? a
+ -> Sort
+ Sort Key: id
+ -> Seq Scan on rpr_plan
+(6 rows)
+
+-- Reluctant optimization bypass: GROUP merge
+-- (A B)+? (A B) stays separate (greedy merges to (a b){2,})
+EXPLAIN (COSTS OFF)
+SELECT COUNT(*) OVER w FROM rpr_plan
+WINDOW w AS (ORDER BY id ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ PATTERN ((A B)+? (A B)) DEFINE A AS val <= 50, B AS val > 50);
+ QUERY PLAN
+-------------------------------------------------------------------------------
+ WindowAgg
+ Window: w AS (ORDER BY id ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
+ Pattern: (a b)+? a b
+ -> Sort
+ Sort Key: id
+ -> Seq Scan on rpr_plan
+(6 rows)
+
+-- Reluctant optimization bypass: quantifier multiply (outer reluctant)
+-- (A{2}){3}? stays as (a{2}){3}? (greedy merges to a{6})
+EXPLAIN (COSTS OFF)
+SELECT COUNT(*) OVER w FROM rpr_plan
+WINDOW w AS (ORDER BY id ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ PATTERN ((A{2}){3}?) DEFINE A AS val > 0);
+ QUERY PLAN
+-------------------------------------------------------------------------------
+ WindowAgg
+ Window: w AS (ORDER BY id ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
+ Pattern: (a{2}){3}?
+ -> Sort
+ Sort Key: id
+ -> Seq Scan on rpr_plan
+(6 rows)
+
+-- Reluctant optimization bypass: quantifier multiply (inner reluctant)
+-- (A{2}?){3} stays as (a{2}?){3} (greedy merges to a{6})
+EXPLAIN (COSTS OFF)
+SELECT COUNT(*) OVER w FROM rpr_plan
+WINDOW w AS (ORDER BY id ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ PATTERN ((A{2}?){3}) DEFINE A AS val > 0);
+ QUERY PLAN
+-------------------------------------------------------------------------------
+ WindowAgg
+ Window: w AS (ORDER BY id ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
+ Pattern: (a{2}?){3}
+ -> Sort
+ Sort Key: id
+ -> Seq Scan on rpr_plan
+(6 rows)
+
+-- Reluctant optimization bypass: PREFIX merge
+-- A B (A B)+? stays separate (greedy merges to (a b){2,})
+EXPLAIN (COSTS OFF)
+SELECT COUNT(*) OVER w FROM rpr_plan
+WINDOW w AS (ORDER BY id ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ PATTERN (A B (A B)+?) DEFINE A AS val <= 50, B AS val > 50);
+ QUERY PLAN
+-------------------------------------------------------------------------------
+ WindowAgg
+ Window: w AS (ORDER BY id ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
+ Pattern: a b (a b)+?
+ -> Sort
+ Sort Key: id
+ -> Seq Scan on rpr_plan
+(6 rows)
+
+-- Reluctant optimization bypass: SUFFIX merge
+-- (A B)+? A B stays separate (greedy merges to (a b){2,})
+EXPLAIN (COSTS OFF)
+SELECT COUNT(*) OVER w FROM rpr_plan
+WINDOW w AS (ORDER BY id ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ PATTERN ((A B)+? A B) DEFINE A AS val <= 50, B AS val > 50);
+ QUERY PLAN
+-------------------------------------------------------------------------------
+ WindowAgg
+ Window: w AS (ORDER BY id ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
+ Pattern: (a b)+? a b
+ -> Sort
+ Sort Key: id
+ -> Seq Scan on rpr_plan
+(6 rows)
+
+-- GROUP unwrap with quantifier propagation: (A)?? B -> a?? b
+-- Single VAR child {1,1} receives GROUP's quantifier and reluctant
+EXPLAIN (COSTS OFF)
+SELECT COUNT(*) OVER w FROM rpr_plan
+WINDOW w AS (ORDER BY id ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ PATTERN ((A)?? B) DEFINE A AS val <= 50, B AS val > 50);
+ QUERY PLAN
+-------------------------------------------------------------------------------
+ WindowAgg
+ Window: w AS (ORDER BY id ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
+ Pattern: a?? b
+ -> Sort
+ Sort Key: id
+ -> Seq Scan on rpr_plan
+(6 rows)
+
+-- Reluctant preserved through ALT flatten
+-- (A | (B | C))+? flattens to (a | b | c)+? - inner ALT flattened, reluctant kept
+EXPLAIN (COSTS OFF)
+SELECT COUNT(*) OVER w FROM rpr_plan
+WINDOW w AS (ORDER BY id ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ PATTERN ((A | (B | C))+?) DEFINE A AS val <= 30, B AS val <= 60, C AS val > 60);
+ QUERY PLAN
+-------------------------------------------------------------------------------
+ WindowAgg
+ Window: w AS (ORDER BY id ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
+ Pattern: (a | b | c)+?
+ -> Sort
+ Sort Key: id
+ -> Seq Scan on rpr_plan
+(6 rows)
+
+-- Reluctant optimization bypass: absorption flags
+-- A+? with SKIP PAST LAST ROW - no absorption markers (greedy A+ gets a+")
+EXPLAIN (COSTS OFF)
+SELECT COUNT(*) OVER w FROM rpr_plan
+WINDOW w AS (ORDER BY id ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP PAST LAST ROW PATTERN (A+?) DEFINE A AS val > 0);
+ QUERY PLAN
+-------------------------------------------------------------------------------
+ WindowAgg
+ Window: w AS (ORDER BY id ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING)
+ Pattern: a+?
+ -> Sort
+ Sort Key: id
+ -> Seq Scan on rpr_plan
+(6 rows)
+
-- ============================================================
-- Absorption Flag Display Tests
-- ============================================================
diff --git a/src/test/regress/expected/rpr_nfa.out b/src/test/regress/expected/rpr_nfa.out
index 7d38d62ac31..a2a257329b7 100644
--- a/src/test/regress/expected/rpr_nfa.out
+++ b/src/test/regress/expected/rpr_nfa.out
@@ -289,6 +289,39 @@ WINDOW w AS (
5 | {_} | |
(5 rows)
+-- Reluctant pattern (A+?) - not absorbable
+-- Compare with greedy A+ above: reluctant excluded from absorption.
+-- Each context produces minimum match independently.
+WITH test_reluctant_absorption AS (
+ SELECT * FROM (VALUES
+ (1, ARRAY['A']),
+ (2, ARRAY['A']),
+ (3, ARRAY['A']),
+ (4, ARRAY['A']),
+ (5, ARRAY['_'])
+ ) AS t(id, flags)
+)
+SELECT id, flags,
+ first_value(id) OVER w AS match_start,
+ last_value(id) OVER w AS match_end
+FROM test_reluctant_absorption
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP TO NEXT ROW
+ PATTERN (A+?)
+ DEFINE
+ A AS 'A' = ANY(flags)
+);
+ id | flags | match_start | match_end
+----+-------+-------------+-----------
+ 1 | {A} | 1 | 1
+ 2 | {A} | 2 | 2
+ 3 | {A} | 3 | 3
+ 4 | {A} | 4 | 4
+ 5 | {_} | |
+(5 rows)
+
-- ============================================================
-- Context Lifecycle
-- ============================================================
@@ -420,6 +453,38 @@ WINDOW w AS (
4 | {_,_} | |
(4 rows)
+-- Reluctant context lifecycle (A+? B with SKIP TO NEXT ROW)
+-- A+? exits early but if B not available, falls back to loop.
+-- Contexts not absorbed (reluctant), so multiple survive.
+WITH test_reluctant_context AS (
+ SELECT * FROM (VALUES
+ (1, ARRAY['A']),
+ (2, ARRAY['A']),
+ (3, ARRAY['B']),
+ (4, ARRAY['_'])
+ ) AS t(id, flags)
+)
+SELECT id, flags,
+ first_value(id) OVER w AS match_start,
+ last_value(id) OVER w AS match_end
+FROM test_reluctant_context
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP TO NEXT ROW
+ PATTERN (A+? B)
+ DEFINE
+ A AS 'A' = ANY(flags),
+ B AS 'B' = ANY(flags)
+);
+ id | flags | match_start | match_end
+----+-------+-------------+-----------
+ 1 | {A} | 1 | 3
+ 2 | {A} | 2 | 3
+ 3 | {B} | |
+ 4 | {_} | |
+(4 rows)
+
-- ============================================================
-- Advance Phase (Epsilon Transitions)
-- ============================================================
@@ -575,6 +640,100 @@ WINDOW w AS (
8 | {D} | |
(8 rows)
+-- Mixed greedy/reluctant sequence: A+? B+ (reluctant A, greedy B)
+-- A exits as early as possible, B consumes the rest greedily
+WITH test_mixed_reluctant AS (
+ SELECT * FROM (VALUES
+ (1, ARRAY['A']),
+ (2, ARRAY['A']),
+ (3, ARRAY['A','B']),
+ (4, ARRAY['B']),
+ (5, ARRAY['B'])
+ ) AS t(id, flags)
+)
+SELECT id, flags,
+ first_value(id) OVER w AS match_start,
+ last_value(id) OVER w AS match_end
+FROM test_mixed_reluctant
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP PAST LAST ROW
+ PATTERN (A+? B+)
+ DEFINE
+ A AS 'A' = ANY(flags),
+ B AS 'B' = ANY(flags)
+);
+ id | flags | match_start | match_end
+----+-------+-------------+-----------
+ 1 | {A} | 1 | 5
+ 2 | {A} | |
+ 3 | {A,B} | |
+ 4 | {B} | |
+ 5 | {B} | |
+(5 rows)
+
+-- Optional reluctant group: (A B)?? C
+-- nfa_advance_begin: reluctant prefers skip (0 iterations) over enter
+WITH test_optional_reluctant AS (
+ SELECT * FROM (VALUES
+ (1, ARRAY['A']),
+ (2, ARRAY['B']),
+ (3, ARRAY['C'])
+ ) AS t(id, flags)
+)
+SELECT id, flags,
+ first_value(id) OVER w AS match_start,
+ last_value(id) OVER w AS match_end
+FROM test_optional_reluctant
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP PAST LAST ROW
+ PATTERN ((A B)?? C)
+ DEFINE
+ A AS 'A' = ANY(flags),
+ B AS 'B' = ANY(flags),
+ C AS 'C' = ANY(flags)
+);
+ id | flags | match_start | match_end
+----+-------+-------------+-----------
+ 1 | {A} | 1 | 3
+ 2 | {B} | |
+ 3 | {C} | |
+(3 rows)
+
+-- Greedy/reluctant sequence: A+ B+? (greedy A, reluctant B at end)
+-- A consumes greedily, B+? exits to FIN after minimum match
+WITH test_greedy_then_reluctant AS (
+ SELECT * FROM (VALUES
+ (1, ARRAY['A']),
+ (2, ARRAY['A','B']),
+ (3, ARRAY['B']),
+ (4, ARRAY['B'])
+ ) AS t(id, flags)
+)
+SELECT id, flags,
+ first_value(id) OVER w AS match_start,
+ last_value(id) OVER w AS match_end
+FROM test_greedy_then_reluctant
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP PAST LAST ROW
+ PATTERN (A+ B+?)
+ DEFINE
+ A AS 'A' = ANY(flags),
+ B AS 'B' = ANY(flags)
+);
+ id | flags | match_start | match_end
+----+-------+-------------+-----------
+ 1 | {A} | 1 | 3
+ 2 | {A,B} | |
+ 3 | {B} | |
+ 4 | {B} | |
+(4 rows)
+
-- ============================================================
-- Match Phase
-- ============================================================
@@ -810,6 +969,37 @@ WINDOW w AS (
6 | {B} | |
(6 rows)
+-- Reluctant with limited frame (A+? B with 2 FOLLOWING)
+-- Reluctant exits early, B must be within frame boundary
+WITH test_reluctant_frame AS (
+ SELECT * FROM (VALUES
+ (1, ARRAY['A']),
+ (2, ARRAY['A']),
+ (3, ARRAY['B']),
+ (4, ARRAY['_'])
+ ) AS t(id, flags)
+)
+SELECT id, flags,
+ first_value(id) OVER w AS match_start,
+ last_value(id) OVER w AS match_end
+FROM test_reluctant_frame
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND 2 FOLLOWING
+ AFTER MATCH SKIP TO NEXT ROW
+ PATTERN (A+? B)
+ DEFINE
+ A AS 'A' = ANY(flags),
+ B AS 'B' = ANY(flags)
+);
+ id | flags | match_start | match_end
+----+-------+-------------+-----------
+ 1 | {A} | 1 | 3
+ 2 | {A} | 2 | 3
+ 3 | {B} | |
+ 4 | {_} | |
+(4 rows)
+
-- ============================================================
-- State Management
-- ============================================================
@@ -843,6 +1033,35 @@ WINDOW w AS (
3 | {D,_} | |
(3 rows)
+-- Reluctant duplicate state handling
+-- (A+? | B+?) creates exit and loop states; exit paths may converge
+WITH test_reluctant_dedup AS (
+ SELECT * FROM (VALUES
+ (1, ARRAY['A','B']),
+ (2, ARRAY['A','B']),
+ (3, ARRAY['_'])
+ ) AS t(id, flags)
+)
+SELECT id, flags,
+ first_value(id) OVER w AS match_start,
+ last_value(id) OVER w AS match_end
+FROM test_reluctant_dedup
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP TO NEXT ROW
+ PATTERN ((A+? | B+?))
+ DEFINE
+ A AS 'A' = ANY(flags),
+ B AS 'B' = ANY(flags)
+);
+ id | flags | match_start | match_end
+----+-------+-------------+-----------
+ 1 | {A,B} | 1 | 1
+ 2 | {A,B} | 2 | 2
+ 3 | {_} | |
+(3 rows)
+
-- Large pattern (stress free list)
WITH test_large_pattern AS (
SELECT * FROM (VALUES
@@ -1014,6 +1233,39 @@ WINDOW w AS (
5 | {B} | |
(5 rows)
+-- Reluctant not absorbed (A+? with SKIP TO NEXT ROW)
+-- Compare with greedy A+ below: reluctant is not absorbable,
+-- so all contexts survive independently.
+WITH test_reluctant_stats AS (
+ SELECT * FROM (VALUES
+ (1, ARRAY['A']),
+ (2, ARRAY['A']),
+ (3, ARRAY['A']),
+ (4, ARRAY['A']),
+ (5, ARRAY['_'])
+ ) AS t(id, flags)
+)
+SELECT id, flags,
+ first_value(id) OVER w AS match_start,
+ last_value(id) OVER w AS match_end
+FROM test_reluctant_stats
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP TO NEXT ROW
+ PATTERN (A+?)
+ DEFINE
+ A AS 'A' = ANY(flags)
+);
+ id | flags | match_start | match_end
+----+-------+-------------+-----------
+ 1 | {A} | 1 | 1
+ 2 | {A} | 2 | 2
+ 3 | {A} | 3 | 3
+ 4 | {A} | 4 | 4
+ 5 | {_} | |
+(5 rows)
+
-- Absorbed contexts
WITH test_absorbed AS (
SELECT * FROM (VALUES
@@ -1322,6 +1574,96 @@ WINDOW w AS (
7 | {B} | |
(7 rows)
+-- Greedy vs reluctant: A+ matches all rows, A+? matches minimum
+WITH test_greedy_vs_reluctant AS (
+ SELECT * FROM (VALUES
+ (1, ARRAY['A','_']),
+ (2, ARRAY['A','_']),
+ (3, ARRAY['A','B']),
+ (4, ARRAY['B','_'])
+ ) AS t(id, flags)
+)
+SELECT id, flags,
+ first_value(id) OVER w AS match_start,
+ last_value(id) OVER w AS match_end
+FROM test_greedy_vs_reluctant
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP PAST LAST ROW
+ PATTERN (A+ B)
+ DEFINE
+ A AS 'A' = ANY(flags),
+ B AS 'B' = ANY(flags)
+);
+ id | flags | match_start | match_end
+----+-------+-------------+-----------
+ 1 | {A,_} | 1 | 4
+ 2 | {A,_} | |
+ 3 | {A,B} | |
+ 4 | {B,_} | |
+(4 rows)
+
+-- Same data, reluctant A+? exits at row 3 where B is first available
+WITH test_greedy_vs_reluctant AS (
+ SELECT * FROM (VALUES
+ (1, ARRAY['A','_']),
+ (2, ARRAY['A','_']),
+ (3, ARRAY['A','B']),
+ (4, ARRAY['B','_'])
+ ) AS t(id, flags)
+)
+SELECT id, flags,
+ first_value(id) OVER w AS match_start,
+ last_value(id) OVER w AS match_end
+FROM test_greedy_vs_reluctant
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP PAST LAST ROW
+ PATTERN (A+? B)
+ DEFINE
+ A AS 'A' = ANY(flags),
+ B AS 'B' = ANY(flags)
+);
+ id | flags | match_start | match_end
+----+-------+-------------+-----------
+ 1 | {A,_} | 1 | 3
+ 2 | {A,_} | |
+ 3 | {A,B} | |
+ 4 | {B,_} | |
+(4 rows)
+
+-- Reluctant group: (A B)+? matches minimum 1 iteration
+WITH test_reluctant_group AS (
+ SELECT * FROM (VALUES
+ (1, ARRAY['A']),
+ (2, ARRAY['B']),
+ (3, ARRAY['A']),
+ (4, ARRAY['B'])
+ ) AS t(id, flags)
+)
+SELECT id, flags,
+ first_value(id) OVER w AS match_start,
+ last_value(id) OVER w AS match_end
+FROM test_reluctant_group
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP PAST LAST ROW
+ PATTERN ((A B)+?)
+ DEFINE
+ A AS 'A' = ANY(flags),
+ B AS 'B' = ANY(flags)
+);
+ id | flags | match_start | match_end
+----+-------+-------------+-----------
+ 1 | {A} | 1 | 2
+ 2 | {B} | |
+ 3 | {A} | 3 | 4
+ 4 | {B} | |
+(4 rows)
+
-- ============================================================
-- Pathological Pattern Runtime Protection
-- ============================================================
@@ -1386,6 +1728,37 @@ WINDOW w AS (
4 | {B} | |
(4 rows)
+-- Reluctant nullable: A*? (prefers 0 matches)
+-- A*? always takes skip path (0 iterations preferred)
+WITH test_reluctant_nullable AS (
+ SELECT * FROM (VALUES
+ (1, ARRAY['A']),
+ (2, ARRAY['A']),
+ (3, ARRAY['A']),
+ (4, ARRAY['_'])
+ ) AS t(id, flags)
+)
+SELECT id, flags,
+ first_value(id) OVER w AS match_start,
+ last_value(id) OVER w AS match_end
+FROM test_reluctant_nullable
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP TO NEXT ROW
+ PATTERN (A*? B)
+ DEFINE
+ A AS 'A' = ANY(flags),
+ B AS 'B' = ANY(flags)
+);
+ id | flags | match_start | match_end
+----+-------+-------------+-----------
+ 1 | {A} | |
+ 2 | {A} | |
+ 3 | {A} | |
+ 4 | {_} | |
+(4 rows)
+
-- ============================================================
-- Alternation Runtime Behavior
-- ============================================================
@@ -1542,6 +1915,35 @@ WINDOW w AS (
2 | {_,C} | |
(2 rows)
+-- ALT with reluctant: (A+? | B+) - A branch is reluctant, B is greedy.
+-- Row 1 matches both A and B. A+? exits immediately (match 1-1).
+WITH test_alt_reluctant AS (
+ SELECT * FROM (VALUES
+ (1, ARRAY['A','B']),
+ (2, ARRAY['B','_']),
+ (3, ARRAY['B','_'])
+ ) AS t(id, flags)
+)
+SELECT id, flags,
+ first_value(id) OVER w AS match_start,
+ last_value(id) OVER w AS match_end
+FROM test_alt_reluctant
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP PAST LAST ROW
+ PATTERN ((A+? | B+))
+ DEFINE
+ A AS 'A' = ANY(flags),
+ B AS 'B' = ANY(flags)
+);
+ id | flags | match_start | match_end
+----+-------+-------------+-----------
+ 1 | {A,B} | 1 | 1
+ 2 | {B,_} | 2 | 3
+ 3 | {B,_} | |
+(3 rows)
+
-- ============================================================
-- Deep Nested Groups
-- ============================================================
@@ -1705,6 +2107,40 @@ WINDOW w AS (
9 | {_} | |
(9 rows)
+-- Nested reluctant group ((A B)+?) with following element C
+-- Inner group exits after minimum 1 iteration
+WITH test_nested_reluctant AS (
+ SELECT * FROM (VALUES
+ (1, ARRAY['A']),
+ (2, ARRAY['B']),
+ (3, ARRAY['A']),
+ (4, ARRAY['B']),
+ (5, ARRAY['C'])
+ ) AS t(id, flags)
+)
+SELECT id, flags,
+ first_value(id) OVER w AS match_start,
+ last_value(id) OVER w AS match_end
+FROM test_nested_reluctant
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP TO NEXT ROW
+ PATTERN ((A B)+? C)
+ DEFINE
+ A AS 'A' = ANY(flags),
+ B AS 'B' = ANY(flags),
+ C AS 'C' = ANY(flags)
+);
+ id | flags | match_start | match_end
+----+-------+-------------+-----------
+ 1 | {A} | 1 | 5
+ 2 | {B} | |
+ 3 | {A} | 3 | 5
+ 4 | {B} | |
+ 5 | {C} | |
+(5 rows)
+
-- ============================================================
-- SKIP Options (Runtime)
-- ============================================================
@@ -1819,6 +2255,53 @@ ORDER BY mode, id;
SKIP PAST | 4 | {B} | |
(8 rows)
+-- Reluctant SKIP comparison: A+? with SKIP PAST vs SKIP NEXT
+WITH test_reluctant_skip AS (
+ SELECT * FROM (VALUES
+ (1, ARRAY['A']),
+ (2, ARRAY['A']),
+ (3, ARRAY['A']),
+ (4, ARRAY['_'])
+ ) AS t(id, flags)
+)
+SELECT 'SKIP PAST' AS mode, id, flags,
+ first_value(id) OVER w AS match_start,
+ last_value(id) OVER w AS match_end
+FROM test_reluctant_skip
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP PAST LAST ROW
+ PATTERN (A+?)
+ DEFINE
+ A AS 'A' = ANY(flags)
+)
+UNION ALL
+SELECT 'SKIP NEXT' AS mode, id, flags,
+ first_value(id) OVER w AS match_start,
+ last_value(id) OVER w AS match_end
+FROM test_reluctant_skip
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP TO NEXT ROW
+ PATTERN (A+?)
+ DEFINE
+ A AS 'A' = ANY(flags)
+)
+ORDER BY mode, id;
+ mode | id | flags | match_start | match_end
+-----------+----+-------+-------------+-----------
+ SKIP NEXT | 1 | {A} | 1 | 1
+ SKIP NEXT | 2 | {A} | 2 | 2
+ SKIP NEXT | 3 | {A} | 3 | 3
+ SKIP NEXT | 4 | {_} | |
+ SKIP PAST | 1 | {A} | 1 | 1
+ SKIP PAST | 2 | {A} | 2 | 2
+ SKIP PAST | 3 | {A} | 3 | 3
+ SKIP PAST | 4 | {_} | |
+(8 rows)
+
-- ============================================================
-- INITIAL Mode (Runtime)
-- ============================================================
@@ -2428,6 +2911,39 @@ WINDOW w AS (
4 | {_,_} | |
(4 rows)
+-- Reluctant branch in ALT not absorbable: (A+?) | B
+-- A+? is reluctant so not absorbable. Compare with greedy (A+) | B above.
+WITH test_reluctant_alt_absorption AS (
+ SELECT * FROM (VALUES
+ (1, ARRAY['A']),
+ (2, ARRAY['A']),
+ (3, ARRAY['A']),
+ (4, ARRAY['B']),
+ (5, ARRAY['_'])
+ ) AS t(id, flags)
+)
+SELECT id, flags,
+ first_value(id) OVER w AS match_start,
+ last_value(id) OVER w AS match_end
+FROM test_reluctant_alt_absorption
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP TO NEXT ROW
+ PATTERN ((A+?) | B)
+ DEFINE
+ A AS 'A' = ANY(flags),
+ B AS 'B' = ANY(flags)
+);
+ id | flags | match_start | match_end
+----+-------+-------------+-----------
+ 1 | {A} | 1 | 1
+ 2 | {A} | 2 | 2
+ 3 | {A} | 3 | 3
+ 4 | {B} | 4 | 4
+ 5 | {_} | |
+(5 rows)
+
-- ============================================================
-- FIXME Issues - Known Limitations
-- ============================================================
diff --git a/src/test/regress/sql/rpr.sql b/src/test/regress/sql/rpr.sql
index 7690c5611c0..a263074e3e9 100644
--- a/src/test/regress/sql/rpr.sql
+++ b/src/test/regress/sql/rpr.sql
@@ -1271,6 +1271,7 @@ SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER
DOWN AS price < PREV(1)
);
+-- PREV(1) missing column reference (error)
SELECT company, tdate, price, first_value(price) OVER w, last_value(price) OVER w
FROM stock
WINDOW w AS (
@@ -2059,16 +2060,16 @@ WINDOW w AS (
-- Expected: 1-4 (A C B C)
-- ============================================
--- RELUCTANT quantifiers (not yet supported)
+-- RELUCTANT quantifiers
-- ============================================
--- Test: A+? B (reluctant) - parser rejects with ERROR
+-- Test: A+? B (reluctant) - B available at row 3, A exits early
WITH jacob_reluctant AS (
SELECT * FROM (VALUES
- (1, ARRAY['A']),
- (2, ARRAY['A']),
- (3, ARRAY['A']),
- (4, ARRAY['B'])
+ (1, ARRAY['A','_']),
+ (2, ARRAY['A','_']),
+ (3, ARRAY['A','B']),
+ (4, ARRAY['B','_'])
) AS t(id, flags)
)
SELECT id, flags, first_value(id) OVER w AS match_start, last_value(id) OVER w AS match_end
@@ -2082,15 +2083,14 @@ WINDOW w AS (
A AS 'A' = ANY(flags),
B AS 'B' = ANY(flags)
);
--- Expected: ERROR (reluctant quantifiers not yet supported)
--- Test: A{1,3}? B (reluctant bounded) - parser rejects with ERROR
+-- Test: A{1,3}? B (reluctant bounded) - same data, bounded quantifier
WITH jacob_reluctant_bounded AS (
SELECT * FROM (VALUES
- (1, ARRAY['A']),
- (2, ARRAY['A']),
- (3, ARRAY['A']),
- (4, ARRAY['B'])
+ (1, ARRAY['A','_']),
+ (2, ARRAY['A','_']),
+ (3, ARRAY['A','B']),
+ (4, ARRAY['B','_'])
) AS t(id, flags)
)
SELECT id, flags, first_value(id) OVER w AS match_start, last_value(id) OVER w AS match_end
@@ -2104,7 +2104,6 @@ WINDOW w AS (
A AS 'A' = ANY(flags),
B AS 'B' = ANY(flags)
);
--- Expected: ERROR (reluctant quantifiers not yet supported)
-- ============================================
-- Nested quantifiers (pathological patterns)
diff --git a/src/test/regress/sql/rpr_base.sql b/src/test/regress/sql/rpr_base.sql
index b752ad12873..992c081205a 100644
--- a/src/test/regress/sql/rpr_base.sql
+++ b/src/test/regress/sql/rpr_base.sql
@@ -762,7 +762,7 @@ ORDER BY id;
DROP TABLE rpr_quant;
--- Reluctant quantifiers (not yet supported)
+-- Reluctant quantifiers
CREATE TABLE rpr_reluctant (id INT, val INT);
INSERT INTO rpr_reluctant VALUES (1, 10), (2, 20), (3, 30);
@@ -775,7 +775,7 @@ WINDOW w AS (
PATTERN (A*?)
DEFINE A AS val > 0
);
--- Expected: ERROR: reluctant quantifiers are not yet supported
+-- Reluctant quantifier: prefer shortest match
-- +? (one or more, reluctant)
SELECT COUNT(*) OVER w
@@ -786,7 +786,7 @@ WINDOW w AS (
PATTERN (A+?)
DEFINE A AS val > 0
);
--- Expected: ERROR: reluctant quantifiers are not yet supported
+-- Reluctant quantifier: prefer shortest match
-- ?? (zero or one, reluctant)
SELECT COUNT(*) OVER w
@@ -797,7 +797,7 @@ WINDOW w AS (
PATTERN (A??)
DEFINE A AS val > 0
);
--- Expected: ERROR: reluctant quantifiers are not yet supported
+-- Reluctant quantifier: prefer shortest match
-- {n,}? (n or more, reluctant)
SELECT COUNT(*) OVER w
@@ -808,7 +808,7 @@ WINDOW w AS (
PATTERN (A{2,}?)
DEFINE A AS val > 0
);
--- Expected: ERROR: reluctant quantifiers are not yet supported
+-- Reluctant quantifier: prefer shortest match
-- {n,m}? (n to m, reluctant)
SELECT COUNT(*) OVER w
@@ -819,7 +819,7 @@ WINDOW w AS (
PATTERN (A{1,3}?)
DEFINE A AS val > 0
);
--- Expected: ERROR: reluctant quantifiers are not yet supported
+-- Reluctant quantifier: prefer shortest match
-- {n}? (exactly n, reluctant)
SELECT COUNT(*) OVER w
@@ -830,7 +830,7 @@ WINDOW w AS (
PATTERN (A{2}?)
DEFINE A AS val > 0
);
--- Expected: ERROR: reluctant quantifiers are not yet supported
+-- Reluctant quantifier: prefer shortest match
-- {,m}? (up to m, reluctant) - COMPLETELY UNTESTED RULE!
SELECT COUNT(*) OVER w
@@ -841,7 +841,7 @@ WINDOW w AS (
PATTERN (A{,3}?)
DEFINE A AS val > 0
);
--- Expected: ERROR: reluctant quantifiers are not yet supported
+-- Reluctant quantifier: prefer shortest match
-- Invalid reluctant patterns (wrong token after quantifier)
@@ -1002,7 +1002,7 @@ WINDOW w AS (
PATTERN (A* ?)
DEFINE A AS val > 0
);
--- Expected: ERROR: reluctant quantifiers are not yet supported
+-- Reluctant quantifier: prefer shortest match
-- + ? (token separated)
SELECT COUNT(*) OVER w
@@ -1013,7 +1013,7 @@ WINDOW w AS (
PATTERN (A+ ?)
DEFINE A AS val > 0
);
--- Expected: ERROR: reluctant quantifiers are not yet supported
+-- Reluctant quantifier: prefer shortest match
-- {2,} ? (token separated)
SELECT COUNT(*) OVER w
@@ -1024,7 +1024,7 @@ WINDOW w AS (
PATTERN (A{2,} ?)
DEFINE A AS val > 0
);
--- Expected: ERROR: reluctant quantifiers are not yet supported
+-- Reluctant quantifier: prefer shortest match
-- Invalid token combinations
@@ -1059,7 +1059,7 @@ WINDOW w AS (
PATTERN (A? ?)
DEFINE A AS val > 0
);
--- Expected: ERROR: reluctant quantifiers are not yet supported
+-- Reluctant quantifier: prefer shortest match
DROP TABLE rpr_reluctant;
@@ -2278,6 +2278,69 @@ SELECT COUNT(*) OVER w FROM rpr_plan
WINDOW w AS (ORDER BY id ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
PATTERN ((A | B | C)) DEFINE A AS val <= 30, B AS val <= 60, C AS val > 60);
+-- Reluctant optimization bypass: VAR merge
+-- A+? A stays as a+? a (greedy A+ A merges to a{2,})
+EXPLAIN (COSTS OFF)
+SELECT COUNT(*) OVER w FROM rpr_plan
+WINDOW w AS (ORDER BY id ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ PATTERN (A+? A) DEFINE A AS val > 0);
+
+-- Reluctant optimization bypass: GROUP merge
+-- (A B)+? (A B) stays separate (greedy merges to (a b){2,})
+EXPLAIN (COSTS OFF)
+SELECT COUNT(*) OVER w FROM rpr_plan
+WINDOW w AS (ORDER BY id ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ PATTERN ((A B)+? (A B)) DEFINE A AS val <= 50, B AS val > 50);
+
+-- Reluctant optimization bypass: quantifier multiply (outer reluctant)
+-- (A{2}){3}? stays as (a{2}){3}? (greedy merges to a{6})
+EXPLAIN (COSTS OFF)
+SELECT COUNT(*) OVER w FROM rpr_plan
+WINDOW w AS (ORDER BY id ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ PATTERN ((A{2}){3}?) DEFINE A AS val > 0);
+
+-- Reluctant optimization bypass: quantifier multiply (inner reluctant)
+-- (A{2}?){3} stays as (a{2}?){3} (greedy merges to a{6})
+EXPLAIN (COSTS OFF)
+SELECT COUNT(*) OVER w FROM rpr_plan
+WINDOW w AS (ORDER BY id ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ PATTERN ((A{2}?){3}) DEFINE A AS val > 0);
+
+-- Reluctant optimization bypass: PREFIX merge
+-- A B (A B)+? stays separate (greedy merges to (a b){2,})
+EXPLAIN (COSTS OFF)
+SELECT COUNT(*) OVER w FROM rpr_plan
+WINDOW w AS (ORDER BY id ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ PATTERN (A B (A B)+?) DEFINE A AS val <= 50, B AS val > 50);
+
+-- Reluctant optimization bypass: SUFFIX merge
+-- (A B)+? A B stays separate (greedy merges to (a b){2,})
+EXPLAIN (COSTS OFF)
+SELECT COUNT(*) OVER w FROM rpr_plan
+WINDOW w AS (ORDER BY id ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ PATTERN ((A B)+? A B) DEFINE A AS val <= 50, B AS val > 50);
+
+-- GROUP unwrap with quantifier propagation: (A)?? B -> a?? b
+-- Single VAR child {1,1} receives GROUP's quantifier and reluctant
+EXPLAIN (COSTS OFF)
+SELECT COUNT(*) OVER w FROM rpr_plan
+WINDOW w AS (ORDER BY id ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ PATTERN ((A)?? B) DEFINE A AS val <= 50, B AS val > 50);
+
+-- Reluctant preserved through ALT flatten
+-- (A | (B | C))+? flattens to (a | b | c)+? - inner ALT flattened, reluctant kept
+EXPLAIN (COSTS OFF)
+SELECT COUNT(*) OVER w FROM rpr_plan
+WINDOW w AS (ORDER BY id ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ PATTERN ((A | (B | C))+?) DEFINE A AS val <= 30, B AS val <= 60, C AS val > 60);
+
+-- Reluctant optimization bypass: absorption flags
+-- A+? with SKIP PAST LAST ROW - no absorption markers (greedy A+ gets a+")
+EXPLAIN (COSTS OFF)
+SELECT COUNT(*) OVER w FROM rpr_plan
+WINDOW w AS (ORDER BY id ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP PAST LAST ROW PATTERN (A+?) DEFINE A AS val > 0);
+
-- ============================================================
-- Absorption Flag Display Tests
-- ============================================================
diff --git a/src/test/regress/sql/rpr_nfa.sql b/src/test/regress/sql/rpr_nfa.sql
index e9894cfa691..0e853ca753b 100644
--- a/src/test/regress/sql/rpr_nfa.sql
+++ b/src/test/regress/sql/rpr_nfa.sql
@@ -230,6 +230,31 @@ WINDOW w AS (
B AS 'B' = ANY(flags)
);
+-- Reluctant pattern (A+?) - not absorbable
+-- Compare with greedy A+ above: reluctant excluded from absorption.
+-- Each context produces minimum match independently.
+WITH test_reluctant_absorption AS (
+ SELECT * FROM (VALUES
+ (1, ARRAY['A']),
+ (2, ARRAY['A']),
+ (3, ARRAY['A']),
+ (4, ARRAY['A']),
+ (5, ARRAY['_'])
+ ) AS t(id, flags)
+)
+SELECT id, flags,
+ first_value(id) OVER w AS match_start,
+ last_value(id) OVER w AS match_end
+FROM test_reluctant_absorption
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP TO NEXT ROW
+ PATTERN (A+?)
+ DEFINE
+ A AS 'A' = ANY(flags)
+);
+
-- ============================================================
-- Context Lifecycle
-- ============================================================
@@ -333,6 +358,31 @@ WINDOW w AS (
D AS 'D' = ANY(flags)
);
+-- Reluctant context lifecycle (A+? B with SKIP TO NEXT ROW)
+-- A+? exits early but if B not available, falls back to loop.
+-- Contexts not absorbed (reluctant), so multiple survive.
+WITH test_reluctant_context AS (
+ SELECT * FROM (VALUES
+ (1, ARRAY['A']),
+ (2, ARRAY['A']),
+ (3, ARRAY['B']),
+ (4, ARRAY['_'])
+ ) AS t(id, flags)
+)
+SELECT id, flags,
+ first_value(id) OVER w AS match_start,
+ last_value(id) OVER w AS match_end
+FROM test_reluctant_context
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP TO NEXT ROW
+ PATTERN (A+? B)
+ DEFINE
+ A AS 'A' = ANY(flags),
+ B AS 'B' = ANY(flags)
+);
+
-- ============================================================
-- Advance Phase (Epsilon Transitions)
-- ============================================================
@@ -448,6 +498,79 @@ WINDOW w AS (
D AS 'D' = ANY(flags)
);
+-- Mixed greedy/reluctant sequence: A+? B+ (reluctant A, greedy B)
+-- A exits as early as possible, B consumes the rest greedily
+WITH test_mixed_reluctant AS (
+ SELECT * FROM (VALUES
+ (1, ARRAY['A']),
+ (2, ARRAY['A']),
+ (3, ARRAY['A','B']),
+ (4, ARRAY['B']),
+ (5, ARRAY['B'])
+ ) AS t(id, flags)
+)
+SELECT id, flags,
+ first_value(id) OVER w AS match_start,
+ last_value(id) OVER w AS match_end
+FROM test_mixed_reluctant
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP PAST LAST ROW
+ PATTERN (A+? B+)
+ DEFINE
+ A AS 'A' = ANY(flags),
+ B AS 'B' = ANY(flags)
+);
+
+-- Optional reluctant group: (A B)?? C
+-- nfa_advance_begin: reluctant prefers skip (0 iterations) over enter
+WITH test_optional_reluctant AS (
+ SELECT * FROM (VALUES
+ (1, ARRAY['A']),
+ (2, ARRAY['B']),
+ (3, ARRAY['C'])
+ ) AS t(id, flags)
+)
+SELECT id, flags,
+ first_value(id) OVER w AS match_start,
+ last_value(id) OVER w AS match_end
+FROM test_optional_reluctant
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP PAST LAST ROW
+ PATTERN ((A B)?? C)
+ DEFINE
+ A AS 'A' = ANY(flags),
+ B AS 'B' = ANY(flags),
+ C AS 'C' = ANY(flags)
+);
+
+-- Greedy/reluctant sequence: A+ B+? (greedy A, reluctant B at end)
+-- A consumes greedily, B+? exits to FIN after minimum match
+WITH test_greedy_then_reluctant AS (
+ SELECT * FROM (VALUES
+ (1, ARRAY['A']),
+ (2, ARRAY['A','B']),
+ (3, ARRAY['B']),
+ (4, ARRAY['B'])
+ ) AS t(id, flags)
+)
+SELECT id, flags,
+ first_value(id) OVER w AS match_start,
+ last_value(id) OVER w AS match_end
+FROM test_greedy_then_reluctant
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP PAST LAST ROW
+ PATTERN (A+ B+?)
+ DEFINE
+ A AS 'A' = ANY(flags),
+ B AS 'B' = ANY(flags)
+);
+
-- ============================================================
-- Match Phase
-- ============================================================
@@ -629,6 +752,30 @@ WINDOW w AS (
B AS 'B' = ANY(flags)
);
+-- Reluctant with limited frame (A+? B with 2 FOLLOWING)
+-- Reluctant exits early, B must be within frame boundary
+WITH test_reluctant_frame AS (
+ SELECT * FROM (VALUES
+ (1, ARRAY['A']),
+ (2, ARRAY['A']),
+ (3, ARRAY['B']),
+ (4, ARRAY['_'])
+ ) AS t(id, flags)
+)
+SELECT id, flags,
+ first_value(id) OVER w AS match_start,
+ last_value(id) OVER w AS match_end
+FROM test_reluctant_frame
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND 2 FOLLOWING
+ AFTER MATCH SKIP TO NEXT ROW
+ PATTERN (A+? B)
+ DEFINE
+ A AS 'A' = ANY(flags),
+ B AS 'B' = ANY(flags)
+);
+
-- ============================================================
-- State Management
-- ============================================================
@@ -657,6 +804,29 @@ WINDOW w AS (
D AS 'D' = ANY(flags)
);
+-- Reluctant duplicate state handling
+-- (A+? | B+?) creates exit and loop states; exit paths may converge
+WITH test_reluctant_dedup AS (
+ SELECT * FROM (VALUES
+ (1, ARRAY['A','B']),
+ (2, ARRAY['A','B']),
+ (3, ARRAY['_'])
+ ) AS t(id, flags)
+)
+SELECT id, flags,
+ first_value(id) OVER w AS match_start,
+ last_value(id) OVER w AS match_end
+FROM test_reluctant_dedup
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP TO NEXT ROW
+ PATTERN ((A+? | B+?))
+ DEFINE
+ A AS 'A' = ANY(flags),
+ B AS 'B' = ANY(flags)
+);
+
-- Large pattern (stress free list)
WITH test_large_pattern AS (
SELECT * FROM (VALUES
@@ -790,6 +960,31 @@ WINDOW w AS (
B AS 'B' = ANY(flags)
);
+-- Reluctant not absorbed (A+? with SKIP TO NEXT ROW)
+-- Compare with greedy A+ below: reluctant is not absorbable,
+-- so all contexts survive independently.
+WITH test_reluctant_stats AS (
+ SELECT * FROM (VALUES
+ (1, ARRAY['A']),
+ (2, ARRAY['A']),
+ (3, ARRAY['A']),
+ (4, ARRAY['A']),
+ (5, ARRAY['_'])
+ ) AS t(id, flags)
+)
+SELECT id, flags,
+ first_value(id) OVER w AS match_start,
+ last_value(id) OVER w AS match_end
+FROM test_reluctant_stats
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP TO NEXT ROW
+ PATTERN (A+?)
+ DEFINE
+ A AS 'A' = ANY(flags)
+);
+
-- Absorbed contexts
WITH test_absorbed AS (
SELECT * FROM (VALUES
@@ -934,6 +1129,75 @@ WINDOW w AS (
B AS 'B' = ANY(flags)
);
+-- Greedy vs reluctant: A+ matches all rows, A+? matches minimum
+WITH test_greedy_vs_reluctant AS (
+ SELECT * FROM (VALUES
+ (1, ARRAY['A','_']),
+ (2, ARRAY['A','_']),
+ (3, ARRAY['A','B']),
+ (4, ARRAY['B','_'])
+ ) AS t(id, flags)
+)
+SELECT id, flags,
+ first_value(id) OVER w AS match_start,
+ last_value(id) OVER w AS match_end
+FROM test_greedy_vs_reluctant
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP PAST LAST ROW
+ PATTERN (A+ B)
+ DEFINE
+ A AS 'A' = ANY(flags),
+ B AS 'B' = ANY(flags)
+);
+
+-- Same data, reluctant A+? exits at row 3 where B is first available
+WITH test_greedy_vs_reluctant AS (
+ SELECT * FROM (VALUES
+ (1, ARRAY['A','_']),
+ (2, ARRAY['A','_']),
+ (3, ARRAY['A','B']),
+ (4, ARRAY['B','_'])
+ ) AS t(id, flags)
+)
+SELECT id, flags,
+ first_value(id) OVER w AS match_start,
+ last_value(id) OVER w AS match_end
+FROM test_greedy_vs_reluctant
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP PAST LAST ROW
+ PATTERN (A+? B)
+ DEFINE
+ A AS 'A' = ANY(flags),
+ B AS 'B' = ANY(flags)
+);
+
+-- Reluctant group: (A B)+? matches minimum 1 iteration
+WITH test_reluctant_group AS (
+ SELECT * FROM (VALUES
+ (1, ARRAY['A']),
+ (2, ARRAY['B']),
+ (3, ARRAY['A']),
+ (4, ARRAY['B'])
+ ) AS t(id, flags)
+)
+SELECT id, flags,
+ first_value(id) OVER w AS match_start,
+ last_value(id) OVER w AS match_end
+FROM test_reluctant_group
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP PAST LAST ROW
+ PATTERN ((A B)+?)
+ DEFINE
+ A AS 'A' = ANY(flags),
+ B AS 'B' = ANY(flags)
+);
+
-- ============================================================
-- Pathological Pattern Runtime Protection
-- ============================================================
@@ -984,6 +1248,30 @@ WINDOW w AS (
A AS 'A' = ANY(flags)
);
+-- Reluctant nullable: A*? (prefers 0 matches)
+-- A*? always takes skip path (0 iterations preferred)
+WITH test_reluctant_nullable AS (
+ SELECT * FROM (VALUES
+ (1, ARRAY['A']),
+ (2, ARRAY['A']),
+ (3, ARRAY['A']),
+ (4, ARRAY['_'])
+ ) AS t(id, flags)
+)
+SELECT id, flags,
+ first_value(id) OVER w AS match_start,
+ last_value(id) OVER w AS match_end
+FROM test_reluctant_nullable
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP TO NEXT ROW
+ PATTERN (A*? B)
+ DEFINE
+ A AS 'A' = ANY(flags),
+ B AS 'B' = ANY(flags)
+);
+
-- ============================================================
-- Alternation Runtime Behavior
-- ============================================================
@@ -1102,6 +1390,29 @@ WINDOW w AS (
C AS 'C' = ANY(flags)
);
+-- ALT with reluctant: (A+? | B+) - A branch is reluctant, B is greedy.
+-- Row 1 matches both A and B. A+? exits immediately (match 1-1).
+WITH test_alt_reluctant AS (
+ SELECT * FROM (VALUES
+ (1, ARRAY['A','B']),
+ (2, ARRAY['B','_']),
+ (3, ARRAY['B','_'])
+ ) AS t(id, flags)
+)
+SELECT id, flags,
+ first_value(id) OVER w AS match_start,
+ last_value(id) OVER w AS match_end
+FROM test_alt_reluctant
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP PAST LAST ROW
+ PATTERN ((A+? | B+))
+ DEFINE
+ A AS 'A' = ANY(flags),
+ B AS 'B' = ANY(flags)
+);
+
-- ============================================================
-- Deep Nested Groups
-- ============================================================
@@ -1220,6 +1531,32 @@ WINDOW w AS (
B AS 'B' = ANY(flags)
);
+-- Nested reluctant group ((A B)+?) with following element C
+-- Inner group exits after minimum 1 iteration
+WITH test_nested_reluctant AS (
+ SELECT * FROM (VALUES
+ (1, ARRAY['A']),
+ (2, ARRAY['B']),
+ (3, ARRAY['A']),
+ (4, ARRAY['B']),
+ (5, ARRAY['C'])
+ ) AS t(id, flags)
+)
+SELECT id, flags,
+ first_value(id) OVER w AS match_start,
+ last_value(id) OVER w AS match_end
+FROM test_nested_reluctant
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP TO NEXT ROW
+ PATTERN ((A B)+? C)
+ DEFINE
+ A AS 'A' = ANY(flags),
+ B AS 'B' = ANY(flags),
+ C AS 'C' = ANY(flags)
+);
+
-- ============================================================
-- SKIP Options (Runtime)
-- ============================================================
@@ -1308,6 +1645,42 @@ WINDOW w AS (
)
ORDER BY mode, id;
+-- Reluctant SKIP comparison: A+? with SKIP PAST vs SKIP NEXT
+WITH test_reluctant_skip AS (
+ SELECT * FROM (VALUES
+ (1, ARRAY['A']),
+ (2, ARRAY['A']),
+ (3, ARRAY['A']),
+ (4, ARRAY['_'])
+ ) AS t(id, flags)
+)
+SELECT 'SKIP PAST' AS mode, id, flags,
+ first_value(id) OVER w AS match_start,
+ last_value(id) OVER w AS match_end
+FROM test_reluctant_skip
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP PAST LAST ROW
+ PATTERN (A+?)
+ DEFINE
+ A AS 'A' = ANY(flags)
+)
+UNION ALL
+SELECT 'SKIP NEXT' AS mode, id, flags,
+ first_value(id) OVER w AS match_start,
+ last_value(id) OVER w AS match_end
+FROM test_reluctant_skip
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP TO NEXT ROW
+ PATTERN (A+?)
+ DEFINE
+ A AS 'A' = ANY(flags)
+)
+ORDER BY mode, id;
+
-- ============================================================
-- INITIAL Mode (Runtime)
-- ============================================================
@@ -1791,6 +2164,31 @@ WINDOW w AS (
C AS 'C' = ANY(flags)
);
+-- Reluctant branch in ALT not absorbable: (A+?) | B
+-- A+? is reluctant so not absorbable. Compare with greedy (A+) | B above.
+WITH test_reluctant_alt_absorption AS (
+ SELECT * FROM (VALUES
+ (1, ARRAY['A']),
+ (2, ARRAY['A']),
+ (3, ARRAY['A']),
+ (4, ARRAY['B']),
+ (5, ARRAY['_'])
+ ) AS t(id, flags)
+)
+SELECT id, flags,
+ first_value(id) OVER w AS match_start,
+ last_value(id) OVER w AS match_end
+FROM test_reluctant_alt_absorption
+WINDOW w AS (
+ ORDER BY id
+ ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
+ AFTER MATCH SKIP TO NEXT ROW
+ PATTERN ((A+?) | B)
+ DEFINE
+ A AS 'A' = ANY(flags),
+ B AS 'B' = ANY(flags)
+);
+
-- ============================================================
-- FIXME Issues - Known Limitations
-- ============================================================
--
2.50.1 (Apple Git-155)
[application/octet-stream] nocfbot-0006-allow-a0-quantifier.patch (5.7K, 8-nocfbot-0006-allow-a0-quantifier.patch)
download | inline diff:
From 2db769b38bb4b339f57a9d2cff6bdd8f1305c551 Mon Sep 17 00:00:00 2001
From: Henson Choi <[email protected]>
Date: Mon, 23 Feb 2026 14:56:42 +0900
Subject: [PATCH 6/6] Allow A{0} quantifier
---
src/backend/executor/nodeWindowAgg.c | 7 +++++++
src/backend/parser/gram.y | 8 +++----
src/test/regress/expected/rpr_base.out | 29 ++++++++++++++++++--------
src/test/regress/sql/rpr_base.sql | 8 +++----
4 files changed, 35 insertions(+), 17 deletions(-)
diff --git a/src/backend/executor/nodeWindowAgg.c b/src/backend/executor/nodeWindowAgg.c
index 98546fb17f7..72ded5efc1f 100644
--- a/src/backend/executor/nodeWindowAgg.c
+++ b/src/backend/executor/nodeWindowAgg.c
@@ -6070,6 +6070,13 @@ nfa_route_to_elem(WindowAggState *winstate, RPRNFAContext *ctx,
{
if (RPRElemIsVar(nextElem))
{
+ if (nextElem->max == 0)
+ {
+ /* max=0: variable never matches, treat as epsilon transition */
+ state->elemIdx = nextElem->next;
+ nfa_advance_state(winstate, ctx, state, currentPos, initialAdvance);
+ return;
+ }
nfa_add_state_unique(winstate, ctx, state);
if (RPRElemCanSkip(nextElem))
{
diff --git a/src/backend/parser/gram.y b/src/backend/parser/gram.y
index 5a3d141e992..17503b38e3a 100644
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -17070,10 +17070,10 @@ row_pattern_quantifier_opt:
/* {n}, {n,}, {,m}, {n,m} quantifiers */
| '{' Iconst '}'
{
- if ($2 <= 0 || $2 >= INT_MAX)
+ if ($2 < 0 || $2 >= INT_MAX)
ereport(ERROR,
errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("quantifier bound must be between 1 and %d", INT_MAX - 1),
+ errmsg("quantifier bound must be between 0 and %d", INT_MAX - 1),
parser_errposition(@2));
$$ = (Node *) makeRPRQuantifier($2, $2, -1, @1, yyscanner);
}
@@ -17118,10 +17118,10 @@ row_pattern_quantifier_opt:
errmsg("invalid token after range quantifier"),
errhint("Only \"?\" is allowed after {n} to make it reluctant."),
parser_errposition(@4)));
- if ($2 <= 0 || $2 >= INT_MAX)
+ if ($2 < 0 || $2 >= INT_MAX)
ereport(ERROR,
errcode(ERRCODE_SYNTAX_ERROR),
- errmsg("quantifier bound must be between 1 and %d", INT_MAX - 1),
+ errmsg("quantifier bound must be between 0 and %d", INT_MAX - 1),
parser_errposition(@2));
$$ = (Node *) makeRPRQuantifier($2, $2, @4, @1, yyscanner);
}
diff --git a/src/test/regress/expected/rpr_base.out b/src/test/regress/expected/rpr_base.out
index 43eea32130f..7df71aaa904 100644
--- a/src/test/regress/expected/rpr_base.out
+++ b/src/test/regress/expected/rpr_base.out
@@ -923,7 +923,7 @@ ORDER BY id;
(10 rows)
-- Edge case quantifiers
--- {0} is not allowed (min must be >= 1)
+-- {0} always skips the variable (zero iterations)
SELECT id, val, COUNT(*) OVER w as cnt
FROM rpr_quant
WINDOW w AS (
@@ -933,10 +933,21 @@ WINDOW w AS (
DEFINE A AS val > 1000, B AS val > 0
)
ORDER BY id;
-ERROR: quantifier bound must be between 1 and 2147483646
-LINE 6: PATTERN (A{0} B)
- ^
--- Expected: ERROR: quantifier bound must be between 1 and 2147483646
+ id | val | cnt
+----+-----+-----
+ 1 | 10 | 1
+ 2 | 20 | 1
+ 3 | 30 | 1
+ 4 | 40 | 1
+ 5 | 50 | 1
+ 6 | 60 | 1
+ 7 | 70 | 1
+ 8 | 80 | 1
+ 9 | 90 | 1
+ 10 | 100 | 1
+(10 rows)
+
+-- Expected: A{0} skips A, only B matches each row
-- {0,0} is not allowed (max must be >= 1)
SELECT id, val, COUNT(*) OVER w as cnt
FROM rpr_quant
@@ -1274,10 +1285,10 @@ WINDOW w AS (
PATTERN (A{2147483647}?)
DEFINE A AS val > 0
);
-ERROR: quantifier bound must be between 1 and 2147483646
+ERROR: quantifier bound must be between 0 and 2147483646
LINE 6: PATTERN (A{2147483647}?)
^
--- Expected: ERROR: quantifier bound must be between 1 and 2147483646
+-- Expected: ERROR: quantifier bound must be between 0 and 2147483646
-- {-1,}? (negative lower bound)
SELECT COUNT(*) OVER w
FROM rpr_reluctant
@@ -1537,10 +1548,10 @@ WINDOW w AS (
PATTERN (A{2147483647})
DEFINE A AS id > 0
);
-ERROR: quantifier bound must be between 1 and 2147483646
+ERROR: quantifier bound must be between 0 and 2147483646
LINE 6: PATTERN (A{2147483647})
^
--- Expected: ERROR: quantifier bound must be between 1 and 2147483646
+-- Expected: ERROR: quantifier bound must be between 0 and 2147483646
-- {n,} boundary errors
-- Negative lower bound in {n,}
SELECT COUNT(*) OVER w
diff --git a/src/test/regress/sql/rpr_base.sql b/src/test/regress/sql/rpr_base.sql
index 992c081205a..d25ca07adaf 100644
--- a/src/test/regress/sql/rpr_base.sql
+++ b/src/test/regress/sql/rpr_base.sql
@@ -673,7 +673,7 @@ ORDER BY id;
-- Edge case quantifiers
--- {0} is not allowed (min must be >= 1)
+-- {0} always skips the variable (zero iterations)
SELECT id, val, COUNT(*) OVER w as cnt
FROM rpr_quant
WINDOW w AS (
@@ -683,7 +683,7 @@ WINDOW w AS (
DEFINE A AS val > 1000, B AS val > 0
)
ORDER BY id;
--- Expected: ERROR: quantifier bound must be between 1 and 2147483646
+-- Expected: A{0} skips A, only B matches each row
-- {0,0} is not allowed (max must be >= 1)
SELECT id, val, COUNT(*) OVER w as cnt
@@ -911,7 +911,7 @@ WINDOW w AS (
PATTERN (A{2147483647}?)
DEFINE A AS val > 0
);
--- Expected: ERROR: quantifier bound must be between 1 and 2147483646
+-- Expected: ERROR: quantifier bound must be between 0 and 2147483646
-- {-1,}? (negative lower bound)
SELECT COUNT(*) OVER w
@@ -1118,7 +1118,7 @@ WINDOW w AS (
PATTERN (A{2147483647})
DEFINE A AS id > 0
);
--- Expected: ERROR: quantifier bound must be between 1 and 2147483646
+-- Expected: ERROR: quantifier bound must be between 0 and 2147483646
-- {n,} boundary errors
--
2.50.1 (Apple Git-155)
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: Row pattern recognition
In-Reply-To: <CAAAe_zAxEqdyoOgx4u0A2RXu829CE-RJbJjg1jVz00j3aSYryQ@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox