Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vyu7J-000Wid-1t for pgsql-hackers@arkaria.postgresql.org; Sat, 07 Mar 2026 16:03:01 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1vyu7H-008uvi-2F for pgsql-hackers@arkaria.postgresql.org; Sat, 07 Mar 2026 16:03:00 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1vyu7H-008uvY-0x for pgsql-hackers@lists.postgresql.org; Sat, 07 Mar 2026 16:02:59 +0000 Received: from mail-pj1-x102b.google.com ([2607:f8b0:4864:20::102b]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1vyu7D-00000001Mg4-3WPm for pgsql-hackers@postgresql.org; Sat, 07 Mar 2026 16:02:58 +0000 Received: by mail-pj1-x102b.google.com with SMTP id 98e67ed59e1d1-356337f058aso5538452a91.2 for ; Sat, 07 Mar 2026 08:02:55 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1772899373; cv=none; d=google.com; s=arc-20240605; b=XKREB4ueVG0qMm6r0Jme/yRt2WbANRPRKcHpddYhxU5gzKeVOUvudjBbLnT7kn5biI 9A8mTHt407lYyZaLf3X7usVO7Ds+qaljxylR3e9q83n0tRJcHTLTnAftDXUZy7yxMktg iJVdma+qJmWo/XNug2SLvo06QK2huQrybMm4btvBws+wd1MLJUpOoWmQXZzEyxWIerwi 2raFij/Q0ylAg5+enZTax4eV9xZZmOK0SuQSCffh1nYAbuHLeYNyVBDQW1/PTTmb4Ffo 3fJ7uR7krrvBhHiXERvQrmGIni70uBk9e9EPC/CSnr4v+lVtdsAPLZbmpVfn1F4OFq/3 lvNw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:reply-to:in-reply-to:references :mime-version:dkim-signature; bh=yZl5062EG8u3mv75XMUm1AhrvKYlkGacgODjfWuaU+I=; fh=zLTXsuThzOw3K5S/vyAQSRdWRPjWWaGT3ErJVPafATQ=; b=UFDvXew1uQ9Ks8WnWZhEvu5kePE8phSJW4G3CZHekC0XbPkQxrMOTkTChv+RfgvRw1 EKRrHBHDac+JGUubV2uljxpeqvkVubbQTK3gpwhXtmMqMkwyyrBGVNoMui57pLiIWjrc lIHa8eE20r7TbOy/CqCQ+pVe5Y797BBWnkGAAPN0prU0/zqSVxjNmLX1OBU1ixiaIgmW 8FcsAirwEll8gO0ynALoSOLdWW55Uqy1PrJG2XwaEe2XZtQ8oJvHIhBYQp1Bng4e6xRs YYXlf+XaEEcHwpG9bVY4xfmtsE77onxBa/iGRU4o0CEqdZCgbbgX/QtHS2+YjRhbB2DZ j7RA==; darn=postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1772899373; x=1773504173; darn=postgresql.org; h=cc:to:subject:message-id:date:from:reply-to:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=yZl5062EG8u3mv75XMUm1AhrvKYlkGacgODjfWuaU+I=; b=eAnLK0HBBYZT4V80aUKUtkZuloSo5Zsi1NBLGnqqCy9A+ldUUHPTxpWegW6NInPHLT 9RXjMnuBdFjM5hZYepLz2xIr0Iv0o9ixISI12hzOHvHo1woPmrPjyFx1d+tDs6jaPjB+ JYvJk5OtMh7igflCoOevnRbXPdKPa605SzKD948MKStU8fI4n6ynlu9Cpk/vZCEgEIrq yCN79hn6ReuH+Bg+uqS9D7fpmpYSHpE5nWw72sfz17KG8ajHyRorcqXWwq0EfLnqmJ1k P5s4LCMbG6fAn9eoyRW0n7h+rcCPu2TDOTzOhgVla4FlYfIXrppaIVbIoxJ1HyU2K4G/ xfew== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1772899373; x=1773504173; h=cc:to:subject:message-id:date:from:reply-to:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=yZl5062EG8u3mv75XMUm1AhrvKYlkGacgODjfWuaU+I=; b=wPnqqOtV/6D4dmbLScBkd9T0oRH1TPRJnrJmAOcQfVXe5wM/LKDW9nVJGTg2WkpJwg S59R1w8KSpm18NBlbb4eiBcI6jtFYYx9otAYhNUgzOcaNcni63KwzUxaPFMVuhjxIlqF CUr33lYo58olaVk7qOp3Z4KoryDFdWyeq8bPaSnkycua83L5U55PZFVbAqIT3xLfotTp M8bW4mzjOvZPIeDTpKLL5R0aBXsCAy+AxtV4HASSISOHIWju2nOHmcZoMGmXT4pk8v50 hwSylR3B5oTtcQpdV30OCPGFdrixDUi9PUuM6zCikEs7R9Wc1n+/EaCWa0XQpxMRuu+i yeZA== X-Forwarded-Encrypted: i=1; AJvYcCWse8/pF3OucgN5VqshlY4z9WqZghjQmkK/v01T/g4XENFsN4tLcFRVeJxxEZg2wHjWTx8MR2ok1Zmru+zm@postgresql.org X-Gm-Message-State: AOJu0YxeW9dIvmiOUf0c37PzkeByXDGDV7K6S5Sck63LAKVDJUeiqV4g ZMw+o2XOjRZyRl+Q2Ldbx1apcCGqRSC67LmsgpaAN7q4+zBlYhNlDy+femqTXAic/Saa4x0s7zh XMvIXA6irCzu4f6dmHam404RcqARFlkk= X-Gm-Gg: ATEYQzzngdwcYIRIUEYTYSQXgLfGS65zLu8JwIR3bkBdZewJ97IwO0Gk5A2rpLWypCK C1tQnNT2wMGL+xZTO92Y1ZEDHIrUwnvAxVdKkfGmankVYSH1zGo10fbcakZ858/iIveRGMeh5Z+ AhIT5uHwwDbM7OTMCfytUKbeb0EpDKBVbcqAtuSYRDF2XCdW4MbnhNb4NT6jdCRiUx3DyklyP52 rAO0o0qsQ+D/kLMUiXmx/nIKV3Zq0O9zwDV/ijebWpBrw/9uCwi1VFToOvO9xKChLNO3HCNF3O5 BJ7+X/45zL1bkXWCG+yw3MrUAjAh7SaXDxV8YvRF X-Received: by 2002:a17:90b:3d8a:b0:359:9a87:859f with SMTP id 98e67ed59e1d1-359be313b32mr5057784a91.26.1772899373451; Sat, 07 Mar 2026 08:02:53 -0800 (PST) MIME-Version: 1.0 References: <20260306.153837.2137322608184587391.ishii@postgresql.org> <20260307.120151.1477244845022229828.ishii@postgresql.org> In-Reply-To: <20260307.120151.1477244845022229828.ishii@postgresql.org> Reply-To: assam258@gmail.com From: Henson Choi Date: Sun, 8 Mar 2026 01:02:41 +0900 X-Gm-Features: AaiRm50-WtKhwLv6DSAb5GIs1f-kfDjpbaCdU_gCYZkAb3Gfvoielei7VI8VUK0 Message-ID: Subject: Re: Row pattern recognition To: Tatsuo Ishii Cc: sjjang112233@gmail.com, vik@postgresfriends.org, er@xs4all.nl, jacob.champion@enterprisedb.com, david.g.johnston@gmail.com, peter@eisentraut.org, pgsql-hackers@postgresql.org Content-Type: multipart/alternative; boundary="000000000000608df3064c714deb" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000608df3064c714deb Content-Type: text/plain; charset="UTF-8" Hi Tatsuo, I found Trino uses "empty match" too [2]. So for SQL users, I guess > "empty match" is more familiar wording. > Agreed. Since RPR is a feature defined by the SQL standard, using the standard's own terminology is the right choice: it lets users cross-reference the specification directly. That said, "zero-length match" is widely used in the regex literature (PCRE2, Perl, Python, Java, etc.), so developers researching the topic should treat the two terms as synonymous. In the SQL context, "empty match" is the preferred term. > > (a) Reorder within nodeWindowAgg.c: move the nfa_* functions up and > > keep the "API exposed to window functions" section at the bottom, > > matching master's layout. > > > > (b) Separate file under src/backend/executor/, keeping it close to > > nodeWindowAgg.c while making the boundary explicit. > > > > (c) A dedicated src/backend/rpr/ directory modeled on > > src/backend/regex/, giving the NFA engine its own namespace. > > This could also be an opportunity to consolidate the existing > > src/backend/optimizer/plan/rpr.c into the same directory. > > I prefer (a) or (b) for now, at least for the first commit. The reason > is, current nfa functions take a WindowAggState argument. If we prefer > (c), I think we need to change some of (or most of) nfa functions so > that they do not take the WindowAggState argument. What do you think? > That is a valid concern. (c) will be necessary in the long run, but MATCH_RECOGNIZE (R010) is not yet on the horizon, and restructuring the NFA layer now -- before that requirement is concrete -- carries a real risk of introducing defects during the refactoring. The risks outweigh the benefits at this stage. Instead, I'll try (b): a separate file under src/backend/executor/ can retain the WindowAggState argument while making the NFA boundary explicit, and the change is still reviewable. If (b) turns out to be harder than expected or surfaces unexpected issues, I'll fall back to (a). If (b) works out, I'll add a structured header comment to the new file to preserve the algorithm description for future reviewers. I am not familiar with ECPG. Do you know if ECPG has Window clause > tests? If ECPG does not have any Window clause tests, is it worth to > add RPR tests to ECPG? > I checked src/interfaces/ecpg/test/ and confirmed there are no Window clause tests there at all. If ECPG has no Window clause coverage, I see no reason to add RPR tests there at this stage. I think it would be better to leave ECPG tests out of this patch set for now. Best regards, Henson --000000000000608df3064c714deb Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Tatsuo,

I found Trino uses "empty match" too [2]. So for SQL users, I gue= ss
"empty match" is more familiar wording.

=
Agreed.=C2=A0 Since RPR is a feature defined by the SQL standard= , using the
standard's own terminology is the right choice: it lets = users
cross-reference the specification directly.=C2=A0 That said, "= ;zero-length
match" is widely used in the regex literature (PCRE2, = Perl, Python,
Java, etc.), so developers researching the topic should tr= eat the two
terms as synonymous.=C2=A0 In the SQL context, "empty m= atch" is the preferred term.

=C2=A0
>=C2=A0 =C2=A0(a) Reorder within nodeWindowAgg.c: move the nfa_* f= unctions up and
>=C2=A0 =C2=A0 =C2=A0 =C2=A0keep the "API exposed to window functio= ns" section at the bottom,
>=C2=A0 =C2=A0 =C2=A0 =C2=A0matching master's layout.
>
>=C2=A0 =C2=A0(b) Separate file under src/backend/executor/, keeping it = close to
>=C2=A0 =C2=A0 =C2=A0 =C2=A0nodeWindowAgg.c while making the boundary ex= plicit.
>
>=C2=A0 =C2=A0(c) A dedicated src/backend/rpr/ directory modeled on
>=C2=A0 =C2=A0 =C2=A0 =C2=A0src/backend/regex/, giving the NFA engine it= s own namespace.
>=C2=A0 =C2=A0 =C2=A0 =C2=A0This could also be an opportunity to consoli= date the existing
>=C2=A0 =C2=A0 =C2=A0 =C2=A0src/backend/optimizer/plan/rpr.c into the sa= me directory.

I prefer (a) or (b) for now, at least for the first commit. The reason
is, current nfa functions take a WindowAggState argument. If we prefer
(c), I think we need to change some of (or most of) nfa functions so
that they do not take the WindowAggState argument. What do you think?
=C2=A0
That is a valid concern. =C2=A0(c) will be= necessary in the long run, but
MATCH_RECOGNIZE (R010) is not yet on the= horizon, and restructuring the
NFA layer now -- before that requirement= is concrete -- carries a real
risk of introducing defects during the re= factoring.=C2=A0 The risks outweigh the
benefits at this stage.

I= nstead, I'll try (b): a separate file under src/backend/executor/
ca= n retain the WindowAggState argument while making the NFA boundary
expli= cit, and the change is still reviewable.=C2=A0 If (b) turns out to be
ha= rder than expected or surfaces unexpected issues, I'll fall back to
= (a).

If (b) works out, I'll add a structured header comment to t= he new file
to preserve the algorithm description for future reviewers.<= br>


I am not familiar w= ith ECPG. Do you know if ECPG has Window clause
tests? If ECPG does not have any Window clause tests, is it worth to
add RPR tests to ECPG?
=C2=A0
I checked src/= interfaces/ecpg/test/ and confirmed there are no Window
clause tests the= re at all.=C2=A0 If ECPG has no Window clause coverage, I
see no reason = to add RPR tests there at this stage.=C2=A0 I think it would be better to l= eave ECPG tests out of this patch set for now.


Best regards,
= Henson
--000000000000608df3064c714deb--