MIME-Version: 1.0
References: <20260517.190023.159085681032648582.ishii@postgresql.org>
 <CACJufxH-DZePhbdJM=8nNYceQiSbbXXLTw54iLhxiynQ+4hbBA@mail.gmail.com>
 <CAAAe_zDephfiDA_A3FN0hCymJRogEr=Rt3QoCTf4qMYDLk+xNA@mail.gmail.com>
 <20260531.103258.1663537611708370752.ishii@postgresql.org>
In-Reply-To: <20260531.103258.1663537611708370752.ishii@postgresql.org>
Reply-To: assam258@gmail.com
From: Henson Choi <assam258@gmail.com>
Date: Mon, 1 Jun 2026 09:40:40 +0900
Message-ID: 
 <CAAAe_zBO7wU6JEuQJ246PzS=B63ufcYUQjovEfwsQvCZdaM7tg@mail.gmail.com>
Subject: Re: Row pattern recognition
To: Tatsuo Ishii <ishii@postgresql.org>, jian.universality@gmail.com
Cc: zsolt.parragi@percona.com, sjjang112233@gmail.com,
 vik@postgresfriends.org,
	er@xs4all.nl, jacob.champion@enterprisedb.com, david.g.johnston@gmail.com,
	peter@eisentraut.org, li.evan.chao@gmail.com, pgsql-hackers@postgresql.org
Content-Type: multipart/alternative; boundary="0000000000004b3f68065326723c"
Archived-At: 
 <https://www.postgresql.org/message-id/CAAAe_zBO7wU6JEuQJ246PzS%3DB63ufcYUQjovEfwsQvCZdaM7tg%40mail.gmail.com>
Precedence: bulk

--0000000000004b3f68065326723c
Content-Type: text/plain; charset="UTF-8"

Hi Tatsuo, Jian,

Thanks, Tatsuo -- your two notes settle both open questions. Answering both
inline below, with a note at the end on the follow-up work.

> Basically I think Jian's idea is good. In addition to the size reason
> above, we would have less code changes when we adapt existing R020
> codes to R010.
>
> However it will need a wide code change as Henson said. I would like
> to focus on stabilizing our code for now. Therefore I would not want
> the refactoring in v48.

Agreed -- out of v48, stabilize first. On R010, I'd treat it as the design
lens, not the schedule: it's far out, so rather than hold RPRContext back
until then, I'd do the consolidation at a sensible point after v48 but shape
it against R010 (what is shared versus per-context, how the fields group),
so
one engine core can later back both the R020 window path and R010 without a
second reshape. Not v48 -- on its own schedule once we're stable.

I'll admit the R010 connection had been nagging at me for a while without a
clean answer, and Jian's consolidation suggestion turns out to land right on
it: framed as a size win, it's really the structural move that opens the
R010
path -- that reframing is his. Which is why I'd rather shape it
deliberately,
as the shared engine core, than fold it in as churn.

> Although I don't have any particular strong preferences, keeping
> "absorption" for the runtime concept sounds good to me.

Good -- "absorption" stays reserved for the runtime context-equivalence
collapse, and the README explains it in one place.

> For AST level name changing, "prefix/suffix merging" seems to be
> already used in other areas according to Google: LLM, Linker, and
> string manipulation in DNA. In the normal expression engine area, it
> looks like "flattening nested quantifiers" or "quantifiers reduction"
> are used for the case. So, for example, "prefix/suffix quantifiers
> reduction" seems to be more appropriate?  (If you don't mind it's too
> long) In any case, I would like to respect your opinion.

Thanks -- you're right that "merging" is well-worn elsewhere, and I'll be
honest that "prefix/suffix merging" isn't a term I'd defend on the merits.
Keeping it for v48 is really a stopgap to contain the ripple: the sibling
Phase-1 rewrites are already named "consecutive variable / group / ALT
merging", so switching to the "flattening / reduction" family would force
renaming those too for consistency. So I'd treat the term itself as
genuinely
open -- your "flattening / reduction" neighborhood is the right one -- and
converge on the established academic naming as the paper I'm preparing on
the
algorithm, together with a university research group, takes shape. For now
I'd
ship "prefix/suffix merging" only to keep the README internally consistent,
and fold the settled term into the glossary pass once the paper lands on
it. A
doc-level name is cheap to revise later.

Both of these -- the RPRContext reshape and the naming/terminology -- are
"after v48, by discussion" items, and they aren't the only ones. Rather than
pin down the scope of that follow-up now, I'd suggest we pick it up together
once v48 is stable.

Thanks again,
Henson

--0000000000004b3f68065326723c
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr">Hi Tatsuo, Jian,<br><br>Thanks, Tatsuo -- your two notes s=
ettle both open questions. Answering both<br>inline below, with a note at t=
he end on the follow-up work.<br><br>&gt; Basically I think Jian&#39;s idea=
 is good. In addition to the size reason<br>&gt; above, we would have less =
code changes when we adapt existing R020<br>&gt; codes to R010.<br>&gt;<br>=
&gt; However it will need a wide code change as Henson said. I would like<b=
r>&gt; to focus on stabilizing our code for now. Therefore I would not want=
<br>&gt; the refactoring in v48.<br><br>Agreed -- out of v48, stabilize fir=
st. On R010, I&#39;d treat it as the design<br>lens, not the schedule: it&#=
39;s far out, so rather than hold RPRContext back<br>until then, I&#39;d do=
 the consolidation at a sensible point after v48 but shape<br>it against R0=
10 (what is shared versus per-context, how the fields group), so<br>one eng=
ine core can later back both the R020 window path and R010 without a<br>sec=
ond reshape. Not v48 -- on its own schedule once we&#39;re stable.<br><br>I=
&#39;ll admit the R010 connection had been nagging at me for a while withou=
t a<br>clean answer, and Jian&#39;s consolidation suggestion turns out to l=
and right on<br>it: framed as a size win, it&#39;s really the structural mo=
ve that opens the R010<br>path -- that reframing is his. Which is why I&#39=
;d rather shape it deliberately,<br>as the shared engine core, than fold it=
 in as churn.<br><br>&gt; Although I don&#39;t have any particular strong p=
references, keeping<br>&gt; &quot;absorption&quot; for the runtime concept =
sounds good to me.<br><br>Good -- &quot;absorption&quot; stays reserved for=
 the runtime context-equivalence<br>collapse, and the README explains it in=
 one place.<br><br>&gt; For AST level name changing, &quot;prefix/suffix me=
rging&quot; seems to be<br>&gt; already used in other areas according to Go=
ogle: LLM, Linker, and<br>&gt; string manipulation in DNA. In the normal ex=
pression engine area, it<br>&gt; looks like &quot;flattening nested quantif=
iers&quot; or &quot;quantifiers reduction&quot;<br>&gt; are used for the ca=
se. So, for example, &quot;prefix/suffix quantifiers<br>&gt; reduction&quot=
; seems to be more appropriate? =C2=A0(If you don&#39;t mind it&#39;s too<b=
r>&gt; long) In any case, I would like to respect your opinion.<br><br>Than=
ks -- you&#39;re right that &quot;merging&quot; is well-worn elsewhere, and=
 I&#39;ll be<br>honest that &quot;prefix/suffix merging&quot; isn&#39;t a t=
erm I&#39;d defend on the merits.<br>Keeping it for v48 is really a stopgap=
 to contain the ripple: the sibling<br>Phase-1 rewrites are already named &=
quot;consecutive variable / group / ALT<br>merging&quot;, so switching to t=
he &quot;flattening / reduction&quot; family would force<br>renaming those =
too for consistency. So I&#39;d treat the term itself as genuinely<br>open =
-- your &quot;flattening / reduction&quot; neighborhood is the right one --=
 and<br>converge on the established academic naming as the paper I&#39;m pr=
eparing on the<br>algorithm, together with a university research group, tak=
es shape. For now I&#39;d<br>ship &quot;prefix/suffix merging&quot; only to=
 keep the README internally consistent,<br>and fold the settled term into t=
he glossary pass once the paper lands on it. A<br>doc-level name is cheap t=
o revise later.<br><br>Both of these -- the RPRContext reshape and the nami=
ng/terminology -- are<br>&quot;after v48, by discussion&quot; items, and th=
ey aren&#39;t the only ones. Rather than<br>pin down the scope of that foll=
ow-up now, I&#39;d suggest we pick it up together<br>once v48 is stable.<br=
><br>Thanks again,<br>Henson<br></div>

--0000000000004b3f68065326723c--