Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wTqiC-000n3D-0i for pgsql-hackers@arkaria.postgresql.org; Mon, 01 Jun 2026 00:41:00 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wTqi8-008DdW-2j for pgsql-hackers@arkaria.postgresql.org; Mon, 01 Jun 2026 00:40:57 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wTqi8-008DdO-1T for pgsql-hackers@lists.postgresql.org; Mon, 01 Jun 2026 00:40:56 +0000 Received: from mail-ed1-x534.google.com ([2a00:1450:4864:20::534]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1wTqi6-00000000Yye-1oaY for pgsql-hackers@postgresql.org; Mon, 01 Jun 2026 00:40:56 +0000 Received: by mail-ed1-x534.google.com with SMTP id 4fb4d7f45d1cf-68d233bf083so1154278a12.1 for ; Sun, 31 May 2026 17:40:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1780274452; cv=none; d=google.com; s=arc-20240605; b=GLQqVFots+qgj0mwdPbcDxswdp+bV3CD9HF0knTr1fsMmOMeEMj0sWgC7VDRCRVia8 NpBPsPkVIClL4G5XbxAcoE1YODqdlj0y/xU7b4RUcwPQHugouptdhNvfDU4RHGHHZ5Aq IKhBc4lsiXoGgnq0LMZCFrvh+Jjyd6rlVDXx+Qeby0NzRFrIArizlceOmmRv46JcCynu g9Mwqh+A4ZhzMTwUwJk4vjxk5gcdyglcNGpFyJFe6Szc+aHaVcAISnwS9PDue4N9y/2M ChW9WHQWOAx8zjRmCzWGEYDcrmMsIRVEYYnqsWhwF87yBATQ9qzkUpB+1ovrohCbovqf CN8A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:reply-to:in-reply-to:references :mime-version:dkim-signature; bh=NDnIkHwQqJgLSg/gEz32ju6Q0X+yiw5yIc1A23cKu5s=; fh=Ig5uJFa3Y0R4CoaCBdH8i9O5jhZUD/QJLb1WtwbCxAc=; b=JfFyCwXJPH49jpwaqLU2Duxo2lpSY59V7SXSITS6x/ZihuYWrsC8uTOtn0bYPbsLuq N5EyzyrseuIHQipDrAH6ciGXwmfPoMYRYYiGnRWhoOB3NRSLRseIbIqE/kpW+dqwhMa1 N+Lk3F6wIIfyxqLYHjt5uN5c6CFdJtXszPGlA0H02BakOzQVBsWltK6UdsYtizMPN5+s ab0CrIJ5a4z3IxsChFJk2Lbx636Yn//z1XMNXCjf7EL83RUyUtyv23ms0Fn7g3Do2qb6 8Fh6fjjcUDneZegfxQBiUBOi9LYgApS3lB1mlGZyQfPz+1hKb6HIIETm4vPE6HEsQqM6 gfvA==; darn=postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1780274452; x=1780879252; darn=postgresql.org; h=cc:to:subject:message-id:date:from:reply-to:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=NDnIkHwQqJgLSg/gEz32ju6Q0X+yiw5yIc1A23cKu5s=; b=Q8SQV71bZmMKk0V1GiuQwYrX1Y7CxxsIW8wa3iBgmFxKOD55TasusWdcmwOzwyXo9l H8J5EfzXv/9aT7Obo55wsA+/+CStBzjoRGL/IwrQV9xvMzIRLK5tLA2d1M5nH3nKpnSc JqEhRRC+D4g2X3MLAAGnnMJWO1lm6ew5gZOHOX0yaOiGzhExl4dn/qkHxrm7AxGTYFia qXGCWJZInhSDmYLUqSkmAIKGWPzSlYoE+IhzG3qzBr20U4Q9qHUqvC6NqyTw9hoj/Hnr NyQFALf2+OAsC8FAfqtPVZvNRZyQfnhrh3ul1hi3+u+cir6aIO83conKqdxA+lOv7Qyy jPUQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1780274452; x=1780879252; h=cc:to:subject:message-id:date:from:reply-to:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=NDnIkHwQqJgLSg/gEz32ju6Q0X+yiw5yIc1A23cKu5s=; b=KEgFFW3hMX3fJvDZO6ECM9X3IZxCVtEIkNuWbSynLDE8Hm0LVYrWNXGlH5JOdK3E9e uQcDSLPTnuNkXl9GkuW0UJlpwzMd5wX29ShxP+F8QmgHy5WAPziEy+IGeq+RT3EGbeIe L5/fXpclgIY8Zt+BWiieUh5Aun3x+zQx5HVd5OHhGI6xDia4bAADGezqMcYLlQzZ+bBg 3/qWODV6WIwEdOjjkDqOrAGonrwH9Nqddb++K+FJxwCw1brSArrervKEd61XbiZ5wm5j B7lkrgl3g57gImUdb25tYeRRojB+mfsTlo0eoGSzQwCa5qac6guaBLnTear5PKVC/b9Y b4ug== X-Forwarded-Encrypted: i=1; AFNElJ9ATuHnc32HF+LL84h0cxsbOvr8CsioThMqBl9JTx1KrxeMZZICNVdHsC8qFWFYUVw5SSVQzneAnB57K3Yv@postgresql.org X-Gm-Message-State: AOJu0YzE2MN9/y60hs0ateeSHb3Z0rCH0d1sZjrpTPpFsPH8fUHKLvHW wa4wl8CxYd//j3ee596IFhJzo+Q8TVelbLNq+Hz5InB+MgFvJ9jxjVwUPsEngm48oWWRKlJa+FJ GU1GgcifMelDoPZWN0gdzH2O6W6wRJLA= X-Gm-Gg: Acq92OFlCL4njPFlwtYywxKR+pEAev9i52Ph6nTL0Ds3h+o38dH5NwfvCVb45z2JcNz bP+G1edgx/b5JdfwPpFw6Kx1bH92YEFcSHO/ttqaev0bWFoYu4xe64Cv8jG9a50coSbOoE9piGa 9l8hDP/QRZ5HN0rGXqNy56EPTWwdQm91pOmTvgRUg/F9zgsH/+pKdxGYFbjyOB7PaK/yp9FzDtm OIGqD8dMDLWyUd2mPrQ5b9W/qHpPieJhP9i/hwp8WzcUzASCwU7rqLofBZPnd4sqIq3Li7Antiv B7A0XWhD99DrCT9P/dGxUc+WKLt7rQjVBGxPsZbk3mQ5fBfWiw== X-Received: by 2002:a50:eb48:0:b0:67c:5a2e:18b8 with SMTP id 4fb4d7f45d1cf-68c8ca394acmr2704721a12.18.1780274451659; Sun, 31 May 2026 17:40:51 -0700 (PDT) MIME-Version: 1.0 References: <20260517.190023.159085681032648582.ishii@postgresql.org> <20260531.103258.1663537611708370752.ishii@postgresql.org> In-Reply-To: <20260531.103258.1663537611708370752.ishii@postgresql.org> Reply-To: assam258@gmail.com From: Henson Choi Date: Mon, 1 Jun 2026 09:40:40 +0900 X-Gm-Features: AVHnY4JV6pv-x4_UDKkFHkiXy-eDTRssi_ZR-zQvFbVQs-lb-MJW9hLc2CJOkHY Message-ID: Subject: Re: Row pattern recognition To: Tatsuo Ishii , jian.universality@gmail.com Cc: zsolt.parragi@percona.com, sjjang112233@gmail.com, vik@postgresfriends.org, er@xs4all.nl, jacob.champion@enterprisedb.com, david.g.johnston@gmail.com, peter@eisentraut.org, li.evan.chao@gmail.com, pgsql-hackers@postgresql.org Content-Type: multipart/alternative; boundary="0000000000004b3f68065326723c" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000004b3f68065326723c Content-Type: text/plain; charset="UTF-8" Hi Tatsuo, Jian, Thanks, Tatsuo -- your two notes settle both open questions. Answering both inline below, with a note at the end on the follow-up work. > Basically I think Jian's idea is good. In addition to the size reason > above, we would have less code changes when we adapt existing R020 > codes to R010. > > However it will need a wide code change as Henson said. I would like > to focus on stabilizing our code for now. Therefore I would not want > the refactoring in v48. Agreed -- out of v48, stabilize first. On R010, I'd treat it as the design lens, not the schedule: it's far out, so rather than hold RPRContext back until then, I'd do the consolidation at a sensible point after v48 but shape it against R010 (what is shared versus per-context, how the fields group), so one engine core can later back both the R020 window path and R010 without a second reshape. Not v48 -- on its own schedule once we're stable. I'll admit the R010 connection had been nagging at me for a while without a clean answer, and Jian's consolidation suggestion turns out to land right on it: framed as a size win, it's really the structural move that opens the R010 path -- that reframing is his. Which is why I'd rather shape it deliberately, as the shared engine core, than fold it in as churn. > Although I don't have any particular strong preferences, keeping > "absorption" for the runtime concept sounds good to me. Good -- "absorption" stays reserved for the runtime context-equivalence collapse, and the README explains it in one place. > For AST level name changing, "prefix/suffix merging" seems to be > already used in other areas according to Google: LLM, Linker, and > string manipulation in DNA. In the normal expression engine area, it > looks like "flattening nested quantifiers" or "quantifiers reduction" > are used for the case. So, for example, "prefix/suffix quantifiers > reduction" seems to be more appropriate? (If you don't mind it's too > long) In any case, I would like to respect your opinion. Thanks -- you're right that "merging" is well-worn elsewhere, and I'll be honest that "prefix/suffix merging" isn't a term I'd defend on the merits. Keeping it for v48 is really a stopgap to contain the ripple: the sibling Phase-1 rewrites are already named "consecutive variable / group / ALT merging", so switching to the "flattening / reduction" family would force renaming those too for consistency. So I'd treat the term itself as genuinely open -- your "flattening / reduction" neighborhood is the right one -- and converge on the established academic naming as the paper I'm preparing on the algorithm, together with a university research group, takes shape. For now I'd ship "prefix/suffix merging" only to keep the README internally consistent, and fold the settled term into the glossary pass once the paper lands on it. A doc-level name is cheap to revise later. Both of these -- the RPRContext reshape and the naming/terminology -- are "after v48, by discussion" items, and they aren't the only ones. Rather than pin down the scope of that follow-up now, I'd suggest we pick it up together once v48 is stable. Thanks again, Henson --0000000000004b3f68065326723c Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Tatsuo, Jian,

Thanks, Tatsuo -- your two notes s= ettle both open questions. Answering both
inline below, with a note at t= he end on the follow-up work.

> Basically I think Jian's idea= is good. In addition to the size reason
> above, we would have less = code changes when we adapt existing R020
> codes to R010.
>
= > However it will need a wide code change as Henson said. I would like> to focus on stabilizing our code for now. Therefore I would not want=
> the refactoring in v48.

Agreed -- out of v48, stabilize fir= st. On R010, I'd treat it as the design
lens, not the schedule: it&#= 39;s far out, so rather than hold RPRContext back
until then, I'd do= the consolidation at a sensible point after v48 but shape
it against R0= 10 (what is shared versus per-context, how the fields group), so
one eng= ine core can later back both the R020 window path and R010 without a
sec= ond reshape. Not v48 -- on its own schedule once we're stable.

I= 'll admit the R010 connection had been nagging at me for a while withou= t a
clean answer, and Jian's consolidation suggestion turns out to l= and right on
it: framed as a size win, it's really the structural mo= ve that opens the R010
path -- that reframing is his. Which is why I'= ;d rather shape it deliberately,
as the shared engine core, than fold it= in as churn.

> Although I don't have any particular strong p= references, keeping
> "absorption" for the runtime concept = sounds good to me.

Good -- "absorption" stays reserved for= the runtime context-equivalence
collapse, and the README explains it in= one place.

> For AST level name changing, "prefix/suffix me= rging" seems to be
> already used in other areas according to Go= ogle: LLM, Linker, and
> string manipulation in DNA. In the normal ex= pression engine area, it
> looks like "flattening nested quantif= iers" or "quantifiers reduction"
> are used for the ca= se. So, for example, "prefix/suffix quantifiers
> reduction"= ; seems to be more appropriate? =C2=A0(If you don't mind it's too> long) In any case, I would like to respect your opinion.

Than= ks -- you're right that "merging" is well-worn elsewhere, and= I'll be
honest that "prefix/suffix merging" isn't a t= erm I'd defend on the merits.
Keeping it for v48 is really a stopgap= to contain the ripple: the sibling
Phase-1 rewrites are already named &= quot;consecutive variable / group / ALT
merging", so switching to t= he "flattening / reduction" family would force
renaming those = too for consistency. So I'd treat the term itself as genuinely
open = -- your "flattening / reduction" neighborhood is the right one --= and
converge on the established academic naming as the paper I'm pr= eparing on the
algorithm, together with a university research group, tak= es shape. For now I'd
ship "prefix/suffix merging" only to= keep the README internally consistent,
and fold the settled term into t= he glossary pass once the paper lands on it. A
doc-level name is cheap t= o revise later.

Both of these -- the RPRContext reshape and the nami= ng/terminology -- are
"after v48, by discussion" items, and th= ey aren't the only ones. Rather than
pin down the scope of that foll= ow-up now, I'd suggest we pick it up together
once v48 is stable.
Thanks again,
Henson
--0000000000004b3f68065326723c--