Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wYwYx-000dzQ-15 for pgsql-hackers@arkaria.postgresql.org; Mon, 15 Jun 2026 01:56:31 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wYwYu-009Nr9-2l for pgsql-hackers@arkaria.postgresql.org; Mon, 15 Jun 2026 01:56:28 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wYwYu-009Nr1-19 for pgsql-hackers@lists.postgresql.org; Mon, 15 Jun 2026 01:56:28 +0000 Received: from mail-ed1-x52c.google.com ([2a00:1450:4864:20::52c]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1wYwYs-00000000PA6-3BjF for pgsql-hackers@postgresql.org; Mon, 15 Jun 2026 01:56:27 +0000 Received: by mail-ed1-x52c.google.com with SMTP id 4fb4d7f45d1cf-6930f7e83b1so4802082a12.0 for ; Sun, 14 Jun 2026 18:56:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1781488585; cv=none; d=google.com; s=arc-20240605; b=AA2JybM0JKCCyqx8ou1VNnOe6FauUFG7+UgUxwuUXQY35ScIHHs40WwME5gtXiUIas ihPjUIII7ZqqyqboonKzcz+S3hMHPjvJMrWuLX+ZXfYM5VRZI/6XmuUC234sofp52pFn C6pHVaZDAqml18sHiWxC8YYrvO3XrR9Tc6smPrhX+hbbnl54eVqFTCAwusgwopUV4JG+ 978LcVXHDo/tib/5s9olpxnBbn9BvlJCSXvE9ogUBNL/kmCNlHJUDwXq4fOSzUfVP18/ YfG0aVO5iukkHiBBaNuTZoLFm+S/ELQtim1WmJGInRwhVKu54li8glyVTsM/fy3E+Fc0 hg0g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:reply-to:in-reply-to:references :mime-version:dkim-signature; bh=hvzfrWpJSlN4irZnj18aWk/pV0aUgmHO13e5pys1lO0=; fh=/AeS6q8aKwDr8XAKAw09UixYL6LJ2MI4XqSux/pDsy0=; b=b+AAWV41uVPw7C5mCzxKOO+mK+IjCB94KazRazUvSyAyf6G+9u9Dgvcm80zfUXeeHi Ppugr5TiaMTt2XG+TBeEG021K+53qmJwRfnu3gXiZ5nT9Ph0MxocHM9g19TX9kmEHZye TUdZxSsGHC2jZ1ZhvwiVW74lhKURLEdOZJc0YOqIUln1JY/if7JisD6EnfiDQ2BpVllO nFbQjJkLHi7saOJUOi8sraCP+F/hBIF92L5ZGJG77fRZcQbhi9wT0dj1hWKJh3c29DcG xlrU9Y73GxEOV6QmFLYW1rIkRpZl1Ro70++SzhmXziAJ3NyLRLJ3nKDnUr2D+TVEM7Mc u20Q==; darn=postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1781488585; x=1782093385; darn=postgresql.org; h=cc:to:subject:message-id:date:from:reply-to:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=hvzfrWpJSlN4irZnj18aWk/pV0aUgmHO13e5pys1lO0=; b=VO/3RwbrF04l/AmRgK2PRM7l+s9FpRJUMxUjVgAfLqpGwpNvtq5vPMaLjGpHSRny09 Z+MEBcBuEOIzBbBihTKciyCOvbdm1Qg9PMoFPQ5ttWDSUn1OT4q3VAlgZAPHpY2RBXaI MK11WxCLnqxavIYj/4mYlPemAVnyKiT8+GOxe1buLI9Izjsj9KM4itQ0wS4ZYgkJ/Ye2 f4JR8uKiPeWqIDwi9QtWbizYXrl57au6nnLABBjcf8FaOABlu9QC556uFeOiB37A59w/ V6PRBqw8EzGzyUfyvzpC4nN3F6J9k9K13zHOXJV01TiVAXgph3rgqpGICCs1uvUGMjHg yW6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1781488585; x=1782093385; h=cc:to:subject:message-id:date:from:reply-to:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=hvzfrWpJSlN4irZnj18aWk/pV0aUgmHO13e5pys1lO0=; b=na4TCxB0FM64aeIM50LkOzq1bdExaAQZYfbpWUQDxR1+yAfNlAWYBH5SKeYxT/EsZr VbE1UnOhA2UDPmdyw0R+w7Ok87GtAdo2kjke1vGR+szN8JN1B3Tp087Zys4oYCJDvvSG d6VE7jtto5uPYMkhFCDiJ0osr3tkE7Srr/f7tapRD5NvLHSyTF118FwRuSiZMXmPtUIi aYBi4qLU3nriZWJsThMrXJLdXzSMBeBKkTXwqzXaU/c/OYLRzv9B0WEPs/6F3sszNcO5 g8wu+zI5ZNg6BHA1tqNesoqNa9hjfqLl5WUZvEWj2Yxf+Z4uR0PWJ792SKTolvEG3W0f LBBw== X-Forwarded-Encrypted: i=1; AFNElJ+yCSxHBvxYmDzC+atNQs5xyCquA4AoUALIWnFVEzkzaR8Eusac8eqs+UjkjNghXCUZ4Q55Mvacl2sRBIzF@postgresql.org X-Gm-Message-State: AOJu0Yy1N+csaUCXqeZ5rp7jrgSgwdetAjggr4sMK9Z3nlBQdTRgWfuW Dqd1Dkj3h/5iGT2nzRSduq7avaaNaAPYn7Bma6qzdbjmRhlDDriB7rxqpJ92HagmpJkhCvz+5nO cT252RuSkQ6OQBM44ErIohdD8I82zqik= X-Gm-Gg: Acq92OGv63W3iCo5Qo7HfLwyHjcaJKz7vWgwsaL+cyHvh8M/GaG4LoMSXIE6q/Nel84 iGzgvUxYewO5BhQkPZJrt42O4OYkAIoyf8/kO3qJqmaJ57fh5BL9ECO0pYnMyDr1XCbAAokughh +v4qyz6OsKBJqHOx/t3sqvw/09yZxN+cjKJlf+Vynk2A+GbAeEIm1gwxzaJ5jLBVlJ2ye1puaio YibkhFYIXlKPyBGzcSZ3SG0ThWLGaa7V6erjuHNtDhNQXVdJ7a4BLV9u9ztwu1oBGtFQeGdur7j ETR+yRtxSVHEiVO2asD3F6s7BEGXxlKGGZjlIIkvfQZczNYMmtKHRVxKyXnUHS61+bvaXXWVlmM 5XHLbvi+KQgBnBbjAtNoc X-Received: by 2002:a17:906:7316:b0:beb:4d05:9a0a with SMTP id a640c23a62f3a-bfe29926e3dmr504383866b.15.1781488584510; Sun, 14 Jun 2026 18:56:24 -0700 (PDT) MIME-Version: 1.0 References: <20260604.132108.405136284364833955.ishii@postgresql.org> <20260609.171307.1883356507067957349.ishii@postgresql.org> In-Reply-To: Reply-To: assam258@gmail.com From: Henson Choi Date: Mon, 15 Jun 2026 10:56:12 +0900 X-Gm-Features: AVVi8CfXIIxlzcPE5GXlswV7L63N_AcVuAPWmLSWaOeUUocuRMACTUZpQEPjh8E Message-ID: Subject: Re: Row pattern recognition To: jian he , Tatsuo Ishii Cc: zsolt.parragi@percona.com, sjjang112233@gmail.com, vik@postgresfriends.org, er@xs4all.nl, jacob.champion@enterprisedb.com, david.g.johnston@gmail.com, peter@eisentraut.org, li.evan.chao@gmail.com, pgsql-hackers@postgresql.org Content-Type: multipart/alternative; boundary="00000000000040539b06544122ab" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --00000000000040539b06544122ab Content-Type: text/plain; charset="UTF-8" Hi jian, Thanks for the follow-up review patches. Once v48 is posted I'll base the work on it and submit the accepted ones in v49. Reasoning per patch below, with the disposition summarized at the end. > v47-0001 Replace some "AST" to "clause". Agreed on dropping "AST" -- it isn't our house term; "parse tree" is the idiom across the tree, and "AST" barely shows up outside these RPR files (just a jsonpath comment). One tweak: where the text names the data structure (the README diagram and III-2 header, the parsenodes.h comments on rpPattern), I'd use "PATTERN parse tree" rather than "clause", since "clause" loses the tree-vs-flat-NFA contrast. Plain prose that just means the SQL clause -> "clause", as you have it. > v47-0002 group Op and trailing ALT into one IF condition, this might > improve readability. Here I'd lean toward keeping the current form. The complexity here comes from the lexer (regex operators tokenized by the SQL operator lexer), and grouping the branches doesn't remove it -- it just layers more cleverness on top. I'd rather not add code complexity onto the lexer complexity; the flat one-token-one-branch form stays easy to scan. A real simplification would be lexer-side, out of scope here, so I'd leave 0002 as-is unless you feel strongly. > v47-0003 validate_rpr_define_volatility ... is removed. > contain_volatile_functions() already covers both volatile FuncExpr > callees and NextValueExpr ... The trade-off is that we lose the error > cursor position, but that seems better than maintaining extra code. You're right the detection is equivalent. But this restructures the parse-side define_walker, which the upcoming PREV/NEXT name-binding fix also reworks -- it even removes the spot where this check incidentally backstops a nav mis-binding. I'd rather make the simplify-vs-keep call against that final walker shape than churn it twice, so I'll keep the current planner-side check for now and revisit once that fix lands. > v47-0004 validateRPRPatternVarCount() ... the `rpDefs != NULL` > sentinel ... is awkward. That cross-check only needs to run once; > better expressed in the caller, transformDefineClause(). Agreed. The cross-check is a one-shot, list-level test -- a different kind of thing from the recursive per-node count, and not really what validateRPRPatternVarCount is for (validating the pattern var count). Moving it to the caller runs it once by construction and leaves the function true to its name, a pure count. Rejection behavior is unchanged. > v47-0005 ... The more intuitive order should be: transformExpr -> > coerce_to_boolean -> pull_var_clause, so pull_var_clause always sees > the final expression form. Agreed, that order is more intuitive. I'll make the change and watch for any change in error output or behavior -- coerce_to_boolean now runs per-define before pull_var_clause rather than in a second pass -- with guard tests for domain-over-bool, nav Var preserved, and non-coercible error, and submit it in v49 if those and the regression pass clean. > v47-0006 ... has_column_ref is not necessary. Column reference checks > can use contain_var_clause, and it's cheap. Also the message change: > -row pattern navigation offset must be a run-time constant > +row pattern navigation offset expression must not contain column > references Agreed contain_var_clause is simpler and the new message reads better. But this restructures define_walker too, which the nav name-binding fix also reworks -- so as with 0003 I'd rather settle it against the post-fix walker shape than churn it twice, and revisit once that fix lands. Summary: 0001 -> v49 (use "parse tree" where it names the structure, not "clause") 0002 -> keep the current flat form 0003 -> revisit after the nav name-binding fix 0004 -> v49 0005 -> v49 (after guard tests + regression pass) 0006 -> revisit after the nav name-binding fix For 0003 and 0006 -- both swap a hand-rolled walker for a standard helper (contain_volatile_functions, contain_var_clause) -- I'll think this through properly when I rework the walker: the convention of reaching for those helpers against what the current checks buy us (the error cursor, the specific messages), rather than deciding it by reflex. More broadly, the substantive correctness defects take priority: the nav name-binding fix and the rest of that series come ahead of these cleanup refactors. Once v48 is posted I'll write up the current list of known issues and send it to the thread. Thanks again for the careful pass. Best, Henson --00000000000040539b06544122ab Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi jian,

Thanks for the follow-up review patches. O= nce v48 is posted I'll base the
work on it and submit the accepted o= nes in v49. Reasoning per patch below,
with the disposition summarized a= t the end.

> v47-0001 Replace some "AST" to "claus= e".

Agreed on dropping "AST" -- it isn't our hous= e term; "parse tree" is the
idiom across the tree, and "A= ST" barely shows up outside these RPR files
(just a jsonpath commen= t). One tweak: where the text names the data
structure (the README diagr= am and III-2 header, the parsenodes.h comments
on rpPattern), I'd us= e "PATTERN parse tree" rather than "clause", since
&= quot;clause" loses the tree-vs-flat-NFA contrast. Plain prose that jus= t means
the SQL clause -> "clause", as you have it.

= > v47-0002 group Op and trailing ALT into one IF condition, this might> improve readability.

Here I'd lean toward keeping the cur= rent form. The complexity here comes
from the lexer (regex operators tok= enized by the SQL operator lexer), and
grouping the branches doesn't= remove it -- it just layers more cleverness
on top. I'd rather not = add code complexity onto the lexer complexity; the
flat one-token-one-br= anch form stays easy to scan. A real simplification
would be lexer-side,= out of scope here, so I'd leave 0002 as-is unless you
feel strongly= .

> v47-0003 validate_rpr_define_volatility ... is removed.
&g= t; contain_volatile_functions() already covers both volatile FuncExpr
&g= t; callees and NextValueExpr ... The trade-off is that we lose the error> cursor position, but that seems better than maintaining extra code.
You're right the detection is equivalent. But this restructures t= he
parse-side define_walker, which the upcoming PREV/NEXT name-binding f= ix
also reworks -- it even removes the spot where this check incidentall= y
backstops a nav mis-binding. I'd rather make the simplify-vs-keep = call
against that final walker shape than churn it twice, so I'll ke= ep the
current planner-side check for now and revisit once that fix land= s.

> v47-0004 validateRPRPatternVarCount() ... the `rpDefs !=3D N= ULL`
> sentinel ... is awkward. That cross-check only needs to run on= ce;
> better expressed in the caller, transformDefineClause().
Agreed. The cross-check is a one-shot, list-level test -- a different kind=
of thing from the recursive per-node count, and not really what
vali= dateRPRPatternVarCount is for (validating the pattern var count).
Moving= it to the caller runs it once by construction and leaves the
function t= rue to its name, a pure count. Rejection behavior is unchanged.

>= v47-0005 ... The more intuitive order should be: transformExpr ->
&g= t; coerce_to_boolean -> pull_var_clause, so pull_var_clause always sees<= br>> the final expression form.

Agreed, that order is more intuit= ive. I'll make the change and watch for
any change in error output o= r behavior -- coerce_to_boolean now runs
per-define before pull_var_clau= se rather than in a second pass -- with
guard tests for domain-over-bool= , nav Var preserved, and non-coercible
error, and submit it in v49 if th= ose and the regression pass clean.

> v47-0006 ... has_column_ref = is not necessary. Column reference checks
> can use contain_var_claus= e, and it's cheap. Also the message change:
> -row pattern naviga= tion offset must be a run-time constant
> +row pattern navigation off= set expression must not contain column
> =C2=A0references

Agre= ed contain_var_clause is simpler and the new message reads better.
But t= his restructures define_walker too, which the nav name-binding fix
also = reworks -- so as with 0003 I'd rather settle it against the post-fixwalker shape than churn it twice, and revisit once that fix lands.

= Summary:

=C2=A0 0001 =C2=A0-> v49 =C2=A0= (use "parse tree" where it names the structure, not "clause= ")
=C2=A0 0002 =C2=A0-> keep the current flat form
=C2=A0 000= 3 =C2=A0-> revisit after the nav name-binding fix
=C2=A0 0004 =C2=A0-= > v49
=C2=A0 0005 =C2=A0-> v49 =C2=A0 (after guard tests + regress= ion pass)
=C2=A0 0006 =C2=A0-> revisit after the nav name-binding fix=


For 0003 and 0006 -- both swap a hand-rolled walker for a st= andard helper
(contain_volatile_functions, contain_var_clause) -- I'= ll think this
through properly when I rework the walker: the convention = of reaching for
those helpers against what the current checks buy us (th= e error cursor,
the specific messages), rather than deciding it by refle= x.

More broadly, the substantive correctness defects take priority: = the nav
name-binding fix and the rest of that series come ahead of these= cleanup
refactors. Once v48 is posted I'll write up the current lis= t of known
issues and send it to the thread.

Thanks again for the= careful pass.

Best,
Henson
--00000000000040539b06544122ab--