Re: Row pattern recognition

public inbox for [email protected]  
help / color / mirror / Atom feed

From: jian he <[email protected]>
To: Henson Choi <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: Tatsuo Ishii <[email protected]>
Subject: Re: Row pattern recognition
Date: Tue, 9 Jun 2026 16:57:20 +0800
Message-ID: <CACJufxFnwdQSApt2vWwYCd0gtf+JjFDxT2hbxHi=+dhFJc+-1g@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <CAAAe_zAZDuHSiVGvz9c6h=Pe=aN+FKZOrdNPfbTOk3XV+WFKYQ@mail.gmail.com>
	<CAAAe_zDz3z2Paidk3jHOm9S3eMVLoXRxK0Lyo=5i_9-EfSH7fA@mail.gmail.com>
	<[email protected]>
	<[email protected]>

Hi.

The below in advanced.sgml actually seems better suited for
explain.sgml (which currently has no changes).
At the very least, explain.sgml should mention it.

   <para>
    When examining query plans for Row Pattern Recognition with
    <command>EXPLAIN</command>, the pattern output may include special
    markers that indicate optimization opportunities. A double quote
    <literal>"</literal> marks where pattern absorption can occur,
    and a single quote <literal>'</literal> marks absorbable elements
    within a branch. For example, <literal>a+"</literal> indicates that
    repeated matches of <literal>a</literal> can be absorbed, while
    <literal>(a' b')+"</literal> shows that both <literal>a</literal>
    and <literal>b</literal> within the group are absorbable.
    These markers are primarily useful for understanding internal
    optimization behavior.
   </para>

<programlisting>
......
 PATTERN (LOWPRICE UP+ DOWN+)
 DEFINE
  LOWPRICE AS price &lt;= 100,
  UP AS price &gt; PREV(price),
  DOWN AS price &lt; PREV(price)
);
</programlisting>
LOWPRICE, UP, DOWN should be lower case, since they are not keywords?

scanRPRPatternRecursive
It would be better to replace all  ``non-trivial quantifier`` to
``non-trivial quantifier (not {1,1}).``

I want to rename allocateRPRPattern to makeRPRPattern, what do you think?

RPRPatternNode.reluctant_location is not necessary. we never use it
for error reporting,
and since it's restricted to the end of a pattern, it adds very little value.

Is nav_volatile_func_checker really necessary?
We already have contain_volatile_functions and
contain_volatile_functions_not_nextval.
Also per https://www.postgresql.org/message-id/[email protected]
The convention is not to check expression volatility during parse
analysis stage.

+bool        trailing_alt pg_node_attr(query_jumble_ignore);
I may check whether trailing_alt is necessary later,  but
query_jumble_ignore is not needed because it's only manipulated in
gram.y and we do JumbleQuery in parse_analyze_fixedparams, which
obviously is later than all gram.y related work.
----------------------------------------------
v47-0001-v47-rpr-reformat-gram.y-again.nocfbot
The HINT message below is somewhat misleading, since the question mark
(?) is actually after the range quantifier.
Change errmsg to errmsg("invalid token \"%s\" after range quantifier",
$5) will make the error message clearer.

+SELECT FROM rpr_err WINDOW w AS ( ROWS BETWEEN CURRENT ROW AND 1
FOLLOWING PATTERN (A {1,2}??) DEFINE A AS TRUE);
+ERROR:  invalid token "??" after range quantifier
+LINE 1: ...TWEEN CURRENT ROW AND 1 FOLLOWING PATTERN (A {1,2}??) DEFINE...
+                                                             ^
+HINT:  Only "?" is allowed after {n,m} to make it reluctant.
----------------------------------------------
v47-0002-v47-rpr-regress-tests-refactoring.nocfbot

-- Consecutive GROUP merge: (A B){2} (A B)+ -> (a b){3,}
-- Exercises the child->max == RPR_QUANTITY_INF branch in
mergeConsecutiveGroups,
--- where a finite prev ((A B){2,2}) meets an infinite child ((A B)+).

We should just drop these comments from the regress tests.
mergeConsecutiveGroups is a static function that might change, and we don't want
to have to update all the tests every time it does.  Also the comments meaning
is still totally clear without them.
----------------------------------------------
v47-0003-refactor-many-functions-in-rpr.c.nocfbot
I did some major refactoring in rpr.c. Instead of allocating new lists to
remove duplicates, we now just mutate them in place using list_delete_nth_cell
(a common pattern in the codebase).
for overflow handling, it would be better use pg_mul_s32_overflow,
pg_add_s32_overflow.
See the commit message for the details.

v47-0004-another-v47-misc-refactoring.nocfbot
Some misc refactoring, that seems to make sense to me, see the commit
message for rationale.

The above v47-0001, v47-0002, and v47-0003, v47-0004 are based on
https://github.com/assam258-5892/postgres
latest commit.



--
jian
https://www.enterprisedb.com/


Attachments:

  [application/octet-stream] v47-0004-another-v47-misc-refactoring.nocfbot (5.1K, 2-v47-0004-another-v47-misc-refactoring.nocfbot)
  download

  [application/octet-stream] v47-0002-v47-rpr-regress-tests-refactoring.nocfbot (14.5K, 3-v47-0002-v47-rpr-regress-tests-refactoring.nocfbot)
  download

  [application/octet-stream] v47-0001-v47-rpr-reformat-gram.y-again.nocfbot (11.6K, 4-v47-0001-v47-rpr-reformat-gram.y-again.nocfbot)
  download

  [application/octet-stream] v47-0003-v47-refactor-many-functions-in-rpr.c.nocfbot (21.5K, 5-v47-0003-v47-refactor-many-functions-in-rpr.c.nocfbot)
  download

view thread (109+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Row pattern recognition
  In-Reply-To: <CACJufxFnwdQSApt2vWwYCd0gtf+JjFDxT2hbxHi=+dhFJc+-1g@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox