public inbox for [email protected]  
help / color / mirror / Atom feed
From: Henson Choi <[email protected]>
To: jian he <[email protected]>
Cc: Tatsuo Ishii <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Subject: Re: Row pattern recognition
Date: Wed, 3 Jun 2026 07:14:21 +0900
Message-ID: <CAAAe_zAZDuHSiVGvz9c6h=Pe=aN+FKZOrdNPfbTOk3XV+WFKYQ@mail.gmail.com> (raw)
In-Reply-To: <CACJufxHVADC8e77pnQxSZRk7SYHCZFk6ZCM2HfTsKyD_kUji0A@mail.gmail.com>
References: <CAAAe_zDDuJofafXyhggPPPLzUXeejH-19NfcLR7jNNQxZchtog@mail.gmail.com>
	<[email protected]>
	<CAAAe_zBhzHyazC_Ot+HXBe_nToKc7AHs4r-s0nNcaBPw0L17wA@mail.gmail.com>
	<[email protected]>
	<CAAAe_zCZc8s4zWfmVVUOt0y_FU=v7YTcJJJ4UL2gBzJ2_KkUmQ@mail.gmail.com>
	<CACJufxHVADC8e77pnQxSZRk7SYHCZFk6ZCM2HfTsKyD_kUji0A@mail.gmail.com>

Subject: Re: Row pattern recognition

Hi Jian,

> 3.
> 240 variables is enough, as each variable supports multiple complex
> AND/OR conditions. [...] reserving the remaining ones in advance is a
> future-proof approach.

Thanks -- that's (3), and Tatsuo landed on the same choice, so it's
settled. I'll make the change for v48.

> XII-4. Memory Pool Management
>   Choice: Custom free list [...]
> It would be better simply to mention that:
> RPRNFAState and RPRNFAContext are allocated in a partition-lifespan
> memory context; they will be destroyed in release_partition.

Agreed -- that's clearer than the rationale list. I'll replace XII-4 with
that wording.

> If ctx->matchedState points to one of the states already in ctx->states,
> will nfa_state_free() be called on the same RPRNFAState twice? Is this
> double-free permitted, or do we have a mechanism in place to guard
> against it?

No -- they're disjoint by construction, so the two frees never touch the
same state. Step by step:

  - nfa_advance() detaches ctx->states up front (sets it to NULL) and
    rebuilds the list from scratch.
  - Each old state is pulled off and advanced; the one that reaches FIN is
    moved into matchedState and never re-added to ctx->states.
  - So at any moment a state is either on ctx->states or it is
    matchedState, never both.
  - ExecRPRFreeContext frees the list and frees matchedState -- disjoint
    sets, no overlap, so no guard is needed.
  - (When a new FIN replaces matchedState, the previous one is freed right
    there.)

> if (currentPos == ctxFrameEnd) {
>     nfa_match(winstate, ctx, NULL);
>     continue;
> }
> If I comment out the CONTINUE, the entire regression still succeeds.

It isn't dead -- it's the N-FOLLOWING boundary handler. nfa_match(ctx,
NULL) forces a mismatch that finalizes the context on its matchedState,
and the continue skips the rest since it's done for this row.

Without the continue, that finalized context is matched a second time in
the same row -- a context-matched-twice error. It only looks harmless
because the forced mismatch already emptied ctx->states, so the second
nfa_match iterates nothing (and it would also re-run
nfa_reevaluate_dependent_vars(), overwriting the shared nfaVarMatched the
other contexts read). So I'd keep it: a finalized context must not be
matched twice in the same row.

Thanks,
Henson


view thread (109+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Row pattern recognition
  In-Reply-To: <CAAAe_zAZDuHSiVGvz9c6h=Pe=aN+FKZOrdNPfbTOk3XV+WFKYQ@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox