Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wHHYE-0071dO-3D for pgsql-hackers@arkaria.postgresql.org; Mon, 27 Apr 2026 08:42:47 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wHHYE-00D5KY-0Y for pgsql-hackers@arkaria.postgresql.org; Mon, 27 Apr 2026 08:42:46 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wHHYD-00D5KP-2K for pgsql-hackers@lists.postgresql.org; Mon, 27 Apr 2026 08:42:45 +0000 Received: from meldrar.postgresql.org ([2a02:c0:301:0:ffff::31]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1wHHYB-00000003J5p-096r for pgsql-hackers@postgresql.org; Mon, 27 Apr 2026 08:42:45 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=postgresql.org; s=20171124; h=Content-Transfer-Encoding:Content-Type: Mime-Version:References:In-Reply-To:From:Subject:Cc:To:Message-Id:Date:Sender :Reply-To:Content-ID:Content-Description; bh=/meA/nTIvSZOII+cuv4Mj3WTaMYD8NmMEVuCfrSJ2p4=; b=S4GgVLjoXUs3rozVL4k1eY6QUb ZpDFvABjIeqp15Wp0CUIEgAY43Q1MSDCY6iRYYRTTY1tOpoO8roMnpJoj9Tox1n5VqA6SIl2ANxwr 0W6KJlykhhXXQUvyKGJC97xFV/Mp9ff1zFfW4Qf7GEX7LoUp6FdwhwQ8XC8jn16jTk2KHYODCyHLa cx/Wx4yzJJbJLJFMO9q54+GccFewi2xOUm41EpOK2lLUbP/fWgd0cuX+Dy+FKVoAo02pWUp0Fp7yM rDi6hEAXW3+8rrwn7cZ0G1quf/0Tt7/8ogSt2YCr4hWA8lhbOftoC7SCHV+RtqpClRoj0gpAzfWHt qaVfUFkA==; Received: from [2409:11:4120:300:ace3:cafa:5520:b1b3] (helo=localhost) by meldrar.postgresql.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wHHY1-0085sH-2e; Mon, 27 Apr 2026 08:42:36 +0000 Date: Mon, 27 Apr 2026 17:42:20 +0900 (JST) Message-Id: <20260427.174220.1939160662649810289.ishii@postgresql.org> To: assam258@gmail.com Cc: zsolt.parragi@percona.com, sjjang112233@gmail.com, vik@postgresfriends.org, er@xs4all.nl, jacob.champion@enterprisedb.com, david.g.johnston@gmail.com, peter@eisentraut.org, li.evan.chao@gmail.com, pgsql-hackers@postgresql.org Subject: Re: Row pattern recognition From: Tatsuo Ishii In-Reply-To: References: <20260427.135243.1750053587183825244.ishii@postgresql.org> X-Mailer: Mew version 6.8 on Emacs 29.3 Mime-Version: 1.0 Content-Type: Text/Plain; charset=iso-2022-jp Content-Transfer-Encoding: 7bit X-Host-Lookup-Failed: Reverse DNS lookup failed for 2409:11:4120:300:ace3:cafa:5520:b1b3 (failed) List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi Henson, > Thank you for pointing this out. I will analyse A1 and reflect > the findings in the next revision. Thanks in advance. >> Comment to 0009: >> >> + /* Mark VAR in visited before duplicate check to prevent DFS loops >> */ >> + winstate->nfaVisitedElems[WORDNUM(state->elemIdx)] |= >> + ((bitmapword) 1 << BITNUM(state->elemIdx)); >> >> Why do not use bms here? Maybe winstate->nfaVisitedElems could become >> quite big? >> > > Thank you for the suggestion. I was not aware of Bitmapset when > this was written; I will switch nfaVisitedElems to Bitmapset. > > On sizing, nfa->nelems is structurally bounded by RPR_ELEMIDX_MAX > (INT16_MAX), since RPRNFAState.elemIdx is int16 and scanRPRPattern() > rejects anything past that with "pattern too complex". So the > bitmap is at most 4 KB per WindowAgg even in the worst case. Hmm. From the comment from bitmapset.c: * A bitmap set can represent any set of nonnegative integers, although * it is mainly intended for sets where the maximum value is not large, * say at most a few hundred. Maybe 4 KB is tool large? If so, there's no need to rush to change the implementation. You can leave it as it is for now. >> Comment to 0024: >> >> + first ( value >> anyelement [, offset >> bigint ] ) >> + anyelement >> + >> + >> + Returns the column value at the row offset >> + rows after the match start row; >> + returns NULL if the target row is beyond the current row. >> + offset defaults to 0 if omitted, referring >> to the >> + match start row itself. >> + offset must be a non-negative integer. >> + offset must not be NULL. >> + Can only be used in a DEFINE clause. >> + >> >> "Returns the column value at the row" is not appropriate because >> first() accepts any expression as the first argument. e.g. first(col1 >> + col2). Instead what about "Returns value evaluated at the row"? >> Same thing can be said to other row pattern navigation operations. >> I think they inherits my mistakes in my earlier patch. Sorry for this. >> > > Thank you, and please do not apologise -- I should have caught this > when I carried the wording across. I will reword the synopses for > first(), last(), prev(), next() (and any other entry that inherited > the same phrasing) along the line you suggested. Thank you. >> Let's prohibit volatile functions in DEFINE. Although this needs extra >> checking, it would make proceeding codes and tests simpler. If >> others, especially Vik and Jacob, think differently, please join the >> discussion. >> > > Thank you for the decision. Since this is a policy change that may > still attract further community discussion, I would like to handle > it as a lower-priority, self-contained follow-up on top of 0031, so > it can be picked up, held, or revised independently. Tentative > plan: reject volatile functions in DEFINE during parse-analysis via > contain_volatile_functions(), fold with the existing offset > runtime-constant check where the two share traversal, and -- if the > prohibition lands -- convert the B9 test into an error-case test > and drop the XXX note. Your tentative plan looks good to me. >> > On sequencing, if we do take it up, I would suggest handling it >> > after the 31-patch set, alongside the README.rpr split as >> > follow-up work on top of 0031. Whether it ultimately lands >> > inside v47 or as a separate piece on top does not need to be >> > decided right now -- there is still room to discuss it as the >> > review progresses, and I am happy to adjust either way based on >> > your direction. >> >> I too prefer to have it after the 31-patch set. >> > > Thank you, confirmed. Both items (volatile-DEFINE prohibition and > README.rpr split) will land as follow-up patches on top of 0031, > after the current series is committed. > > One quick question: for the 0007 / 0009 / 0017 / 0024 review fixes > summarised below, would additional follow-up patches on top of 0031 > be preferable to refreshing the already-reviewed series? Yes, that's fine. I will apply the delta patch for 0007 / 0009 / 0017 / 0024 on top of 0031. > For ease of tracking, here is a one-line summary per patch of the > changes I plan to fold into the next revision: > > 0007: analyse A1 and, if the diagnosis holds, revise it so the > frame optimization bypass is observable in EXPLAIN > (adjusting the comment and expected output accordingly). > 0009: switch WindowAggState.nfaVisitedElems from a bitmapword[] > to Bitmapset, replacing the manual WORDNUM/BITNUM sites > with bms_add_member() / bms_is_member() / bms_free(). > 0017: fix the "Checks for" -> "Check for" comment, rewrite the > called-out passive-voice errmsg() strings (and the similar > ones nearby) in the active voice, and use Max() at the two > offset-update sites. > 0024: reword the navigation synopses (first/last/prev/next) to > "Returns the value of evaluated at the row ...", > and replace the non-ASCII ≠ / → in rpr.out with != / -> > (sweeping the rest of rpr*.out for other non-ASCII). Confirmed. > In addition, I plan to follow the current series with two new > patches on top of 0031 (final numbering deferred until they are > posted): > > new: README.rpr split -- extract the design notes currently > inlined in execRPR.c into src/backend/executor/README.rpr, > as previously discussed. > new: Prohibit volatile functions in DEFINE -- separate, > self-contained patch as described above; intended as a > lower-priority follow-up that can be picked up, held, or > revised independently of the main series depending on how > the community discussion settles. Ok. > Separately, one item is deferred for later discussion rather than > folded into a patch: > > deferred: B7 Recursive CTE XXX in 0007 -- whether this case > falls under ISO/IEC 9075-2 4.18.5 / 6.17.5 prohibition > ("Ok, let's discuss on this later on."). Noted, happy > to pick this up whenever convenient. Ok. > Once you let me know your preference on the sequencing question > above, I will send the corresponding patches as soon as they are > ready, and post the two follow-ups separately so that they can be > reviewed and committed on top of the main series. > > Thank you again for the thorough review. Thank you for your effort! -- Tatsuo Ishii SRA OSS K.K. English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp