Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wXJmK-003BuZ-2q for pgsql-hackers@arkaria.postgresql.org; Wed, 10 Jun 2026 14:19:37 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wXJmH-00Byzb-1v for pgsql-hackers@arkaria.postgresql.org; Wed, 10 Jun 2026 14:19:33 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wXJmH-00ByzS-0z for pgsql-hackers@lists.postgresql.org; Wed, 10 Jun 2026 14:19:33 +0000 Received: from meldrar.postgresql.org ([2a02:c0:301:0:ffff::31]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1wXJmE-000000021O5-1hSm for pgsql-hackers@postgresql.org; Wed, 10 Jun 2026 14:19:32 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=postgresql.org; s=20171124; h=Content-Transfer-Encoding:Content-Type: Mime-Version:References:In-Reply-To:From:Subject:Cc:To:Message-Id:Date:Sender :Reply-To:Content-ID:Content-Description; bh=bpwSTC6xfbMJuceJTVhFS4qHpXq0Mz60amePXPPpXOQ=; b=QaYKTAkP0gmf1T6aJ05wYaX+OQ s/FSEEXYBvm4U2PHQOM1x/b64TbbUgelKI2nqG/IIg3XPJW6u+GjmGX2wp8PyYqgm1Ay2sTsivSTi j23JjQOiZwUF6KJUA7ehGTKgavoaWhjqqTk+nPmsmZnxs9aK1e/zP2qxtC7VutmPEWL+iZmPJ9Dg7 dZyL/J4AEJWm9kYFKhBnB3aP8HyQsbu6Kt0mQuKgCkBLQejO0Co5+X8mdZhMMubzY5TzjFiIc0HVU eiit2dwKGjYcMzXwLTVFoLJ+H4+xej/fKhj/6+MsTkaJPQMGpEV51NAV7BvpqLAZ+LQmB4thk1LlK 0QWEMx/A==; Received: from [2409:11:4120:300:6feb:c07f:709a:911d] (helo=localhost) by meldrar.postgresql.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wXJm2-006rj5-2W; Wed, 10 Jun 2026 14:19:21 +0000 Date: Wed, 10 Jun 2026 23:19:02 +0900 (JST) Message-Id: <20260610.231902.1271202110858446443.ishii@postgresql.org> To: assam258@gmail.com Cc: jian.universality@gmail.com, zsolt.parragi@percona.com, sjjang112233@gmail.com, vik@postgresfriends.org, er@xs4all.nl, jacob.champion@enterprisedb.com, david.g.johnston@gmail.com, peter@eisentraut.org, li.evan.chao@gmail.com, pgsql-hackers@postgresql.org Subject: Re: Row pattern recognition From: Tatsuo Ishii In-Reply-To: References: X-Mailer: Mew version 6.8 on Emacs 29.3 Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Host-Lookup-Failed: Reverse DNS lookup failed for 2409:11:4120:300:6feb:c07f:709a:911d (failed) List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi Henson, > Hi Tatsuo, Jian, > > While doing a self-review pass over the incremental fixes on top of v47 I > ran into two issues where I'd rather agree on an approach with you before > I pick one. One of them is a regression I introduced myself in the DEFINE > memory-leak fix; the other is an original design point from v47. There is > also a third bug which I plan to handle together with the second one, since > it can be affected by that change -- I describe it at the end. > > I have verified both of the issues below on an assert-enabled build (and a > non-assert build where relevant). > > > == 1. DEFINE evaluation reuses the per-output-tuple context > (use-after-free) == > > nocfbot-0039 (the DEFINE memory-leak fix) added a ResetExprContext() in > update_reduced_frame, but it resets the wrong context. > > ps_ExprContext is the per-output-tuple context that ExecWindowAgg resets > once per output row. update_reduced_frame now resets it once per NFA row, > while the output row is still being formed -- so a pass-by-ref window > function result already datum-copied into that per-tuple memory (when > numfuncs > 1) is freed before ExecProject reads it. > > Minimal trigger -- a pass-by-ref window function plus a second one over an > RPR window: > > SELECT lag(company) OVER w, count(*) OVER w FROM stock > WINDOW w AS (PARTITION BY company ORDER BY tdate > ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING > AFTER MATCH SKIP PAST LAST ROW INITIAL > PATTERN (START UP+) > DEFINE START AS TRUE, UP AS price > PREV(price)); > > On a CLOBBER_FREED_MEMORY build the lag column comes out as 0x7F garbage; > in production it is garbage or a crash. (An aggregate is not required -- > lag + first_value hits the same reset via the frame-access path.) > > Neither v47 nor the patch is the answer on its own: v47 had no reset here, > so no use-after-free, but the DEFINE scratch accumulated over the whole > forward scan (the leak nocfbot-0039 fixed); nocfbot-0039 added the per-row > reset but on the shared per-output-tuple context. We do want a per-row > reset -- just not on that context. > > So I think this needs a dedicated ExprContext for DEFINE evaluation, reset > once per NFA row: it keeps the memory bounded without touching the > per-output-tuple results. > > Question: does a dedicated DEFINE ExprContext look right to you? Can we use winstate->tmpcontext instead? > > == 2. PREV/NEXT/FIRST/LAST placeholders collide with user functions == > > The nav operations are polymorphic pg_catalog functions (anyelement, OIDs > 8126-8133) recognized by funcid in parse_func.c, which collides with > same-name user functions. > > Outside DEFINE, a same-name function masks or clashes with the placeholder: > with public.last(anyelement), SELECT last(123) fails "cannot use last > outside a DEFINE clause"; with public.next(numeric), SELECT next(10) fails > "function next(integer) is not unique"; and even with no user function, > last(123) errors instead of "function last(integer) does not exist". > > Inside DEFINE, a same-name function with an exact-type match beats the > anyelement placeholder, so PREV(price) silently becomes a plain FuncExpr > instead of an RPRNavExpr -- a wrong match result with no error (reproduced > for numeric, text and int). And ruleutils deparses a bare PREV(, so > reparsing a view under a search_path with public.prev rebinds it (pg_dump > is safe via search_path = ''). > > This is original v47 design, not a regression. Per the standard, > PREV/NEXT/FIRST/LAST are navigation operations with dedicated syntax, not > general-namespace functions -- the collision comes from mapping them onto > catalog functions plus search-path resolution. > > I haven't found a clean approach yet. Inside DEFINE these names have to be > the navigation operation (per the standard), yet outside DEFINE they > shouldn't shadow or break same-name user functions the way the catalog > placeholders do -- and since the deparse output is unqualified (a bare > PREV(...)), whatever we choose also has to round-trip cleanly. I'm not > sure how best to reconcile those. > > My rough leaning is to not add catalog functions for these at all: leave > resolution outside DEFINE exactly as it is today, and only inside DEFINE > adjust the function-resolution path itself to recognize the navigation > operations. But that is still quite abstract. > > Question: how would you approach this? I need more time think about this. Regards, -- Tatsuo Ishii SRA OSS K.K. English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp