Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wXslm-003XLH-34 for pgsql-hackers@arkaria.postgresql.org; Fri, 12 Jun 2026 03:41:23 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wXsll-000fbb-1m for pgsql-hackers@arkaria.postgresql.org; Fri, 12 Jun 2026 03:41:21 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wXsll-000fbL-0l for pgsql-hackers@lists.postgresql.org; Fri, 12 Jun 2026 03:41:21 +0000 Received: from meldrar.postgresql.org ([2a02:c0:301:0:ffff::31]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1wXsli-00000002f0s-1h2e for pgsql-hackers@postgresql.org; Fri, 12 Jun 2026 03:41:20 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=postgresql.org; s=20171124; h=Content-Transfer-Encoding:Content-Type: Mime-Version:References:In-Reply-To:From:Subject:Cc:To:Message-Id:Date:Sender :Reply-To:Content-ID:Content-Description; bh=NxiDP462Xqb+9i0g7VLGhiGBymv3ANTFfb/6/MFLmps=; b=LSZCIE4PCbo0GziGhdvent5QXZ Ue3kKwF8yGyMW3rJ4BP+IXLKkFHn8gu8tEnZEib8+IYerURZSOrElj72HDQdCOfp0w46nSaciweup uSJaKYMRoYmYTanbiiwlU2ofv3pfXXKNOBlD40vDuOWO+p9dvOS4LvEcy5mcUh6bH4WXAtyATHqZj a77DJBVgkK7f5CwEaPNedUhbgQPE4m0a35zh+iQFAHiMt8Vw0G173EvKWVXBdYp7j8b8iQgDSFTYU 9S7nCTQB2+sXtA162z7R3h/SIs5CxzteQQG8VTOO619vRnLTnOCk7ZxAZh4PoXdaV5ONi08kx3Box 6qXGYYAg==; Received: from p383008-ipoe.ipoe.ocn.ne.jp ([153.240.252.7] helo=localhost) by meldrar.postgresql.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wXsla-007YsG-01; Fri, 12 Jun 2026 03:41:12 +0000 Date: Fri, 12 Jun 2026 12:40:50 +0900 (JST) Message-Id: <20260612.124050.1286931795409220006.ishii@postgresql.org> To: assam258@gmail.com Cc: jian.universality@gmail.com, zsolt.parragi@percona.com, sjjang112233@gmail.com, vik@postgresfriends.org, er@xs4all.nl, jacob.champion@enterprisedb.com, david.g.johnston@gmail.com, peter@eisentraut.org, li.evan.chao@gmail.com, pgsql-hackers@postgresql.org Subject: Re: Row pattern recognition From: Tatsuo Ishii In-Reply-To: References: <20260610.231902.1271202110858446443.ishii@postgresql.org> X-Mailer: Mew version 6.8 on Emacs 29.3 Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi Henson, > Hi Tatsuo, > > == 1. DEFINE evaluation use-after-free (to Tatsuo) == > >>> nocfbot-0039 (the DEFINE memory-leak fix) added a ResetExprContext() in >>> update_reduced_frame, but it resets the wrong context. >>> >>> ps_ExprContext is the per-output-tuple context that ExecWindowAgg resets >>> once per output row. update_reduced_frame now resets it once per NFA > row, >>> while the output row is still being formed -- so a pass-by-ref window >>> function result already datum-copied into that per-tuple memory (when >>> numfuncs > 1) is freed before ExecProject reads it. >>> >>> Neither v47 nor the patch is the answer on its own: v47 had no reset > here, >>> so no use-after-free, but the DEFINE scratch accumulated over the whole >>> forward scan (the leak nocfbot-0039 fixed); nocfbot-0039 added the > per-row >>> reset but on the shared per-output-tuple context. We do want a per-row >>> reset -- just not on that context. >>> >>> So I think this needs a dedicated ExprContext for DEFINE evaluation, > reset >>> once per NFA row: it keeps the memory bounded without touching the >>> per-output-tuple results. >>> >>> Question: does a dedicated DEFINE ExprContext look right to you? >> >> Can we use winstate->tmpcontext instead? > > I don't think we can. tmpcontext is idle where update_reduced_frame > would reset it, but not *during* DEFINE evaluation: a DEFINE with NEXT() > re-enters the spooler from inside its ExecEvalExpr (ExecEvalRPRNavSet > -> window_gettupleslot -> spool_tuples), and spool_tuples resets > winstate->tmpcontext (via ExecQualAndReset on partEqfunction) for every > input row it pulls under a PARTITION BY. ExecEvalRPRNavRestore parks a > pass-by-ref nav result in that per-tuple memory to survive until the > next reset. To give one example, > > -- the cast forces a pass-by-ref result; a by-value bigint is safe > DEFINE b AS PREV(price::numeric) < NEXT(price::numeric) > > frees PREV's result when NEXT spools the next row, before numeric_lt() > reads it -- the same use-after-free as nocfbot-0039, just on a different > context. It is the normal forward-scan path (the NFA runs ahead of the > spool), and after a tuplestore spill even NEXT-free DEFINEs hit it. > > More generally, tmpcontext fits its existing users because each one sets > its slots, evaluates, and resets within a shallow, straight-line region; > it is best kept to that limited, low-depth usage. DEFINE evaluation > re-enters the spooler mid-expression, so it cannot honor that contract. > > So I think the dedicated DEFINE ExprContext is the right call: a > once-per-NFA-row reset is a third cadence, distinct from tmpcontext > (per-input-tuple) and ps_ExprContext (per-output-tuple) -- the same > one-context-per-cadence pattern nodeWindowAgg and nodeAgg already use. Thanks for the explanation. Got it. We need the third ExprContext. Regards, -- Tatsuo Ishii SRA OSS K.K. English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp