Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wXC97-0037J6-0r for pgsql-hackers@arkaria.postgresql.org; Wed, 10 Jun 2026 06:10:37 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wXC96-009QBz-06 for pgsql-hackers@arkaria.postgresql.org; Wed, 10 Jun 2026 06:10:36 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wXC95-009QBp-1y for pgsql-hackers@lists.postgresql.org; Wed, 10 Jun 2026 06:10:35 +0000 Received: from mail-ed1-x52d.google.com ([2a00:1450:4864:20::52d]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1wXC93-00000001xnQ-136y for pgsql-hackers@postgresql.org; Wed, 10 Jun 2026 06:10:34 +0000 Received: by mail-ed1-x52d.google.com with SMTP id 4fb4d7f45d1cf-6877c719cb0so8059058a12.2 for ; Tue, 09 Jun 2026 23:10:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1781071832; cv=none; d=google.com; s=arc-20240605; b=WoWeyg+tAH0EzkpbMPQBxvSMe42jFtw6wmCdY7eM2iIlIyac/ZI8X0l8mBrhSVcMWr qjsNWxKh/k29+D7Oe3Vs+1jbnS4Bq9KeMrNqAor8M4j/igJA9zfTME59pUOuhmUJV/r1 mGPG+bR70+cPDqXC15+l5GBB3JNUwBobwFAdVA7On+AGhotsJ49W1XetLhp1Ll6nGnWc WQerRtVQ2vQCh+P2PTM6SSJSyOk1Qo7kDQ9+eHBbXXGUBq+/6+JIPVNyX94bLM5hM9Yn aEEYSzjpmshFwU3jRZNDwi0oMKjEBFo3iVejWzWAg41rlrEW7tNOeFB0OoPHXRoEfclO 9zdw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:reply-to:in-reply-to:references :mime-version:dkim-signature; bh=AJEkJKk4vQGd0FmfeqvCYai/izLX81oJbm/tcRk1Bu4=; fh=Z7o5jj4AkOXTLiHWpQQ+QMZpWmHrjKI6rDIljp8cL0I=; b=jc4unZCjhz73RxU3Os7KhT9WlcnDQk6/uxsYCA9HuLg4MdQ7VV6TqiXNEoHXfaIF46 4fvqeXMOlmqknj9tkcRGFGoavyyHhHZSL5t6Jj1kcJxb2DAFVqwOCjezD5jgglPUL0d6 k0xKRk5CruRJyl+4bSEPDv7YqJr0+Q6gRmmHdjCVYCLgDd22WGB0G54z7RSxpEccVt95 LfnGRpOAK3GHL+5Ag3qPbqTBTHru0wsYbVpTuwA63nEH+Wd5izG/JhoED24EB4yVqCjA 43rDPHU7SF0Dp75f7tYYCJjeYGi2xCHGovfucFGh4/t0vTWNQ2T3utNaA3hJjl9Snmyb iRIA==; darn=postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1781071832; x=1781676632; darn=postgresql.org; h=cc:to:subject:message-id:date:from:reply-to:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=AJEkJKk4vQGd0FmfeqvCYai/izLX81oJbm/tcRk1Bu4=; b=B5dlL8YhTbecTF2UWZJB7BYYN6wuWb1aHYMjxxbehANcZbAYE/tW4vcf86zJ8Tvfau gQUe5kZF2Fic8EBcQstfdZChQUbNKY+snz2VYtKFUqFOc5ZY5Z5FK1IR92gbqhTdWgn8 j+FDM1qOsEKb7jgXwgSEBhg9HRIwgCAfUH4xTfXFJVfRbwWhEptB0uWiyUvG8vpFhSdV v87U6SFY0tRs5+n2lePVy6NYjFIt30ldk3itQytpLjf01xaVgv8gBYKtYbSzS4W0tS3z duS5RxY1Htckfa1brRum+Zr5VHWP3EsPUXW3m4MO6GwWBkHaGh0u6d8XTf+I2Eevf1VI op1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1781071832; x=1781676632; h=cc:to:subject:message-id:date:from:reply-to:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=AJEkJKk4vQGd0FmfeqvCYai/izLX81oJbm/tcRk1Bu4=; b=eVo0uRMJUNG21/p9Ay5W0jyQHK/DwsS6j7jO4ELsharPe33NpW/cRSwsFpCc3pxBmn RgdTIfuFHf1Zil2CNTE028CRRey8UR6pagHe5iNaxyeTUAtXtqu3I9Ke8Br0uqJu3hGU 0k7I8rojcojt7GU5r0gBQcsTx4ASbZBRhS6XGQ44W41Rw0Iim1tBdwrqvnclIMYiQbzm ZSKk7pG1cSXDIjM0n6/zuruJPsxmzFRti0vsDQJ0JPqGST29iygcIjxfvELBzJjQ5v8y 2Z4F8jJy5rkczEiVipHo1Cd2keUB0Z5Eoam/EOq9ad6V13vBn2vzEpyacqg1wSqG+sZR o7BQ== X-Forwarded-Encrypted: i=1; AFNElJ9fq0TsvwusEPPRdgWdXPdAqDmTC1b7MbZnN1qeQpFffleD76nfzuF7MVLPnacTN+ZeVpXysvciJGMuLWnC@postgresql.org X-Gm-Message-State: AOJu0Yxd38dl7gRDZSRqI1suGEM2iJUiSHz2JO0wfrHPrY/B27o4Wt9o RrffolzQ5rs11sxdfgEaCp4zQIIgmdcWrTZTowpoVrvCavJATFwzdh4lXRcOHZACd60Ocmoi6uT iJsuu1L2eMa/f9n5tsn2hMS7dtbCCgYs= X-Gm-Gg: Acq92OHMU1NyLwmrD9VQc1JCIlk4C83Z5aXHf3TWYqvTwfZgjRkIWtWujcks/WltTrk 654hVJ9vWgqQbEec1fXBMx69NkISOs2p/xraWb6QHGq1lelnlYbsq+9ahMqdlN3WrvMZ6QX3Zn+ 8sIXNf0O9wvr8C9AJdM+BJue8ldKPnQX+H5ggHaRGy3AmbgWhFLRxaMCLMk91qZuu7yxp6gkOxX 9tjGK2dDFnuPdW2oNbOQMna5gAifc+atZEWZM9LGUPRQQMoBLqT/9wW/P5ESOXsmZj0ASHLSsV9 wZKME3SO4kFEBSOLyMp1mO9s872ePJtu7NhxeAoQM1m/q3LfSkfD6i9M4kMtBSIsNYz8s+bksxY QYuzLZeUwWpkg8HjUdw5dojApWSvzRCPPyH+hy3rx2w== X-Received: by 2002:a17:907:1c23:b0:beb:362b:69f8 with SMTP id a640c23a62f3a-bf372f21bbamr1129124166b.41.1781071831278; Tue, 09 Jun 2026 23:10:31 -0700 (PDT) MIME-Version: 1.0 References: <20260604.132108.405136284364833955.ishii@postgresql.org> <20260609.171307.1883356507067957349.ishii@postgresql.org> In-Reply-To: Reply-To: assam258@gmail.com From: Henson Choi Date: Wed, 10 Jun 2026 15:10:19 +0900 X-Gm-Features: AVVi8Cd4rMevMRV7bwSSHFBEIwVIk1nj7F-iuqew31WJOovZuSC3Q4iv3RrkGyo Message-ID: Subject: Re: Row pattern recognition To: Tatsuo Ishii , jian he Cc: zsolt.parragi@percona.com, sjjang112233@gmail.com, vik@postgresfriends.org, er@xs4all.nl, jacob.champion@enterprisedb.com, david.g.johnston@gmail.com, peter@eisentraut.org, li.evan.chao@gmail.com, pgsql-hackers@postgresql.org Content-Type: multipart/alternative; boundary="000000000000d2a7a90653e01909" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000d2a7a90653e01909 Content-Type: text/plain; charset="UTF-8" Hi Tatsuo, Jian, While doing a self-review pass over the incremental fixes on top of v47 I ran into two issues where I'd rather agree on an approach with you before I pick one. One of them is a regression I introduced myself in the DEFINE memory-leak fix; the other is an original design point from v47. There is also a third bug which I plan to handle together with the second one, since it can be affected by that change -- I describe it at the end. I have verified both of the issues below on an assert-enabled build (and a non-assert build where relevant). == 1. DEFINE evaluation reuses the per-output-tuple context (use-after-free) == nocfbot-0039 (the DEFINE memory-leak fix) added a ResetExprContext() in update_reduced_frame, but it resets the wrong context. ps_ExprContext is the per-output-tuple context that ExecWindowAgg resets once per output row. update_reduced_frame now resets it once per NFA row, while the output row is still being formed -- so a pass-by-ref window function result already datum-copied into that per-tuple memory (when numfuncs > 1) is freed before ExecProject reads it. Minimal trigger -- a pass-by-ref window function plus a second one over an RPR window: SELECT lag(company) OVER w, count(*) OVER w FROM stock WINDOW w AS (PARTITION BY company ORDER BY tdate ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING AFTER MATCH SKIP PAST LAST ROW INITIAL PATTERN (START UP+) DEFINE START AS TRUE, UP AS price > PREV(price)); On a CLOBBER_FREED_MEMORY build the lag column comes out as 0x7F garbage; in production it is garbage or a crash. (An aggregate is not required -- lag + first_value hits the same reset via the frame-access path.) Neither v47 nor the patch is the answer on its own: v47 had no reset here, so no use-after-free, but the DEFINE scratch accumulated over the whole forward scan (the leak nocfbot-0039 fixed); nocfbot-0039 added the per-row reset but on the shared per-output-tuple context. We do want a per-row reset -- just not on that context. So I think this needs a dedicated ExprContext for DEFINE evaluation, reset once per NFA row: it keeps the memory bounded without touching the per-output-tuple results. Question: does a dedicated DEFINE ExprContext look right to you? == 2. PREV/NEXT/FIRST/LAST placeholders collide with user functions == The nav operations are polymorphic pg_catalog functions (anyelement, OIDs 8126-8133) recognized by funcid in parse_func.c, which collides with same-name user functions. Outside DEFINE, a same-name function masks or clashes with the placeholder: with public.last(anyelement), SELECT last(123) fails "cannot use last outside a DEFINE clause"; with public.next(numeric), SELECT next(10) fails "function next(integer) is not unique"; and even with no user function, last(123) errors instead of "function last(integer) does not exist". Inside DEFINE, a same-name function with an exact-type match beats the anyelement placeholder, so PREV(price) silently becomes a plain FuncExpr instead of an RPRNavExpr -- a wrong match result with no error (reproduced for numeric, text and int). And ruleutils deparses a bare PREV(, so reparsing a view under a search_path with public.prev rebinds it (pg_dump is safe via search_path = ''). This is original v47 design, not a regression. Per the standard, PREV/NEXT/FIRST/LAST are navigation operations with dedicated syntax, not general-namespace functions -- the collision comes from mapping them onto catalog functions plus search-path resolution. I haven't found a clean approach yet. Inside DEFINE these names have to be the navigation operation (per the standard), yet outside DEFINE they shouldn't shadow or break same-name user functions the way the catalog placeholders do -- and since the deparse output is unqualified (a bare PREV(...)), whatever we choose also has to round-trip cleanly. I'm not sure how best to reconcile those. My rough leaning is to not add catalog functions for these at all: leave resolution outside DEFINE exactly as it is today, and only inside DEFINE adjust the function-resolution path itself to recognize the navigation operations. But that is still quite abstract. Question: how would you approach this? == Note: a third bug, to be handled together with item 2 == A navigation operation nested inside another nav's offset argument -- e.g. PREV(price, NEXT(2::bigint, 0)) in a DEFINE clause -- slips past the parser but trips Assert(!IsA(nav->offset_arg, RPRNavExpr)) in the planner. So it aborts at plan time on an assert build; without asserts, the backward form PREV(price, PREV(2::bigint, 5)) reaches a runtime "cannot fetch row ... before mark position". The fix is to reject a nav inside an offset argument in the DEFINE walk. But since item 2 may reshape that walker substantially from how it works today, I'll do it together with item 2 and add it as a regression test there. Thanks, Henson --000000000000d2a7a90653e01909 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Tatsuo, Jian,

While doing a self-review pass ove= r the incremental fixes on top of v47 I
ran into two issues where I'= d rather agree on an approach with you before
I pick one.=C2=A0 One of t= hem is a regression I introduced myself in the DEFINE
memory-leak fix; t= he other is an original design point from v47.=C2=A0 There is
also a thi= rd bug which I plan to handle together with the second one, since
it can= be affected by that change -- I describe it at the end.

I have veri= fied both of the issues below on an assert-enabled build (and a
non-asse= rt build where relevant).


=3D=3D 1. DEFINE evaluation reuses the= per-output-tuple context (use-after-free) =3D=3D

nocfbot-0039 (the = DEFINE memory-leak fix) added a ResetExprContext() in
update_reduced_fra= me, but it resets the wrong context.

ps_ExprContext is the per-outpu= t-tuple context that ExecWindowAgg resets
once per output row. =C2=A0upd= ate_reduced_frame now resets it once per NFA row,
while the output row i= s still being formed -- so a pass-by-ref window
function result already = datum-copied into that per-tuple memory (when
numfuncs > 1) is freed = before ExecProject reads it.

Minimal trigger -- a pass-by-ref window= function plus a second one over an
RPR window:

=C2=A0 SELECT lag= (company) OVER w, count(*) OVER w FROM stock
=C2=A0 =C2=A0WINDOW w AS (P= ARTITION BY company ORDER BY tdate
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 AFTER MATCH SKIP PA= ST LAST ROW INITIAL
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 PATTERN (START UP+)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 DEFINE START AS TRUE, UP AS price > PREV(price));

On a CL= OBBER_FREED_MEMORY build the lag column comes out as 0x7F garbage;
in pr= oduction it is garbage or a crash. =C2=A0(An aggregate is not required --lag + first_value hits the same reset via the frame-access path.)

= Neither v47 nor the patch is the answer on its own: v47 had no reset here,<= br>so no use-after-free, but the DEFINE scratch accumulated over the whole<= br>forward scan (the leak nocfbot-0039 fixed); nocfbot-0039 added the per-r= ow
reset but on the shared per-output-tuple context.=C2=A0 We do want a = per-row
reset -- just not on that context.

So I think this needs = a dedicated ExprContext for DEFINE evaluation, reset
once per NFA row: i= t keeps the memory bounded without touching the
per-output-tuple results= .

Question: does a dedicated DEFINE ExprContext look right to you?

=3D=3D 2. PREV/NEXT/FIRST/LAST placeholders collide with user fun= ctions =3D=3D

The nav operations are polymorphic pg_catalog function= s (anyelement, OIDs
8126-8133) recognized by funcid in parse_func.c, whi= ch collides with
same-name user functions.

Outside DEFINE, a same= -name function masks or clashes with the placeholder:
with public.last(a= nyelement), SELECT last(123) fails "cannot use last
outside a DEFIN= E clause"; with public.next(numeric), SELECT next(10) fails
"f= unction next(integer) is not unique"; and even with no user function,<= br>last(123) errors instead of "function last(integer) does not exist&= quot;.

Inside DEFINE, a same-name function with an exact-type match = beats the
anyelement placeholder, so PREV(price) silently becomes a plai= n FuncExpr
instead of an RPRNavExpr -- a wrong match result with no erro= r (reproduced
for numeric, text and int).=C2=A0 And ruleutils deparses a= bare PREV(, so
reparsing a view under a search_path with public.prev re= binds it (pg_dump
is safe via search_path =3D '').

This i= s original v47 design, not a regression.=C2=A0 Per the standard,
PREV/NE= XT/FIRST/LAST are navigation operations with dedicated syntax, not
gener= al-namespace functions -- the collision comes from mapping them onto
cat= alog functions plus search-path resolution.

I haven't found a cl= ean approach yet.=C2=A0 Inside DEFINE these names have to be
the navigat= ion operation (per the standard), yet outside DEFINE they
shouldn't = shadow or break same-name user functions the way the catalog
placeholder= s do -- and since the deparse output is unqualified (a bare
PREV(...)), = whatever we choose also has to round-trip cleanly.=C2=A0 I'm not
sur= e how best to reconcile those.

My rough leaning is to not add catalo= g functions for these at all: leave
resolution outside DEFINE exactly as= it is today, and only inside DEFINE
adjust the function-resolution path= itself to recognize the navigation
operations.=C2=A0 But that is still = quite abstract.

Question: how would you approach this?


= =3D=3D Note: a third bug, to be handled together with item 2 =3D=3D

= A navigation operation nested inside another nav's offset argument -- e= .g.
PREV(price, NEXT(2::bigint, 0)) in a DEFINE clause -- slips past the= parser
but trips Assert(!IsA(nav->offset_arg, RPRNavExpr)) in the pl= anner.=C2=A0 So it
aborts at plan time on an assert build; without asser= ts, the backward form
PREV(price, PREV(2::bigint, 5)) reaches a runtime = "cannot fetch row ...
before mark position".

The fix is= to reject a nav inside an offset argument in the DEFINE walk.
But since= item 2 may reshape that walker substantially from how it works
today, I= 'll do it together with item 2 and add it as a regression test there.
Thanks,
Henson
--000000000000d2a7a90653e01909--