Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wB7b3-000l0L-0F for pgsql-hackers@arkaria.postgresql.org; Fri, 10 Apr 2026 08:52:13 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wB7b1-00Amed-0H for pgsql-hackers@arkaria.postgresql.org; Fri, 10 Apr 2026 08:52:11 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wB7b0-00AmeU-2A for pgsql-hackers@lists.postgresql.org; Fri, 10 Apr 2026 08:52:11 +0000 Received: from mail-pj1-x1034.google.com ([2607:f8b0:4864:20::1034]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1wB7az-00000000IuT-1J2F for pgsql-hackers@postgresql.org; Fri, 10 Apr 2026 08:52:10 +0000 Received: by mail-pj1-x1034.google.com with SMTP id 98e67ed59e1d1-35c2fe0d90fso1329894a91.1 for ; Fri, 10 Apr 2026 01:52:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1775811129; cv=none; d=google.com; s=arc-20240605; b=XAcACokR/UNAxqx4E+3+SepDcI9zz9naDLH7emVTpiNaPLeqxvGTVHHZUHfcpJtsfc QUFaZMJS22KSChagw7aaFE4hKnbudiEA4rMsaJ7uZyQdc4hCSkbX7+U1Klqi9zeUcZZ8 zUKwXTYdubU8aCW1Z1nAISUauji3vdNnS11ecg1irRoDOwp1pglYHCP243t43/JO7dda gTbPpcbPRQicdOONG2Jco+2kenO6jeKok+cbmt9GF2D/BmAHN7vvgY0wZQb4ruAjIM3+ 49OVyVJ485HaKvVoKXJBaonYDZVVfOHZ54xDuRgsxm5sa3G270Ml3zVOW3w+PljLqKAk a5JA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:reply-to:in-reply-to:references :mime-version:dkim-signature; bh=BDV0OSq8hp9fbqbPIJGYiFLi5RfLbZbjiVugffPkQq4=; fh=L00JSdjRVpcqnQyulClU5IFY8cFWeYGes4uL7LvSRic=; b=OHJr0eBiP1jSdP1QzG3wfeTXkQDsBki6xEBXNpeXJYfuR1OaMHeBDa/R3Hk5j/6F4W eRPD5140QI8m02arQHvVgW2/vf5ubMlzvjtrF8pPtWAZcVwB1R2/5Cx6gqN5o7earhrl fU9Dl6jFDSR8v5TbRODPSQPQ77A9Z7y34Zkb/RRvYhsX9wW+8kNl5yrqMxWTg9FNwGy9 qbd+T5bbbOhTWZoUOCuDtp7smgVG1aUOzHseRaHX5ILbiy8+PUwAZdJ6uZBoYeAn+jSq rn2W2i7tQGEvefCcaqTZbHINn8AW3YOnmFcmIsEcOM2FL0Vp8A6y4UOcNaHTzaSPpO5H TrUw==; darn=postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1775811129; x=1776415929; darn=postgresql.org; h=cc:to:subject:message-id:date:from:reply-to:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=BDV0OSq8hp9fbqbPIJGYiFLi5RfLbZbjiVugffPkQq4=; b=BYKJksE0hR3WV0nPsQECrkOJ1MhQ70ABCbGpkcmtwMJwGSoO+Bs+oJUtNwBSyAKBva Q5xVSkFIbJxEQoeY9n0pXtQ17djtcduzvUJiJUGVxPeOwIT8WsB2edwGMG6qa5KlHvHy FyEIJpdoccVARvQK0xeBXelH3HkzXpBRidCs8Kmq/yTw9UzJvUh/JnkGkgoE5kYtgMyb wA0zUZg/EYLeggwDYt3vwN2LNLn/649bfKi6P23qNO9ARqB3E/hHNtdZsRDVHKSLkUdY +VZl19wfiSJFZHMG5cSh8PuRlKf/aAjSkGX6P3M7WVpcxVukjaR8b4ryhDeYGDYcPW75 Au7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1775811129; x=1776415929; h=cc:to:subject:message-id:date:from:reply-to:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=BDV0OSq8hp9fbqbPIJGYiFLi5RfLbZbjiVugffPkQq4=; b=i2n/Ibr42XZs+3QgzVv8x/nRiFCkYKoKeqBF67XBMIH3XayfSn1BRQw0xa1jNWGi9f kpqUEEL7lfTxllDNgCHu9C91zQ/88GiicIUUuzUQsgIpLBGYSI4ZFPAB8O+ZsnpaSByX ZGMjyusgup6l3LVIMq2Hm0saH56QnsImaWPqc50jpeag5esD0fJDcT15/f8JQSYYwg4I BG1Dx1Hxydntc2wf7vUn5I8vBO23X1YD6E9x4zPFxRhobnbXI9a9V5y5uk4ZE607PV6t 3B0gWCDde27XsOTHo2GfvQShqRl0RGTDKH2l36R2fB9R0WHzRjz26tlxvKJLT/Vl44s5 RqdA== X-Forwarded-Encrypted: i=1; AJvYcCXlaOhfJ+OHIWjmUaUL8NJ2Ix83WBZncX6d+aNBFyN9q3XgHxWKfm5LAqI71Xblmc9sOusFyVZFaH9s9M9+@postgresql.org X-Gm-Message-State: AOJu0YzNiH+0jBBZTrSKLGPb0FUwzzIQdWoTO5wZHeGpI3P2D0oRjRPN 87NNh0ew6wehh5Lxteg7BmIYS2GoLn0BLXZn+HLlpxhaUW+Ufw1zW2UsglppRxCXVDWknNlNnva FFeR6u/0AvBTwuqGgX9RFUWF4Opj+ypI= X-Gm-Gg: AeBDieuzgbSOeOvjDf9Oji2TOub0kEN+YghXNhBsOVKMRKoAzOo8C5HIy0/bjPODCbc toAPDfVO0uvQPZfScWsuLih/H0uJvP3IUJ1rRxuwIMdJ9eUQuISmCMH8ACWTzPrR63kt16pHyvy 67oHgiDNi+CvhQXh7LGJnxAxKxfPoozKbFduEWR8uK2sxSzqeI6SzHA3I3cJAUaYROVV7TpETS4 AOxbjK1Eqmgp8XJ4+C8X5G5e1IYTqgeQmhDjyLNe9vcrhVvkRy29XgF+cp7C047S3T/IBl79wy8 JzlB9KjUuhjFvTpAIHKT49CXxAIrND+ZbdpFWmHb X-Received: by 2002:a17:90b:2fcb:b0:35b:a3be:f1b6 with SMTP id 98e67ed59e1d1-35e4281c521mr2585734a91.16.1775811128647; Fri, 10 Apr 2026 01:52:08 -0700 (PDT) MIME-Version: 1.0 References: <20260330.133428.1197197778850463943.ishii@postgresql.org> <20260410.101847.595894887122872457.ishii@postgresql.org> In-Reply-To: <20260410.101847.595894887122872457.ishii@postgresql.org> Reply-To: assam258@gmail.com From: Henson Choi Date: Fri, 10 Apr 2026 17:51:54 +0900 X-Gm-Features: AQROBzAfwkkBMidjSA880-d1VI12hqoNzRIMyEtCCiH9Pe1OgbwKonh15d8ac64 Message-ID: Subject: Re: Row pattern recognition To: Tatsuo Ishii Cc: zsolt.parragi@percona.com, sjjang112233@gmail.com, vik@postgresfriends.org, er@xs4all.nl, jacob.champion@enterprisedb.com, david.g.johnston@gmail.com, peter@eisentraut.org, pgsql-hackers@postgresql.org Content-Type: multipart/alternative; boundary="00000000000082e52c064f173f17" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --00000000000082e52c064f173f17 Content-Type: text/plain; charset="UTF-8" Hi Tatsuo, Thank you for the careful review. Before this, there's a check in the function: > > if (tle->ressortgroupref || tle->resjunk) > continue; > > Below is the query in the regression test that is suppoed to trigger > the error. In this case column "i" in the target list should have > resjunk flag to be set to true. So I thought the if statenment above > becomes true and the column i is not removed from subquery's target > list. Am I missing something? That is a very good observation. For the test query you cited, the resjunk check does indeed protect column "i", as you noted. The allpaths.c guard is needed for a different scenario: when a Var already exists in the inner SELECT as a non-resjunk entry, but the outer query does not reference it. For example: SELECT count(*) FROM ( SELECT i, count(*) OVER w FROM generate_series(1,10) i -- ^ in SELECT -> resjunk=false WINDOW w AS ( ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING PATTERN (A+) DEFINE A AS i > PREV(i) ) ) t; -- outer uses only count(*) -> i is not in attrs_used In this case, the Var addition logic in parse_rpr.c (pull_var_clause + dup check) finds that "i" already exists in the targetlist, so it does not add a new resjunk entry. Since "i" is resjunk=false and the outer query does not reference it, remove_unused_subquery_outputs() would replace it with NULL, breaking DEFINE evaluation. Changing the existing entry to resjunk=true is not an option either, since it would alter the subquery's output schema -- the outer query may reference that column, and at parse time we do not yet know whether it will. The existing protection mechanisms (resjunk, ressortgroupref, attrs_used, volatile check) cannot express "referenced by DEFINE", since DEFINE is a new column reference path that did not exist in PostgreSQL before. The allpaths.c guard follows the same pattern as the existing SRF and volatile guards in that function. Also I think removal of the WindowAgg node is fine here because it's > not necessary to calculate the result of the sub query at > all. Actually the query above runs fine on v46. I guess some of the > incremental patches caused this to stop working. Can you elaborate? > I agree -- for this specific query, removing the WindowAgg does not change the count(*) result, since window functions do not change the number of rows. Regarding v46: the two bugs were actually latent from the initial implementation, not caused by incremental patches. The old code added the entire DEFINE expression (e.g., "i > PREV(i)") to the query targetlist via findTargetlistEntrySQL99(). In most cases, DEFINE conditions involve comparison operators, so the resulting OpExpr did not match existing Var entries in the targetlist, and a new resjunk entry was created. This accidentally prevented removal. (In fact, a bare boolean DEFINE such as "DEFINE A AS flag" would have exhibited the same removal bug even in v46, since findTargetlistEntrySQL99 would reuse the existing entry instead of creating a new resjunk one -- confirming that the issue was latent in the original implementation, not introduced by recent patches.) However, that same approach caused the SIGSEGV (Bug 1): when RPR and non-RPR windows coexist, the non-RPR WindowAgg inherits targetlist entries containing RPRNavExpr nodes that it cannot evaluate. The two bugs are two sides of the same coin -- the old approach prevented removal but caused SIGSEGV. Switching to Var-only addition fixed the SIGSEGV but required the allpaths.c guard to prevent removal. As for more precise optimization (allowing removal when all WindowFuncs for an RPR clause are unused), it would require structural changes to the loop in remove_unused_subquery_outputs(), which currently evaluates entries individually. Since this function runs for all subqueries, not just RPR, such a change would need broader testing. The patch addresses two distinct issues: the parse_rpr.c change (Var-only addition) prevents RPRNavExpr from propagating to non-RPR WindowAgg nodes, while the allpaths.c guard prevents incorrect column removal. Since both are needed, the guard cannot simply be reverted. Would it make sense to keep it as-is for now, and explore the more precise optimization as a separate follow-up patch? Best regards, Henson --00000000000082e52c064f173f17 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Tatsuo,

Thank = you for the careful review.

Before this, there's a check in the function:

=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (tle->ressort= groupref || tle->resjunk)
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 continue;

Below is the query in the regression test that is suppoed to trigger
the error.=C2=A0 In this case column "i" in the target list shoul= d have
resjunk flag to be set to true. So I thought the if statenment above
becomes true and the column i is not removed from subquery's target
list. Am I missing something?=C2=A0

That is= a very good observation. For the test query you cited, the
resjunk chec= k does indeed protect column "i", as you noted.

The allpat= hs.c guard is needed for a different scenario: when a Var
already exists= in the inner SELECT as a non-resjunk entry, but the
outer query does no= t reference it. For example:

=C2=A0 SELECT count(*) FROM (
=C2=A0= =C2=A0 SELECT i, count(*) OVER w FROM generate_series(1,10) i
=C2=A0 = =C2=A0 -- =C2=A0 =C2=A0 ^ in SELECT -> resjunk=3Dfalse
=C2=A0 =C2=A0 = WINDOW w AS (
=C2=A0 =C2=A0 =C2=A0 ROWS BETWEEN CURRENT ROW AND UNBOUNDE= D FOLLOWING
=C2=A0 =C2=A0 =C2=A0 PATTERN (A+)
=C2=A0 =C2=A0 =C2=A0 DE= FINE A AS i > PREV(i)
=C2=A0 =C2=A0 )
=C2=A0 ) t;
=C2=A0 -- out= er uses only count(*) -> i is not in attrs_used

In this case, the= Var addition logic in parse_rpr.c (pull_var_clause
+ dup check) finds t= hat "i" already exists in the targetlist, so it
does not add a= new resjunk entry. Since "i" is resjunk=3Dfalse and the
outer= query does not reference it, remove_unused_subquery_outputs()
would rep= lace it with NULL, breaking DEFINE evaluation.

Changing the existing= entry to resjunk=3Dtrue is not an option either,
since it would alter t= he subquery's output schema -- the outer query
may reference that co= lumn, and at parse time we do not yet know
whether it will. The existing= protection mechanisms (resjunk,
ressortgroupref, attrs_used, volatile c= heck) cannot express
"referenced by DEFINE", since DEFINE is a= new column reference path
that did not exist in PostgreSQL before. The = allpaths.c guard
follows the same pattern as the existing SRF and volati= le guards in
that function.

Also I think removal of the WindowAgg node is fine here because it's not necessary to calculate the result of the sub query at
all. Actually the query above runs fine on v46. I guess some of the
incremental patches caused this to stop working. Can you elaborate?

I agree -- for this specific query, removing the Wi= ndowAgg does not
change the count(*) result, since window functions do n= ot change the
number of rows.

Regarding v46: the two bugs were ac= tually latent from the initial
implementation, not caused by incremental= patches. The old code
added the entire DEFINE expression (e.g., "i= > PREV(i)") to the
query targetlist via findTargetlistEntrySQL9= 9(). In most cases,
DEFINE conditions involve comparison operators, so t= he resulting
OpExpr did not match existing Var entries in the targetlist= , and a
new resjunk entry was created. This accidentally prevented remov= al.
(In fact, a bare boolean DEFINE such as "DEFINE A AS flag"= would
have exhibited the same removal bug even in v46, since
findTar= getlistEntrySQL99 would reuse the existing entry instead
of creating a n= ew resjunk one -- confirming that the issue was
latent in the original i= mplementation, not introduced by recent
patches.)

However, that s= ame approach caused the SIGSEGV (Bug 1): when RPR
and non-RPR windows co= exist, the non-RPR WindowAgg inherits
targetlist entries containing RPRN= avExpr nodes that it cannot
evaluate.

The two bugs are two sides = of the same coin -- the old approach
prevented removal but caused SIGSEG= V. Switching to Var-only
addition fixed the SIGSEGV but required the all= paths.c guard to
prevent removal.

As for more precise optimizatio= n (allowing removal when all
WindowFuncs for an RPR clause are unused), = it would require
structural changes to the loop in remove_unused_subquer= y_outputs(),
which currently evaluates entries individually. Since this = function
runs for all subqueries, not just RPR, such a change would need=
broader testing.

The patch addresses two distinct issues: the pa= rse_rpr.c change
(Var-only addition) prevents RPRNavExpr from propagatin= g to
non-RPR WindowAgg nodes, while the allpaths.c guard prevents
inc= orrect column removal. Since both are needed, the guard
cannot simply be= reverted. Would it make sense to keep it as-is
for now, and explore the= more precise optimization as a separate
follow-up patch?

Best re= gards,
Henson=C2=A0
--00000000000082e52c064f173f17--