Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wZMQJ-000ybS-1X for pgsql-hackers@arkaria.postgresql.org; Tue, 16 Jun 2026 05:33:19 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wZMQG-00FQV9-2q for pgsql-hackers@arkaria.postgresql.org; Tue, 16 Jun 2026 05:33:16 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wZMQG-00FQUx-1L for pgsql-hackers@lists.postgresql.org; Tue, 16 Jun 2026 05:33:16 +0000 Received: from mail-ej1-x62b.google.com ([2a00:1450:4864:20::62b]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1wZMQE-00000000awz-1wXQ for pgsql-hackers@postgresql.org; Tue, 16 Jun 2026 05:33:15 +0000 Received: by mail-ej1-x62b.google.com with SMTP id a640c23a62f3a-bed19623d6eso525404766b.1 for ; Mon, 15 Jun 2026 22:33:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1781587991; cv=none; d=google.com; s=arc-20240605; b=FVMELJBWiP4h+Gm+G/D78X1MrsTYapoYK6H33Rpj2f1Gq7PhENmbAtyjzLGtrmdjOw fIxiQN1EQjdStGy0h+zgdB8fnbiEfOB4aaWekLZx+MsY+HnTOKgqresBzlVzUnXLLupn flJT0tGI2InRVguTOdO7CnhUg4vZNeapObvG7PNrlnQg91awn9wNB92AOPI5h/x5j6il gENFY8i4c1LJzqm8zMMFF3HSlm6cUfdKLWmxjmXT14eb/raAxm2p9PTKLYdUiSfr1M8t 8isgas1s+V7rfB6Bu9pcqnfWrrq8lDFLdae6MCTJXrYJkcUBOqMurLZk3rU8ih1W/T7L jJuQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:reply-to:in-reply-to:references :mime-version:dkim-signature; bh=IIIg53QRf9dts9fCHk3crfXD91K78FH2j+XcM71vaPU=; fh=ssMBIo9fZsd/XMmduRegxsyvy59O1mGgaEG8Q2qQocU=; b=IqKdeTmxKpNrXIxjDh2HW3DqtN3cR8w2VA//xVX4i7BtK6BUnoGda8xI2Z+jICiOOv Qykp+FR5GU0XpcHMPksubey/jR+7ko7FyZ06o24HoxjCVmpJMt57tRkZI6LAxAMOp8D6 Cqy5+j+T25J+qSFPKSeqwvCLNd4vaFOLdKCauxPSsNzlgohaoH5CDv//PeHWsqYHhvDt PWk+ciVUfzjxYU0aLT9bk6h+UcSxItIooHlGJE3pvMNRe0jYdd5PbQdVFg4eh/eWvyP0 ZCuGBm6TZ8cByjtndOeG29Ar8AQV3E1bwUUqjkF2gV1kUNetpSKNj2Fwk8+zFf//HM6W P1tw==; darn=postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20251104; t=1781587991; x=1782192791; darn=postgresql.org; h=cc:to:subject:message-id:date:from:reply-to:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=IIIg53QRf9dts9fCHk3crfXD91K78FH2j+XcM71vaPU=; b=k1+T5bwzFZVHBoZLmiSnEI6RnyvWjVSUUpOYE6CiYnTImuONq3F+8QX13pttf99u2M 3NQ355HtQ+hrG85tmaN9VXd2OKR/Hl4S8E0CfvdMk2hHkGcGlZFvMKIno62drcqAVaDh Hk5NhAN3NLWS0DN9E26Fyu67leIxNDxbzplx/3S6SC6JAjksEU6B9Nwd6w+0UKbiybr4 ATjsZy5ZJOI2ITpxL3J/UFppaGVRepL5B1HR4zG1zLsjzdzRtX+i1wq/Xcp0IzbVyZ/y S1SN36Ddb2IqgBYNgcT4VbhlnCpl+qq+ETzayTSND3mpe4FZ6buhm9rpTBOeJmwoBskz 94Aw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1781587991; x=1782192791; h=cc:to:subject:message-id:date:from:reply-to:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=IIIg53QRf9dts9fCHk3crfXD91K78FH2j+XcM71vaPU=; b=T0hKjpt+WCsXwIJq8ShVTSR2XhotC40aKSeYlf0dm8OZ7rKugqRFsuUNHkKwakaO5l ctmGQscxqxC9OzLq4f+kJwDpnB2VW96wzs/jYb9ffR/+ZoSFfakjRs0GDGtjh5kBbVzS JgpEgzgSxfuSpJ0J8zQZea/Ado1OaM55BT3CMwgeCfHiWOYq0kt17Q8ald08SjQvNeuC VceGvTbub9AJrRtO3uwtaf3KnR0G+O12mLjjSWw5IKy3fUbuIDz37RSadl8dJnvaiEkN ILw7gtQAKNzRa+aCOwWew8CHv1EY+5tkfpvGujvKIIeP/w56MqVKWbfGfhv/Zk8oerOV 5JMQ== X-Forwarded-Encrypted: i=1; AFNElJ/W53t/5OFJ0DgVMHdhbsnrxHao4sL+ipIxR6GVHpQ/4ZutFfm53ZMN6cnPshefmRgrurOikdIczPHlvpiP@postgresql.org X-Gm-Message-State: AOJu0Yxhq0+Q6317zIEC7Sx6vb/HOBM02I35OjNscR8wOUbpwoDxx3dN x0SqNX4xRMD/GI1AHvTw6ulaqQR4Xcp8fos7flQzzulUl9MnwcwKlnFmQ8u9sT2o5wQRibnoxX3 4wXCj34rJCF6UgjqEPnZXhOkhRT/MFns= X-Gm-Gg: Acq92OF8KTener1mJeoymLN4FnlxoY8tRJasxLghG9UA2FSAQftpTs3gUXk/S3tMjRt tziWYj6K47xukZxq3oisnZDCs74uuwX0AM1l8nV21JEiqp90fpEC+AQJrRAU/ZdRfkISKrYF45I KnImfbRAiWz/yzpDOrcEQjUuIAq12ri+RzopNKrw6Zwow38jVyFKQhqnahl70mP3xSmqx6GoZvI AsftnchxM0E4HXzWIIr7G0yMIntTAxCRgwA35CJOqOZBDCNXUuKwyKqZ2KncP1Rs/egRafj9yA1 6N+VggZg3PoI19/bJWRJ9veRLQuSff55I1xZrvgShyXZ6VZh0iwd/sySRSY9v2qcFV6slqTVxCL Eq40dSOiWo2zlKRfMxECs X-Received: by 2002:a17:907:3fa7:b0:bd3:cb24:ae61 with SMTP id a640c23a62f3a-bfe27830142mr813286166b.6.1781587990501; Mon, 15 Jun 2026 22:33:10 -0700 (PDT) MIME-Version: 1.0 References: <20260604.132108.405136284364833955.ishii@postgresql.org> <20260609.171307.1883356507067957349.ishii@postgresql.org> In-Reply-To: Reply-To: assam258@gmail.com From: Henson Choi Date: Tue, 16 Jun 2026 14:32:58 +0900 X-Gm-Features: AVVi8CfmOzrPPh2bJzA74a1mB85pNkWzhVbGWI5k1WqYuHmLO0TTsMo2jqCLps0 Message-ID: Subject: Re: Row pattern recognition To: jian he , Tatsuo Ishii Cc: zsolt.parragi@percona.com, sjjang112233@gmail.com, vik@postgresfriends.org, er@xs4all.nl, jacob.champion@enterprisedb.com, david.g.johnston@gmail.com, peter@eisentraut.org, li.evan.chao@gmail.com, pgsql-hackers@postgresql.org Content-Type: multipart/alternative; boundary="0000000000004f5c6706545847b6" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000004f5c6706545847b6 Content-Type: text/plain; charset="UTF-8" Hi Tatsuo, Jian, I think there's a correctness problem in the RPR patch: a window function's result can change depending on which other, unrelated window functions are in the same query. Pattern matching only advances when a window function reads the frame. nth_value(x, n) returns NULL without reading the frame when n is NULL (correct per the standard), so if it is the only window function in an RPR window, the match never advances over those rows and the reduced frame no longer matches a full scan. Example -- one partition, 60 rows, price = id * 10: CREATE TABLE rpr_dormant (id int, price int); INSERT INTO rpr_dormant SELECT g, g*10 FROM generate_series(1,60) g; SELECT id, nth_value(price, CASE WHEN id < 50 THEN NULL ELSE 1 END) OVER w FROM rpr_dormant WINDOW w AS ( ORDER BY id ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING AFTER MATCH SKIP PAST LAST ROW PATTERN (A+) DEFINE A AS price > PREV(FIRST(price), 50) ); Run alone, the nth_value column does not follow the actual match structure; adding an unrelated first_value(id) OVER w, which reads the frame every row, changes it. And while the match is dormant the mark position keeps advancing, running ahead and trimming rows that the backward navigation later needs -- so the same query can instead fail with "cannot fetch row N before WindowObject's mark position". I think an RPR window should perform the match for each row up front, building its reduced frame during the row scan before the window functions are evaluated, regardless of whether any function reads the frame. The fix belongs in the executor, not in nth_value -- the early return is standard, and the same gap is reachable from any user-defined function that skips the frame. Does this direction seem right, or is the lazy, frame-driven matching intentional in a way I'm missing? Happy to prepare a patch. Best regards, Henson --0000000000004f5c6706545847b6 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi Tatsuo, Jian,

I think there's a correctness = problem in the RPR patch: a window function's
result can change depe= nding on which other, unrelated window functions are
in the same query.<= br>
Pattern matching only advances when a window function reads the fram= e.
nth_value(x, n) returns NULL without reading the frame when n is NULL=
(correct per the standard), so if it is the only window function in an = RPR
window, the match never advances over those rows and the reduced fra= me no
longer matches a full scan.

Exampl= e -- one partition, 60 rows, price =3D id * 10:

=C2=A0 CREATE TABLE = rpr_dormant (id int, price int);
=C2=A0 INSERT INTO rpr_dormant SELECT g= , g*10 FROM generate_series(1,60) g;

=C2=A0 SELECT id, nth_value(pri= ce, CASE WHEN id < 50 THEN NULL ELSE 1 END) OVER w
=C2=A0 FROM rpr_do= rmant
=C2=A0 WINDOW w AS (
=C2=A0 =C2=A0 ORDER BY id
=C2=A0 =C2=A0= ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
=C2=A0 =C2=A0 AFTER MA= TCH SKIP PAST LAST ROW
=C2=A0 =C2=A0 PATTERN (A+)
=C2=A0 =C2=A0 DEFIN= E A AS price > PREV(FIRST(price), 50)
=C2=A0 );


Run alo= ne, the nth_value column does not follow the actual match structure;
add= ing an unrelated first_value(id) OVER w, which reads the frame every row,changes it.=C2=A0 And while the match is dormant the mark position keeps<= br>advancing, running ahead and trimming rows that the backward navigation<= br>later needs -- so the same query can instead fail with "cannot fetc= h row N
before WindowObject's mark position".

I think an= RPR window should perform the match for each row up front,
building its= reduced frame during the row scan before the window functions
are evalu= ated, regardless of whether any function reads the frame.=C2=A0 The fix
= belongs in the executor, not in nth_value -- the early return is standard,<= br>and the same gap is reachable from any user-defined function that skips = the
frame.

Does this direction seem right, or is the lazy, frame-= driven matching
intentional in a way I'm missing?=C2=A0 Happy to pre= pare a patch.

Best regards,
Henson
--0000000000004f5c6706545847b6--