public inbox for [email protected]  
help / color / mirror / Atom feed
From: Henson Choi <[email protected]>
To: jian he <[email protected]>
To: Tatsuo Ishii <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Subject: Re: Row pattern recognition
Date: Tue, 16 Jun 2026 14:32:58 +0900
Message-ID: <CAAAe_zCQaV_im5WNbU1D1HAQDUk3yYK3nSduZZ8Qu_aK-aRA9A@mail.gmail.com> (raw)
In-Reply-To: <CACJufxFAQhbOD9EVCTAy-VwDbG4446N10GsxCcgdpFnjHO1Efw@mail.gmail.com>
References: <CAAAe_zAZDuHSiVGvz9c6h=Pe=aN+FKZOrdNPfbTOk3XV+WFKYQ@mail.gmail.com>
	<CAAAe_zDz3z2Paidk3jHOm9S3eMVLoXRxK0Lyo=5i_9-EfSH7fA@mail.gmail.com>
	<[email protected]>
	<[email protected]>
	<CACJufxFnwdQSApt2vWwYCd0gtf+JjFDxT2hbxHi=+dhFJc+-1g@mail.gmail.com>
	<CAAAe_zATnkqsbLYDj8MJV1TriX9Wi0wShDg3nK3qYpiupKwhFA@mail.gmail.com>
	<CAAAe_zBL+J0AYmvmcJQT7Q-gp5aRH0deJ7SE7-N21g4hWExyJw@mail.gmail.com>
	<CACJufxHJFNBJ0vHCCLestWV5b7DF5e4VzfmovqGKBGgqg+rcGA@mail.gmail.com>
	<CAAAe_zBY0rrgf+tKXMUc-Y3nDDD69hddRBKopEKAZobhY=Cy-Q@mail.gmail.com>
	<CAAAe_zDYxq0d3exCDwvKncD0kaL2uehDir6HXo4r5DXMitKrSg@mail.gmail.com>
	<CACJufxG57=ddtbN=5RZCzhxWDYXvocKmB7NtZy+DoqZuhxb_DA@mail.gmail.com>
	<CAAAe_zB4TMHnpaOrwYp7dKs553q2474ZXRytGfYOfYp4DdrgiQ@mail.gmail.com>
	<CACJufxFAQhbOD9EVCTAy-VwDbG4446N10GsxCcgdpFnjHO1Efw@mail.gmail.com>

Hi Tatsuo, Jian,

I think there's a correctness problem in the RPR patch: a window function's
result can change depending on which other, unrelated window functions are
in the same query.

Pattern matching only advances when a window function reads the frame.
nth_value(x, n) returns NULL without reading the frame when n is NULL
(correct per the standard), so if it is the only window function in an RPR
window, the match never advances over those rows and the reduced frame no
longer matches a full scan.

Example -- one partition, 60 rows, price = id * 10:

  CREATE TABLE rpr_dormant (id int, price int);
  INSERT INTO rpr_dormant SELECT g, g*10 FROM generate_series(1,60) g;

  SELECT id, nth_value(price, CASE WHEN id < 50 THEN NULL ELSE 1 END) OVER w
  FROM rpr_dormant
  WINDOW w AS (
    ORDER BY id
    ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING
    AFTER MATCH SKIP PAST LAST ROW
    PATTERN (A+)
    DEFINE A AS price > PREV(FIRST(price), 50)
  );

Run alone, the nth_value column does not follow the actual match structure;
adding an unrelated first_value(id) OVER w, which reads the frame every row,
changes it.  And while the match is dormant the mark position keeps
advancing, running ahead and trimming rows that the backward navigation
later needs -- so the same query can instead fail with "cannot fetch row N
before WindowObject's mark position".

I think an RPR window should perform the match for each row up front,
building its reduced frame during the row scan before the window functions
are evaluated, regardless of whether any function reads the frame.  The fix
belongs in the executor, not in nth_value -- the early return is standard,
and the same gap is reachable from any user-defined function that skips the
frame.

Does this direction seem right, or is the lazy, frame-driven matching
intentional in a way I'm missing?  Happy to prepare a patch.

Best regards,
Henson


view thread (129+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Row pattern recognition
  In-Reply-To: <CAAAe_zCQaV_im5WNbU1D1HAQDUk3yYK3nSduZZ8Qu_aK-aRA9A@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox