Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w2MWu-000Dsh-0L for pgsql-hackers@arkaria.postgresql.org; Tue, 17 Mar 2026 04:59:44 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w2MWr-00FsA5-2t for pgsql-hackers@arkaria.postgresql.org; Tue, 17 Mar 2026 04:59:41 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w2MWr-00Fs9w-1B for pgsql-hackers@lists.postgresql.org; Tue, 17 Mar 2026 04:59:41 +0000 Received: from mail-pj1-x102a.google.com ([2607:f8b0:4864:20::102a]) by makus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.98.2) (envelope-from ) id 1w2MWm-000000008DK-1AVd for pgsql-hackers@postgresql.org; Tue, 17 Mar 2026 04:59:40 +0000 Received: by mail-pj1-x102a.google.com with SMTP id 98e67ed59e1d1-358d80f60ccso3262852a91.3 for ; Mon, 16 Mar 2026 21:59:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1773723577; cv=none; d=google.com; s=arc-20240605; b=S2rxB9oHHiR1imc7t1RvxPL7pS4tXydSMx5ZSk1gaEc3XVcdbyPaZgSkRwDZtrjL5J 4r4tQKjeIaQZM+rOR02vHFy1Qaira8mqu8YdZptWVH7MYXYRRg4nNqHWj8HX4idddNnx h+JPCoQk8tSC4ywxFq3xfeWGlHaPdo9an31f4JmDVIaHqAutHokA58tMETkaItkCiD3B 3ptSYR3mk55zOqM3WsK3byWehX84+ASDCLbN10YZ6cFnx4kVOADkM4UzV/yTBNczUGUX JvHNQbd6EAtT2i1IpFpgOkKoCE8EcVadiX6xT8yQ4jyGjXcvVY+h9m/uNomw0p3BB0gA 3yOA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20240605; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:dkim-signature; bh=sj8oExj1JKlsFGRMAZ5Kit7wIkbFCq8r20UM4JCwpXc=; fh=3eQIEkDyL3O8csPJObeWbE+WdJvhtRLClaNWqEQRQZQ=; b=IEtJxB2DR1iMgkg/ysuovky+yG3rx63ER4usqIk1su/OTr9P/PgHYJjdvzcBelFQPh nyyuW4fJaQ7TxOme+GBvwFd+YISMTNRjOQzW+PTHk0854ac3B4NQ2fuyTE6GKancvOn8 XbypjGh6vMOEGeLMYKPMHN52Oa6D5/21Zch9u7F5UF19lvniHPPGpE+KGc2zQv2Ke1vx iK/Avkc/B4gQucEx9DCMCFBKlkszqxNYeIs8PoTypujrbjmv7yoLP5WN2fi8qfFnZJmT QJQNd7Sij0sKpYxjj/ZU4vxTYGVkGWrUOJAGeWOzuudyITTsHa2JcidO5rcofRX5Jg4T yMCw==; darn=postgresql.org ARC-Authentication-Results: i=1; mx.google.com; arc=none DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1773723577; x=1774328377; darn=postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=sj8oExj1JKlsFGRMAZ5Kit7wIkbFCq8r20UM4JCwpXc=; b=XEQecCzIDTt/mlmvJiReSOfn/gy64q7MRXEr136wD4GP4YfXDen1ySKIh6lICcT6SQ l+1l90dgOXMcGh7CXmK6CDFVDKLWVe4Jcr0lkrfpjYFIf5PKNsKwH1Ze5HrRHNX6n4Oc yjMP7TktSe2tLLvOuNtYUlNm/Ah0NEuxSMQzZRKYikSW6t/wjoTsaQWhqRZeWrenUWTB pVd7EG8/ROfV2E3RjV0kw3CAIKodclda+olXUGJszrz6MBVhdKybxtsWETwUizTmAMk8 a/dtrONqkt9DqiIAZVSvCFiDbeqArzN6isD8XCCIWGjaopT0ctFWW9DBxHu393g72sbC yQQw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20251104; t=1773723577; x=1774328377; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-gg:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=sj8oExj1JKlsFGRMAZ5Kit7wIkbFCq8r20UM4JCwpXc=; b=Iq89rBcMwa3kowod/zgPK752ZCA16iaCJSKzo6PUhKeY61mKQxFe1qk7aVWFzjfY0v em348lq72aPlWQAmtu0NVyVvIPzDmCpT3dW0zq+6RbOtDW3DadaxoD9vwMdbVQfkbsva KmH/7LZDIc2EvP9dfJqLnoOcSIom5O93fv66zKnq1k7RR5P3H1S3sub3eGy0lgXmOjEG dAm+TRO1vsg2+m2sJr74pFdm7Ead5/+Mk4ljy3+Lw0HYbH726alpmo+nOXN9NIFhgb0e NfKhVyGyzQ4ThP+KJ0gtN/dOyiRh/eEZBvfHTsjqdK4lNT6ujWiSuBSv1lYF8X5svGRp A/Xw== X-Forwarded-Encrypted: i=1; AJvYcCUXhrWikkp8qFgGfi7WlleOaZrhRSL3ODyYLr+i0nGYzimwy1hu8SBChLyCRSgRN6n40zDfEkuCNGRRmc3p@postgresql.org X-Gm-Message-State: AOJu0YzCtrjlFVwuR7hZ0CmsM2GNc0sJ5AiWlP1fH+h0wKDupYEF/+eG F+/4m4TR5l/JUGxh4/sqwXJ+qhIxbQM6IxgA+YVxWLYN3gj8X57P7LoMbYCXcUJCdaE8vl84/hX 5Sc/YYe67HFt9Z7eHwbMqv8vjShkk5jo= X-Gm-Gg: ATEYQzxvkCAYBwbYNd3alNIhiq4vOvXvFHXBhHn0QgmzCiWlfovfjNi5872qR4Lsta0 rQjWrs1wplLlDKyG4u3T/uBOoOoXQJu+tQsX7zh/Ghgdq01s2BuNm4jRPxF98bVY5ZhX8c/6VfT wai9FUiku6dMPOWJyX6VkJo3UEf3YX6lysqxciFEdWdIjQqpsUWZZnWEa1FMv8vjHeBAZJmaxD+ bPBvKTReXzKUaRXPP3vrJueemfClvGlxnrv1E9HwQA9TusO5FkHAcJv53MtsJ05L7YunsLGa6hS OZR5I6o= X-Received: by 2002:a17:90b:4ace:b0:359:ff8a:ee5a with SMTP id 98e67ed59e1d1-35a21eb6895mr13626572a91.12.1773723576632; Mon, 16 Mar 2026 21:59:36 -0700 (PDT) MIME-Version: 1.0 References: <20260309.142202.1739855502263731478.ishii@postgresql.org> <20260311.191117.172824867943979935.ishii@postgresql.org> In-Reply-To: <20260311.191117.172824867943979935.ishii@postgresql.org> From: SungJun Jang Date: Tue, 17 Mar 2026 13:59:24 +0900 X-Gm-Features: AaiRm52aby2_kKcEG0rSEV4i6RYCQu5NqxFQDxd9nLzGUB21tBV9rl05nK4iNr0 Message-ID: Subject: Re: Row pattern recognition To: Tatsuo Ishii Cc: assam258@gmail.com, vik@postgresfriends.org, er@xs4all.nl, jacob.champion@enterprisedb.com, david.g.johnston@gmail.com, peter@eisentraut.org, pgsql-hackers@postgresql.org Content-Type: multipart/alternative; boundary="000000000000b70974064d3133c4" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000b70974064d3133c4 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hi hackers, Thank you for the excellent report! I am glad to see that PostgreSQL > RPR is much faster than Trino, especially in the match failure cases. > > Is it possible to share the data generation script and the query for > PostgreSQL so that I could locally perform the tests? Here is a self-contained guide to reproduce the RPR ABCD pattern test locally using PostgreSQL, Trino, and Oracle. Requirements: PostgreSQL 19devel build with the RPR patch applied (local) Trino and Oracle can be set up via Docker (see README for details): https://github.com/assam258-5892/docker-databases Start all services: cd ~/docker-databases && docker compose up -d trino-service oracle-servic= e Note: docker compose up does not pull images automatically if they are already cached locally. To fetch the latest images, run docker compose pull first. Note: The Oracle image requires an Oracle account. Register at https://container-registry.oracle.com, accept the license for the database image, then log in before pulling: docker login container-registry.oracle.com Connect to each database shell: Trino: docker compose exec -it trino-service trino Oracle: docker compose exec -it oracle-service sqlplus / as sysdba Step 1: Create the test table and data PostgreSQL (1x scale, 20,000 rows): DROP TABLE IF EXISTS abcd_test; CREATE TABLE abcd_test AS SELECT v, CASE WHEN v % 10000 < 3333 THEN 'A' WHEN v % 10000 >=3D 3333 AND v % 10000 < 6666 THEN 'B' WHEN v % 10000 >=3D 6666 AND v % 10000 < 9999 THEN 'C' WHEN v % 10000 =3D 9999 THEN 'D' END AS cat FROM generate_series(0, 19999) AS v; ANALYZE abcd_test; Trino (1x scale, 20,000 rows): CREATE SCHEMA IF NOT EXISTS memory.test; DROP TABLE IF EXISTS memory.test.abcd_test; CREATE TABLE memory.test.abcd_test AS WITH nums AS ( SELECT a.v * 10000 + b.v AS v FROM UNNEST(sequence(0, 1)) AS a(v) CROSS JOIN UNNEST(sequence(0, 9999)) AS b(v) ) SELECT CAST(v AS INTEGER) AS v, CASE WHEN v % 10000 < 3333 THEN 'A' WHEN v % 10000 >=3D 3333 AND v % 10000 < 6666 THEN 'B' WHEN v % 10000 >=3D 6666 AND v % 10000 < 9999 THEN 'C' WHEN v % 10000 =3D 9999 THEN 'D' END AS cat FROM nums; Note: Trino sequence() is limited to 10,000 elements per call, so a CROSS JOIN is used. For scale Sx, change sequence(0, 1) to sequence(0, S*2-1). Oracle (1x scale, 20,000 rows): DROP TABLE abcd_test PURGE; CREATE TABLE abcd_test AS SELECT v, CASE WHEN MOD(v, 10000) < 3333 THEN 'A' WHEN MOD(v, 10000) >=3D 3333 AND MOD(v, 10000) < 6666 THEN 'B' WHEN MOD(v, 10000) >=3D 6666 AND MOD(v, 10000) < 9999 THEN 'C' WHEN MOD(v, 10000) =3D 9999 THEN 'D' END AS cat FROM (SELECT LEVEL - 1 AS v FROM dual CONNECT BY LEVEL <=3D 20000); EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'ABCD_TEST'); Verify data distribution (all engines): SELECT cat, COUNT(*) AS cnt FROM abcd_test GROUP BY cat ORDER BY cat; Expected (1x): A=3D6666, B=3D6666, C=3D6666, D=3D2 Step 2: Run Test 1 =E2=80=94 A+ B+ C+ D (match expected) Expected: 2 rows returned (one match per segment) PostgreSQL: SELECT match_first, match_last, match_len FROM ( SELECT v, first_value(v) OVER w AS match_first, last_value(v) OVER w AS match_last, count(*) OVER w AS match_len FROM abcd_test WINDOW w AS ( ORDER BY v ROWS BETWEEN CURRENT ROW AND UNBOUNDED FOLLOWING AFTER MATCH SKIP PAST LAST ROW PATTERN (A+ B+ C+ D) DEFINE A AS cat =3D 'A', B AS cat =3D 'B', C AS cat =3D 'C', D AS cat =3D 'D' ) ) result WHERE match_len > 0; Trino / Oracle: SELECT match_first, match_last, match_len FROM abcd_test MATCH_RECOGNIZE ( ORDER BY v MEASURES FIRST(v) AS match_first, LAST(v) AS match_last, COUNT(*) AS match_len ONE ROW PER MATCH AFTER MATCH SKIP PAST LAST ROW PATTERN (A+ B+ C+ D) DEFINE A AS cat =3D 'A', B AS cat =3D 'B', C AS cat =3D 'C', D AS cat =3D 'D' ) mr; Note: Trino uses memory.test.abcd_test as the table name. Expected result (1x): match_first | match_last | match_len ------------|------------|---------- 0 | 9999 | 10000 10000 | 19999 | 10000 Step 3: Run Test 2 =E2=80=94 A+ B+ C+ E (match failure) Expected: 0 rows (E does not exist) Use the same queries as Test 1 with two changes: PATTERN: (A+ B+ C+ D) =E2=86=92 (A+ B+ C+ E) DEFINE: cat =3D 'D' =E2=86=92 cat =3D 'E' Warning: Trino Test 2 at 1x scale takes approximately 5-6 minutes. Step 4: Scale up (optional) Re-create the test table at 2x scale (40,000 rows) and then repeat Step 2 and Step 3. PostgreSQL (2x scale, 40,000 rows): DROP TABLE IF EXISTS abcd_test; CREATE TABLE abcd_test AS SELECT v, CASE WHEN v % 20000 < 6666 THEN 'A' WHEN v % 20000 >=3D 6666 AND v % 20000 < 13332 THEN 'B' WHEN v % 20000 >=3D 13332 AND v % 20000 < 19999 THEN 'C' WHEN v % 20000 =3D 19999 THEN 'D' END AS cat FROM generate_series(0, 39999) AS v; ANALYZE abcd_test; Trino (2x scale, 40,000 rows): CREATE SCHEMA IF NOT EXISTS memory.test; DROP TABLE IF EXISTS memory.test.abcd_test; CREATE TABLE memory.test.abcd_test AS WITH nums AS ( SELECT a.v * 10000 + b.v AS v FROM UNNEST(sequence(0, 3)) AS a(v) CROSS JOIN UNNEST(sequence(0, 9999)) AS b(v) ) SELECT CAST(v AS INTEGER) AS v, CASE WHEN v % 20000 < 6666 THEN 'A' WHEN v % 20000 >=3D 6666 AND v % 20000 < 13332 THEN 'B' WHEN v % 20000 >=3D 13332 AND v % 20000 < 19999 THEN 'C' WHEN v % 20000 =3D 19999 THEN 'D' END AS cat FROM nums; Oracle (2x scale, 40,000 rows): DROP TABLE abcd_test PURGE; CREATE TABLE abcd_test AS SELECT v, CASE WHEN MOD(v, 20000) < 6666 THEN 'A' WHEN MOD(v, 20000) >=3D 6666 AND MOD(v, 20000) < 13332 THEN 'B' WHEN MOD(v, 20000) >=3D 13332 AND MOD(v, 20000) < 19999 THEN 'C' WHEN MOD(v, 20000) =3D 19999 THEN 'D' END AS cat FROM (SELECT LEVEL - 1 AS v FROM dual CONNECT BY LEVEL <=3D 40000); EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, 'ABCD_TEST'); Expected (2x): A=3D13332, B=3D13332, C=3D13332, D=3D4 Then run Step 2 and Step 3 as-is. Warning: Trino Test 2 at 2x scale takes approximately 20-25 minutes. Please let me know if you encounter any issues reproducing this. Best regards SungJun --000000000000b70974064d3133c4 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hi h= ackers,

Thank you for the excellent report! I am glad to see that PostgreSQL
RPR is much faster than Trino, especially in the match failure cases.

Is it possible to share the data generation script and the query for
PostgreSQL so that I could locally perform the tests?

= Here is a self-contained guide to reproduce the RPR ABCD pattern test local= ly
using PostgreSQL, Trino, and Oracle.


Requirements:

= PostgreSQL 19devel build with the RPR patch applied (local)

Trino an= d Oracle can be set up via Docker (see README for details):

=C2=A0 <= a href=3D"https://github.com/assam258-5892/docker-databases">https://github= .com/assam258-5892/docker-databases

Start all services:

= =C2=A0 cd ~/docker-databases && docker compose up -d trino-service = oracle-service

Note: docker compose up does not pull images automati= cally if they are already
cached locally. To fetch the latest images, ru= n docker compose pull first.

Note: The Oracle image requires an Orac= le account. Register at
https://container-registry.oracle.com, accept the license for the dat= abase
image, then log in before pulling:

=C2=A0 docker login conta= iner-registry.oracle.com

Connect to each database shell:<= br>
=C2=A0 Trino: =C2=A0docker compose exec -it= trino-service trino
=C2=A0 Oracle: docker compose exec -it oracle-servi= ce sqlplus / as sysdba



Step 1: Create the test table and = data

PostgreSQL (1x scale, 20,000 rows):

DROP TABLE IF EXISTS abcd_test;
CREATE TABLE abcd_test AS
SELECT= v,
=C2=A0 =C2=A0 =C2=A0 =C2=A0CASE
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0WHEN v % 10000 < 3333 THEN 'A'
=C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0WHEN v % 10000 >=3D 3333 AND v % 10000 < 6666 TH= EN 'B'
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0WHEN v % 10000 &= gt;=3D 6666 AND v % 10000 < 9999 THEN 'C'
=C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0WHEN v % 10000 =3D 9999 THEN 'D'
=C2=A0 = =C2=A0 =C2=A0 =C2=A0END AS cat
FROM generate_series(0, 19999) AS v;
<= br>ANALYZE abcd_test;



Trino (1x scale, 20,000 rows):
<= br>CREATE SCHEMA IF NOT EXISTS memory.test;
DROP TABLE IF EXISTS memory.test.abcd_test;
CREATE TABLE memory.test.ab= cd_test AS
WITH nums AS (
=C2=A0 =C2=A0 SELECT a.v * 10000 + b.v AS v=
=C2=A0 =C2=A0 FROM UNNEST(sequence(0, 1)) AS a(v)
=C2=A0 =C2=A0 CROS= S JOIN UNNEST(sequence(0, 9999)) AS b(v)
)
SELECT CAST(v AS INTEGER) = AS v,
=C2=A0 =C2=A0 =C2=A0 =C2=A0CASE
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0WHEN v % 10000 < 3333 THEN 'A'
=C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0WHEN v % 10000 >=3D 3333 AND v % 10000 < 6666 TH= EN 'B'
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0WHEN v % 10000 &= gt;=3D 6666 AND v % 10000 < 9999 THEN 'C'
=C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0WHEN v % 10000 =3D 9999 THEN 'D'
=C2=A0 = =C2=A0 =C2=A0 =C2=A0END AS cat
FROM nums;


Note: Trino sequ= ence() is limited to 10,000 elements per call, so a CROSS JOIN
is used. = For scale Sx, change sequence(0, 1) to sequence(0, S*2-1).


Oracl= e (1x scale, 20,000 rows):

DROP TABLE abcd_= test PURGE;
CREATE TABLE abcd_test AS
SELECT v,
=C2=A0 =C2=A0 =C2= =A0 =C2=A0CASE
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0WHEN MOD(v, 1000= 0) < 3333 THEN 'A'
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0W= HEN MOD(v, 10000) >=3D 3333 AND MOD(v, 10000) < 6666 THEN 'B'=
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0WHEN MOD(v, 10000) >=3D 666= 6 AND MOD(v, 10000) < 9999 THEN 'C'
=C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0WHEN MOD(v, 10000) =3D 9999 THEN 'D'
=C2=A0 =C2= =A0 =C2=A0 =C2=A0END AS cat
FROM (SELECT LEVEL - 1 AS v FROM dual CONNEC= T BY LEVEL <=3D 20000);

EXEC DBMS_STATS.GATHER_TABLE_STATS(USER, = 'ABCD_TEST');



Verify data distribution (all engin= es):

SELECT cat, COUNT(*) AS cnt FROM abcd_= test GROUP BY cat ORDER BY cat;

Expected (1x): A=3D6666, B=3D= 6666, C=3D6666, D=3D2


Step 2: Run Test 1 =E2=80=94 A+ B+ C+ D (m= atch expected)

Expected: 2 rows returned (one match per segment)
=
PostgreSQL:

SELECT match_first, match_l= ast, match_len
FROM (
=C2=A0 =C2=A0 SELECT v,
=C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0first_value(v) OVER w AS match_first,
=C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0last_value(v) OVER w AS match_last,
=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0count(*) OVER w AS match_len
=C2=A0 = =C2=A0 FROM abcd_test
=C2=A0 =C2=A0 WINDOW w AS (
=C2=A0 =C2=A0 =C2= =A0 =C2=A0 ORDER BY v
=C2=A0 =C2=A0 =C2=A0 =C2=A0 ROWS BETWEEN CURRENT R= OW AND UNBOUNDED FOLLOWING
=C2=A0 =C2=A0 =C2=A0 =C2=A0 AFTER MATCH SKIP = PAST LAST ROW
=C2=A0 =C2=A0 =C2=A0 =C2=A0 PATTERN (A+ B+ C+ D)
=C2=A0= =C2=A0 =C2=A0 =C2=A0 DEFINE
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 A= AS cat =3D 'A',
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 B AS = cat =3D 'B',
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 C AS cat = =3D 'C',
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 D AS cat =3D = 'D'
=C2=A0 =C2=A0 )
) result
WHERE match_len > 0;


Trino / Oracle:

SELECT match_f= irst, match_last, match_len
FROM abcd_test
MATCH_RECOGNIZE (
=C2= =A0 =C2=A0 ORDER BY v
=C2=A0 =C2=A0 MEASURES
=C2=A0 =C2=A0 =C2=A0 =C2= =A0 FIRST(v) AS match_first,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 LAST(v) AS matc= h_last,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 COUNT(*) AS match_len
=C2=A0 =C2= =A0 ONE ROW PER MATCH
=C2=A0 =C2=A0 AFTER MATCH SKIP PAST LAST ROW
= =C2=A0 =C2=A0 PATTERN (A+ B+ C+ D)
=C2=A0 =C2=A0 DEFINE
=C2=A0 =C2=A0= =C2=A0 =C2=A0 A AS cat =3D 'A',
=C2=A0 =C2=A0 =C2=A0 =C2=A0 B A= S cat =3D 'B',
=C2=A0 =C2=A0 =C2=A0 =C2=A0 C AS cat =3D 'C&#= 39;,
=C2=A0 =C2=A0 =C2=A0 =C2=A0 D AS cat =3D 'D'
) mr;

Note: Trino uses memory.test.abcd_test as the table name.

E= xpected result (1x):

match_first | match_la= st | match_len
------------|------------|----------
=C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 0 | =C2=A0 =C2=A0 =C2=A0 9999 | =C2=A0 =C2=A0 10000
= =C2=A0 =C2=A0 =C2=A0 10000 | =C2=A0 =C2=A0 =C2=A019999 | =C2=A0 =C2=A0 1000= 0



Step 3: Run Test 2 =E2=80=94 A+ B+ C+ E (match failure)=

Expected: 0 rows (E does not exist)

Use the same queries as = Test 1 with two changes:

PATTERN: (A+ B+ C+ D) =E2=86=92 (A+ B+ C+ E= )

DEFINE: cat =3D 'D' =E2=86=92 cat =3D 'E'

W= arning: Trino Test 2 at 1x scale takes approximately 5-6 minutes.

Step 4: Scale up (optional)

Re-create the test table at 2x scale (= 40,000 rows) and then repeat Step 2 and
Step 3.

PostgreSQL (2x sc= ale, 40,000 rows):

DROP TABLE IF EXISTS abc= d_test;
CREATE TABLE abcd_test AS
SELECT v,
=C2=A0 =C2=A0 =C2=A0 = =C2=A0CASE
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0WHEN v % 20000 < = 6666 THEN 'A'
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0WHEN v % = 20000 >=3D 6666 AND v % 20000 < 13332 THEN 'B'
=C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0WHEN v % 20000 >=3D 13332 AND v % 20000 &= lt; 19999 THEN 'C'
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0WHEN= v % 20000 =3D 19999 THEN 'D'
=C2=A0 =C2=A0 =C2=A0 =C2=A0END AS = cat
FROM generate_series(0, 39999) AS v;

ANALYZE abcd_test;


Trino (2x scale, 40,000 rows):

CREATE SCHEMA IF NOT EXISTS memory.test;

DROP TABLE IF EXISTS memor= y.test.abcd_test;
CREATE TABLE memory.test.abcd_test AS
WITH nums AS = (
=C2=A0 =C2=A0 SELECT a.v * 10000 + b.v AS v
=C2=A0 =C2=A0 FROM UNNE= ST(sequence(0, 3)) AS a(v)
=C2=A0 =C2=A0 CROSS JOIN UNNEST(sequence(0, 9= 999)) AS b(v)
)
SELECT CAST(v AS INTEGER) AS v,
=C2=A0 =C2=A0 =C2= =A0 =C2=A0CASE
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0WHEN v % 20000 &= lt; 6666 THEN 'A'
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0WHEN = v % 20000 >=3D 6666 AND v % 20000 < 13332 THEN 'B'
=C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0WHEN v % 20000 >=3D 13332 AND v % 2000= 0 < 19999 THEN 'C'
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0W= HEN v % 20000 =3D 19999 THEN 'D'
=C2=A0 =C2=A0 =C2=A0 =C2=A0END = AS cat
FROM nums;



Oracle (2x scale, 40,000 rows):
<= br>DROP TABLE abcd_test PURGE;
CREATE TABLE abc= d_test AS
SELECT v,
=C2=A0 =C2=A0 =C2=A0 =C2=A0CASE
=C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0WHEN MOD(v, 20000) < 6666 THEN 'A'=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0WHEN MOD(v, 20000) >=3D 6666 A= ND MOD(v, 20000) < 13332 THEN 'B'
=C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0WHEN MOD(v, 20000) >=3D 13332 AND MOD(v, 20000) < 19999= THEN 'C'
=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0WHEN MOD(v, 2= 0000) =3D 19999 THEN 'D'
=C2=A0 =C2=A0 =C2=A0 =C2=A0END AS catFROM (SELECT LEVEL - 1 AS v FROM dual CONNECT BY LEVEL <=3D 40000);

EXEC DBMS_STATS.GATHER_TABLE_STATS(USE= R, 'ABCD_TEST');

Expected (2x): A=3D13332, B=3D13332,= C=3D13332, D=3D4

Then run Step 2 and Step 3 as-is.

Warning: = Trino Test 2 at 2x scale takes approximately 20-25 minutes.


Plea= se let me know if you encounter any issues reproducing this.

Best re= gards
SungJun
=C2=A0
--000000000000b70974064d3133c4--