public inbox for [email protected]
help / color / mirror / Atom feedLimit GRAPH_TABLE path combinations to prevent memory exhaustion
3+ messages / 2 participants
[nested] [flat]
* Limit GRAPH_TABLE path combinations to prevent memory exhaustion
@ 2026-04-29 16:35 SATYANARAYANA NARLAPURAM <[email protected]>
0 siblings, 1 reply; 3+ messages in thread
From: SATYANARAYANA NARLAPURAM @ 2026-04-29 16:35 UTC (permalink / raw)
To: PostgreSQL Hackers <[email protected]>; Ashutosh Bapat <[email protected]>
Hi hackers,
generate_queries_for_path_pattern_recurse() enumerates all path
combinations by recursing over the Cartesian product of matching elements
per pattern position. Without IS label filters, each position matches
ALL tables of that kind, leading to N^K combinations (N tables, K
pattern positions). Each combination allocates a Query node via palloc
causing unbounded memory growth.
A 8-table graph with a -element pattern reaches 81.3 GB RES in a few seconds
before I cancel the query. Tests in the patch (those were failed) can
reproduce the problem
without the fix included in the patch.
top - 15:04:19 up 43 days, 19:18, 5 users, load average: 0.43, 0.19, 0.08
Tasks: 1 total, 1 running, 0 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.9 us, 0.8 sy, 0.0 ni, 98.3 id, 0.0 wa, 0.0 hi, 0.0 si,
0.0 st
MiB Mem : 515766.2 total, 248412.7 free, 234847.7 used, 48014.7 buff/cache
MiB Swap: 0.0 total, 0.0 free, 0.0 used. 280918.6 avail Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
COMMAND
649642 azureus+ 20 0 212.2g 81.3g 33948 R 100.0 16.1 0:41.20
postgres
As a POC I added a pre-computation check that calculates the total number
of path combinations before entering the
generate_queries_for_path_pattern_recurse.
If the product exceeds MAX_GRAPH_TABLE_PATH_COMBINATIONS (set to 10,000),
the rewriter reports ERRCODE_PROGRAM_LIMIT_EXCEEDED with a hint suggesting
IS label filters to reduce the search space. The limit of 10,000 is
somewhat arbitrary
but conservative. It caps memory at roughly 5 MB of Query nodes.
Patterns that would exceed the limit without labels can always be made to
succeed
by adding IS expressions to pin specific positions to fewer tables.
Alternatively, we can consider adding a GUC to control the limit but appears
to be an overkill. Thoughts?
Thanks,
Satya
Attachments:
[application/octet-stream] 0001-Limit-GRAPH_TABLE-path-combinations-to-prevent-memor.patch (10.8K, 3-0001-Limit-GRAPH_TABLE-path-combinations-to-prevent-memor.patch)
download | inline diff:
From bc1cbe54442757e973cf7d6ee9592a9e5babcb54 Mon Sep 17 00:00:00 2001
From: Satya Narlapuram <[email protected]>
Date: Wed, 29 Apr 2026 16:22:42 +0000
Subject: [PATCH] Limit GRAPH_TABLE path combinations to prevent memory
exhaustion
generate_queries_for_path_pattern_recurse() enumerates all path
combinations by recursing over the Cartesian product of matching elements
per pattern position. Without IS label filters, each position matches
ALL tables of that kind, leading to N^K combinations (N tables, K
pattern positions). Each combination allocates a Query node via palloc
into process memory with no limit, causing rapid memory exhaustion and
potential OOM kills.
Add a pre-computation check that calculates the total number of path
combinations before calling
generate_queries_for_path_pattern_recurse(). If the product exceeds
MAX_GRAPH_TABLE_PATH_COMBINATIONS (10000), report an error with a hint
suggesting IS label filters to reduce the search space.
This provides protection against memory exhaustion while
preserving correct behavior for reasonable graph patterns.
---
src/backend/rewrite/rewriteGraphTable.c | 25 ++++++++
src/test/regress/expected/graph_table.out | 70 +++++++++++++++++++++++
src/test/regress/sql/graph_table.sql | 64 +++++++++++++++++++++
3 files changed, 159 insertions(+)
diff --git a/src/backend/rewrite/rewriteGraphTable.c b/src/backend/rewrite/rewriteGraphTable.c
index 6867de6d..8a866e0c 100644
--- a/src/backend/rewrite/rewriteGraphTable.c
+++ b/src/backend/rewrite/rewriteGraphTable.c
@@ -45,6 +45,9 @@
#include "utils/ruleutils.h"
#include "utils/syscache.h"
+/* Limit on total path combinations to prevent combinatorial explosion. */
+#define MAX_GRAPH_TABLE_PATH_COMBINATIONS 10000
+
/*
* Represents one path factor in a path.
@@ -343,6 +346,28 @@ generate_queries_for_path_pattern(RangeTblEntry *rte, List *path_pattern)
path_elem_lists = lappend(path_elem_lists,
get_path_elements_for_path_factor(rte->relid, pf));
+ /*
+ * Check that the total number of path combinations doesn't exceed a
+ * reasonable limit. Without explicit IS label expressions, each element
+ * pattern matches all graph elements of the same kind, producing a
+ * Cartesian product that can easily exhaust process memory.
+ */
+ {
+ int64 total_paths = 1;
+
+ foreach_ptr(List, elems, path_elem_lists)
+ {
+ if (list_length(elems) > 0)
+ total_paths *= list_length(elems);
+ if (total_paths > MAX_GRAPH_TABLE_PATH_COMBINATIONS)
+ ereport(ERROR,
+ (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
+ errmsg("too many path combinations (%lld) in GRAPH_TABLE pattern",
+ (long long) total_paths),
+ errhint("Use explicit IS label expressions to reduce the number of matching elements.")));
+ }
+ }
+
pathqueries = generate_queries_for_path_pattern_recurse(rte, pathqueries,
NIL, path_elem_lists, 0);
if (!pathqueries)
diff --git a/src/test/regress/expected/graph_table.out b/src/test/regress/expected/graph_table.out
index 12b8706b..56fa1b5f 100644
--- a/src/test/regress/expected/graph_table.out
+++ b/src/test/regress/expected/graph_table.out
@@ -1032,4 +1032,74 @@ SELECT sname, dname FROM GRAPH_TABLE (g1 MATCH (src)->(dest) WHERE src.vprop1 >
ERROR: subqueries within GRAPH_TABLE reference are not supported
SELECT sname, dname FROM GRAPH_TABLE (g1 MATCH (src)->(dest) WHERE out_degree(src.vname) > (SELECT max(out_degree(nname)) FROM GRAPH_TABLE (g1 MATCH (node) COLUMNS (node.vname AS nname))) COLUMNS(src.vname AS sname, dest.vname AS dname));
ERROR: subqueries within GRAPH_TABLE reference are not supported
+-- Test: path combination limit prevents combinatorial explosion in the
+-- rewriter when MATCH patterns lack explicit IS label expressions.
+CREATE TABLE v4 (id int PRIMARY KEY, val int);
+CREATE TABLE v5 (id int PRIMARY KEY, val int);
+CREATE TABLE v6 (id int PRIMARY KEY, val int);
+CREATE TABLE v7 (id int PRIMARY KEY, val int);
+CREATE TABLE v8 (id int PRIMARY KEY, val int);
+CREATE TABLE v9 (id int PRIMARY KEY, val int);
+CREATE TABLE v10 (id int PRIMARY KEY, val int);
+CREATE TABLE v11 (id int PRIMARY KEY, val int);
+CREATE TABLE e1 (id int PRIMARY KEY, src int, dest int);
+CREATE TABLE e2 (id int PRIMARY KEY, src int, dest int);
+CREATE TABLE e3 (id int PRIMARY KEY, src int, dest int);
+CREATE TABLE e4 (id int PRIMARY KEY, src int, dest int);
+CREATE TABLE e5 (id int PRIMARY KEY, src int, dest int);
+CREATE TABLE e6 (id int PRIMARY KEY, src int, dest int);
+CREATE TABLE e7 (id int PRIMARY KEY, src int, dest int);
+CREATE TABLE e8 (id int PRIMARY KEY, src int, dest int);
+INSERT INTO v4 VALUES (1, 10);
+INSERT INTO e1 VALUES (1, 1, 1);
+CREATE PROPERTY GRAPH g5
+ VERTEX TABLES (
+ v4 LABEL vl PROPERTIES (id, val) LABEL vl2 PROPERTIES (id, val),
+ v5 LABEL vl PROPERTIES (id, val),
+ v6 LABEL vl PROPERTIES (id, val),
+ v7 LABEL vl PROPERTIES (id, val),
+ v8 LABEL vl PROPERTIES (id, val),
+ v9 LABEL vl PROPERTIES (id, val),
+ v10 LABEL vl PROPERTIES (id, val),
+ v11 LABEL vl PROPERTIES (id, val)
+ )
+ EDGE TABLES (
+ e1 SOURCE KEY (src) REFERENCES v4 (id) DESTINATION KEY (dest) REFERENCES v5 (id)
+ LABEL el PROPERTIES (id),
+ e2 SOURCE KEY (src) REFERENCES v5 (id) DESTINATION KEY (dest) REFERENCES v6 (id)
+ LABEL el PROPERTIES (id),
+ e3 SOURCE KEY (src) REFERENCES v6 (id) DESTINATION KEY (dest) REFERENCES v7 (id)
+ LABEL el PROPERTIES (id),
+ e4 SOURCE KEY (src) REFERENCES v7 (id) DESTINATION KEY (dest) REFERENCES v8 (id)
+ LABEL el PROPERTIES (id),
+ e5 SOURCE KEY (src) REFERENCES v8 (id) DESTINATION KEY (dest) REFERENCES v9 (id)
+ LABEL el PROPERTIES (id),
+ e6 SOURCE KEY (src) REFERENCES v9 (id) DESTINATION KEY (dest) REFERENCES v10 (id)
+ LABEL el PROPERTIES (id),
+ e7 SOURCE KEY (src) REFERENCES v10 (id) DESTINATION KEY (dest) REFERENCES v11 (id)
+ LABEL el PROPERTIES (id),
+ e8 SOURCE KEY (src) REFERENCES v11 (id) DESTINATION KEY (dest) REFERENCES v4 (id)
+ LABEL el PROPERTIES (id)
+ );
+-- 3-element unlabeled: 8 * 8 * 8 = 512 combinations (under limit, succeeds)
+SELECT count(*) FROM GRAPH_TABLE (g5 MATCH (a)-[e]->(b) COLUMNS (a.id AS aid));
+ count
+-------
+ 0
+(1 row)
+
+-- 5-element unlabeled: 8^5 = 32,768 combinations (over limit, rejected)
+SELECT count(*) FROM GRAPH_TABLE (g5 MATCH (a)-[e1]->(b)-[e2]->(c) COLUMNS (a.id AS aid));
+ERROR: too many path combinations (32768) in GRAPH_TABLE pattern
+HINT: Use explicit IS label expressions to reduce the number of matching elements.
+-- IS vl2 pins first vertex to 1 table: 1*8*8*8*8 = 4,096 (under limit)
+SELECT count(*) FROM GRAPH_TABLE (g5 MATCH (a IS vl2)-[e1]->(b)-[e2]->(c) COLUMNS (a.id AS aid));
+ count
+-------
+ 0
+(1 row)
+
+DROP PROPERTY GRAPH g5;
+DROP TABLE e8, e7, e6, e5, e4, e3, e2, e1;
+DROP TABLE v11, v10, v9, v8, v7, v6, v5, v4;
-- leave the objects behind for pg_upgrade/pg_dump tests
diff --git a/src/test/regress/sql/graph_table.sql b/src/test/regress/sql/graph_table.sql
index a5df4647..69313544 100644
--- a/src/test/regress/sql/graph_table.sql
+++ b/src/test/regress/sql/graph_table.sql
@@ -590,4 +590,68 @@ SELECT * FROM customers co WHERE co.customer_id = (SELECT customer_id FROM GRAPH
SELECT sname, dname FROM GRAPH_TABLE (g1 MATCH (src)->(dest) WHERE src.vprop1 > (SELECT max(v1.vprop1) FROM v1) COLUMNS(src.vname AS sname, dest.vname AS dname));
SELECT sname, dname FROM GRAPH_TABLE (g1 MATCH (src)->(dest) WHERE out_degree(src.vname) > (SELECT max(out_degree(nname)) FROM GRAPH_TABLE (g1 MATCH (node) COLUMNS (node.vname AS nname))) COLUMNS(src.vname AS sname, dest.vname AS dname));
+-- Test: path combination limit prevents combinatorial explosion in the
+-- rewriter when MATCH patterns lack explicit IS label expressions.
+CREATE TABLE v4 (id int PRIMARY KEY, val int);
+CREATE TABLE v5 (id int PRIMARY KEY, val int);
+CREATE TABLE v6 (id int PRIMARY KEY, val int);
+CREATE TABLE v7 (id int PRIMARY KEY, val int);
+CREATE TABLE v8 (id int PRIMARY KEY, val int);
+CREATE TABLE v9 (id int PRIMARY KEY, val int);
+CREATE TABLE v10 (id int PRIMARY KEY, val int);
+CREATE TABLE v11 (id int PRIMARY KEY, val int);
+CREATE TABLE e1 (id int PRIMARY KEY, src int, dest int);
+CREATE TABLE e2 (id int PRIMARY KEY, src int, dest int);
+CREATE TABLE e3 (id int PRIMARY KEY, src int, dest int);
+CREATE TABLE e4 (id int PRIMARY KEY, src int, dest int);
+CREATE TABLE e5 (id int PRIMARY KEY, src int, dest int);
+CREATE TABLE e6 (id int PRIMARY KEY, src int, dest int);
+CREATE TABLE e7 (id int PRIMARY KEY, src int, dest int);
+CREATE TABLE e8 (id int PRIMARY KEY, src int, dest int);
+INSERT INTO v4 VALUES (1, 10);
+INSERT INTO e1 VALUES (1, 1, 1);
+
+CREATE PROPERTY GRAPH g5
+ VERTEX TABLES (
+ v4 LABEL vl PROPERTIES (id, val) LABEL vl2 PROPERTIES (id, val),
+ v5 LABEL vl PROPERTIES (id, val),
+ v6 LABEL vl PROPERTIES (id, val),
+ v7 LABEL vl PROPERTIES (id, val),
+ v8 LABEL vl PROPERTIES (id, val),
+ v9 LABEL vl PROPERTIES (id, val),
+ v10 LABEL vl PROPERTIES (id, val),
+ v11 LABEL vl PROPERTIES (id, val)
+ )
+ EDGE TABLES (
+ e1 SOURCE KEY (src) REFERENCES v4 (id) DESTINATION KEY (dest) REFERENCES v5 (id)
+ LABEL el PROPERTIES (id),
+ e2 SOURCE KEY (src) REFERENCES v5 (id) DESTINATION KEY (dest) REFERENCES v6 (id)
+ LABEL el PROPERTIES (id),
+ e3 SOURCE KEY (src) REFERENCES v6 (id) DESTINATION KEY (dest) REFERENCES v7 (id)
+ LABEL el PROPERTIES (id),
+ e4 SOURCE KEY (src) REFERENCES v7 (id) DESTINATION KEY (dest) REFERENCES v8 (id)
+ LABEL el PROPERTIES (id),
+ e5 SOURCE KEY (src) REFERENCES v8 (id) DESTINATION KEY (dest) REFERENCES v9 (id)
+ LABEL el PROPERTIES (id),
+ e6 SOURCE KEY (src) REFERENCES v9 (id) DESTINATION KEY (dest) REFERENCES v10 (id)
+ LABEL el PROPERTIES (id),
+ e7 SOURCE KEY (src) REFERENCES v10 (id) DESTINATION KEY (dest) REFERENCES v11 (id)
+ LABEL el PROPERTIES (id),
+ e8 SOURCE KEY (src) REFERENCES v11 (id) DESTINATION KEY (dest) REFERENCES v4 (id)
+ LABEL el PROPERTIES (id)
+ );
+
+-- 3-element unlabeled: 8 * 8 * 8 = 512 combinations (under limit, succeeds)
+SELECT count(*) FROM GRAPH_TABLE (g5 MATCH (a)-[e]->(b) COLUMNS (a.id AS aid));
+
+-- 5-element unlabeled: 8^5 = 32,768 combinations (over limit, rejected)
+SELECT count(*) FROM GRAPH_TABLE (g5 MATCH (a)-[e1]->(b)-[e2]->(c) COLUMNS (a.id AS aid));
+
+-- IS vl2 pins first vertex to 1 table: 1*8*8*8*8 = 4,096 (under limit)
+SELECT count(*) FROM GRAPH_TABLE (g5 MATCH (a IS vl2)-[e1]->(b)-[e2]->(c) COLUMNS (a.id AS aid));
+
+DROP PROPERTY GRAPH g5;
+DROP TABLE e8, e7, e6, e5, e4, e3, e2, e1;
+DROP TABLE v11, v10, v9, v8, v7, v6, v5, v4;
+
-- leave the objects behind for pg_upgrade/pg_dump tests
--
2.43.0
^ permalink raw reply [nested|flat] 3+ messages in thread
* Re: Limit GRAPH_TABLE path combinations to prevent memory exhaustion
@ 2026-04-30 07:15 Ashutosh Bapat <[email protected]>
parent: SATYANARAYANA NARLAPURAM <[email protected]>
0 siblings, 1 reply; 3+ messages in thread
From: Ashutosh Bapat @ 2026-04-30 07:15 UTC (permalink / raw)
To: SATYANARAYANA NARLAPURAM <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]>
On Wed, Apr 29, 2026 at 10:05 PM SATYANARAYANA NARLAPURAM
<[email protected]> wrote:
>
> Hi hackers,
>
> generate_queries_for_path_pattern_recurse() enumerates all path
> combinations by recursing over the Cartesian product of matching elements
> per pattern position. Without IS label filters, each position matches
> ALL tables of that kind, leading to N^K combinations (N tables, K
> pattern positions). Each combination allocates a Query node via palloc
> causing unbounded memory growth.
>
> A 8-table graph with a -element pattern reaches 81.3 GB RES in a few seconds
> before I cancel the query. Tests in the patch (those were failed) can reproduce the problem
> without the fix included in the patch.
>
> top - 15:04:19 up 43 days, 19:18, 5 users, load average: 0.43, 0.19, 0.08
> Tasks: 1 total, 1 running, 0 sleeping, 0 stopped, 0 zombie
> %Cpu(s): 0.9 us, 0.8 sy, 0.0 ni, 98.3 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
> MiB Mem : 515766.2 total, 248412.7 free, 234847.7 used, 48014.7 buff/cache
> MiB Swap: 0.0 total, 0.0 free, 0.0 used. 280918.6 avail Mem
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 649642 azureus+ 20 0 212.2g 81.3g 33948 R 100.0 16.1 0:41.20 postgres
>
I tried to reproduce this problem on my laptop with queries included
in the patch. For me all the queries finished. First within 4 ms,
second within 133 ms and the third in 12ms. Did you try with more
edges or more rows in the tables?
>
> As a POC I added a pre-computation check that calculates the total number
> of path combinations before entering the generate_queries_for_path_pattern_recurse.
> If the product exceeds MAX_GRAPH_TABLE_PATH_COMBINATIONS (set to 10,000),
> the rewriter reports ERRCODE_PROGRAM_LIMIT_EXCEEDED with a hint suggesting
> IS label filters to reduce the search space. The limit of 10,000 is somewhat arbitrary
> but conservative. It caps memory at roughly 5 MB of Query nodes.
> Patterns that would exceed the limit without labels can always be made to succeed
> by adding IS expressions to pin specific positions to fewer tables.
> Alternatively, we can consider adding a GUC to control the limit but appears
> to be an overkill. Thoughts?
I understand the problem and can understand the pain. Somebody who is
writing a query like ()-()-()-()-() ... should know that every pattern
that doesn't have a label will be matched against each vertex/edge.
Our documentation makes it explicit [1] "The above patterns would
match any vertex, or any two vertices connected by any edge ... ". But
if they are really interested in matching every vertex or edge they
don't have any other choice. Using restrictive label expressions would
lead to wrong or incomplete results. If they can use label expressions
they should do so anyway, otherwise they will end up with incorrect
results. To me it seems like somebody who is writing such a pattern
without providing enough resources is writing a bad query.
We have other places where queries can consume large amounts of memory
during planning or execution. Simply take the SQL query equivalent to
the above pattern. We do not have a way to prohibit such queries as
far as I know. I understand that SQL/PGQ makes it easy to write such
queries by hand. But that seems to be abusing a powerful tool.
Another example is joining partitioned tables with thousands of
partitions. We have a GUC which enables or disables partitionwise join
but there is no GUC to limit the number of tables or partitions being
joined.
I think we can document that such a pattern can result in large
queries which may consume memory.
Said that 81.3 GB looks unreasonably large for
generate_queries_for_path_pattern_recurse() alone. I guess a large
portion of it comes from planning and execution. How many rows did
those tables had? Which phase of query execution consumed that much
memory? Do you have a dump of memory contexts when it reaches that
limit?
[1] https://www.postgresql.org/docs/devel/queries-graph.html
--
Best Wishes,
Ashutosh Bapat
^ permalink raw reply [nested|flat] 3+ messages in thread
* Re: Limit GRAPH_TABLE path combinations to prevent memory exhaustion
@ 2026-04-30 19:01 SATYANARAYANA NARLAPURAM <[email protected]>
parent: Ashutosh Bapat <[email protected]>
0 siblings, 0 replies; 3+ messages in thread
From: SATYANARAYANA NARLAPURAM @ 2026-04-30 19:01 UTC (permalink / raw)
To: Ashutosh Bapat <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]>
Hi,
On Thu, Apr 30, 2026 at 12:16 AM Ashutosh Bapat <
[email protected]> wrote:
> On Wed, Apr 29, 2026 at 10:05 PM SATYANARAYANA NARLAPURAM
> <[email protected]> wrote:
> >
> > Hi hackers,
> >
> > generate_queries_for_path_pattern_recurse() enumerates all path
> > combinations by recursing over the Cartesian product of matching elements
> > per pattern position. Without IS label filters, each position matches
> > ALL tables of that kind, leading to N^K combinations (N tables, K
> > pattern positions). Each combination allocates a Query node via palloc
> > causing unbounded memory growth.
> >
> > A 8-table graph with a -element pattern reaches 81.3 GB RES in a few
> seconds
> > before I cancel the query. Tests in the patch (those were failed) can
> reproduce the problem
> > without the fix included in the patch.
> >
> > top - 15:04:19 up 43 days, 19:18, 5 users, load average: 0.43, 0.19,
> 0.08
> > Tasks: 1 total, 1 running, 0 sleeping, 0 stopped, 0 zombie
> > %Cpu(s): 0.9 us, 0.8 sy, 0.0 ni, 98.3 id, 0.0 wa, 0.0 hi, 0.0 si,
> 0.0 st
> > MiB Mem : 515766.2 total, 248412.7 free, 234847.7 used, 48014.7
> buff/cache
> > MiB Swap: 0.0 total, 0.0 free, 0.0 used. 280918.6 avail
> Mem
> >
> > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+
> COMMAND
> > 649642 azureus+ 20 0 212.2g 81.3g 33948 R 100.0 16.1 0:41.20
> postgres
> >
>
> I tried to reproduce this problem on my laptop with queries included
> in the patch. For me all the queries finished. First within 4 ms,
> second within 133 ms and the third in 12ms. Did you try with more
> edges or more rows in the tables?
>
> >
> > As a POC I added a pre-computation check that calculates the total number
> > of path combinations before entering the
> generate_queries_for_path_pattern_recurse.
> > If the product exceeds MAX_GRAPH_TABLE_PATH_COMBINATIONS (set to 10,000),
> > the rewriter reports ERRCODE_PROGRAM_LIMIT_EXCEEDED with a hint
> suggesting
> > IS label filters to reduce the search space. The limit of 10,000 is
> somewhat arbitrary
> > but conservative. It caps memory at roughly 5 MB of Query nodes.
> > Patterns that would exceed the limit without labels can always be made
> to succeed
> > by adding IS expressions to pin specific positions to fewer tables.
> > Alternatively, we can consider adding a GUC to control the limit but
> appears
> > to be an overkill. Thoughts?
>
> I understand the problem and can understand the pain. Somebody who is
> writing a query like ()-()-()-()-() ... should know that every pattern
> that doesn't have a label will be matched against each vertex/edge.
> Our documentation makes it explicit [1] "The above patterns would
> match any vertex, or any two vertices connected by any edge ... ". But
> if they are really interested in matching every vertex or edge they
> don't have any other choice. Using restrictive label expressions would
> lead to wrong or incomplete results. If they can use label expressions
> they should do so anyway, otherwise they will end up with incorrect
> results. To me it seems like somebody who is writing such a pattern
> without providing enough resources is writing a bad query.
>
> We have other places where queries can consume large amounts of memory
> during planning or execution. Simply take the SQL query equivalent to
> the above pattern. We do not have a way to prohibit such queries as
> far as I know. I understand that SQL/PGQ makes it easy to write such
> queries by hand. But that seems to be abusing a powerful tool.
>
> Another example is joining partitioned tables with thousands of
> partitions. We have a GUC which enables or disables partitionwise join
> but there is no GUC to limit the number of tables or partitions being
> joined.
>
> I think we can document that such a pattern can result in large
> queries which may consume memory.
>
> Said that 81.3 GB looks unreasonably large for
> generate_queries_for_path_pattern_recurse() alone. I guess a large
> portion of it comes from planning and execution. How many rows did
> those tables had? Which phase of query execution consumed that much
> memory? Do you have a dump of memory contexts when it reaches that
> limit?
>
I am worried about a potential DOS by a low privileged user or an
accidental query
causing OOM.
Please find the attached patch that added traces to print the memory
context.
Patch also includes CFI to cancel the query which we didn't have earlier.
You can see traces like below once you run the repro:
2026-04-30 18:43:51.268 UTC [927121] LOG: GRAPH_TABLE: before
generate_queries_for_path_pattern_recurse, current memory context
"MessageContext": 39392903168 bytes total
2026-04-30 18:43:51.268 UTC [927121] STATEMENT: SELECT count(*) FROM
GRAPH_TABLE (g5 MATCH (a)-[e1]->(b)-[e2]->(c)-[e3]->(d)-[e4]->(e) COLUMNS (
a.id AS aid));
2026-04-30 18:43:51.268 UTC [927121] ERROR: canceling statement due to
user request
2026-04-30 18:43:51.268 UTC [927121] STATEMENT: SELECT count(*) FROM
GRAPH_TABLE (g5 MATCH (a)-[e1]->(b)-[e2]->(c)-[e3]->(d)-[e4]->(e) COLUMNS (
a.id AS aid));
Repro:
CREATE temp TABLE v1 (id int PRIMARY KEY, val int);
CREATE temp TABLE v2 (id int PRIMARY KEY, val int);
CREATE temp TABLE v3 (id int PRIMARY KEY, val int);
CREATE temp TABLE v4 (id int PRIMARY KEY, val int);
CREATE temp TABLE v5 (id int PRIMARY KEY, val int);
CREATE temp TABLE v6 (id int PRIMARY KEY, val int);
CREATE temp TABLE v7 (id int PRIMARY KEY, val int);
CREATE temp TABLE v8 (id int PRIMARY KEY, val int);
CREATE temp TABLE e1 (id int PRIMARY KEY, src int, dest int);
CREATE temp TABLE e2 (id int PRIMARY KEY, src int, dest int);
CREATE temp TABLE e3 (id int PRIMARY KEY, src int, dest int);
CREATE temp TABLE e4 (id int PRIMARY KEY, src int, dest int);
CREATE temp TABLE e5 (id int PRIMARY KEY, src int, dest int);
CREATE temp TABLE e6 (id int PRIMARY KEY, src int, dest int);
CREATE temp TABLE e7 (id int PRIMARY KEY, src int, dest int);
CREATE temp TABLE e8 (id int PRIMARY KEY, src int, dest int);
CREATE PROPERTY GRAPH g5
VERTEX TABLES (
v1 LABEL vl PROPERTIES (id, val) LABEL vl2 PROPERTIES (id, val),
v2 LABEL vl PROPERTIES (id, val),
v3 LABEL vl PROPERTIES (id, val),
v4 LABEL vl PROPERTIES (id, val),
v5 LABEL vl PROPERTIES (id, val),
v6 LABEL vl PROPERTIES (id, val),
v7 LABEL vl PROPERTIES (id, val),
v8 LABEL vl PROPERTIES (id, val)
)
EDGE TABLES (
e1 SOURCE KEY (src) REFERENCES v1 (id) DESTINATION KEY (dest)
REFERENCES v2 (id) LABEL el PROPERTIES (id),
e2 SOURCE KEY (src) REFERENCES v2 (id) DESTINATION KEY (dest)
REFERENCES v3 (id) LABEL el PROPERTIES (id),
e3 SOURCE KEY (src) REFERENCES v3 (id) DESTINATION KEY (dest)
REFERENCES v4 (id) LABEL el PROPERTIES (id),
e4 SOURCE KEY (src) REFERENCES v4 (id) DESTINATION KEY (dest)
REFERENCES v5 (id) LABEL el PROPERTIES (id),
e5 SOURCE KEY (src) REFERENCES v5 (id) DESTINATION KEY (dest)
REFERENCES v6 (id) LABEL el PROPERTIES (id),
e6 SOURCE KEY (src) REFERENCES v6 (id) DESTINATION KEY (dest)
REFERENCES v7 (id) LABEL el PROPERTIES (id),
e7 SOURCE KEY (src) REFERENCES v7 (id) DESTINATION KEY (dest)
REFERENCES v8 (id) LABEL el PROPERTIES (id),
e8 SOURCE KEY (src) REFERENCES v8 (id) DESTINATION KEY (dest)
REFERENCES v1 (id) LABEL el PROPERTIES (id)
);
SELECT count(*) FROM GRAPH_TABLE (g5 MATCH
(a)-[e1]->(b)-[e2]->(c)-[e3]->(d)-[e4]->(e) COLUMNS (a.id AS aid));
Attachments:
[application/octet-stream] 0001-Add-memory-context-debugging-for-GRAPH_TABLE-path-ge.patch (1.8K, 3-0001-Add-memory-context-debugging-for-GRAPH_TABLE-path-ge.patch)
download | inline diff:
diff --git a/src/backend/rewrite/rewriteGraphTable.c b/src/backend/rewrite/rewriteGraphTable.c
index 6867de6d..9b25fe6f 100644
--- a/src/backend/rewrite/rewriteGraphTable.c
+++ b/src/backend/rewrite/rewriteGraphTable.c
@@ -42,6 +42,7 @@
#include "utils/builtins.h"
#include "utils/fmgroids.h"
#include "utils/lsyscache.h"
+#include "utils/memutils.h"
#include "utils/ruleutils.h"
#include "utils/syscache.h"
@@ -342,7 +343,6 @@ generate_queries_for_path_pattern(RangeTblEntry *rte, List *path_pattern)
foreach_ptr(struct path_factor, pf, path_factors)
path_elem_lists = lappend(path_elem_lists,
get_path_elements_for_path_factor(rte->relid, pf));
-
pathqueries = generate_queries_for_path_pattern_recurse(rte, pathqueries,
NIL, path_elem_lists, 0);
if (!pathqueries)
@@ -367,6 +367,8 @@ generate_queries_for_path_pattern_recurse(RangeTblEntry *rte, List *pathqueries,
foreach_ptr(struct path_element, pe, path_elems)
{
+ CHECK_FOR_INTERRUPTS();
+
/* Update current path being built with current element. */
cur_path = lappend(cur_path, pe);
@@ -383,10 +385,19 @@ generate_queries_for_path_pattern_recurse(RangeTblEntry *rte, List *pathqueries,
pathqueries = lappend(pathqueries, pathquery);
}
else
+ {
+ /* Dump memory context before recursive path query generation */
+ ereport(LOG,
+ (errmsg("GRAPH_TABLE: before generate_queries_for_path_pattern_recurse, "
+ "current memory context \"%s\": %zu bytes total",
+ CurrentMemoryContext->name,
+ MemoryContextMemAllocated(CurrentMemoryContext, true))));
pathqueries = generate_queries_for_path_pattern_recurse(rte, pathqueries,
cur_path,
path_elem_lists,
elempos + 1);
+ }
+
/* Make way for the next element at the same position. */
cur_path = list_delete_last(cur_path);
}
^ permalink raw reply [nested|flat] 3+ messages in thread
end of thread, other threads:[~2026-04-30 19:01 UTC | newest]
Thread overview: 3+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2026-04-29 16:35 Limit GRAPH_TABLE path combinations to prevent memory exhaustion SATYANARAYANA NARLAPURAM <[email protected]>
2026-04-30 07:15 ` Ashutosh Bapat <[email protected]>
2026-04-30 19:01 ` SATYANARAYANA NARLAPURAM <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox