public inbox for [email protected]  
help / color / mirror / Atom feed
Re: BUG #19362: Extremely log processing of jsonb_path_exists_opr
2+ messages / 2 participants
[nested] [flat]

* Re: BUG #19362: Extremely log processing of jsonb_path_exists_opr
@ 2025-12-23 19:02  Andrey Rachitskiy <[email protected]>
  0 siblings, 1 reply; 2+ messages in thread

From: Andrey Rachitskiy @ 2025-12-23 19:02 UTC (permalink / raw)
  To: [email protected]; [email protected]


More examples to reproduce the problem:
```
–- Create a JSON array with nesting depth of 1000
CREATE TEMP TABLE test_json AS 
SELECT ('[' || repeat('[', 1000) || '0' || repeat(']', 1000) || ']')::jsonb AS data;
 
–- Query with 4 `.**` operators executes quickly
SELECT data @? '$.**.**.**.**' FROM test_json;
 ?column? 
----------
 t
(1 row)
Time: 0,550 ms
 
–- Same query with added `.*` in LAX mode hangs > 23 minutes (not the limit)
SELECT data @? '$.**.**.**.**.*' FROM test_json;
^CCancel request sent
ERROR:  canceling statement due to user request
Time: 1390914,431 ms (23:10,914)
```
 
Code Analysis (src/backend/utils/adt/jsonpath_exec.c)
Problem 1: Recursive .** Operator Execution
The executeAnyItem() function calls itself recursively for each nested level:
```
if (level < last && v.type == jbvBinary)
{     // Recursive call for each nested level    
      res = executeAnyItem(cxt, jsp, v.val.binary.data, found, level + 1, first, last, ignoreStructuralErrors, unwrapNext);
}
```
Each additional .** operator forces this recursion to repeat for every already discovered level, creating a multiplicative effect:
 
N levels × N levels × ... — repeated as many times as there are .** operators.
 
Problem 2: .* in LAX Mode Triggers Re-entry
When processing .* for arrays in LAX mode, the code calls executeItemUnwrapTargetArray(), which restarts the recursive traversal from the beginning (adding another full pass through all levels to the already explosive complexity):
```
return executeAnyItem(cxt, jsp, jb->val.binary.data, found,
                      1, 1, 1, false, unwrapElements);  // re-entry! 
```
 
Additional Issues:
Strict Mode Problems
```
-– Memory allocation error after ~5.5 seconds 
postgres=# SELECT data @? 'strict $.**.**.**.**.**'::jsonpath from test_json;
ERROR:  invalid memory alloc request size 1073741824
Time: 5537,986 ms (00:05,538)
 
-– Hangs > 32 minutes (not the limit)
SELECT data @? 'strict $.**.**.**.**.**.*'::jsonpath FROM test_json;
^CCancel request sent
ERROR:  canceling statement due to user request
Time: 1927249,549 ms (32:07,250)
```
Strict mode exhibits similar behavior to LAX mode, consuming significant CPU resources.
 
Affected Functions: The jsonb_path_query function is also susceptible to this performance issue.
 
-- 
regards, Andrey Rachitskiy

^ permalink  raw  reply  [nested|flat] 2+ messages in thread

* Re: BUG #19362: Extremely log processing of jsonb_path_exists_opr
@ 2026-05-01 12:20  Andrey Borodin <[email protected]>
  parent: Andrey Rachitskiy <[email protected]>
  0 siblings, 0 replies; 2+ messages in thread

From: Andrey Borodin @ 2026-05-01 12:20 UTC (permalink / raw)
  To: Andrey Rachitskiy <[email protected]>; +Cc: [email protected]; [email protected]



> On 24 Dec 2025, at 00:02, Andrey Rachitskiy <[email protected]> wrote:
> 
> Strict mode exhibits similar behavior to LAX mode, consuming significant CPU resources.
> 

Hi!

Examples that you show represent an opportunity for a performance optimization,
but hardly constitute a bug.

But I think to showcase this opportunity it would be good to demonstrate more
realistic query.

I cannot imagine end user wanting a query like

SELECT data @? '$.**.**.**.**.*' FROM test_json;

or

select
'[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[[0]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]]'::jsonb
@? '++$.**.**.**.**.*'::jsonpath;

Please, construct a more realistic example of what user might want, but not get
in reasonable time.

FWIW, I've asked LLM and it thinks that optimization is worth doing:

<LLM output>

The query pattern $.**[*] ? (@ == "x") or $.**.*  ? (@ == "x") is something a 
developer might write when searching for values at any depth in JSON data that 
contains arrays (task lists, comment threads, nested categories). The intent 
is clear: "find this value anywhere in the structure, including inside arrays."
The performance degradation from O(N) to O(N²) occurs at moderate nesting 
depths that can appear in real data — 50-level comment threads, 30-level 
org charts, etc.
More importantly, if the application passes user-provided jsonpath expressions 
to @?, the $.**.**.**.**.*  pattern becomes a DoS vector requiring no special 
privileges — only the ability to submit a query.
A meaningful optimization would be to collapse consecutive .** operators 
(.** {a,b} .** {c,d} → .** {a+c, b+d}) at parse or execution time, which 
would make $.**.**.**.**.*  behave identically to $.**.*  in O(N²) time 
rather than O(N⁵).

</LLM output>

But I think it's a matter for pgsql-hackers, not pgsql-bugs.


Best regards, Andrey Borodin.





^ permalink  raw  reply  [nested|flat] 2+ messages in thread


end of thread, other threads:[~2026-05-01 12:20 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2025-12-23 19:02 Re: BUG #19362: Extremely log processing of jsonb_path_exists_opr Andrey Rachitskiy <[email protected]>
2026-05-01 12:20 ` Andrey Borodin <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox