public inbox for [email protected]
help / color / mirror / Atom feedFrom: Tom Lane <[email protected]>
To: [email protected]
Cc: [email protected]
Subject: Re: BUG #19443: GIN jsonb_path_ops index not used when many rows contain nested null values (PostgreSQL 18.3)
Date: Mon, 30 Mar 2026 15:03:46 -0400
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
PG Bug reporting form <[email protected]> writes:
> When a jsonb column has a high proportion of rows with {"a": null}, and a
> small
> subset with valid nested paths like {"a":{"b":{"c":1}}}, the query planner
> chooses a sequential scan instead of using a jsonb_path_ops GIN index for
> the
> path query @? '$.a.b.c' LIMIT 10.
> This leads to unexpected performance degradation.
If I force the plan selection with enable_seqscan, the query gets
slower, and not by a small amount:
regression=# EXPLAIN (ANALYZE)
SELECT * FROM test_jsonb
WHERE data @? '$.a.b.c'
LIMIT 10;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..1.91 rows=10 width=22) (actual time=0.009..0.024 rows=10 loops=1)
-> Seq Scan on test_jsonb (cost=0.00..1938.00 rows=10130 width=22) (actual time=0.008..0.022 rows=10 loops=1)
Filter: (data @? '$."a"."b"."c"'::jsonpath)
Rows Removed by Filter: 75
Planning Time: 0.060 ms
Execution Time: 0.036 ms
(6 rows)
regression=# set enable_seqscan TO 0;
SET
regression=# EXPLAIN (ANALYZE)
SELECT * FROM test_jsonb
WHERE data @? '$.a.b.c'
LIMIT 10;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=150.51..151.31 rows=10 width=22) (actual time=4.963..4.979 rows=10 loops=1)
-> Bitmap Heap Scan on test_jsonb (cost=150.51..965.13 rows=10130 width=22) (actual time=4.963..4.977 rows=10 loops=1)
Recheck Cond: (data @? '$."a"."b"."c"'::jsonpath)
Rows Removed by Index Recheck: 75
Heap Blocks: exact=1
-> Bitmap Index Scan on idx_test_jsonb (cost=0.00..147.98 rows=10130 width=0) (actual time=4.900..4.900 rows=100000 loops=1)
Index Cond: (data @? '$."a"."b"."c"'::jsonpath)
Planning Time: 0.122 ms
Execution Time: 4.998 ms
(9 rows)
So it appears to me that the planner's decision was correct.
It looks like the index is unable to distinguish real matches
from false matches in this example, so it returns all 100000
rows and the bitmap heap scan has to filter most of them away.
So that's why it's slow. I'm not sure offhand if the planner's
cost model is accurately modeling this situation or if it
just got lucky ... but in any case, you've not demonstrated
that there's a planner bug.
regards, tom lane
view thread (2+ messages)
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected]
Subject: Re: BUG #19443: GIN jsonb_path_ops index not used when many rows contain nested null values (PostgreSQL 18.3)
In-Reply-To: <[email protected]>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox