Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w7Hu0-0059L2-20 for pgsql-bugs@arkaria.postgresql.org; Mon, 30 Mar 2026 19:03:56 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1w7Htz-005liw-0b for pgsql-bugs@arkaria.postgresql.org; Mon, 30 Mar 2026 19:03:55 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1w7Hty-005lin-32 for pgsql-bugs@lists.postgresql.org; Mon, 30 Mar 2026 19:03:55 +0000 Received: from sss.pgh.pa.us ([68.162.161.243]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1w7Htw-000000022sr-3Yke for pgsql-bugs@lists.postgresql.org; Mon, 30 Mar 2026 19:03:55 +0000 Received: from sss1.sss.pgh.pa.us (localhost [127.0.0.1]) by sss.pgh.pa.us (8.15.2/8.15.2) with ESMTP id 62UJ3kKC2546011; Mon, 30 Mar 2026 15:03:46 -0400 From: Tom Lane To: zheng_xianghang@163.com cc: pgsql-bugs@lists.postgresql.org Subject: Re: BUG #19443: GIN jsonb_path_ops index not used when many rows contain nested null values (PostgreSQL 18.3) In-reply-to: <19443-a8d2aef8b11ea452@postgresql.org> References: <19443-a8d2aef8b11ea452@postgresql.org> Comments: In-reply-to PG Bug reporting form message dated "Mon, 30 Mar 2026 03:09:55 -0000" MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-ID: <2546009.1774897426.1@sss.pgh.pa.us> Content-Transfer-Encoding: quoted-printable Date: Mon, 30 Mar 2026 15:03:46 -0400 Message-ID: <2546010.1774897426@sss.pgh.pa.us> List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk PG Bug reporting form writes: > When a jsonb column has a high proportion of rows with {"a": null}, and = a > small > subset with valid nested paths like {"a":{"b":{"c":1}}}, the query plann= er > chooses a sequential scan instead of using a jsonb_path_ops GIN index fo= r > the > path query @? '$.a.b.c' LIMIT 10. > This leads to unexpected performance degradation. If I force the plan selection with enable_seqscan, the query gets slower, and not by a small amount: regression=3D# EXPLAIN (ANALYZE) SELECT * FROM test_jsonb WHERE data @? '$.a.b.c' LIMIT 10; QUERY PLAN = = --------------------------------------------------------------------------= ----------------------------------------- Limit (cost=3D0.00..1.91 rows=3D10 width=3D22) (actual time=3D0.009..0.0= 24 rows=3D10 loops=3D1) -> Seq Scan on test_jsonb (cost=3D0.00..1938.00 rows=3D10130 width=3D= 22) (actual time=3D0.008..0.022 rows=3D10 loops=3D1) Filter: (data @? '$."a"."b"."c"'::jsonpath) Rows Removed by Filter: 75 Planning Time: 0.060 ms Execution Time: 0.036 ms (6 rows) regression=3D# set enable_seqscan TO 0; SET regression=3D# EXPLAIN (ANALYZE) SELECT * FROM test_jsonb WHERE data @? '$.a.b.c' LIMIT 10; QUERY PLAN = = --------------------------------------------------------------------------= -------------------------------------------------------------- Limit (cost=3D150.51..151.31 rows=3D10 width=3D22) (actual time=3D4.963.= .4.979 rows=3D10 loops=3D1) -> Bitmap Heap Scan on test_jsonb (cost=3D150.51..965.13 rows=3D10130= width=3D22) (actual time=3D4.963..4.977 rows=3D10 loops=3D1) Recheck Cond: (data @? '$."a"."b"."c"'::jsonpath) Rows Removed by Index Recheck: 75 Heap Blocks: exact=3D1 -> Bitmap Index Scan on idx_test_jsonb (cost=3D0.00..147.98 row= s=3D10130 width=3D0) (actual time=3D4.900..4.900 rows=3D100000 loops=3D1) Index Cond: (data @? '$."a"."b"."c"'::jsonpath) Planning Time: 0.122 ms Execution Time: 4.998 ms (9 rows) So it appears to me that the planner's decision was correct. It looks like the index is unable to distinguish real matches from false matches in this example, so it returns all 100000 rows and the bitmap heap scan has to filter most of them away. So that's why it's slow. I'm not sure offhand if the planner's cost model is accurately modeling this situation or if it just got lucky ... but in any case, you've not demonstrated that there's a planner bug. regards, tom lane