public inbox for [email protected]  
help / color / mirror / Atom feed
Re: Performance of JSON type in postgres
2+ messages / 2 participants
[nested] [flat]

* Re: Performance of JSON type in postgres
@ 2025-07-15 17:32 Merlin Moncure <[email protected]>
  2025-07-19 20:39 ` Re: Performance of JSON type in postgres veem v <[email protected]>
  0 siblings, 1 reply; 2+ messages in thread

From: Merlin Moncure @ 2025-07-15 17:32 UTC (permalink / raw)
  To: David G. Johnston <[email protected]>; +Cc: Adrian Klaver <[email protected]>; veem v <[email protected]>; pgsql-general <[email protected]>

On Mon, Jul 14, 2025 at 2:01 PM David G. Johnston <
[email protected]> wrote:

> On Mon, Jul 14, 2025 at 12:54 PM Adrian Klaver <[email protected]>
> wrote:
>
>> On 7/14/25 12:51, veem v wrote:
>> > So I want to
>> > understand the experts' opinion on this which I believe will  be
>> > crucial during design itself.
>>
>> It is spelled out here:
>>
>> https://www.postgresql.org/docs/current/datatype-json.html
>>
>>
> I've taken to heart the main takeaway from that page:
>
> "In general, most applications should prefer to store JSON data as jsonb,
> unless there are quite specialized needs, such as legacy assumptions about
> ordering of object keys."
>

I don't think the documentation is accurate at all, unless one of those
specialized needs is to 'be faster'.   json serialization is more than 2x
faster based on simple testing (see below).   This is absolutely not a
trivial difference.

I would say, use json for serialization, use jsonb for data storage, unless
the precise structure of the input document is important.

merlin

leaselock_iam@leaselock_prod=> explain analyze select json_agg(l) from (
select l from llcore.lease l limit 10000) q;
                                                          QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=405.52..405.53 rows=1 width=32) (actual
time=69.043..69.048 rows=1 loops=1)
   ->  Limit  (cost=0.00..380.52 rows=10000 width=247) (actual
time=0.017..9.764 rows=10000 loops=1)
         ->  Seq Scan on lease l  (cost=0.00..100383.89 rows=2638089
width=247) (actual time=0.016..8.831 rows=10000 loops=1)
 Planning Time: 0.109 ms
 Execution Time: 69.088 ms
(5 rows)

Time: 160.560 ms
leaselock_iam@leaselock_prod=> explain analyze select jsonb_agg(l) from (
select l from llcore.lease l limit 10000) q;
                                                          QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=405.52..405.53 rows=1 width=32) (actual
time=146.139..146.141 rows=1 loops=1)
   ->  Limit  (cost=0.00..380.52 rows=10000 width=247) (actual
time=0.017..20.837 rows=10000 loops=1)
         ->  Seq Scan on lease l  (cost=0.00..100383.89 rows=2638089
width=247) (actual time=0.016..19.975 rows=10000 loops=1)
 Planning Time: 0.108 ms
 Execution Time: 152.277 ms


^ permalink  raw  reply  [nested|flat] 2+ messages in thread

* Re: Performance of JSON type in postgres
  2025-07-15 17:32 Re: Performance of JSON type in postgres Merlin Moncure <[email protected]>
@ 2025-07-19 20:39 ` veem v <[email protected]>
  0 siblings, 0 replies; 2+ messages in thread

From: veem v @ 2025-07-19 20:39 UTC (permalink / raw)
  To: Merlin Moncure <[email protected]>; Adrian Klaver <[email protected]>; David G. Johnston <[email protected]>; +Cc: pgsql-general <[email protected]>

On Tue, 15 Jul 2025 at 23:02, Merlin Moncure <[email protected]> wrote:

> On Mon, Jul 14, 2025 at 2:01 PM David G. Johnston <
> [email protected]> wrote:
>
>> On Mon, Jul 14, 2025 at 12:54 PM Adrian Klaver <[email protected]>
>> wrote:
>>
>>> On 7/14/25 12:51, veem v wrote:
>>> > So I want to
>>> > understand the experts' opinion on this which I believe will  be
>>> > crucial during design itself.
>>>
>>> It is spelled out here:
>>>
>>> https://www.postgresql.org/docs/current/datatype-json.html
>>>
>>>
>> I've taken to heart the main takeaway from that page:
>>
>> "In general, most applications should prefer to store JSON data as jsonb,
>> unless there are quite specialized needs, such as legacy assumptions about
>> ordering of object keys."
>>
>
> I don't think the documentation is accurate at all, unless one of those
> specialized needs is to 'be faster'.   json serialization is more than 2x
> faster based on simple testing (see below).   This is absolutely not a
> trivial difference.
>
> I would say, use json for serialization, use jsonb for data storage,
> unless the precise structure of the input document is important.
>
> merlin
>
> leaselock_iam@leaselock_prod=> explain analyze select json_agg(l) from (
> select l from llcore.lease l limit 10000) q;
>                                                           QUERY PLAN
>
> ------------------------------------------------------------------------------------------------------------------------------
>  Aggregate  (cost=405.52..405.53 rows=1 width=32) (actual
> time=69.043..69.048 rows=1 loops=1)
>    ->  Limit  (cost=0.00..380.52 rows=10000 width=247) (actual
> time=0.017..9.764 rows=10000 loops=1)
>          ->  Seq Scan on lease l  (cost=0.00..100383.89 rows=2638089
> width=247) (actual time=0.016..8.831 rows=10000 loops=1)
>  Planning Time: 0.109 ms
>  Execution Time: 69.088 ms
> (5 rows)
>
> Time: 160.560 ms
> leaselock_iam@leaselock_prod=> explain analyze select jsonb_agg(l) from (
> select l from llcore.lease l limit 10000) q;
>                                                           QUERY PLAN
>
> -------------------------------------------------------------------------------------------------------------------------------
>  Aggregate  (cost=405.52..405.53 rows=1 width=32) (actual
> time=146.139..146.141 rows=1 loops=1)
>    ->  Limit  (cost=0.00..380.52 rows=10000 width=247) (actual
> time=0.017..20.837 rows=10000 loops=1)
>          ->  Seq Scan on lease l  (cost=0.00..100383.89 rows=2638089
> width=247) (actual time=0.016..19.975 rows=10000 loops=1)
>  Planning Time: 0.108 ms
>  Execution Time: 152.277 ms
>
>


Thank you.

I tested below for sample data. I see loading or serialization seems a lot
slower(twice as slower) in JSONB as compared to JSON. Whereas storage looks
efficient in JSONB. and reading performance of nested fields are 7-8 times
slower in JSON as compared to JSONB(and ofcourse index support makes it a
better choice here). Hope i am testing it correctly here.

https://dbfiddle.uk/6P7sjL22

So I am a bit confused here . Also one of our use case is, along with
persisting this data and querying it in postgres database, We are also
going to move this data from postgres (which is a upstream OLTP system) to
a downstream OLAP system ,which is in Snowflake database which is having
data types like Variant or Varchar types. So, will it create a significant
difference if we store it in JSON vs JSONB in our postgres i.e the
source/upstream database?


^ permalink  raw  reply  [nested|flat] 2+ messages in thread


end of thread, other threads:[~2025-07-19 20:39 UTC | newest]

Thread overview: 2+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2025-07-15 17:32 Re: Performance of JSON type in postgres Merlin Moncure <[email protected]>
2025-07-19 20:39 ` veem v <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox