public inbox for [email protected]
help / color / mirror / Atom feedFrom: Kirk Parker <[email protected]>
To: [email protected]
To: [email protected]
Subject: Re: Ambiguity in IS JSON description and logic
Date: Thu, 20 Mar 2025 07:08:50 -0700
Message-ID: <CANwZ8r=qtW3EfrzMrEB=xRoxePmx=JLUC8MvEK68kRCrfrQ4_g@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
On Thu, Mar 20, 2025 at 2:46 AM PG Doc comments form <[email protected]>
wrote:
> The following documentation comment has been logged on the website:
>
> Page: https://www.postgresql.org/docs/17/functions-json.html
> Description:
>
> On the manual page
> https://www.postgresql.org/docs/current/functions-json.html, in the Table
> 9.48. "SQL/JSON Testing Functions" there is a description of IS JSON. It
> includes the next sentence: "If WITH UNIQUE KEYS is specified, then any
> object in the expression is also tested to see if it has duplicate keys."
> And such text is ambiguous, because the term "object" has certain meaning
> regarding json format. In reality the option WITH UNIQUE KEYS allows to
> check for duplicated keys any array element not object. For objects, both
> WITH UNIQUE KEYS and WITHOUT UNIQUE KEYS return false, and both IS JSON
> ARRAY WITH UNIQUE KEY and IS JSON ARRAY WITHOUT UNIQUE KEY return true (it
> is at the same time with and without unique values, how it is possible?),
> i.e. it works the same as just IS JSON ARRAY. The example code that
> confirms
> my reasoning:
> SELECT
> js.vl AS "tested str",
>
>
> js.vl IS JSON OBJECT WITH UNIQUE KEYS AS ".. object w. UQ keys",
>
>
> js.vl IS JSON OBJECT WITHOUT UNIQUE KEYS AS ".. object w/o UQ keys",
> js.vl IS JSON ARRAY WITH UNIQUE KEYS AS ".. array w. UQ keys",
>
>
> js.vl IS JSON ARRAY WITHOUT UNIQUE KEYS AS ".. array w/o UQ keys",
> js.vl IS JSON ARRAY ".. array"
> FROM (VALUES ('{{"a": "a1"}, {"a": "a2"}}'), ('[{"a": "a1"}, {"a":
> "a2"}]'),
> ('["a", "a"]')) AS js(vl);
>
> I'm not sure what should be the right logic for this option, for me it
> looks
> now the same as simple IS JSON ARRAY without any UNIQUE KEY option, but if
> we use an option it should be either true for WITH UNIQUE KEYS or WITHOUT
> UNIQUE KEYS but not for both at the same time. But anyway the sentence I
> showed above should contain "array" instead of "object" because for objects
> it returns false independently of applied option. I tested it on
> "PostgreSQL 17.0 on x86_64-windows, compiled by msvc-19.41.34120, 64-bit".
>
First, WITHOUT UNIQUE KEYS does not mean "confirm that there are duplicate
keys", it's just a way of stating the default explicitly. In other words it
means "w/o testing for duplicate keys". Thus IS JSON OBJECT and IS JSON
OBJECT WITHOUT UNIQUE KEYS will both always return identical results on the
same JSON expression.
Secondly, the UNIQUE test is recursive; for objects maybe the meaning is
intuitive, but for JSON arrays -- which don't have any concept of keys;
JSON arrays are just ordered lists -- it means "does this array contain any
embedded objects with duplicate keys".
See:
SELECT js,
js IS JSON "json?",
js IS JSON OBJECT "object?",
js IS JSON OBJECT WITH UNIQUE KEYS "object w. UK?",
js IS JSON OBJECT WITHOUT UNIQUE KEYS "object w/o UK?",
js IS JSON ARRAY "array?",
js IS JSON ARRAY WITH UNIQUE KEYS "array w. UK?",
js IS JSON ARRAY WITHOUT UNIQUE KEYS "array w/o UK?"
FROM (VALUES
('[{"a":1},{"b":2,"b":3}]'), -- expect t for array, array w/o UK
('[{"a":1},{"b":2,"c":3}]'), -- expect t for ALL array tests
('{"b":2,"b":3}'), -- expect t for object, object w/o UK
('{"c":2,"d":3}'), -- expect t for ALL object tests
('{"c":2,"d":{ "e": 0, "e": 1}}'), -- WITH UNIQUE is recursive for nested
objects
('{"c":2,"d":{ "e": 0, "f": {"g":1,"g":2}}}'), -- no matter how deep
('[{"a":1},{"b":2,"c":{"d":1, "d":2}}]') -- and also tests arrays
recursively for embedded objecs
) foo(js);
A couple of side notes:
1. Your first data example is not JSON at all. It's helpful for this kind
of test to include a plain IS JSON column, since any of the IS JSON X tests
can fail for two reasons: (a) it's not JSON, or (b) it is JSON but it's not
an X.
2. Curiously, the JSON spec itself is completely silent on the meaning of
objects with duplicate keys. PostgreSQL is more helpful in this
regard--the docs explicitly state that the last value is the one that is
retained by JSONB and used in processing functions.
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected]
Subject: Re: Ambiguity in IS JSON description and logic
In-Reply-To: <CANwZ8r=qtW3EfrzMrEB=xRoxePmx=JLUC8MvEK68kRCrfrQ4_g@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox