public inbox for [email protected]
help / color / mirror / Atom feedAmbiguity in IS JSON description and logic
8+ messages / 7 participants
[nested] [flat]
* Ambiguity in IS JSON description and logic
@ 2025-03-19 12:14 PG Doc comments form <[email protected]>
0 siblings, 1 reply; 8+ messages in thread
From: PG Doc comments form @ 2025-03-19 12:14 UTC (permalink / raw)
To: [email protected]; +Cc: [email protected]
The following documentation comment has been logged on the website:
Page: https://www.postgresql.org/docs/17/functions-json.html
Description:
On the manual page
https://www.postgresql.org/docs/current/functions-json.html, in the Table
9.48. "SQL/JSON Testing Functions" there is a description of IS JSON. It
includes the next sentence: "If WITH UNIQUE KEYS is specified, then any
object in the expression is also tested to see if it has duplicate keys."
And such text is ambiguous, because the term "object" has certain meaning
regarding json format. In reality the option WITH UNIQUE KEYS allows to
check for duplicated keys any array element not object. For objects, both
WITH UNIQUE KEYS and WITHOUT UNIQUE KEYS return false, and both IS JSON
ARRAY WITH UNIQUE KEY and IS JSON ARRAY WITHOUT UNIQUE KEY return true (it
is at the same time with and without unique values, how it is possible?),
i.e. it works the same as just IS JSON ARRAY. The example code that confirms
my reasoning:
SELECT
js.vl AS "tested str",
js.vl IS JSON OBJECT WITH UNIQUE KEYS AS ".. object w. UQ keys",
js.vl IS JSON OBJECT WITHOUT UNIQUE KEYS AS ".. object w/o UQ keys",
js.vl IS JSON ARRAY WITH UNIQUE KEYS AS ".. array w. UQ keys",
js.vl IS JSON ARRAY WITHOUT UNIQUE KEYS AS ".. array w/o UQ keys",
js.vl IS JSON ARRAY ".. array"
FROM (VALUES ('{{"a": "a1"}, {"a": "a2"}}'), ('[{"a": "a1"}, {"a": "a2"}]'),
('["a", "a"]')) AS js(vl);
I'm not sure what should be the right logic for this option, for me it looks
now the same as simple IS JSON ARRAY without any UNIQUE KEY option, but if
we use an option it should be either true for WITH UNIQUE KEYS or WITHOUT
UNIQUE KEYS but not for both at the same time. But anyway the sentence I
showed above should contain "array" instead of "object" because for objects
it returns false independently of applied option. I tested it on
"PostgreSQL 17.0 on x86_64-windows, compiled by msvc-19.41.34120, 64-bit".
^ permalink raw reply [nested|flat] 8+ messages in thread
* Re: Ambiguity in IS JSON description and logic
@ 2025-03-20 14:08 Kirk Parker <[email protected]>
parent: PG Doc comments form <[email protected]>
0 siblings, 1 reply; 8+ messages in thread
From: Kirk Parker @ 2025-03-20 14:08 UTC (permalink / raw)
To: [email protected]; [email protected]
On Thu, Mar 20, 2025 at 2:46 AM PG Doc comments form <[email protected]>
wrote:
> The following documentation comment has been logged on the website:
>
> Page: https://www.postgresql.org/docs/17/functions-json.html
> Description:
>
> On the manual page
> https://www.postgresql.org/docs/current/functions-json.html, in the Table
> 9.48. "SQL/JSON Testing Functions" there is a description of IS JSON. It
> includes the next sentence: "If WITH UNIQUE KEYS is specified, then any
> object in the expression is also tested to see if it has duplicate keys."
> And such text is ambiguous, because the term "object" has certain meaning
> regarding json format. In reality the option WITH UNIQUE KEYS allows to
> check for duplicated keys any array element not object. For objects, both
> WITH UNIQUE KEYS and WITHOUT UNIQUE KEYS return false, and both IS JSON
> ARRAY WITH UNIQUE KEY and IS JSON ARRAY WITHOUT UNIQUE KEY return true (it
> is at the same time with and without unique values, how it is possible?),
> i.e. it works the same as just IS JSON ARRAY. The example code that
> confirms
> my reasoning:
> SELECT
> js.vl AS "tested str",
>
>
> js.vl IS JSON OBJECT WITH UNIQUE KEYS AS ".. object w. UQ keys",
>
>
> js.vl IS JSON OBJECT WITHOUT UNIQUE KEYS AS ".. object w/o UQ keys",
> js.vl IS JSON ARRAY WITH UNIQUE KEYS AS ".. array w. UQ keys",
>
>
> js.vl IS JSON ARRAY WITHOUT UNIQUE KEYS AS ".. array w/o UQ keys",
> js.vl IS JSON ARRAY ".. array"
> FROM (VALUES ('{{"a": "a1"}, {"a": "a2"}}'), ('[{"a": "a1"}, {"a":
> "a2"}]'),
> ('["a", "a"]')) AS js(vl);
>
> I'm not sure what should be the right logic for this option, for me it
> looks
> now the same as simple IS JSON ARRAY without any UNIQUE KEY option, but if
> we use an option it should be either true for WITH UNIQUE KEYS or WITHOUT
> UNIQUE KEYS but not for both at the same time. But anyway the sentence I
> showed above should contain "array" instead of "object" because for objects
> it returns false independently of applied option. I tested it on
> "PostgreSQL 17.0 on x86_64-windows, compiled by msvc-19.41.34120, 64-bit".
>
First, WITHOUT UNIQUE KEYS does not mean "confirm that there are duplicate
keys", it's just a way of stating the default explicitly. In other words it
means "w/o testing for duplicate keys". Thus IS JSON OBJECT and IS JSON
OBJECT WITHOUT UNIQUE KEYS will both always return identical results on the
same JSON expression.
Secondly, the UNIQUE test is recursive; for objects maybe the meaning is
intuitive, but for JSON arrays -- which don't have any concept of keys;
JSON arrays are just ordered lists -- it means "does this array contain any
embedded objects with duplicate keys".
See:
SELECT js,
js IS JSON "json?",
js IS JSON OBJECT "object?",
js IS JSON OBJECT WITH UNIQUE KEYS "object w. UK?",
js IS JSON OBJECT WITHOUT UNIQUE KEYS "object w/o UK?",
js IS JSON ARRAY "array?",
js IS JSON ARRAY WITH UNIQUE KEYS "array w. UK?",
js IS JSON ARRAY WITHOUT UNIQUE KEYS "array w/o UK?"
FROM (VALUES
('[{"a":1},{"b":2,"b":3}]'), -- expect t for array, array w/o UK
('[{"a":1},{"b":2,"c":3}]'), -- expect t for ALL array tests
('{"b":2,"b":3}'), -- expect t for object, object w/o UK
('{"c":2,"d":3}'), -- expect t for ALL object tests
('{"c":2,"d":{ "e": 0, "e": 1}}'), -- WITH UNIQUE is recursive for nested
objects
('{"c":2,"d":{ "e": 0, "f": {"g":1,"g":2}}}'), -- no matter how deep
('[{"a":1},{"b":2,"c":{"d":1, "d":2}}]') -- and also tests arrays
recursively for embedded objecs
) foo(js);
A couple of side notes:
1. Your first data example is not JSON at all. It's helpful for this kind
of test to include a plain IS JSON column, since any of the IS JSON X tests
can fail for two reasons: (a) it's not JSON, or (b) it is JSON but it's not
an X.
2. Curiously, the JSON spec itself is completely silent on the meaning of
objects with duplicate keys. PostgreSQL is more helpful in this
regard--the docs explicitly state that the last value is the one that is
retained by JSONB and used in processing functions.
^ permalink raw reply [nested|flat] 8+ messages in thread
* Re: Ambiguity in IS JSON description and logic
@ 2025-03-20 14:22 Kirk Parker <[email protected]>
parent: Kirk Parker <[email protected]>
0 siblings, 2 replies; 8+ messages in thread
From: Kirk Parker @ 2025-03-20 14:22 UTC (permalink / raw)
To: [email protected]; [email protected]
On Thu, Mar 20, 2025 at 7:08 AM Kirk Parker <[email protected]> wrote:
>
> On Thu, Mar 20, 2025 at 2:46 AM PG Doc comments form <
> [email protected]> wrote:
>
>> The following documentation comment has been logged on the website:
>>
>> Page: https://www.postgresql.org/docs/17/functions-json.html
>> Description:
>>
>> On the manual page
>> https://www.postgresql.org/docs/current/functions-json.html, in the Table
>> 9.48. "SQL/JSON Testing Functions" there is a description of IS JSON. It
>> includes the next sentence: "If WITH UNIQUE KEYS is specified, then any
>> object in the expression is also tested to see if it has duplicate keys."
>> And such text is ambiguous, because the term "object" has certain meaning
>> regarding json format. In reality the option WITH UNIQUE KEYS allows to
>> check for duplicated keys any array element not object. For objects, both
>> WITH UNIQUE KEYS and WITHOUT UNIQUE KEYS return false, and both IS JSON
>> ARRAY WITH UNIQUE KEY and IS JSON ARRAY WITHOUT UNIQUE KEY return true (it
>> is at the same time with and without unique values, how it is possible?),
>> i.e. it works the same as just IS JSON ARRAY. The example code that
>> confirms
>> my reasoning:
>> SELECT
>> js.vl AS "tested str",
>>
>>
>> js.vl IS JSON OBJECT WITH UNIQUE KEYS AS ".. object w. UQ
>> keys",
>>
>> js.vl IS JSON OBJECT WITHOUT UNIQUE KEYS AS ".. object w/o UQ keys",
>> js.vl IS JSON ARRAY WITH UNIQUE KEYS AS ".. array w. UQ keys",
>>
>>
>> js.vl IS JSON ARRAY WITHOUT UNIQUE KEYS AS ".. array w/o UQ keys",
>> js.vl IS JSON ARRAY ".. array"
>> FROM (VALUES ('{{"a": "a1"}, {"a": "a2"}}'), ('[{"a": "a1"}, {"a":
>> "a2"}]'),
>> ('["a", "a"]')) AS js(vl);
>>
>> I'm not sure what should be the right logic for this option, for me it
>> looks
>> now the same as simple IS JSON ARRAY without any UNIQUE KEY option, but if
>> we use an option it should be either true for WITH UNIQUE KEYS or WITHOUT
>> UNIQUE KEYS but not for both at the same time. But anyway the sentence I
>> showed above should contain "array" instead of "object" because for
>> objects
>> it returns false independently of applied option. I tested it on
>> "PostgreSQL 17.0 on x86_64-windows, compiled by msvc-19.41.34120, 64-bit".
>>
>
> First, WITHOUT UNIQUE KEYS does not mean "confirm that there are duplicate
> keys", it's just a way of stating the default explicitly. In other words it
> means "w/o testing for duplicate keys". Thus IS JSON OBJECT and IS JSON
> OBJECT WITHOUT UNIQUE KEYS will both always return identical results on the
> same JSON expression.
>
> Secondly, the UNIQUE test is recursive; for objects maybe the meaning is
> intuitive, but for JSON arrays -- which don't have any concept of keys;
> JSON arrays are just ordered lists -- it means "does this array contain any
> embedded objects with duplicate keys".
>
> See:
>
> SELECT js,
> js IS JSON "json?",
> js IS JSON OBJECT "object?",
> js IS JSON OBJECT WITH UNIQUE KEYS "object w. UK?",
> js IS JSON OBJECT WITHOUT UNIQUE KEYS "object w/o UK?",
> js IS JSON ARRAY "array?",
> js IS JSON ARRAY WITH UNIQUE KEYS "array w. UK?",
> js IS JSON ARRAY WITHOUT UNIQUE KEYS "array w/o UK?"
> FROM (VALUES
> ('[{"a":1},{"b":2,"b":3}]'), -- expect t for array, array w/o UK
> ('[{"a":1},{"b":2,"c":3}]'), -- expect t for ALL array tests
> ('{"b":2,"b":3}'), -- expect t for object, object w/o UK
> ('{"c":2,"d":3}'), -- expect t for ALL object tests
> ('{"c":2,"d":{ "e": 0, "e": 1}}'), -- WITH UNIQUE is recursive for
> nested objects
> ('{"c":2,"d":{ "e": 0, "f": {"g":1,"g":2}}}'), -- no matter how deep
> ('[{"a":1},{"b":2,"c":{"d":1, "d":2}}]') -- and also tests arrays
> recursively for embedded objecs
> ) foo(js);
>
>
> A couple of side notes:
>
> 1. Your first data example is not JSON at all. It's helpful for this kind
> of test to include a plain IS JSON column, since any of the IS JSON X tests
> can fail for two reasons: (a) it's not JSON, or (b) it is JSON but it's not
> an X.
>
> 2. Curiously, the JSON spec itself is completely silent on the meaning of
> objects with duplicate keys. PostgreSQL is more helpful in this
> regard--the docs explicitly state that the last value is the one that is
> retained by JSONB and used in processing functions.
>
>
To improve the documentation here, I would suggest simply adding the word
"recursively" after "tested":
If WITH UNIQUE KEYS is specified, then any object in the *expression* is
also tested recursively to see if it has duplicate keys
As for clarifying -- if we need to -- that WITHOUT UNIQUE KEYS means
"without testing for uniqueness" rather than "confirming that non-unique
keys are present", I will defer to anyone who has more familiarity with the
PostgreSQL documentation style. I probed around looking at other
situations where there are phrases that make the default condition explicit
(e.g. CREATE INDEX ... NULLS DISTINCT) that didn't help me much; all my
attempts ended up too wordy for my liking.
^ permalink raw reply [nested|flat] 8+ messages in thread
* Re: Ambiguity in IS JSON description and logic
@ 2025-03-20 16:06 noname stranger <[email protected]>
parent: Kirk Parker <[email protected]>
1 sibling, 0 replies; 8+ messages in thread
From: noname stranger @ 2025-03-20 16:06 UTC (permalink / raw)
To: Kirk Parker <[email protected]>; +Cc: [email protected]
Hello Kirk,
Thank you for your answer and detailed explanation. You are absolutely
right, I made a mistake in my reasoning. On the other hand, some ambiguity
exists.
It is a perfect idea to clarify what WITHOUT UNIQUE KEYS means. Such
clarification could help with understanding the logic of IS JSON faster and
avoiding confusion and ambiguity. I found the same option for a few JSON
creation functions in table 9.47 at the same page, like json_object() and
json(). Their descriptions also don't clarify this point.
Best regards,
Valery
чт, 20 мар. 2025 г. в 15:22, Kirk Parker <[email protected]>:
> On Thu, Mar 20, 2025 at 7:08 AM Kirk Parker <[email protected]> wrote:
>
>>
>> On Thu, Mar 20, 2025 at 2:46 AM PG Doc comments form <
>> [email protected]> wrote:
>>
>>> The following documentation comment has been logged on the website:
>>>
>>> Page: https://www.postgresql.org/docs/17/functions-json.html
>>> Description:
>>>
>>> On the manual page
>>> https://www.postgresql.org/docs/current/functions-json.html, in the
>>> Table
>>> 9.48. "SQL/JSON Testing Functions" there is a description of IS JSON. It
>>> includes the next sentence: "If WITH UNIQUE KEYS is specified, then any
>>> object in the expression is also tested to see if it has duplicate keys."
>>> And such text is ambiguous, because the term "object" has certain meaning
>>> regarding json format. In reality the option WITH UNIQUE KEYS allows to
>>> check for duplicated keys any array element not object. For objects, both
>>> WITH UNIQUE KEYS and WITHOUT UNIQUE KEYS return false, and both IS JSON
>>> ARRAY WITH UNIQUE KEY and IS JSON ARRAY WITHOUT UNIQUE KEY return true
>>> (it
>>> is at the same time with and without unique values, how it is possible?),
>>> i.e. it works the same as just IS JSON ARRAY. The example code that
>>> confirms
>>> my reasoning:
>>> SELECT
>>> js.vl AS "tested str",
>>>
>>>
>>> js.vl IS JSON OBJECT WITH UNIQUE KEYS AS ".. object w. UQ
>>> keys",
>>>
>>> js.vl IS JSON OBJECT WITHOUT UNIQUE KEYS AS ".. object w/o UQ
>>> keys",
>>> js.vl IS JSON ARRAY WITH UNIQUE KEYS AS ".. array w. UQ
>>> keys",
>>>
>>> js.vl IS JSON ARRAY WITHOUT UNIQUE KEYS AS ".. array w/o UQ keys",
>>> js.vl IS JSON ARRAY ".. array"
>>> FROM (VALUES ('{{"a": "a1"}, {"a": "a2"}}'), ('[{"a": "a1"}, {"a":
>>> "a2"}]'),
>>> ('["a", "a"]')) AS js(vl);
>>>
>>> I'm not sure what should be the right logic for this option, for me it
>>> looks
>>> now the same as simple IS JSON ARRAY without any UNIQUE KEY option, but
>>> if
>>> we use an option it should be either true for WITH UNIQUE KEYS or WITHOUT
>>> UNIQUE KEYS but not for both at the same time. But anyway the sentence I
>>> showed above should contain "array" instead of "object" because for
>>> objects
>>> it returns false independently of applied option. I tested it on
>>> "PostgreSQL 17.0 on x86_64-windows, compiled by msvc-19.41.34120,
>>> 64-bit".
>>>
>>
>> First, WITHOUT UNIQUE KEYS does not mean "confirm that there are
>> duplicate keys", it's just a way of stating the default explicitly. In
>> other words it means "w/o testing for duplicate keys". Thus IS JSON OBJECT
>> and IS JSON OBJECT WITHOUT UNIQUE KEYS will both always return identical
>> results on the same JSON expression.
>>
>> Secondly, the UNIQUE test is recursive; for objects maybe the meaning is
>> intuitive, but for JSON arrays -- which don't have any concept of keys;
>> JSON arrays are just ordered lists -- it means "does this array contain any
>> embedded objects with duplicate keys".
>>
>> See:
>>
>> SELECT js,
>> js IS JSON "json?",
>> js IS JSON OBJECT "object?",
>> js IS JSON OBJECT WITH UNIQUE KEYS "object w. UK?",
>> js IS JSON OBJECT WITHOUT UNIQUE KEYS "object w/o UK?",
>> js IS JSON ARRAY "array?",
>> js IS JSON ARRAY WITH UNIQUE KEYS "array w. UK?",
>> js IS JSON ARRAY WITHOUT UNIQUE KEYS "array w/o UK?"
>> FROM (VALUES
>> ('[{"a":1},{"b":2,"b":3}]'), -- expect t for array, array w/o UK
>> ('[{"a":1},{"b":2,"c":3}]'), -- expect t for ALL array tests
>> ('{"b":2,"b":3}'), -- expect t for object, object w/o UK
>> ('{"c":2,"d":3}'), -- expect t for ALL object tests
>> ('{"c":2,"d":{ "e": 0, "e": 1}}'), -- WITH UNIQUE is recursive for
>> nested objects
>> ('{"c":2,"d":{ "e": 0, "f": {"g":1,"g":2}}}'), -- no matter how deep
>> ('[{"a":1},{"b":2,"c":{"d":1, "d":2}}]') -- and also tests arrays
>> recursively for embedded objecs
>> ) foo(js);
>>
>>
>> A couple of side notes:
>>
>> 1. Your first data example is not JSON at all. It's helpful for this
>> kind of test to include a plain IS JSON column, since any of the IS JSON X
>> tests can fail for two reasons: (a) it's not JSON, or (b) it is JSON but
>> it's not an X.
>>
>> 2. Curiously, the JSON spec itself is completely silent on the meaning of
>> objects with duplicate keys. PostgreSQL is more helpful in this
>> regard--the docs explicitly state that the last value is the one that is
>> retained by JSONB and used in processing functions.
>>
>>
> To improve the documentation here, I would suggest simply adding the word
> "recursively" after "tested":
>
> If WITH UNIQUE KEYS is specified, then any object in the *expression* is
> also tested recursively to see if it has duplicate keys
>
> As for clarifying -- if we need to -- that WITHOUT UNIQUE KEYS means
> "without testing for uniqueness" rather than "confirming that non-unique
> keys are present", I will defer to anyone who has more familiarity with the
> PostgreSQL documentation style. I probed around looking at other
> situations where there are phrases that make the default condition explicit
> (e.g. CREATE INDEX ... NULLS DISTINCT) that didn't help me much; all my
> attempts ended up too wordy for my liking.
>
--
С уважением,
Валерий
^ permalink raw reply [nested|flat] 8+ messages in thread
* Re: Ambiguity in IS JSON description and logic
@ 2025-03-21 02:04 David G. Johnston <[email protected]>
parent: Kirk Parker <[email protected]>
1 sibling, 1 reply; 8+ messages in thread
From: David G. Johnston @ 2025-03-21 02:04 UTC (permalink / raw)
To: Kirk Parker <[email protected]>; +Cc: [email protected]; [email protected]
On Thu, Mar 20, 2025 at 7:22 AM Kirk Parker <[email protected]> wrote:
> On Thu, Mar 20, 2025 at 7:08 AM Kirk Parker <[email protected]> wrote:
>
>>
>> On Thu, Mar 20, 2025 at 2:46 AM PG Doc comments form <
>> [email protected]> wrote:
>>
>>> The following documentation comment has been logged on the website:
>>>
>>> Page: https://www.postgresql.org/docs/17/functions-json.html
>>> Description:
>>>
>>> On the manual page
>>> https://www.postgresql.org/docs/current/functions-json.html, in the
>>> Table
>>> 9.48. "SQL/JSON Testing Functions" there is a description of IS JSON. It
>>> includes the next sentence: "If WITH UNIQUE KEYS is specified, then any
>>> object in the expression is also tested to see if it has duplicate keys."
>>> And such text is ambiguous, because the term "object" has certain meaning
>>> regarding json format. In reality the option WITH UNIQUE KEYS allows to
>>> check for duplicated keys any array element not object. For objects, both
>>> WITH UNIQUE KEYS and WITHOUT UNIQUE KEYS return false, and both IS JSON
>>> ARRAY WITH UNIQUE KEY and IS JSON ARRAY WITHOUT UNIQUE KEY return true
>>> (it
>>> is at the same time with and without unique values, how it is possible?),
>>> i.e. it works the same as just IS JSON ARRAY. The example code that
>>> confirms
>>> my reasoning:
>>> SELECT
>>> js.vl AS "tested str",
>>>
>>>
>>> js.vl IS JSON OBJECT WITH UNIQUE KEYS AS ".. object w. UQ
>>> keys",
>>>
>>> js.vl IS JSON OBJECT WITHOUT UNIQUE KEYS AS ".. object w/o UQ
>>> keys",
>>> js.vl IS JSON ARRAY WITH UNIQUE KEYS AS ".. array w. UQ
>>> keys",
>>>
>>> js.vl IS JSON ARRAY WITHOUT UNIQUE KEYS AS ".. array w/o UQ keys",
>>> js.vl IS JSON ARRAY ".. array"
>>> FROM (VALUES ('{{"a": "a1"}, {"a": "a2"}}'), ('[{"a": "a1"}, {"a":
>>> "a2"}]'),
>>> ('["a", "a"]')) AS js(vl);
>>>
>>> I'm not sure what should be the right logic for this option, for me it
>>> looks
>>> now the same as simple IS JSON ARRAY without any UNIQUE KEY option, but
>>> if
>>> we use an option it should be either true for WITH UNIQUE KEYS or WITHOUT
>>> UNIQUE KEYS but not for both at the same time. But anyway the sentence I
>>> showed above should contain "array" instead of "object" because for
>>> objects
>>> it returns false independently of applied option. I tested it on
>>> "PostgreSQL 17.0 on x86_64-windows, compiled by msvc-19.41.34120,
>>> 64-bit".
>>>
>>
>> First, WITHOUT UNIQUE KEYS does not mean "confirm that there are
>> duplicate keys", it's just a way of stating the default explicitly. In
>> other words it means "w/o testing for duplicate keys". Thus IS JSON OBJECT
>> and IS JSON OBJECT WITHOUT UNIQUE KEYS will both always return identical
>> results on the same JSON expression.
>>
>> Secondly, the UNIQUE test is recursive; for objects maybe the meaning is
>> intuitive, but for JSON arrays -- which don't have any concept of keys;
>> JSON arrays are just ordered lists -- it means "does this array contain any
>> embedded objects with duplicate keys".
>>
>> See:
>>
>> SELECT js,
>> js IS JSON "json?",
>> js IS JSON OBJECT "object?",
>> js IS JSON OBJECT WITH UNIQUE KEYS "object w. UK?",
>> js IS JSON OBJECT WITHOUT UNIQUE KEYS "object w/o UK?",
>> js IS JSON ARRAY "array?",
>> js IS JSON ARRAY WITH UNIQUE KEYS "array w. UK?",
>> js IS JSON ARRAY WITHOUT UNIQUE KEYS "array w/o UK?"
>> FROM (VALUES
>> ('[{"a":1},{"b":2,"b":3}]'), -- expect t for array, array w/o UK
>> ('[{"a":1},{"b":2,"c":3}]'), -- expect t for ALL array tests
>> ('{"b":2,"b":3}'), -- expect t for object, object w/o UK
>> ('{"c":2,"d":3}'), -- expect t for ALL object tests
>> ('{"c":2,"d":{ "e": 0, "e": 1}}'), -- WITH UNIQUE is recursive for
>> nested objects
>> ('{"c":2,"d":{ "e": 0, "f": {"g":1,"g":2}}}'), -- no matter how deep
>> ('[{"a":1},{"b":2,"c":{"d":1, "d":2}}]') -- and also tests arrays
>> recursively for embedded objecs
>> ) foo(js);
>>
>>
>> A couple of side notes:
>>
>> 1. Your first data example is not JSON at all. It's helpful for this
>> kind of test to include a plain IS JSON column, since any of the IS JSON X
>> tests can fail for two reasons: (a) it's not JSON, or (b) it is JSON but
>> it's not an X.
>>
>> 2. Curiously, the JSON spec itself is completely silent on the meaning of
>> objects with duplicate keys. PostgreSQL is more helpful in this
>> regard--the docs explicitly state that the last value is the one that is
>> retained by JSONB and used in processing functions.
>>
>>
> To improve the documentation here, I would suggest simply adding the word
> "recursively" after "tested":
>
> If WITH UNIQUE KEYS is specified, then any object in the *expression* is
> also tested recursively to see if it has duplicate keys
>
>
I think the existing word "any" sufficiently implies "recursively". It
also doesn't really address the complaint here. I'm thinking something
more like:
(this is changed intentionally, see below)
expression IS [ NOT ] JSON [ { SCALAR | ARRAY | OBJECT } ] [ WITH UNIQUE ]
-> boolean
This predicate tests whether expression can be parsed as JSON. Two
additional properties can be tested at the same time: the type of the JSON
value, and whether it passes the unique object keys constraint. Enable the
first test by specifying one of SCALAR, ARRAY, or OBJECT. Enable the
second test by specifying WITH UNIQUE: This test is applied to all objects
contained within the JSON value. The return value is true only if
expression can be parsed as JSON and all enabled tests pass. The return
value is inverted if NOT is specified.
The test label "array w/o UK?" has to go. Coupling with the "additional
tests" idea introduced above, and the recommended syntax, we should do
something like:
SELECT js,
js IS JSON "parses ok, no tests",
js IS JSON OBJECT "object test only",
js IS JSON ARRAY "array test only",
js IS JSON WITH UNIQUE "unique test only",
js IS JSON ARRAY WITH UNIQUE "array and unique tests"
Then, to keep the technical reference thorough, re-add the full syntax at
the end.
This is the full syntax accepted for this predicate. Both VALUE and
WITHOUT, the default explicit keywords to disable the two additional tests,
as well as KEYS, are omitted above for clarity.
expression IS [ NOT ] JSON [ { VALUE | SCALAR | ARRAY | OBJECT } ] [ { WITH
| WITHOUT } UNIQUE [ KEYS ] ]
David J.
^ permalink raw reply [nested|flat] 8+ messages in thread
* Re: Ambiguity in IS JSON description and logic
@ 2025-11-18 11:55 r314tive <[email protected]>
parent: David G. Johnston <[email protected]>
0 siblings, 2 replies; 8+ messages in thread
From: r314tive @ 2025-11-18 11:55 UTC (permalink / raw)
To: David G. Johnston <[email protected]>; +Cc: Kirk Parker <[email protected]>; [email protected]; [email protected]
Hi,
Based on this discussion, I’ve prepared a small documentation patch that
tries to clarify the behavior of the IS JSON ... UNIQUE KEYS clause.
The patch explains that:
- the WITH/WITHOUT UNIQUE KEYS clause controls an additional test on
duplicate object keys,
- WITH UNIQUE KEYS requires that no object contained in the expression
(recursively) have duplicate keys, and
- WITHOUT UNIQUE KEYS, which is also the default, just disables this
additional test and does not require duplicates to be present.
Patch is attached.
Regards,
Ilmar Yunusov
вт, 18 нояб. 2025 г. в 16:50, David G. Johnston <[email protected]
>:
> On Thu, Mar 20, 2025 at 7:22 AM Kirk Parker <[email protected]> wrote:
>
>> On Thu, Mar 20, 2025 at 7:08 AM Kirk Parker <[email protected]> wrote:
>>
>>>
>>> On Thu, Mar 20, 2025 at 2:46 AM PG Doc comments form <
>>> [email protected]> wrote:
>>>
>>>> The following documentation comment has been logged on the website:
>>>>
>>>> Page: https://www.postgresql.org/docs/17/functions-json.html
>>>> Description:
>>>>
>>>> On the manual page
>>>> https://www.postgresql.org/docs/current/functions-json.html, in the
>>>> Table
>>>> 9.48. "SQL/JSON Testing Functions" there is a description of IS JSON. It
>>>> includes the next sentence: "If WITH UNIQUE KEYS is specified, then any
>>>> object in the expression is also tested to see if it has duplicate
>>>> keys."
>>>> And such text is ambiguous, because the term "object" has certain
>>>> meaning
>>>> regarding json format. In reality the option WITH UNIQUE KEYS allows to
>>>> check for duplicated keys any array element not object. For objects,
>>>> both
>>>> WITH UNIQUE KEYS and WITHOUT UNIQUE KEYS return false, and both IS JSON
>>>> ARRAY WITH UNIQUE KEY and IS JSON ARRAY WITHOUT UNIQUE KEY return true
>>>> (it
>>>> is at the same time with and without unique values, how it is
>>>> possible?),
>>>> i.e. it works the same as just IS JSON ARRAY. The example code that
>>>> confirms
>>>> my reasoning:
>>>> SELECT
>>>> js.vl AS "tested str",
>>>>
>>>>
>>>> js.vl IS JSON OBJECT WITH UNIQUE KEYS AS ".. object w. UQ
>>>> keys",
>>>>
>>>> js.vl IS JSON OBJECT WITHOUT UNIQUE KEYS AS ".. object w/o UQ
>>>> keys",
>>>> js.vl IS JSON ARRAY WITH UNIQUE KEYS AS ".. array w. UQ
>>>> keys",
>>>>
>>>> js.vl IS JSON ARRAY WITHOUT UNIQUE KEYS AS ".. array w/o UQ
>>>> keys",
>>>> js.vl IS JSON ARRAY ".. array"
>>>> FROM (VALUES ('{{"a": "a1"}, {"a": "a2"}}'), ('[{"a": "a1"}, {"a":
>>>> "a2"}]'),
>>>> ('["a", "a"]')) AS js(vl);
>>>>
>>>> I'm not sure what should be the right logic for this option, for me it
>>>> looks
>>>> now the same as simple IS JSON ARRAY without any UNIQUE KEY option, but
>>>> if
>>>> we use an option it should be either true for WITH UNIQUE KEYS or
>>>> WITHOUT
>>>> UNIQUE KEYS but not for both at the same time. But anyway the sentence I
>>>> showed above should contain "array" instead of "object" because for
>>>> objects
>>>> it returns false independently of applied option. I tested it on
>>>> "PostgreSQL 17.0 on x86_64-windows, compiled by msvc-19.41.34120,
>>>> 64-bit".
>>>>
>>>
>>> First, WITHOUT UNIQUE KEYS does not mean "confirm that there are
>>> duplicate keys", it's just a way of stating the default explicitly. In
>>> other words it means "w/o testing for duplicate keys". Thus IS JSON OBJECT
>>> and IS JSON OBJECT WITHOUT UNIQUE KEYS will both always return identical
>>> results on the same JSON expression.
>>>
>>> Secondly, the UNIQUE test is recursive; for objects maybe the meaning is
>>> intuitive, but for JSON arrays -- which don't have any concept of keys;
>>> JSON arrays are just ordered lists -- it means "does this array contain any
>>> embedded objects with duplicate keys".
>>>
>>> See:
>>>
>>> SELECT js,
>>> js IS JSON "json?",
>>> js IS JSON OBJECT "object?",
>>> js IS JSON OBJECT WITH UNIQUE KEYS "object w. UK?",
>>> js IS JSON OBJECT WITHOUT UNIQUE KEYS "object w/o UK?",
>>> js IS JSON ARRAY "array?",
>>> js IS JSON ARRAY WITH UNIQUE KEYS "array w. UK?",
>>> js IS JSON ARRAY WITHOUT UNIQUE KEYS "array w/o UK?"
>>> FROM (VALUES
>>> ('[{"a":1},{"b":2,"b":3}]'), -- expect t for array, array w/o UK
>>> ('[{"a":1},{"b":2,"c":3}]'), -- expect t for ALL array tests
>>> ('{"b":2,"b":3}'), -- expect t for object, object w/o
>>> UK
>>> ('{"c":2,"d":3}'), -- expect t for ALL object tests
>>> ('{"c":2,"d":{ "e": 0, "e": 1}}'), -- WITH UNIQUE is recursive for
>>> nested objects
>>> ('{"c":2,"d":{ "e": 0, "f": {"g":1,"g":2}}}'), -- no matter how deep
>>> ('[{"a":1},{"b":2,"c":{"d":1, "d":2}}]') -- and also tests arrays
>>> recursively for embedded objecs
>>> ) foo(js);
>>>
>>>
>>> A couple of side notes:
>>>
>>> 1. Your first data example is not JSON at all. It's helpful for this
>>> kind of test to include a plain IS JSON column, since any of the IS JSON X
>>> tests can fail for two reasons: (a) it's not JSON, or (b) it is JSON but
>>> it's not an X.
>>>
>>> 2. Curiously, the JSON spec itself is completely silent on the meaning
>>> of objects with duplicate keys. PostgreSQL is more helpful in this
>>> regard--the docs explicitly state that the last value is the one that is
>>> retained by JSONB and used in processing functions.
>>>
>>>
>> To improve the documentation here, I would suggest simply adding the word
>> "recursively" after "tested":
>>
>> If WITH UNIQUE KEYS is specified, then any object in the *expression* is
>> also tested recursively to see if it has duplicate keys
>>
>>
> I think the existing word "any" sufficiently implies "recursively". It
> also doesn't really address the complaint here. I'm thinking something
> more like:
>
> (this is changed intentionally, see below)
>
> expression IS [ NOT ] JSON [ { SCALAR | ARRAY | OBJECT } ] [ WITH UNIQUE ]
> -> boolean
>
> This predicate tests whether expression can be parsed as JSON. Two
> additional properties can be tested at the same time: the type of the JSON
> value, and whether it passes the unique object keys constraint. Enable the
> first test by specifying one of SCALAR, ARRAY, or OBJECT. Enable the
> second test by specifying WITH UNIQUE: This test is applied to all objects
> contained within the JSON value. The return value is true only if
> expression can be parsed as JSON and all enabled tests pass. The return
> value is inverted if NOT is specified.
>
> The test label "array w/o UK?" has to go. Coupling with the "additional
> tests" idea introduced above, and the recommended syntax, we should do
> something like:
>
> SELECT js,
> js IS JSON "parses ok, no tests",
> js IS JSON OBJECT "object test only",
> js IS JSON ARRAY "array test only",
> js IS JSON WITH UNIQUE "unique test only",
> js IS JSON ARRAY WITH UNIQUE "array and unique tests"
>
>
> Then, to keep the technical reference thorough, re-add the full syntax at
> the end.
>
>
> This is the full syntax accepted for this predicate. Both VALUE and
> WITHOUT, the default explicit keywords to disable the two additional tests,
> as well as KEYS, are omitted above for clarity.
> expression IS [ NOT ] JSON [ { VALUE | SCALAR | ARRAY | OBJECT } ] [ {
> WITH | WITHOUT } UNIQUE [ KEYS ] ]
>
> David J.
>
>
Attachments:
[application/octet-stream] 0001-doc-Clarify-IS-JSON-UNIQUE-KEYS-behavior.patch (1.6K, 3-0001-doc-Clarify-IS-JSON-UNIQUE-KEYS-behavior.patch)
download | inline diff:
From ce0689941679c55bb22c4a9206fb792fa2dd909e Mon Sep 17 00:00:00 2001
From: ilmar yunusov <[email protected]>
Date: Tue, 18 Nov 2025 16:30:22 +0500
Subject: [PATCH] doc: Clarify IS JSON UNIQUE KEYS behavior
---
doc/src/sgml/func/func-json.sgml | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)
diff --git a/doc/src/sgml/func/func-json.sgml b/doc/src/sgml/func/func-json.sgml
index 1ec73cff464..ccb23dcdcb2 100644
--- a/doc/src/sgml/func/func-json.sgml
+++ b/doc/src/sgml/func/func-json.sgml
@@ -893,10 +893,15 @@
parsed as JSON, possibly of a specified type.
If <literal>SCALAR</literal> or <literal>ARRAY</literal> or
<literal>OBJECT</literal> is specified, the
- test is whether or not the JSON is of that particular type. If
- <literal>WITH UNIQUE KEYS</literal> is specified, then any object in the
- <replaceable>expression</replaceable> is also tested to see if it
- has duplicate keys.
+ test is whether or not the JSON is of that particular type.
+ The optional <literal>WITH</literal> or <literal>WITHOUT</literal>
+ <literal>UNIQUE KEYS</literal> clause controls an additional test
+ on duplicate object keys. Specifying <literal>WITH UNIQUE KEYS</literal>
+ requires that no object contained in the
+ <replaceable>expression</replaceable> (recursively) have duplicate keys.
+ Specifying <literal>WITHOUT UNIQUE KEYS</literal>, which is also the
+ default, disables this additional test; it does not require duplicate
+ keys to be present.
</para>
<para>
<programlisting>
--
2.52.0
^ permalink raw reply [nested|flat] 8+ messages in thread
* Re: Ambiguity in IS JSON description and logic
@ 2025-11-18 11:56 Ilmar Y <[email protected]>
parent: r314tive <[email protected]>
1 sibling, 0 replies; 8+ messages in thread
From: Ilmar Y @ 2025-11-18 11:56 UTC (permalink / raw)
To: [email protected]; +Cc: David Johnston <[email protected]>
I’ve sent a docs patch to pgsql-docs in the
“Ambiguity in IS JSON description and logic” thread.
The patch clarifies that the WITH/WITHOUT UNIQUE KEYS clause for IS JSON
controls an additional test for duplicate object keys, that WITH UNIQUE KEYS
enforces this constraint recursively, and that WITHOUT UNIQUE KEYS (the default)
disables this additional test rather than requiring duplicates to be present.
^ permalink raw reply [nested|flat] 8+ messages in thread
* Re: Ambiguity in IS JSON description and logic
@ 2025-11-18 16:37 Laurenz Albe <[email protected]>
parent: r314tive <[email protected]>
1 sibling, 0 replies; 8+ messages in thread
From: Laurenz Albe @ 2025-11-18 16:37 UTC (permalink / raw)
To: r314tive <[email protected]>; David G. Johnston <[email protected]>; +Cc: Kirk Parker <[email protected]>; [email protected]; [email protected]
On Tue, 2025-11-18 at 16:55 +0500, r314tive wrote:
> Based on this discussion, I’ve prepared a small documentation patch that
> tries to clarify the behavior of the IS JSON ... UNIQUE KEYS clause.
>
> The patch explains that:
> - the WITH/WITHOUT UNIQUE KEYS clause controls an additional test on
> duplicate object keys,
> - WITH UNIQUE KEYS requires that no object contained in the expression
> (recursively) have duplicate keys, and
> - WITHOUT UNIQUE KEYS, which is also the default, just disables this
> additional test and does not require duplicates to be present.
>
> Patch is attached.
I don't think that the patch is really an improvement. To me, the original
wording contains all the important information. The patch it somewhat
repetetive and contains unnecessary detail. For example, I can see from
the syntax diagram that I can use WITHOUT UNIQUE KEYS, and I have no
trouble guessing what that means.
Perhaps it would be enough to add something like "*recursively* checks for
duplicate keys" or "checks for duplicate keys *on any nesting level*".
Yours,
Laurenz Albe
^ permalink raw reply [nested|flat] 8+ messages in thread
end of thread, other threads:[~2025-11-18 16:37 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2025-03-19 12:14 Ambiguity in IS JSON description and logic PG Doc comments form <[email protected]>
2025-03-20 14:08 ` Kirk Parker <[email protected]>
2025-03-20 14:22 ` Kirk Parker <[email protected]>
2025-03-20 16:06 ` noname stranger <[email protected]>
2025-03-21 02:04 ` David G. Johnston <[email protected]>
2025-11-18 11:55 ` r314tive <[email protected]>
2025-11-18 11:56 ` Ilmar Y <[email protected]>
2025-11-18 16:37 ` Laurenz Albe <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox