Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tvIQK-003CiA-VD for pgsql-docs@arkaria.postgresql.org; Thu, 20 Mar 2025 16:07:13 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1tvIQJ-0045mS-OI for pgsql-docs@arkaria.postgresql.org; Thu, 20 Mar 2025 16:07:11 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1tvIQJ-0045lM-Ba for pgsql-docs@lists.postgresql.org; Thu, 20 Mar 2025 16:07:11 +0000 Received: from mail-ej1-x631.google.com ([2a00:1450:4864:20::631]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.96) (envelope-from ) id 1tvIQG-000AIj-29 for pgsql-docs@lists.postgresql.org; Thu, 20 Mar 2025 16:07:11 +0000 Received: by mail-ej1-x631.google.com with SMTP id a640c23a62f3a-aaf0f1adef8so180082266b.3 for ; Thu, 20 Mar 2025 09:07:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1742486828; x=1743091628; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=5a0bt2OAHWDPkNbbFHDS3TY3h8gSgvTaoPIGf0AdIWA=; b=HJTUhk410HkhtmiizVsVrVLcYliALw9WWQ+k3lMbSN/uLZuqEwELOEbMY5I+loewyn XM3qAZmn2G0HGyEV5rvqQhOYwi4I3LmpiUH4YRMUny1+inV4I97M5R0f36/02d/5ypyb HSwa0+3i/AZV+z8jzS4Njc9ymmvOWgUAzLcAGMhn7IuVlIWR2yHgrFdGybTVlfdV4qtm HZuQRvwWsf/NqfT22ZKlCpqwXqNsSYTn1qoUYlcUdbd6hUzSmhg0joOIFRHMYWyMXVJQ 8J0rkpIYP4wZTfZavvJByrDbcTHDW8b+L50aj7WNzTaUySe0FCBMHnYDqKcqT887wtFL pPww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1742486828; x=1743091628; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=5a0bt2OAHWDPkNbbFHDS3TY3h8gSgvTaoPIGf0AdIWA=; b=NKYTW2VUBEm/mz1XLzSRbfysQtYW70K15LMbB74R0MtGzX52Id55WQsS4oJ33zOVtl BpyiDaVpxK3C8SDLS6DHP3K8IjG6cLJyZyIH6nzx7A+C50EMRrXNpoR1YXK2SFoPpax7 l1FsD1nIOI9SO7PxJlZs1VfMrz+9ycrPnT0VNGS2tz9LBevwnMh6ccbY+awRMoGWjHPa 7gnTPebatjOaKeT7w7gerFzgMTPIP949jdu5Ej/fo13AgAh33eA6qms6hmDcapaFh1W9 PbwbAvpbyb90CBDgEZg5Wr3M6Eysc7B/1z9RQGEv/hrHr4IRzGIIoSe4tO7tW0K/S7ug MXjg== X-Gm-Message-State: AOJu0YxsMwkptkiNvvQg9VWSwRbO4WVZzV8sL4CqXDlJZBMA/pblyUdc cCagspSflxZvkRE+M+T+Lu3fzJDQYNR3O2QJOVaKG+hfm7iS8TjhH37HIvrk2TbCMvPFPN2RBCW WMvlfNTWI6R4JjdVndlz6iJvTtM4= X-Gm-Gg: ASbGnctx3yi6ocU1kmZ/uQJTuFcOzuklC48TjEu99KhQARJUQW7gN0illJENBdym3lV TFzkCSqZg0vzBkJ9yLQLTbIJNqfyiqA62qSuc/zuGZJW8T1QoWLqnSiy+puhWro6H26r6TSC7Wg +BicKXkJ4Xv7jgV6XgDVuXZspQ X-Google-Smtp-Source: AGHT+IHLiW4Z2+r8pGr462psHz/Zdn3n4lwEuB+7kQm4ROJtbODReyDZ6LM+BbtRSU0XmmgL6Zu6YEJVrWZKNKObyP4= X-Received: by 2002:a17:907:97d5:b0:ac2:7be7:95c5 with SMTP id a640c23a62f3a-ac3ce05ce39mr529122366b.33.1742486827231; Thu, 20 Mar 2025 09:07:07 -0700 (PDT) MIME-Version: 1.0 References: <174238647361.682.12732328104350596711@wrigleys.postgresql.org> In-Reply-To: From: noname stranger Date: Thu, 20 Mar 2025 17:06:50 +0100 X-Gm-Features: AQ5f1JpuJTaJb7dPBe4poeS-VAE7Ip5AXlYBRSb4ZWKlllG-zdfd8m-Wo52M-tg Message-ID: Subject: Re: Ambiguity in IS JSON description and logic To: Kirk Parker Cc: pgsql-docs@lists.postgresql.org Content-Type: multipart/alternative; boundary="0000000000005cf7970630c85432" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --0000000000005cf7970630c85432 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Hello Kirk, Thank you for your answer and detailed explanation. You are absolutely right, I made a mistake in my reasoning. On the other hand, some ambiguity exists. It is a perfect idea to clarify what WITHOUT UNIQUE KEYS means. Such clarification could help with understanding the logic of IS JSON faster and avoiding confusion and ambiguity. I found the same option for a few JSON creation functions in table 9.47 at the same page, like json_object() and json(). Their descriptions also don't clarify this point. Best regards, Valery =D1=87=D1=82, 20 =D0=BC=D0=B0=D1=80. 2025=E2=80=AF=D0=B3. =D0=B2 15:22, Kir= k Parker : > On Thu, Mar 20, 2025 at 7:08=E2=80=AFAM Kirk Parker wr= ote: > >> >> On Thu, Mar 20, 2025 at 2:46=E2=80=AFAM PG Doc comments form < >> noreply@postgresql.org> wrote: >> >>> The following documentation comment has been logged on the website: >>> >>> Page: https://www.postgresql.org/docs/17/functions-json.html >>> Description: >>> >>> On the manual page >>> https://www.postgresql.org/docs/current/functions-json.html, in the >>> Table >>> 9.48. "SQL/JSON Testing Functions" there is a description of IS JSON. I= t >>> includes the next sentence: "If WITH UNIQUE KEYS is specified, then any >>> object in the expression is also tested to see if it has duplicate keys= ." >>> And such text is ambiguous, because the term "object" has certain meani= ng >>> regarding json format. In reality the option WITH UNIQUE KEYS allows to >>> check for duplicated keys any array element not object. For objects, bo= th >>> WITH UNIQUE KEYS and WITHOUT UNIQUE KEYS return false, and both IS JSON >>> ARRAY WITH UNIQUE KEY and IS JSON ARRAY WITHOUT UNIQUE KEY return true >>> (it >>> is at the same time with and without unique values, how it is possible?= ), >>> i.e. it works the same as just IS JSON ARRAY. The example code that >>> confirms >>> my reasoning: >>> SELECT >>> js.vl AS "tested str", >>> >>> >>> js.vl IS JSON OBJECT WITH UNIQUE KEYS AS ".. object w. UQ >>> keys", >>> >>> js.vl IS JSON OBJECT WITHOUT UNIQUE KEYS AS ".. object w/o UQ >>> keys", >>> js.vl IS JSON ARRAY WITH UNIQUE KEYS AS ".. array w. UQ >>> keys", >>> >>> js.vl IS JSON ARRAY WITHOUT UNIQUE KEYS AS ".. array w/o UQ keys= ", >>> js.vl IS JSON ARRAY ".. array" >>> FROM (VALUES ('{{"a": "a1"}, {"a": "a2"}}'), ('[{"a": "a1"}, {"a": >>> "a2"}]'), >>> ('["a", "a"]')) AS js(vl); >>> >>> I'm not sure what should be the right logic for this option, for me it >>> looks >>> now the same as simple IS JSON ARRAY without any UNIQUE KEY option, but >>> if >>> we use an option it should be either true for WITH UNIQUE KEYS or WITHO= UT >>> UNIQUE KEYS but not for both at the same time. But anyway the sentence = I >>> showed above should contain "array" instead of "object" because for >>> objects >>> it returns false independently of applied option. I tested it on >>> "PostgreSQL 17.0 on x86_64-windows, compiled by msvc-19.41.34120, >>> 64-bit". >>> >> >> First, WITHOUT UNIQUE KEYS does not mean "confirm that there are >> duplicate keys", it's just a way of stating the default explicitly. In >> other words it means "w/o testing for duplicate keys". Thus IS JSON OBJE= CT >> and IS JSON OBJECT WITHOUT UNIQUE KEYS will both always return identical >> results on the same JSON expression. >> >> Secondly, the UNIQUE test is recursive; for objects maybe the meaning is >> intuitive, but for JSON arrays -- which don't have any concept of keys; >> JSON arrays are just ordered lists -- it means "does this array contain = any >> embedded objects with duplicate keys". >> >> See: >> >> SELECT js, >> js IS JSON "json?", >> js IS JSON OBJECT "object?", >> js IS JSON OBJECT WITH UNIQUE KEYS "object w. UK?", >> js IS JSON OBJECT WITHOUT UNIQUE KEYS "object w/o UK?", >> js IS JSON ARRAY "array?", >> js IS JSON ARRAY WITH UNIQUE KEYS "array w. UK?", >> js IS JSON ARRAY WITHOUT UNIQUE KEYS "array w/o UK?" >> FROM (VALUES >> ('[{"a":1},{"b":2,"b":3}]'), -- expect t for array, array w/o UK >> ('[{"a":1},{"b":2,"c":3}]'), -- expect t for ALL array tests >> ('{"b":2,"b":3}'), -- expect t for object, object w/o = UK >> ('{"c":2,"d":3}'), -- expect t for ALL object tests >> ('{"c":2,"d":{ "e": 0, "e": 1}}'), -- WITH UNIQUE is recursive for >> nested objects >> ('{"c":2,"d":{ "e": 0, "f": {"g":1,"g":2}}}'), -- no matter how deep >> ('[{"a":1},{"b":2,"c":{"d":1, "d":2}}]') -- and also tests arrays >> recursively for embedded objecs >> ) foo(js); >> >> >> A couple of side notes: >> >> 1. Your first data example is not JSON at all. It's helpful for this >> kind of test to include a plain IS JSON column, since any of the IS JSON= X >> tests can fail for two reasons: (a) it's not JSON, or (b) it is JSON but >> it's not an X. >> >> 2. Curiously, the JSON spec itself is completely silent on the meaning o= f >> objects with duplicate keys. PostgreSQL is more helpful in this >> regard--the docs explicitly state that the last value is the one that is >> retained by JSONB and used in processing functions. >> >> > To improve the documentation here, I would suggest simply adding the word > "recursively" after "tested": > > If WITH UNIQUE KEYS is specified, then any object in the *expression*= is > also tested recursively to see if it has duplicate keys > > As for clarifying -- if we need to -- that WITHOUT UNIQUE KEYS means > "without testing for uniqueness" rather than "confirming that non-unique > keys are present", I will defer to anyone who has more familiarity with t= he > PostgreSQL documentation style. I probed around looking at other > situations where there are phrases that make the default condition explic= it > (e.g. CREATE INDEX ... NULLS DISTINCT) that didn't help me much; all my > attempts ended up too wordy for my liking. > --=20 =D0=A1 =D1=83=D0=B2=D0=B0=D0=B6=D0=B5=D0=BD=D0=B8=D0=B5=D0=BC, =D0=92=D0=B0=D0=BB=D0=B5=D1=80=D0=B8=D0=B9 --0000000000005cf7970630c85432 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Hello Kirk,

Thank you for yo= ur answer and detailed explanation. You are absolutely right, I made a mist= ake in my reasoning. On the other hand, some ambiguity exists.
<= div>It is a perfect idea to clarify what WITHOUT UNIQUE KEYS means. Such clarification could help with understanding the logic of IS JSON faste= r and avoiding confusion and ambiguity. I found the same option for a few JSON creation functions in table 9.47 at = the same page, like json_object() and json(). Their descriptions also don&#= 39;t clarify this point.

Best regards,
V= alery


=D1=87=D1=82, 20 =D0=BC=D0=B0=D1=80. 2025=E2=80= =AF=D0=B3. =D0=B2 15:22, Kirk Parker <khp@equatoria.us>:
On Thu, Mar 20, 2025 at 7:08= =E2=80=AFAM Kirk Parker <khp@equatoria.us> wrote:

O= n Thu, Mar 20, 2025 at 2:46=E2=80=AFAM PG Doc comments form <noreply@postgresql.org= > wrote:
The = following documentation comment has been logged on the website:

Page: https://www.postgresql.org/docs/17/funct= ions-json.html
Description:

On the manual page
https://www.postgresql.org/docs/current/f= unctions-json.html, in the Table
9.48. "SQL/JSON Testing Functions" there is a description of IS J= SON. It
includes the next sentence: "If WITH UNIQUE KEYS is specified, then an= y
object in the expression is also tested to see if it has duplicate keys.&qu= ot;
And such text is ambiguous, because the term "object" has certain= meaning
regarding json format. In reality the option WITH UNIQUE KEYS allows to
check for duplicated keys any array element not object. For objects, both WITH UNIQUE KEYS and WITHOUT UNIQUE KEYS return false, and both IS JSON
ARRAY WITH UNIQUE KEY and IS JSON ARRAY WITHOUT UNIQUE KEY return true (it<= br> is at the same time with and without unique values, how it is possible?), i.e. it works the same as just IS JSON ARRAY. The example code that confirm= s
my reasoning:=C2=A0
SELECT
=C2=A0 =C2=A0 js.vl AS=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 "tested str",=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0

=C2=A0 =C2=A0 js.vl IS JSON OBJECT WITH UNIQUE KEYS AS=C2=A0 =C2=A0 =C2=A0 = ".. object w. UQ keys",=C2=A0 =C2=A0

=C2=A0 =C2=A0 js.vl IS JSON OBJECT WITHOUT UNIQUE KEYS AS=C2=A0 =C2=A0"= ;.. object w/o UQ keys",
=C2=A0 =C2=A0 js.vl IS JSON ARRAY WITH UNIQUE KEYS AS=C2=A0 =C2=A0 =C2=A0 = =C2=A0".. array w. UQ keys",=C2=A0 =C2=A0

=C2=A0 =C2=A0 js.vl IS JSON ARRAY WITHOUT UNIQUE KEYS AS=C2=A0 =C2=A0 "= ;.. array w/o UQ keys",
=C2=A0 =C2=A0 js.vl IS JSON ARRAY ".. array"
FROM (VALUES ('{{"a": "a1"}, {"a": "= a2"}}'), ('[{"a": "a1"}, {"a": &= quot;a2"}]'),
('["a", "a"]')) AS js(vl);

I'm not sure what should be the right logic for this option, for me it = looks
now the same as simple IS JSON ARRAY without any UNIQUE KEY option, but if<= br> we use an option it should be either true for WITH UNIQUE KEYS or WITHOUT UNIQUE KEYS but not for both at the same time. But anyway the sentence I showed above should contain "array" instead of "object"= because for objects
it returns false independently of applied option.=C2=A0 I tested it on
"PostgreSQL 17.0 on x86_64-windows, compiled by msvc-19.41.34120, 64-b= it".

First, WITHOUT UNIQUE KEYS do= es not mean "confirm that there are duplicate keys", it's jus= t a way of stating the default explicitly. In other words it means "w/= o testing for duplicate keys". Thus IS JSON OBJECT and IS JSON OBJECT = WITHOUT UNIQUE KEYS will both always return identical results on the same J= SON expression.=C2=A0

Secondly, the UNIQUE test is= recursive; for objects maybe the meaning is intuitive, but for JSON arrays= -- which don't have any concept of keys; JSON arrays are just ordered = lists -- it means "does this array contain any embedded objects with d= uplicate keys".

See:
=C2=A0SELECT js,
=C2=A0 js IS JSON= "json?",
=C2=A0 js IS JSON OBJECT "object?",
=C2=A0 js= IS JSON OBJECT WITH UNIQUE KEYS "object w. UK?",
=C2=A0 js IS JS= ON OBJECT WITHOUT UNIQUE KEYS "object w/o UK?",
=C2=A0 js IS JSON= ARRAY "array?",
=C2=A0 js IS JSON ARRAY WITH UNIQUE KEYS "a= rray w. UK?",
=C2=A0 js IS JSON ARRAY WITHOUT UNIQUE KEYS "array = w/o UK?"
FROM (VALUES
=C2=A0 ('[{"a":1},{"b":= 2,"b":3}]'), =C2=A0 =C2=A0 =C2=A0 -- expect t for array, arra= y w/o UK
=C2=A0 ('[{"a":1},{"b":2,"c":3}]= '), =C2=A0 =C2=A0 =C2=A0 -- expect t for ALL array tests
=C2=A0 ('{= "b":2,"b":3}'), =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 -- expect t for object, object w/o UK
=C2=A0 ('{&q= uot;c":2,"d":3}'), =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 -- expect t for ALL object tests
=C2=A0 ('{"c= ":2,"d":{ "e": 0, "e": 1}}'), -- WIT= H UNIQUE is recursive for nested objects
=C2=A0 ('{"c":2,&quo= t;d":{ "e": 0, "f": {"g":1,"g"= :2}}}'), -- no matter how deep
=C2=A0 ('[{"a":1},{"b= ":2,"c":{"d":1, "d":2}}]') -- and al= so tests arrays recursively for embedded objecs
) foo(js);

A couple of side notes:

1. Your first data exam= ple is not JSON at all.=C2=A0 It's helpful for this kind of test to inc= lude a plain IS JSON column, since any of the IS JSON X tests can fail for = two reasons: (a) it's not JSON, or (b) it is JSON but it's not an X= .

<= font face=3D"arial, sans-serif">2. Curiously, the JSON spec itself is compl= etely silent on the meaning of objects with duplicate=C2=A0keys.=C2=A0=C2= =A0PostgreSQL is more helpful in this regard--th= e docs explicitly state that the last value is the one that is retained by = JSONB and used in processing functions.

=C2=A0
To improve the documentation=C2=A0he= re, I would suggest simply adding the word "recursively" after &q= uot;tested":

=C2=A0 =C2=A0 If=C2=A0WITH UNIQUE KEYS=C2=A0is= specified, then any object in the=C2=A0expression=C2=A0is also tested recursively to see if it has duplicate keys

As for clarifying -- if we n= eed to -- that WITHOUT UNIQUE KEYS means "without testing for uniquene= ss" rather than "confirming that non-unique keys are present"= ;, I will defer to anyone who has more familiarity with the PostgreSQL docu= mentation=C2=A0style.=C2=A0 I probed around looking at other situations whe= re there are phrases that make the default condition explicit (e.g. CREATE = INDEX ... NULLS DISTINCT) that didn't help me much; all my attempts end= ed up too wordy for my liking.


--
=D0=A1 =D1=83=D0=B2=D0=B0=D0=B6=D0=B5=D0=BD=D0=B8=D0=B5= =D0=BC,
=D0=92=D0=B0=D0=BB=D0=B5=D1=80=D0=B8=D0=B9
--0000000000005cf7970630c85432--