Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1iZOiv-0003u5-PG for pgsql-docs@arkaria.postgresql.org; Tue, 26 Nov 2019 00:28:57 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.89) (envelope-from ) id 1iZOiu-0003KO-2R for pgsql-docs@arkaria.postgresql.org; Tue, 26 Nov 2019 00:28:56 +0000 Received: from makus.postgresql.org ([2001:4800:3e1:1::229]) by malur.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1iZOit-0003Js-FZ for pgsql-docs@lists.postgresql.org; Tue, 26 Nov 2019 00:28:55 +0000 Received: from meldrar.postgresql.org ([2a02:c0:301:0:ffff::31]) by makus.postgresql.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.92) (envelope-from ) id 1iZOim-0003zB-Ge for pgsql-docs@lists.postgresql.org; Tue, 26 Nov 2019 00:28:54 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=postgresql.org; s=20171124; h=Content-Type:In-Reply-To:MIME-Version:Date: Message-ID:From:References:To:Subject:Sender:Reply-To:Cc: Content-Transfer-Encoding:Content-ID:Content-Description; bh=S8E8X6q/+HZtkVVk5VylAc5k2EfdBkNcpeuU/b2NdOA=; b=YJ+NMkNLJ3emsu/NjPDWWqR+zN RIQJCixsuyHu6MsdfgQPvD/nH8e+I0DbG5q97BrmwOC7+p+SB9RYnQhsCfw8FgNQhjeUyauT+QihX ADzDtyWFYoyisrd7Waw0Z9bDmuai3p0ri2OmidlUqZEBidxvV9LRtRFwfWPjWhkhTmUQL99pegRyQ p+AJizRQgOaEs36osMyPJwXFlCA6NRcvkEMrlu4bFr1erp7NKE7gUIa8hSF3IsrA6wlVolbHlVmRj u24hTin56mV4x+aB55oCwbO3T/X39oWiQycBq/sxmIcJcllCvRvGmKDZKGh5BcrZjlzf93DVglyXH 6tXGtcYQ==; Received: from pool-173-56-10-231.nycmny.fios.verizon.net ([173.56.10.231] helo=Ph33r-Retina.home) by meldrar.postgresql.org with esmtpsa (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.89) (envelope-from ) id 1iZOih-0000QB-Vm; Tue, 26 Nov 2019 00:28:47 +0000 Subject: Re: legacy assumptions To: jonathan@buhacoff.net, pgsql-docs@lists.postgresql.org, PG Doc comments form References: <157470402952.7229.3931536721568065202@wrigleys.postgresql.org> From: "Jonathan S. Katz" Autocrypt: addr=jkatz@postgresql.org; keydata= mQINBFtShwABEAC7PNHDUOTYuifpUCk23KqfxdQQkn4nkoxOXRK0+rAj36FiwqGB4TJFuOVZ sDFAEiSlC8Jt4y5Cs7B5tetT8JNd2cs6zp/udMJJDz9d65O9PDpdlMgAmIiTzpLlSdx8FG56 DTksaDv1d8j3cTJPSE4/fWSxqzA7o3Y9UuL7atZPrzfImgpRKs0of7elIHwOa8GucjyhYqcR h60wFBJc2KXqQdDYRTZy43DSnY/0VNc0omiH355fustvpm+m5HjD3w7qZyfN3fpKJpnX1LCF f3MnPHaDGITIYGRCBXvf0UqUtD6OEVWPv2C2gyqWMIpWmZTOgDufltKyIByKBoS9x0PlFkij 04X3KODCngt+N8Ssc9OICc6QSxhjoP48PYPdmiTmkrGuf0LX084wj1xeo1NX7XxZK39F6dTJ DhsIiW0sNS0xMxQHLHG9VLbPjx3SANQBh6BuryPz5ZupW9/TIDmkvprtU/oXfKgtfYm3fxmk EctxbWrEPsFTFPyuMqQu6l+xyQv0s1VLZfjNWaua6H1/gGoIt6kRnn5qMXDVVpijuWkHbv7G ngaQMd258UrrOEHnnjzhQ7jxMWV9D+emxbAtlIxnYvCWlV4IwAQhEHfvudqYaIY3hNWrvQ6H GB2KXoTZYN9g5djm14/5nj1IU5zOcovkjJnKhoo9iStnpFF2cwARAQABtCdKb25hdGhhbiBT LiBLYXR6IDxqa2F0ekBwb3N0Z3Jlc3FsLm9yZz6JAlQEEwEIAD4CGwMFCQeGH4AFCwkIBwIG FQoJCAsCBBYCAwECHgECF4AWIQT6hLaVryv3miBkP/HxBJxynxxlJwUCW1KJJQAKCRDxBJxy nxxlJwjrD/kBgqsW4QpNpTFw7ifRokZV08CCX4huPBJQ91rrv+UEWlEcotFBHVkYyHnpzARl tcZxhJ9CbFxjniH9cOTty5T/O1yolbOHtZSW8Z8aWV6BVEbjMb+BFxSSLm7RnvJdzQbGCZq2 ZZvfVpB6z3EHYph4KDdVKvMFjoLskxmdS1DE0tE3zTxvoQsi24Q+HOS07kUjs6fsu/WICMfz mgO++AWG9Y0CvN0mm4TkujESzyKM9E5irD+leEMIcddl51Aa2c/VMfBXQbRmpHIgUFTmuHQD CnQih+9i3OJAksDg66SP8a7yiXv5mwvyDi1EfTGVKYR2j+pwyjwnC3oIbvDMmB3uTn2JIjnT iZKPVtAcAylXjubFltihQgNyuShdP4W+kBwZizhUFqUVL8Anx+KoytYmJPfMRFLGuK4obXKq a2ZS3k9KB+H+isOx2nFJOsc7V360Zp1DVaNmuiK10TT6QndShSPaqkJqFtCb6r92rZ9sZM/L 3vtCI4Rrl3Pt1MgtENXupS8gZpJnAYS0j5A1PAZ09r6ANoaeMHspF+5J5fOHeEvqphXr36mm a83Vl1t4orPb0+QmmijmlpseDU63M88Aw5p3c4qj7t8Qr2EZ5zrn7/sFn5wOfbs8Nymxafif QCnlV2vg9p0m7vSk/yLJ4PFZvs52FgqAGRCdRn0s2EC99bkCDQRbUocAARAAv8ho/toQ9DG3 j4f9h9n1aRHr2FlviN2Utpy6L8+dfDggO0geilmkGQOolZ2E60gGfye/kUtF9W3NByO4hxDR 9u6qbOXcdqnuA+cc68EfqlWFJrVtYFxt0h4ElWYOYnIezKthriWch/FY70FGrxs3z8UHOHq5 0wBW433eTvZm90WixBiXEt2v1DgW4Vr3ymfO7Aap/IYyPuE4JzgudAuAl0HKPyNEHWHG1dAb jX1RiCw9gknIDWQOF0B4UAaJctWGVcnZ3A2ULwNGMa1P9ZJlBWf1vcj01aiHMU0yQ7JjJiSp vfm9eM0uSLwRdDrJjyl5ZZqVumjdv2SMNQ8GvYRbEMys3GGDSt9zXgfCSUnPnJfYxjzBHRI6 x44Wfsx8S6hWxepOogCJJ/g67Bk9mY8YV4klWIXDJVOL5jnBC09DbsZG81JaE2QxB8Y7W36Z Mroi9XMxg3s805hQAQUvdG/poU8hN8BWdrnTm/+4eQQp7gDY1ePDmGM6bJC+OHOSnFtR/f+7 0zpKJ10cc7cBygGnl1yR3KjhFyAWUFvP4ZGziKCcpMwXZfe9PGuyA/YOubMphxIn3YsK2wrd faKZYX2GMZCZhMMcvx9IpQrxIJgU+VlwXu/O+Lk10VIPcxPJJwmpdI6HzcS8ZgG6IMcC444X XTuLaP8j2mgcMvYak3ScCykAEQEAAYkCPAQYAQgAJhYhBPqEtpWvK/eaIGQ/8fEEnHKfHGUn BQJbUocAAhsMBQkHhh+AAAoJEPEEnHKfHGUnReMP/RA2UhGQj+G4uBshkRLjpRysabdPqgQB dEBk6wYbio88Wg/2/hgY7UzmDDEwX3sZfQDcrI6+vIobI8uqstZID+WgAAa1JLfChMyVQnSy 0zfWMOABXscc5tGuvFRZvJklTissMFjXUwaREEKp4ZikTvJ/62MCjSdtrUhnPLvoTsHTKRKD ichE+b5A54alwsubTijw12O6N22r5IjZiiSZV0u7dsShyKw+7wCSax9fuBoE43NMYf+dnjMK nerAQYUcZWYMnk+EC8RaqYAxv6XZ2tKx1AkGGktwkQIBwrz9IlDSvJ3LWJ2UIIuLRTdngNgg GIL4zzuUa2F56FqskQIuYMaNETk6LYfalBDQ6TVLAcgCPQxp4k4i/PRsZ3lZ2ZhRHRYciOvm kp+I6EfHwllQpYrWs1thluBGqlJSVJgKl0IOFvKLsQ6KKfqzAwh6FxrO5qajp4viNIgtWoFw O5Bp0jgFTbH2OrMWIRfUdUCH1Djbuo65svhNj7FNsQVYzHDI+Nd1I/LOBoPc3UorRMF2M4JV kUR7skOHWUbPTSNUr7qc14NSMY3PKjGeVGAsBVHBPvmRx/Ss2tW/TpJWpxg4pmquFQSXuaYh Yf4FN8Sxy320pcr+FqN8AhZkYJcNY82OTtR5VEKRC/mYcyq61qXIKqngydn79bJjGxHipxyF FmEC Message-ID: Date: Mon, 25 Nov 2019 19:28:43 -0500 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:68.0) Gecko/20100101 Thunderbird/68.2.2 MIME-Version: 1.0 In-Reply-To: <157470402952.7229.3931536721568065202@wrigleys.postgresql.org> Content-Type: multipart/signed; micalg=pgp-sha256; protocol="application/pgp-signature"; boundary="R8Owqe9B9Ft8o04M03Ftgxh4h1pIMaN1R" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Precedence: bulk This is an OpenPGP/MIME signed message (RFC 4880 and 3156) --R8Owqe9B9Ft8o04M03Ftgxh4h1pIMaN1R Content-Type: multipart/mixed; boundary="0yHdWjcq3PvoiEN2a2hulCBf3ePVMsFU7" --0yHdWjcq3PvoiEN2a2hulCBf3ePVMsFU7 Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: quoted-printable Hi, On 11/25/19 12:47 PM, PG Doc comments form wrote: > The following documentation comment has been logged on the website: >=20 > Page: https://www.postgresql.org/docs/12/datatype-json.html > Description: >=20 > I'm wondering if this one line of section 8.14 JSON Types > (https://www.postgresql.org/docs/current/datatype-json.html) can be edi= ted > to remove the word "legacy": >=20 > "In general, most applications should prefer to store JSON data as json= b, > unless there are quite specialized needs, such as legacy assumptions ab= out > ordering of object keys." >=20 > I'm concerned that with the word "legacy" there, someone might come alo= ng > eventually and decide the json column type isn't needed anymore because= it's > "legacy", where in fact there are modern and legitimate uses for a fiel= d > that allows you to retrieve the data exactly as it was stored and allow= s > JSON queries on that data (even if they are slower). While I'm certainly sensitive to this need as once upon a time I had a similar requirement, slightly less strict requirement, I made sure to not rely on the PostgreSQL JSON type itself to ensure ordering was preserved (and in my case I was able to rely on a solution external to PostgreSQL). The JSON RFC states that objects should be considered "unordered", and mentions that while different parsing libraries may preserve key ordering, "implementations whose behavior does not depend on member ordering will be interoperable in the sense that they will not be affected by these differences."[1] > An alternative would be to store the > plaintext as binary data for the integrity check and have a separate js= onb > column with a second copy of the same data. Since different application= s > have different time/space tradeoffs, it's good to have the choice. Another approach is to leverage PostgreSQL's expression index capabilities, which would allow you to limit the data duplication. For example: CREATE TABLE docs (doc bytea); -- populating some test data INSERT INTO docs SELECT ('{"id": ' || x || ', "data": [1,2,3] }')::bytea FROM generate_series(1, 100000) x; -- create an expression index that maps to the operators supported by GIN= CREATE INDEX docs_doc_json_idx ON docs USING gin(jsonb(encode(doc, 'escape'))); and in one test run: EXPLAIN SELECT doc FROM docs WHERE encode(doc, 'escape')::jsonb @> '{"id": 567}'; I got a plan similar to: QUERY PLAN -------------------------------------------------------------------------= ----------- Bitmap Heap Scan on docs (cost=3D28.77..306.00 rows=3D100 width=3D31) Recheck Cond: ((encode(doc, 'escape'::text))::jsonb @> '{"id": 567}'::jsonb) -> Bitmap Index Scan on docs_doc_json_idx (cost=3D0.00..28.75 rows=3D100 width=3D0) Index Cond: ((encode(doc, 'escape'::text))::jsonb @> '{"id": 567}'::jsonb) In this way, you can: - Keep the key ordering preserved and perform any integrity checks, etc. that your application requires - Limit your data duplication to that of the index - Still get the benefits of the JSONB lookup functions that work with the indexing - Still perform JSON validation: INSERT INTO docs VALUES ('{]'::bytea); ERROR: invalid input syntax for type json DETAIL: Expected string or "}", but found "]". CONTEXT: JSON data, line 1: {] > My suggestion for that sentence: >=20 > "In general, most applications should prefer to store JSON data as json= b, > unless there are quite specialized needs, such as assumptions about ord= ering > of object keys or the need to retrieve the data exactly as it was store= d." My preference would be that we guide in the documentation on what to do if one has an application sensitive to ordering. I'm not opposed to the wording, but I'd prefer we encourage people to leverage JSONB for storage & retrieval. Thanks! Jonathan [1] https://tools.ietf.org/html/rfc7159#section-4 --0yHdWjcq3PvoiEN2a2hulCBf3ePVMsFU7-- --R8Owqe9B9Ft8o04M03Ftgxh4h1pIMaN1R Content-Type: application/pgp-signature; name="signature.asc" Content-Description: OpenPGP digital signature Content-Disposition: attachment; filename="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEE+oS2la8r95ogZD/x8QSccp8cZScFAl3ccbsACgkQ8QSccp8c ZSclERAAmTRLngRe3QlrP0SYjWEuf5/t86kXevq/CyFjUDiZ26TkPybPHwpSRbkb n0A7vg6w6WYNZ9LFeb6+lE3TQnCPc/QbDDn7xKBUGr6ohBTJvoEJfxKb8GJTYWB/ I6nGVpxv1hNCRReVwNuiPy5dR26I1EOZLsPr0iaS4iV0wlr/Dz+bKx/+LMXz8Bh2 BCSWwgdWNCPYvE3Ev9yvsH72Ov9Fylf5mRFKwdWiDVWzu3EE89e6de5dcmVWgvGE r6tcs9Sac5I+ObX1zkPquuux6d8dFpLO+WHFOamPGUC2KtGow4RkEOq4sJsO7/ZP 9T78Vt2qCJtNDtw1skQIMxuyHklMKisfmwE8G7ZhcJd6UetU2/sYtINOEU/4qm2w Kc4LZEhVHhfI0tgX+frRamMAlYaJDwFb6kHghxSqpGxVY3KSwvpEyO1NTMK5pmwj 1mitirpuQf3M3crXgOjl5DxuNtSTLpHNkO4lPT2HkM7h1Mwx3ufOY4j4+2UOgnMZ dQj7uRxWQPp50S7RvIIUXrwQuUZk1ZtKa6o0ZlSUQuWiwvPwXbDueEYJdPQ5sGs4 jpBpwvyteWejVxlPJ56tVWbqCHobBR3Ic8O8s2U4r27z86CiXcCVK3KO0UWOptJd mViHC1JeAJ6Bn7WTPOxhow9jyeDOkWFVQRRqufJcAPPBn2N4O2E= =eVlG -----END PGP SIGNATURE----- --R8Owqe9B9Ft8o04M03Ftgxh4h1pIMaN1R--