public inbox for [email protected]  
help / color / mirror / Atom feed
From: Dominique Devienne <[email protected]>
To: Tom Lane <[email protected]>
Cc: David G. Johnston <[email protected]>
Cc: Adrian Klaver <[email protected]>
Cc: Davin Shearer <[email protected]>
Cc: [email protected] <[email protected]>
Subject: Re: Emitting JSON to file using COPY TO
Date: Mon, 27 Nov 2023 16:26:43 +0100
Message-ID: <CAFCRh-_GdiUvjd5z5FfvTfhruOnYqBu163XU47zZE8RNATCJGQ@mail.gmail.com> (raw)
In-Reply-To: <[email protected]>
References: <CALvfUkBxTYy5uWPFVwpk_7ii2zgT07t3d-yR_cy4sfrrLU=kcg@mail.gmail.com>
	<[email protected]>
	<CAFCRh-_LhP5Ln4-beVz=1pQ9VycU3hAxnJHXhPHZRmX0BSCVEQ@mail.gmail.com>
	<CAFCRh--Vhx0105S7SWAYHGcVMTSRtbcoArDUGSbYSWRUGunW_g@mail.gmail.com>
	<CAKFQuwZb3zaSqkOvGXRRLZEW61az+4xyHZ7tBPR91JfrcdCKvA@mail.gmail.com>
	<[email protected]>

On Mon, Nov 27, 2023 at 3:56 PM Tom Lane <[email protected]> wrote:

> "David G. Johnston" <[email protected]> writes:
> > I agree there should be a copy option for “not formatted” so if you dump
> a
> > single column result in that format you get the raw unescaped contents of
> > the column.
>
> I'm not sure I even buy that.  JSON data in particular is typically
> multi-line, so how will you know where the row boundaries are?
> That is, is a newline a row separator or part of the data?
>
> You can debate the intelligence of any particular quoting/escaping
> scheme, but imagining that you can get away without having one at
> all will just create its own problems.
>

What I was suggesting is not about a "not formatted" option.
But rather than JSON values (i.e. typed `json` or `jsonb`) in a
JSON-formatted COPY operator, the JSON values should not be
serialized to text that is simply output as a JSON-text-value by COPY,
but "inlined" as a "real" JSON value without the JSON document output by
COPY.

This is a special case, where the inner and outer "values" (for lack of a
better terminology)
are *both* JSON documents, and given that JSON is hierarchical, the inner
JSON value can
either by 1) serializing to text first, which must thus be escaped using
the JSON escaping rules,
2) NOT serialized, but "inline" or "spliced-in" the outer COPY JSON
document.

I guess COPY in JSON mode supports only #1 now? While #2 makes more sense
to me.
But both options are valid. Is that clearer?

BTW, JSON is not multi-line, except for insignificant whitespace.
So even COPY in JSON mode is not supposed to be line based I guess?
Unless COPY in JSON mode is more like NDJSON (https://ndjson.org/)? --DD


view thread (35+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Emitting JSON to file using COPY TO
  In-Reply-To: <CAFCRh-_GdiUvjd5z5FfvTfhruOnYqBu163XU47zZE8RNATCJGQ@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox