public inbox for [email protected]  
help / color / mirror / Atom feed
From: Joe Conway <[email protected]>
To: Davin Shearer <[email protected]>
To: PostgreSQL-development <[email protected]>
Cc: Nathan Bossart <[email protected]>
Subject: Re: Emitting JSON to file using COPY TO
Date: Fri, 1 Dec 2023 22:10:54 -0500
Message-ID: <[email protected]> (raw)
In-Reply-To: <CALvfUkByf6eou6J36-2Hw7Kdro80JD_Ch1AeUDAjsX8RpcA==A@mail.gmail.com>
References: <CALvfUkBxTYy5uWPFVwpk_7ii2zgT07t3d-yR_cy4sfrrLU=kcg@mail.gmail.com>
	<[email protected]>
	<CAFCRh-_LhP5Ln4-beVz=1pQ9VycU3hAxnJHXhPHZRmX0BSCVEQ@mail.gmail.com>
	<CAFCRh--Vhx0105S7SWAYHGcVMTSRtbcoArDUGSbYSWRUGunW_g@mail.gmail.com>
	<CAKFQuwZb3zaSqkOvGXRRLZEW61az+4xyHZ7tBPR91JfrcdCKvA@mail.gmail.com>
	<[email protected]>
	<CAFCRh-_GdiUvjd5z5FfvTfhruOnYqBu163XU47zZE8RNATCJGQ@mail.gmail.com>
	<[email protected]>
	<CALvfUkCq3pvkmniWEKZwkbA6hNNvkANF3OamMfhC170RgY+HJg@mail.gmail.com>
	<[email protected]>
	<20231201230958.GA1786735@nathanxps13>
	<CALvfUkByf6eou6J36-2Hw7Kdro80JD_Ch1AeUDAjsX8RpcA==A@mail.gmail.com>

On 12/1/23 22:00, Davin Shearer wrote:
> I'm really glad to see this taken up as a possible new feature and will 
> definitely use it if it gets released.  I'm impressed with how clean, 
> understandable, and approachable the postgres codebase is in general and 
> how easy it is to read and understand this patch.
> 
> I reviewed the patch (though I didn't build and test the code) and have 
> a concern with adding the '[' at the beginning and ']' at the end of the 
> json output.  Those are already added by `json_agg` 
> (https://www.postgresql.org/docs/current/functions-aggregate.html 
> <https://www.postgresql.org/docs/current/functions-aggregate.html;) as 
> you can see in my initial email.  Adding them in the COPY TO may be 
> redundant (e.g., [[{"key":"value"...}....]]).

With this patch in place you don't use json_agg() at all. See the 
example output (this is real output with the patch applied):

(oops -- I meant to send this with the same email as the patch)

8<-------------------------------------------------
create table foo(id int8, f1 text, f2 timestamptz);
insert into foo
   select g.i,
          'line: ' || g.i::text,
          clock_timestamp()
   from generate_series(1,4) as g(i);

copy foo to stdout (format 'json');
[
  {"id":1,"f1":"line: 1","f2":"2023-12-01T12:58:16.776863-05:00"}
,{"id":2,"f1":"line: 2","f2":"2023-12-01T12:58:16.777084-05:00"}
,{"id":3,"f1":"line: 3","f2":"2023-12-01T12:58:16.777096-05:00"}
,{"id":4,"f1":"line: 4","f2":"2023-12-01T12:58:16.777103-05:00"}
]
8<-------------------------------------------------


> I think COPY TO makes good sense to support, though COPY FROM maybe not 
> so much as JSON isn't necessarily flat and rectangular like CSV.

Yeah -- definitely not as straight forward but possibly we just support 
the array-of-jsonobj-rows as input as well?

> For my use-case, I'm emitting JSON files to Apache NiFi for processing, 
> and NiFi has superior handling of JSON (via JOLT parsers) versus CSV 
> where parsing is generally done with regex.  I want to be able to emit 
> JSON using a postgres function and thus COPY TO.
> 
> Definitely +1 for COPY TO.
> 
> I don't think COPY FROM will work out well unless the JSON is required 
> to be flat and rectangular.  I would vote -1 to leave it out due to the 
> necessary restrictions making it not generally useful.


-- 
Joe Conway
PostgreSQL Contributors Team
RDS Open Source Databases
Amazon Web Services: https://aws.amazon.com







view thread (35+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected]
  Subject: Re: Emitting JSON to file using COPY TO
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox