public inbox for [email protected]  
help / color / mirror / Atom feed
From: Masahiko Sawada <[email protected]>
To: jian he <[email protected]>
Cc: Andrew Dunstan <[email protected]>
Cc: Joe Conway <[email protected]>
Cc: Junwang Zhao <[email protected]>
Cc: Florents Tselai <[email protected]>
Cc: Andrey M. Borodin <[email protected]>
Cc: Dean Rasheed <[email protected]>
Cc: Daniel Verite <[email protected]>
Cc: Davin Shearer <[email protected]>
Cc: PostgreSQL-development <[email protected]>
Subject: Re: Emitting JSON to file using COPY TO
Date: Mon, 16 Mar 2026 11:24:58 -0700
Message-ID: <CAD21AoASG2uXJXN=23jkSgKWJNVz2TPazQmrOYA7ZvO3GxjVhw@mail.gmail.com> (raw)
In-Reply-To: <CACJufxGm3OhbG3=bj4nLxMzq92A84v6Q1dG+hR_ZFrfLg1B99w@mail.gmail.com>
References: <[email protected]>
	<CACJufxGbiNtsHnn=ZGw2J4VcjD5za5znbcxsaEvVd7vMdxQQ+g@mail.gmail.com>
	<CACJufxG7T2_LF=UuJ4iFYMJZsK37L6d3XJvBrcR-3Pp3z+BiGw@mail.gmail.com>
	<CACJufxG0=CoG64Ng7DfMP4zy0MORhTW7rRioQnNVCw-6GxRaXg@mail.gmail.com>
	<CACJufxGq6aJszBhfrUQKHeKmp0r+ka7=SwHSY7Qs_6LQMwiEbw@mail.gmail.com>
	<CA+v5N43-F9-Wiktg_-+aqKJz+YiCNJMAh5ootDeTKkOJ=kiaHA@mail.gmail.com>
	<CACJufxEjZwrocCpt29xtmJTwhWZUu1Nt0GfHFarNGXUS5AAecw@mail.gmail.com>
	<CAEG8a3LB7q1eQ0AzFTEBDFDxs3kJ=5iJA+HTmiYGya6Wb5jsRA@mail.gmail.com>
	<CACJufxFLpMsbW41T65xJsw925MSE1bvOr6X+h_7sw8_qmDpTpA@mail.gmail.com>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<CACJufxELGPA86veVNNBoqUDLwwQbQKdiYLKtwLb7DoSgV6f8sw@mail.gmail.com>
	<[email protected]>
	<CACJufxFFZqxC3p4WjpTEi4riaJm=pADX+py0yQ0=RWTn5cqK3Q@mail.gmail.com>
	<[email protected]>
	<CACJufxGm3OhbG3=bj4nLxMzq92A84v6Q1dG+hR_ZFrfLg1B99w@mail.gmail.com>

On Sun, Mar 8, 2026 at 8:49 PM jian he <[email protected]> wrote:
>
> On Mon, Mar 9, 2026 at 3:44 AM Andrew Dunstan <[email protected]> wrote:
> >
> > Hmm. But should we be scribbling on slot->tts_tupleDescriptor like that?
> > How about something like this?:
> >
> > -        * Full table or query without column list.  Ensure the slot uses
> > -        * cstate->tupDesc so that the datum is stamped with the right type;
> > -        * for queries output type is RECORDOID this must be the blessed
> > -        * descriptor so that composite_to_json can look it up via
> > -        * lookup_rowtype_tupdesc.
> > +        * Full table or query without column list.  For queries, the slot's
> > +        * TupleDesc may carry RECORDOID, which is not registered in the
> > type
> > +        * cache and would cause composite_to_json's lookup_rowtype_tupdesc
> > +        * call to fail.  Build a HeapTuple stamped with the blessed
> > +        * descriptor so the type can be looked up correctly.
> >           */
> >          if (!cstate->rel && slot->tts_tupleDescriptor->tdtypeid ==
> > RECORDOID)
> > -           slot->tts_tupleDescriptor = cstate->queryDesc->tupDesc;
> > +       {
> > +           HeapTuple   tup;
> >
> > -       rowdata = ExecFetchSlotHeapTupleDatum(slot);
> > +           tup = heap_form_tuple(cstate->tupDesc,
> > +                                 slot->tts_values,
> > +                                 slot->tts_isnull);
> > +           rowdata = HeapTupleGetDatum(tup);
> > +       }
> > +       else
> > +       {
> > +           rowdata = ExecFetchSlotHeapTupleDatum(slot);
> > +       }
> >
> This is better. I've tried to get rid of json_projvalues and json_projnulls.
> Just using heap_form_tuple, but it won't work.
>
> I incorporated the v28-0004 COPY column list into v9-0002.
> With this patch set, we added four fields to the struct CopyToStateData.
>
> +    StringInfo    json_buf;        /* reusable buffer for JSON output,
> +                                 * initialized in BeginCopyTo */
> +    TupleDesc    tupDesc;        /* Descriptor for JSON output; for a column
> +                                 * list this is a projected descriptor */
> +    Datum       *json_projvalues;    /* pre-allocated projection values, or
> +                                     * NULL */
> +    bool       *json_projnulls; /* pre-allocated projection nulls, or NULL */
>
> Using the script in
> https://www.postgresql.org/message-id/CACJufxFFZqxC3p4WjpTEi4riaJm%3DpADX%2Bpy0yQ0%3DRWTn5cqK3Q%40ma...
> I tested it again on macOS and Linux, and there are no regressions for
> COPY TO with the TEXT and CSV formats.

I've reviewed the patch and have some comments:

---
I got a SEGV in the following scenario:

postgres(1:1197708)=# create table test (a int, b text, c jsonb);
CREATE TABLE
postgres(1:1197708)=# copy test(a, b) to stdout with (format 'json' );
TRAP: failed Assert("tupdesc->firstNonCachedOffsetAttr >= 0"), File:
"execTuples.c", Line: 2328, PID: 1197708
postgres: masahiko postgres [local] COPY(ExceptionalCondition+0x9e) [0xbebe48]
postgres: masahiko postgres [local] COPY(BlessTupleDesc+0x2b) [0x729b50]
postgres: masahiko postgres [local] COPY(BeginCopyTo+0xc94) [0x637bdf]
postgres: masahiko postgres [local] COPY(DoCopy+0xb68) [0x62afbc]
postgres: masahiko postgres [local]
COPY(standard_ProcessUtility+0xa22) [0xa0ba48]
postgres: masahiko postgres [local] COPY(ProcessUtility+0x10e) [0xa0b01f]
postgres: masahiko postgres [local] COPY() [0xa09872]
postgres: masahiko postgres [local] COPY() [0xa09acf]
postgres: masahiko postgres [local] COPY(PortalRun+0x2c8) [0xa0901d]
postgres: masahiko postgres [local] COPY() [0xa02055]
postgres: masahiko postgres [local] COPY(PostgresMain+0xaf1) [0xa0724e]
postgres: masahiko postgres [local] COPY() [0x9fdab9]
postgres: masahiko postgres [local]
COPY(postmaster_child_launch+0x165) [0x905378]
postgres: masahiko postgres [local] COPY() [0x90b600]
postgres: masahiko postgres [local] COPY() [0x908e6a]
postgres: masahiko postgres [local] COPY(PostmasterMain+0x14fe) [0x90880c]
postgres: masahiko postgres [local] COPY(main+0x340) [0x7a1f9c]

It seems to forget to call TupleDescFinalize(). And I think we need
some regression tests for this case.

---
+       if (cstate->opts.format == COPY_FORMAT_JSON)
+       {
+               /*
+                * If FORCE_ARRAY has been specified, send the opening bracket.
+                */
+               if (cstate->opts.force_array)
+               {
+                       CopySendChar(cstate, '[');
+                       CopySendTextLikeEndOfRow(cstate);
+               }
+       }

We can conjunct the two if statement conditions.

Regards,

-- 
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com






reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Emitting JSON to file using COPY TO
  In-Reply-To: <CAD21AoASG2uXJXN=23jkSgKWJNVz2TPazQmrOYA7ZvO3GxjVhw@mail.gmail.com>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox