Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sTEUE-00EF44-SN for pgsql-hackers@arkaria.postgresql.org; Mon, 15 Jul 2024 05:42:59 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1sTEUD-008OwN-At for pgsql-hackers@arkaria.postgresql.org; Mon, 15 Jul 2024 05:42:57 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1sTEUC-008OwF-Vt for pgsql-hackers@lists.postgresql.org; Mon, 15 Jul 2024 05:42:57 +0000 Received: from mail-vk1-xa35.google.com ([2607:f8b0:4864:20::a35]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1sTEU9-002BgY-L4 for pgsql-hackers@lists.postgresql.org; Mon, 15 Jul 2024 05:42:55 +0000 Received: by mail-vk1-xa35.google.com with SMTP id 71dfb90a1353d-4f2f31a9410so1465611e0c.3 for ; Sun, 14 Jul 2024 22:42:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=enterprisedb.com; s=google; t=1721022171; x=1721626971; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=lyYssVfDI08FKZ/Ve6+Z0a8eAKS78exT27IN1QT15Gw=; b=Ix2Fn7MyLqBGUECh8Fhju0HG3ULezxdfn99yaXZx/kttQ+3P7Bp1v6jGgy1d8XzP61 /u6NjTLf+/s/O26QdRZ/kLESPZtRvJBkngZXy6PRVpNoWHxsFNQ4E5jnwQxxV51+rYV6 Vory4UGuCiktlAqSHJ9PnJDdXn5fCj5GQzzuTl9CxUMF4qIg23Py02cZDoX9J8mEQE2p VEFmmrWrtRKdV6Z/d0NRjeueseG8xXlKDApR6uKqpYpUN29voiAX9OY3iaV/YaG+LgtN EG6nAgjyqXFOKpno4N4wqjr2GAw/QStaEvXOCeOg/JS0pbVlVXmmmcIKn6C1jwIH9aMj 8AEw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1721022171; x=1721626971; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=lyYssVfDI08FKZ/Ve6+Z0a8eAKS78exT27IN1QT15Gw=; b=skPNtzNWcoaJHh9D8BrNvPbVx2ItkA9I42z6sQJjvoU5z30DYjnA2y6DcCkNayKGPi Qj50aiJvLVClOUzaVjDDweeHKgFyqYwZTzgzX6Csb54mhF3rYm0HdgsKTrU84+BUwKRd 4Y4e8e/tX97cDZTEZXcdv1XeADAkQxjtR5g2qWd2+jGLXjLK0MerihYfPLpgevRO16n1 d+/O9rpbIidmem6eyYNy2OxacTIX91kXGQFprS2PDP9llAOtE7ShFvGsR0WasCgkHsOh nSbZKiEhqmBlRcuy3JWn0rTxwowKRH80vKcIqijO2T3Uka/IaKhcxebCmhV/V4bax55n HGlA== X-Gm-Message-State: AOJu0YyO28Zx1kQ1IqbpyuUwtxSx8NN5lhjxXc2Mkt/RP0blxL3xkZwa KC78ouzGS+mTUzRSO2Hre2UKIOz5bRCMG4UAS02j8O1EBnoRo05uBpmUwCcGPvRDandzHYrwrVa 7UuG/WWGq09h2P9ItycnoRehQWH+pDjZ9FAR2 X-Google-Smtp-Source: AGHT+IGWLk0MbVWhfQ+K+oYgP24y/DEUPi9OsjNrpeG+5NR8iFTo3a3+lAV39l6Zu1uv2SVK+95YhHdm2qOXHOFpf5U= X-Received: by 2002:ac5:c3c6:0:b0:4ef:58d4:70f5 with SMTP id 71dfb90a1353d-4f33f15fea9mr18112975e0c.2.1721022171034; Sun, 14 Jul 2024 22:42:51 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Nishant Sharma Date: Mon, 15 Jul 2024 11:12:40 +0530 Message-ID: Subject: Re: on_error table, saving error info to a table To: jian he Cc: PostgreSQL Hackers Content-Type: multipart/alternative; boundary="00000000000027d0bb061d42b344" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --00000000000027d0bb061d42b344 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Thanks for the patch! I was not able to understand why we need a special error table for COPY? Can't we just log it in a new log error file for COPY in a proper format? Like using table approach, PG would be unnecessarily be utilising its resources like extra CPU to format the data in pages to store in its table format, writing and keeping the table in its store (which may or may not be required), the user would be required to make sure it creates the error table with proper columns to be used in COPY, etc.. Meanwhile, please find some quick review comments:- 1) Patch needs re-base. 2) If the columns of the error table are fixed, then why not create it internally using some function or something instead of making the user create the table correctly with all the columns? 3) I think, error messages can be improved, which looks big to me. 4) I think no need of below pstrdup, As CStringGetTextDatum creates a text copy for the same:- err_code =3D pstrdup(unpack_sql_state(cstate->escontext->error_data->sqlerrcode)); t_values[9] =3D CStringGetTextDatum(err_code); 5) Should 'on_error_rel' as not NULL be checked along with below, as I can see it is passed as NULL from two locations? + if (cstate->opts.on_error =3D=3D COPY_ON_ERROR_TABLE) + { 6) Below declarations can be shifted to the if block, where they are used. Instead of keeping them as global in function? + HeapTuple on_error_tup; + TupleDesc on_error_tupDesc; + if (cstate->opts.on_error =3D=3D COPY_ON_ERROR_TABLE) + { 7) Below comment going beyond 80 char width:- * if on_error is specified with 'table', then on_error_rel is the error saving table 8) Need space after 'false' err_detail ? false: true; 9) Below call can fit in a single line. No need to keep the 2nd and 3rd parameter in nextlines. + on_error_tup =3D heap_form_tuple(on_error_tupDesc, + t_values, + t_isnull); 10) Variable declarations Tab Spacing issue at multiple places. 11) Comments in the patch are not matched to PG comment style. Kindly note I have not tested the patch properly yet. Only checked it with a positive test case. As I will wait for a conclusion on my opinion of the proposed patch. Regards, Nishant Sharma. EnterpriseDB, Pune. On Sat, Feb 3, 2024 at 11:52=E2=80=AFAM jian he wrote: > Hi. > I previously did some work in COPY FROM save error information to a table= . > still based on this suggestion: > https://www.postgresql.org/message-id/752672.1699474336%40sss.pgh.pa.us > Now I refactored it. > > the syntax: > ON_ERROR 'table', TABLE 'error_saving_tbl' > > if ON_ERROR is not specified with 'table', TABLE is specified, then error= . > if ON_ERROR is specified with 'table', TABLE is not specified or > error_saving_tbl does not exist, then error. > > In BeginCopyFrom, we check the data definition of error_saving_table, > we also check if the user has INSERT privilege to error_saving_table > (all the columns). > We also did a preliminary check of the lock condition of > error_saving_table. > > if it does not meet these conditions, we quickly error out. > error_saving_table will be the same schema as the copy from table. > > Because "table" is a keyword, I have to add the following changes to > gram.y. > --- a/src/backend/parser/gram.y > +++ b/src/backend/parser/gram.y > @@ -3420,6 +3420,10 @@ copy_opt_item: > { > $$ =3D makeDefElem("null", (Node *) makeString($3), @1); > } > + | TABLE opt_as Sconst > + { > + $$ =3D makeDefElem("table", (Node *) makeString($3), @1); > + } > > since "table" is already a keyword, so there is no influence on the > parsing speed? > > demo: > > create table err_tbl( > userid oid, -- the user oid while copy generated this entry > copy_tbl oid, --copy table > filename text, > lineno int8, > line text, > colname text, > raw_field_value text, > err_message text, > err_detail text, > errorcode text > ); > create table t_copy_tbl(a int, b int, c int); > COPY t_copy_tbl FROM STDIN WITH (delimiter ',', on_error 'table', > table err_tbl); > 1,2,a > \. > > table err_tbl \gx > -[ RECORD 1 ]---+------------------------------------------- > userid | 10 > copy_tbl | 17920 > filename | STDIN > lineno | 1 > line | 1,2,a > colname | c > raw_field_value | a > err_message | invalid input syntax for type integer: "a" > err_detail | > errorcode | 22P02 > --00000000000027d0bb061d42b344 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Thanks for the patch!

I was not able to= understand why we need a special error table for COPY?
Can't= we just log it in a new log error file for COPY in a proper format? Like
using table approach, PG would be unnecessarily be utilising its r= esources like
extra CPU to format the data in pages to store in i= ts table format, writing and
keeping the table in its store (whic= h may or may not be required), the user
would be required to make= sure it creates the error table with proper columns
to be used i= n COPY, etc..


Meanwhile, please= find some quick review comments:-

1) Patch needs re-base.

2)= If the columns of the error table are fixed, then why not create it intern= ally using
some function or something instead of making the user = create the table correctly
with all the columns?

3) I thin= k, error messages can be improved, which looks big to me.

4) I think= no need of below pstrdup, As CStringGetTextDatum creates a text copy for
the same:-
err_code =3D pstrdup(unpack_sql_state(cstate->esc= ontext->error_data->sqlerrcode));

t_values[9] =3D CStringGetTe= xtDatum(err_code);

5) Should 'on_error_rel' as not NULL be c= hecked along with below, as I can see it is
passed as NULL from t= wo locations?
+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 if (cst= ate->opts.on_error =3D=3D COPY_ON_ERROR_TABLE)
+ =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 {

6) Below declarations can be shifted = to the if block, where they are used. Instead of
keeping them as = global in function?
+ =C2=A0 HeapTuple =C2=A0 on_error_tup;
+ =C2=A0 = TupleDesc =C2=A0 on_error_tupDesc;

+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 if (cstate->opts.on_error =3D=3D COPY_ON_ERROR_TABLE)<= br>+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 {

7) Below com= ment going beyond 80 char width:-
* if on_error is specified with 't= able', then on_error_rel is the error saving table

8) Need space= after 'false'
err_detail ? false: true;

9) Below call ca= n fit in a single line. No need to keep the 2nd and 3rd parameter in
<= div>nextlines.
+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 on_error_tup =3D heap_form_tuple(on_error_tupDesc,
+ =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 t_values,
+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 t_isnull);
<= br>10) Variable declarations Tab Spacing issue at multiple places.

1= 1) Comments in the patch are not matched to PG comment style.

Kindly note I have not tested the patch properly yet. Only checked it = with a positive test
case. As I will wait for a conclusion on my = opinion of the proposed patch.


Rega= rds,
Nishant Sharma.
EnterpriseDB, Pune.

=

On Sat, Feb 3, 2024 at 11:52=E2=80=AFAM jian he <jian.universality@gmail.com> wrote:
Hi.
I previously did some work in COPY FROM save error information to a table.<= br> still based on this suggestion:
https://www.postgresql.org/me= ssage-id/752672.1699474336%40sss.pgh.pa.us
Now I refactored it.

the syntax:
ON_ERROR 'table', TABLE 'error_saving_tbl'

if ON_ERROR is not specified with 'table', TABLE is specified, then= error.
if ON_ERROR is specified with 'table', TABLE is not specified or error_saving_tbl does not exist, then error.

In BeginCopyFrom, we check the data definition of error_saving_table,
we also check if the user has INSERT privilege to error_saving_table
(all the columns).
We also did a preliminary check of the lock condition of error_saving_table= .

if it does not meet these conditions, we quickly error out.
error_saving_table will be the same schema as the copy from table.

Because "table" is a keyword, I have to add the following changes= to gram.y.
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -3420,6 +3420,10 @@ copy_opt_item:
=C2=A0 {
=C2=A0 $$ =3D makeDefElem("null", (Node *) makeString($3), @1); =C2=A0 }
+ | TABLE opt_as Sconst
+ {
+ $$ =3D makeDefElem("table", (Node *) makeString($3), @1);
+ }

since "table" is already a keyword, so there is no influence on t= he
parsing speed?

demo:

create table err_tbl(
=C2=A0 =C2=A0 userid oid, -- the user oid while copy generated this entry =C2=A0 =C2=A0 copy_tbl oid, --copy table
=C2=A0 =C2=A0 filename text,
=C2=A0 =C2=A0 lineno=C2=A0 int8,
=C2=A0 =C2=A0 line=C2=A0 =C2=A0 text,
=C2=A0 =C2=A0 colname text,
=C2=A0 =C2=A0 raw_field_value text,
=C2=A0 =C2=A0 err_message text,
=C2=A0 =C2=A0 err_detail text,
=C2=A0 =C2=A0 errorcode text
);
create table t_copy_tbl(a int, b int, c int);
COPY t_copy_tbl FROM STDIN WITH (delimiter ',', on_error 'table= ',
table err_tbl);
1,2,a
\.

table err_tbl \gx
-[ RECORD 1 ]---+-------------------------------------------
userid=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | 10
copy_tbl=C2=A0 =C2=A0 =C2=A0 =C2=A0 | 17920
filename=C2=A0 =C2=A0 =C2=A0 =C2=A0 | STDIN
lineno=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | 1
line=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 | 1,2,a
colname=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0| c
raw_field_value | a
err_message=C2=A0 =C2=A0 =C2=A0| invalid input syntax for type integer: &qu= ot;a"
err_detail=C2=A0 =C2=A0 =C2=A0 |
errorcode=C2=A0 =C2=A0 =C2=A0 =C2=A0| 22P02
--00000000000027d0bb061d42b344--