Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1t8Gpm-0036A0-Cc for pgsql-hackers@arkaria.postgresql.org; Tue, 05 Nov 2024 10:30:49 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1t8Gpj-00ByqP-NG for pgsql-hackers@arkaria.postgresql.org; Tue, 05 Nov 2024 10:30:48 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1t8Gpj-00ByqE-B8 for pgsql-hackers@lists.postgresql.org; Tue, 05 Nov 2024 10:30:47 +0000 Received: from mail-ua1-x92f.google.com ([2607:f8b0:4864:20::92f]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 (Exim 4.94.2) (envelope-from ) id 1t8Gpf-000Iko-BL for pgsql-hackers@lists.postgresql.org; Tue, 05 Nov 2024 10:30:46 +0000 Received: by mail-ua1-x92f.google.com with SMTP id a1e0cc1a2514c-84ff43e87cbso1786045241.1 for ; Tue, 05 Nov 2024 02:30:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=enterprisedb.com; s=google; t=1730802642; x=1731407442; darn=lists.postgresql.org; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date:message-id:reply-to; bh=MvHhGJPhqvF82S6KgA4L9NkjMk6g6kkbwOHZTNQVIqw=; b=RNX5LA/Sq/Lz+oy9xBbNssi77smzDgMvQTf/k0q0P3/rWLE4GFRMhUFycoPakR7/pH IfbAp0iRQX4ewuIxTAulu+dgXt0/IOJrgihCKa2zM6I41taR3AwdIoaPFLrmSqGdPUpm KDW8YcueB36h/KKGDVcyCJmuWvmMO1N4mkkzFs9Vpk2/vPei5YmQSL2iybgf0zfyXFVb wdJg2NxcprgMEZVFOhj7nOPlj7VyL6wT/AR/gbsX7djXFOjyXyz45QJj7XR+FdwSw6cR 5skDxbve11QIk09RcmdpISthiAOFaTV7VAs9/k6uf1b4lIT3/aT9msr0O0Q+9MbnJg+s rMTQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1730802642; x=1731407442; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=MvHhGJPhqvF82S6KgA4L9NkjMk6g6kkbwOHZTNQVIqw=; b=ZYdCKpiqJPA4HzQJarr2rObUT5RCDoMsboA3alSV/hveoISIGrpOTzz/PZ5hGnbn+3 rlPMQ+MuoqE7wGOWfFaZiovj1rhAkI9ZdglXoy2B/pdC+lzZT5G4FjNJ4Hm4C7COwWjf QuJ/uokx9ga1Xrr+u5N+BRcM5CytpoErbMWJUL53Oaf7m3sa3ic1OzbvWAi1+L237IWl zVA51YPNPS13X4QCADPkXw18FNVFM+pR05ZqUmnxZvafaPedF40HXM+1e5XMLGMYX4tx aSDGeICgGiGOxBn98awQyfUv9cMOgYWEMkYJxtPLUXjhXsR5bPibrHMi9PFJ40qN7WGQ lqzA== X-Gm-Message-State: AOJu0YxVMql1Q8tgbqZozrWpazjeSlq2dQyf7ZIi1Er2yO5v1WBdv+qR MUG5dlQaq9YJVvB+Jfu46BKBJQf2lI6SiIPPY8X7CWUjzvDtkvOCELwfKNz1jF2k4iBAd1YUxsn kZ5vJ7CzA63DbGu5PgVlB5Bw0MhXtOS8o3eHD X-Google-Smtp-Source: AGHT+IF5MGUmTUQEYOxug/1pHJp3SbY3PHUK1mWb1y9PjoY++sfKr6vWP1J4njasNfYZqrfvPw7PI1C5LcYHGjJPjJE= X-Received: by 2002:a05:6102:290b:b0:4a5:ad80:a9e1 with SMTP id ada2fe7eead31-4a962f3705emr12826407137.28.1730802641960; Tue, 05 Nov 2024 02:30:41 -0800 (PST) MIME-Version: 1.0 References: In-Reply-To: From: Nishant Sharma Date: Tue, 5 Nov 2024 16:00:30 +0530 Message-ID: Subject: Re: on_error table, saving error info to a table To: jian he Cc: PostgreSQL Hackers Content-Type: multipart/alternative; boundary="000000000000a69fbd062627e420" List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk --000000000000a69fbd062627e420 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Thanks for the v2 patch! I see v1 review comments got addressed in v2 along with some further improvements. 1) v2 Patch again needs re-base. 2) I think we need not worry whether table name is unique or not, table name can be provided by user and we can check if it does not exists then simply we can create it with appropriate columns, if it exists we use it to check if its correct on_error table and proceed. 3) Using #define in between the code? I don't see that style in copyfromparse.c file. I do see such style in other src file. So, not sure if committer would allow it or not. #define ERROR_TBL_COLUMNS 10 4) Below appears redundant to me, it was not the case in v1 patch set, where it had only one return and one increment of error as new added code was at the end of the block:- + cstate->num_errors++; + return true; + } cstate->num_errors++; I was not able to test the v2 due to conflicts in v2, I did attempt to resolve but I saw many failures in make world. Regards, Nishant. On Tue, Aug 20, 2024 at 5:31=E2=80=AFAM jian he wrote: > On Mon, Jul 15, 2024 at 1:42=E2=80=AFPM Nishant Sharma > wrote: > > > > Thanks for the patch! > > > > I was not able to understand why we need a special error table for COPY= ? > > Can't we just log it in a new log error file for COPY in a proper > format? Like > > using table approach, PG would be unnecessarily be utilising its > resources like > > extra CPU to format the data in pages to store in its table format, > writing and > > keeping the table in its store (which may or may not be required), the > user > > would be required to make sure it creates the error table with proper > columns > > to be used in COPY, etc.. > > > > > > Meanwhile, please find some quick review comments:- > > > > 1) Patch needs re-base. > > > > 2) If the columns of the error table are fixed, then why not create it > internally using > > some function or something instead of making the user create the table > correctly > > with all the columns? > > I'll think about it more. > previously, i tried to use SPI to create tables, but at that time, i > thought that's kind of excessive. > you need to create the table, check whether the table name is unique, > check the privilege. > now we quickly error out if the error saving table definition does not > meet. I guess that's less work to do. > > > > 3) I think, error messages can be improved, which looks big to me. > > > > 4) I think no need of below pstrdup, As CStringGetTextDatum creates a > text copy for > > the same:- > > err_code =3D > pstrdup(unpack_sql_state(cstate->escontext->error_data->sqlerrcode)); > > > > t_values[9] =3D CStringGetTextDatum(err_code); > > > > 5) Should 'on_error_rel' as not NULL be checked along with below, as I > can see it is > > passed as NULL from two locations? > > + if (cstate->opts.on_error =3D=3D COPY_ON_ERROR_TABLE) > > + { > > > > 6) Below declarations can be shifted to the if block, where they are > used. Instead of > > keeping them as global in function? > > + HeapTuple on_error_tup; > > + TupleDesc on_error_tupDesc; > > > > + if (cstate->opts.on_error =3D=3D COPY_ON_ERROR_TABLE) > > + { > > > > 7) Below comment going beyond 80 char width:- > > * if on_error is specified with 'table', then on_error_rel is the error > saving table > > > > 8) Need space after 'false' > > err_detail ? false: true; > > > > 9) Below call can fit in a single line. No need to keep the 2nd and 3rd > parameter in > > nextlines. > > + on_error_tup =3D heap_form_tuple(on_error_tupDesc, > > + t_values, > > + t_isnull); > > > > 10) Variable declarations Tab Spacing issue at multiple places. > > > > 11) Comments in the patch are not matched to PG comment style. > > > > > > Kindly note I have not tested the patch properly yet. Only checked it > with a positive test > > case. As I will wait for a conclusion on my opinion of the proposed > patch. > > > almost all these issues have been addressed. > The error messages are still not so good. I need to polish it. > --000000000000a69fbd062627e420 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Thanks for the v2 patch!

I see v1 revie= w comments got addressed in v2 along with some
further improvemen= ts.

1) v2 Patch again needs re-base.
2) I think we need not worry whether table name is unique or no= t,
table name can be provided by user and we can check if it does=
not exists then simply we can create it with appropriate columns= ,
if it exists we use it to check if its correct on_error table a= nd
proceed.

3) Using #define in between = the code? I don't see that style in
copyfromparse.c file. I d= o see such style in other src file. So, not
sure if committer wou= ld allow it or not.
#define ERROR_TBL_COLUMNS =C2=A0 10

4)=C2=A0Below appears redundant to me, it was not the case in v1 p= atch
set, where it had only one return and one increment of error= as new
added code was at the end of the block:-
+ =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 cstate->num_= errors++;
+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 return true;
+ =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 }=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 cstate->num_err= ors++;


I was not able to test the v= 2 due to conflicts in v2, I did attempt to
resolve but I saw many= failures in make world.


Regards,
Nishant.

On Tue, Aug 20, 2024 at 5:31=E2=80=AFAM jian he <= jian.universality@gmail.com<= /a>> wrote:
O= n Mon, Jul 15, 2024 at 1:42=E2=80=AFPM Nishant Sharma
<
ni= shant.sharma@enterprisedb.com> wrote:
>
> Thanks for the patch!
>
> I was not able to understand why we need a special error table for COP= Y?
> Can't we just log it in a new log error file for COPY in a proper = format? Like
> using table approach, PG would be unnecessarily be utilising its resou= rces like
> extra CPU to format the data in pages to store in its table format, wr= iting and
> keeping the table in its store (which may or may not be required), the= user
> would be required to make sure it creates the error table with proper = columns
> to be used in COPY, etc..
>
>
> Meanwhile, please find some quick review comments:-
>
> 1) Patch needs re-base.
>
> 2) If the columns of the error table are fixed, then why not create it= internally using
> some function or something instead of making the user create the table= correctly
> with all the columns?

I'll think about it more.
previously, i tried to use SPI to create tables, but at that time, i
thought that's kind of excessive.
you need to create the table, check whether the table name is unique,
check the privilege.
now we quickly error out if the error saving table definition does not
meet. I guess that's less work to do.


> 3) I think, error messages can be improved, which looks big to me.
>
> 4) I think no need of below pstrdup, As CStringGetTextDatum creates a = text copy for
> the same:-
> err_code =3D pstrdup(unpack_sql_state(cstate->escontext->error_d= ata->sqlerrcode));
>
> t_values[9] =3D CStringGetTextDatum(err_code);
>
> 5) Should 'on_error_rel' as not NULL be checked along with bel= ow, as I can see it is
> passed as NULL from two locations?
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (cstate->= ;opts.on_error =3D=3D COPY_ON_ERROR_TABLE)
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0{
>
> 6) Below declarations can be shifted to the if block, where they are u= sed. Instead of
> keeping them as global in function?
> +=C2=A0 =C2=A0HeapTuple=C2=A0 =C2=A0on_error_tup;
> +=C2=A0 =C2=A0TupleDesc=C2=A0 =C2=A0on_error_tupDesc;
>
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (cstate->= ;opts.on_error =3D=3D COPY_ON_ERROR_TABLE)
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0{
>
> 7) Below comment going beyond 80 char width:-
> * if on_error is specified with 'table', then on_error_rel is = the error saving table
>
> 8) Need space after 'false'
> err_detail ? false: true;
>
> 9) Below call can fit in a single line. No need to keep the 2nd and 3r= d parameter in
> nextlines.
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= on_error_tup =3D heap_form_tuple(on_error_tupDesc,
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0t_values,
> +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0= =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0t_isnull);
>
> 10) Variable declarations Tab Spacing issue at multiple places.
>
> 11) Comments in the patch are not matched to PG comment style.
>
>
> Kindly note I have not tested the patch properly yet. Only checked it = with a positive test
> case. As I will wait for a conclusion on my opinion of the proposed pa= tch.
>
almost all these issues have been addressed.
The error messages are still not so good. I need to polish it.
--000000000000a69fbd062627e420--