public inbox for [email protected]  
help / color / mirror / Atom feed
From: torikoshia <[email protected]>
To: jian he <[email protected]>
Cc: Masahiko Sawada <[email protected]>
Cc: vignesh C <[email protected]>
Cc: Jim Jones <[email protected]>
Cc: Kirill Reshke <[email protected]>
Cc: Fujii Masao <[email protected]>
Cc: David G. Johnston <[email protected]>
Cc: Yugo NAGATA <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Subject: Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
Date: Tue, 01 Jul 2025 23:54:15 +0900
Message-ID: <[email protected]> (raw)
In-Reply-To: <CACJufxGEHmijmP-QqvrmqU6cxmhgpdjY7ewQBQ=E9NmdyEcqmw@mail.gmail.com>
References: <CAKFQuwawy1e6YR4S=j+y7pXqg_Dw1WBVrgvf=BP3d1_aSfe_+Q@mail.gmail.com>
	<CACJufxGsGqU0BnHw9+Vk3KhBSA3951fZs+JQnM387CiJD3Qn+g@mail.gmail.com>
	<CALdSSPh13qiKmhqwj=bR_3seZJVkP9E6BDbLfHXQ_DbaxQL4FA@mail.gmail.com>
	<CACJufxEq9aQX6ddkHeEX-RqjTZfDF02BLWGVS28ixRQQxoz0LA@mail.gmail.com>
	<CALdSSPjiR8cqPrJOjoGPvTnDufuWYj94-1whCGtGox=_QeZQEA@mail.gmail.com>
	<CACJufxEKq25CvQj4rYxcvHC_0MZQxzfwEKVWo9i-bKKxbVw3BA@mail.gmail.com>
	<CALdSSPgcuW1omVFUpnL6H63H+2qvtvEkD+4H3jUh1iEaNn_nEg@mail.gmail.com>
	<CACJufxFTvtMy_aYFQPtcBwVwJ-2=rJRVE0SBWv6efv86J5DyAQ@mail.gmail.com>
	<CALdSSPhxprTyzzr4tk-QU-GUroBKYNuJJ2GxWnExbB=bjQjJ7w@mail.gmail.com>
	<CACJufxHpTC-kqmW0LKNd_ZgnPJkhiAebPQkX2YE-CVxBP4LDfg@mail.gmail.com>
	<CALdSSPjpi0w6US75_st_-zOdUfr5q2vXmh08x0pL4FGJGYZvsQ@mail.gmail.com>
	<CACJufxE420KH=SLQSMEmi8QfEhXgVL8kGkuZJTviZ0o=TEVQgw@mail.gmail.com>
	<[email protected]>
	<CACJufxFej2xUCERoWU+Eb+R+L_xzMNq4ZERNtL=SPF8O0T0ygw@mail.gmail.com>
	<[email protected]>
	<CACJufxHj0PejJC8BMq1cCiFC5uFdnazqrHh_VEm=s65-cNSnwQ@mail.gmail.com>
	<CALDaNm0Q3hA9w76gvB=goOY4ktKb-wjrbZbOcrqSwvxdm0yrcw@mail.gmail.com>
	<CACJufxEueK_MFebYqp73e+P+ykbD4+QbONptMg=iaaxUEGU7EQ@mail.gmail.com>
	<CALDaNm0FUtPjH9wz5dgwOBNtsAXEZj=0-TYGsyQXchQy-hXGrw@mail.gmail.com>
	<CACJufxFpm2Gzx4AD9qKtiQSiiiunX02wNTxu0JoFm7nEKF2KUw@mail.gmail.com>
	<CAD21AoBjmEYjZT9A6P-vAuwboiiYtXaMc12pX2ySmh3RXi=v8w@mail.gmail.com>
	<CACJufxF6_YwAboiCaVYLRtNpO4kGbXqkXzH_7W=pUvrNXK8WuQ@mail.gmail.com>
	<CAD21AoBWwEzpp3Z6tp-O-AQGftK-kuj6vRva2L5daWoWBtbnRg@mail.gmail.com>
	<CACJufxGEHmijmP-QqvrmqU6cxmhgpdjY7ewQBQ=E9NmdyEcqmw@mail.gmail.com>

Hi,

Thanks for updating the patch and I've read 
v17-0001-COPY-on_error-set_null.patch and here are some comments.

> +COPY x from stdin (on_error set_null, reject_limit 2);
> +ERROR:  COPY REJECT_LIMIT requires ON_ERROR to be set to IGNORE

I understand that REJECT_LIMIT is out of scope for this patch, but 
personally, I feel that supporting REJECT_LIMIT with ON_ERROR SET_NULL 
would be a natural extension.
- Both IGNORE and SET_NULL share the common behavior of allowing COPY to 
continue despite soft errors.
- Since REJECT_LIMIT defines the threshold for how many soft errors can 
be tolerated before COPY fails, it seems consistent to allow it with 
SET_NULL as well.


+       if (current_row_erroneous)
+               cstate->num_errors++;

Is there any reason this error counting isn't placed inside the "if 
(cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)" block?
As far as I can tell, current_row_erroneous is only modified within that 
block, so it might make sense to keep this logic together for clarity.


These may be very minor, but I noticed a few inconsistencies in casing 
and wording:

+                * If ON_ERROR is specified with IGNORE, skip rows with 
soft errors.
+                * If ON_ERROR is specified with set_null, try to 
replace with null.

IGNORE is in uppercase, but set_null is lowercase.

+                                * we use it to count number of rows 
(not fields!) that
+                                * successfully applied on_error 
set_null.

The sentence should begin with a capital: "We use it..."
Also, I felt it's unclear what "we use it" means. Does it necessary?


+COPY x to stdout (on_error set_null);
+ERROR:  COPY ON_ERROR cannot be used with COPY TO
+LINE 1: COPY x to stdout (on_error set_null);

COPY is uppercase, but to is lowercase.


+COPY x from stdin (format BINARY, on_error set_null);
+ERROR:  only ON_ERROR STOP is allowed in BINARY mode
+COPY x from stdin (on_error set_null, reject_limit 2);
+ERROR:  COPY REJECT_LIMIT requires ON_ERROR to be set to IGNORE
...
+COPY t_on_error_null FROM STDIN WITH (on_error set_null);
+ERROR:  domain d_int_not_null does not allow null values
+CONTEXT:  COPY t_on_error_null, line 1, column a: null input

It might be better to consider standardizing casing across all COPY 
statements (e.g., COPY ... TO, COPY ... FROM STDIN) for consistency.


-- 
Regards,

--
Atsushi Torikoshi
Seconded from NTT DATA Japan Corporation to SRA OSS K.K.





view thread (30+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox