public inbox for [email protected]  
help / color / mirror / Atom feed
From: Jim Jones <[email protected]>
To: jian he <[email protected]>
To: Kirill Reshke <[email protected]>
Cc: Fujii Masao <[email protected]>
Cc: David G. Johnston <[email protected]>
Cc: Yugo NAGATA <[email protected]>
Cc: torikoshia <[email protected]>
Cc: PostgreSQL Hackers <[email protected]>
Subject: Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
Date: Tue, 11 Mar 2025 11:31:06 +0100
Message-ID: <[email protected]> (raw)
In-Reply-To: <CACJufxE420KH=SLQSMEmi8QfEhXgVL8kGkuZJTviZ0o=TEVQgw@mail.gmail.com>
References: <CAKFQuwawy1e6YR4S=j+y7pXqg_Dw1WBVrgvf=BP3d1_aSfe_+Q@mail.gmail.com>
	<[email protected]>
	<CALdSSPjYw5g7_sc++bRcxOnC7jW6O2qiSkgdKRUYFXZZv3-Ktw@mail.gmail.com>
	<CACJufxGsGqU0BnHw9+Vk3KhBSA3951fZs+JQnM387CiJD3Qn+g@mail.gmail.com>
	<CALdSSPh13qiKmhqwj=bR_3seZJVkP9E6BDbLfHXQ_DbaxQL4FA@mail.gmail.com>
	<CACJufxEq9aQX6ddkHeEX-RqjTZfDF02BLWGVS28ixRQQxoz0LA@mail.gmail.com>
	<CALdSSPjiR8cqPrJOjoGPvTnDufuWYj94-1whCGtGox=_QeZQEA@mail.gmail.com>
	<CACJufxEKq25CvQj4rYxcvHC_0MZQxzfwEKVWo9i-bKKxbVw3BA@mail.gmail.com>
	<CALdSSPgcuW1omVFUpnL6H63H+2qvtvEkD+4H3jUh1iEaNn_nEg@mail.gmail.com>
	<CACJufxFTvtMy_aYFQPtcBwVwJ-2=rJRVE0SBWv6efv86J5DyAQ@mail.gmail.com>
	<CALdSSPhxprTyzzr4tk-QU-GUroBKYNuJJ2GxWnExbB=bjQjJ7w@mail.gmail.com>
	<CACJufxHpTC-kqmW0LKNd_ZgnPJkhiAebPQkX2YE-CVxBP4LDfg@mail.gmail.com>
	<CALdSSPjpi0w6US75_st_-zOdUfr5q2vXmh08x0pL4FGJGYZvsQ@mail.gmail.com>
	<CACJufxE420KH=SLQSMEmi8QfEhXgVL8kGkuZJTviZ0o=TEVQgw@mail.gmail.com>

Hi Jian

On 07.03.25 04:48, jian he wrote:
> hi.
> rebase only.


I revisited this patch today. It applies and builds cleanly, and it
works as expected.

Some tests and minor comments:

====

1) WARNING might be a better fit than NOTICE here.

postgres=# \pset null NULL
Null display is "NULL".
postgres=# CREATE TEMPORARY TABLE t (x int, y int, z text);
CREATE TABLE
postgres=# COPY t (x,y) FROM STDIN WITH (on_error set_to_null, format csv);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1,a
>> 2,1
>> 3,2
>> 4,b
>> a,c
>> \.
NOTICE:  erroneous values in 3 rows were replaced with null
COPY 5
postgres=# SELECT * FROM t;
  x   |  y   |  z   
------+------+------
    1 | NULL | NULL
    2 |    1 | NULL
    3 |    2 | NULL
    4 | NULL | NULL
 NULL | NULL | NULL
(5 rows)


postgres=# \pset null NULL
Null display is "NULL".
postgres=# CREATE TEMPORARY TABLE t (x int, y int, z text);
CREATE TABLE
postgres=# COPY t (x,y) FROM STDIN WITH (on_error set_to_null, format
csv, log_verbosity verbose);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1,a
>> 2,1
>> 3,2
>> 4,b
>> a,c
>> \.
NOTICE:  column "y" was set to null due to data type incompatibility at
line 1
NOTICE:  column "y" was set to null due to data type incompatibility at
line 4
NOTICE:  column "x" was set to null due to data type incompatibility at
line 5
NOTICE:  column "y" was set to null due to data type incompatibility at
line 5
NOTICE:  erroneous values in 3 rows were replaced with null
COPY 5
postgres=# SELECT * FROM t;
  x   |  y   |  z   
------+------+------
    1 | NULL | NULL
    2 |    1 | NULL
    3 |    2 | NULL
    4 | NULL | NULL
 NULL | NULL | NULL
(5 rows)


I would still leave the extra messages from "log_verbosity verbose" as
NOTICE though. What do you think?

====

2) Inconsistent terminology. Invalid values in "on_error set_to_null"
mode are names as "erroneous", but as "invalid" in "on_error stop" mode.
I don't want to get into the semantics of erroneous or invalid, but
sticking to one terminology would IMHO look better.

postgres=# \pset null NULL
Null display is "NULL".
postgres=# CREATE TEMPORARY TABLE t (x int, y int, z text);
CREATE TABLE
postgres=# COPY t (x,y) FROM STDIN WITH (on_error stop, format csv,
log_verbosity verbose);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1,a
>> 2,1
>> 3,2
>> 4,b
>> a,c
>> \.
ERROR:  invalid input syntax for type integer: "a"
CONTEXT:  COPY t, line 1, column y: "a"
postgres=# SELECT * FROM t;
 x | y | z
---+---+---
(0 rows)

====

3) same as in 1)

postgres=# \pset null NULL
Null display is "NULL".
postgres=# CREATE TEMPORARY TABLE t (x int, y int, z text);
CREATE TABLE
postgres=# COPY t (x,y) FROM STDIN WITH (on_error ignore, format csv,
log_verbosity verbose);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1,a
>> 2,1
>> 3,2
>> 4,b
>> a,c
>> \.
NOTICE:  skipping row due to data type incompatibility at line 1 for
column "y": "a"
NOTICE:  skipping row due to data type incompatibility at line 4 for
column "y": "b"
NOTICE:  skipping row due to data type incompatibility at line 5 for
column "x": "a"
NOTICE:  3 rows were skipped due to data type incompatibility
COPY 2
postgres=# SELECT * FROM t;
 x | y |  z   
---+---+------
 2 | 1 | NULL
 3 | 2 | NULL
(2 rows)====

====

"on_error ignore" works well with "reject_limit #"

postgres=# \pset null NULL
Null display is "NULL".
postgres=# CREATE TEMPORARY TABLE t (x int, y int, z text);
CREATE TABLE
postgres=# COPY t (x,y) FROM STDIN WITH (on_error ignore, format csv,
log_verbosity verbose, reject_limit 1);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1,a
>> 2,1
>> 3,2
>> 4,b
>> a,c
>> \.
NOTICE:  skipping row due to data type incompatibility at line 1 for
column "y": "a"
NOTICE:  skipping row due to data type incompatibility at line 4 for
column "y": "b"
ERROR:  skipped more than REJECT_LIMIT (1) rows due to data type
incompatibility
CONTEXT:  COPY t, line 4, column y: "b"
postgres=# SELECT * FROM t;
 x | y | z
---+---+---
(0 rows)

best regards, Jim





view thread (30+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox