public inbox for [email protected]
help / color / mirror / Atom feedChange COPY ... ON_ERROR ignore to ON_ERROR ignore_row
31+ messages / 9 participants
[nested] [flat]
* Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
@ 2024-01-26 15:08 David G. Johnston <[email protected]>
0 siblings, 3 replies; 31+ messages in thread
From: David G. Johnston @ 2024-01-26 15:08 UTC (permalink / raw)
To: PostgreSQL Hackers <[email protected]>
Hi,
The option choice of "ignore" in the COPY ON_ERROR clause seems overly
generic. There would seem to be two relevant ways to ignore bad column
input data - drop the entire row or just set the column value to null. I
can see us wanting to provide the set to null option and in any case having
the option name be explicit that it ignores the row seems like a good idea.
David J.
^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
@ 2024-01-28 23:50 jian he <[email protected]>
parent: David G. Johnston <[email protected]>
2 siblings, 1 reply; 31+ messages in thread
From: jian he @ 2024-01-28 23:50 UTC (permalink / raw)
To: David G. Johnston <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]>
On Fri, Jan 26, 2024 at 11:09 PM David G. Johnston
<[email protected]> wrote:
>
> Hi,
>
> The option choice of "ignore" in the COPY ON_ERROR clause seems overly generic. There would seem to be two relevant ways to ignore bad column input data - drop the entire row or just set the column value to null. I can see us wanting to provide the set to null option and in any case having the option name be explicit that it ignores the row seems like a good idea.
two issue I found out while playing around with it;
create table x1(a int not null, b int not null );
you can only do:
COPY x1 from stdin (on_error 'null');
but you cannot do
COPY x1 from stdin (on_error null);
we need to hack the gram.y to escape the "null". I don't know how to
make it work.
related post I found:
https://stackoverflow.com/questions/31786611/how-to-escape-flex-keyword
another issue:
COPY x1 from stdin (on_error null);
when we already have `not null` top level constraint for table x1.
Do we need an error immediately?
"on_error null" seems to conflict with `not null` constraint (assume
refers to the same column).
it may fail while doing bulk inserts while on_error is set to null
because of violating a not null constraint.
^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
@ 2024-01-29 00:19 David G. Johnston <[email protected]>
parent: jian he <[email protected]>
0 siblings, 0 replies; 31+ messages in thread
From: David G. Johnston @ 2024-01-29 00:19 UTC (permalink / raw)
To: jian he <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]>
On Sun, Jan 28, 2024 at 4:51 PM jian he <[email protected]> wrote:
> On Fri, Jan 26, 2024 at 11:09 PM David G. Johnston
> <[email protected]> wrote:
> >
> > Hi,
> >
> > The option choice of "ignore" in the COPY ON_ERROR clause seems overly
> generic. There would seem to be two relevant ways to ignore bad column
> input data - drop the entire row or just set the column value to null. I
> can see us wanting to provide the set to null option and in any case having
> the option name be explicit that it ignores the row seems like a good idea.
>
> two issue I found out while playing around with it;
> create table x1(a int not null, b int not null );
>
> another issue:
> COPY x1 from stdin (on_error null);
>
> when we already have `not null` top level constraint for table x1.
> Do we need an error immediately?
> "on_error null" seems to conflict with `not null` constraint (assume
> refers to the same column).
> it may fail while doing bulk inserts while on_error is set to null
> because of violating a not null constraint.
>
You should not error immediately since whether or not there is a problem is
table and data dependent. I would not check for the case of all columns
being defined not null and just let the mismatch happen.
That said, maybe with this being a string we can accept something like:
'null, ignore'
And so if attempting to place any one null fails, assuming we can make that
a soft error too, we would then ignore the entire row.
David J.
^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
@ 2024-01-29 08:28 Yugo NAGATA <[email protected]>
parent: David G. Johnston <[email protected]>
2 siblings, 0 replies; 31+ messages in thread
From: Yugo NAGATA @ 2024-01-29 08:28 UTC (permalink / raw)
To: David G. Johnston <[email protected]>; +Cc: PostgreSQL Hackers <[email protected]>
On Fri, 26 Jan 2024 08:08:29 -0700
"David G. Johnston" <[email protected]> wrote:
> Hi,
>
> The option choice of "ignore" in the COPY ON_ERROR clause seems overly
> generic. There would seem to be two relevant ways to ignore bad column
> input data - drop the entire row or just set the column value to null. I
> can see us wanting to provide the set to null option and in any case having
> the option name be explicit that it ignores the row seems like a good idea.
I am not in favour of renaming the option name "ignore", instead we can
use another style of name for the option to set the column value to NULL,
for example, "set_to_null".
(Maybe, we can make a more generic option "set_to (col, val)" that can set
the value of column specified by "col" value to the specified value "val"
(e.g. 'N/A') on a soft error, although the syntax would be a bit complex...)
IMO, it is more simple to define "ignore" as to skip the entire row rather
than having variety of "ignore". Once defined it so, the option to set the
column value to NULL should not be called "ignore" because values in other
columns will be inserted.
Regards,
Yugo Nagata
>
> David J.
--
Yugo NAGATA <[email protected]>
^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
@ 2025-03-07 03:48 jian he <[email protected]>
parent: David G. Johnston <[email protected]>
2 siblings, 1 reply; 31+ messages in thread
From: jian he @ 2025-03-07 03:48 UTC (permalink / raw)
To: Kirill Reshke <[email protected]>; +Cc: Fujii Masao <[email protected]>; Jim Jones <[email protected]>; David G. Johnston <[email protected]>; Yugo NAGATA <[email protected]>; torikoshia <[email protected]>; PostgreSQL Hackers <[email protected]>
hi.
rebase only.
Attachments:
[text/x-patch] v12-0001-COPY-on_error-set_to_null.patch (19.8K, 2-v12-0001-COPY-on_error-set_to_null.patch)
download | inline diff:
From ce0ce6438094cad553e509db65b7fd27de2b9af6 Mon Sep 17 00:00:00 2001
From: jian he <[email protected]>
Date: Fri, 7 Mar 2025 11:43:51 +0800
Subject: [PATCH v12 1/1] COPY (on_error set_to_null)
Extent "on_error action", introduce new option: on_error set_to_null.
Current grammar makes us unable to use "on_error null", so we choose "on_error set_to_null".
Any data type conversion errors during the COPY FROM process will result in the
affected column being set to NULL. This only applicable when using the
non-binary format for COPY FROM. However, the not-null constraint will still be
enforced. If a conversion error leads to a NULL value in a column that has a
not-null constraint, a not-null constraint violation error will be triggered.
A regression test for a domain with a not-null constraint has been added.
Author: Jian He <[email protected]>
Author: Kirill Reshke <[email protected]>
Reviewed-by:
Fujii Masao <[email protected]>
Jim Jones <[email protected]>
"David G. Johnston" <[email protected]>
Yugo NAGATA <[email protected]>
torikoshia <[email protected]>
discussion: https://postgr.es/m/CAKFQuwawy1e6YR4S=j+y7pXqg_Dw1WBVrgvf=BP3d1_aSfe_+Q@mail.gmail.com
---
doc/src/sgml/ref/copy.sgml | 34 +++++++++++-----
src/backend/commands/copy.c | 6 ++-
src/backend/commands/copyfrom.c | 29 +++++++++-----
src/backend/commands/copyfromparse.c | 43 +++++++++++++++++++-
src/bin/psql/tab-complete.in.c | 2 +-
src/include/commands/copy.h | 1 +
src/test/regress/expected/copy2.out | 60 ++++++++++++++++++++++++++++
src/test/regress/sql/copy2.sql | 46 +++++++++++++++++++++
8 files changed, 196 insertions(+), 25 deletions(-)
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index df093da97c5..91bc25b9ab3 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -394,23 +394,34 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
Specifies how to behave when encountering an error converting a column's
input value into its data type.
An <replaceable class="parameter">error_action</replaceable> value of
- <literal>stop</literal> means fail the command, while
- <literal>ignore</literal> means discard the input row and continue with the next one.
+ <literal>stop</literal> means fail the command,
+ <literal>ignore</literal> means discard the input row and continue with the next one, and
+ <literal>set_to_null</literal> means replace columns containing erroneous input values with
+ <literal>NULL</literal> and move to the next field.
The default is <literal>stop</literal>.
</para>
<para>
- The <literal>ignore</literal> option is applicable only for <command>COPY FROM</command>
+ The <literal>ignore</literal> and <literal>set_to_null</literal>
+ options are applicable only for <command>COPY FROM</command>
when the <literal>FORMAT</literal> is <literal>text</literal> or <literal>csv</literal>.
</para>
<para>
- A <literal>NOTICE</literal> message containing the ignored row count is
- emitted at the end of the <command>COPY FROM</command> if at least one
- row was discarded. When <literal>LOG_VERBOSITY</literal> option is set to
- <literal>verbose</literal>, a <literal>NOTICE</literal> message
+ For <literal>ignore</literal> option,
+ a <literal>NOTICE</literal> message containing the ignored row count is
+ emitted at the end of the <command>COPY FROM</command> if at least one row was discarded.
+ For <literal>set_to_null</literal> option,
+ a <literal>NOTICE</literal> message containing the row count that erroneous input values replaced by to null
+ happened is emitted at the end of the <command>COPY FROM</command> if at least one row was replaced.
+ </para>
+ <para>
+ When <literal>LOG_VERBOSITY</literal> option is set to <literal>verbose</literal>,
+ for <literal>ignore</literal> option, a <literal>NOTICE</literal> message
containing the line of the input file and the column name whose input
- conversion has failed is emitted for each discarded row.
- When it is set to <literal>silent</literal>, no message is emitted
- regarding ignored rows.
+ conversion has failed is emitted for each discarded row;
+ for <literal>set_to_null</literal> option,
+ a <literal>NOTICE</literal> message containing the line of the input file and the column name
+ where value was replaced with <literal>NULL</literal> for each input conversion failure.
+ When it is set to <literal>silent</literal>, no message is emitted regarding input conversion failed rows.
</para>
</listitem>
</varlistentry>
@@ -458,7 +469,8 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
</para>
<para>
This is currently used in <command>COPY FROM</command> command when
- <literal>ON_ERROR</literal> option is set to <literal>ignore</literal>.
+ <literal>ON_ERROR</literal> option is set to <literal>ignore</literal>
+ or <literal>set_to_null</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index cfca9d9dc29..afe60758d40 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -403,12 +403,14 @@ defGetCopyOnErrorChoice(DefElem *def, ParseState *pstate, bool is_from)
parser_errposition(pstate, def->location)));
/*
- * Allow "stop", or "ignore" values.
+ * Allow "stop", "ignore", "set_to_null" values.
*/
if (pg_strcasecmp(sval, "stop") == 0)
return COPY_ON_ERROR_STOP;
if (pg_strcasecmp(sval, "ignore") == 0)
return COPY_ON_ERROR_IGNORE;
+ if (pg_strcasecmp(sval, "set_to_null") == 0)
+ return COPY_ON_ERROR_NULL;
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -918,7 +920,7 @@ ProcessCopyOptions(ParseState *pstate,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("only ON_ERROR STOP is allowed in BINARY mode")));
- if (opts_out->reject_limit && !opts_out->on_error)
+ if (opts_out->reject_limit && opts_out->on_error != COPY_ON_ERROR_IGNORE)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
/*- translator: first and second %s are the names of COPY option, e.g.
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index bcf66f0adf8..4502ce2d366 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -1467,14 +1467,22 @@ CopyFrom(CopyFromState cstate)
/* Done, clean up */
error_context_stack = errcallback.previous;
- if (cstate->opts.on_error != COPY_ON_ERROR_STOP &&
- cstate->num_errors > 0 &&
+ if (cstate->num_errors > 0 &&
cstate->opts.log_verbosity >= COPY_LOG_VERBOSITY_DEFAULT)
- ereport(NOTICE,
- errmsg_plural("%llu row was skipped due to data type incompatibility",
- "%llu rows were skipped due to data type incompatibility",
- (unsigned long long) cstate->num_errors,
- (unsigned long long) cstate->num_errors));
+ {
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ ereport(NOTICE,
+ errmsg_plural("%llu row was skipped due to data type incompatibility",
+ "%llu rows were skipped due to data type incompatibility",
+ (unsigned long long) cstate->num_errors,
+ (unsigned long long) cstate->num_errors));
+ else if (cstate->opts.on_error == COPY_ON_ERROR_NULL)
+ ereport(NOTICE,
+ errmsg_plural("erroneous values in %llu row was replaced with null",
+ "erroneous values in %llu rows were replaced with null",
+ (unsigned long long) cstate->num_errors,
+ (unsigned long long) cstate->num_errors));
+ }
if (bistate != NULL)
FreeBulkInsertState(bistate);
@@ -1622,10 +1630,11 @@ BeginCopyFrom(ParseState *pstate,
cstate->escontext->error_occurred = false;
/*
- * Currently we only support COPY_ON_ERROR_IGNORE. We'll add other
- * options later
+ * Currently we only support COPY_ON_ERROR_IGNORE, COPY_ON_ERROR_NULL.
+ * We'll add other options later
*/
- if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE ||
+ cstate->opts.on_error == COPY_ON_ERROR_NULL)
cstate->escontext->details_wanted = false;
}
else
diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c
index e8128f85e6b..e2b4d1f7ec9 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -947,6 +947,7 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
int fldct;
int fieldno;
char *string;
+ bool current_row_erroneous = false;
tupDesc = RelationGetDescr(cstate->rel);
attr_count = list_length(cstate->attnumlist);
@@ -1025,6 +1026,7 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
/*
* If ON_ERROR is specified with IGNORE, skip rows with soft errors
+ * If ON_ERROR is specified with set_to_null, try to replace with null.
*/
else if (!InputFunctionCallSafe(&in_functions[m],
string,
@@ -1035,9 +1037,46 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
{
Assert(cstate->opts.on_error != COPY_ON_ERROR_STOP);
+ if (cstate->opts.on_error == COPY_ON_ERROR_NULL)
+ {
+ /*
+ * we use this count the number of rows (not fields) that
+ * successfully applied the on_error set_to_null
+ */
+ if (!current_row_erroneous)
+ current_row_erroneous = true;
+
+ /*
+ * we need another InputFunctionCallSafe so we can error out
+ * not-null violation for domain with not-null constraint.
+ */
+ cstate->escontext->error_occurred = false;
+ if (InputFunctionCallSafe(&in_functions[m],
+ NULL,
+ typioparams[m],
+ att->atttypmod,
+ (Node *) cstate->escontext,
+ &values[m]))
+ {
+ nulls[m] = true;
+ values[m] = (Datum) 0;
+ if (cstate->opts.log_verbosity == COPY_LOG_VERBOSITY_VERBOSE)
+ ereport(NOTICE,
+ errmsg("column \"%s\" was set to null due to data type incompatibility at line %llu",
+ cstate->cur_attname,
+ (unsigned long long) cstate->cur_lineno));
+ continue;
+ }
+ else
+ ereport(ERROR,
+ errcode(ERRCODE_NOT_NULL_VIOLATION),
+ errmsg("domain %s does not allow null values", format_type_be(typioparams[m])),
+ errdatatype(typioparams[m]));
+ }
cstate->num_errors++;
- if (cstate->opts.log_verbosity == COPY_LOG_VERBOSITY_VERBOSE)
+ if (cstate->opts.log_verbosity == COPY_LOG_VERBOSITY_VERBOSE &&
+ cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
{
/*
* Since we emit line number and column info in the below
@@ -1076,6 +1115,8 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
cstate->cur_attval = NULL;
}
+ if (current_row_erroneous)
+ cstate->num_errors++;
Assert(fieldno == attr_count);
return true;
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 8432be641ac..fcdc61bcb57 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -3279,7 +3279,7 @@ match_previous_words(int pattern_id,
COMPLETE_WITH("FORMAT", "FREEZE", "DELIMITER", "NULL",
"HEADER", "QUOTE", "ESCAPE", "FORCE_QUOTE",
"FORCE_NOT_NULL", "FORCE_NULL", "ENCODING", "DEFAULT",
- "ON_ERROR", "LOG_VERBOSITY");
+ "ON_ERROR", "SET_TO_NULL", "LOG_VERBOSITY");
/* Complete COPY <sth> FROM|TO filename WITH (FORMAT */
else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "FORMAT"))
diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h
index 06dfdfef721..7ebf4f78933 100644
--- a/src/include/commands/copy.h
+++ b/src/include/commands/copy.h
@@ -38,6 +38,7 @@ typedef enum CopyOnErrorChoice
{
COPY_ON_ERROR_STOP = 0, /* immediately throw errors, default */
COPY_ON_ERROR_IGNORE, /* ignore errors */
+ COPY_ON_ERROR_NULL, /* set error field to null */
} CopyOnErrorChoice;
/*
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index 64ea33aeae8..9a5acef8db0 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -81,6 +81,10 @@ COPY x from stdin (on_error ignore, on_error ignore);
ERROR: conflicting or redundant options
LINE 1: COPY x from stdin (on_error ignore, on_error ignore);
^
+COPY x from stdin (on_error set_to_null, on_error ignore);
+ERROR: conflicting or redundant options
+LINE 1: COPY x from stdin (on_error set_to_null, on_error ignore);
+ ^
COPY x from stdin (log_verbosity default, log_verbosity verbose);
ERROR: conflicting or redundant options
LINE 1: COPY x from stdin (log_verbosity default, log_verbosity verb...
@@ -92,6 +96,10 @@ COPY x from stdin (format BINARY, null 'x');
ERROR: cannot specify NULL in BINARY mode
COPY x from stdin (format BINARY, on_error ignore);
ERROR: only ON_ERROR STOP is allowed in BINARY mode
+COPY x from stdin (format BINARY, on_error set_to_null);
+ERROR: only ON_ERROR STOP is allowed in BINARY mode
+COPY x FROM stdin (on_error set_to_null, reject_limit 2);
+ERROR: COPY REJECT_LIMIT requires ON_ERROR to be set to IGNORE
COPY x from stdin (on_error unsupported);
ERROR: COPY ON_ERROR "unsupported" not recognized
LINE 1: COPY x from stdin (on_error unsupported);
@@ -124,6 +132,10 @@ COPY x to stdout (format BINARY, on_error unsupported);
ERROR: COPY ON_ERROR cannot be used with COPY TO
LINE 1: COPY x to stdout (format BINARY, on_error unsupported);
^
+COPY x to stdin (on_error set_to_null);
+ERROR: COPY ON_ERROR cannot be used with COPY TO
+LINE 1: COPY x to stdin (on_error set_to_null);
+ ^
COPY x from stdin (log_verbosity unsupported);
ERROR: COPY LOG_VERBOSITY "unsupported" not recognized
LINE 1: COPY x from stdin (log_verbosity unsupported);
@@ -769,6 +781,51 @@ CONTEXT: COPY check_ign_err
NOTICE: skipping row due to data type incompatibility at line 8 for column "k": "a"
CONTEXT: COPY check_ign_err
NOTICE: 6 rows were skipped due to data type incompatibility
+CREATE DOMAIN d_int_not_null AS INT NOT NULL CHECK(value > 0);
+CREATE DOMAIN d_int_positive_maybe_null AS INT CHECK(value > 0);
+CREATE TABLE t_on_error_null (a d_int_not_null, b d_int_positive_maybe_null, c INT);
+\pset null NULL
+--fail, column a cannot set to null value
+COPY t_on_error_null FROM STDIN WITH (on_error set_to_null);
+ERROR: domain d_int_not_null does not allow null values
+CONTEXT: COPY t_on_error_null, line 1, column a: null input
+--fail, column a is domain with not-null constraint
+COPY t_on_error_null FROM STDIN WITH (on_error set_to_null);
+ERROR: domain d_int_not_null does not allow null values
+CONTEXT: COPY t_on_error_null, line 1, column a: "a"
+--fail, column a cannot set to null value
+COPY t_on_error_null FROM STDIN WITH (on_error set_to_null);
+ERROR: domain d_int_not_null does not allow null values
+CONTEXT: COPY t_on_error_null, line 1, column a: "-1"
+--fail. less data
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_to_null);
+ERROR: missing data for column "c"
+CONTEXT: COPY t_on_error_null, line 1: "1,1"
+--fail. extra data
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_to_null);
+ERROR: extra data after last expected column
+CONTEXT: COPY t_on_error_null, line 1: "1,2,3,4"
+--ok
+COPY t_on_error_null FROM STDIN WITH (on_error set_to_null, log_verbosity verbose);
+NOTICE: column "b" was set to null due to data type incompatibility at line 1
+CONTEXT: COPY t_on_error_null, line 1, column b: "a"
+NOTICE: column "c" was set to null due to data type incompatibility at line 1
+CONTEXT: COPY t_on_error_null, line 1, column c: "d"
+NOTICE: column "b" was set to null due to data type incompatibility at line 2
+CONTEXT: COPY t_on_error_null, line 2, column b: "b"
+NOTICE: column "c" was set to null due to data type incompatibility at line 3
+CONTEXT: COPY t_on_error_null, line 3, column c: "e"
+NOTICE: erroneous values in 3 rows were replaced with null
+-- check inserted content
+select * from t_on_error_null;
+ a | b | c
+----+------+------
+ 10 | NULL | NULL
+ 11 | NULL | 12
+ 13 | 14 | NULL
+(3 rows)
+
+\pset null ''
-- tests for on_error option with log_verbosity and null constraint via domain
CREATE DOMAIN dcheck_ign_err2 varchar(15) NOT NULL;
CREATE TABLE check_ign_err2 (n int, m int[], k int, l dcheck_ign_err2);
@@ -828,6 +885,9 @@ DROP VIEW instead_of_insert_tbl_view;
DROP VIEW instead_of_insert_tbl_view_2;
DROP FUNCTION fun_instead_of_insert_tbl();
DROP TABLE check_ign_err;
+DROP TABLE t_on_error_null;
+DROP DOMAIN d_int_not_null;
+DROP DOMAIN d_int_positive_maybe_null;
DROP TABLE check_ign_err2;
DROP DOMAIN dcheck_ign_err2;
DROP TABLE hard_err;
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index 45273557ce0..003a91648e2 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -67,12 +67,15 @@ COPY x from stdin (force_null (a), force_null (b));
COPY x from stdin (convert_selectively (a), convert_selectively (b));
COPY x from stdin (encoding 'sql_ascii', encoding 'sql_ascii');
COPY x from stdin (on_error ignore, on_error ignore);
+COPY x from stdin (on_error set_to_null, on_error ignore);
COPY x from stdin (log_verbosity default, log_verbosity verbose);
-- incorrect options
COPY x from stdin (format BINARY, delimiter ',');
COPY x from stdin (format BINARY, null 'x');
COPY x from stdin (format BINARY, on_error ignore);
+COPY x from stdin (format BINARY, on_error set_to_null);
+COPY x FROM stdin (on_error set_to_null, reject_limit 2);
COPY x from stdin (on_error unsupported);
COPY x from stdin (format TEXT, force_quote(a));
COPY x from stdin (format TEXT, force_quote *);
@@ -87,6 +90,7 @@ COPY x from stdin (format TEXT, force_null *);
COPY x to stdout (format CSV, force_null(a));
COPY x to stdout (format CSV, force_null *);
COPY x to stdout (format BINARY, on_error unsupported);
+COPY x to stdin (on_error set_to_null);
COPY x from stdin (log_verbosity unsupported);
COPY x from stdin with (reject_limit 1);
COPY x from stdin with (on_error ignore, reject_limit 0);
@@ -534,6 +538,45 @@ a {2} 2
8 {8} 8
\.
+CREATE DOMAIN d_int_not_null AS INT NOT NULL CHECK(value > 0);
+CREATE DOMAIN d_int_positive_maybe_null AS INT CHECK(value > 0);
+CREATE TABLE t_on_error_null (a d_int_not_null, b d_int_positive_maybe_null, c INT);
+
+\pset null NULL
+--fail, column a cannot set to null value
+COPY t_on_error_null FROM STDIN WITH (on_error set_to_null);
+\N 11 13
+\.
+
+--fail, column a is domain with not-null constraint
+COPY t_on_error_null FROM STDIN WITH (on_error set_to_null);
+a 11 14
+\.
+
+--fail, column a cannot set to null value
+COPY t_on_error_null FROM STDIN WITH (on_error set_to_null);
+-1 11 13
+\.
+
+--fail. less data
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_to_null);
+1,1
+\.
+--fail. extra data
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_to_null);
+1,2,3,4
+\.
+
+--ok
+COPY t_on_error_null FROM STDIN WITH (on_error set_to_null, log_verbosity verbose);
+10 a d
+11 b 12
+13 14 e
+\.
+
+-- check inserted content
+select * from t_on_error_null;
+\pset null ''
-- tests for on_error option with log_verbosity and null constraint via domain
CREATE DOMAIN dcheck_ign_err2 varchar(15) NOT NULL;
CREATE TABLE check_ign_err2 (n int, m int[], k int, l dcheck_ign_err2);
@@ -603,6 +646,9 @@ DROP VIEW instead_of_insert_tbl_view;
DROP VIEW instead_of_insert_tbl_view_2;
DROP FUNCTION fun_instead_of_insert_tbl();
DROP TABLE check_ign_err;
+DROP TABLE t_on_error_null;
+DROP DOMAIN d_int_not_null;
+DROP DOMAIN d_int_positive_maybe_null;
DROP TABLE check_ign_err2;
DROP DOMAIN dcheck_ign_err2;
DROP TABLE hard_err;
--
2.34.1
^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
@ 2025-03-11 10:31 Jim Jones <[email protected]>
parent: jian he <[email protected]>
0 siblings, 1 reply; 31+ messages in thread
From: Jim Jones @ 2025-03-11 10:31 UTC (permalink / raw)
To: jian he <[email protected]>; Kirill Reshke <[email protected]>; +Cc: Fujii Masao <[email protected]>; David G. Johnston <[email protected]>; Yugo NAGATA <[email protected]>; torikoshia <[email protected]>; PostgreSQL Hackers <[email protected]>
Hi Jian
On 07.03.25 04:48, jian he wrote:
> hi.
> rebase only.
I revisited this patch today. It applies and builds cleanly, and it
works as expected.
Some tests and minor comments:
====
1) WARNING might be a better fit than NOTICE here.
postgres=# \pset null NULL
Null display is "NULL".
postgres=# CREATE TEMPORARY TABLE t (x int, y int, z text);
CREATE TABLE
postgres=# COPY t (x,y) FROM STDIN WITH (on_error set_to_null, format csv);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1,a
>> 2,1
>> 3,2
>> 4,b
>> a,c
>> \.
NOTICE: erroneous values in 3 rows were replaced with null
COPY 5
postgres=# SELECT * FROM t;
x | y | z
------+------+------
1 | NULL | NULL
2 | 1 | NULL
3 | 2 | NULL
4 | NULL | NULL
NULL | NULL | NULL
(5 rows)
postgres=# \pset null NULL
Null display is "NULL".
postgres=# CREATE TEMPORARY TABLE t (x int, y int, z text);
CREATE TABLE
postgres=# COPY t (x,y) FROM STDIN WITH (on_error set_to_null, format
csv, log_verbosity verbose);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1,a
>> 2,1
>> 3,2
>> 4,b
>> a,c
>> \.
NOTICE: column "y" was set to null due to data type incompatibility at
line 1
NOTICE: column "y" was set to null due to data type incompatibility at
line 4
NOTICE: column "x" was set to null due to data type incompatibility at
line 5
NOTICE: column "y" was set to null due to data type incompatibility at
line 5
NOTICE: erroneous values in 3 rows were replaced with null
COPY 5
postgres=# SELECT * FROM t;
x | y | z
------+------+------
1 | NULL | NULL
2 | 1 | NULL
3 | 2 | NULL
4 | NULL | NULL
NULL | NULL | NULL
(5 rows)
I would still leave the extra messages from "log_verbosity verbose" as
NOTICE though. What do you think?
====
2) Inconsistent terminology. Invalid values in "on_error set_to_null"
mode are names as "erroneous", but as "invalid" in "on_error stop" mode.
I don't want to get into the semantics of erroneous or invalid, but
sticking to one terminology would IMHO look better.
postgres=# \pset null NULL
Null display is "NULL".
postgres=# CREATE TEMPORARY TABLE t (x int, y int, z text);
CREATE TABLE
postgres=# COPY t (x,y) FROM STDIN WITH (on_error stop, format csv,
log_verbosity verbose);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1,a
>> 2,1
>> 3,2
>> 4,b
>> a,c
>> \.
ERROR: invalid input syntax for type integer: "a"
CONTEXT: COPY t, line 1, column y: "a"
postgres=# SELECT * FROM t;
x | y | z
---+---+---
(0 rows)
====
3) same as in 1)
postgres=# \pset null NULL
Null display is "NULL".
postgres=# CREATE TEMPORARY TABLE t (x int, y int, z text);
CREATE TABLE
postgres=# COPY t (x,y) FROM STDIN WITH (on_error ignore, format csv,
log_verbosity verbose);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1,a
>> 2,1
>> 3,2
>> 4,b
>> a,c
>> \.
NOTICE: skipping row due to data type incompatibility at line 1 for
column "y": "a"
NOTICE: skipping row due to data type incompatibility at line 4 for
column "y": "b"
NOTICE: skipping row due to data type incompatibility at line 5 for
column "x": "a"
NOTICE: 3 rows were skipped due to data type incompatibility
COPY 2
postgres=# SELECT * FROM t;
x | y | z
---+---+------
2 | 1 | NULL
3 | 2 | NULL
(2 rows)====
====
"on_error ignore" works well with "reject_limit #"
postgres=# \pset null NULL
Null display is "NULL".
postgres=# CREATE TEMPORARY TABLE t (x int, y int, z text);
CREATE TABLE
postgres=# COPY t (x,y) FROM STDIN WITH (on_error ignore, format csv,
log_verbosity verbose, reject_limit 1);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1,a
>> 2,1
>> 3,2
>> 4,b
>> a,c
>> \.
NOTICE: skipping row due to data type incompatibility at line 1 for
column "y": "a"
NOTICE: skipping row due to data type incompatibility at line 4 for
column "y": "b"
ERROR: skipped more than REJECT_LIMIT (1) rows due to data type
incompatibility
CONTEXT: COPY t, line 4, column y: "b"
postgres=# SELECT * FROM t;
x | y | z
---+---+---
(0 rows)
best regards, Jim
^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
@ 2025-03-12 08:00 jian he <[email protected]>
parent: Jim Jones <[email protected]>
0 siblings, 1 reply; 31+ messages in thread
From: jian he @ 2025-03-12 08:00 UTC (permalink / raw)
To: Jim Jones <[email protected]>; +Cc: Kirill Reshke <[email protected]>; Fujii Masao <[email protected]>; David G. Johnston <[email protected]>; Yugo NAGATA <[email protected]>; torikoshia <[email protected]>; PostgreSQL Hackers <[email protected]>
On Tue, Mar 11, 2025 at 6:31 PM Jim Jones <[email protected]> wrote:
>
>
> I revisited this patch today. It applies and builds cleanly, and it
> works as expected.
>
> Some tests and minor comments:
>
hi. Jim Jones.
thanks for testsing it again!
> ====
>
> 1) WARNING might be a better fit than NOTICE here.
>
but NOTICE, on_errror set_to_null is aligned with on_errror ignore.
>
> I would still leave the extra messages from "log_verbosity verbose" as
> NOTICE though. What do you think?
>
> ====
When LOG_VERBOSITY option is set to verbose,
for ignore option, a NOTICE message containing the line of the input
file and the column name
whose input conversion has failed is emitted for each discarded row;
for set_to_null option, a NOTICE message containing the line of the
input file and the column name
where value was replaced with NULL for each input conversion failure.
see the above desciption,
on_errror set_to_null is aligned with on_errror ignore.
it's just on_errror ignore is per row, on_errror set_to_null is per
column/field.
so NOTICE is aligned with other on_error option.
>
> 2) Inconsistent terminology. Invalid values in "on_error set_to_null"
> mode are names as "erroneous", but as "invalid" in "on_error stop" mode.
> I don't want to get into the semantics of erroneous or invalid, but
> sticking to one terminology would IMHO look better.
>
I am open to changing it.
what do you think "invalid values in %llu row was replaced with null"?
> ====
>
> "on_error ignore" works well with "reject_limit #"
>
i remember there was some confusion about on_error set_to_null with
reject_limit option.
I choose to not suport it.
obviously, if there is consenses, we can support it later.
^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
@ 2025-03-12 08:25 Jim Jones <[email protected]>
parent: jian he <[email protected]>
0 siblings, 1 reply; 31+ messages in thread
From: Jim Jones @ 2025-03-12 08:25 UTC (permalink / raw)
To: jian he <[email protected]>; +Cc: Kirill Reshke <[email protected]>; Fujii Masao <[email protected]>; David G. Johnston <[email protected]>; Yugo NAGATA <[email protected]>; torikoshia <[email protected]>; PostgreSQL Hackers <[email protected]>
On 12.03.25 09:00, jian he wrote:
>> 1) WARNING might be a better fit than NOTICE here.
>>
> but NOTICE, on_errror set_to_null is aligned with on_errror ignore.
>
>> I would still leave the extra messages from "log_verbosity verbose" as
>> NOTICE though. What do you think?
>>
>> ====
> When LOG_VERBOSITY option is set to verbose,
> for ignore option, a NOTICE message containing the line of the input
> file and the column name
> whose input conversion has failed is emitted for each discarded row;
> for set_to_null option, a NOTICE message containing the line of the
> input file and the column name
> where value was replaced with NULL for each input conversion failure.
>
> see the above desciption,
> on_errror set_to_null is aligned with on_errror ignore.
> it's just on_errror ignore is per row, on_errror set_to_null is per
> column/field.
> so NOTICE is aligned with other on_error option.
I considered using a WARNING due to the severity of the issue - the
failure to import data - but either NOTICE or WARNING works for me.
>> 2) Inconsistent terminology. Invalid values in "on_error set_to_null"
>> mode are names as "erroneous", but as "invalid" in "on_error stop" mode.
>> I don't want to get into the semantics of erroneous or invalid, but
>> sticking to one terminology would IMHO look better.
>>
> I am open to changing it.
> what do you think "invalid values in %llu row was replaced with null"?
LGTM: "invalid values in %llu rows were replaced with null"
Thanks for the patch!
Best, Jim
^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
@ 2025-03-18 03:55 jian he <[email protected]>
parent: Jim Jones <[email protected]>
0 siblings, 1 reply; 31+ messages in thread
From: jian he @ 2025-03-18 03:55 UTC (permalink / raw)
To: Jim Jones <[email protected]>; +Cc: Kirill Reshke <[email protected]>; Fujii Masao <[email protected]>; David G. Johnston <[email protected]>; Yugo NAGATA <[email protected]>; torikoshia <[email protected]>; PostgreSQL Hackers <[email protected]>
On Wed, Mar 12, 2025 at 4:26 PM Jim Jones <[email protected]> wrote:
>
> >> 2) Inconsistent terminology. Invalid values in "on_error set_to_null"
> >> mode are names as "erroneous", but as "invalid" in "on_error stop" mode.
> >> I don't want to get into the semantics of erroneous or invalid, but
> >> sticking to one terminology would IMHO look better.
> >>
> > I am open to changing it.
> > what do you think "invalid values in %llu row was replaced with null"?
>
> LGTM: "invalid values in %llu rows were replaced with null"
>
changed based on this.
also minor documentation tweaks.
Attachments:
[text/x-patch] v13-0001-COPY-on_error-set_to_null.patch (19.8K, 2-v13-0001-COPY-on_error-set_to_null.patch)
download | inline diff:
From 3553eee56c8dd0c3ce334d1f37b511acbbc640af Mon Sep 17 00:00:00 2001
From: jian he <[email protected]>
Date: Tue, 18 Mar 2025 11:51:48 +0800
Subject: [PATCH v13 1/1] COPY (on_error set_to_null)
Extent "on_error action", introduce new option: on_error set_to_null.
Current grammar makes us unable to use "on_error null", so we choose "on_error set_to_null".
Any data type conversion errors during the COPY FROM process will result in the
affected column being set to NULL. This only applies when using the non-binary
format for COPY FROM.
However, the not-null constraint will still be enforced.
If a column have not-null constraint, successful (on_error set_to_null)
action will cause not-null constraint violation.
This also apply to column type is domain with not-null constraint.
A regression test for a domain with a not-null constraint has been added.
Author: Jian He <[email protected]>
Author: Kirill Reshke <[email protected]>
Reviewed-by:
Fujii Masao <[email protected]>
Jim Jones <[email protected]>
"David G. Johnston" <[email protected]>
Yugo NAGATA <[email protected]>
torikoshia <[email protected]>
discussion: https://postgr.es/m/CAKFQuwawy1e6YR4S=j+y7pXqg_Dw1WBVrgvf=BP3d1_aSfe_+Q@mail.gmail.com
---
doc/src/sgml/ref/copy.sgml | 34 +++++++++++-----
src/backend/commands/copy.c | 6 ++-
src/backend/commands/copyfrom.c | 29 +++++++++-----
src/backend/commands/copyfromparse.c | 43 +++++++++++++++++++-
src/bin/psql/tab-complete.in.c | 2 +-
src/include/commands/copy.h | 1 +
src/test/regress/expected/copy2.out | 60 ++++++++++++++++++++++++++++
src/test/regress/sql/copy2.sql | 46 +++++++++++++++++++++
8 files changed, 196 insertions(+), 25 deletions(-)
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index df093da97c5..cdb725d7565 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -394,23 +394,34 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
Specifies how to behave when encountering an error converting a column's
input value into its data type.
An <replaceable class="parameter">error_action</replaceable> value of
- <literal>stop</literal> means fail the command, while
- <literal>ignore</literal> means discard the input row and continue with the next one.
+ <literal>stop</literal> means fail the command,
+ <literal>ignore</literal> means discard the input row and continue with the next one, and
+ <literal>set_to_null</literal> means replace columns containing invalid input values with
+ <literal>NULL</literal> and move to the next field.
The default is <literal>stop</literal>.
</para>
<para>
- The <literal>ignore</literal> option is applicable only for <command>COPY FROM</command>
+ The <literal>ignore</literal> and <literal>set_to_null</literal>
+ options are applicable only for <command>COPY FROM</command>
when the <literal>FORMAT</literal> is <literal>text</literal> or <literal>csv</literal>.
</para>
<para>
- A <literal>NOTICE</literal> message containing the ignored row count is
- emitted at the end of the <command>COPY FROM</command> if at least one
- row was discarded. When <literal>LOG_VERBOSITY</literal> option is set to
- <literal>verbose</literal>, a <literal>NOTICE</literal> message
+ For <literal>ignore</literal> option,
+ a <literal>NOTICE</literal> message containing the ignored row count is
+ emitted at the end of the <command>COPY FROM</command> if at least one row was discarded.
+ For <literal>set_to_null</literal> option,
+ a <literal>NOTICE</literal> message indicating the number of rows where invalid input values were replaced with
+ null is emitted at the end of the <command>COPY FROM</command> if at least one row was replaced.
+ </para>
+ <para>
+ When <literal>LOG_VERBOSITY</literal> option is set to <literal>verbose</literal>,
+ for <literal>ignore</literal> option, a <literal>NOTICE</literal> message
containing the line of the input file and the column name whose input
- conversion has failed is emitted for each discarded row.
- When it is set to <literal>silent</literal>, no message is emitted
- regarding ignored rows.
+ conversion has failed is emitted for each discarded row;
+ for <literal>set_to_null</literal> option,
+ a <literal>NOTICE</literal> message containing the line of the input file and the column name
+ where value was replaced with <literal>NULL</literal> for each input conversion failure.
+ When it is set to <literal>silent</literal>, no message is emitted regarding input conversion failed rows.
</para>
</listitem>
</varlistentry>
@@ -458,7 +469,8 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
</para>
<para>
This is currently used in <command>COPY FROM</command> command when
- <literal>ON_ERROR</literal> option is set to <literal>ignore</literal>.
+ <literal>ON_ERROR</literal> option is set to <literal>ignore</literal>
+ or <literal>set_to_null</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index cfca9d9dc29..afe60758d40 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -403,12 +403,14 @@ defGetCopyOnErrorChoice(DefElem *def, ParseState *pstate, bool is_from)
parser_errposition(pstate, def->location)));
/*
- * Allow "stop", or "ignore" values.
+ * Allow "stop", "ignore", "set_to_null" values.
*/
if (pg_strcasecmp(sval, "stop") == 0)
return COPY_ON_ERROR_STOP;
if (pg_strcasecmp(sval, "ignore") == 0)
return COPY_ON_ERROR_IGNORE;
+ if (pg_strcasecmp(sval, "set_to_null") == 0)
+ return COPY_ON_ERROR_NULL;
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -918,7 +920,7 @@ ProcessCopyOptions(ParseState *pstate,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("only ON_ERROR STOP is allowed in BINARY mode")));
- if (opts_out->reject_limit && !opts_out->on_error)
+ if (opts_out->reject_limit && opts_out->on_error != COPY_ON_ERROR_IGNORE)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
/*- translator: first and second %s are the names of COPY option, e.g.
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index bcf66f0adf8..43a227eae72 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -1467,14 +1467,22 @@ CopyFrom(CopyFromState cstate)
/* Done, clean up */
error_context_stack = errcallback.previous;
- if (cstate->opts.on_error != COPY_ON_ERROR_STOP &&
- cstate->num_errors > 0 &&
+ if (cstate->num_errors > 0 &&
cstate->opts.log_verbosity >= COPY_LOG_VERBOSITY_DEFAULT)
- ereport(NOTICE,
- errmsg_plural("%llu row was skipped due to data type incompatibility",
- "%llu rows were skipped due to data type incompatibility",
- (unsigned long long) cstate->num_errors,
- (unsigned long long) cstate->num_errors));
+ {
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ ereport(NOTICE,
+ errmsg_plural("%llu row was skipped due to data type incompatibility",
+ "%llu rows were skipped due to data type incompatibility",
+ (unsigned long long) cstate->num_errors,
+ (unsigned long long) cstate->num_errors));
+ else if (cstate->opts.on_error == COPY_ON_ERROR_NULL)
+ ereport(NOTICE,
+ errmsg_plural("invalid values in %llu row was replaced with null",
+ "invalid values in %llu rows were replaced with null",
+ (unsigned long long) cstate->num_errors,
+ (unsigned long long) cstate->num_errors));
+ }
if (bistate != NULL)
FreeBulkInsertState(bistate);
@@ -1622,10 +1630,11 @@ BeginCopyFrom(ParseState *pstate,
cstate->escontext->error_occurred = false;
/*
- * Currently we only support COPY_ON_ERROR_IGNORE. We'll add other
- * options later
+ * Currently we only support COPY_ON_ERROR_IGNORE, COPY_ON_ERROR_NULL.
+ * We'll add other options later
*/
- if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE ||
+ cstate->opts.on_error == COPY_ON_ERROR_NULL)
cstate->escontext->details_wanted = false;
}
else
diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c
index e8128f85e6b..e2b4d1f7ec9 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -947,6 +947,7 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
int fldct;
int fieldno;
char *string;
+ bool current_row_erroneous = false;
tupDesc = RelationGetDescr(cstate->rel);
attr_count = list_length(cstate->attnumlist);
@@ -1025,6 +1026,7 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
/*
* If ON_ERROR is specified with IGNORE, skip rows with soft errors
+ * If ON_ERROR is specified with set_to_null, try to replace with null.
*/
else if (!InputFunctionCallSafe(&in_functions[m],
string,
@@ -1035,9 +1037,46 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
{
Assert(cstate->opts.on_error != COPY_ON_ERROR_STOP);
+ if (cstate->opts.on_error == COPY_ON_ERROR_NULL)
+ {
+ /*
+ * we use this count the number of rows (not fields) that
+ * successfully applied the on_error set_to_null
+ */
+ if (!current_row_erroneous)
+ current_row_erroneous = true;
+
+ /*
+ * we need another InputFunctionCallSafe so we can error out
+ * not-null violation for domain with not-null constraint.
+ */
+ cstate->escontext->error_occurred = false;
+ if (InputFunctionCallSafe(&in_functions[m],
+ NULL,
+ typioparams[m],
+ att->atttypmod,
+ (Node *) cstate->escontext,
+ &values[m]))
+ {
+ nulls[m] = true;
+ values[m] = (Datum) 0;
+ if (cstate->opts.log_verbosity == COPY_LOG_VERBOSITY_VERBOSE)
+ ereport(NOTICE,
+ errmsg("column \"%s\" was set to null due to data type incompatibility at line %llu",
+ cstate->cur_attname,
+ (unsigned long long) cstate->cur_lineno));
+ continue;
+ }
+ else
+ ereport(ERROR,
+ errcode(ERRCODE_NOT_NULL_VIOLATION),
+ errmsg("domain %s does not allow null values", format_type_be(typioparams[m])),
+ errdatatype(typioparams[m]));
+ }
cstate->num_errors++;
- if (cstate->opts.log_verbosity == COPY_LOG_VERBOSITY_VERBOSE)
+ if (cstate->opts.log_verbosity == COPY_LOG_VERBOSITY_VERBOSE &&
+ cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
{
/*
* Since we emit line number and column info in the below
@@ -1076,6 +1115,8 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
cstate->cur_attval = NULL;
}
+ if (current_row_erroneous)
+ cstate->num_errors++;
Assert(fieldno == attr_count);
return true;
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 9a4d993e2bc..7980513a9bd 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -3280,7 +3280,7 @@ match_previous_words(int pattern_id,
COMPLETE_WITH("FORMAT", "FREEZE", "DELIMITER", "NULL",
"HEADER", "QUOTE", "ESCAPE", "FORCE_QUOTE",
"FORCE_NOT_NULL", "FORCE_NULL", "ENCODING", "DEFAULT",
- "ON_ERROR", "LOG_VERBOSITY");
+ "ON_ERROR", "SET_TO_NULL", "LOG_VERBOSITY");
/* Complete COPY <sth> FROM|TO filename WITH (FORMAT */
else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "FORMAT"))
diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h
index 06dfdfef721..7ebf4f78933 100644
--- a/src/include/commands/copy.h
+++ b/src/include/commands/copy.h
@@ -38,6 +38,7 @@ typedef enum CopyOnErrorChoice
{
COPY_ON_ERROR_STOP = 0, /* immediately throw errors, default */
COPY_ON_ERROR_IGNORE, /* ignore errors */
+ COPY_ON_ERROR_NULL, /* set error field to null */
} CopyOnErrorChoice;
/*
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index 64ea33aeae8..caa94bfd526 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -81,6 +81,10 @@ COPY x from stdin (on_error ignore, on_error ignore);
ERROR: conflicting or redundant options
LINE 1: COPY x from stdin (on_error ignore, on_error ignore);
^
+COPY x from stdin (on_error set_to_null, on_error ignore);
+ERROR: conflicting or redundant options
+LINE 1: COPY x from stdin (on_error set_to_null, on_error ignore);
+ ^
COPY x from stdin (log_verbosity default, log_verbosity verbose);
ERROR: conflicting or redundant options
LINE 1: COPY x from stdin (log_verbosity default, log_verbosity verb...
@@ -92,6 +96,10 @@ COPY x from stdin (format BINARY, null 'x');
ERROR: cannot specify NULL in BINARY mode
COPY x from stdin (format BINARY, on_error ignore);
ERROR: only ON_ERROR STOP is allowed in BINARY mode
+COPY x from stdin (format BINARY, on_error set_to_null);
+ERROR: only ON_ERROR STOP is allowed in BINARY mode
+COPY x FROM stdin (on_error set_to_null, reject_limit 2);
+ERROR: COPY REJECT_LIMIT requires ON_ERROR to be set to IGNORE
COPY x from stdin (on_error unsupported);
ERROR: COPY ON_ERROR "unsupported" not recognized
LINE 1: COPY x from stdin (on_error unsupported);
@@ -124,6 +132,10 @@ COPY x to stdout (format BINARY, on_error unsupported);
ERROR: COPY ON_ERROR cannot be used with COPY TO
LINE 1: COPY x to stdout (format BINARY, on_error unsupported);
^
+COPY x to stdin (on_error set_to_null);
+ERROR: COPY ON_ERROR cannot be used with COPY TO
+LINE 1: COPY x to stdin (on_error set_to_null);
+ ^
COPY x from stdin (log_verbosity unsupported);
ERROR: COPY LOG_VERBOSITY "unsupported" not recognized
LINE 1: COPY x from stdin (log_verbosity unsupported);
@@ -769,6 +781,51 @@ CONTEXT: COPY check_ign_err
NOTICE: skipping row due to data type incompatibility at line 8 for column "k": "a"
CONTEXT: COPY check_ign_err
NOTICE: 6 rows were skipped due to data type incompatibility
+CREATE DOMAIN d_int_not_null AS INT NOT NULL CHECK(value > 0);
+CREATE DOMAIN d_int_positive_maybe_null AS INT CHECK(value > 0);
+CREATE TABLE t_on_error_null (a d_int_not_null, b d_int_positive_maybe_null, c INT);
+\pset null NULL
+--fail, column a cannot set to null value
+COPY t_on_error_null FROM STDIN WITH (on_error set_to_null);
+ERROR: domain d_int_not_null does not allow null values
+CONTEXT: COPY t_on_error_null, line 1, column a: null input
+--fail, column a is domain with not-null constraint
+COPY t_on_error_null FROM STDIN WITH (on_error set_to_null);
+ERROR: domain d_int_not_null does not allow null values
+CONTEXT: COPY t_on_error_null, line 1, column a: "a"
+--fail, column a cannot set to null value
+COPY t_on_error_null FROM STDIN WITH (on_error set_to_null);
+ERROR: domain d_int_not_null does not allow null values
+CONTEXT: COPY t_on_error_null, line 1, column a: "-1"
+--fail. less data
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_to_null);
+ERROR: missing data for column "c"
+CONTEXT: COPY t_on_error_null, line 1: "1,1"
+--fail. extra data
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_to_null);
+ERROR: extra data after last expected column
+CONTEXT: COPY t_on_error_null, line 1: "1,2,3,4"
+--ok
+COPY t_on_error_null FROM STDIN WITH (on_error set_to_null, log_verbosity verbose);
+NOTICE: column "b" was set to null due to data type incompatibility at line 1
+CONTEXT: COPY t_on_error_null, line 1, column b: "a"
+NOTICE: column "c" was set to null due to data type incompatibility at line 1
+CONTEXT: COPY t_on_error_null, line 1, column c: "d"
+NOTICE: column "b" was set to null due to data type incompatibility at line 2
+CONTEXT: COPY t_on_error_null, line 2, column b: "b"
+NOTICE: column "c" was set to null due to data type incompatibility at line 3
+CONTEXT: COPY t_on_error_null, line 3, column c: "e"
+NOTICE: invalid values in 3 rows were replaced with null
+-- check inserted content
+select * from t_on_error_null;
+ a | b | c
+----+------+------
+ 10 | NULL | NULL
+ 11 | NULL | 12
+ 13 | 14 | NULL
+(3 rows)
+
+\pset null ''
-- tests for on_error option with log_verbosity and null constraint via domain
CREATE DOMAIN dcheck_ign_err2 varchar(15) NOT NULL;
CREATE TABLE check_ign_err2 (n int, m int[], k int, l dcheck_ign_err2);
@@ -828,6 +885,9 @@ DROP VIEW instead_of_insert_tbl_view;
DROP VIEW instead_of_insert_tbl_view_2;
DROP FUNCTION fun_instead_of_insert_tbl();
DROP TABLE check_ign_err;
+DROP TABLE t_on_error_null;
+DROP DOMAIN d_int_not_null;
+DROP DOMAIN d_int_positive_maybe_null;
DROP TABLE check_ign_err2;
DROP DOMAIN dcheck_ign_err2;
DROP TABLE hard_err;
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index 45273557ce0..003a91648e2 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -67,12 +67,15 @@ COPY x from stdin (force_null (a), force_null (b));
COPY x from stdin (convert_selectively (a), convert_selectively (b));
COPY x from stdin (encoding 'sql_ascii', encoding 'sql_ascii');
COPY x from stdin (on_error ignore, on_error ignore);
+COPY x from stdin (on_error set_to_null, on_error ignore);
COPY x from stdin (log_verbosity default, log_verbosity verbose);
-- incorrect options
COPY x from stdin (format BINARY, delimiter ',');
COPY x from stdin (format BINARY, null 'x');
COPY x from stdin (format BINARY, on_error ignore);
+COPY x from stdin (format BINARY, on_error set_to_null);
+COPY x FROM stdin (on_error set_to_null, reject_limit 2);
COPY x from stdin (on_error unsupported);
COPY x from stdin (format TEXT, force_quote(a));
COPY x from stdin (format TEXT, force_quote *);
@@ -87,6 +90,7 @@ COPY x from stdin (format TEXT, force_null *);
COPY x to stdout (format CSV, force_null(a));
COPY x to stdout (format CSV, force_null *);
COPY x to stdout (format BINARY, on_error unsupported);
+COPY x to stdin (on_error set_to_null);
COPY x from stdin (log_verbosity unsupported);
COPY x from stdin with (reject_limit 1);
COPY x from stdin with (on_error ignore, reject_limit 0);
@@ -534,6 +538,45 @@ a {2} 2
8 {8} 8
\.
+CREATE DOMAIN d_int_not_null AS INT NOT NULL CHECK(value > 0);
+CREATE DOMAIN d_int_positive_maybe_null AS INT CHECK(value > 0);
+CREATE TABLE t_on_error_null (a d_int_not_null, b d_int_positive_maybe_null, c INT);
+
+\pset null NULL
+--fail, column a cannot set to null value
+COPY t_on_error_null FROM STDIN WITH (on_error set_to_null);
+\N 11 13
+\.
+
+--fail, column a is domain with not-null constraint
+COPY t_on_error_null FROM STDIN WITH (on_error set_to_null);
+a 11 14
+\.
+
+--fail, column a cannot set to null value
+COPY t_on_error_null FROM STDIN WITH (on_error set_to_null);
+-1 11 13
+\.
+
+--fail. less data
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_to_null);
+1,1
+\.
+--fail. extra data
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_to_null);
+1,2,3,4
+\.
+
+--ok
+COPY t_on_error_null FROM STDIN WITH (on_error set_to_null, log_verbosity verbose);
+10 a d
+11 b 12
+13 14 e
+\.
+
+-- check inserted content
+select * from t_on_error_null;
+\pset null ''
-- tests for on_error option with log_verbosity and null constraint via domain
CREATE DOMAIN dcheck_ign_err2 varchar(15) NOT NULL;
CREATE TABLE check_ign_err2 (n int, m int[], k int, l dcheck_ign_err2);
@@ -603,6 +646,9 @@ DROP VIEW instead_of_insert_tbl_view;
DROP VIEW instead_of_insert_tbl_view_2;
DROP FUNCTION fun_instead_of_insert_tbl();
DROP TABLE check_ign_err;
+DROP TABLE t_on_error_null;
+DROP DOMAIN d_int_not_null;
+DROP DOMAIN d_int_positive_maybe_null;
DROP TABLE check_ign_err2;
DROP DOMAIN dcheck_ign_err2;
DROP TABLE hard_err;
--
2.34.1
^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
@ 2025-03-21 06:34 vignesh C <[email protected]>
parent: jian he <[email protected]>
0 siblings, 1 reply; 31+ messages in thread
From: vignesh C @ 2025-03-21 06:34 UTC (permalink / raw)
To: jian he <[email protected]>; +Cc: Jim Jones <[email protected]>; Kirill Reshke <[email protected]>; Fujii Masao <[email protected]>; David G. Johnston <[email protected]>; Yugo NAGATA <[email protected]>; torikoshia <[email protected]>; PostgreSQL Hackers <[email protected]>
On Tue, 18 Mar 2025 at 09:26, jian he <[email protected]> wrote:
>
> changed based on this.
>
> also minor documentation tweaks.
Few comments:
1) I felt this is wrong:
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 9a4d993e2bc..7980513a9bd 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -3280,7 +3280,7 @@ match_previous_words(int pattern_id,
COMPLETE_WITH("FORMAT", "FREEZE", "DELIMITER", "NULL",
"HEADER", "QUOTE", "ESCAPE",
"FORCE_QUOTE",
"FORCE_NOT_NULL",
"FORCE_NULL", "ENCODING", "DEFAULT",
- "ON_ERROR", "LOG_VERBOSITY");
+ "ON_ERROR", "SET_TO_NULL",
"LOG_VERBOSITY");
as the following fails:
postgres=# copy t_on_error_null from stdin WITH ( set_to_null );
ERROR: option "set_to_null" not recognized
LINE 1: copy t_on_error_null from stdin WITH ( set_to_null );
2) Can you limit this to 80 chars if possible to improve the readability:
+ <literal>stop</literal> means fail the command,
+ <literal>ignore</literal> means discard the input row and
continue with the next one, and
+ <literal>set_to_null</literal> means replace columns containing
invalid input values with
+ <literal>NULL</literal> and move to the next field.
3) similarly here too:
+ For <literal>ignore</literal> option,
+ a <literal>NOTICE</literal> message containing the ignored row count is
+ emitted at the end of the <command>COPY FROM</command> if at
least one row was discarded.
+ For <literal>set_to_null</literal> option,
+ a <literal>NOTICE</literal> message indicating the number of
rows where invalid input values were replaced with
+ null is emitted at the end of the <command>COPY FROM</command>
if at least one row was replaced.
4) Could you mention a brief one line in the commit message as to why
"on_error null" cannot be used:
Extent "on_error action", introduce new option: on_error set_to_null.
Current grammar makes us unable to use "on_error null", so we choose
"on_error set_to_null".
Regards,
Vignesh
^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
@ 2025-03-24 07:50 jian he <[email protected]>
parent: vignesh C <[email protected]>
0 siblings, 1 reply; 31+ messages in thread
From: jian he @ 2025-03-24 07:50 UTC (permalink / raw)
To: vignesh C <[email protected]>; +Cc: Jim Jones <[email protected]>; Kirill Reshke <[email protected]>; Fujii Masao <[email protected]>; David G. Johnston <[email protected]>; Yugo NAGATA <[email protected]>; torikoshia <[email protected]>; PostgreSQL Hackers <[email protected]>
On Fri, Mar 21, 2025 at 2:34 PM vignesh C <[email protected]> wrote:
>
> Few comments:
> 1) I felt this is wrong:
> diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
> index 9a4d993e2bc..7980513a9bd 100644
> --- a/src/bin/psql/tab-complete.in.c
> +++ b/src/bin/psql/tab-complete.in.c
> @@ -3280,7 +3280,7 @@ match_previous_words(int pattern_id,
> COMPLETE_WITH("FORMAT", "FREEZE", "DELIMITER", "NULL",
> "HEADER", "QUOTE", "ESCAPE",
> "FORCE_QUOTE",
> "FORCE_NOT_NULL",
> "FORCE_NULL", "ENCODING", "DEFAULT",
> - "ON_ERROR", "LOG_VERBOSITY");
> + "ON_ERROR", "SET_TO_NULL",
> "LOG_VERBOSITY");
>
> as the following fails:
> postgres=# copy t_on_error_null from stdin WITH ( set_to_null );
> ERROR: option "set_to_null" not recognized
> LINE 1: copy t_on_error_null from stdin WITH ( set_to_null );
>
- COMPLETE_WITH("stop", "ignore");
+ COMPLETE_WITH("stop", "ignore", "set_to_null");
yech. I think I fixed this.
> 2) Can you limit this to 80 chars if possible to improve the readability:
> + <literal>stop</literal> means fail the command,
> + <literal>ignore</literal> means discard the input row and
> continue with the next one, and
> + <literal>set_to_null</literal> means replace columns containing
> invalid input values with
> + <literal>NULL</literal> and move to the next field.
>
> 3) similarly here too:
> + For <literal>ignore</literal> option,
> + a <literal>NOTICE</literal> message containing the ignored row count is
> + emitted at the end of the <command>COPY FROM</command> if at
> least one row was discarded.
> + For <literal>set_to_null</literal> option,
> + a <literal>NOTICE</literal> message indicating the number of
> rows where invalid input values were replaced with
> + null is emitted at the end of the <command>COPY FROM</command>
> if at least one row was replaced.
>
sure.
> 4) Could you mention a brief one line in the commit message as to why
> "on_error null" cannot be used:
> Extent "on_error action", introduce new option: on_error set_to_null.
> Current grammar makes us unable to use "on_error null", so we choose
> "on_error set_to_null".
by the following changes, we can change to (on_error null).
--- a/src/backend/parser/gram.y
+++ b/src/backend/parser/gram.y
@@ -3579,6 +3579,7 @@ copy_generic_opt_elem:
copy_generic_opt_arg:
opt_boolean_or_string { $$ =
(Node *) makeString($1); }
+ | NULL_P
{ $$ = (Node *) makeString("null"); }
| NumericOnly
{ $$ = (Node *) $1; }
| '*'
{ $$ = (Node *) makeNode(A_Star); }
| DEFAULT { $$ = (Node
*) makeString("default"); }
COPY x from stdin (format null);
ERROR: syntax error at or near "null"
LINE 1: COPY x from stdin (format null);
^
will become
COPY x from stdin (format null);
ERROR: COPY format "null" not recognized
LINE 1: COPY x from stdin (format null);
^
it will cause NULL_P from reserved word to
non-reserved word in the COPY related command.
I am not sure this is what we want.
Anyway, I attached both two version
(ON_ERROR SET_TO_NULL) (ON_ERROR NULL).
Attachments:
[text/x-patch] v14-0001-COPY-on_error-set_to_null.patch (19.9K, 2-v14-0001-COPY-on_error-set_to_null.patch)
download | inline diff:
From 7a6b7edf0877bd9c5bb2629888d3d9c29618ca5d Mon Sep 17 00:00:00 2001
From: jian he <[email protected]>
Date: Mon, 24 Mar 2025 15:14:17 +0800
Subject: [PATCH v14 1/1] COPY (on_error set_to_null)
Extent "on_error action", introduce new option: on_error set_to_null.
Current grammar makes us unable to use "on_error null". if we did it, then in
all the COPY command options's value, null will becomen reserved to non-reserved
word. so we choose "on_error set_to_null".
Any data type conversion errors during the COPY FROM process will result in the
affected column being set to NULL. This only applies when using the non-binary
format for COPY FROM.
However, the not-null constraint will still be enforced.
If a column have not-null constraint, successful (on_error set_to_null)
action will cause not-null constraint violation.
This also apply to column type is domain with not-null constraint.
A regression test for a domain with a not-null constraint has been added.
Author: Jian He <[email protected]>
Author: Kirill Reshke <[email protected]>
Reviewed-by:
Fujii Masao <[email protected]>
Jim Jones <[email protected]>
"David G. Johnston" <[email protected]>
Yugo NAGATA <[email protected]>
torikoshia <[email protected]>
discussion: https://postgr.es/m/CAKFQuwawy1e6YR4S=j+y7pXqg_Dw1WBVrgvf=BP3d1_aSfe_+Q@mail.gmail.com
---
doc/src/sgml/ref/copy.sgml | 36 ++++++++++++-----
src/backend/commands/copy.c | 6 ++-
src/backend/commands/copyfrom.c | 29 +++++++++-----
src/backend/commands/copyfromparse.c | 43 +++++++++++++++++++-
src/bin/psql/tab-complete.in.c | 2 +-
src/include/commands/copy.h | 1 +
src/test/regress/expected/copy2.out | 60 ++++++++++++++++++++++++++++
src/test/regress/sql/copy2.sql | 46 +++++++++++++++++++++
8 files changed, 198 insertions(+), 25 deletions(-)
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index df093da97c5..1909c11edff 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -394,23 +394,36 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
Specifies how to behave when encountering an error converting a column's
input value into its data type.
An <replaceable class="parameter">error_action</replaceable> value of
- <literal>stop</literal> means fail the command, while
- <literal>ignore</literal> means discard the input row and continue with the next one.
+ <literal>stop</literal> means fail the command,
+ <literal>ignore</literal> means discard the input row and continue with the next one,
+ and <literal>set_to_null</literal> means replace columns containing invalid
+ input values with <literal>NULL</literal> and move to the next field.
The default is <literal>stop</literal>.
</para>
<para>
- The <literal>ignore</literal> option is applicable only for <command>COPY FROM</command>
+ The <literal>ignore</literal> and <literal>set_to_null</literal>
+ options are applicable only for <command>COPY FROM</command>
when the <literal>FORMAT</literal> is <literal>text</literal> or <literal>csv</literal>.
</para>
+ <para>
+ For <literal>ignore</literal> option, a <literal>NOTICE</literal> message
+ containing the ignored row count is emitted at the end of the <command>COPY
+ FROM</command> if at least one row was discarded.
+ For <literal>set_to_null</literal> option,
+ a <literal>NOTICE</literal> message indicating the number of rows
+ where invalid input values were replaced with null is emitted
+ at the end of the <command>COPY FROM</command> if at least one row was replaced.
+ </para>
<para>
- A <literal>NOTICE</literal> message containing the ignored row count is
- emitted at the end of the <command>COPY FROM</command> if at least one
- row was discarded. When <literal>LOG_VERBOSITY</literal> option is set to
- <literal>verbose</literal>, a <literal>NOTICE</literal> message
+ When <literal>LOG_VERBOSITY</literal> option is set to <literal>verbose</literal>,
+ for <literal>ignore</literal> option, a <literal>NOTICE</literal> message
containing the line of the input file and the column name whose input
- conversion has failed is emitted for each discarded row.
- When it is set to <literal>silent</literal>, no message is emitted
- regarding ignored rows.
+ conversion has failed is emitted for each discarded row;
+ for <literal>set_to_null</literal> option, a <literal>NOTICE</literal>
+ message containing the line of the input file and the column name where
+ value was replaced with <literal>NULL</literal> for each input conversion
+ failure.
+ When it is set to <literal>silent</literal>, no message is emitted regarding input conversion failed rows.
</para>
</listitem>
</varlistentry>
@@ -458,7 +471,8 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
</para>
<para>
This is currently used in <command>COPY FROM</command> command when
- <literal>ON_ERROR</literal> option is set to <literal>ignore</literal>.
+ <literal>ON_ERROR</literal> option is set to <literal>ignore</literal>
+ or <literal>set_to_null</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index cfca9d9dc29..afe60758d40 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -403,12 +403,14 @@ defGetCopyOnErrorChoice(DefElem *def, ParseState *pstate, bool is_from)
parser_errposition(pstate, def->location)));
/*
- * Allow "stop", or "ignore" values.
+ * Allow "stop", "ignore", "set_to_null" values.
*/
if (pg_strcasecmp(sval, "stop") == 0)
return COPY_ON_ERROR_STOP;
if (pg_strcasecmp(sval, "ignore") == 0)
return COPY_ON_ERROR_IGNORE;
+ if (pg_strcasecmp(sval, "set_to_null") == 0)
+ return COPY_ON_ERROR_NULL;
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -918,7 +920,7 @@ ProcessCopyOptions(ParseState *pstate,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("only ON_ERROR STOP is allowed in BINARY mode")));
- if (opts_out->reject_limit && !opts_out->on_error)
+ if (opts_out->reject_limit && opts_out->on_error != COPY_ON_ERROR_IGNORE)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
/*- translator: first and second %s are the names of COPY option, e.g.
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index bcf66f0adf8..43a227eae72 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -1467,14 +1467,22 @@ CopyFrom(CopyFromState cstate)
/* Done, clean up */
error_context_stack = errcallback.previous;
- if (cstate->opts.on_error != COPY_ON_ERROR_STOP &&
- cstate->num_errors > 0 &&
+ if (cstate->num_errors > 0 &&
cstate->opts.log_verbosity >= COPY_LOG_VERBOSITY_DEFAULT)
- ereport(NOTICE,
- errmsg_plural("%llu row was skipped due to data type incompatibility",
- "%llu rows were skipped due to data type incompatibility",
- (unsigned long long) cstate->num_errors,
- (unsigned long long) cstate->num_errors));
+ {
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ ereport(NOTICE,
+ errmsg_plural("%llu row was skipped due to data type incompatibility",
+ "%llu rows were skipped due to data type incompatibility",
+ (unsigned long long) cstate->num_errors,
+ (unsigned long long) cstate->num_errors));
+ else if (cstate->opts.on_error == COPY_ON_ERROR_NULL)
+ ereport(NOTICE,
+ errmsg_plural("invalid values in %llu row was replaced with null",
+ "invalid values in %llu rows were replaced with null",
+ (unsigned long long) cstate->num_errors,
+ (unsigned long long) cstate->num_errors));
+ }
if (bistate != NULL)
FreeBulkInsertState(bistate);
@@ -1622,10 +1630,11 @@ BeginCopyFrom(ParseState *pstate,
cstate->escontext->error_occurred = false;
/*
- * Currently we only support COPY_ON_ERROR_IGNORE. We'll add other
- * options later
+ * Currently we only support COPY_ON_ERROR_IGNORE, COPY_ON_ERROR_NULL.
+ * We'll add other options later
*/
- if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE ||
+ cstate->opts.on_error == COPY_ON_ERROR_NULL)
cstate->escontext->details_wanted = false;
}
else
diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c
index e8128f85e6b..e2b4d1f7ec9 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -947,6 +947,7 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
int fldct;
int fieldno;
char *string;
+ bool current_row_erroneous = false;
tupDesc = RelationGetDescr(cstate->rel);
attr_count = list_length(cstate->attnumlist);
@@ -1025,6 +1026,7 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
/*
* If ON_ERROR is specified with IGNORE, skip rows with soft errors
+ * If ON_ERROR is specified with set_to_null, try to replace with null.
*/
else if (!InputFunctionCallSafe(&in_functions[m],
string,
@@ -1035,9 +1037,46 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
{
Assert(cstate->opts.on_error != COPY_ON_ERROR_STOP);
+ if (cstate->opts.on_error == COPY_ON_ERROR_NULL)
+ {
+ /*
+ * we use this count the number of rows (not fields) that
+ * successfully applied the on_error set_to_null
+ */
+ if (!current_row_erroneous)
+ current_row_erroneous = true;
+
+ /*
+ * we need another InputFunctionCallSafe so we can error out
+ * not-null violation for domain with not-null constraint.
+ */
+ cstate->escontext->error_occurred = false;
+ if (InputFunctionCallSafe(&in_functions[m],
+ NULL,
+ typioparams[m],
+ att->atttypmod,
+ (Node *) cstate->escontext,
+ &values[m]))
+ {
+ nulls[m] = true;
+ values[m] = (Datum) 0;
+ if (cstate->opts.log_verbosity == COPY_LOG_VERBOSITY_VERBOSE)
+ ereport(NOTICE,
+ errmsg("column \"%s\" was set to null due to data type incompatibility at line %llu",
+ cstate->cur_attname,
+ (unsigned long long) cstate->cur_lineno));
+ continue;
+ }
+ else
+ ereport(ERROR,
+ errcode(ERRCODE_NOT_NULL_VIOLATION),
+ errmsg("domain %s does not allow null values", format_type_be(typioparams[m])),
+ errdatatype(typioparams[m]));
+ }
cstate->num_errors++;
- if (cstate->opts.log_verbosity == COPY_LOG_VERBOSITY_VERBOSE)
+ if (cstate->opts.log_verbosity == COPY_LOG_VERBOSITY_VERBOSE &&
+ cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
{
/*
* Since we emit line number and column info in the below
@@ -1076,6 +1115,8 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
cstate->cur_attval = NULL;
}
+ if (current_row_erroneous)
+ cstate->num_errors++;
Assert(fieldno == attr_count);
return true;
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 98951aef82c..c79b3af0495 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -3291,7 +3291,7 @@ match_previous_words(int pattern_id,
/* Complete COPY <sth> FROM filename WITH (ON_ERROR */
else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "ON_ERROR"))
- COMPLETE_WITH("stop", "ignore");
+ COMPLETE_WITH("stop", "ignore", "set_to_null");
/* Complete COPY <sth> FROM filename WITH (LOG_VERBOSITY */
else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "LOG_VERBOSITY"))
diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h
index 06dfdfef721..7ebf4f78933 100644
--- a/src/include/commands/copy.h
+++ b/src/include/commands/copy.h
@@ -38,6 +38,7 @@ typedef enum CopyOnErrorChoice
{
COPY_ON_ERROR_STOP = 0, /* immediately throw errors, default */
COPY_ON_ERROR_IGNORE, /* ignore errors */
+ COPY_ON_ERROR_NULL, /* set error field to null */
} CopyOnErrorChoice;
/*
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index 64ea33aeae8..caa94bfd526 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -81,6 +81,10 @@ COPY x from stdin (on_error ignore, on_error ignore);
ERROR: conflicting or redundant options
LINE 1: COPY x from stdin (on_error ignore, on_error ignore);
^
+COPY x from stdin (on_error set_to_null, on_error ignore);
+ERROR: conflicting or redundant options
+LINE 1: COPY x from stdin (on_error set_to_null, on_error ignore);
+ ^
COPY x from stdin (log_verbosity default, log_verbosity verbose);
ERROR: conflicting or redundant options
LINE 1: COPY x from stdin (log_verbosity default, log_verbosity verb...
@@ -92,6 +96,10 @@ COPY x from stdin (format BINARY, null 'x');
ERROR: cannot specify NULL in BINARY mode
COPY x from stdin (format BINARY, on_error ignore);
ERROR: only ON_ERROR STOP is allowed in BINARY mode
+COPY x from stdin (format BINARY, on_error set_to_null);
+ERROR: only ON_ERROR STOP is allowed in BINARY mode
+COPY x FROM stdin (on_error set_to_null, reject_limit 2);
+ERROR: COPY REJECT_LIMIT requires ON_ERROR to be set to IGNORE
COPY x from stdin (on_error unsupported);
ERROR: COPY ON_ERROR "unsupported" not recognized
LINE 1: COPY x from stdin (on_error unsupported);
@@ -124,6 +132,10 @@ COPY x to stdout (format BINARY, on_error unsupported);
ERROR: COPY ON_ERROR cannot be used with COPY TO
LINE 1: COPY x to stdout (format BINARY, on_error unsupported);
^
+COPY x to stdin (on_error set_to_null);
+ERROR: COPY ON_ERROR cannot be used with COPY TO
+LINE 1: COPY x to stdin (on_error set_to_null);
+ ^
COPY x from stdin (log_verbosity unsupported);
ERROR: COPY LOG_VERBOSITY "unsupported" not recognized
LINE 1: COPY x from stdin (log_verbosity unsupported);
@@ -769,6 +781,51 @@ CONTEXT: COPY check_ign_err
NOTICE: skipping row due to data type incompatibility at line 8 for column "k": "a"
CONTEXT: COPY check_ign_err
NOTICE: 6 rows were skipped due to data type incompatibility
+CREATE DOMAIN d_int_not_null AS INT NOT NULL CHECK(value > 0);
+CREATE DOMAIN d_int_positive_maybe_null AS INT CHECK(value > 0);
+CREATE TABLE t_on_error_null (a d_int_not_null, b d_int_positive_maybe_null, c INT);
+\pset null NULL
+--fail, column a cannot set to null value
+COPY t_on_error_null FROM STDIN WITH (on_error set_to_null);
+ERROR: domain d_int_not_null does not allow null values
+CONTEXT: COPY t_on_error_null, line 1, column a: null input
+--fail, column a is domain with not-null constraint
+COPY t_on_error_null FROM STDIN WITH (on_error set_to_null);
+ERROR: domain d_int_not_null does not allow null values
+CONTEXT: COPY t_on_error_null, line 1, column a: "a"
+--fail, column a cannot set to null value
+COPY t_on_error_null FROM STDIN WITH (on_error set_to_null);
+ERROR: domain d_int_not_null does not allow null values
+CONTEXT: COPY t_on_error_null, line 1, column a: "-1"
+--fail. less data
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_to_null);
+ERROR: missing data for column "c"
+CONTEXT: COPY t_on_error_null, line 1: "1,1"
+--fail. extra data
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_to_null);
+ERROR: extra data after last expected column
+CONTEXT: COPY t_on_error_null, line 1: "1,2,3,4"
+--ok
+COPY t_on_error_null FROM STDIN WITH (on_error set_to_null, log_verbosity verbose);
+NOTICE: column "b" was set to null due to data type incompatibility at line 1
+CONTEXT: COPY t_on_error_null, line 1, column b: "a"
+NOTICE: column "c" was set to null due to data type incompatibility at line 1
+CONTEXT: COPY t_on_error_null, line 1, column c: "d"
+NOTICE: column "b" was set to null due to data type incompatibility at line 2
+CONTEXT: COPY t_on_error_null, line 2, column b: "b"
+NOTICE: column "c" was set to null due to data type incompatibility at line 3
+CONTEXT: COPY t_on_error_null, line 3, column c: "e"
+NOTICE: invalid values in 3 rows were replaced with null
+-- check inserted content
+select * from t_on_error_null;
+ a | b | c
+----+------+------
+ 10 | NULL | NULL
+ 11 | NULL | 12
+ 13 | 14 | NULL
+(3 rows)
+
+\pset null ''
-- tests for on_error option with log_verbosity and null constraint via domain
CREATE DOMAIN dcheck_ign_err2 varchar(15) NOT NULL;
CREATE TABLE check_ign_err2 (n int, m int[], k int, l dcheck_ign_err2);
@@ -828,6 +885,9 @@ DROP VIEW instead_of_insert_tbl_view;
DROP VIEW instead_of_insert_tbl_view_2;
DROP FUNCTION fun_instead_of_insert_tbl();
DROP TABLE check_ign_err;
+DROP TABLE t_on_error_null;
+DROP DOMAIN d_int_not_null;
+DROP DOMAIN d_int_positive_maybe_null;
DROP TABLE check_ign_err2;
DROP DOMAIN dcheck_ign_err2;
DROP TABLE hard_err;
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index 45273557ce0..003a91648e2 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -67,12 +67,15 @@ COPY x from stdin (force_null (a), force_null (b));
COPY x from stdin (convert_selectively (a), convert_selectively (b));
COPY x from stdin (encoding 'sql_ascii', encoding 'sql_ascii');
COPY x from stdin (on_error ignore, on_error ignore);
+COPY x from stdin (on_error set_to_null, on_error ignore);
COPY x from stdin (log_verbosity default, log_verbosity verbose);
-- incorrect options
COPY x from stdin (format BINARY, delimiter ',');
COPY x from stdin (format BINARY, null 'x');
COPY x from stdin (format BINARY, on_error ignore);
+COPY x from stdin (format BINARY, on_error set_to_null);
+COPY x FROM stdin (on_error set_to_null, reject_limit 2);
COPY x from stdin (on_error unsupported);
COPY x from stdin (format TEXT, force_quote(a));
COPY x from stdin (format TEXT, force_quote *);
@@ -87,6 +90,7 @@ COPY x from stdin (format TEXT, force_null *);
COPY x to stdout (format CSV, force_null(a));
COPY x to stdout (format CSV, force_null *);
COPY x to stdout (format BINARY, on_error unsupported);
+COPY x to stdin (on_error set_to_null);
COPY x from stdin (log_verbosity unsupported);
COPY x from stdin with (reject_limit 1);
COPY x from stdin with (on_error ignore, reject_limit 0);
@@ -534,6 +538,45 @@ a {2} 2
8 {8} 8
\.
+CREATE DOMAIN d_int_not_null AS INT NOT NULL CHECK(value > 0);
+CREATE DOMAIN d_int_positive_maybe_null AS INT CHECK(value > 0);
+CREATE TABLE t_on_error_null (a d_int_not_null, b d_int_positive_maybe_null, c INT);
+
+\pset null NULL
+--fail, column a cannot set to null value
+COPY t_on_error_null FROM STDIN WITH (on_error set_to_null);
+\N 11 13
+\.
+
+--fail, column a is domain with not-null constraint
+COPY t_on_error_null FROM STDIN WITH (on_error set_to_null);
+a 11 14
+\.
+
+--fail, column a cannot set to null value
+COPY t_on_error_null FROM STDIN WITH (on_error set_to_null);
+-1 11 13
+\.
+
+--fail. less data
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_to_null);
+1,1
+\.
+--fail. extra data
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_to_null);
+1,2,3,4
+\.
+
+--ok
+COPY t_on_error_null FROM STDIN WITH (on_error set_to_null, log_verbosity verbose);
+10 a d
+11 b 12
+13 14 e
+\.
+
+-- check inserted content
+select * from t_on_error_null;
+\pset null ''
-- tests for on_error option with log_verbosity and null constraint via domain
CREATE DOMAIN dcheck_ign_err2 varchar(15) NOT NULL;
CREATE TABLE check_ign_err2 (n int, m int[], k int, l dcheck_ign_err2);
@@ -603,6 +646,9 @@ DROP VIEW instead_of_insert_tbl_view;
DROP VIEW instead_of_insert_tbl_view_2;
DROP FUNCTION fun_instead_of_insert_tbl();
DROP TABLE check_ign_err;
+DROP TABLE t_on_error_null;
+DROP DOMAIN d_int_not_null;
+DROP DOMAIN d_int_positive_maybe_null;
DROP TABLE check_ign_err2;
DROP DOMAIN dcheck_ign_err2;
DROP TABLE hard_err;
--
2.34.1
[application/octet-stream] v14-0001-COPY-on_error-null.nocfbot (20.0K, 3-v14-0001-COPY-on_error-null.nocfbot)
download
^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
@ 2025-03-25 06:31 vignesh C <[email protected]>
parent: jian he <[email protected]>
0 siblings, 1 reply; 31+ messages in thread
From: vignesh C @ 2025-03-25 06:31 UTC (permalink / raw)
To: jian he <[email protected]>; +Cc: Jim Jones <[email protected]>; Kirill Reshke <[email protected]>; Fujii Masao <[email protected]>; David G. Johnston <[email protected]>; Yugo NAGATA <[email protected]>; torikoshia <[email protected]>; PostgreSQL Hackers <[email protected]>
On Mon, 24 Mar 2025 at 13:21, jian he <[email protected]> wrote:
>
> I am not sure this is what we want.
> Anyway, I attached both two version
Few comments
1) I understood the problem, your first approach is ok for me.
2) Here in error we say column c1 violates not-null constraint and in
the context we show column c2, should the context also display c2
column:
postgres=# create table t3(c1 int not null, c2 int, check (c1 > 10));
CREATE TABLE
postgres=# COPY t3 FROM STDIN WITH (on_error set_to_null);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> a b
>> \.
ERROR: null value in column "c1" of relation "t3" violates not-null constraint
DETAIL: Failing row contains (null, null).
CONTEXT: COPY t3, line 1, column c2: "b"
3) typo becomen should be become:
null will becomen reserved to non-reserved
4) There is a whitespace error while applying patch
Applying: COPY (on_error set_to_null)
.git/rebase-apply/patch:39: trailing whitespace.
a <literal>NOTICE</literal> message indicating the number of rows
warning: 1 line adds whitespace errors.
Regards,
Vignesh
^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
@ 2025-07-01 14:54 torikoshia <[email protected]>
parent: vignesh C <[email protected]>
0 siblings, 1 reply; 31+ messages in thread
From: torikoshia @ 2025-07-01 14:54 UTC (permalink / raw)
To: jian he <[email protected]>; +Cc: Masahiko Sawada <[email protected]>; vignesh C <[email protected]>; Jim Jones <[email protected]>; Kirill Reshke <[email protected]>; Fujii Masao <[email protected]>; David G. Johnston <[email protected]>; Yugo NAGATA <[email protected]>; PostgreSQL Hackers <[email protected]>
Hi,
Thanks for updating the patch and I've read
v17-0001-COPY-on_error-set_null.patch and here are some comments.
> +COPY x from stdin (on_error set_null, reject_limit 2);
> +ERROR: COPY REJECT_LIMIT requires ON_ERROR to be set to IGNORE
I understand that REJECT_LIMIT is out of scope for this patch, but
personally, I feel that supporting REJECT_LIMIT with ON_ERROR SET_NULL
would be a natural extension.
- Both IGNORE and SET_NULL share the common behavior of allowing COPY to
continue despite soft errors.
- Since REJECT_LIMIT defines the threshold for how many soft errors can
be tolerated before COPY fails, it seems consistent to allow it with
SET_NULL as well.
+ if (current_row_erroneous)
+ cstate->num_errors++;
Is there any reason this error counting isn't placed inside the "if
(cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)" block?
As far as I can tell, current_row_erroneous is only modified within that
block, so it might make sense to keep this logic together for clarity.
These may be very minor, but I noticed a few inconsistencies in casing
and wording:
+ * If ON_ERROR is specified with IGNORE, skip rows with
soft errors.
+ * If ON_ERROR is specified with set_null, try to
replace with null.
IGNORE is in uppercase, but set_null is lowercase.
+ * we use it to count number of rows
(not fields!) that
+ * successfully applied on_error
set_null.
The sentence should begin with a capital: "We use it..."
Also, I felt it's unclear what "we use it" means. Does it necessary?
+COPY x to stdout (on_error set_null);
+ERROR: COPY ON_ERROR cannot be used with COPY TO
+LINE 1: COPY x to stdout (on_error set_null);
COPY is uppercase, but to is lowercase.
+COPY x from stdin (format BINARY, on_error set_null);
+ERROR: only ON_ERROR STOP is allowed in BINARY mode
+COPY x from stdin (on_error set_null, reject_limit 2);
+ERROR: COPY REJECT_LIMIT requires ON_ERROR to be set to IGNORE
...
+COPY t_on_error_null FROM STDIN WITH (on_error set_null);
+ERROR: domain d_int_not_null does not allow null values
+CONTEXT: COPY t_on_error_null, line 1, column a: null input
It might be better to consider standardizing casing across all COPY
statements (e.g., COPY ... TO, COPY ... FROM STDIN) for consistency.
--
Regards,
--
Atsushi Torikoshi
Seconded from NTT DATA Japan Corporation to SRA OSS K.K.
^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
@ 2025-07-02 09:25 jian he <[email protected]>
parent: torikoshia <[email protected]>
0 siblings, 1 reply; 31+ messages in thread
From: jian he @ 2025-07-02 09:25 UTC (permalink / raw)
To: torikoshia <[email protected]>; +Cc: Masahiko Sawada <[email protected]>; vignesh C <[email protected]>; Jim Jones <[email protected]>; Kirill Reshke <[email protected]>; Fujii Masao <[email protected]>; David G. Johnston <[email protected]>; Yugo NAGATA <[email protected]>; PostgreSQL Hackers <[email protected]>
On Tue, Jul 1, 2025 at 10:54 PM torikoshia <[email protected]> wrote:
>
> Hi,
>
> Thanks for updating the patch and I've read
> v17-0001-COPY-on_error-set_null.patch and here are some comments.
>
> + if (current_row_erroneous)
> + cstate->num_errors++;
>
> Is there any reason this error counting isn't placed inside the "if
> (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)" block?
> As far as I can tell, current_row_erroneous is only modified within that
> block, so it might make sense to keep this logic together for clarity.
>
> These may be very minor, but I noticed a few inconsistencies in casing
> and wording:
>
> + * If ON_ERROR is specified with IGNORE, skip rows with
> soft errors.
> + * If ON_ERROR is specified with set_null, try to
> replace with null.
>
> IGNORE is in uppercase, but set_null is lowercase.
>
> + * we use it to count number of rows
> (not fields!) that
> + * successfully applied on_error
> set_null.
>
> The sentence should begin with a capital: "We use it..."
> Also, I felt it's unclear what "we use it" means. Does it necessary?
>
hi.
I changed this comment, also heavily refactored CopyFromTextLikeOneRow based on
v17-0001-COPY-on_error-set_null.patch.
Now it looks way more intuitive, IMHO.
CopyFromTextLikeOneRow
else if (!InputFunctionCallSafe(&in_functions[m],
string,
typioparams[m],
att->atttypmod,
(Node *) cstate->escontext,
&values[m]))
{
if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
////code for on_errr ignore
else if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
////code for on_errr set_null
if (cstate->opts.log_verbosity == COPY_LOG_VERBOSITY_VERBOSE)
//code for verbose message for on_error ignore or on_error set_null
if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
////code for on_errr ignore loop control
else if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
////code for on_errr set_null loop control
}
> +COPY x to stdout (on_error set_null);
> +ERROR: COPY ON_ERROR cannot be used with COPY TO
> +LINE 1: COPY x to stdout (on_error set_null);
>
> COPY is uppercase, but to is lowercase.
>
> +COPY x from stdin (format BINARY, on_error set_null);
> +ERROR: only ON_ERROR STOP is allowed in BINARY mode
> +COPY x from stdin (on_error set_null, reject_limit 2);
> +ERROR: COPY REJECT_LIMIT requires ON_ERROR to be set to IGNORE
> ...
> +COPY t_on_error_null FROM STDIN WITH (on_error set_null);
> +ERROR: domain d_int_not_null does not allow null values
> +CONTEXT: COPY t_on_error_null, line 1, column a: null input
>
> It might be better to consider standardizing casing across all COPY
> statements (e.g., COPY ... TO, COPY ... FROM STDIN) for consistency.
>
I followed near code conventions, changing the casing here seems not necessary.
Attachments:
[text/x-patch] v18-0001-COPY-on_error-set_null.patch (22.7K, 2-v18-0001-COPY-on_error-set_null.patch)
download | inline diff:
From feded9f7562f608ec97cbb08399661c6494df021 Mon Sep 17 00:00:00 2001
From: jian he <[email protected]>
Date: Wed, 2 Jul 2025 16:32:57 +0800
Subject: [PATCH v18 1/1] COPY (on_error set_null)
Current grammar makes us unable to use "on_error null". if we did it, then in
all the COPY command options's value, null will become reserved to non-reserved
words. so we choose "on_error set_null".
When COPY FROM, if ON_ERROR SET_NULL is specified, any data type conversion
errors will result in the affected column being set to NULL. However, column's
not-null constraints are still enforced, attempting to set a NULL value in such
columns will raise a constraint violation error. This applies to column data
type is a domain with a NOT NULL constraint.
Author: Jian He <[email protected]>
Author: Kirill Reshke <[email protected]>
Reviewed-by:
Fujii Masao <[email protected]>
Jim Jones <[email protected]>
"David G. Johnston" <[email protected]>
Yugo NAGATA <[email protected]>
torikoshia <[email protected]>
Masahiko Sawada <[email protected]>
Atsushi Torikoshi <[email protected]>
discussion: https://postgr.es/m/CAKFQuwawy1e6YR4S=j+y7pXqg_Dw1WBVrgvf=BP3d1_aSfe_+Q@mail.gmail.com
commitfest entry: https://commitfest.postgresql.org/patch/4810
---
doc/src/sgml/ref/copy.sgml | 35 +++++++---
src/backend/commands/copy.c | 6 +-
src/backend/commands/copyfrom.c | 42 +++++++++---
src/backend/commands/copyfromparse.c | 84 ++++++++++++++++++++----
src/bin/psql/tab-complete.in.c | 2 +-
src/include/commands/copy.h | 1 +
src/include/commands/copyfrom_internal.h | 7 ++
src/test/regress/expected/copy2.out | 60 +++++++++++++++++
src/test/regress/sql/copy2.sql | 46 +++++++++++++
9 files changed, 247 insertions(+), 36 deletions(-)
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 8433344e5b6..26fb4be1709 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -394,23 +394,37 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
Specifies how to behave when encountering an error converting a column's
input value into its data type.
An <replaceable class="parameter">error_action</replaceable> value of
- <literal>stop</literal> means fail the command, while
- <literal>ignore</literal> means discard the input row and continue with the next one.
+ <literal>stop</literal> means fail the command,
+ <literal>ignore</literal> means discard the input row and continue with the next one,
+ and <literal>set_null</literal> means replace column containing invalid
+ input value with <literal>NULL</literal> and move to the next field.
The default is <literal>stop</literal>.
</para>
<para>
- The <literal>ignore</literal> option is applicable only for <command>COPY FROM</command>
+ The <literal>ignore</literal> and <literal>set_null</literal>
+ options are applicable only for <command>COPY FROM</command>
when the <literal>FORMAT</literal> is <literal>text</literal> or <literal>csv</literal>.
</para>
+ <para>
+ For <literal>ignore</literal> option, a <literal>NOTICE</literal> message
+ containing the ignored row count is emitted at the end of the <command>COPY
+ FROM</command> if at least one row was discarded.
+ For <literal>set_null</literal> option,
+ a <literal>NOTICE</literal> message indicating the number of rows
+ where invalid input values were replaced with null is emitted
+ at the end of the <command>COPY FROM</command> if at least one row was replaced.
+ </para>
<para>
- A <literal>NOTICE</literal> message containing the ignored row count is
- emitted at the end of the <command>COPY FROM</command> if at least one
- row was discarded. When <literal>LOG_VERBOSITY</literal> option is set to
- <literal>verbose</literal>, a <literal>NOTICE</literal> message
+ When <literal>LOG_VERBOSITY</literal> option is set to <literal>verbose</literal>,
+ for <literal>ignore</literal> option, a <literal>NOTICE</literal> message
containing the line of the input file and the column name whose input
- conversion has failed is emitted for each discarded row.
+ conversion has failed is emitted for each discarded row;
+ for <literal>set_null</literal> option, a <literal>NOTICE</literal>
+ message containing the line of the input file and the column name where
+ value was replaced with <literal>NULL</literal> for each input conversion
+ failure.
When it is set to <literal>silent</literal>, no message is emitted
- regarding ignored rows.
+ regarding input conversion failed rows.
</para>
</listitem>
</varlistentry>
@@ -458,7 +472,8 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
</para>
<para>
This is currently used in <command>COPY FROM</command> command when
- <literal>ON_ERROR</literal> option is set to <literal>ignore</literal>.
+ <literal>ON_ERROR</literal> option is set to <literal>ignore</literal>
+ or <literal>set_null</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 74ae42b19a7..f963d0e51ff 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -403,12 +403,14 @@ defGetCopyOnErrorChoice(DefElem *def, ParseState *pstate, bool is_from)
parser_errposition(pstate, def->location)));
/*
- * Allow "stop", or "ignore" values.
+ * Allow "stop", "ignore", "set_null" values.
*/
if (pg_strcasecmp(sval, "stop") == 0)
return COPY_ON_ERROR_STOP;
if (pg_strcasecmp(sval, "ignore") == 0)
return COPY_ON_ERROR_IGNORE;
+ if (pg_strcasecmp(sval, "set_null") == 0)
+ return COPY_ON_ERROR_SET_NULL;
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -918,7 +920,7 @@ ProcessCopyOptions(ParseState *pstate,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("only ON_ERROR STOP is allowed in BINARY mode")));
- if (opts_out->reject_limit && !opts_out->on_error)
+ if (opts_out->reject_limit && opts_out->on_error != COPY_ON_ERROR_IGNORE)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
/*- translator: first and second %s are the names of COPY option, e.g.
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index fbbbc09a97b..750d597d4d0 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -1467,14 +1467,22 @@ CopyFrom(CopyFromState cstate)
/* Done, clean up */
error_context_stack = errcallback.previous;
- if (cstate->opts.on_error != COPY_ON_ERROR_STOP &&
- cstate->num_errors > 0 &&
+ if (cstate->num_errors > 0 &&
cstate->opts.log_verbosity >= COPY_LOG_VERBOSITY_DEFAULT)
- ereport(NOTICE,
- errmsg_plural("%" PRIu64 " row was skipped due to data type incompatibility",
- "%" PRIu64 " rows were skipped due to data type incompatibility",
- cstate->num_errors,
- cstate->num_errors));
+ {
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ ereport(NOTICE,
+ errmsg_plural("%" PRIu64 " row was skipped due to data type incompatibility",
+ "%" PRIu64 " rows were skipped due to data type incompatibility",
+ cstate->num_errors,
+ cstate->num_errors));
+ else if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ ereport(NOTICE,
+ errmsg_plural("invalid values in %" PRIu64 " row was replaced with null due to data type incompatibility",
+ "invalid values in %" PRIu64 " rows were replaced with null due to data type incompatibility",
+ cstate->num_errors,
+ cstate->num_errors));
+ }
if (bistate != NULL)
FreeBulkInsertState(bistate);
@@ -1614,6 +1622,19 @@ BeginCopyFrom(ParseState *pstate,
}
}
+ if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ {
+ int attr_count = list_length(cstate->attnumlist);
+
+ cstate->domain_with_constraint = (bool *) palloc0(attr_count * sizeof(bool));
+ foreach_int(attno, cstate->attnumlist)
+ {
+ int i = foreach_current_index(attno);
+ Form_pg_attribute att = TupleDescAttr(tupDesc, attno - 1);
+ cstate->domain_with_constraint[i] = DomainHasConstraints(att->atttypid);
+ }
+ }
+
/* Set up soft error handler for ON_ERROR */
if (cstate->opts.on_error != COPY_ON_ERROR_STOP)
{
@@ -1622,10 +1643,11 @@ BeginCopyFrom(ParseState *pstate,
cstate->escontext->error_occurred = false;
/*
- * Currently we only support COPY_ON_ERROR_IGNORE. We'll add other
- * options later
+ * Currently we only support COPY_ON_ERROR_IGNORE, COPY_ON_ERROR_SET_NULL.
+ * We'll add other options later
*/
- if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE ||
+ cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
cstate->escontext->details_wanted = false;
}
else
diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c
index f52f2477df1..2147d8423fd 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -947,6 +947,7 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
int fldct;
int fieldno;
char *string;
+ bool current_row_erroneous = false;
tupDesc = RelationGetDescr(cstate->rel);
attr_count = list_length(cstate->attnumlist);
@@ -1024,7 +1025,8 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
}
/*
- * If ON_ERROR is specified with IGNORE, skip rows with soft errors
+ * If ON_ERROR is specified with IGNORE, skip rows with soft errors.
+ * If ON_ERROR is specified with SET_NULL, try to replace with null.
*/
else if (!InputFunctionCallSafe(&in_functions[m],
string,
@@ -1035,7 +1037,50 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
{
Assert(cstate->opts.on_error != COPY_ON_ERROR_STOP);
- cstate->num_errors++;
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ cstate->num_errors++;
+ else if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ {
+ cstate->escontext->error_occurred = false;
+ Assert(cstate->domain_with_constraint != NULL);
+
+ /*
+ * When the column's type is a domain with constraints, an
+ * additional InputFunctionCallSafe may be needed to raise
+ * errors for domain constraint violations.
+ */
+ if (!cstate->domain_with_constraint[m] ||
+ InputFunctionCallSafe(&in_functions[m],
+ NULL,
+ typioparams[m],
+ att->atttypmod,
+ (Node *) cstate->escontext,
+ &values[m]))
+ {
+ nulls[m] = true;
+ values[m] = (Datum) 0;
+ }
+ else if (string == NULL)
+ ereport(ERROR,
+ errcode(ERRCODE_NOT_NULL_VIOLATION),
+ errmsg("domain %s does not allow null values", format_type_be(typioparams[m])),
+ errdatatype(typioparams[m]));
+ else
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+ errmsg("invalid input value for domain %s: \"%s\"",
+ format_type_be(typioparams[m]), string));
+
+ /*
+ * We count only the number of rows (not individual fields)
+ * where ON_ERROR SET_NULL was successfully applied.
+ */
+ if (!current_row_erroneous)
+ {
+ current_row_erroneous = true;
+ cstate->num_errors++;
+ }
+ }
if (cstate->opts.log_verbosity == COPY_LOG_VERBOSITY_VERBOSE)
{
@@ -1052,24 +1097,37 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
char *attval;
attval = CopyLimitPrintoutLength(cstate->cur_attval);
- ereport(NOTICE,
- errmsg("skipping row due to data type incompatibility at line %" PRIu64 " for column \"%s\": \"%s\"",
- cstate->cur_lineno,
- cstate->cur_attname,
- attval));
+
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ ereport(NOTICE,
+ errmsg("skipping row due to data type incompatibility at line %" PRIu64 " for column \"%s\": \"%s\"",
+ cstate->cur_lineno,
+ cstate->cur_attname,
+ attval));
+ else if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ ereport(NOTICE,
+ errmsg("setting to null due to data type incompatibility at line %" PRIu64 " for column \"%s\": \"%s\"",
+ cstate->cur_lineno,
+ cstate->cur_attname,
+ attval));
pfree(attval);
}
else
- ereport(NOTICE,
- errmsg("skipping row due to data type incompatibility at line %" PRIu64 " for column \"%s\": null input",
- cstate->cur_lineno,
- cstate->cur_attname));
-
+ {
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ ereport(NOTICE,
+ errmsg("skipping row due to data type incompatibility at line %" PRIu64 " for column \"%s\": null input",
+ cstate->cur_lineno,
+ cstate->cur_attname));
+ }
/* reset relname_only */
cstate->relname_only = false;
}
- return true;
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ return true;
+ else if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ continue;
}
cstate->cur_attname = NULL;
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 8c2ea0b9587..a587f4162ee 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -3305,7 +3305,7 @@ match_previous_words(int pattern_id,
/* Complete COPY <sth> FROM filename WITH (ON_ERROR */
else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "ON_ERROR"))
- COMPLETE_WITH("stop", "ignore");
+ COMPLETE_WITH("stop", "ignore", "set_null");
/* Complete COPY <sth> FROM filename WITH (LOG_VERBOSITY */
else if (Matches("COPY|\\copy", MatchAny, "FROM|TO", MatchAny, "WITH", "(", "LOG_VERBOSITY"))
diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h
index 06dfdfef721..935d21ee77a 100644
--- a/src/include/commands/copy.h
+++ b/src/include/commands/copy.h
@@ -38,6 +38,7 @@ typedef enum CopyOnErrorChoice
{
COPY_ON_ERROR_STOP = 0, /* immediately throw errors, default */
COPY_ON_ERROR_IGNORE, /* ignore errors */
+ COPY_ON_ERROR_SET_NULL, /* set error field to null */
} CopyOnErrorChoice;
/*
diff --git a/src/include/commands/copyfrom_internal.h b/src/include/commands/copyfrom_internal.h
index c8b22af22d8..c82bfab4636 100644
--- a/src/include/commands/copyfrom_internal.h
+++ b/src/include/commands/copyfrom_internal.h
@@ -108,6 +108,13 @@ typedef struct CopyFromStateData
* att */
bool *defaults; /* if DEFAULT marker was found for
* corresponding att */
+ /*
+ * Set to true if the corresponding attribute's data type is a domain with
+ * constraints. This field is usually NULL, except when ON_ERROR is set to
+ * SET_NULL.
+ */
+ bool *domain_with_constraint;
+
bool volatile_defexprs; /* is any of defexprs volatile? */
List *range_table; /* single element list of RangeTblEntry */
List *rteperminfos; /* single element list of RTEPermissionInfo */
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index 64ea33aeae8..3f843d1cd5c 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -81,6 +81,10 @@ COPY x from stdin (on_error ignore, on_error ignore);
ERROR: conflicting or redundant options
LINE 1: COPY x from stdin (on_error ignore, on_error ignore);
^
+COPY x from stdin (on_error set_null, on_error ignore);
+ERROR: conflicting or redundant options
+LINE 1: COPY x from stdin (on_error set_null, on_error ignore);
+ ^
COPY x from stdin (log_verbosity default, log_verbosity verbose);
ERROR: conflicting or redundant options
LINE 1: COPY x from stdin (log_verbosity default, log_verbosity verb...
@@ -92,6 +96,10 @@ COPY x from stdin (format BINARY, null 'x');
ERROR: cannot specify NULL in BINARY mode
COPY x from stdin (format BINARY, on_error ignore);
ERROR: only ON_ERROR STOP is allowed in BINARY mode
+COPY x from stdin (format BINARY, on_error set_null);
+ERROR: only ON_ERROR STOP is allowed in BINARY mode
+COPY x from stdin (on_error set_null, reject_limit 2);
+ERROR: COPY REJECT_LIMIT requires ON_ERROR to be set to IGNORE
COPY x from stdin (on_error unsupported);
ERROR: COPY ON_ERROR "unsupported" not recognized
LINE 1: COPY x from stdin (on_error unsupported);
@@ -124,6 +132,10 @@ COPY x to stdout (format BINARY, on_error unsupported);
ERROR: COPY ON_ERROR cannot be used with COPY TO
LINE 1: COPY x to stdout (format BINARY, on_error unsupported);
^
+COPY x to stdout (on_error set_null);
+ERROR: COPY ON_ERROR cannot be used with COPY TO
+LINE 1: COPY x to stdout (on_error set_null);
+ ^
COPY x from stdin (log_verbosity unsupported);
ERROR: COPY LOG_VERBOSITY "unsupported" not recognized
LINE 1: COPY x from stdin (log_verbosity unsupported);
@@ -769,6 +781,51 @@ CONTEXT: COPY check_ign_err
NOTICE: skipping row due to data type incompatibility at line 8 for column "k": "a"
CONTEXT: COPY check_ign_err
NOTICE: 6 rows were skipped due to data type incompatibility
+CREATE DOMAIN d_int_not_null AS INT NOT NULL CHECK(value > 0);
+CREATE DOMAIN d_int_positive_maybe_null AS INT CHECK(value > 0);
+CREATE TABLE t_on_error_null (a d_int_not_null, b d_int_positive_maybe_null, c INT);
+\pset null NULL
+--fail, column a cannot set to null value
+COPY t_on_error_null FROM STDIN WITH (on_error set_null);
+ERROR: domain d_int_not_null does not allow null values
+CONTEXT: COPY t_on_error_null, line 1, column a: null input
+--fail, column a is domain with not-null constraint
+COPY t_on_error_null FROM STDIN WITH (on_error set_null);
+ERROR: invalid input value for domain d_int_not_null: "ss"
+CONTEXT: COPY t_on_error_null, line 1, column a: "ss"
+--fail, column a cannot set to null value
+COPY t_on_error_null FROM STDIN WITH (on_error set_null);
+ERROR: invalid input value for domain d_int_not_null: "-1"
+CONTEXT: COPY t_on_error_null, line 1, column a: "-1"
+--fail. less data
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_null);
+ERROR: missing data for column "c"
+CONTEXT: COPY t_on_error_null, line 1: "1,1"
+--fail. extra data
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_null);
+ERROR: extra data after last expected column
+CONTEXT: COPY t_on_error_null, line 1: "1,2,3,4"
+--ok
+COPY t_on_error_null FROM STDIN WITH (on_error set_null, log_verbosity verbose);
+NOTICE: setting to null due to data type incompatibility at line 1 for column "b": "x1"
+CONTEXT: COPY t_on_error_null
+NOTICE: setting to null due to data type incompatibility at line 1 for column "c": "yx"
+CONTEXT: COPY t_on_error_null
+NOTICE: setting to null due to data type incompatibility at line 2 for column "b": "zx"
+CONTEXT: COPY t_on_error_null
+NOTICE: setting to null due to data type incompatibility at line 3 for column "c": "ea"
+CONTEXT: COPY t_on_error_null
+NOTICE: invalid values in 3 rows were replaced with null due to data type incompatibility
+-- check inserted content
+select * from t_on_error_null;
+ a | b | c
+----+------+------
+ 10 | NULL | NULL
+ 11 | NULL | 12
+ 13 | 14 | NULL
+(3 rows)
+
+\pset null ''
-- tests for on_error option with log_verbosity and null constraint via domain
CREATE DOMAIN dcheck_ign_err2 varchar(15) NOT NULL;
CREATE TABLE check_ign_err2 (n int, m int[], k int, l dcheck_ign_err2);
@@ -828,6 +885,9 @@ DROP VIEW instead_of_insert_tbl_view;
DROP VIEW instead_of_insert_tbl_view_2;
DROP FUNCTION fun_instead_of_insert_tbl();
DROP TABLE check_ign_err;
+DROP TABLE t_on_error_null;
+DROP DOMAIN d_int_not_null;
+DROP DOMAIN d_int_positive_maybe_null;
DROP TABLE check_ign_err2;
DROP DOMAIN dcheck_ign_err2;
DROP TABLE hard_err;
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index 45273557ce0..d77a06668e8 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -67,12 +67,15 @@ COPY x from stdin (force_null (a), force_null (b));
COPY x from stdin (convert_selectively (a), convert_selectively (b));
COPY x from stdin (encoding 'sql_ascii', encoding 'sql_ascii');
COPY x from stdin (on_error ignore, on_error ignore);
+COPY x from stdin (on_error set_null, on_error ignore);
COPY x from stdin (log_verbosity default, log_verbosity verbose);
-- incorrect options
COPY x from stdin (format BINARY, delimiter ',');
COPY x from stdin (format BINARY, null 'x');
COPY x from stdin (format BINARY, on_error ignore);
+COPY x from stdin (format BINARY, on_error set_null);
+COPY x from stdin (on_error set_null, reject_limit 2);
COPY x from stdin (on_error unsupported);
COPY x from stdin (format TEXT, force_quote(a));
COPY x from stdin (format TEXT, force_quote *);
@@ -87,6 +90,7 @@ COPY x from stdin (format TEXT, force_null *);
COPY x to stdout (format CSV, force_null(a));
COPY x to stdout (format CSV, force_null *);
COPY x to stdout (format BINARY, on_error unsupported);
+COPY x to stdout (on_error set_null);
COPY x from stdin (log_verbosity unsupported);
COPY x from stdin with (reject_limit 1);
COPY x from stdin with (on_error ignore, reject_limit 0);
@@ -534,6 +538,45 @@ a {2} 2
8 {8} 8
\.
+CREATE DOMAIN d_int_not_null AS INT NOT NULL CHECK(value > 0);
+CREATE DOMAIN d_int_positive_maybe_null AS INT CHECK(value > 0);
+CREATE TABLE t_on_error_null (a d_int_not_null, b d_int_positive_maybe_null, c INT);
+
+\pset null NULL
+--fail, column a cannot set to null value
+COPY t_on_error_null FROM STDIN WITH (on_error set_null);
+\N 11 13
+\.
+
+--fail, column a is domain with not-null constraint
+COPY t_on_error_null FROM STDIN WITH (on_error set_null);
+ss 11 14
+\.
+
+--fail, column a cannot set to null value
+COPY t_on_error_null FROM STDIN WITH (on_error set_null);
+-1 11 13
+\.
+
+--fail. less data
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_null);
+1,1
+\.
+--fail. extra data
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_null);
+1,2,3,4
+\.
+
+--ok
+COPY t_on_error_null FROM STDIN WITH (on_error set_null, log_verbosity verbose);
+10 x1 yx
+11 zx 12
+13 14 ea
+\.
+
+-- check inserted content
+select * from t_on_error_null;
+\pset null ''
-- tests for on_error option with log_verbosity and null constraint via domain
CREATE DOMAIN dcheck_ign_err2 varchar(15) NOT NULL;
CREATE TABLE check_ign_err2 (n int, m int[], k int, l dcheck_ign_err2);
@@ -603,6 +646,9 @@ DROP VIEW instead_of_insert_tbl_view;
DROP VIEW instead_of_insert_tbl_view_2;
DROP FUNCTION fun_instead_of_insert_tbl();
DROP TABLE check_ign_err;
+DROP TABLE t_on_error_null;
+DROP DOMAIN d_int_not_null;
+DROP DOMAIN d_int_positive_maybe_null;
DROP TABLE check_ign_err2;
DROP DOMAIN dcheck_ign_err2;
DROP TABLE hard_err;
--
2.34.1
^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
@ 2025-07-30 04:44 jian he <[email protected]>
parent: jian he <[email protected]>
0 siblings, 1 reply; 31+ messages in thread
From: jian he @ 2025-07-30 04:44 UTC (permalink / raw)
To: torikoshia <[email protected]>; +Cc: Masahiko Sawada <[email protected]>; vignesh C <[email protected]>; Jim Jones <[email protected]>; Kirill Reshke <[email protected]>; Fujii Masao <[email protected]>; David G. Johnston <[email protected]>; Yugo NAGATA <[email protected]>; PostgreSQL Hackers <[email protected]>
hi.
rebase.
Attachments:
[text/x-patch] v19-0001-COPY-on_error-set_null.patch (22.7K, 2-v19-0001-COPY-on_error-set_null.patch)
download | inline diff:
From b3b2d794c83b36cf129d917d527ebf2cac46ca3b Mon Sep 17 00:00:00 2001
From: jian he <[email protected]>
Date: Wed, 30 Jul 2025 11:06:17 +0800
Subject: [PATCH v19 1/1] COPY (on_error set_null)
Current grammar makes us unable to use "on_error null". if we did it, then in
all the COPY command options's value, null will become reserved to non-reserved
words. so we choose "on_error set_null".
When COPY FROM, if ON_ERROR SET_NULL is specified, any data type conversion
errors will result in the affected column being set to NULL. However, column's
not-null constraints are still enforced, attempting to set a NULL value in such
columns will raise a constraint violation error. This applies to column data
type is a domain with a NOT NULL constraint.
Author: Jian He <[email protected]>
Author: Kirill Reshke <[email protected]>
Reviewed-by:
Fujii Masao <[email protected]>
Jim Jones <[email protected]>
"David G. Johnston" <[email protected]>
Yugo NAGATA <[email protected]>
torikoshia <[email protected]>
Masahiko Sawada <[email protected]>
Atsushi Torikoshi <[email protected]>
discussion: https://postgr.es/m/CAKFQuwawy1e6YR4S=j+y7pXqg_Dw1WBVrgvf=BP3d1_aSfe_+Q@mail.gmail.com
commitfest entry: https://commitfest.postgresql.org/patch/4810
---
doc/src/sgml/ref/copy.sgml | 35 +++++++---
src/backend/commands/copy.c | 6 +-
src/backend/commands/copyfrom.c | 42 +++++++++---
src/backend/commands/copyfromparse.c | 84 ++++++++++++++++++++----
src/bin/psql/tab-complete.in.c | 2 +-
src/include/commands/copy.h | 1 +
src/include/commands/copyfrom_internal.h | 7 ++
src/test/regress/expected/copy2.out | 60 +++++++++++++++++
src/test/regress/sql/copy2.sql | 46 +++++++++++++
9 files changed, 247 insertions(+), 36 deletions(-)
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index c2d1fbc1fbe..a36e33f320f 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -412,23 +412,37 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
Specifies how to behave when encountering an error converting a column's
input value into its data type.
An <replaceable class="parameter">error_action</replaceable> value of
- <literal>stop</literal> means fail the command, while
- <literal>ignore</literal> means discard the input row and continue with the next one.
+ <literal>stop</literal> means fail the command,
+ <literal>ignore</literal> means discard the input row and continue with the next one,
+ and <literal>set_null</literal> means replace column containing invalid
+ input value with <literal>NULL</literal> and move to the next field.
The default is <literal>stop</literal>.
</para>
<para>
- The <literal>ignore</literal> option is applicable only for <command>COPY FROM</command>
+ The <literal>ignore</literal> and <literal>set_null</literal>
+ options are applicable only for <command>COPY FROM</command>
when the <literal>FORMAT</literal> is <literal>text</literal> or <literal>csv</literal>.
</para>
+ <para>
+ For <literal>ignore</literal> option, a <literal>NOTICE</literal> message
+ containing the ignored row count is emitted at the end of the <command>COPY
+ FROM</command> if at least one row was discarded.
+ For <literal>set_null</literal> option,
+ a <literal>NOTICE</literal> message indicating the number of rows
+ where invalid input values were replaced with null is emitted
+ at the end of the <command>COPY FROM</command> if at least one row was replaced.
+ </para>
<para>
- A <literal>NOTICE</literal> message containing the ignored row count is
- emitted at the end of the <command>COPY FROM</command> if at least one
- row was discarded. When <literal>LOG_VERBOSITY</literal> option is set to
- <literal>verbose</literal>, a <literal>NOTICE</literal> message
+ When <literal>LOG_VERBOSITY</literal> option is set to <literal>verbose</literal>,
+ for <literal>ignore</literal> option, a <literal>NOTICE</literal> message
containing the line of the input file and the column name whose input
- conversion has failed is emitted for each discarded row.
+ conversion has failed is emitted for each discarded row;
+ for <literal>set_null</literal> option, a <literal>NOTICE</literal>
+ message containing the line of the input file and the column name where
+ value was replaced with <literal>NULL</literal> for each input conversion
+ failure.
When it is set to <literal>silent</literal>, no message is emitted
- regarding ignored rows.
+ regarding input conversion failed rows.
</para>
</listitem>
</varlistentry>
@@ -476,7 +490,8 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
</para>
<para>
This is currently used in <command>COPY FROM</command> command when
- <literal>ON_ERROR</literal> option is set to <literal>ignore</literal>.
+ <literal>ON_ERROR</literal> option is set to <literal>ignore</literal>
+ or <literal>set_null</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index fae9c41db65..9213bfb167f 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -413,12 +413,14 @@ defGetCopyOnErrorChoice(DefElem *def, ParseState *pstate, bool is_from)
parser_errposition(pstate, def->location)));
/*
- * Allow "stop", or "ignore" values.
+ * Allow "stop", "ignore", "set_null" values.
*/
if (pg_strcasecmp(sval, "stop") == 0)
return COPY_ON_ERROR_STOP;
if (pg_strcasecmp(sval, "ignore") == 0)
return COPY_ON_ERROR_IGNORE;
+ if (pg_strcasecmp(sval, "set_null") == 0)
+ return COPY_ON_ERROR_SET_NULL;
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -928,7 +930,7 @@ ProcessCopyOptions(ParseState *pstate,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("only ON_ERROR STOP is allowed in BINARY mode")));
- if (opts_out->reject_limit && !opts_out->on_error)
+ if (opts_out->reject_limit && opts_out->on_error != COPY_ON_ERROR_IGNORE)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
/*- translator: first and second %s are the names of COPY option, e.g.
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index fbbbc09a97b..750d597d4d0 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -1467,14 +1467,22 @@ CopyFrom(CopyFromState cstate)
/* Done, clean up */
error_context_stack = errcallback.previous;
- if (cstate->opts.on_error != COPY_ON_ERROR_STOP &&
- cstate->num_errors > 0 &&
+ if (cstate->num_errors > 0 &&
cstate->opts.log_verbosity >= COPY_LOG_VERBOSITY_DEFAULT)
- ereport(NOTICE,
- errmsg_plural("%" PRIu64 " row was skipped due to data type incompatibility",
- "%" PRIu64 " rows were skipped due to data type incompatibility",
- cstate->num_errors,
- cstate->num_errors));
+ {
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ ereport(NOTICE,
+ errmsg_plural("%" PRIu64 " row was skipped due to data type incompatibility",
+ "%" PRIu64 " rows were skipped due to data type incompatibility",
+ cstate->num_errors,
+ cstate->num_errors));
+ else if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ ereport(NOTICE,
+ errmsg_plural("invalid values in %" PRIu64 " row was replaced with null due to data type incompatibility",
+ "invalid values in %" PRIu64 " rows were replaced with null due to data type incompatibility",
+ cstate->num_errors,
+ cstate->num_errors));
+ }
if (bistate != NULL)
FreeBulkInsertState(bistate);
@@ -1614,6 +1622,19 @@ BeginCopyFrom(ParseState *pstate,
}
}
+ if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ {
+ int attr_count = list_length(cstate->attnumlist);
+
+ cstate->domain_with_constraint = (bool *) palloc0(attr_count * sizeof(bool));
+ foreach_int(attno, cstate->attnumlist)
+ {
+ int i = foreach_current_index(attno);
+ Form_pg_attribute att = TupleDescAttr(tupDesc, attno - 1);
+ cstate->domain_with_constraint[i] = DomainHasConstraints(att->atttypid);
+ }
+ }
+
/* Set up soft error handler for ON_ERROR */
if (cstate->opts.on_error != COPY_ON_ERROR_STOP)
{
@@ -1622,10 +1643,11 @@ BeginCopyFrom(ParseState *pstate,
cstate->escontext->error_occurred = false;
/*
- * Currently we only support COPY_ON_ERROR_IGNORE. We'll add other
- * options later
+ * Currently we only support COPY_ON_ERROR_IGNORE, COPY_ON_ERROR_SET_NULL.
+ * We'll add other options later
*/
- if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE ||
+ cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
cstate->escontext->details_wanted = false;
}
else
diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c
index b1ae97b833d..ee887a37afd 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -956,6 +956,7 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
int fldct;
int fieldno;
char *string;
+ bool current_row_erroneous = false;
tupDesc = RelationGetDescr(cstate->rel);
attr_count = list_length(cstate->attnumlist);
@@ -1033,7 +1034,8 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
}
/*
- * If ON_ERROR is specified with IGNORE, skip rows with soft errors
+ * If ON_ERROR is specified with IGNORE, skip rows with soft errors.
+ * If ON_ERROR is specified with SET_NULL, try to replace with null.
*/
else if (!InputFunctionCallSafe(&in_functions[m],
string,
@@ -1044,7 +1046,50 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
{
Assert(cstate->opts.on_error != COPY_ON_ERROR_STOP);
- cstate->num_errors++;
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ cstate->num_errors++;
+ else if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ {
+ cstate->escontext->error_occurred = false;
+ Assert(cstate->domain_with_constraint != NULL);
+
+ /*
+ * If the column type is a domain with constraints, an
+ * additional InputFunctionCallSafe may be needed to raise
+ * errors for domain constraint violations.
+ */
+ if (!cstate->domain_with_constraint[m] ||
+ InputFunctionCallSafe(&in_functions[m],
+ NULL,
+ typioparams[m],
+ att->atttypmod,
+ (Node *) cstate->escontext,
+ &values[m]))
+ {
+ nulls[m] = true;
+ values[m] = (Datum) 0;
+ }
+ else if (string == NULL)
+ ereport(ERROR,
+ errcode(ERRCODE_NOT_NULL_VIOLATION),
+ errmsg("domain %s does not allow null values", format_type_be(typioparams[m])),
+ errdatatype(typioparams[m]));
+ else
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+ errmsg("invalid input value for domain %s: \"%s\"",
+ format_type_be(typioparams[m]), string));
+
+ /*
+ * We count only the number of rows (not individual fields)
+ * where ON_ERROR SET_NULL was successfully applied.
+ */
+ if (!current_row_erroneous)
+ {
+ current_row_erroneous = true;
+ cstate->num_errors++;
+ }
+ }
if (cstate->opts.log_verbosity == COPY_LOG_VERBOSITY_VERBOSE)
{
@@ -1061,24 +1106,37 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
char *attval;
attval = CopyLimitPrintoutLength(cstate->cur_attval);
- ereport(NOTICE,
- errmsg("skipping row due to data type incompatibility at line %" PRIu64 " for column \"%s\": \"%s\"",
- cstate->cur_lineno,
- cstate->cur_attname,
- attval));
+
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ ereport(NOTICE,
+ errmsg("skipping row due to data type incompatibility at line %" PRIu64 " for column \"%s\": \"%s\"",
+ cstate->cur_lineno,
+ cstate->cur_attname,
+ attval));
+ else if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ ereport(NOTICE,
+ errmsg("setting to null due to data type incompatibility at line %" PRIu64 " for column \"%s\": \"%s\"",
+ cstate->cur_lineno,
+ cstate->cur_attname,
+ attval));
pfree(attval);
}
else
- ereport(NOTICE,
- errmsg("skipping row due to data type incompatibility at line %" PRIu64 " for column \"%s\": null input",
- cstate->cur_lineno,
- cstate->cur_attname));
-
+ {
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ ereport(NOTICE,
+ errmsg("skipping row due to data type incompatibility at line %" PRIu64 " for column \"%s\": null input",
+ cstate->cur_lineno,
+ cstate->cur_attname));
+ }
/* reset relname_only */
cstate->relname_only = false;
}
- return true;
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ return true;
+ else if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ continue;
}
cstate->cur_attname = NULL;
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index dbc586c5bc3..c88593e4158 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -3343,7 +3343,7 @@ match_previous_words(int pattern_id,
/* Complete COPY <sth> FROM filename WITH (ON_ERROR */
else if (Matches("COPY|\\copy", MatchAny, "FROM", MatchAny, "WITH", "(", "ON_ERROR"))
- COMPLETE_WITH("stop", "ignore");
+ COMPLETE_WITH("stop", "ignore", "set_null");
/* Complete COPY <sth> FROM filename WITH (LOG_VERBOSITY */
else if (Matches("COPY|\\copy", MatchAny, "FROM", MatchAny, "WITH", "(", "LOG_VERBOSITY"))
diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h
index 541176e1980..da3622028e7 100644
--- a/src/include/commands/copy.h
+++ b/src/include/commands/copy.h
@@ -35,6 +35,7 @@ typedef enum CopyOnErrorChoice
{
COPY_ON_ERROR_STOP = 0, /* immediately throw errors, default */
COPY_ON_ERROR_IGNORE, /* ignore errors */
+ COPY_ON_ERROR_SET_NULL, /* set error field to null */
} CopyOnErrorChoice;
/*
diff --git a/src/include/commands/copyfrom_internal.h b/src/include/commands/copyfrom_internal.h
index c8b22af22d8..c82bfab4636 100644
--- a/src/include/commands/copyfrom_internal.h
+++ b/src/include/commands/copyfrom_internal.h
@@ -108,6 +108,13 @@ typedef struct CopyFromStateData
* att */
bool *defaults; /* if DEFAULT marker was found for
* corresponding att */
+ /*
+ * Set to true if the corresponding attribute's data type is a domain with
+ * constraints. This field is usually NULL, except when ON_ERROR is set to
+ * SET_NULL.
+ */
+ bool *domain_with_constraint;
+
bool volatile_defexprs; /* is any of defexprs volatile? */
List *range_table; /* single element list of RangeTblEntry */
List *rteperminfos; /* single element list of RTEPermissionInfo */
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index caa3c44f0d0..919a2296c27 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -81,6 +81,10 @@ COPY x from stdin (on_error ignore, on_error ignore);
ERROR: conflicting or redundant options
LINE 1: COPY x from stdin (on_error ignore, on_error ignore);
^
+COPY x from stdin (on_error set_null, on_error ignore);
+ERROR: conflicting or redundant options
+LINE 1: COPY x from stdin (on_error set_null, on_error ignore);
+ ^
COPY x from stdin (log_verbosity default, log_verbosity verbose);
ERROR: conflicting or redundant options
LINE 1: COPY x from stdin (log_verbosity default, log_verbosity verb...
@@ -92,6 +96,10 @@ COPY x from stdin (format BINARY, null 'x');
ERROR: cannot specify NULL in BINARY mode
COPY x from stdin (format BINARY, on_error ignore);
ERROR: only ON_ERROR STOP is allowed in BINARY mode
+COPY x from stdin (format BINARY, on_error set_null);
+ERROR: only ON_ERROR STOP is allowed in BINARY mode
+COPY x from stdin (on_error set_null, reject_limit 2);
+ERROR: COPY REJECT_LIMIT requires ON_ERROR to be set to IGNORE
COPY x from stdin (on_error unsupported);
ERROR: COPY ON_ERROR "unsupported" not recognized
LINE 1: COPY x from stdin (on_error unsupported);
@@ -124,6 +132,10 @@ COPY x to stdout (format BINARY, on_error unsupported);
ERROR: COPY ON_ERROR cannot be used with COPY TO
LINE 1: COPY x to stdout (format BINARY, on_error unsupported);
^
+COPY x to stdout (on_error set_null);
+ERROR: COPY ON_ERROR cannot be used with COPY TO
+LINE 1: COPY x to stdout (on_error set_null);
+ ^
COPY x from stdin (log_verbosity unsupported);
ERROR: COPY LOG_VERBOSITY "unsupported" not recognized
LINE 1: COPY x from stdin (log_verbosity unsupported);
@@ -775,6 +787,51 @@ CONTEXT: COPY check_ign_err
NOTICE: skipping row due to data type incompatibility at line 8 for column "k": "a"
CONTEXT: COPY check_ign_err
NOTICE: 6 rows were skipped due to data type incompatibility
+CREATE DOMAIN d_int_not_null AS INT NOT NULL CHECK(value > 0);
+CREATE DOMAIN d_int_positive_maybe_null AS INT CHECK(value > 0);
+CREATE TABLE t_on_error_null (a d_int_not_null, b d_int_positive_maybe_null, c INT);
+\pset null NULL
+--fail, column a cannot set to null value
+COPY t_on_error_null FROM STDIN WITH (on_error set_null);
+ERROR: domain d_int_not_null does not allow null values
+CONTEXT: COPY t_on_error_null, line 1, column a: null input
+--fail, column a is domain with not-null constraint
+COPY t_on_error_null FROM STDIN WITH (on_error set_null);
+ERROR: invalid input value for domain d_int_not_null: "ss"
+CONTEXT: COPY t_on_error_null, line 1, column a: "ss"
+--fail, column a cannot set to null value
+COPY t_on_error_null FROM STDIN WITH (on_error set_null);
+ERROR: invalid input value for domain d_int_not_null: "-1"
+CONTEXT: COPY t_on_error_null, line 1, column a: "-1"
+--fail. less data
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_null);
+ERROR: missing data for column "c"
+CONTEXT: COPY t_on_error_null, line 1: "1,1"
+--fail. extra data
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_null);
+ERROR: extra data after last expected column
+CONTEXT: COPY t_on_error_null, line 1: "1,2,3,4"
+--ok
+COPY t_on_error_null FROM STDIN WITH (on_error set_null, log_verbosity verbose);
+NOTICE: setting to null due to data type incompatibility at line 1 for column "b": "x1"
+CONTEXT: COPY t_on_error_null
+NOTICE: setting to null due to data type incompatibility at line 1 for column "c": "yx"
+CONTEXT: COPY t_on_error_null
+NOTICE: setting to null due to data type incompatibility at line 2 for column "b": "zx"
+CONTEXT: COPY t_on_error_null
+NOTICE: setting to null due to data type incompatibility at line 3 for column "c": "ea"
+CONTEXT: COPY t_on_error_null
+NOTICE: invalid values in 3 rows were replaced with null due to data type incompatibility
+-- check inserted content
+select * from t_on_error_null;
+ a | b | c
+----+------+------
+ 10 | NULL | NULL
+ 11 | NULL | 12
+ 13 | 14 | NULL
+(3 rows)
+
+\pset null ''
-- tests for on_error option with log_verbosity and null constraint via domain
CREATE DOMAIN dcheck_ign_err2 varchar(15) NOT NULL;
CREATE TABLE check_ign_err2 (n int, m int[], k int, l dcheck_ign_err2);
@@ -834,6 +891,9 @@ DROP VIEW instead_of_insert_tbl_view;
DROP VIEW instead_of_insert_tbl_view_2;
DROP FUNCTION fun_instead_of_insert_tbl();
DROP TABLE check_ign_err;
+DROP TABLE t_on_error_null;
+DROP DOMAIN d_int_not_null;
+DROP DOMAIN d_int_positive_maybe_null;
DROP TABLE check_ign_err2;
DROP DOMAIN dcheck_ign_err2;
DROP TABLE hard_err;
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index cef45868db5..be05ed52def 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -67,12 +67,15 @@ COPY x from stdin (force_null (a), force_null (b));
COPY x from stdin (convert_selectively (a), convert_selectively (b));
COPY x from stdin (encoding 'sql_ascii', encoding 'sql_ascii');
COPY x from stdin (on_error ignore, on_error ignore);
+COPY x from stdin (on_error set_null, on_error ignore);
COPY x from stdin (log_verbosity default, log_verbosity verbose);
-- incorrect options
COPY x from stdin (format BINARY, delimiter ',');
COPY x from stdin (format BINARY, null 'x');
COPY x from stdin (format BINARY, on_error ignore);
+COPY x from stdin (format BINARY, on_error set_null);
+COPY x from stdin (on_error set_null, reject_limit 2);
COPY x from stdin (on_error unsupported);
COPY x from stdin (format TEXT, force_quote(a));
COPY x from stdin (format TEXT, force_quote *);
@@ -87,6 +90,7 @@ COPY x from stdin (format TEXT, force_null *);
COPY x to stdout (format CSV, force_null(a));
COPY x to stdout (format CSV, force_null *);
COPY x to stdout (format BINARY, on_error unsupported);
+COPY x to stdout (on_error set_null);
COPY x from stdin (log_verbosity unsupported);
COPY x from stdin with (reject_limit 1);
COPY x from stdin with (on_error ignore, reject_limit 0);
@@ -537,6 +541,45 @@ a {2} 2
8 {8} 8
\.
+CREATE DOMAIN d_int_not_null AS INT NOT NULL CHECK(value > 0);
+CREATE DOMAIN d_int_positive_maybe_null AS INT CHECK(value > 0);
+CREATE TABLE t_on_error_null (a d_int_not_null, b d_int_positive_maybe_null, c INT);
+
+\pset null NULL
+--fail, column a cannot set to null value
+COPY t_on_error_null FROM STDIN WITH (on_error set_null);
+\N 11 13
+\.
+
+--fail, column a is domain with not-null constraint
+COPY t_on_error_null FROM STDIN WITH (on_error set_null);
+ss 11 14
+\.
+
+--fail, column a cannot set to null value
+COPY t_on_error_null FROM STDIN WITH (on_error set_null);
+-1 11 13
+\.
+
+--fail. less data
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_null);
+1,1
+\.
+--fail. extra data
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_null);
+1,2,3,4
+\.
+
+--ok
+COPY t_on_error_null FROM STDIN WITH (on_error set_null, log_verbosity verbose);
+10 x1 yx
+11 zx 12
+13 14 ea
+\.
+
+-- check inserted content
+select * from t_on_error_null;
+\pset null ''
-- tests for on_error option with log_verbosity and null constraint via domain
CREATE DOMAIN dcheck_ign_err2 varchar(15) NOT NULL;
CREATE TABLE check_ign_err2 (n int, m int[], k int, l dcheck_ign_err2);
@@ -606,6 +649,9 @@ DROP VIEW instead_of_insert_tbl_view;
DROP VIEW instead_of_insert_tbl_view_2;
DROP FUNCTION fun_instead_of_insert_tbl();
DROP TABLE check_ign_err;
+DROP TABLE t_on_error_null;
+DROP DOMAIN d_int_not_null;
+DROP DOMAIN d_int_positive_maybe_null;
DROP TABLE check_ign_err2;
DROP DOMAIN dcheck_ign_err2;
DROP TABLE hard_err;
--
2.34.1
^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
@ 2025-11-10 10:22 jian he <[email protected]>
parent: jian he <[email protected]>
0 siblings, 1 reply; 31+ messages in thread
From: jian he @ 2025-11-10 10:22 UTC (permalink / raw)
To: torikoshia <[email protected]>; +Cc: Masahiko Sawada <[email protected]>; vignesh C <[email protected]>; Jim Jones <[email protected]>; Kirill Reshke <[email protected]>; Fujii Masao <[email protected]>; David G. Johnston <[email protected]>; Yugo NAGATA <[email protected]>; PostgreSQL Hackers <[email protected]>
hi.
rebase and minor cosmetic changes.
--
jian
https://www.enterprisedb.com/
Attachments:
[text/x-patch] v20-0001-COPY-on_error-set_null.patch (22.4K, 2-v20-0001-COPY-on_error-set_null.patch)
download | inline diff:
From 05574920774c9bb2f93f9cb6ecea136aa673cba7 Mon Sep 17 00:00:00 2001
From: jian he <[email protected]>
Date: Mon, 10 Nov 2025 18:17:09 +0800
Subject: [PATCH v20 1/1] COPY (on_error set_null)
When COPY FROM, if ON_ERROR SET_NULL is specified, any data type conversion
errors will result in the affected column being set to NULL. However, column's
not-null constraints are still enforced, attempting to set a NULL value in such
columns will raise a constraint violation error. This applies to column data
type is a domain with a NOT NULL constraint.
Author: Jian He <[email protected]>
Author: Kirill Reshke <[email protected]>
Reviewed-by:
Fujii Masao <[email protected]>
Jim Jones <[email protected]>
"David G. Johnston" <[email protected]>
Yugo NAGATA <[email protected]>
torikoshia <[email protected]>
Masahiko Sawada <[email protected]>
Atsushi Torikoshi <[email protected]>
discussion: https://postgr.es/m/CAKFQuwawy1e6YR4S=j+y7pXqg_Dw1WBVrgvf=BP3d1_aSfe_+Q@mail.gmail.com
commitfest entry: https://commitfest.postgresql.org/patch/4810
---
doc/src/sgml/ref/copy.sgml | 35 +++++++---
src/backend/commands/copy.c | 6 +-
src/backend/commands/copyfrom.c | 42 +++++++++---
src/backend/commands/copyfromparse.c | 84 ++++++++++++++++++++----
src/bin/psql/tab-complete.in.c | 2 +-
src/include/commands/copy.h | 1 +
src/include/commands/copyfrom_internal.h | 7 ++
src/test/regress/expected/copy2.out | 55 ++++++++++++++++
src/test/regress/sql/copy2.sql | 43 ++++++++++++
9 files changed, 239 insertions(+), 36 deletions(-)
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index fdc24b36bb8..1f42fd0972d 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -412,23 +412,37 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
Specifies how to behave when encountering an error converting a column's
input value into its data type.
An <replaceable class="parameter">error_action</replaceable> value of
- <literal>stop</literal> means fail the command, while
- <literal>ignore</literal> means discard the input row and continue with the next one.
+ <literal>stop</literal> means fail the command,
+ <literal>ignore</literal> means discard the input row and continue with the next one,
+ and <literal>set_null</literal> means replace column containing invalid
+ input value with <literal>NULL</literal> and move to the next field.
The default is <literal>stop</literal>.
</para>
<para>
- The <literal>ignore</literal> option is applicable only for <command>COPY FROM</command>
+ The <literal>ignore</literal> and <literal>set_null</literal>
+ options are applicable only for <command>COPY FROM</command>
when the <literal>FORMAT</literal> is <literal>text</literal> or <literal>csv</literal>.
</para>
+ <para>
+ For <literal>ignore</literal> option, a <literal>NOTICE</literal> message
+ containing the ignored row count is emitted at the end of the <command>COPY
+ FROM</command> if at least one row was discarded.
+ For <literal>set_null</literal> option,
+ a <literal>NOTICE</literal> message indicating the number of rows
+ where invalid input values were replaced with null is emitted
+ at the end of the <command>COPY FROM</command> if at least one row was replaced.
+ </para>
<para>
- A <literal>NOTICE</literal> message containing the ignored row count is
- emitted at the end of the <command>COPY FROM</command> if at least one
- row was discarded. When <literal>LOG_VERBOSITY</literal> option is set to
- <literal>verbose</literal>, a <literal>NOTICE</literal> message
+ When <literal>LOG_VERBOSITY</literal> option is set to <literal>verbose</literal>,
+ for <literal>ignore</literal> option, a <literal>NOTICE</literal> message
containing the line of the input file and the column name whose input
- conversion has failed is emitted for each discarded row.
+ conversion has failed is emitted for each discarded row;
+ for <literal>set_null</literal> option, a <literal>NOTICE</literal>
+ message containing the line of the input file and the column name where
+ value was replaced with <literal>NULL</literal> for each input conversion
+ failure.
When it is set to <literal>silent</literal>, no message is emitted
- regarding ignored rows.
+ regarding input conversion failed rows.
</para>
</listitem>
</varlistentry>
@@ -476,7 +490,8 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
</para>
<para>
This is currently used in <command>COPY FROM</command> command when
- <literal>ON_ERROR</literal> option is set to <literal>ignore</literal>.
+ <literal>ON_ERROR</literal> option is set to <literal>ignore</literal>
+ or <literal>set_null</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 28e878c3688..9d0413f4cfa 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -456,12 +456,14 @@ defGetCopyOnErrorChoice(DefElem *def, ParseState *pstate, bool is_from)
parser_errposition(pstate, def->location)));
/*
- * Allow "stop", or "ignore" values.
+ * Allow "stop", "ignore", "set_null" values.
*/
if (pg_strcasecmp(sval, "stop") == 0)
return COPY_ON_ERROR_STOP;
if (pg_strcasecmp(sval, "ignore") == 0)
return COPY_ON_ERROR_IGNORE;
+ if (pg_strcasecmp(sval, "set_null") == 0)
+ return COPY_ON_ERROR_SET_NULL;
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -971,7 +973,7 @@ ProcessCopyOptions(ParseState *pstate,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("only ON_ERROR STOP is allowed in BINARY mode")));
- if (opts_out->reject_limit && !opts_out->on_error)
+ if (opts_out->reject_limit && opts_out->on_error != COPY_ON_ERROR_IGNORE)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
/*- translator: first and second %s are the names of COPY option, e.g.
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 12781963b4f..87588b688f2 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -1467,14 +1467,22 @@ CopyFrom(CopyFromState cstate)
/* Done, clean up */
error_context_stack = errcallback.previous;
- if (cstate->opts.on_error != COPY_ON_ERROR_STOP &&
- cstate->num_errors > 0 &&
+ if (cstate->num_errors > 0 &&
cstate->opts.log_verbosity >= COPY_LOG_VERBOSITY_DEFAULT)
- ereport(NOTICE,
- errmsg_plural("%" PRIu64 " row was skipped due to data type incompatibility",
- "%" PRIu64 " rows were skipped due to data type incompatibility",
- cstate->num_errors,
- cstate->num_errors));
+ {
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ ereport(NOTICE,
+ errmsg_plural("%" PRIu64 " row was skipped due to data type incompatibility",
+ "%" PRIu64 " rows were skipped due to data type incompatibility",
+ cstate->num_errors,
+ cstate->num_errors));
+ else if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ ereport(NOTICE,
+ errmsg_plural("invalid values in %" PRIu64 " row was replaced with null due to data type incompatibility",
+ "invalid values in %" PRIu64 " rows were replaced with null due to data type incompatibility",
+ cstate->num_errors,
+ cstate->num_errors));
+ }
if (bistate != NULL)
FreeBulkInsertState(bistate);
@@ -1614,6 +1622,19 @@ BeginCopyFrom(ParseState *pstate,
}
}
+ if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ {
+ int attr_count = list_length(cstate->attnumlist);
+
+ cstate->domain_with_constraint = (bool *) palloc0(attr_count * sizeof(bool));
+ foreach_int(attno, cstate->attnumlist)
+ {
+ int i = foreach_current_index(attno);
+ Form_pg_attribute att = TupleDescAttr(tupDesc, attno - 1);
+ cstate->domain_with_constraint[i] = DomainHasConstraints(att->atttypid);
+ }
+ }
+
/* Set up soft error handler for ON_ERROR */
if (cstate->opts.on_error != COPY_ON_ERROR_STOP)
{
@@ -1622,10 +1643,11 @@ BeginCopyFrom(ParseState *pstate,
cstate->escontext->error_occurred = false;
/*
- * Currently we only support COPY_ON_ERROR_IGNORE. We'll add other
- * options later
+ * Currently we only support COPY_ON_ERROR_IGNORE, COPY_ON_ERROR_SET_NULL.
+ * We'll add other options later
*/
- if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE ||
+ cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
cstate->escontext->details_wanted = false;
}
else
diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c
index b1ae97b833d..7c0d13ab38b 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -956,6 +956,7 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
int fldct;
int fieldno;
char *string;
+ bool current_row_erroneous = false;
tupDesc = RelationGetDescr(cstate->rel);
attr_count = list_length(cstate->attnumlist);
@@ -1033,7 +1034,8 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
}
/*
- * If ON_ERROR is specified with IGNORE, skip rows with soft errors
+ * If ON_ERROR is specified with IGNORE, skip rows with soft errors.
+ * If ON_ERROR is specified with SET_NULL, try to replace with null.
*/
else if (!InputFunctionCallSafe(&in_functions[m],
string,
@@ -1044,7 +1046,50 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
{
Assert(cstate->opts.on_error != COPY_ON_ERROR_STOP);
- cstate->num_errors++;
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ cstate->num_errors++;
+ else if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ {
+ cstate->escontext->error_occurred = false;
+ Assert(cstate->domain_with_constraint != NULL);
+
+ /*
+ * If the column type is a domain with constraints, an
+ * additional InputFunctionCallSafe may be needed to raise
+ * errors for domain constraint violations.
+ */
+ if (!cstate->domain_with_constraint[m] ||
+ InputFunctionCallSafe(&in_functions[m],
+ NULL,
+ typioparams[m],
+ att->atttypmod,
+ (Node *) cstate->escontext,
+ &values[m]))
+ {
+ nulls[m] = true;
+ values[m] = (Datum) 0;
+ }
+ else if (string == NULL)
+ ereport(ERROR,
+ errcode(ERRCODE_NOT_NULL_VIOLATION),
+ errmsg("domain %s does not allow null values", format_type_be(typioparams[m])),
+ errdatatype(typioparams[m]));
+ else
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+ errmsg("invalid input value for domain %s: \"%s\"",
+ format_type_be(typioparams[m]), string));
+
+ /*
+ * We count only the number of rows (not individual fields)
+ * where ON_ERROR SET_NULL was successfully applied.
+ */
+ if (!current_row_erroneous)
+ {
+ current_row_erroneous = true;
+ cstate->num_errors++;
+ }
+ }
if (cstate->opts.log_verbosity == COPY_LOG_VERBOSITY_VERBOSE)
{
@@ -1061,24 +1106,37 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
char *attval;
attval = CopyLimitPrintoutLength(cstate->cur_attval);
- ereport(NOTICE,
- errmsg("skipping row due to data type incompatibility at line %" PRIu64 " for column \"%s\": \"%s\"",
- cstate->cur_lineno,
- cstate->cur_attname,
- attval));
+
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ ereport(NOTICE,
+ errmsg("skipping row due to data type incompatibility at line %" PRIu64 " for column \"%s\": \"%s\"",
+ cstate->cur_lineno,
+ cstate->cur_attname,
+ attval));
+ else if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ ereport(NOTICE,
+ errmsg("setting to null due to data type incompatibility at line %" PRIu64 " for column \"%s\": \"%s\"",
+ cstate->cur_lineno,
+ cstate->cur_attname,
+ attval));
pfree(attval);
}
else
- ereport(NOTICE,
- errmsg("skipping row due to data type incompatibility at line %" PRIu64 " for column \"%s\": null input",
- cstate->cur_lineno,
- cstate->cur_attname));
-
+ {
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ ereport(NOTICE,
+ errmsg("skipping row due to data type incompatibility at line %" PRIu64 " for column \"%s\": null input",
+ cstate->cur_lineno,
+ cstate->cur_attname));
+ }
/* reset relname_only */
cstate->relname_only = false;
}
- return true;
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ return true;
+ else if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ continue;
}
cstate->cur_attname = NULL;
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 316a2dafbf1..7124f468ba4 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -3381,7 +3381,7 @@ match_previous_words(int pattern_id,
/* Complete COPY <sth> FROM [PROGRAM] filename WITH (ON_ERROR */
else if (Matches("COPY|\\copy", MatchAny, "FROM", MatchAnyExcept("PROGRAM"), "WITH", "(", "ON_ERROR") ||
Matches("COPY|\\copy", MatchAny, "FROM", "PROGRAM", MatchAny, "WITH", "(", "ON_ERROR"))
- COMPLETE_WITH("stop", "ignore");
+ COMPLETE_WITH("stop", "ignore", "set_null");
/* Complete COPY <sth> FROM [PROGRAM] filename WITH (LOG_VERBOSITY */
else if (Matches("COPY|\\copy", MatchAny, "FROM", MatchAnyExcept("PROGRAM"), "WITH", "(", "LOG_VERBOSITY") ||
diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h
index 541176e1980..da3622028e7 100644
--- a/src/include/commands/copy.h
+++ b/src/include/commands/copy.h
@@ -35,6 +35,7 @@ typedef enum CopyOnErrorChoice
{
COPY_ON_ERROR_STOP = 0, /* immediately throw errors, default */
COPY_ON_ERROR_IGNORE, /* ignore errors */
+ COPY_ON_ERROR_SET_NULL, /* set error field to null */
} CopyOnErrorChoice;
/*
diff --git a/src/include/commands/copyfrom_internal.h b/src/include/commands/copyfrom_internal.h
index c8b22af22d8..a606a38cc23 100644
--- a/src/include/commands/copyfrom_internal.h
+++ b/src/include/commands/copyfrom_internal.h
@@ -108,6 +108,13 @@ typedef struct CopyFromStateData
* att */
bool *defaults; /* if DEFAULT marker was found for
* corresponding att */
+ /*
+ * Set to true if the corresponding attribute's data type is a domain with
+ * constraints. This field is usually NULL, except when ON_ERROR is set to
+ * SET_NULL.
+ */
+ bool *domain_with_constraint;
+
bool volatile_defexprs; /* is any of defexprs volatile? */
List *range_table; /* single element list of RangeTblEntry */
List *rteperminfos; /* single element list of RTEPermissionInfo */
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index f3fdce23459..16e70ccb813 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -81,6 +81,10 @@ COPY x from stdin (on_error ignore, on_error ignore);
ERROR: conflicting or redundant options
LINE 1: COPY x from stdin (on_error ignore, on_error ignore);
^
+COPY x from stdin (on_error set_null, on_error set_null);
+ERROR: conflicting or redundant options
+LINE 1: COPY x from stdin (on_error set_null, on_error set_null);
+ ^
COPY x from stdin (log_verbosity default, log_verbosity verbose);
ERROR: conflicting or redundant options
LINE 1: COPY x from stdin (log_verbosity default, log_verbosity verb...
@@ -92,6 +96,10 @@ COPY x from stdin (format BINARY, null 'x');
ERROR: cannot specify NULL in BINARY mode
COPY x from stdin (format BINARY, on_error ignore);
ERROR: only ON_ERROR STOP is allowed in BINARY mode
+COPY x from stdin (format BINARY, on_error set_null);
+ERROR: only ON_ERROR STOP is allowed in BINARY mode
+COPY x from stdin (on_error set_null, reject_limit 2);
+ERROR: COPY REJECT_LIMIT requires ON_ERROR to be set to IGNORE
COPY x from stdin (on_error unsupported);
ERROR: COPY ON_ERROR "unsupported" not recognized
LINE 1: COPY x from stdin (on_error unsupported);
@@ -124,6 +132,10 @@ COPY x to stdout (format BINARY, on_error unsupported);
ERROR: COPY ON_ERROR cannot be used with COPY TO
LINE 1: COPY x to stdout (format BINARY, on_error unsupported);
^
+COPY x to stdout (on_error set_null);
+ERROR: COPY ON_ERROR cannot be used with COPY TO
+LINE 1: COPY x to stdout (on_error set_null);
+ ^
COPY x from stdin (log_verbosity unsupported);
ERROR: COPY LOG_VERBOSITY "unsupported" not recognized
LINE 1: COPY x from stdin (log_verbosity unsupported);
@@ -776,6 +788,46 @@ CONTEXT: COPY check_ign_err
NOTICE: skipping row due to data type incompatibility at line 8 for column "k": "a"
CONTEXT: COPY check_ign_err
NOTICE: 6 rows were skipped due to data type incompatibility
+CREATE DOMAIN d_int_not_null AS INT NOT NULL CHECK(value > 0);
+CREATE DOMAIN d_int_positive_maybe_null AS INT CHECK(value > 0);
+CREATE TABLE t_on_error_null (a d_int_not_null, b d_int_positive_maybe_null, c INT);
+\pset null NULL
+COPY t_on_error_null FROM STDIN WITH (on_error set_null); --fail
+ERROR: domain d_int_not_null does not allow null values
+CONTEXT: COPY t_on_error_null, line 1, column a: null input
+COPY t_on_error_null FROM STDIN WITH (on_error set_null); --fail
+ERROR: invalid input value for domain d_int_not_null: "ss"
+CONTEXT: COPY t_on_error_null, line 1, column a: "ss"
+COPY t_on_error_null FROM STDIN WITH (on_error set_null); --fail
+ERROR: invalid input value for domain d_int_not_null: "-1"
+CONTEXT: COPY t_on_error_null, line 1, column a: "-1"
+--fail, less data.
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_null);
+ERROR: missing data for column "c"
+CONTEXT: COPY t_on_error_null, line 1: "1,1"
+--fail, extra data.
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_null);
+ERROR: extra data after last expected column
+CONTEXT: COPY t_on_error_null, line 1: "1,2,3,4"
+COPY t_on_error_null FROM STDIN WITH (on_error set_null, log_verbosity verbose); --ok
+NOTICE: setting to null due to data type incompatibility at line 1 for column "b": "x1"
+CONTEXT: COPY t_on_error_null
+NOTICE: setting to null due to data type incompatibility at line 1 for column "c": "yx"
+CONTEXT: COPY t_on_error_null
+NOTICE: setting to null due to data type incompatibility at line 2 for column "b": "zx"
+CONTEXT: COPY t_on_error_null
+NOTICE: setting to null due to data type incompatibility at line 3 for column "c": "ea"
+CONTEXT: COPY t_on_error_null
+NOTICE: invalid values in 3 rows were replaced with null due to data type incompatibility
+SELECT * FROM t_on_error_null;
+ a | b | c
+----+------+------
+ 10 | NULL | NULL
+ 11 | NULL | 12
+ 13 | 14 | NULL
+(3 rows)
+
+\pset null ''
-- tests for on_error option with log_verbosity and null constraint via domain
CREATE DOMAIN dcheck_ign_err2 varchar(15) NOT NULL;
CREATE TABLE check_ign_err2 (n int, m int[], k int, l dcheck_ign_err2);
@@ -835,6 +887,9 @@ DROP VIEW instead_of_insert_tbl_view;
DROP VIEW instead_of_insert_tbl_view_2;
DROP FUNCTION fun_instead_of_insert_tbl();
DROP TABLE check_ign_err;
+DROP TABLE t_on_error_null;
+DROP DOMAIN d_int_not_null;
+DROP DOMAIN d_int_positive_maybe_null;
DROP TABLE check_ign_err2;
DROP DOMAIN dcheck_ign_err2;
DROP TABLE hard_err;
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index cef45868db5..49200cf064d 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -67,12 +67,15 @@ COPY x from stdin (force_null (a), force_null (b));
COPY x from stdin (convert_selectively (a), convert_selectively (b));
COPY x from stdin (encoding 'sql_ascii', encoding 'sql_ascii');
COPY x from stdin (on_error ignore, on_error ignore);
+COPY x from stdin (on_error set_null, on_error set_null);
COPY x from stdin (log_verbosity default, log_verbosity verbose);
-- incorrect options
COPY x from stdin (format BINARY, delimiter ',');
COPY x from stdin (format BINARY, null 'x');
COPY x from stdin (format BINARY, on_error ignore);
+COPY x from stdin (format BINARY, on_error set_null);
+COPY x from stdin (on_error set_null, reject_limit 2);
COPY x from stdin (on_error unsupported);
COPY x from stdin (format TEXT, force_quote(a));
COPY x from stdin (format TEXT, force_quote *);
@@ -87,6 +90,7 @@ COPY x from stdin (format TEXT, force_null *);
COPY x to stdout (format CSV, force_null(a));
COPY x to stdout (format CSV, force_null *);
COPY x to stdout (format BINARY, on_error unsupported);
+COPY x to stdout (on_error set_null);
COPY x from stdin (log_verbosity unsupported);
COPY x from stdin with (reject_limit 1);
COPY x from stdin with (on_error ignore, reject_limit 0);
@@ -537,6 +541,42 @@ a {2} 2
8 {8} 8
\.
+CREATE DOMAIN d_int_not_null AS INT NOT NULL CHECK(value > 0);
+CREATE DOMAIN d_int_positive_maybe_null AS INT CHECK(value > 0);
+CREATE TABLE t_on_error_null (a d_int_not_null, b d_int_positive_maybe_null, c INT);
+
+\pset null NULL
+COPY t_on_error_null FROM STDIN WITH (on_error set_null); --fail
+\N 11 13
+\.
+
+COPY t_on_error_null FROM STDIN WITH (on_error set_null); --fail
+ss 11 14
+\.
+
+COPY t_on_error_null FROM STDIN WITH (on_error set_null); --fail
+-1 11 13
+\.
+
+--fail, less data.
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_null);
+1,1
+\.
+--fail, extra data.
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_null);
+1,2,3,4
+\.
+
+COPY t_on_error_null FROM STDIN WITH (on_error set_null, log_verbosity verbose); --ok
+10 x1 yx
+11 zx 12
+13 14 ea
+\.
+
+SELECT * FROM t_on_error_null;
+
+\pset null ''
+
-- tests for on_error option with log_verbosity and null constraint via domain
CREATE DOMAIN dcheck_ign_err2 varchar(15) NOT NULL;
CREATE TABLE check_ign_err2 (n int, m int[], k int, l dcheck_ign_err2);
@@ -606,6 +646,9 @@ DROP VIEW instead_of_insert_tbl_view;
DROP VIEW instead_of_insert_tbl_view_2;
DROP FUNCTION fun_instead_of_insert_tbl();
DROP TABLE check_ign_err;
+DROP TABLE t_on_error_null;
+DROP DOMAIN d_int_not_null;
+DROP DOMAIN d_int_positive_maybe_null;
DROP TABLE check_ign_err2;
DROP DOMAIN dcheck_ign_err2;
DROP TABLE hard_err;
--
2.34.1
^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
@ 2026-01-21 07:37 jian he <[email protected]>
parent: jian he <[email protected]>
0 siblings, 1 reply; 31+ messages in thread
From: jian he @ 2026-01-21 07:37 UTC (permalink / raw)
To: Matheus Alcantara <[email protected]>; +Cc: torikoshia <[email protected]>; Masahiko Sawada <[email protected]>; vignesh C <[email protected]>; Jim Jones <[email protected]>; Kirill Reshke <[email protected]>; Fujii Masao <[email protected]>; David G. Johnston <[email protected]>; Yugo NAGATA <[email protected]>; PostgreSQL Hackers <[email protected]>
On Wed, Jan 21, 2026 at 3:55 AM Matheus Alcantara
<[email protected]> wrote:
> Hi,
>
> The patch needs a new rebase, could you please send a new version?
sure. please check the attached.
Attachments:
[text/x-patch] v21-0001-COPY-on_error-set_null.patch (22.3K, 2-v21-0001-COPY-on_error-set_null.patch)
download | inline diff:
From 30056bdb56d0d8f63333ccc4cb60808abfa91644 Mon Sep 17 00:00:00 2001
From: jian he <[email protected]>
Date: Wed, 21 Jan 2026 11:22:47 +0800
Subject: [PATCH v21 1/1] COPY (on_error set_null)
If ON_ERROR SET_NULL is specified during COPY FROM, any data type conversion
errors will result in the affected column being set to NULL. However, column's
not-null constraints are still enforced, attempting to set a NULL value in such
columns will raise a constraint violation error. This applies to column data
type is a domain with a NOT NULL constraint.
Author: Jian He <[email protected]>
Author: Kirill Reshke <[email protected]>
Reviewed-by:
Fujii Masao <[email protected]>
Jim Jones <[email protected]>
"David G. Johnston" <[email protected]>
Yugo NAGATA <[email protected]>
torikoshia <[email protected]>
Masahiko Sawada <[email protected]>
Atsushi Torikoshi <[email protected]>
discussion: https://postgr.es/m/CAKFQuwawy1e6YR4S=j+y7pXqg_Dw1WBVrgvf=BP3d1_aSfe_+Q@mail.gmail.com
commitfest entry: https://commitfest.postgresql.org/patch/4810
---
doc/src/sgml/ref/copy.sgml | 35 +++++++---
src/backend/commands/copy.c | 6 +-
src/backend/commands/copyfrom.c | 45 ++++++++++---
src/backend/commands/copyfromparse.c | 85 ++++++++++++++++++++----
src/bin/psql/tab-complete.in.c | 2 +-
src/include/commands/copy.h | 1 +
src/include/commands/copyfrom_internal.h | 7 ++
src/test/regress/expected/copy2.out | 55 +++++++++++++++
src/test/regress/sql/copy2.sql | 43 ++++++++++++
9 files changed, 244 insertions(+), 35 deletions(-)
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 53b0ea8f573..4d039b66221 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -413,22 +413,38 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
input value into its data type.
An <replaceable class="parameter">error_action</replaceable> value of
<literal>stop</literal> means fail the command, while
- <literal>ignore</literal> means discard the input row and continue with the next one.
+ <literal>ignore</literal> means discard the input row and continue with the next one,
+ and <literal>set_null</literal> means replace column containing invalid
+ input value with <literal>NULL</literal> and continue to the next field.
The default is <literal>stop</literal>.
</para>
<para>
- The <literal>ignore</literal> option is applicable only for <command>COPY FROM</command>
+ The <literal>ignore</literal> and <literal>set_null</literal>
+ options are applicable only for <command>COPY FROM</command>
when the <literal>FORMAT</literal> is <literal>text</literal> or <literal>csv</literal>.
</para>
+
+ <para>
+ For <literal>ignore</literal> option, a <literal>NOTICE</literal> message
+ containing the ignored row count is emitted at the end of the <command>COPY FROM</command>
+ if at least one row was discarded.
+ For <literal>set_null</literal> option, a <literal>NOTICE</literal>
+ message indicating the number of rows where invalid input values were
+ replaced with null is emitted at the end of the <command>COPY FROM</command>
+ if at least one row was replaced.
+ </para>
+
<para>
- A <literal>NOTICE</literal> message containing the ignored row count is
- emitted at the end of the <command>COPY FROM</command> if at least one
- row was discarded. When <literal>LOG_VERBOSITY</literal> option is set to
- <literal>verbose</literal>, a <literal>NOTICE</literal> message
+ When <literal>LOG_VERBOSITY</literal> option is set to <literal>verbose</literal>,
+ for <literal>ignore</literal> option, a <literal>NOTICE</literal> message
containing the line of the input file and the column name whose input
- conversion has failed is emitted for each discarded row.
+ conversion has failed is emitted for each discarded row;
+ for <literal>set_null</literal> option, a <literal>NOTICE</literal>
+ message containing the line of the input file and the column name where
+ value was replaced with <literal>NULL</literal> for each input conversion
+ failure.
When it is set to <literal>silent</literal>, no message is emitted
- regarding ignored rows.
+ regarding input conversion failed rows.
</para>
</listitem>
</varlistentry>
@@ -476,7 +492,8 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
</para>
<para>
This is currently used in <command>COPY FROM</command> command when
- <literal>ON_ERROR</literal> option is set to <literal>ignore</literal>.
+ <literal>ON_ERROR</literal> option is set to <literal>ignore</literal>
+ or <literal>set_null</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 9c51384ab92..c3eb31556b8 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -456,12 +456,14 @@ defGetCopyOnErrorChoice(DefElem *def, ParseState *pstate, bool is_from)
parser_errposition(pstate, def->location)));
/*
- * Allow "stop", or "ignore" values.
+ * Allow "stop", "ignore", "set_null" values.
*/
if (pg_strcasecmp(sval, "stop") == 0)
return COPY_ON_ERROR_STOP;
if (pg_strcasecmp(sval, "ignore") == 0)
return COPY_ON_ERROR_IGNORE;
+ if (pg_strcasecmp(sval, "set_null") == 0)
+ return COPY_ON_ERROR_SET_NULL;
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -971,7 +973,7 @@ ProcessCopyOptions(ParseState *pstate,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("only ON_ERROR STOP is allowed in BINARY mode")));
- if (opts_out->reject_limit && !opts_out->on_error)
+ if (opts_out->reject_limit && opts_out->on_error != COPY_ON_ERROR_IGNORE)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
/*- translator: first and second %s are the names of COPY option, e.g.
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 25ee20b23db..f54168fbb70 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -50,6 +50,7 @@
#include "utils/portal.h"
#include "utils/rel.h"
#include "utils/snapmgr.h"
+#include "utils/typcache.h"
/*
* No more than this many tuples per CopyMultiInsertBuffer
@@ -1467,14 +1468,22 @@ CopyFrom(CopyFromState cstate)
/* Done, clean up */
error_context_stack = errcallback.previous;
- if (cstate->opts.on_error != COPY_ON_ERROR_STOP &&
- cstate->num_errors > 0 &&
+ if (cstate->num_errors > 0 &&
cstate->opts.log_verbosity >= COPY_LOG_VERBOSITY_DEFAULT)
- ereport(NOTICE,
- errmsg_plural("%" PRIu64 " row was skipped due to data type incompatibility",
- "%" PRIu64 " rows were skipped due to data type incompatibility",
- cstate->num_errors,
- cstate->num_errors));
+ {
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ ereport(NOTICE,
+ errmsg_plural("%" PRIu64 " row was skipped due to data type incompatibility",
+ "%" PRIu64 " rows were skipped due to data type incompatibility",
+ cstate->num_errors,
+ cstate->num_errors));
+ else if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ ereport(NOTICE,
+ errmsg_plural("invalid values in %" PRIu64 " row was replaced with null due to data type incompatibility",
+ "invalid values in %" PRIu64 " rows were replaced with null due to data type incompatibility",
+ cstate->num_errors,
+ cstate->num_errors));
+ }
if (bistate != NULL)
FreeBulkInsertState(bistate);
@@ -1622,15 +1631,31 @@ BeginCopyFrom(ParseState *pstate,
cstate->escontext->error_occurred = false;
/*
- * Currently we only support COPY_ON_ERROR_IGNORE. We'll add other
- * options later
+ * Currently we only support COPY_ON_ERROR_IGNORE,
+ * COPY_ON_ERROR_SET_NULL. We'll add other options later
*/
- if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE ||
+ cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
cstate->escontext->details_wanted = false;
}
else
cstate->escontext = NULL;
+ if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ {
+ int attr_count = list_length(cstate->attnumlist);
+
+ cstate->domain_with_constraint = (bool *) palloc0(attr_count * sizeof(bool));
+ foreach_int(attno, cstate->attnumlist)
+ {
+ int i = foreach_current_index(attno);
+
+ Form_pg_attribute att = TupleDescAttr(tupDesc, attno - 1);
+
+ cstate->domain_with_constraint[i] = DomainHasConstraints(att->atttypid);
+ }
+ }
+
/* Convert FORCE_NULL name list to per-column flags, check validity */
cstate->opts.force_null_flags = (bool *) palloc0(num_phys_attrs * sizeof(bool));
if (cstate->opts.force_null_all)
diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c
index 5868a7fa11f..1d8a9dbb792 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -956,6 +956,7 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
int fldct;
int fieldno;
char *string;
+ bool current_row_erroneous = false;
tupDesc = RelationGetDescr(cstate->rel);
attr_count = list_length(cstate->attnumlist);
@@ -1033,7 +1034,8 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
}
/*
- * If ON_ERROR is specified with IGNORE, skip rows with soft errors
+ * If ON_ERROR is specified with IGNORE, skip rows with soft errors.
+ * If ON_ERROR is specified with SET_NULL, try to replace with null.
*/
else if (!InputFunctionCallSafe(&in_functions[m],
string,
@@ -1044,7 +1046,51 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
{
Assert(cstate->opts.on_error != COPY_ON_ERROR_STOP);
- cstate->num_errors++;
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ cstate->num_errors++;
+ else if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ {
+ cstate->escontext->error_occurred = false;
+
+ Assert(cstate->domain_with_constraint != NULL);
+
+ /*
+ * If the column type is a constrained domain, an additional
+ * InputFunctionCallSafe may be needed to raise error for
+ * domain constraint violation.
+ */
+ if (!cstate->domain_with_constraint[m] ||
+ InputFunctionCallSafe(&in_functions[m],
+ NULL,
+ typioparams[m],
+ att->atttypmod,
+ (Node *) cstate->escontext,
+ &values[m]))
+ {
+ nulls[m] = true;
+ values[m] = (Datum) 0;
+ }
+ else if (string == NULL)
+ ereport(ERROR,
+ errcode(ERRCODE_NOT_NULL_VIOLATION),
+ errmsg("domain %s does not allow null values", format_type_be(typioparams[m])),
+ errdatatype(typioparams[m]));
+ else
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+ errmsg("invalid input value for domain %s: \"%s\"",
+ format_type_be(typioparams[m]), string));
+
+ /*
+ * We count only the number of rows (not fields) where
+ * ON_ERROR SET_NULL was applied.
+ */
+ if (!current_row_erroneous)
+ {
+ current_row_erroneous = true;
+ cstate->num_errors++;
+ }
+ }
if (cstate->opts.log_verbosity == COPY_LOG_VERBOSITY_VERBOSE)
{
@@ -1061,24 +1107,37 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
char *attval;
attval = CopyLimitPrintoutLength(cstate->cur_attval);
- ereport(NOTICE,
- errmsg("skipping row due to data type incompatibility at line %" PRIu64 " for column \"%s\": \"%s\"",
- cstate->cur_lineno,
- cstate->cur_attname,
- attval));
+
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ ereport(NOTICE,
+ errmsg("skipping row due to data type incompatibility at line %" PRIu64 " for column \"%s\": \"%s\"",
+ cstate->cur_lineno,
+ cstate->cur_attname,
+ attval));
+ else if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ ereport(NOTICE,
+ errmsg("setting to null due to data type incompatibility at line %" PRIu64 " for column \"%s\": \"%s\"",
+ cstate->cur_lineno,
+ cstate->cur_attname,
+ attval));
pfree(attval);
}
else
- ereport(NOTICE,
- errmsg("skipping row due to data type incompatibility at line %" PRIu64 " for column \"%s\": null input",
- cstate->cur_lineno,
- cstate->cur_attname));
-
+ {
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ ereport(NOTICE,
+ errmsg("skipping row due to data type incompatibility at line %" PRIu64 " for column \"%s\": null input",
+ cstate->cur_lineno,
+ cstate->cur_attname));
+ }
/* reset relname_only */
cstate->relname_only = false;
}
- return true;
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ return true;
+ else if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ continue;
}
cstate->cur_attname = NULL;
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 8b91bc00062..e2c72bd3059 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -3425,7 +3425,7 @@ match_previous_words(int pattern_id,
/* Complete COPY <sth> FROM filename WITH (ON_ERROR */
else if (TailMatches("ON_ERROR"))
- COMPLETE_WITH("stop", "ignore");
+ COMPLETE_WITH("stop", "ignore", "set_null");
/* Complete COPY <sth> FROM filename WITH (LOG_VERBOSITY */
else if (TailMatches("LOG_VERBOSITY"))
diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h
index 43c2580539f..877202af67b 100644
--- a/src/include/commands/copy.h
+++ b/src/include/commands/copy.h
@@ -35,6 +35,7 @@ typedef enum CopyOnErrorChoice
{
COPY_ON_ERROR_STOP = 0, /* immediately throw errors, default */
COPY_ON_ERROR_IGNORE, /* ignore errors */
+ COPY_ON_ERROR_SET_NULL, /* set error field to null */
} CopyOnErrorChoice;
/*
diff --git a/src/include/commands/copyfrom_internal.h b/src/include/commands/copyfrom_internal.h
index 822ef33cf69..f892c343157 100644
--- a/src/include/commands/copyfrom_internal.h
+++ b/src/include/commands/copyfrom_internal.h
@@ -108,6 +108,13 @@ typedef struct CopyFromStateData
* att */
bool *defaults; /* if DEFAULT marker was found for
* corresponding att */
+
+ /*
+ * True if the corresponding attribute's is a constrained domain. This
+ * will be populated only when ON_ERROR is SET_NULL, otherwise NULL.
+ */
+ bool *domain_with_constraint;
+
bool volatile_defexprs; /* is any of defexprs volatile? */
List *range_table; /* single element list of RangeTblEntry */
List *rteperminfos; /* single element list of RTEPermissionInfo */
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index f3fdce23459..2e1637f2134 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -81,6 +81,10 @@ COPY x from stdin (on_error ignore, on_error ignore);
ERROR: conflicting or redundant options
LINE 1: COPY x from stdin (on_error ignore, on_error ignore);
^
+COPY x from stdin (on_error set_null, on_error set_null);
+ERROR: conflicting or redundant options
+LINE 1: COPY x from stdin (on_error set_null, on_error set_null);
+ ^
COPY x from stdin (log_verbosity default, log_verbosity verbose);
ERROR: conflicting or redundant options
LINE 1: COPY x from stdin (log_verbosity default, log_verbosity verb...
@@ -92,6 +96,10 @@ COPY x from stdin (format BINARY, null 'x');
ERROR: cannot specify NULL in BINARY mode
COPY x from stdin (format BINARY, on_error ignore);
ERROR: only ON_ERROR STOP is allowed in BINARY mode
+COPY x from stdin (format BINARY, on_error set_null);
+ERROR: only ON_ERROR STOP is allowed in BINARY mode
+COPY x from stdin (on_error set_null, reject_limit 2);
+ERROR: COPY REJECT_LIMIT requires ON_ERROR to be set to IGNORE
COPY x from stdin (on_error unsupported);
ERROR: COPY ON_ERROR "unsupported" not recognized
LINE 1: COPY x from stdin (on_error unsupported);
@@ -124,6 +132,10 @@ COPY x to stdout (format BINARY, on_error unsupported);
ERROR: COPY ON_ERROR cannot be used with COPY TO
LINE 1: COPY x to stdout (format BINARY, on_error unsupported);
^
+COPY x to stdout (on_error set_null);
+ERROR: COPY ON_ERROR cannot be used with COPY TO
+LINE 1: COPY x to stdout (on_error set_null);
+ ^
COPY x from stdin (log_verbosity unsupported);
ERROR: COPY LOG_VERBOSITY "unsupported" not recognized
LINE 1: COPY x from stdin (log_verbosity unsupported);
@@ -776,6 +788,46 @@ CONTEXT: COPY check_ign_err
NOTICE: skipping row due to data type incompatibility at line 8 for column "k": "a"
CONTEXT: COPY check_ign_err
NOTICE: 6 rows were skipped due to data type incompatibility
+CREATE DOMAIN d_int_not_null AS integer NOT NULL CHECK(value > 0);
+CREATE DOMAIN d_int_positive_maybe_null AS integer CHECK(value > 0);
+CREATE TABLE t_on_error_null (a d_int_not_null, b d_int_positive_maybe_null, c integer);
+\pset null NULL
+COPY t_on_error_null FROM STDIN WITH (on_error set_null); --fail
+ERROR: domain d_int_not_null does not allow null values
+CONTEXT: COPY t_on_error_null, line 1, column a: null input
+COPY t_on_error_null FROM STDIN WITH (on_error set_null); --fail
+ERROR: invalid input value for domain d_int_not_null: "ss"
+CONTEXT: COPY t_on_error_null, line 1, column a: "ss"
+COPY t_on_error_null FROM STDIN WITH (on_error set_null); --fail
+ERROR: invalid input value for domain d_int_not_null: "-1"
+CONTEXT: COPY t_on_error_null, line 1, column a: "-1"
+--fail, less data.
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_null);
+ERROR: missing data for column "c"
+CONTEXT: COPY t_on_error_null, line 1: "1,1"
+--fail, extra data.
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_null);
+ERROR: extra data after last expected column
+CONTEXT: COPY t_on_error_null, line 1: "1,2,3,4"
+COPY t_on_error_null FROM STDIN WITH (on_error set_null, log_verbosity verbose); --ok
+NOTICE: setting to null due to data type incompatibility at line 1 for column "b": "x1"
+CONTEXT: COPY t_on_error_null
+NOTICE: setting to null due to data type incompatibility at line 1 for column "c": "yx"
+CONTEXT: COPY t_on_error_null
+NOTICE: setting to null due to data type incompatibility at line 2 for column "b": "zx"
+CONTEXT: COPY t_on_error_null
+NOTICE: setting to null due to data type incompatibility at line 3 for column "c": "ea"
+CONTEXT: COPY t_on_error_null
+NOTICE: invalid values in 3 rows were replaced with null due to data type incompatibility
+SELECT * FROM t_on_error_null ORDER BY a;
+ a | b | c
+----+------+------
+ 10 | NULL | NULL
+ 11 | NULL | 12
+ 13 | 14 | NULL
+(3 rows)
+
+\pset null ''
-- tests for on_error option with log_verbosity and null constraint via domain
CREATE DOMAIN dcheck_ign_err2 varchar(15) NOT NULL;
CREATE TABLE check_ign_err2 (n int, m int[], k int, l dcheck_ign_err2);
@@ -835,6 +887,9 @@ DROP VIEW instead_of_insert_tbl_view;
DROP VIEW instead_of_insert_tbl_view_2;
DROP FUNCTION fun_instead_of_insert_tbl();
DROP TABLE check_ign_err;
+DROP TABLE t_on_error_null;
+DROP DOMAIN d_int_not_null;
+DROP DOMAIN d_int_positive_maybe_null;
DROP TABLE check_ign_err2;
DROP DOMAIN dcheck_ign_err2;
DROP TABLE hard_err;
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index cef45868db5..132f67d6c11 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -67,12 +67,15 @@ COPY x from stdin (force_null (a), force_null (b));
COPY x from stdin (convert_selectively (a), convert_selectively (b));
COPY x from stdin (encoding 'sql_ascii', encoding 'sql_ascii');
COPY x from stdin (on_error ignore, on_error ignore);
+COPY x from stdin (on_error set_null, on_error set_null);
COPY x from stdin (log_verbosity default, log_verbosity verbose);
-- incorrect options
COPY x from stdin (format BINARY, delimiter ',');
COPY x from stdin (format BINARY, null 'x');
COPY x from stdin (format BINARY, on_error ignore);
+COPY x from stdin (format BINARY, on_error set_null);
+COPY x from stdin (on_error set_null, reject_limit 2);
COPY x from stdin (on_error unsupported);
COPY x from stdin (format TEXT, force_quote(a));
COPY x from stdin (format TEXT, force_quote *);
@@ -87,6 +90,7 @@ COPY x from stdin (format TEXT, force_null *);
COPY x to stdout (format CSV, force_null(a));
COPY x to stdout (format CSV, force_null *);
COPY x to stdout (format BINARY, on_error unsupported);
+COPY x to stdout (on_error set_null);
COPY x from stdin (log_verbosity unsupported);
COPY x from stdin with (reject_limit 1);
COPY x from stdin with (on_error ignore, reject_limit 0);
@@ -537,6 +541,42 @@ a {2} 2
8 {8} 8
\.
+CREATE DOMAIN d_int_not_null AS integer NOT NULL CHECK(value > 0);
+CREATE DOMAIN d_int_positive_maybe_null AS integer CHECK(value > 0);
+CREATE TABLE t_on_error_null (a d_int_not_null, b d_int_positive_maybe_null, c integer);
+
+\pset null NULL
+COPY t_on_error_null FROM STDIN WITH (on_error set_null); --fail
+\N 11 13
+\.
+
+COPY t_on_error_null FROM STDIN WITH (on_error set_null); --fail
+ss 11 14
+\.
+
+COPY t_on_error_null FROM STDIN WITH (on_error set_null); --fail
+-1 11 13
+\.
+
+--fail, less data.
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_null);
+1,1
+\.
+--fail, extra data.
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_null);
+1,2,3,4
+\.
+
+COPY t_on_error_null FROM STDIN WITH (on_error set_null, log_verbosity verbose); --ok
+10 x1 yx
+11 zx 12
+13 14 ea
+\.
+
+SELECT * FROM t_on_error_null ORDER BY a;
+
+\pset null ''
+
-- tests for on_error option with log_verbosity and null constraint via domain
CREATE DOMAIN dcheck_ign_err2 varchar(15) NOT NULL;
CREATE TABLE check_ign_err2 (n int, m int[], k int, l dcheck_ign_err2);
@@ -606,6 +646,9 @@ DROP VIEW instead_of_insert_tbl_view;
DROP VIEW instead_of_insert_tbl_view_2;
DROP FUNCTION fun_instead_of_insert_tbl();
DROP TABLE check_ign_err;
+DROP TABLE t_on_error_null;
+DROP DOMAIN d_int_not_null;
+DROP DOMAIN d_int_positive_maybe_null;
DROP TABLE check_ign_err2;
DROP DOMAIN dcheck_ign_err2;
DROP TABLE hard_err;
--
2.34.1
^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
@ 2026-01-21 18:47 Matheus Alcantara <[email protected]>
parent: jian he <[email protected]>
0 siblings, 1 reply; 31+ messages in thread
From: Matheus Alcantara @ 2026-01-21 18:47 UTC (permalink / raw)
To: jian he <[email protected]>; +Cc: torikoshia <[email protected]>; Masahiko Sawada <[email protected]>; vignesh C <[email protected]>; Jim Jones <[email protected]>; Kirill Reshke <[email protected]>; Fujii Masao <[email protected]>; David G. Johnston <[email protected]>; Yugo NAGATA <[email protected]>; PostgreSQL Hackers <[email protected]>
On Wed Jan 21, 2026 at 4:37 AM -03, jian he wrote:
> On Wed, Jan 21, 2026 at 3:55 AM Matheus Alcantara
> <[email protected]> wrote:
>> Hi,
>>
>> The patch needs a new rebase, could you please send a new version?
>
> sure. please check the attached.
Thanks for the new version. I have some comments on this first round of
review:
+ errmsg_plural("invalid values in %" PRIu64 " row was replaced with null due to data type incompatibility",
+ "invalid values in %" PRIu64 " rows were replaced with null due to data type incompatibility",
I think that we could remove the "invalid values in" to make it
consistency with the COPY_ON_ERROR_IGNORE NOTICE
----------
+ cstate->domain_with_constraint = (bool *) palloc0(attr_count * sizeof(bool));
I think that we can use palloc_array?
----------
Should FORCE_NOT_NULL be allowed to be used with ON_ERROR set_null? It
seems to me that ON_ERROR set_null overwrite the FORCE_NOT_NULL
behaviour:
postgres=# create table t4(a int, b varchar(5));
CREATE TABLE
postgres=# copy t4 from 'data.csv' with (FORCE_NOT_NULL(b), format csv, delimiter ',', NULL 'NULL', ON_ERROR set_null);
NOTICE: invalid values in 2 rows were replaced with null due to data type incompatibility
COPY 5
postgres=# \pset null 'NULL'
Null display is "NULL".
postgres=# select * from t4;
a | b
---+------
1 | aaaa
2 | bbbb
2 | NULL
2 | NULL
5 | NULL
(5 rows)
Note that only the ccccc rows on .csv file was inserted with a NULL
value on b column. The 5,NULL row was inserted with a "NULL" string as a
value:
postgres=# select * from t4 where b is null;
a | b
---+------
2 | NULL
2 | NULL
(2 rows)
The contents of data.csv:
1,aaaa
2,bbbb
2,ccccc
2,ccccc
5,NULL
Perhaps we should block the usage of FORCE_NOT_NULL with ON_ERROR
SET_NULL?
----------
On monitoring.sgml we have the following for pg_stat_progress_copy
tuples_skipped:
Number of tuples skipped because they contain malformed data.
This counter only advances when a value other than
<literal>stop</literal> is specified to the <literal>ON_ERROR</literal>
IIUC we are not updating this view if we set a column to NULL due to an
error, perhaps this documentation should be updated to mention that it
will not be updated with ON_ERROR set_null?
----------
I may have missing something, but we are still considering implementing
the REJECT_LIMIT + ON_ERROR set_null?
--
Matheus Alcantara
EDB: https://www.enterprisedb.com
^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
@ 2026-01-22 14:45 jian he <[email protected]>
parent: Matheus Alcantara <[email protected]>
0 siblings, 1 reply; 31+ messages in thread
From: jian he @ 2026-01-22 14:45 UTC (permalink / raw)
To: Matheus Alcantara <[email protected]>; +Cc: torikoshia <[email protected]>; Masahiko Sawada <[email protected]>; vignesh C <[email protected]>; Jim Jones <[email protected]>; Kirill Reshke <[email protected]>; Fujii Masao <[email protected]>; David G. Johnston <[email protected]>; Yugo NAGATA <[email protected]>; PostgreSQL Hackers <[email protected]>
On Thu, Jan 22, 2026 at 2:47 AM Matheus Alcantara
<[email protected]> wrote:
>
> Thanks for the new version. I have some comments on this first round of
> review:
>
> + errmsg_plural("invalid values in %" PRIu64 " row was replaced with null due to data type incompatibility",
> + "invalid values in %" PRIu64 " rows were replaced with null due to data type incompatibility",
>
> I think that we could remove the "invalid values in" to make it
> consistency with the COPY_ON_ERROR_IGNORE NOTICE
>
sure.
> ----------
>
> + cstate->domain_with_constraint = (bool *) palloc0(attr_count * sizeof(bool));
>
> I think that we can use palloc_array?
>
sure.
> ----------
>
> Should FORCE_NOT_NULL be allowed to be used with ON_ERROR set_null? It
> seems to me that ON_ERROR set_null overwrite the FORCE_NOT_NULL
> behaviour:
>
> postgres=# create table t4(a int, b varchar(5));
> CREATE TABLE
>
> postgres=# copy t4 from 'data.csv' with (FORCE_NOT_NULL(b), format csv, delimiter ',', NULL 'NULL', ON_ERROR set_null);
> NOTICE: invalid values in 2 rows were replaced with null due to data type incompatibility
> COPY 5
>
> postgres=# \pset null 'NULL'
> Null display is "NULL".
> postgres=# select * from t4;
> a | b
> ---+------
> 1 | aaaa
> 2 | bbbb
> 2 | NULL
> 2 | NULL
> 5 | NULL
> (5 rows)
>
> Note that only the ccccc rows on .csv file was inserted with a NULL
> value on b column. The 5,NULL row was inserted with a "NULL" string as a
> value:
>
> postgres=# select * from t4 where b is null;
> a | b
> ---+------
> 2 | NULL
> 2 | NULL
> (2 rows)
>
> The contents of data.csv:
> 1,aaaa
> 2,bbbb
> 2,ccccc
> 2,ccccc
> 5,NULL
>
> Perhaps we should block the usage of FORCE_NOT_NULL with ON_ERROR
> SET_NULL?
>
FORCE_NOT_NULL is related to how we handle NULL string in column value.
We first process cstate->opts.force_notnull_flags, cstate->opts.force_null_flags
then InputFunctionCallSafe.
see copyfromparse.c, CopyFromTextLikeOneRow ``if (is_csv)``loop.
I think these two are unrelated things, FORCE_NOT_NULL should be fine with
ON_ERROR SET_NULL.
you can see related tests in
https://git.postgresql.org/cgit/postgresql.git/tree/src/test/regress/sql/copy2.sql#n330
Am I missing something?
>
> On monitoring.sgml we have the following for pg_stat_progress_copy
> tuples_skipped:
> Number of tuples skipped because they contain malformed data.
> This counter only advances when a value other than
> <literal>stop</literal> is specified to the <literal>ON_ERROR</literal>
>
> IIUC we are not updating this view if we set a column to NULL due to an
> error, perhaps this documentation should be updated to mention that it
> will not be updated with ON_ERROR set_null?
>
IMHO, we don't need to mention ON_ERROR set_null, since we do not support it.
change to the following should be ok, i think.
<para>
Number of tuples skipped because they contain malformed data.
This counter only advances when
<literal>ignore</literal> is specified to the <literal>ON_ERROR</literal>
option.
</para></entry>
>
> I may have missing something, but we are still considering implementing
> the REJECT_LIMIT + ON_ERROR set_null?
Possibly as a separate patch later.
--
jian
https://www.enterprisedb.com/
^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
@ 2026-01-27 18:37 Matheus Alcantara <[email protected]>
parent: jian he <[email protected]>
0 siblings, 1 reply; 31+ messages in thread
From: Matheus Alcantara @ 2026-01-27 18:37 UTC (permalink / raw)
To: jian he <[email protected]>; +Cc: torikoshia <[email protected]>; Masahiko Sawada <[email protected]>; vignesh C <[email protected]>; Jim Jones <[email protected]>; Kirill Reshke <[email protected]>; Fujii Masao <[email protected]>; David G. Johnston <[email protected]>; Yugo NAGATA <[email protected]>; PostgreSQL Hackers <[email protected]>
On 22/01/26 11:45, jian he wrote:
>> Should FORCE_NOT_NULL be allowed to be used with ON_ERROR set_null? It
>> seems to me that ON_ERROR set_null overwrite the FORCE_NOT_NULL
>> behaviour:
>>
>> postgres=# create table t4(a int, b varchar(5));
>> CREATE TABLE
>>
>> postgres=# copy t4 from 'data.csv' with (FORCE_NOT_NULL(b), format csv, delimiter ',', NULL 'NULL', ON_ERROR set_null);
>> NOTICE: invalid values in 2 rows were replaced with null due to data type incompatibility
>> COPY 5
>>
>> postgres=# \pset null 'NULL'
>> Null display is "NULL".
>> postgres=# select * from t4;
>> a | b
>> ---+------
>> 1 | aaaa
>> 2 | bbbb
>> 2 | NULL
>> 2 | NULL
>> 5 | NULL
>> (5 rows)
>>
>> Note that only the ccccc rows on .csv file was inserted with a NULL
>> value on b column. The 5,NULL row was inserted with a "NULL" string as a
>> value:
>>
>> postgres=# select * from t4 where b is null;
>> a | b
>> ---+------
>> 2 | NULL
>> 2 | NULL
>> (2 rows)
>>
>> The contents of data.csv:
>> 1,aaaa
>> 2,bbbb
>> 2,ccccc
>> 2,ccccc
>> 5,NULL
>>
>> Perhaps we should block the usage of FORCE_NOT_NULL with ON_ERROR
>> SET_NULL?
>>
> FORCE_NOT_NULL is related to how we handle NULL string in column value.
>
> We first process cstate->opts.force_notnull_flags, cstate->opts.force_null_flags
> then InputFunctionCallSafe.
> see copyfromparse.c, CopyFromTextLikeOneRow ``if (is_csv)``loop.
>
> I think these two are unrelated things, FORCE_NOT_NULL should be fine with
> ON_ERROR SET_NULL.
> you can see related tests in
> https://git.postgresql.org/cgit/postgresql.git/tree/src/test/regress/sql/copy2.sql#n330
>
> Am I missing something?
Yeah, after some more thinking it seems ok to use both options
together. I just found a bit strange when using integer columns.
Consider this example:
cat data.csv
1,11
2,22
3,
4,44
postgres=# create table t(a int not null, b int);
CREATE TABLE
postgres=# copy t from
'/Users/matheus/dev/pgdev/copy-on-error-set-null/data.csv' with
(FORCE_NOT_NULL(b), format csv, delimiter ',', ON_ERROR set_null);
NOTICE: 1 row was replaced with null due to data type incompatibility
COPY 4
postgres=# select * from t where b is null;
a | b
---+---
3 |
(1 row)
We are requiring a not null value on column b but we are still
generating rows with null values on b.
The reasoning on this is that the row 3 would generate a "invalid
input syntax for type integer" error and the ON_ERROR set_null fix
this by inserting a NULL value. It make sense I think but I'm
wondering if it could cause any confusion?
>> On monitoring.sgml we have the following for pg_stat_progress_copy
>> tuples_skipped:
>> Number of tuples skipped because they contain malformed data.
>> This counter only advances when a value other than
>> <literal>stop</literal> is specified to the <literal>ON_ERROR</literal>
>>
>> IIUC we are not updating this view if we set a column to NULL due to an
>> error, perhaps this documentation should be updated to mention that it
>> will not be updated with ON_ERROR set_null?
>>
>
> IMHO, we don't need to mention ON_ERROR set_null, since we do not support it.
> change to the following should be ok, i think.
>
> <para>
> Number of tuples skipped because they contain malformed data.
> This counter only advances when
> <literal>ignore</literal> is specified to the <literal>ON_ERROR</literal>
> option.
> </para></entry>
It looks good, I was thinking in something like this.
>>
>> I may have missing something, but we are still considering implementing
>> the REJECT_LIMIT + ON_ERROR set_null?
> Possibly as a separate patch later.
Ok, good, thanks.
--
Matheus Alcantara
EDB: https://www.enterprisedb.com
^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
@ 2026-02-06 03:40 jian he <[email protected]>
parent: Matheus Alcantara <[email protected]>
0 siblings, 1 reply; 31+ messages in thread
From: jian he @ 2026-02-06 03:40 UTC (permalink / raw)
To: Matheus Alcantara <[email protected]>; +Cc: torikoshia <[email protected]>; Masahiko Sawada <[email protected]>; vignesh C <[email protected]>; Jim Jones <[email protected]>; Kirill Reshke <[email protected]>; Fujii Masao <[email protected]>; David G. Johnston <[email protected]>; Yugo NAGATA <[email protected]>; PostgreSQL Hackers <[email protected]>
On Wed, Jan 28, 2026 at 2:37 AM Matheus Alcantara
<[email protected]> wrote:
>
> Yeah, after some more thinking it seems ok to use both options
> together. I just found a bit strange when using integer columns.
> Consider this example:
>
> cat data.csv
> 1,11
> 2,22
> 3,
> 4,44
>
> postgres=# create table t(a int not null, b int);
> CREATE TABLE
>
> postgres=# copy t from
> '/Users/matheus/dev/pgdev/copy-on-error-set-null/data.csv' with
> (FORCE_NOT_NULL(b), format csv, delimiter ',', ON_ERROR set_null);
> NOTICE: 1 row was replaced with null due to data type incompatibility
> COPY 4
>
> postgres=# select * from t where b is null;
> a | b
> ---+---
> 3 |
> (1 row)
>
> We are requiring a not null value on column b but we are still
> generating rows with null values on b.
>
> The reasoning on this is that the row 3 would generate a "invalid
> input syntax for type integer" error and the ON_ERROR set_null fix
> this by inserting a NULL value. It make sense I think but I'm
> wondering if it could cause any confusion?
>
After careful reading the FORCE_NOT_NULL, FORCE_NULL option.
It's about dealing with empty value and NULL strings.
copy t from stdin with(FORCE_NOT_NULL (b), format csv, delimiter ',');
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1,
>> \.
ERROR: invalid input syntax for type integer: ""
CONTEXT: COPY t, line 1, column b: ""
in this case, FORCE_NOT_NULL will convert the empty value to null
string (empty double quote)
another FORCE_NULL.
copy t from stdin with(FORCE_NULL (b), format csv, delimiter ',');
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> 1,""
>> \.
COPY 1
src9=# table t;
a | b
---+---
1 |
(1 row)
In this case, FORCE_NULL will convert null string (empty double quote) to NULL.
ON_ERROR explanation, the first sentence:
"""
Specifies how to behave when encountering an error converting a
column's input value into its data type.
"""
FORCE_NULL, FORCE_NOT_NULL is a special handling of input value, ON_ERROR is
about converting the input value to data type, so it's before ON_ERROR.
Overall the current doc is fine, IMHO.
The attached patch has addressed your other points.
--
jian
https://www.enterprisedb.com/
Attachments:
[text/x-patch] v22-0001-COPY-on_error-set_null.patch (22.9K, 2-v22-0001-COPY-on_error-set_null.patch)
download | inline diff:
From b479c57753b51a868d5d19080a070c7aed9026de Mon Sep 17 00:00:00 2001
From: jian he <[email protected]>
Date: Thu, 5 Feb 2026 16:09:59 +0800
Subject: [PATCH v22 1/1] COPY (on_error set_null)
If ON_ERROR SET_NULL is specified during COPY FROM, any data type conversion
errors will result in the affected column being set to NULL. However, column's
not-null constraints are still enforced, attempting to set a NULL value in such
columns will raise a constraint violation error. This applies to column data
type is a domain with a NOT NULL constraint.
Author: Jian He <[email protected]>
Author: Kirill Reshke <[email protected]>
Reviewed-by:
Fujii Masao <[email protected]>
Jim Jones <[email protected]>
"David G. Johnston" <[email protected]>
Yugo NAGATA <[email protected]>
torikoshia <[email protected]>
Masahiko Sawada <[email protected]>
Atsushi Torikoshi <[email protected]>
Matheus Alcantara <[email protected]>
discussion: https://postgr.es/m/CAKFQuwawy1e6YR4S=j+y7pXqg_Dw1WBVrgvf=BP3d1_aSfe_+Q@mail.gmail.com
commitfest entry: https://commitfest.postgresql.org/patch/4810
---
doc/src/sgml/monitoring.sgml | 4 +-
doc/src/sgml/ref/copy.sgml | 35 +++++++---
src/backend/commands/copy.c | 6 +-
src/backend/commands/copyfrom.c | 46 ++++++++++---
src/backend/commands/copyfromparse.c | 85 ++++++++++++++++++++----
src/bin/psql/tab-complete.in.c | 2 +-
src/include/commands/copy.h | 1 +
src/include/commands/copyfrom_internal.h | 7 ++
src/test/regress/expected/copy2.out | 55 +++++++++++++++
src/test/regress/sql/copy2.sql | 43 ++++++++++++
10 files changed, 247 insertions(+), 37 deletions(-)
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index b77d189a500..dcf6e6a2f48 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6266,8 +6266,8 @@ FROM pg_stat_get_backend_idset() AS backendid;
</para>
<para>
Number of tuples skipped because they contain malformed data.
- This counter only advances when a value other than
- <literal>stop</literal> is specified to the <literal>ON_ERROR</literal>
+ This counter only advances when
+ <literal>ignore</literal> is specified to the <literal>ON_ERROR</literal>
option.
</para></entry>
</row>
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index afea9caf5ad..98c0e2924ea 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -432,22 +432,38 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
input value into its data type.
An <replaceable class="parameter">error_action</replaceable> value of
<literal>stop</literal> means fail the command, while
- <literal>ignore</literal> means discard the input row and continue with the next one.
+ <literal>ignore</literal> means discard the input row and continue with the next one,
+ and <literal>set_null</literal> means replace field containing invalid
+ input value with <literal>NULL</literal> and continue to the next field.
The default is <literal>stop</literal>.
</para>
<para>
- The <literal>ignore</literal> option is applicable only for <command>COPY FROM</command>
+ The <literal>ignore</literal> and <literal>set_null</literal>
+ options are applicable only for <command>COPY FROM</command>
when the <literal>FORMAT</literal> is <literal>text</literal> or <literal>csv</literal>.
</para>
+
+ <para>
+ For <literal>ignore</literal> option, a <literal>NOTICE</literal> message
+ containing the ignored row count is emitted at the end of the <command>COPY FROM</command>
+ if at least one row was discarded.
+ For <literal>set_null</literal> option, a <literal>NOTICE</literal>
+ message indicating the number of rows where invalid input values were
+ replaced with null is emitted at the end of the <command>COPY FROM</command>
+ if at least one row was replaced.
+ </para>
+
<para>
- A <literal>NOTICE</literal> message containing the ignored row count is
- emitted at the end of the <command>COPY FROM</command> if at least one
- row was discarded. When <literal>LOG_VERBOSITY</literal> option is set to
- <literal>verbose</literal>, a <literal>NOTICE</literal> message
+ When <literal>LOG_VERBOSITY</literal> option is set to <literal>verbose</literal>,
+ for <literal>ignore</literal> option, a <literal>NOTICE</literal> message
containing the line of the input file and the column name whose input
- conversion has failed is emitted for each discarded row.
+ conversion has failed is emitted for each discarded row;
+ for <literal>set_null</literal> option, a <literal>NOTICE</literal>
+ message containing the line of the input file and the column name where
+ value was replaced with <literal>NULL</literal> for each input conversion
+ failure.
When it is set to <literal>silent</literal>, no message is emitted
- regarding ignored rows.
+ regarding input conversion failed rows.
</para>
</listitem>
</varlistentry>
@@ -495,7 +511,8 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
</para>
<para>
This is currently used in <command>COPY FROM</command> command when
- <literal>ON_ERROR</literal> option is set to <literal>ignore</literal>.
+ <literal>ON_ERROR</literal> option is set to <literal>ignore</literal>
+ or <literal>set_null</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 5d9b08b822c..6263767a903 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -469,12 +469,14 @@ defGetCopyOnErrorChoice(DefElem *def, ParseState *pstate, bool is_from)
parser_errposition(pstate, def->location)));
/*
- * Allow "stop", or "ignore" values.
+ * Allow "stop", "ignore", "set_null" values.
*/
if (pg_strcasecmp(sval, "stop") == 0)
return COPY_ON_ERROR_STOP;
if (pg_strcasecmp(sval, "ignore") == 0)
return COPY_ON_ERROR_IGNORE;
+ if (pg_strcasecmp(sval, "set_null") == 0)
+ return COPY_ON_ERROR_SET_NULL;
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -1035,7 +1037,7 @@ ProcessCopyOptions(ParseState *pstate,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("only ON_ERROR STOP is allowed in BINARY mode")));
- if (opts_out->reject_limit && !opts_out->on_error)
+ if (opts_out->reject_limit && opts_out->on_error != COPY_ON_ERROR_IGNORE)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
/*- translator: first and second %s are the names of COPY option, e.g.
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 0c73b0259c3..cc4a2dab348 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -50,6 +50,7 @@
#include "utils/portal.h"
#include "utils/rel.h"
#include "utils/snapmgr.h"
+#include "utils/typcache.h"
/*
* No more than this many tuples per CopyMultiInsertBuffer
@@ -1467,14 +1468,22 @@ CopyFrom(CopyFromState cstate)
/* Done, clean up */
error_context_stack = errcallback.previous;
- if (cstate->opts.on_error != COPY_ON_ERROR_STOP &&
- cstate->num_errors > 0 &&
+ if (cstate->num_errors > 0 &&
cstate->opts.log_verbosity >= COPY_LOG_VERBOSITY_DEFAULT)
- ereport(NOTICE,
- errmsg_plural("%" PRIu64 " row was skipped due to data type incompatibility",
- "%" PRIu64 " rows were skipped due to data type incompatibility",
- cstate->num_errors,
- cstate->num_errors));
+ {
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ ereport(NOTICE,
+ errmsg_plural("%" PRIu64 " row was skipped due to data type incompatibility",
+ "%" PRIu64 " rows were skipped due to data type incompatibility",
+ cstate->num_errors,
+ cstate->num_errors));
+ else if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ ereport(NOTICE,
+ errmsg_plural("%" PRIu64 " row was replaced with null due to data type incompatibility",
+ "%" PRIu64 " rows were replaced with null due to data type incompatibility",
+ cstate->num_errors,
+ cstate->num_errors));
+ }
if (bistate != NULL)
FreeBulkInsertState(bistate);
@@ -1622,15 +1631,32 @@ BeginCopyFrom(ParseState *pstate,
cstate->escontext->error_occurred = false;
/*
- * Currently we only support COPY_ON_ERROR_IGNORE. We'll add other
- * options later
+ * Currently we only support COPY_ON_ERROR_IGNORE,
+ * COPY_ON_ERROR_SET_NULL. We'll add other options later
*/
- if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE ||
+ cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
cstate->escontext->details_wanted = false;
}
else
cstate->escontext = NULL;
+ if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ {
+ int attr_count = list_length(cstate->attnumlist);
+
+ cstate->domain_with_constraint = palloc0_array(bool, attr_count);
+
+ foreach_int(attno, cstate->attnumlist)
+ {
+ int i = foreach_current_index(attno);
+
+ Form_pg_attribute att = TupleDescAttr(tupDesc, attno - 1);
+
+ cstate->domain_with_constraint[i] = DomainHasConstraints(att->atttypid);
+ }
+ }
+
/* Convert FORCE_NULL name list to per-column flags, check validity */
cstate->opts.force_null_flags = (bool *) palloc0(num_phys_attrs * sizeof(bool));
if (cstate->opts.force_null_all)
diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c
index 0d9199f5ce8..e12decaa81a 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -959,6 +959,7 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
int fldct;
int fieldno;
char *string;
+ bool current_row_erroneous = false;
tupDesc = RelationGetDescr(cstate->rel);
attr_count = list_length(cstate->attnumlist);
@@ -1036,7 +1037,8 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
}
/*
- * If ON_ERROR is specified with IGNORE, skip rows with soft errors
+ * If ON_ERROR is specified with IGNORE, skip rows with soft errors.
+ * If ON_ERROR is specified with SET_NULL, try to replace with null.
*/
else if (!InputFunctionCallSafe(&in_functions[m],
string,
@@ -1047,7 +1049,51 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
{
Assert(cstate->opts.on_error != COPY_ON_ERROR_STOP);
- cstate->num_errors++;
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ cstate->num_errors++;
+ else if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ {
+ cstate->escontext->error_occurred = false;
+
+ Assert(cstate->domain_with_constraint != NULL);
+
+ /*
+ * If the column type is a constrained domain, an additional
+ * InputFunctionCallSafe may be needed to raise error for
+ * domain constraint violation.
+ */
+ if (!cstate->domain_with_constraint[m] ||
+ InputFunctionCallSafe(&in_functions[m],
+ NULL,
+ typioparams[m],
+ att->atttypmod,
+ (Node *) cstate->escontext,
+ &values[m]))
+ {
+ nulls[m] = true;
+ values[m] = (Datum) 0;
+ }
+ else if (string == NULL)
+ ereport(ERROR,
+ errcode(ERRCODE_NOT_NULL_VIOLATION),
+ errmsg("domain %s does not allow null values", format_type_be(typioparams[m])),
+ errdatatype(typioparams[m]));
+ else
+ ereport(ERROR,
+ errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+ errmsg("invalid input value for domain %s: \"%s\"",
+ format_type_be(typioparams[m]), string));
+
+ /*
+ * We count only the number of rows (not fields) where
+ * ON_ERROR SET_NULL was applied.
+ */
+ if (!current_row_erroneous)
+ {
+ current_row_erroneous = true;
+ cstate->num_errors++;
+ }
+ }
if (cstate->opts.log_verbosity == COPY_LOG_VERBOSITY_VERBOSE)
{
@@ -1064,24 +1110,37 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
char *attval;
attval = CopyLimitPrintoutLength(cstate->cur_attval);
- ereport(NOTICE,
- errmsg("skipping row due to data type incompatibility at line %" PRIu64 " for column \"%s\": \"%s\"",
- cstate->cur_lineno,
- cstate->cur_attname,
- attval));
+
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ ereport(NOTICE,
+ errmsg("skipping row due to data type incompatibility at line %" PRIu64 " for column \"%s\": \"%s\"",
+ cstate->cur_lineno,
+ cstate->cur_attname,
+ attval));
+ else if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ ereport(NOTICE,
+ errmsg("setting to null due to data type incompatibility at line %" PRIu64 " for column \"%s\": \"%s\"",
+ cstate->cur_lineno,
+ cstate->cur_attname,
+ attval));
pfree(attval);
}
else
- ereport(NOTICE,
- errmsg("skipping row due to data type incompatibility at line %" PRIu64 " for column \"%s\": null input",
- cstate->cur_lineno,
- cstate->cur_attname));
-
+ {
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ ereport(NOTICE,
+ errmsg("skipping row due to data type incompatibility at line %" PRIu64 " for column \"%s\": null input",
+ cstate->cur_lineno,
+ cstate->cur_attname));
+ }
/* reset relname_only */
cstate->relname_only = false;
}
- return true;
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ return true;
+ else if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ continue;
}
cstate->cur_attname = NULL;
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index a610ef603f4..6810a6d04e6 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -3425,7 +3425,7 @@ match_previous_words(int pattern_id,
/* Complete COPY <sth> FROM filename WITH (ON_ERROR */
else if (TailMatches("ON_ERROR"))
- COMPLETE_WITH("stop", "ignore");
+ COMPLETE_WITH("stop", "ignore", "set_null");
/* Complete COPY <sth> FROM filename WITH (LOG_VERBOSITY */
else if (TailMatches("LOG_VERBOSITY"))
diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h
index b2a68fa6407..abecfe51098 100644
--- a/src/include/commands/copy.h
+++ b/src/include/commands/copy.h
@@ -35,6 +35,7 @@ typedef enum CopyOnErrorChoice
{
COPY_ON_ERROR_STOP = 0, /* immediately throw errors, default */
COPY_ON_ERROR_IGNORE, /* ignore errors */
+ COPY_ON_ERROR_SET_NULL, /* set error field to null */
} CopyOnErrorChoice;
/*
diff --git a/src/include/commands/copyfrom_internal.h b/src/include/commands/copyfrom_internal.h
index 822ef33cf69..f892c343157 100644
--- a/src/include/commands/copyfrom_internal.h
+++ b/src/include/commands/copyfrom_internal.h
@@ -108,6 +108,13 @@ typedef struct CopyFromStateData
* att */
bool *defaults; /* if DEFAULT marker was found for
* corresponding att */
+
+ /*
+ * True if the corresponding attribute's is a constrained domain. This
+ * will be populated only when ON_ERROR is SET_NULL, otherwise NULL.
+ */
+ bool *domain_with_constraint;
+
bool volatile_defexprs; /* is any of defexprs volatile? */
List *range_table; /* single element list of RangeTblEntry */
List *rteperminfos; /* single element list of RTEPermissionInfo */
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index 3145b314e48..72034796aca 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -81,6 +81,10 @@ COPY x from stdin (on_error ignore, on_error ignore);
ERROR: conflicting or redundant options
LINE 1: COPY x from stdin (on_error ignore, on_error ignore);
^
+COPY x from stdin (on_error set_null, on_error set_null);
+ERROR: conflicting or redundant options
+LINE 1: COPY x from stdin (on_error set_null, on_error set_null);
+ ^
COPY x from stdin (log_verbosity default, log_verbosity verbose);
ERROR: conflicting or redundant options
LINE 1: COPY x from stdin (log_verbosity default, log_verbosity verb...
@@ -92,6 +96,10 @@ COPY x from stdin (format BINARY, null 'x');
ERROR: cannot specify NULL in BINARY mode
COPY x from stdin (format BINARY, on_error ignore);
ERROR: only ON_ERROR STOP is allowed in BINARY mode
+COPY x from stdin (format BINARY, on_error set_null);
+ERROR: only ON_ERROR STOP is allowed in BINARY mode
+COPY x from stdin (on_error set_null, reject_limit 2);
+ERROR: COPY REJECT_LIMIT requires ON_ERROR to be set to IGNORE
COPY x from stdin (on_error unsupported);
ERROR: COPY ON_ERROR "unsupported" not recognized
LINE 1: COPY x from stdin (on_error unsupported);
@@ -124,6 +132,10 @@ COPY x to stdout (format BINARY, on_error unsupported);
ERROR: COPY ON_ERROR cannot be used with COPY TO
LINE 1: COPY x to stdout (format BINARY, on_error unsupported);
^
+COPY x to stdout (on_error set_null);
+ERROR: COPY ON_ERROR cannot be used with COPY TO
+LINE 1: COPY x to stdout (on_error set_null);
+ ^
COPY x from stdin (log_verbosity unsupported);
ERROR: COPY LOG_VERBOSITY "unsupported" not recognized
LINE 1: COPY x from stdin (log_verbosity unsupported);
@@ -782,6 +794,46 @@ CONTEXT: COPY check_ign_err
NOTICE: skipping row due to data type incompatibility at line 8 for column "k": "a"
CONTEXT: COPY check_ign_err
NOTICE: 6 rows were skipped due to data type incompatibility
+CREATE DOMAIN d_int_not_null AS integer NOT NULL CHECK(value > 0);
+CREATE DOMAIN d_int_positive_maybe_null AS integer CHECK(value > 0);
+CREATE TABLE t_on_error_null (a d_int_not_null, b d_int_positive_maybe_null, c integer);
+\pset null NULL
+COPY t_on_error_null FROM STDIN WITH (on_error set_null); --fail
+ERROR: domain d_int_not_null does not allow null values
+CONTEXT: COPY t_on_error_null, line 1, column a: null input
+COPY t_on_error_null FROM STDIN WITH (on_error set_null); --fail
+ERROR: invalid input value for domain d_int_not_null: "ss"
+CONTEXT: COPY t_on_error_null, line 1, column a: "ss"
+COPY t_on_error_null FROM STDIN WITH (on_error set_null); --fail
+ERROR: invalid input value for domain d_int_not_null: "-1"
+CONTEXT: COPY t_on_error_null, line 1, column a: "-1"
+--fail, less data.
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_null);
+ERROR: missing data for column "c"
+CONTEXT: COPY t_on_error_null, line 1: "1,1"
+--fail, extra data.
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_null);
+ERROR: extra data after last expected column
+CONTEXT: COPY t_on_error_null, line 1: "1,2,3,4"
+COPY t_on_error_null FROM STDIN WITH (on_error set_null, log_verbosity verbose); --ok
+NOTICE: setting to null due to data type incompatibility at line 1 for column "b": "x1"
+CONTEXT: COPY t_on_error_null
+NOTICE: setting to null due to data type incompatibility at line 1 for column "c": "yx"
+CONTEXT: COPY t_on_error_null
+NOTICE: setting to null due to data type incompatibility at line 2 for column "b": "zx"
+CONTEXT: COPY t_on_error_null
+NOTICE: setting to null due to data type incompatibility at line 3 for column "c": "ea"
+CONTEXT: COPY t_on_error_null
+NOTICE: 3 rows were replaced with null due to data type incompatibility
+SELECT * FROM t_on_error_null ORDER BY a;
+ a | b | c
+----+------+------
+ 10 | NULL | NULL
+ 11 | NULL | 12
+ 13 | 14 | NULL
+(3 rows)
+
+\pset null ''
-- tests for on_error option with log_verbosity and null constraint via domain
CREATE DOMAIN dcheck_ign_err2 varchar(15) NOT NULL;
CREATE TABLE check_ign_err2 (n int, m int[], k int, l dcheck_ign_err2);
@@ -841,6 +893,9 @@ DROP VIEW instead_of_insert_tbl_view;
DROP VIEW instead_of_insert_tbl_view_2;
DROP FUNCTION fun_instead_of_insert_tbl();
DROP TABLE check_ign_err;
+DROP TABLE t_on_error_null;
+DROP DOMAIN d_int_not_null;
+DROP DOMAIN d_int_positive_maybe_null;
DROP TABLE check_ign_err2;
DROP DOMAIN dcheck_ign_err2;
DROP TABLE hard_err;
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index 66435167500..5e9f6af2c5c 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -67,12 +67,15 @@ COPY x from stdin (force_null (a), force_null (b));
COPY x from stdin (convert_selectively (a), convert_selectively (b));
COPY x from stdin (encoding 'sql_ascii', encoding 'sql_ascii');
COPY x from stdin (on_error ignore, on_error ignore);
+COPY x from stdin (on_error set_null, on_error set_null);
COPY x from stdin (log_verbosity default, log_verbosity verbose);
-- incorrect options
COPY x from stdin (format BINARY, delimiter ',');
COPY x from stdin (format BINARY, null 'x');
COPY x from stdin (format BINARY, on_error ignore);
+COPY x from stdin (format BINARY, on_error set_null);
+COPY x from stdin (on_error set_null, reject_limit 2);
COPY x from stdin (on_error unsupported);
COPY x from stdin (format TEXT, force_quote(a));
COPY x from stdin (format TEXT, force_quote *);
@@ -87,6 +90,7 @@ COPY x from stdin (format TEXT, force_null *);
COPY x to stdout (format CSV, force_null(a));
COPY x to stdout (format CSV, force_null *);
COPY x to stdout (format BINARY, on_error unsupported);
+COPY x to stdout (on_error set_null);
COPY x from stdin (log_verbosity unsupported);
COPY x from stdin with (reject_limit 1);
COPY x from stdin with (on_error ignore, reject_limit 0);
@@ -540,6 +544,42 @@ a {2} 2
8 {8} 8
\.
+CREATE DOMAIN d_int_not_null AS integer NOT NULL CHECK(value > 0);
+CREATE DOMAIN d_int_positive_maybe_null AS integer CHECK(value > 0);
+CREATE TABLE t_on_error_null (a d_int_not_null, b d_int_positive_maybe_null, c integer);
+
+\pset null NULL
+COPY t_on_error_null FROM STDIN WITH (on_error set_null); --fail
+\N 11 13
+\.
+
+COPY t_on_error_null FROM STDIN WITH (on_error set_null); --fail
+ss 11 14
+\.
+
+COPY t_on_error_null FROM STDIN WITH (on_error set_null); --fail
+-1 11 13
+\.
+
+--fail, less data.
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_null);
+1,1
+\.
+--fail, extra data.
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_null);
+1,2,3,4
+\.
+
+COPY t_on_error_null FROM STDIN WITH (on_error set_null, log_verbosity verbose); --ok
+10 x1 yx
+11 zx 12
+13 14 ea
+\.
+
+SELECT * FROM t_on_error_null ORDER BY a;
+
+\pset null ''
+
-- tests for on_error option with log_verbosity and null constraint via domain
CREATE DOMAIN dcheck_ign_err2 varchar(15) NOT NULL;
CREATE TABLE check_ign_err2 (n int, m int[], k int, l dcheck_ign_err2);
@@ -609,6 +649,9 @@ DROP VIEW instead_of_insert_tbl_view;
DROP VIEW instead_of_insert_tbl_view_2;
DROP FUNCTION fun_instead_of_insert_tbl();
DROP TABLE check_ign_err;
+DROP TABLE t_on_error_null;
+DROP DOMAIN d_int_not_null;
+DROP DOMAIN d_int_positive_maybe_null;
DROP TABLE check_ign_err2;
DROP DOMAIN dcheck_ign_err2;
DROP TABLE hard_err;
--
2.34.1
^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
@ 2026-02-06 12:58 Matheus Alcantara <[email protected]>
parent: jian he <[email protected]>
0 siblings, 2 replies; 31+ messages in thread
From: Matheus Alcantara @ 2026-02-06 12:58 UTC (permalink / raw)
To: jian he <[email protected]>; Matheus Alcantara <[email protected]>; +Cc: torikoshia <[email protected]>; Masahiko Sawada <[email protected]>; vignesh C <[email protected]>; Jim Jones <[email protected]>; Kirill Reshke <[email protected]>; Fujii Masao <[email protected]>; David G. Johnston <[email protected]>; Yugo NAGATA <[email protected]>; PostgreSQL Hackers <[email protected]>
On Fri Feb 6, 2026 at 12:40 AM -03, jian he wrote:
>> Yeah, after some more thinking it seems ok to use both options
>> together. I just found a bit strange when using integer columns.
>> Consider this example:
>>
>> cat data.csv
>> 1,11
>> 2,22
>> 3,
>> 4,44
>>
>> postgres=# create table t(a int not null, b int);
>> CREATE TABLE
>>
>> postgres=# copy t from
>> '/Users/matheus/dev/pgdev/copy-on-error-set-null/data.csv' with
>> (FORCE_NOT_NULL(b), format csv, delimiter ',', ON_ERROR set_null);
>> NOTICE: 1 row was replaced with null due to data type incompatibility
>> COPY 4
>>
>> postgres=# select * from t where b is null;
>> a | b
>> ---+---
>> 3 |
>> (1 row)
>>
>> We are requiring a not null value on column b but we are still
>> generating rows with null values on b.
>>
>> The reasoning on this is that the row 3 would generate a "invalid
>> input syntax for type integer" error and the ON_ERROR set_null fix
>> this by inserting a NULL value. It make sense I think but I'm
>> wondering if it could cause any confusion?
>>
>
> After careful reading the FORCE_NOT_NULL, FORCE_NULL option.
> It's about dealing with empty value and NULL strings.
>
> copy t from stdin with(FORCE_NOT_NULL (b), format csv, delimiter ',');
> Enter data to be copied followed by a newline.
> End with a backslash and a period on a line by itself, or an EOF signal.
>>> 1,
>>> \.
> ERROR: invalid input syntax for type integer: ""
> CONTEXT: COPY t, line 1, column b: ""
>
> in this case, FORCE_NOT_NULL will convert the empty value to null
> string (empty double quote)
>
> another FORCE_NULL.
> copy t from stdin with(FORCE_NULL (b), format csv, delimiter ',');
> Enter data to be copied followed by a newline.
> End with a backslash and a period on a line by itself, or an EOF signal.
>>> 1,""
>>> \.
> COPY 1
> src9=# table t;
> a | b
> ---+---
> 1 |
> (1 row)
>
> In this case, FORCE_NULL will convert null string (empty double quote) to NULL.
>
> ON_ERROR explanation, the first sentence:
> """
> Specifies how to behave when encountering an error converting a
> column's input value into its data type.
> """
> FORCE_NULL, FORCE_NOT_NULL is a special handling of input value, ON_ERROR is
> about converting the input value to data type, so it's before ON_ERROR.
>
> Overall the current doc is fine, IMHO.
>
Yeah, after also reading more carefully it's sounds correct to me too.
> The attached patch has addressed your other points.
>
Thanks, overall the patch looks good to me. I'm attaching a diff with
just some small tweaks on documentation and error messages. Please see
and check if it's make sense.
I'm wondering if we should have an else if block on
CopyFromTextLikeOneRow() when cstate->cur_attval is NULL to handle
COPY_ON_ERROR_SET_NULL when log_verbosity is set to
COPY_LOG_VERBOSITY_VERBOSE
if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
ereport(NOTICE,
errmsg("skipping row due to data type incompatibility at line %" PRIu64 " for column \"%s\": null input",
cstate->cur_lineno,
cstate->cur_attname));
+ else if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ ereport(NOTICE,
+ errmsg("setting to null due to data type incompatibility at line %" PRIu64 " for column \"%s\": null input",
+ cstate->cur_lineno,
+ cstate->cur_attname));
--
Matheus Alcantara
EDB: https://www.enterprisedb.com
^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
@ 2026-02-06 13:00 Matheus Alcantara <[email protected]>
parent: Matheus Alcantara <[email protected]>
1 sibling, 0 replies; 31+ messages in thread
From: Matheus Alcantara @ 2026-02-06 13:00 UTC (permalink / raw)
To: jian he <[email protected]>; +Cc: torikoshia <[email protected]>; Masahiko Sawada <[email protected]>; vignesh C <[email protected]>; Jim Jones <[email protected]>; Kirill Reshke <[email protected]>; Fujii Masao <[email protected]>; David G. Johnston <[email protected]>; Yugo NAGATA <[email protected]>; PostgreSQL Hackers <[email protected]>
On Fri Feb 6, 2026 at 9:58 AM -03, Matheus Alcantara wrote:
> Thanks, overall the patch looks good to me. I'm attaching a diff with
> just some small tweaks on documentation and error messages. Please see
> and check if it's make sense.
>
I miss to include the diff file, sorry about that.
--
Matheus Alcantara
EDB: https://www.enterprisedb.com
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index dcf6e6a2f48..2aeb38a6e5f 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6265,10 +6265,13 @@ FROM pg_stat_get_backend_idset() AS backendid;
<structfield>tuples_skipped</structfield> <type>bigint</type>
</para>
<para>
- Number of tuples skipped because they contain malformed data.
- This counter only advances when
- <literal>ignore</literal> is specified to the <literal>ON_ERROR</literal>
- option.
+ Number of tuples that contained malformed data. When
+ <literal>ON_ERROR</literal> is set to <literal>ignore</literal>,
+ this counts rows that were skipped. When set to
+ <literal>set_null</literal>, this counts rows where at least one
+ column was set to null due to a conversion error.
+ This counter only advances when <literal>ON_ERROR</literal>
+ is set to <literal>ignore</literal> or <literal>set_null</literal>.
</para></entry>
</row>
</tbody>
diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c
index 96ba23e961c..0a75800b8ab 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -1052,6 +1052,11 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
cstate->num_errors++;
else if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
{
+ /*
+ * Reset error state so the subsequent InputFunctionCallSafe
+ * call (for domain constraint check) can properly report
+ * whether it succeeded or failed.
+ */
cstate->escontext->error_occurred = false;
Assert(cstate->domain_with_constraint != NULL);
@@ -1075,13 +1080,17 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
else if (string == NULL)
ereport(ERROR,
errcode(ERRCODE_NOT_NULL_VIOLATION),
- errmsg("domain %s does not allow null values", format_type_be(typioparams[m])),
+ errmsg("null value in column \"%s\" violates not-null constraint of domain %s",
+ cstate->cur_attname, format_type_be(typioparams[m])),
errdatatype(typioparams[m]));
else
ereport(ERROR,
- errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
- errmsg("invalid input value for domain %s: \"%s\"",
- format_type_be(typioparams[m]), string));
+ errcode(ERRCODE_NOT_NULL_VIOLATION),
+ errmsg("cannot set null value for column \"%s\" with domain %s",
+ cstate->cur_attname, format_type_be(typioparams[m])),
+ errdetail("Column \"%s\" does not accept null values, so ON_ERROR SET_NULL cannot be applied.",
+ cstate->cur_attname),
+ errdatatype(typioparams[m]));
/*
* We count only the number of rows (not fields) where
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index 72034796aca..d9cc7bf5f48 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -799,13 +799,15 @@ CREATE DOMAIN d_int_positive_maybe_null AS integer CHECK(value > 0);
CREATE TABLE t_on_error_null (a d_int_not_null, b d_int_positive_maybe_null, c integer);
\pset null NULL
COPY t_on_error_null FROM STDIN WITH (on_error set_null); --fail
-ERROR: domain d_int_not_null does not allow null values
+ERROR: null value in column "a" violates not-null constraint of domain d_int_not_null
CONTEXT: COPY t_on_error_null, line 1, column a: null input
COPY t_on_error_null FROM STDIN WITH (on_error set_null); --fail
-ERROR: invalid input value for domain d_int_not_null: "ss"
+ERROR: cannot set null value for column "a" with domain d_int_not_null
+DETAIL: Column "a" does not accept null values, so ON_ERROR SET_NULL cannot be applied.
CONTEXT: COPY t_on_error_null, line 1, column a: "ss"
COPY t_on_error_null FROM STDIN WITH (on_error set_null); --fail
-ERROR: invalid input value for domain d_int_not_null: "-1"
+ERROR: cannot set null value for column "a" with domain d_int_not_null
+DETAIL: Column "a" does not accept null values, so ON_ERROR SET_NULL cannot be applied.
CONTEXT: COPY t_on_error_null, line 1, column a: "-1"
--fail, less data.
COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_null);
Attachments:
[text/plain] v22-0001-tweaks.diff.nocfbot (4.2K, 2-v22-0001-tweaks.diff.nocfbot)
download | inline diff:
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index dcf6e6a2f48..2aeb38a6e5f 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6265,10 +6265,13 @@ FROM pg_stat_get_backend_idset() AS backendid;
<structfield>tuples_skipped</structfield> <type>bigint</type>
</para>
<para>
- Number of tuples skipped because they contain malformed data.
- This counter only advances when
- <literal>ignore</literal> is specified to the <literal>ON_ERROR</literal>
- option.
+ Number of tuples that contained malformed data. When
+ <literal>ON_ERROR</literal> is set to <literal>ignore</literal>,
+ this counts rows that were skipped. When set to
+ <literal>set_null</literal>, this counts rows where at least one
+ column was set to null due to a conversion error.
+ This counter only advances when <literal>ON_ERROR</literal>
+ is set to <literal>ignore</literal> or <literal>set_null</literal>.
</para></entry>
</row>
</tbody>
diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c
index 96ba23e961c..0a75800b8ab 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -1052,6 +1052,11 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
cstate->num_errors++;
else if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
{
+ /*
+ * Reset error state so the subsequent InputFunctionCallSafe
+ * call (for domain constraint check) can properly report
+ * whether it succeeded or failed.
+ */
cstate->escontext->error_occurred = false;
Assert(cstate->domain_with_constraint != NULL);
@@ -1075,13 +1080,17 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
else if (string == NULL)
ereport(ERROR,
errcode(ERRCODE_NOT_NULL_VIOLATION),
- errmsg("domain %s does not allow null values", format_type_be(typioparams[m])),
+ errmsg("null value in column \"%s\" violates not-null constraint of domain %s",
+ cstate->cur_attname, format_type_be(typioparams[m])),
errdatatype(typioparams[m]));
else
ereport(ERROR,
- errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
- errmsg("invalid input value for domain %s: \"%s\"",
- format_type_be(typioparams[m]), string));
+ errcode(ERRCODE_NOT_NULL_VIOLATION),
+ errmsg("cannot set null value for column \"%s\" with domain %s",
+ cstate->cur_attname, format_type_be(typioparams[m])),
+ errdetail("Column \"%s\" does not accept null values, so ON_ERROR SET_NULL cannot be applied.",
+ cstate->cur_attname),
+ errdatatype(typioparams[m]));
/*
* We count only the number of rows (not fields) where
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index 72034796aca..d9cc7bf5f48 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -799,13 +799,15 @@ CREATE DOMAIN d_int_positive_maybe_null AS integer CHECK(value > 0);
CREATE TABLE t_on_error_null (a d_int_not_null, b d_int_positive_maybe_null, c integer);
\pset null NULL
COPY t_on_error_null FROM STDIN WITH (on_error set_null); --fail
-ERROR: domain d_int_not_null does not allow null values
+ERROR: null value in column "a" violates not-null constraint of domain d_int_not_null
CONTEXT: COPY t_on_error_null, line 1, column a: null input
COPY t_on_error_null FROM STDIN WITH (on_error set_null); --fail
-ERROR: invalid input value for domain d_int_not_null: "ss"
+ERROR: cannot set null value for column "a" with domain d_int_not_null
+DETAIL: Column "a" does not accept null values, so ON_ERROR SET_NULL cannot be applied.
CONTEXT: COPY t_on_error_null, line 1, column a: "ss"
COPY t_on_error_null FROM STDIN WITH (on_error set_null); --fail
-ERROR: invalid input value for domain d_int_not_null: "-1"
+ERROR: cannot set null value for column "a" with domain d_int_not_null
+DETAIL: Column "a" does not accept null values, so ON_ERROR SET_NULL cannot be applied.
CONTEXT: COPY t_on_error_null, line 1, column a: "-1"
--fail, less data.
COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_null);
^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
@ 2026-02-09 03:59 jian he <[email protected]>
parent: Matheus Alcantara <[email protected]>
1 sibling, 2 replies; 31+ messages in thread
From: jian he @ 2026-02-09 03:59 UTC (permalink / raw)
To: Matheus Alcantara <[email protected]>; +Cc: torikoshia <[email protected]>; Masahiko Sawada <[email protected]>; vignesh C <[email protected]>; Jim Jones <[email protected]>; Kirill Reshke <[email protected]>; Fujii Masao <[email protected]>; David G. Johnston <[email protected]>; Yugo NAGATA <[email protected]>; PostgreSQL Hackers <[email protected]>
On Fri, Feb 6, 2026 at 8:58 PM Matheus Alcantara
<[email protected]> wrote:
> >
> Thanks, overall the patch looks good to me. I'm attaching a diff with
> just some small tweaks on documentation and error messages. Please see
> and check if it's make sense.
>
In the function CopyFrom, we have:
if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE &&
cstate->escontext->error_occurred)
{
cstate->escontext->error_occurred = false;
pgstat_progress_update_param(PROGRESS_COPY_TUPLES_SKIPPED,
cstate->num_errors);
That means PROGRESS_COPY_TUPLES_SKIPPED applied for COPY_ON_ERROR_IGNORE only.
So
<para>
Number of tuples skipped because they contain malformed data.
This counter only advances when
<literal>ignore</literal> is specified to the <literal>ON_ERROR</literal>
option.
</para></entry>
should be ok.
> I'm wondering if we should have an else if block on
> CopyFromTextLikeOneRow() when cstate->cur_attval is NULL to handle
> COPY_ON_ERROR_SET_NULL when log_verbosity is set to
> COPY_LOG_VERBOSITY_VERBOSE
>
> if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
> ereport(NOTICE,
> errmsg("skipping row due to data type incompatibility at line %" PRIu64 " for column \"%s\": null input",
> cstate->cur_lineno,
> cstate->cur_attname));
> + else if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
> + ereport(NOTICE,
> + errmsg("setting to null due to data type incompatibility at line %" PRIu64 " for column \"%s\": null input",
> + cstate->cur_lineno,
> + cstate->cur_attname));
>
CopyFromTextLikeOneRow, we have:
cstate->cur_attname = NameStr(att->attname);
cstate->cur_attval = string;
even if "string" is NULL (two InputFunctionCallSafe function call with
"str" value as NULL), it will fail at
```
else if (string == NULL)
ereport(ERROR,
errcode(ERRCODE_NOT_NULL_VIOLATION),
errmsg("null value in column \"%s\"
violates not-null constraint of domain %s",
cstate->cur_attname,
format_type_be(typioparams[m])),
errdatatype(typioparams[m]));
```
so i think condition like:
if (cstate->opts.log_verbosity == COPY_LOG_VERBOSITY_VERBOSE &&
cstate->cur_attval == NULL &&
cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
is not reachable.
therefore I didn't add the ELSE IF block.
inspired by your change, I further simplified the error handling code.
--
jian
https://www.enterprisedb.com/
Attachments:
[text/x-patch] v23-0001-COPY-on_error-set_null.patch (23.4K, 2-v23-0001-COPY-on_error-set_null.patch)
download | inline diff:
From 114eeb0338510b6c55d8b0c7657242c6a275eb8d Mon Sep 17 00:00:00 2001
From: jian he <[email protected]>
Date: Sun, 8 Feb 2026 23:47:03 +0800
Subject: [PATCH v23 1/1] COPY (on_error set_null)
If ON_ERROR SET_NULL is specified during COPY FROM, any data type conversion
errors will result in the affected column being set to NULL. However, column's
not-null constraints are still enforced, attempting to set a NULL value in such
columns will raise a constraint violation error. This applies to column data
type is a domain with a NOT NULL constraint.
Author: Jian He <[email protected]>
Author: Kirill Reshke <[email protected]>
Reviewed-by:
Fujii Masao <[email protected]>
Jim Jones <[email protected]>
"David G. Johnston" <[email protected]>
Yugo NAGATA <[email protected]>
torikoshia <[email protected]>
Masahiko Sawada <[email protected]>
Atsushi Torikoshi <[email protected]>
Matheus Alcantara <[email protected]>
discussion: https://postgr.es/m/CAKFQuwawy1e6YR4S=j+y7pXqg_Dw1WBVrgvf=BP3d1_aSfe_+Q@mail.gmail.com
commitfest entry: https://commitfest.postgresql.org/patch/4810
---
doc/src/sgml/monitoring.sgml | 4 +-
doc/src/sgml/ref/copy.sgml | 35 +++++++---
src/backend/commands/copy.c | 6 +-
src/backend/commands/copyfrom.c | 46 +++++++++---
src/backend/commands/copyfromparse.c | 89 ++++++++++++++++++++----
src/bin/psql/tab-complete.in.c | 2 +-
src/include/commands/copy.h | 1 +
src/include/commands/copyfrom_internal.h | 7 ++
src/test/regress/expected/copy2.out | 58 +++++++++++++++
src/test/regress/sql/copy2.sql | 43 ++++++++++++
10 files changed, 254 insertions(+), 37 deletions(-)
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index b77d189a500..dcf6e6a2f48 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6266,8 +6266,8 @@ FROM pg_stat_get_backend_idset() AS backendid;
</para>
<para>
Number of tuples skipped because they contain malformed data.
- This counter only advances when a value other than
- <literal>stop</literal> is specified to the <literal>ON_ERROR</literal>
+ This counter only advances when
+ <literal>ignore</literal> is specified to the <literal>ON_ERROR</literal>
option.
</para></entry>
</row>
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 53b0ea8f573..acc808dbb6e 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -413,22 +413,38 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
input value into its data type.
An <replaceable class="parameter">error_action</replaceable> value of
<literal>stop</literal> means fail the command, while
- <literal>ignore</literal> means discard the input row and continue with the next one.
+ <literal>ignore</literal> means discard the input row and continue with the next one,
+ and <literal>set_null</literal> means replace field containing invalid
+ input value with <literal>NULL</literal> and continue to the next field.
The default is <literal>stop</literal>.
</para>
<para>
- The <literal>ignore</literal> option is applicable only for <command>COPY FROM</command>
+ The <literal>ignore</literal> and <literal>set_null</literal>
+ options are applicable only for <command>COPY FROM</command>
when the <literal>FORMAT</literal> is <literal>text</literal> or <literal>csv</literal>.
</para>
+
+ <para>
+ For <literal>ignore</literal> option, a <literal>NOTICE</literal> message
+ containing the ignored row count is emitted at the end of the <command>COPY FROM</command>
+ if at least one row was discarded.
+ For <literal>set_null</literal> option, a <literal>NOTICE</literal>
+ message indicating the number of rows where invalid input values were
+ replaced with null is emitted at the end of the <command>COPY FROM</command>
+ if at least one row was replaced.
+ </para>
+
<para>
- A <literal>NOTICE</literal> message containing the ignored row count is
- emitted at the end of the <command>COPY FROM</command> if at least one
- row was discarded. When <literal>LOG_VERBOSITY</literal> option is set to
- <literal>verbose</literal>, a <literal>NOTICE</literal> message
+ When <literal>LOG_VERBOSITY</literal> option is set to <literal>verbose</literal>,
+ for <literal>ignore</literal> option, a <literal>NOTICE</literal> message
containing the line of the input file and the column name whose input
- conversion has failed is emitted for each discarded row.
+ conversion has failed is emitted for each discarded row;
+ for <literal>set_null</literal> option, a <literal>NOTICE</literal>
+ message containing the line of the input file and the column name where
+ value was replaced with <literal>NULL</literal> for each input conversion
+ failure.
When it is set to <literal>silent</literal>, no message is emitted
- regarding ignored rows.
+ regarding input conversion failed rows.
</para>
</listitem>
</varlistentry>
@@ -476,7 +492,8 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
</para>
<para>
This is currently used in <command>COPY FROM</command> command when
- <literal>ON_ERROR</literal> option is set to <literal>ignore</literal>.
+ <literal>ON_ERROR</literal> option is set to <literal>ignore</literal>
+ or <literal>set_null</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 155a79a70c5..6de0b643da1 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -469,12 +469,14 @@ defGetCopyOnErrorChoice(DefElem *def, ParseState *pstate, bool is_from)
parser_errposition(pstate, def->location)));
/*
- * Allow "stop", or "ignore" values.
+ * Allow "stop", "ignore", "set_null" values.
*/
if (pg_strcasecmp(sval, "stop") == 0)
return COPY_ON_ERROR_STOP;
if (pg_strcasecmp(sval, "ignore") == 0)
return COPY_ON_ERROR_IGNORE;
+ if (pg_strcasecmp(sval, "set_null") == 0)
+ return COPY_ON_ERROR_SET_NULL;
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -984,7 +986,7 @@ ProcessCopyOptions(ParseState *pstate,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("only ON_ERROR STOP is allowed in BINARY mode")));
- if (opts_out->reject_limit && !opts_out->on_error)
+ if (opts_out->reject_limit && opts_out->on_error != COPY_ON_ERROR_IGNORE)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
/*- translator: first and second %s are the names of COPY option, e.g.
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 25ee20b23db..1213f3d8832 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -50,6 +50,7 @@
#include "utils/portal.h"
#include "utils/rel.h"
#include "utils/snapmgr.h"
+#include "utils/typcache.h"
/*
* No more than this many tuples per CopyMultiInsertBuffer
@@ -1467,14 +1468,22 @@ CopyFrom(CopyFromState cstate)
/* Done, clean up */
error_context_stack = errcallback.previous;
- if (cstate->opts.on_error != COPY_ON_ERROR_STOP &&
- cstate->num_errors > 0 &&
+ if (cstate->num_errors > 0 &&
cstate->opts.log_verbosity >= COPY_LOG_VERBOSITY_DEFAULT)
- ereport(NOTICE,
- errmsg_plural("%" PRIu64 " row was skipped due to data type incompatibility",
- "%" PRIu64 " rows were skipped due to data type incompatibility",
- cstate->num_errors,
- cstate->num_errors));
+ {
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ ereport(NOTICE,
+ errmsg_plural("%" PRIu64 " row was skipped due to data type incompatibility",
+ "%" PRIu64 " rows were skipped due to data type incompatibility",
+ cstate->num_errors,
+ cstate->num_errors));
+ else if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ ereport(NOTICE,
+ errmsg_plural("%" PRIu64 " row was replaced with null due to data type incompatibility",
+ "%" PRIu64 " rows were replaced with null due to data type incompatibility",
+ cstate->num_errors,
+ cstate->num_errors));
+ }
if (bistate != NULL)
FreeBulkInsertState(bistate);
@@ -1622,15 +1631,32 @@ BeginCopyFrom(ParseState *pstate,
cstate->escontext->error_occurred = false;
/*
- * Currently we only support COPY_ON_ERROR_IGNORE. We'll add other
- * options later
+ * Currently we only support COPY_ON_ERROR_IGNORE,
+ * COPY_ON_ERROR_SET_NULL. We'll add other options later
*/
- if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE ||
+ cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
cstate->escontext->details_wanted = false;
}
else
cstate->escontext = NULL;
+ if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ {
+ int attr_count = list_length(cstate->attnumlist);
+
+ cstate->domain_with_constraint = palloc0_array(bool, attr_count);
+
+ foreach_int(attno, cstate->attnumlist)
+ {
+ int i = foreach_current_index(attno);
+
+ Form_pg_attribute att = TupleDescAttr(tupDesc, attno - 1);
+
+ cstate->domain_with_constraint[i] = DomainHasConstraints(att->atttypid);
+ }
+ }
+
/* Convert FORCE_NULL name list to per-column flags, check validity */
cstate->opts.force_null_flags = (bool *) palloc0(num_phys_attrs * sizeof(bool));
if (cstate->opts.force_null_all)
diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c
index 94d6f415a06..c34dd7fcd72 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -958,6 +958,7 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
int fldct;
int fieldno;
char *string;
+ bool current_row_erroneous = false;
tupDesc = RelationGetDescr(cstate->rel);
attr_count = list_length(cstate->attnumlist);
@@ -1035,7 +1036,8 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
}
/*
- * If ON_ERROR is specified with IGNORE, skip rows with soft errors
+ * If ON_ERROR is specified with IGNORE, skip rows with soft errors.
+ * If ON_ERROR is specified with SET_NULL, try to replace with null.
*/
else if (!InputFunctionCallSafe(&in_functions[m],
string,
@@ -1046,7 +1048,55 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
{
Assert(cstate->opts.on_error != COPY_ON_ERROR_STOP);
- cstate->num_errors++;
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ cstate->num_errors++;
+ else if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ {
+ /*
+ * Reset error state so the subsequent InputFunctionCallSafe
+ * call (for domain constraint check) can properly report
+ * whether it succeeded or failed.
+ */
+ cstate->escontext->error_occurred = false;
+
+ Assert(cstate->domain_with_constraint != NULL);
+
+ /*
+ * If the column type is a constrained domain, an additional
+ * InputFunctionCallSafe may be needed to raise error for
+ * domain constraint violation.
+ */
+ if (!cstate->domain_with_constraint[m] ||
+ InputFunctionCallSafe(&in_functions[m],
+ NULL,
+ typioparams[m],
+ att->atttypmod,
+ (Node *) cstate->escontext,
+ &values[m]))
+ {
+ nulls[m] = true;
+ values[m] = (Datum) 0;
+ }
+ else
+ ereport(ERROR,
+ errcode(ERRCODE_NOT_NULL_VIOLATION),
+ errmsg("domain %s does not allow null values",
+ format_type_be(typioparams[m])),
+ errdetail("ON_ERROR SET_NULL cannot be applied because column \"%s\" (domain %s) does not accept null values.",
+ cstate->cur_attname,
+ format_type_be(typioparams[m])),
+ errdatatype(typioparams[m]));
+
+ /*
+ * We count only the number of rows (not fields) where
+ * ON_ERROR SET_NULL was applied.
+ */
+ if (!current_row_erroneous)
+ {
+ current_row_erroneous = true;
+ cstate->num_errors++;
+ }
+ }
if (cstate->opts.log_verbosity == COPY_LOG_VERBOSITY_VERBOSE)
{
@@ -1063,24 +1113,37 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
char *attval;
attval = CopyLimitPrintoutLength(cstate->cur_attval);
- ereport(NOTICE,
- errmsg("skipping row due to data type incompatibility at line %" PRIu64 " for column \"%s\": \"%s\"",
- cstate->cur_lineno,
- cstate->cur_attname,
- attval));
+
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ ereport(NOTICE,
+ errmsg("skipping row due to data type incompatibility at line %" PRIu64 " for column \"%s\": \"%s\"",
+ cstate->cur_lineno,
+ cstate->cur_attname,
+ attval));
+ else if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ ereport(NOTICE,
+ errmsg("setting to null due to data type incompatibility at line %" PRIu64 " for column \"%s\": \"%s\"",
+ cstate->cur_lineno,
+ cstate->cur_attname,
+ attval));
pfree(attval);
}
else
- ereport(NOTICE,
- errmsg("skipping row due to data type incompatibility at line %" PRIu64 " for column \"%s\": null input",
- cstate->cur_lineno,
- cstate->cur_attname));
-
+ {
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ ereport(NOTICE,
+ errmsg("skipping row due to data type incompatibility at line %" PRIu64 " for column \"%s\": null input",
+ cstate->cur_lineno,
+ cstate->cur_attname));
+ }
/* reset relname_only */
cstate->relname_only = false;
}
- return true;
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ return true;
+ else if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ continue;
}
cstate->cur_attname = NULL;
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index 8b91bc00062..e2c72bd3059 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -3425,7 +3425,7 @@ match_previous_words(int pattern_id,
/* Complete COPY <sth> FROM filename WITH (ON_ERROR */
else if (TailMatches("ON_ERROR"))
- COMPLETE_WITH("stop", "ignore");
+ COMPLETE_WITH("stop", "ignore", "set_null");
/* Complete COPY <sth> FROM filename WITH (LOG_VERBOSITY */
else if (TailMatches("LOG_VERBOSITY"))
diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h
index 43c2580539f..877202af67b 100644
--- a/src/include/commands/copy.h
+++ b/src/include/commands/copy.h
@@ -35,6 +35,7 @@ typedef enum CopyOnErrorChoice
{
COPY_ON_ERROR_STOP = 0, /* immediately throw errors, default */
COPY_ON_ERROR_IGNORE, /* ignore errors */
+ COPY_ON_ERROR_SET_NULL, /* set error field to null */
} CopyOnErrorChoice;
/*
diff --git a/src/include/commands/copyfrom_internal.h b/src/include/commands/copyfrom_internal.h
index 822ef33cf69..f892c343157 100644
--- a/src/include/commands/copyfrom_internal.h
+++ b/src/include/commands/copyfrom_internal.h
@@ -108,6 +108,13 @@ typedef struct CopyFromStateData
* att */
bool *defaults; /* if DEFAULT marker was found for
* corresponding att */
+
+ /*
+ * True if the corresponding attribute's is a constrained domain. This
+ * will be populated only when ON_ERROR is SET_NULL, otherwise NULL.
+ */
+ bool *domain_with_constraint;
+
bool volatile_defexprs; /* is any of defexprs volatile? */
List *range_table; /* single element list of RangeTblEntry */
List *rteperminfos; /* single element list of RTEPermissionInfo */
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index 3145b314e48..045c0f81243 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -81,6 +81,10 @@ COPY x from stdin (on_error ignore, on_error ignore);
ERROR: conflicting or redundant options
LINE 1: COPY x from stdin (on_error ignore, on_error ignore);
^
+COPY x from stdin (on_error set_null, on_error set_null);
+ERROR: conflicting or redundant options
+LINE 1: COPY x from stdin (on_error set_null, on_error set_null);
+ ^
COPY x from stdin (log_verbosity default, log_verbosity verbose);
ERROR: conflicting or redundant options
LINE 1: COPY x from stdin (log_verbosity default, log_verbosity verb...
@@ -92,6 +96,10 @@ COPY x from stdin (format BINARY, null 'x');
ERROR: cannot specify NULL in BINARY mode
COPY x from stdin (format BINARY, on_error ignore);
ERROR: only ON_ERROR STOP is allowed in BINARY mode
+COPY x from stdin (format BINARY, on_error set_null);
+ERROR: only ON_ERROR STOP is allowed in BINARY mode
+COPY x from stdin (on_error set_null, reject_limit 2);
+ERROR: COPY REJECT_LIMIT requires ON_ERROR to be set to IGNORE
COPY x from stdin (on_error unsupported);
ERROR: COPY ON_ERROR "unsupported" not recognized
LINE 1: COPY x from stdin (on_error unsupported);
@@ -124,6 +132,10 @@ COPY x to stdout (format BINARY, on_error unsupported);
ERROR: COPY ON_ERROR cannot be used with COPY TO
LINE 1: COPY x to stdout (format BINARY, on_error unsupported);
^
+COPY x to stdout (on_error set_null);
+ERROR: COPY ON_ERROR cannot be used with COPY TO
+LINE 1: COPY x to stdout (on_error set_null);
+ ^
COPY x from stdin (log_verbosity unsupported);
ERROR: COPY LOG_VERBOSITY "unsupported" not recognized
LINE 1: COPY x from stdin (log_verbosity unsupported);
@@ -782,6 +794,49 @@ CONTEXT: COPY check_ign_err
NOTICE: skipping row due to data type incompatibility at line 8 for column "k": "a"
CONTEXT: COPY check_ign_err
NOTICE: 6 rows were skipped due to data type incompatibility
+CREATE DOMAIN d_int_not_null AS integer NOT NULL CHECK(value > 0);
+CREATE DOMAIN d_int_positive_maybe_null AS integer CHECK(value > 0);
+CREATE TABLE t_on_error_null (a d_int_not_null, b d_int_positive_maybe_null, c integer);
+\pset null NULL
+COPY t_on_error_null FROM STDIN WITH (on_error set_null); --fail
+ERROR: domain d_int_not_null does not allow null values
+DETAIL: ON_ERROR SET_NULL cannot be applied because column "a" (domain d_int_not_null) does not accept null values.
+CONTEXT: COPY t_on_error_null, line 1, column a: null input
+COPY t_on_error_null FROM STDIN WITH (on_error set_null); --fail
+ERROR: domain d_int_not_null does not allow null values
+DETAIL: ON_ERROR SET_NULL cannot be applied because column "a" (domain d_int_not_null) does not accept null values.
+CONTEXT: COPY t_on_error_null, line 1, column a: "ss"
+COPY t_on_error_null FROM STDIN WITH (on_error set_null); --fail
+ERROR: domain d_int_not_null does not allow null values
+DETAIL: ON_ERROR SET_NULL cannot be applied because column "a" (domain d_int_not_null) does not accept null values.
+CONTEXT: COPY t_on_error_null, line 1, column a: "-1"
+--fail, less data.
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_null);
+ERROR: missing data for column "c"
+CONTEXT: COPY t_on_error_null, line 1: "1,1"
+--fail, extra data.
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_null);
+ERROR: extra data after last expected column
+CONTEXT: COPY t_on_error_null, line 1: "1,2,3,4"
+COPY t_on_error_null FROM STDIN WITH (on_error set_null, log_verbosity verbose); --ok
+NOTICE: setting to null due to data type incompatibility at line 1 for column "b": "x1"
+CONTEXT: COPY t_on_error_null
+NOTICE: setting to null due to data type incompatibility at line 1 for column "c": "yx"
+CONTEXT: COPY t_on_error_null
+NOTICE: setting to null due to data type incompatibility at line 2 for column "b": "zx"
+CONTEXT: COPY t_on_error_null
+NOTICE: setting to null due to data type incompatibility at line 3 for column "c": "ea"
+CONTEXT: COPY t_on_error_null
+NOTICE: 3 rows were replaced with null due to data type incompatibility
+SELECT * FROM t_on_error_null ORDER BY a;
+ a | b | c
+----+------+------
+ 10 | NULL | NULL
+ 11 | NULL | 12
+ 13 | 14 | NULL
+(3 rows)
+
+\pset null ''
-- tests for on_error option with log_verbosity and null constraint via domain
CREATE DOMAIN dcheck_ign_err2 varchar(15) NOT NULL;
CREATE TABLE check_ign_err2 (n int, m int[], k int, l dcheck_ign_err2);
@@ -841,6 +896,9 @@ DROP VIEW instead_of_insert_tbl_view;
DROP VIEW instead_of_insert_tbl_view_2;
DROP FUNCTION fun_instead_of_insert_tbl();
DROP TABLE check_ign_err;
+DROP TABLE t_on_error_null;
+DROP DOMAIN d_int_not_null;
+DROP DOMAIN d_int_positive_maybe_null;
DROP TABLE check_ign_err2;
DROP DOMAIN dcheck_ign_err2;
DROP TABLE hard_err;
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index 66435167500..5e9f6af2c5c 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -67,12 +67,15 @@ COPY x from stdin (force_null (a), force_null (b));
COPY x from stdin (convert_selectively (a), convert_selectively (b));
COPY x from stdin (encoding 'sql_ascii', encoding 'sql_ascii');
COPY x from stdin (on_error ignore, on_error ignore);
+COPY x from stdin (on_error set_null, on_error set_null);
COPY x from stdin (log_verbosity default, log_verbosity verbose);
-- incorrect options
COPY x from stdin (format BINARY, delimiter ',');
COPY x from stdin (format BINARY, null 'x');
COPY x from stdin (format BINARY, on_error ignore);
+COPY x from stdin (format BINARY, on_error set_null);
+COPY x from stdin (on_error set_null, reject_limit 2);
COPY x from stdin (on_error unsupported);
COPY x from stdin (format TEXT, force_quote(a));
COPY x from stdin (format TEXT, force_quote *);
@@ -87,6 +90,7 @@ COPY x from stdin (format TEXT, force_null *);
COPY x to stdout (format CSV, force_null(a));
COPY x to stdout (format CSV, force_null *);
COPY x to stdout (format BINARY, on_error unsupported);
+COPY x to stdout (on_error set_null);
COPY x from stdin (log_verbosity unsupported);
COPY x from stdin with (reject_limit 1);
COPY x from stdin with (on_error ignore, reject_limit 0);
@@ -540,6 +544,42 @@ a {2} 2
8 {8} 8
\.
+CREATE DOMAIN d_int_not_null AS integer NOT NULL CHECK(value > 0);
+CREATE DOMAIN d_int_positive_maybe_null AS integer CHECK(value > 0);
+CREATE TABLE t_on_error_null (a d_int_not_null, b d_int_positive_maybe_null, c integer);
+
+\pset null NULL
+COPY t_on_error_null FROM STDIN WITH (on_error set_null); --fail
+\N 11 13
+\.
+
+COPY t_on_error_null FROM STDIN WITH (on_error set_null); --fail
+ss 11 14
+\.
+
+COPY t_on_error_null FROM STDIN WITH (on_error set_null); --fail
+-1 11 13
+\.
+
+--fail, less data.
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_null);
+1,1
+\.
+--fail, extra data.
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_null);
+1,2,3,4
+\.
+
+COPY t_on_error_null FROM STDIN WITH (on_error set_null, log_verbosity verbose); --ok
+10 x1 yx
+11 zx 12
+13 14 ea
+\.
+
+SELECT * FROM t_on_error_null ORDER BY a;
+
+\pset null ''
+
-- tests for on_error option with log_verbosity and null constraint via domain
CREATE DOMAIN dcheck_ign_err2 varchar(15) NOT NULL;
CREATE TABLE check_ign_err2 (n int, m int[], k int, l dcheck_ign_err2);
@@ -609,6 +649,9 @@ DROP VIEW instead_of_insert_tbl_view;
DROP VIEW instead_of_insert_tbl_view_2;
DROP FUNCTION fun_instead_of_insert_tbl();
DROP TABLE check_ign_err;
+DROP TABLE t_on_error_null;
+DROP DOMAIN d_int_not_null;
+DROP DOMAIN d_int_positive_maybe_null;
DROP TABLE check_ign_err2;
DROP DOMAIN dcheck_ign_err2;
DROP TABLE hard_err;
--
2.34.1
^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
@ 2026-02-09 15:36 Matheus Alcantara <[email protected]>
parent: jian he <[email protected]>
1 sibling, 0 replies; 31+ messages in thread
From: Matheus Alcantara @ 2026-02-09 15:36 UTC (permalink / raw)
To: jian he <[email protected]>; +Cc: torikoshia <[email protected]>; Masahiko Sawada <[email protected]>; vignesh C <[email protected]>; Jim Jones <[email protected]>; Kirill Reshke <[email protected]>; Fujii Masao <[email protected]>; David G. Johnston <[email protected]>; Yugo NAGATA <[email protected]>; PostgreSQL Hackers <[email protected]>
On 09/02/26 00:59, jian he wrote:
>> Thanks, overall the patch looks good to me. I'm attaching a diff with
>> just some small tweaks on documentation and error messages. Please see
>> and check if it's make sense.
>>
> In the function CopyFrom, we have:
> if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE &&
> cstate->escontext->error_occurred)
> {
> cstate->escontext->error_occurred = false;
> pgstat_progress_update_param(PROGRESS_COPY_TUPLES_SKIPPED,
> cstate->num_errors);
>
> That means PROGRESS_COPY_TUPLES_SKIPPED applied for COPY_ON_ERROR_IGNORE only.
> So
> <para>
> Number of tuples skipped because they contain malformed data.
> This counter only advances when
> <literal>ignore</literal> is specified to the <literal>ON_ERROR</literal>
> option.
> </para></entry>
> should be ok.
>
Ok, agree.
>> I'm wondering if we should have an else if block on
>> CopyFromTextLikeOneRow() when cstate->cur_attval is NULL to handle
>> COPY_ON_ERROR_SET_NULL when log_verbosity is set to
>> COPY_LOG_VERBOSITY_VERBOSE
>>
>> if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
>> ereport(NOTICE,
>> errmsg("skipping row due to data type incompatibility at line %" PRIu64 " for column \"%s\": null input",
>> cstate->cur_lineno,
>> cstate->cur_attname));
>> + else if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
>> + ereport(NOTICE,
>> + errmsg("setting to null due to data type incompatibility at line %" PRIu64 " for column \"%s\": null input",
>> + cstate->cur_lineno,
>> + cstate->cur_attname));
>>
>
> CopyFromTextLikeOneRow, we have:
> cstate->cur_attname = NameStr(att->attname);
> cstate->cur_attval = string;
>
> even if "string" is NULL (two InputFunctionCallSafe function call with
> "str" value as NULL), it will fail at
> ```
> else if (string == NULL)
> ereport(ERROR,
> errcode(ERRCODE_NOT_NULL_VIOLATION),
> errmsg("null value in column \"%s\"
> violates not-null constraint of domain %s",
> cstate->cur_attname,
> format_type_be(typioparams[m])),
> errdatatype(typioparams[m]));
> ```
> so i think condition like:
> if (cstate->opts.log_verbosity == COPY_LOG_VERBOSITY_VERBOSE &&
> cstate->cur_attval == NULL &&
> cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
> is not reachable.
> therefore I didn't add the ELSE IF block.
>
Ok, make sense. I've tested and it seems correct.
> inspired by your change, I further simplified the error handling code.
>
Thanks for the new version. It looks good to me. I don't have any
other comments.
--
Matheus Alcantara
EDB: https://www.enterprisedb.com
^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
@ 2026-02-25 07:36 Peter Eisentraut <[email protected]>
parent: jian he <[email protected]>
1 sibling, 1 reply; 31+ messages in thread
From: Peter Eisentraut @ 2026-02-25 07:36 UTC (permalink / raw)
To: jian he <[email protected]>; Matheus Alcantara <[email protected]>; +Cc: torikoshia <[email protected]>; Masahiko Sawada <[email protected]>; vignesh C <[email protected]>; Jim Jones <[email protected]>; Kirill Reshke <[email protected]>; Fujii Masao <[email protected]>; David G. Johnston <[email protected]>; Yugo NAGATA <[email protected]>; PostgreSQL Hackers <[email protected]>
I have a few more cosmetic changes to suggest:
- doc/src/sgml/ref/copy.sgml
+ and <literal>set_null</literal> means replace field containing
invalid
should be "the field" and "the invalid"
+ input value with <literal>NULL</literal> and continue to the next
field.
change <literal>NULL</literal> to "a null value"
+ <para>
+ For <literal>ignore</literal> option, a <literal>NOTICE</literal>
message
+ containing the ignored row count is emitted at the end of the
<command>COPY FROM</command>
+ if at least one row was discarded.
+ For <literal>set_null</literal> option, a <literal>NOTICE</literal>
+ message indicating the number of rows where invalid input values were
+ replaced with null is emitted at the end of the <command>COPY
FROM</command>
+ if at least one row was replaced.
+ </para>
I think this could be written more compactly, like
If on_error is set to ignore or set_null, a NOTICE message is emitted
at the end of the COPY FROM command containing the count of rows that
were ignored or changed, if at least one row was affected.
- src/backend/commands/copy.c
/*
- * Allow "stop", or "ignore" values.
+ * Allow "stop", "ignore", "set_null" values.
*/
Just remove that comment. It is evident from the following code.
- src/backend/commands/copyfrom.c
+ "%" PRIu64 " rows were replaced with null due to data type
incompatibility"
I think this is not quite correctly worded. It should be something like
in NNN rows, columns were set to null due to ...
because you are not setting the whole row to null.
/*
- * Currently we only support COPY_ON_ERROR_IGNORE. We'll add other
- * options later
+ * Currently we only support COPY_ON_ERROR_IGNORE,
+ * COPY_ON_ERROR_SET_NULL. We'll add other options later
*/
Delete this comment.
+ if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ {
+ int attr_count = list_length(cstate->attnumlist);
+
+ cstate->domain_with_constraint = palloc0_array(bool, attr_count);
Maybe add a comment for this block to explain that you are collecting
information about domains for later.
- src/backend/commands/copyfromparse.c
/*
- * If ON_ERROR is specified with IGNORE, skip rows with soft errors
+ * If ON_ERROR is specified with IGNORE, skip rows with soft errors.
+ * If ON_ERROR is specified with SET_NULL, try to replace with null.
*/
Trim this comment. Maybe "If ON_ERROR is specified, handle the
different options". We don't need to re-explain here what the options do.
+ /*
+ * If the column type is a constrained domain, an additional
+ * InputFunctionCallSafe may be needed to raise error for
+ * domain constraint violation.
+ */
Why "may be needed"? Is it sometimes not needed? Why, under what
circumstances?
The subsequent error message writes "domain ... does not allow null
values", but AFAICT a domain input failure could also be due to a check
constraint failure? How would that be handled? The flow here is a bit
confusing.
- src/test/regress/sql/copy2.sql
I suggest adding a space after "--" in a comment, like "-- error"
instead of "--error".
Similarly, a space after CHECK, like "CHECK (...)".
^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
@ 2026-02-28 02:04 jian he <[email protected]>
parent: Peter Eisentraut <[email protected]>
0 siblings, 1 reply; 31+ messages in thread
From: jian he @ 2026-02-28 02:04 UTC (permalink / raw)
To: Peter Eisentraut <[email protected]>; +Cc: Matheus Alcantara <[email protected]>; torikoshia <[email protected]>; Masahiko Sawada <[email protected]>; vignesh C <[email protected]>; Jim Jones <[email protected]>; Kirill Reshke <[email protected]>; Fujii Masao <[email protected]>; David G. Johnston <[email protected]>; Yugo NAGATA <[email protected]>; PostgreSQL Hackers <[email protected]>
On Wed, Feb 25, 2026 at 3:36 PM Peter Eisentraut <[email protected]> wrote:
>
> + if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
> + {
> + int attr_count = list_length(cstate->attnumlist);
> +
> + cstate->domain_with_constraint = palloc0_array(bool, attr_count);
>
> Maybe add a comment for this block to explain that you are collecting
> information about domains for later.
>
Here is what I came up with:
+ /*
+ * When data type conversion fails and ON_ERROR is SET_NULL, we need
+ * ensure that input column allows NULL value, ExecConstraints will
+ * cover most of the cases, however it does not vertify domain
+ * constraints. Therefore, for constrained domains, NULL value check
+ * must be performed during the initial string-to-datum conversion
+ * (see CopyFromTextLikeOneRow).
+ */
> + /*
> + * If the column type is a constrained domain, an additional
> + * InputFunctionCallSafe may be needed to raise error for
> + * domain constraint violation.
> + */
>
> Why "may be needed"? Is it sometimes not needed? Why, under what
> circumstances?
I changed the comments to:
+ /*
+ * For constrained domain types, we need an additional
+ * InputFunctionCallSafe to ensure that an error is thrown if
+ * the domain constraint rejects NULL.
+ */
>
> The subsequent error message writes "domain ... does not allow null
> values", but AFAICT a domain input failure could also be due to a check
> constraint failure? How would that be handled? The flow here is a bit
> confusing.
>
create domain d3 as int check (value is not null);
create table t(a d3);
src4=# copy t1 from stdin (on_error set_null);
Enter data to be copied followed by a newline.
End with a backslash and a period on a line by itself, or an EOF signal.
>> \N
>> \.
ERROR: domain d1 does not allow null values
DETAIL: ON_ERROR SET_NULL cannot be applied because column "a"
(domain d1) does not accept null values.
CONTEXT: COPY t1, line 1, column a: null input
It's more about whether all domain constraints allow a NULL value,
In this context, the domain constraint is a CHECK constraint.
``InputFunctionCallSafe(&in_functions[m], NULL,``
this check whether a NULL value is allowed for this domain.
ExecConstraints does not handle domain constraints, so this is needed.
The error message:
``errmsg("domain %s does not allow null values",``
should be fine?
All other suggestions have been incorporated into v24.
--
jian
https://www.enterprisedb.com/
Attachments:
[text/x-patch] v24-0001-COPY-on_error-set_null.patch (23.6K, 2-v24-0001-COPY-on_error-set_null.patch)
download | inline diff:
From 61d2a81c50ac1691b4a328ae5a8fc24254f23c8d Mon Sep 17 00:00:00 2001
From: jian he <[email protected]>
Date: Sat, 28 Feb 2026 09:57:42 +0800
Subject: [PATCH v24 1/1] COPY (on_error set_null)
If ON_ERROR SET_NULL is specified during COPY FROM, any data type conversion
errors will result in the affected column being set to NULL. However, column's
not-null constraints are still enforced, attempting to set a NULL value in such
columns will raise a constraint violation error. This applies to column data
type is a domain with a NOT NULL constraint.
Author: Jian He <[email protected]>
Author: Kirill Reshke <[email protected]>
Reviewed-by:
Fujii Masao <[email protected]>
Jim Jones <[email protected]>
"David G. Johnston" <[email protected]>
Yugo NAGATA <[email protected]>
torikoshia <[email protected]>
Masahiko Sawada <[email protected]>
Atsushi Torikoshi <[email protected]>
Matheus Alcantara <[email protected]>
Peter Eisentraut <[email protected]>
discussion: https://postgr.es/m/CAKFQuwawy1e6YR4S=j+y7pXqg_Dw1WBVrgvf=BP3d1_aSfe_+Q@mail.gmail.com
commitfest entry: https://commitfest.postgresql.org/patch/4810
---
doc/src/sgml/monitoring.sgml | 4 +-
doc/src/sgml/ref/copy.sgml | 28 +++++---
src/backend/commands/copy.c | 7 +-
src/backend/commands/copyfrom.c | 54 ++++++++++----
src/backend/commands/copyfromparse.c | 90 ++++++++++++++++++++----
src/bin/psql/tab-complete.in.c | 2 +-
src/include/commands/copy.h | 1 +
src/include/commands/copyfrom_internal.h | 7 ++
src/test/regress/expected/copy2.out | 58 +++++++++++++++
src/test/regress/sql/copy2.sql | 43 +++++++++++
10 files changed, 251 insertions(+), 43 deletions(-)
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index b77d189a500..dcf6e6a2f48 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -6266,8 +6266,8 @@ FROM pg_stat_get_backend_idset() AS backendid;
</para>
<para>
Number of tuples skipped because they contain malformed data.
- This counter only advances when a value other than
- <literal>stop</literal> is specified to the <literal>ON_ERROR</literal>
+ This counter only advances when
+ <literal>ignore</literal> is specified to the <literal>ON_ERROR</literal>
option.
</para></entry>
</row>
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 53b0ea8f573..b7270848bab 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -413,22 +413,31 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
input value into its data type.
An <replaceable class="parameter">error_action</replaceable> value of
<literal>stop</literal> means fail the command, while
- <literal>ignore</literal> means discard the input row and continue with the next one.
+ <literal>ignore</literal> means discard the input row and continue with the next one,
+ and <literal>set_null</literal> means replace the field containing the invalid
+ input value with a null value and continue to the next field.
The default is <literal>stop</literal>.
</para>
<para>
- The <literal>ignore</literal> option is applicable only for <command>COPY FROM</command>
+ The <literal>ignore</literal> and <literal>set_null</literal>
+ options are applicable only for <command>COPY FROM</command>
when the <literal>FORMAT</literal> is <literal>text</literal> or <literal>csv</literal>.
+ If <literal>ON_ERROR</literal> is set to <literal>ignore</literal> or
+ <literal>set_null</literal>, a NOTICE message is emitted at the end of the
+ COPY FROM command containing the count of rows that were ignored or
+ changed, if at least one row was affected.
</para>
<para>
- A <literal>NOTICE</literal> message containing the ignored row count is
- emitted at the end of the <command>COPY FROM</command> if at least one
- row was discarded. When <literal>LOG_VERBOSITY</literal> option is set to
- <literal>verbose</literal>, a <literal>NOTICE</literal> message
+ When <literal>LOG_VERBOSITY</literal> option is set to <literal>verbose</literal>,
+ for <literal>ignore</literal> option, a <literal>NOTICE</literal> message
containing the line of the input file and the column name whose input
- conversion has failed is emitted for each discarded row.
+ conversion has failed is emitted for each discarded row;
+ for <literal>set_null</literal> option, a <literal>NOTICE</literal>
+ message containing the line of the input file and the column name where
+ value was replaced with <literal>NULL</literal> for each input conversion
+ failure.
When it is set to <literal>silent</literal>, no message is emitted
- regarding ignored rows.
+ regarding input conversion failed rows.
</para>
</listitem>
</varlistentry>
@@ -476,7 +485,8 @@ COPY { <replaceable class="parameter">table_name</replaceable> [ ( <replaceable
</para>
<para>
This is currently used in <command>COPY FROM</command> command when
- <literal>ON_ERROR</literal> option is set to <literal>ignore</literal>.
+ <literal>ON_ERROR</literal> option is set to <literal>ignore</literal>
+ or <literal>set_null</literal>.
</para>
</listitem>
</varlistentry>
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index 155a79a70c5..63b86802ba2 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -468,13 +468,12 @@ defGetCopyOnErrorChoice(DefElem *def, ParseState *pstate, bool is_from)
errmsg("COPY %s cannot be used with %s", "ON_ERROR", "COPY TO"),
parser_errposition(pstate, def->location)));
- /*
- * Allow "stop", or "ignore" values.
- */
if (pg_strcasecmp(sval, "stop") == 0)
return COPY_ON_ERROR_STOP;
if (pg_strcasecmp(sval, "ignore") == 0)
return COPY_ON_ERROR_IGNORE;
+ if (pg_strcasecmp(sval, "set_null") == 0)
+ return COPY_ON_ERROR_SET_NULL;
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
@@ -984,7 +983,7 @@ ProcessCopyOptions(ParseState *pstate,
(errcode(ERRCODE_SYNTAX_ERROR),
errmsg("only ON_ERROR STOP is allowed in BINARY mode")));
- if (opts_out->reject_limit && !opts_out->on_error)
+ if (opts_out->reject_limit && opts_out->on_error != COPY_ON_ERROR_IGNORE)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
/*- translator: first and second %s are the names of COPY option, e.g.
diff --git a/src/backend/commands/copyfrom.c b/src/backend/commands/copyfrom.c
index 2b7556b287c..607bab3fc7c 100644
--- a/src/backend/commands/copyfrom.c
+++ b/src/backend/commands/copyfrom.c
@@ -50,6 +50,7 @@
#include "utils/portal.h"
#include "utils/rel.h"
#include "utils/snapmgr.h"
+#include "utils/typcache.h"
/*
* No more than this many tuples per CopyMultiInsertBuffer
@@ -1463,14 +1464,22 @@ CopyFrom(CopyFromState cstate)
/* Done, clean up */
error_context_stack = errcallback.previous;
- if (cstate->opts.on_error != COPY_ON_ERROR_STOP &&
- cstate->num_errors > 0 &&
+ if (cstate->num_errors > 0 &&
cstate->opts.log_verbosity >= COPY_LOG_VERBOSITY_DEFAULT)
- ereport(NOTICE,
- errmsg_plural("%" PRIu64 " row was skipped due to data type incompatibility",
- "%" PRIu64 " rows were skipped due to data type incompatibility",
- cstate->num_errors,
- cstate->num_errors));
+ {
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ ereport(NOTICE,
+ errmsg_plural("%" PRIu64 " row was skipped due to data type incompatibility",
+ "%" PRIu64 " rows were skipped due to data type incompatibility",
+ cstate->num_errors,
+ cstate->num_errors));
+ else if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ ereport(NOTICE,
+ errmsg_plural("in %" PRIu64 " row, columns was set to null due to data type incompatibility",
+ "in %" PRIu64 " rows, columns were set to null due to data type incompatibility",
+ cstate->num_errors,
+ cstate->num_errors));
+ }
if (bistate != NULL)
FreeBulkInsertState(bistate);
@@ -1617,16 +1626,37 @@ BeginCopyFrom(ParseState *pstate,
cstate->escontext->type = T_ErrorSaveContext;
cstate->escontext->error_occurred = false;
- /*
- * Currently we only support COPY_ON_ERROR_IGNORE. We'll add other
- * options later
- */
- if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE ||
+ cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
cstate->escontext->details_wanted = false;
}
else
cstate->escontext = NULL;
+ if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ {
+ int attr_count = list_length(cstate->attnumlist);
+
+ /*
+ * When data type conversion fails and ON_ERROR is SET_NULL, we need
+ * ensure that input column allows NULL value, ExecConstraints will
+ * cover most of the cases, however it does not vertify domain
+ * constraints. Therefore, for constrained domains, NULL value check
+ * must be performed during the initial string-to-datum conversion
+ * (see CopyFromTextLikeOneRow).
+ */
+ cstate->domain_with_constraint = palloc0_array(bool, attr_count);
+
+ foreach_int(attno, cstate->attnumlist)
+ {
+ int i = foreach_current_index(attno);
+
+ Form_pg_attribute att = TupleDescAttr(tupDesc, attno - 1);
+
+ cstate->domain_with_constraint[i] = DomainHasConstraints(att->atttypid);
+ }
+ }
+
/* Convert FORCE_NULL name list to per-column flags, check validity */
cstate->opts.force_null_flags = (bool *) palloc0(num_phys_attrs * sizeof(bool));
if (cstate->opts.force_null_all)
diff --git a/src/backend/commands/copyfromparse.c b/src/backend/commands/copyfromparse.c
index 6b00d49c50f..670b188d9a0 100644
--- a/src/backend/commands/copyfromparse.c
+++ b/src/backend/commands/copyfromparse.c
@@ -959,6 +959,7 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
int fldct;
int fieldno;
char *string;
+ bool current_row_erroneous = false;
tupDesc = RelationGetDescr(cstate->rel);
attr_count = list_length(cstate->attnumlist);
@@ -1035,9 +1036,7 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
values[m] = ExecEvalExpr(defexprs[m], econtext, &nulls[m]);
}
- /*
- * If ON_ERROR is specified with IGNORE, skip rows with soft errors
- */
+ /* If ON_ERROR is specified, handle the different options */
else if (!InputFunctionCallSafe(&in_functions[m],
string,
typioparams[m],
@@ -1047,7 +1046,55 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
{
Assert(cstate->opts.on_error != COPY_ON_ERROR_STOP);
- cstate->num_errors++;
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ cstate->num_errors++;
+ else if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ {
+ /*
+ * Reset error state so the subsequent InputFunctionCallSafe
+ * call (for domain constraint check) can properly report
+ * whether it succeeded or failed.
+ */
+ cstate->escontext->error_occurred = false;
+
+ Assert(cstate->domain_with_constraint != NULL);
+
+ /*
+ * For constrained domain types, we need an additional
+ * InputFunctionCallSafe to ensure that an error is thrown if
+ * the domain constraint rejects NULL.
+ */
+ if (!cstate->domain_with_constraint[m] ||
+ InputFunctionCallSafe(&in_functions[m],
+ NULL,
+ typioparams[m],
+ att->atttypmod,
+ (Node *) cstate->escontext,
+ &values[m]))
+ {
+ nulls[m] = true;
+ values[m] = (Datum) 0;
+ }
+ else
+ ereport(ERROR,
+ errcode(ERRCODE_NOT_NULL_VIOLATION),
+ errmsg("domain %s does not allow null values",
+ format_type_be(typioparams[m])),
+ errdetail("ON_ERROR SET_NULL cannot be applied because column \"%s\" (domain %s) does not accept null values.",
+ cstate->cur_attname,
+ format_type_be(typioparams[m])),
+ errdatatype(typioparams[m]));
+
+ /*
+ * We count only the number of rows (not fields) where
+ * ON_ERROR SET_NULL was applied.
+ */
+ if (!current_row_erroneous)
+ {
+ current_row_erroneous = true;
+ cstate->num_errors++;
+ }
+ }
if (cstate->opts.log_verbosity == COPY_LOG_VERBOSITY_VERBOSE)
{
@@ -1064,24 +1111,37 @@ CopyFromTextLikeOneRow(CopyFromState cstate, ExprContext *econtext,
char *attval;
attval = CopyLimitPrintoutLength(cstate->cur_attval);
- ereport(NOTICE,
- errmsg("skipping row due to data type incompatibility at line %" PRIu64 " for column \"%s\": \"%s\"",
- cstate->cur_lineno,
- cstate->cur_attname,
- attval));
+
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ ereport(NOTICE,
+ errmsg("skipping row due to data type incompatibility at line %" PRIu64 " for column \"%s\": \"%s\"",
+ cstate->cur_lineno,
+ cstate->cur_attname,
+ attval));
+ else if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ ereport(NOTICE,
+ errmsg("setting to null due to data type incompatibility at line %" PRIu64 " for column \"%s\": \"%s\"",
+ cstate->cur_lineno,
+ cstate->cur_attname,
+ attval));
pfree(attval);
}
else
- ereport(NOTICE,
- errmsg("skipping row due to data type incompatibility at line %" PRIu64 " for column \"%s\": null input",
- cstate->cur_lineno,
- cstate->cur_attname));
-
+ {
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ ereport(NOTICE,
+ errmsg("skipping row due to data type incompatibility at line %" PRIu64 " for column \"%s\": null input",
+ cstate->cur_lineno,
+ cstate->cur_attname));
+ }
/* reset relname_only */
cstate->relname_only = false;
}
- return true;
+ if (cstate->opts.on_error == COPY_ON_ERROR_IGNORE)
+ return true;
+ else if (cstate->opts.on_error == COPY_ON_ERROR_SET_NULL)
+ continue;
}
cstate->cur_attname = NULL;
diff --git a/src/bin/psql/tab-complete.in.c b/src/bin/psql/tab-complete.in.c
index b2dba6d10ab..61c08241b04 100644
--- a/src/bin/psql/tab-complete.in.c
+++ b/src/bin/psql/tab-complete.in.c
@@ -3425,7 +3425,7 @@ match_previous_words(int pattern_id,
/* Complete COPY <sth> FROM filename WITH (ON_ERROR */
else if (TailMatches("ON_ERROR"))
- COMPLETE_WITH("stop", "ignore");
+ COMPLETE_WITH("stop", "ignore", "set_null");
/* Complete COPY <sth> FROM filename WITH (LOG_VERBOSITY */
else if (TailMatches("LOG_VERBOSITY"))
diff --git a/src/include/commands/copy.h b/src/include/commands/copy.h
index 43c2580539f..877202af67b 100644
--- a/src/include/commands/copy.h
+++ b/src/include/commands/copy.h
@@ -35,6 +35,7 @@ typedef enum CopyOnErrorChoice
{
COPY_ON_ERROR_STOP = 0, /* immediately throw errors, default */
COPY_ON_ERROR_IGNORE, /* ignore errors */
+ COPY_ON_ERROR_SET_NULL, /* set error field to null */
} CopyOnErrorChoice;
/*
diff --git a/src/include/commands/copyfrom_internal.h b/src/include/commands/copyfrom_internal.h
index 822ef33cf69..f892c343157 100644
--- a/src/include/commands/copyfrom_internal.h
+++ b/src/include/commands/copyfrom_internal.h
@@ -108,6 +108,13 @@ typedef struct CopyFromStateData
* att */
bool *defaults; /* if DEFAULT marker was found for
* corresponding att */
+
+ /*
+ * True if the corresponding attribute's is a constrained domain. This
+ * will be populated only when ON_ERROR is SET_NULL, otherwise NULL.
+ */
+ bool *domain_with_constraint;
+
bool volatile_defexprs; /* is any of defexprs volatile? */
List *range_table; /* single element list of RangeTblEntry */
List *rteperminfos; /* single element list of RTEPermissionInfo */
diff --git a/src/test/regress/expected/copy2.out b/src/test/regress/expected/copy2.out
index 3145b314e48..1dd9a63bb80 100644
--- a/src/test/regress/expected/copy2.out
+++ b/src/test/regress/expected/copy2.out
@@ -81,6 +81,10 @@ COPY x from stdin (on_error ignore, on_error ignore);
ERROR: conflicting or redundant options
LINE 1: COPY x from stdin (on_error ignore, on_error ignore);
^
+COPY x from stdin (on_error set_null, on_error set_null);
+ERROR: conflicting or redundant options
+LINE 1: COPY x from stdin (on_error set_null, on_error set_null);
+ ^
COPY x from stdin (log_verbosity default, log_verbosity verbose);
ERROR: conflicting or redundant options
LINE 1: COPY x from stdin (log_verbosity default, log_verbosity verb...
@@ -92,6 +96,10 @@ COPY x from stdin (format BINARY, null 'x');
ERROR: cannot specify NULL in BINARY mode
COPY x from stdin (format BINARY, on_error ignore);
ERROR: only ON_ERROR STOP is allowed in BINARY mode
+COPY x from stdin (format BINARY, on_error set_null);
+ERROR: only ON_ERROR STOP is allowed in BINARY mode
+COPY x from stdin (on_error set_null, reject_limit 2);
+ERROR: COPY REJECT_LIMIT requires ON_ERROR to be set to IGNORE
COPY x from stdin (on_error unsupported);
ERROR: COPY ON_ERROR "unsupported" not recognized
LINE 1: COPY x from stdin (on_error unsupported);
@@ -124,6 +132,10 @@ COPY x to stdout (format BINARY, on_error unsupported);
ERROR: COPY ON_ERROR cannot be used with COPY TO
LINE 1: COPY x to stdout (format BINARY, on_error unsupported);
^
+COPY x to stdout (on_error set_null);
+ERROR: COPY ON_ERROR cannot be used with COPY TO
+LINE 1: COPY x to stdout (on_error set_null);
+ ^
COPY x from stdin (log_verbosity unsupported);
ERROR: COPY LOG_VERBOSITY "unsupported" not recognized
LINE 1: COPY x from stdin (log_verbosity unsupported);
@@ -782,6 +794,49 @@ CONTEXT: COPY check_ign_err
NOTICE: skipping row due to data type incompatibility at line 8 for column "k": "a"
CONTEXT: COPY check_ign_err
NOTICE: 6 rows were skipped due to data type incompatibility
+CREATE DOMAIN d_int_not_null AS integer NOT NULL CHECK(value > 0);
+CREATE DOMAIN d_int_positive_maybe_null AS integer CHECK(value > 0);
+CREATE TABLE t_on_error_null (a d_int_not_null, b d_int_positive_maybe_null, c integer);
+\pset null NULL
+COPY t_on_error_null FROM STDIN WITH (on_error set_null); -- fail
+ERROR: domain d_int_not_null does not allow null values
+DETAIL: ON_ERROR SET_NULL cannot be applied because column "a" (domain d_int_not_null) does not accept null values.
+CONTEXT: COPY t_on_error_null, line 1, column a: null input
+COPY t_on_error_null FROM STDIN WITH (on_error set_null); -- fail
+ERROR: domain d_int_not_null does not allow null values
+DETAIL: ON_ERROR SET_NULL cannot be applied because column "a" (domain d_int_not_null) does not accept null values.
+CONTEXT: COPY t_on_error_null, line 1, column a: "ss"
+COPY t_on_error_null FROM STDIN WITH (on_error set_null); -- fail
+ERROR: domain d_int_not_null does not allow null values
+DETAIL: ON_ERROR SET_NULL cannot be applied because column "a" (domain d_int_not_null) does not accept null values.
+CONTEXT: COPY t_on_error_null, line 1, column a: "-1"
+-- fail, less data.
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_null);
+ERROR: missing data for column "c"
+CONTEXT: COPY t_on_error_null, line 1: "1,1"
+-- fail, extra data.
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_null);
+ERROR: extra data after last expected column
+CONTEXT: COPY t_on_error_null, line 1: "1,2,3,4"
+COPY t_on_error_null FROM STDIN WITH (on_error set_null, log_verbosity verbose); -- ok
+NOTICE: setting to null due to data type incompatibility at line 1 for column "b": "x1"
+CONTEXT: COPY t_on_error_null
+NOTICE: setting to null due to data type incompatibility at line 1 for column "c": "yx"
+CONTEXT: COPY t_on_error_null
+NOTICE: setting to null due to data type incompatibility at line 2 for column "b": "zx"
+CONTEXT: COPY t_on_error_null
+NOTICE: setting to null due to data type incompatibility at line 3 for column "c": "ea"
+CONTEXT: COPY t_on_error_null
+NOTICE: in 3 rows, columns were set to null due to data type incompatibility
+SELECT * FROM t_on_error_null ORDER BY a;
+ a | b | c
+----+------+------
+ 10 | NULL | NULL
+ 11 | NULL | 12
+ 13 | 14 | NULL
+(3 rows)
+
+\pset null ''
-- tests for on_error option with log_verbosity and null constraint via domain
CREATE DOMAIN dcheck_ign_err2 varchar(15) NOT NULL;
CREATE TABLE check_ign_err2 (n int, m int[], k int, l dcheck_ign_err2);
@@ -841,6 +896,9 @@ DROP VIEW instead_of_insert_tbl_view;
DROP VIEW instead_of_insert_tbl_view_2;
DROP FUNCTION fun_instead_of_insert_tbl();
DROP TABLE check_ign_err;
+DROP TABLE t_on_error_null;
+DROP DOMAIN d_int_not_null;
+DROP DOMAIN d_int_positive_maybe_null;
DROP TABLE check_ign_err2;
DROP DOMAIN dcheck_ign_err2;
DROP TABLE hard_err;
diff --git a/src/test/regress/sql/copy2.sql b/src/test/regress/sql/copy2.sql
index 66435167500..d3b427857ab 100644
--- a/src/test/regress/sql/copy2.sql
+++ b/src/test/regress/sql/copy2.sql
@@ -67,12 +67,15 @@ COPY x from stdin (force_null (a), force_null (b));
COPY x from stdin (convert_selectively (a), convert_selectively (b));
COPY x from stdin (encoding 'sql_ascii', encoding 'sql_ascii');
COPY x from stdin (on_error ignore, on_error ignore);
+COPY x from stdin (on_error set_null, on_error set_null);
COPY x from stdin (log_verbosity default, log_verbosity verbose);
-- incorrect options
COPY x from stdin (format BINARY, delimiter ',');
COPY x from stdin (format BINARY, null 'x');
COPY x from stdin (format BINARY, on_error ignore);
+COPY x from stdin (format BINARY, on_error set_null);
+COPY x from stdin (on_error set_null, reject_limit 2);
COPY x from stdin (on_error unsupported);
COPY x from stdin (format TEXT, force_quote(a));
COPY x from stdin (format TEXT, force_quote *);
@@ -87,6 +90,7 @@ COPY x from stdin (format TEXT, force_null *);
COPY x to stdout (format CSV, force_null(a));
COPY x to stdout (format CSV, force_null *);
COPY x to stdout (format BINARY, on_error unsupported);
+COPY x to stdout (on_error set_null);
COPY x from stdin (log_verbosity unsupported);
COPY x from stdin with (reject_limit 1);
COPY x from stdin with (on_error ignore, reject_limit 0);
@@ -540,6 +544,42 @@ a {2} 2
8 {8} 8
\.
+CREATE DOMAIN d_int_not_null AS integer NOT NULL CHECK(value > 0);
+CREATE DOMAIN d_int_positive_maybe_null AS integer CHECK(value > 0);
+CREATE TABLE t_on_error_null (a d_int_not_null, b d_int_positive_maybe_null, c integer);
+
+\pset null NULL
+COPY t_on_error_null FROM STDIN WITH (on_error set_null); -- fail
+\N 11 13
+\.
+
+COPY t_on_error_null FROM STDIN WITH (on_error set_null); -- fail
+ss 11 14
+\.
+
+COPY t_on_error_null FROM STDIN WITH (on_error set_null); -- fail
+-1 11 13
+\.
+
+-- fail, less data.
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_null);
+1,1
+\.
+-- fail, extra data.
+COPY t_on_error_null FROM STDIN WITH (delimiter ',', on_error set_null);
+1,2,3,4
+\.
+
+COPY t_on_error_null FROM STDIN WITH (on_error set_null, log_verbosity verbose); -- ok
+10 x1 yx
+11 zx 12
+13 14 ea
+\.
+
+SELECT * FROM t_on_error_null ORDER BY a;
+
+\pset null ''
+
-- tests for on_error option with log_verbosity and null constraint via domain
CREATE DOMAIN dcheck_ign_err2 varchar(15) NOT NULL;
CREATE TABLE check_ign_err2 (n int, m int[], k int, l dcheck_ign_err2);
@@ -609,6 +649,9 @@ DROP VIEW instead_of_insert_tbl_view;
DROP VIEW instead_of_insert_tbl_view_2;
DROP FUNCTION fun_instead_of_insert_tbl();
DROP TABLE check_ign_err;
+DROP TABLE t_on_error_null;
+DROP DOMAIN d_int_not_null;
+DROP DOMAIN d_int_positive_maybe_null;
DROP TABLE check_ign_err2;
DROP DOMAIN dcheck_ign_err2;
DROP TABLE hard_err;
--
2.34.1
^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
@ 2026-03-04 13:25 Fujii Masao <[email protected]>
parent: jian he <[email protected]>
0 siblings, 1 reply; 31+ messages in thread
From: Fujii Masao @ 2026-03-04 13:25 UTC (permalink / raw)
To: Peter Eisentraut <[email protected]>; +Cc: jian he <[email protected]>; Matheus Alcantara <[email protected]>; torikoshia <[email protected]>; Masahiko Sawada <[email protected]>; vignesh C <[email protected]>; Jim Jones <[email protected]>; Kirill Reshke <[email protected]>; Fujii Masao <[email protected]>; David G. Johnston <[email protected]>; Yugo NAGATA <[email protected]>; PostgreSQL Hackers <[email protected]>
On Tue, Mar 3, 2026 at 3:38 PM Peter Eisentraut <[email protected]> wrote:
> Thanks, committed.
Thanks for committing the patch!
With this change, ON_ERROR = 'set_null' can now be used with foreign tables
backed by file_fdw. However, unlike ON_ERROR = 'ignore', there is currently
no regression test covering this behavior in file_fdw.
How about adding a regression test to ensure that file_fdw works correctly
with ON_ERROR = 'set_null', and to improve test coverage? Patch attached.
Regards,
--
Fujii Masao
Attachments:
[application/octet-stream] v1-0001-file_fdw-Add-regression-test-for-file_fdw-with-ON.patch (2.3K, 2-v1-0001-file_fdw-Add-regression-test-for-file_fdw-with-ON.patch)
download | inline diff:
From 400adb16a50cc817eaea816fc6735a7b4e7af9aa Mon Sep 17 00:00:00 2001
From: Fujii Masao <[email protected]>
Date: Wed, 4 Mar 2026 22:19:06 +0900
Subject: [PATCH v1] file_fdw: Add regression test for file_fdw with
ON_ERROR='set_null'.
Commit 2a525cc97e1 introduced the ON_ERROR = 'set_null' option for COPY,
allowing it to be used with foreign tables backed by file_fdw. However,
unlike ON_ERROR = 'ignore', no regression test was added to verify
this behavior for file_fdw.
This commit adds a regression test to ensure that foreign tables using
file_fdw work correctly with ON_ERROR = 'set_null', improving test coverage.
---
contrib/file_fdw/expected/file_fdw.out | 12 +++++++++++-
contrib/file_fdw/sql/file_fdw.sql | 4 +++-
2 files changed, 14 insertions(+), 2 deletions(-)
diff --git a/contrib/file_fdw/expected/file_fdw.out b/contrib/file_fdw/expected/file_fdw.out
index 251f00bd258..640986528ae 100644
--- a/contrib/file_fdw/expected/file_fdw.out
+++ b/contrib/file_fdw/expected/file_fdw.out
@@ -246,7 +246,17 @@ SELECT * FROM agg_bad; -- ERROR
ERROR: invalid input syntax for type real: "aaa"
CONTEXT: COPY agg_bad, line 3, column b: "aaa"
-- on_error, log_verbosity and reject_limit tests
-ALTER FOREIGN TABLE agg_bad OPTIONS (ADD on_error 'ignore');
+ALTER FOREIGN TABLE agg_bad OPTIONS (ADD on_error 'set_null');
+SELECT * FROM agg_bad;
+ a | b
+-----+--------
+ 100 | 99.097
+ 0 | _null_
+ 42 | 324.78
+ 1 | _null_
+(4 rows)
+
+ALTER FOREIGN TABLE agg_bad OPTIONS (SET on_error 'ignore');
SELECT * FROM agg_bad;
NOTICE: 2 rows were skipped due to data type incompatibility
a | b
diff --git a/contrib/file_fdw/sql/file_fdw.sql b/contrib/file_fdw/sql/file_fdw.sql
index 2cba84b1db7..56bfc926c00 100644
--- a/contrib/file_fdw/sql/file_fdw.sql
+++ b/contrib/file_fdw/sql/file_fdw.sql
@@ -171,7 +171,9 @@ SELECT * FROM agg_csv c JOIN agg_text t ON (t.a = c.a) ORDER BY c.a;
SELECT * FROM agg_bad; -- ERROR
-- on_error, log_verbosity and reject_limit tests
-ALTER FOREIGN TABLE agg_bad OPTIONS (ADD on_error 'ignore');
+ALTER FOREIGN TABLE agg_bad OPTIONS (ADD on_error 'set_null');
+SELECT * FROM agg_bad;
+ALTER FOREIGN TABLE agg_bad OPTIONS (SET on_error 'ignore');
SELECT * FROM agg_bad;
ALTER FOREIGN TABLE agg_bad OPTIONS (ADD log_verbosity 'silent');
SELECT * FROM agg_bad;
--
2.51.2
^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
@ 2026-03-13 02:57 Fujii Masao <[email protected]>
parent: Fujii Masao <[email protected]>
0 siblings, 1 reply; 31+ messages in thread
From: Fujii Masao @ 2026-03-13 02:57 UTC (permalink / raw)
To: Peter Eisentraut <[email protected]>; +Cc: jian he <[email protected]>; Matheus Alcantara <[email protected]>; torikoshia <[email protected]>; Masahiko Sawada <[email protected]>; vignesh C <[email protected]>; Jim Jones <[email protected]>; Kirill Reshke <[email protected]>; Fujii Masao <[email protected]>; David G. Johnston <[email protected]>; Yugo NAGATA <[email protected]>; PostgreSQL Hackers <[email protected]>
On Wed, Mar 4, 2026 at 10:25 PM Fujii Masao <[email protected]> wrote:
>
> On Tue, Mar 3, 2026 at 3:38 PM Peter Eisentraut <[email protected]> wrote:
> > Thanks, committed.
>
> Thanks for committing the patch!
>
> With this change, ON_ERROR = 'set_null' can now be used with foreign tables
> backed by file_fdw. However, unlike ON_ERROR = 'ignore', there is currently
> no regression test covering this behavior in file_fdw.
>
> How about adding a regression test to ensure that file_fdw works correctly
> with ON_ERROR = 'set_null', and to improve test coverage? Patch attached.
Barring any objections, I will commit the patch.
Regards,
--
Fujii Masao
^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
@ 2026-03-13 13:50 Fujii Masao <[email protected]>
parent: Fujii Masao <[email protected]>
0 siblings, 1 reply; 31+ messages in thread
From: Fujii Masao @ 2026-03-13 13:50 UTC (permalink / raw)
To: Yi Ding <[email protected]>; +Cc: Peter Eisentraut <[email protected]>; jian he <[email protected]>; Matheus Alcantara <[email protected]>; torikoshia <[email protected]>; Masahiko Sawada <[email protected]>; vignesh C <[email protected]>; Jim Jones <[email protected]>; Kirill Reshke <[email protected]>; Fujii Masao <[email protected]>; David G. Johnston <[email protected]>; Yugo NAGATA <[email protected]>; PostgreSQL Hackers <[email protected]>
On Fri, Mar 13, 2026 at 2:51 PM Yi Ding <[email protected]> wrote:
> The new test added in v1 makes sense to me. A small suggestion is that to verify if a field is really null, we can do:
>
> ALTER FOREIGN TABLE agg_bad OPTIONS (ADD on_error 'set_null');
> SELECT a, b IS NULL FROM agg_bad;
Since the file_fdw test runs "\pset null _null_", a NULL value is displayed as
"_null_". So you can verify that the value is NULL by checking whether
"_null_" is shown.
One could argue that this cannot be distinguished from the literal text value
"_null_". However, relying on "\pset null _null_" is sufficient for this test,
I think.
Regards,
--
Fujii Masao
^ permalink raw reply [nested|flat] 31+ messages in thread
* Re: Re: Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row
@ 2026-03-16 03:15 Fujii Masao <[email protected]>
parent: Fujii Masao <[email protected]>
0 siblings, 0 replies; 31+ messages in thread
From: Fujii Masao @ 2026-03-16 03:15 UTC (permalink / raw)
To: Yi Ding <[email protected]>; +Cc: Peter Eisentraut <[email protected]>; jian he <[email protected]>; Matheus Alcantara <[email protected]>; torikoshia <[email protected]>; Masahiko Sawada <[email protected]>; vignesh C <[email protected]>; Jim Jones <[email protected]>; Kirill Reshke <[email protected]>; Fujii Masao <[email protected]>; David G. Johnston <[email protected]>; Yugo NAGATA <[email protected]>; PostgreSQL Hackers <[email protected]>
On Mon, Mar 16, 2026 at 11:39 AM Yi Ding <[email protected]> wrote:
>
>
> At 2026-03-13 21:50:07, "Fujii Masao" <[email protected]> wrote:
> >On Fri, Mar 13, 2026 at 2:51 PM Yi Ding <[email protected]> wrote:
> >> The new test added in v1 makes sense to me. A small suggestion is that to verify if a field is really null, we can do:
> >>
> >> ALTER FOREIGN TABLE agg_bad OPTIONS (ADD on_error 'set_null');
> >> SELECT a, b IS NULL FROM agg_bad;
> >
> >Since the file_fdw test runs "\pset null _null_", a NULL value is displayed as
> >"_null_". So you can verify that the value is NULL by checking whether
> >"_null_" is shown.
> >
> >One could argue that this cannot be distinguished from the literal text value
> >"_null_". However, relying on "\pset null _null_" is sufficient for this test,
> >I think.
> >
> >Regards,
> >
> >--
> >Fujii Masao
> >
>
> Sounds reasonable,that addressed my comment.
I've pushed the patch. Thanks!
Regards,
--
Fujii Masao
^ permalink raw reply [nested|flat] 31+ messages in thread
end of thread, other threads:[~2026-03-16 03:15 UTC | newest]
Thread overview: 31+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2024-01-26 15:08 Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row David G. Johnston <[email protected]>
2024-01-28 23:50 ` jian he <[email protected]>
2024-01-29 00:19 ` David G. Johnston <[email protected]>
2024-01-29 08:28 ` Yugo NAGATA <[email protected]>
2025-03-07 03:48 ` jian he <[email protected]>
2025-03-11 10:31 ` Jim Jones <[email protected]>
2025-03-12 08:00 ` jian he <[email protected]>
2025-03-12 08:25 ` Jim Jones <[email protected]>
2025-03-18 03:55 ` jian he <[email protected]>
2025-03-21 06:34 ` vignesh C <[email protected]>
2025-03-24 07:50 ` jian he <[email protected]>
2025-03-25 06:31 ` vignesh C <[email protected]>
2025-07-01 14:54 ` torikoshia <[email protected]>
2025-07-02 09:25 ` jian he <[email protected]>
2025-07-30 04:44 ` jian he <[email protected]>
2025-11-10 10:22 ` jian he <[email protected]>
2026-01-21 07:37 ` jian he <[email protected]>
2026-01-21 18:47 ` Matheus Alcantara <[email protected]>
2026-01-22 14:45 ` jian he <[email protected]>
2026-01-27 18:37 ` Matheus Alcantara <[email protected]>
2026-02-06 03:40 ` jian he <[email protected]>
2026-02-06 12:58 ` Matheus Alcantara <[email protected]>
2026-02-06 13:00 ` Matheus Alcantara <[email protected]>
2026-02-09 03:59 ` jian he <[email protected]>
2026-02-09 15:36 ` Matheus Alcantara <[email protected]>
2026-02-25 07:36 ` Peter Eisentraut <[email protected]>
2026-02-28 02:04 ` jian he <[email protected]>
2026-03-04 13:25 ` Fujii Masao <[email protected]>
2026-03-13 02:57 ` Fujii Masao <[email protected]>
2026-03-13 13:50 ` Fujii Masao <[email protected]>
2026-03-16 03:15 ` Fujii Masao <[email protected]>
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox