Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1rWojS-007MAc-GV for pgsql-hackers@arkaria.postgresql.org; Mon, 05 Feb 2024 02:29:15 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1rWojR-00AE7s-7v for pgsql-hackers@arkaria.postgresql.org; Mon, 05 Feb 2024 02:29:13 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1rWojQ-00AE7k-Td for pgsql-hackers@lists.postgresql.org; Mon, 05 Feb 2024 02:29:12 +0000 Received: from oss.nttdata.com ([49.212.34.109]) by magus.postgresql.org with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1rWojJ-005VSI-R3 for pgsql-hackers@lists.postgresql.org; Mon, 05 Feb 2024 02:29:12 +0000 Received: from oss.nttdata.com (localhost [127.0.0.1]) by oss.nttdata.com (Postfix) with ESMTPA id CFA24606A9; Mon, 5 Feb 2024 11:28:59 +0900 (JST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at oss.nttdata.com MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Date: Mon, 05 Feb 2024 11:28:59 +0900 From: torikoshia To: jian he Cc: Yugo NAGATA , "David G. Johnston" , PostgreSQL Hackers Subject: Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row In-Reply-To: References: <20240129172858.ccb6c77c3be95a295e2b2b44@sraoss.co.jp> User-Agent: Roundcube Webmail/1.4.11 Message-ID: <04bf425ad1b15a4daefe96c478a5253b@oss.nttdata.com> X-Sender: torikoshia@oss.nttdata.com List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk Hi, On 2024-02-03 15:22, jian he wrote: > The idea of on_error is to tolerate errors, I think. > if a column has a not null constraint, let it cannot be used with > (on_error 'null') > + /* > + * we can specify on_error 'null', but it can only apply to > columns > + * don't have not null constraint. > + */ > + if (att->attnotnull && cstate->opts.on_error == > COPY_ON_ERROR_NULL) > + ereport(ERROR, > + (errcode(ERRCODE_BAD_COPY_FILE_FORMAT), > + errmsg("copy on_error 'null' cannot be used with > not null constraint column"))); This means we cannot use ON_ERROR 'null' even when there is one column which have NOT NULL constraint, i.e. primary key, right? IMHO this is strong constraint and will decrease the opportunity to use this feature. It might be better to allow error_action 'null' for tables which have NOT NULL constraint columns, and when facing soft errors for those rows, skip that row or stop COPY. > Based on this, I've made a patch. > based on COPY Synopsis: ON_ERROR 'error_action' > on_error 'null', the keyword NULL should be single quoted. As you mentioned, single quotation seems a little odd.. I'm not sure what is the best name and syntax for this feature, but since current error_action are verbs('stop' and 'ignore'), I feel 'null' might not be appropriate. > demo: > COPY check_ign_err FROM STDIN WITH (on_error 'null'); > 1 {1} a > 2 {2} 1 > 3 {3} 2 > 4 {4} b > a {5} c > \. > > \pset null NULL > > SELECT * FROM check_ign_err; > n | m | k > ------+-----+------ > 1 | {1} | NULL > 2 | {2} | 1 > 3 | {3} | 2 > 4 | {4} | NULL > NULL | {5} | NULL Since we notice the number of ignored rows when ON_ERROR is 'ignore', users may want to know the number of rows which was changed to NULL when using ON_ERROR 'null'. -- Regards, -- Atsushi Torikoshi NTT DATA Group Corporation