Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1t96oe-007oSh-Iv for pgsql-hackers@arkaria.postgresql.org; Thu, 07 Nov 2024 18:01:07 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.94.2) (envelope-from ) id 1t96oa-000cyp-Nv for pgsql-hackers@arkaria.postgresql.org; Thu, 07 Nov 2024 18:01:05 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1t96oa-000cyh-Dd for pgsql-hackers@lists.postgresql.org; Thu, 07 Nov 2024 18:01:04 +0000 Received: from oss.nttdata.com ([49.212.34.109]) by magus.postgresql.org with esmtps (TLS1.2) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1t96oW-000iKW-Bh for pgsql-hackers@lists.postgresql.org; Thu, 07 Nov 2024 18:01:04 +0000 Received: from [192.168.11.3] (p4012170-ipxg13601funabasi.chiba.ocn.ne.jp [153.165.137.170]) by oss.nttdata.com (Postfix) with ESMTPSA id 3864F61901; Fri, 8 Nov 2024 03:00:55 +0900 (JST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at oss.nttdata.com Message-ID: <07587c36-18b3-4ccb-b5fb-579bcb04ed37@oss.nttdata.com> Date: Fri, 8 Nov 2024 03:00:54 +0900 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: Change COPY ... ON_ERROR ignore to ON_ERROR ignore_row To: Kirill Reshke Cc: jian he , Jim Jones , "David G. Johnston" , Yugo NAGATA , torikoshia , PostgreSQL Hackers References: <04bf425ad1b15a4daefe96c478a5253b@oss.nttdata.com> <20240206191937.72eaf0ccc20cfea37944b422@sraoss.co.jp> <76da9fcc-93c5-4053-872e-12932a95356d@uni-muenster.de> <6eac5b45-7f45-4c7a-aae1-e90db8be2e08@uni-muenster.de> <3d6b5885-16a1-475d-b56f-41701c48d9d4@uni-muenster.de> <63595e8f-a245-4335-aa22-7e449a70e210@oss.nttdata.com> Content-Language: en-US From: Fujii Masao In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk On 2024/10/26 6:03, Kirill Reshke wrote: > when the REJECT LIMIT is set to some non-zero number and the number of > row NULL replacements exceeds the limit, is it OK to fail. Because > there WAS errors, and we should not tolerate more than $limit errors . > I do find this behavior to be consistent. +1 > But what if we don't set a REJECT LIMIT, it is sane to do all > replacements, as if REJECT LIMIT is inf. +1 > But our REJECT LIMIT is zero > (not set). > So, we ignore zero REJECT LIMIT if set_to_null is set. REJECT_LIMIT currently has to be greater than zero, so it won’t ever be zero. > But while I was trying to implement that, I realized that I don't > understand v4 of this patch. My misunderstanding is about > `t_on_error_null` tests. We are allowed to insert a NULL value for the > first column of t_on_error_null using COPY ON_ERROR SET_TO_NULL. Why > do we do that? My thought is we should try to execute > InputFunctionCallSafe with NULL value (i mean, here [1]) for the > column after we failed to insert the input value. And, if this second > call is successful, we do replacement, otherwise we count the row as > erroneous. Your concern is valid. Allowing NULL to be stored in a column with a NOT NULL constraint via COPY ON_ERROR=SET_TO_NULL does seem unexpected. As you suggested, NULL values set by SET_TO_NULL should probably be re-evaluated. > Hm, good catch. Applied almost as you suggested. I did tweak this > "replace columns with invalid input values with " into "replace > columns containing erroneous input values with". Is that OK? Yes, sounds good. Regards, -- Fujii Masao Advanced Computing Technology Center Research and Development Headquarters NTT DATA CORPORATION