Received: from malur.postgresql.org ([217.196.149.56]) by arkaria.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wMGYJ-002LEV-0J for pgsql-hackers@arkaria.postgresql.org; Mon, 11 May 2026 02:39:27 +0000 Received: from localhost ([127.0.0.1] helo=malur.postgresql.org) by malur.postgresql.org with esmtp (Exim 4.96) (envelope-from ) id 1wMGYF-00FgwN-2s for pgsql-hackers@arkaria.postgresql.org; Mon, 11 May 2026 02:39:23 +0000 Received: from magus.postgresql.org ([2a02:c0:301:0:ffff::29]) by malur.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wMGYF-00FgwE-1x for pgsql-hackers@lists.postgresql.org; Mon, 11 May 2026 02:39:23 +0000 Received: from meldrar.postgresql.org ([2a02:c0:301:0:ffff::31]) by magus.postgresql.org with esmtps (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.98.2) (envelope-from ) id 1wMGYC-00000001dr9-2TZY for pgsql-hackers@lists.postgresql.org; Mon, 11 May 2026 02:39:23 +0000 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=postgresql.org; s=20171124; h=Content-Transfer-Encoding:Content-Type: Mime-Version:References:In-Reply-To:From:Subject:Cc:To:Message-Id:Date:Sender :Reply-To:Content-ID:Content-Description; bh=DOQimJpXpe0J/pFNEvYlm6WD9bbu/b8vKf3wZnC3hso=; b=nC9UwCH0GQUm2nB6lvnrt6oj5y akO04mcsGQ/mQbwae0UwUbYiSGVXN8X4WamGc4JwsFmBRBHMgbz+hC/WbdvaAe1NkgFReYhbUsNRc 4ozZFOVwMJODKghJiaGn14Ave3Vjk2NGITyoqS8V5VD+A4X8OIC0gyUf3t+ye2EVN7xTfwY9katxx blnIj/0qWftfIJEO+HDf+V510mNNL5FxGZwCgmQuYqm3QGWHQygNMZ2dC8YLz4RicA7lFXYGRB4sD W0P4qpPeWfSVOTpUdjw4ks4GhMMhFgvBu0yITgkgOacTegCRiI91B+gz0vVmZrWtWWHGBB+cmGe/3 WauWxaKA==; Received: from [2409:11:4120:300:19c3:e4eb:fcd9:d8b2] (helo=localhost) by meldrar.postgresql.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.96) (envelope-from ) id 1wMGY9-004ltS-38; Mon, 11 May 2026 02:39:20 +0000 Date: Mon, 11 May 2026 11:39:09 +0900 (JST) Message-Id: <20260511.113909.1529691212419813991.ishii@postgresql.org> To: chenloveit@gmail.com Cc: pgsql-hackers@lists.postgresql.org Subject: Re: Proposal: tighten validation for legacy EUC encodings or document that accepted byte sequences may be unconvertible to UTF8 From: Tatsuo Ishii In-Reply-To: References: <20260511.104013.2069487042346308197.ishii@postgresql.org> X-Mailer: Mew version 6.8 on Emacs 29.3 Mime-Version: 1.0 Content-Type: Text/Plain; charset=us-ascii Content-Transfer-Encoding: 7bit X-Host-Lookup-Failed: Reverse DNS lookup failed for 2409:11:4120:300:19c3:e4eb:fcd9:d8b2 (failed) List-Id: List-Help: List-Subscribe: List-Post: List-Owner: List-Archive: Archived-At: Precedence: bulk [Add Cc: to pgsql-hackers] From: Zhongpu Chen Subject: Re: Proposal: tighten validation for legacy EUC encodings or document that accepted byte sequences may be unconvertible to UTF8 Date: Mon, 11 May 2026 09:56:20 +0800 Message-ID: > I see. The settings may be used in a finer way. For example, `set > euc-cn-encoding-valiation = 'read_compatible'`. It will make pg_dumpall not working. Suppose there's a database populated with `set euc-cn-encoding-valiation = 'native'. 1. Dump the database cluster using pg_dumpall. 2. Create a new database cluster using initdb. 3. Set euc-cn-encoding-valiation = 'read_compatible' in the postgresql.conf. 4. Restore from the dump --- failure because of disallowed EUC_CN characters. I think encoding properties (including character validation) should belong to encoding itself, rather than GUC parameters. If you want to have "strict" EUC_CN and "non-strict" EUC_CN at the same time, I think the best way to implement it is, add new EUC_CN variant encoding. Regards, -- Tatsuo Ishii SRA OSS K.K. English: http://www.sraoss.co.jp/index_en/ Japanese:http://www.sraoss.co.jp