public inbox for [email protected]
help / color / mirror / Atom feedFrom: Zhongpu Chen <[email protected]>
To: Junwang Zhao <[email protected]>
Cc: [email protected]
Subject: Re: Character with byte sequence 0xa2 0xa3 in encoding "EUC_CN" has no equivalent in encoding "UTF8"
Date: Sat, 2 May 2026 00:09:19 +0800
Message-ID: <CA+1gyqJMtuTofZDy+CeomGGhsFGXw6JrdyAhqvnLii44oKePGg@mail.gmail.com> (raw)
In-Reply-To: <CAEG8a3+m6Hx-VzMBX92Y6EZECHhGDKS+2zHNkZC5FE0WkvWR3Q@mail.gmail.com>
References: <CA+1gyqL7uiQhfLcYWpHNUKQgHjQc7sOPthSTiaxLDZzcrGFYSg@mail.gmail.com>
<CAEG8a3+m6Hx-VzMBX92Y6EZECHhGDKS+2zHNkZC5FE0WkvWR3Q@mail.gmail.com>
```
demo_euc_cn_db=# SET client_encoding TO 'EUC_CN';
SET
demo_euc_cn_db=# SELECT * FROM t WHERE id = 1;
id | s
----+----
1 | ��
(1 row)
```
Since 0xA2A3 is invalid in EUC-CN, it cannot be mapped to any meaningful
character. Currently, EUC-CN allows all 2-byte within A1-EF, but this
coarse-grained approach is flawed.
On Fri, May 1, 2026 at 11:07 PM Junwang Zhao <[email protected]> wrote:
> On Fri, May 1, 2026 at 9:59 PM Zhongpu Chen <[email protected]> wrote:
> >
> > ## Description
> >
> > The legacy encodings allow some invalid bytes, which will cause errors
> during SELECT operations.
> >
> > ## How to reproduce
> >
> > ```shell
> > createdb -E EUC_CN -T template0 --locale=C demo_euc_cn_db
> > ```
> >
> > ```sql
> > demo_euc_cn_db=# CREATE TABLE t(id int, s varchar(10));
> >
> > demo_euc_cn_db=# INSERT INTO t VALUES(1, E'\xA2\xA3');
> > INSERT 0 1
> > demo_euc_cn_db=# SELECT * FROM t WHERE id = 1;
> > ERROR: character with byte sequence 0xa2 0xa3 in encoding "EUC_CN" has
> no equivalent in encoding "UTF8"
>
> Can you try the following statement before select?
> SET client_encoding TO 'EUC_CN';
>
> > ```
> >
> > --
> > Zhongpu Chen
>
>
>
> --
> Regards
> Junwang Zhao
>
--
Zhongpu Chen
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected]
Subject: Re: Character with byte sequence 0xa2 0xa3 in encoding "EUC_CN" has no equivalent in encoding "UTF8"
In-Reply-To: <CA+1gyqJMtuTofZDy+CeomGGhsFGXw6JrdyAhqvnLii44oKePGg@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox