public inbox for [email protected]
help / color / mirror / Atom feedFrom: Henson Choi <[email protected]>
To: Thomas Munro <[email protected]>
Cc: Heikki Linnakangas <[email protected]>
Cc: Robert Haas <[email protected]>
Cc: Tom Lane <[email protected]>
Cc: Jeroen Vermeulen <[email protected]>
Cc: VASUKI M <[email protected]>
Cc: [email protected]
Subject: Re: BUG #19354: JOHAB rejects valid byte sequences
Date: Wed, 15 Apr 2026 14:57:50 +0900
Message-ID: <CAAAe_zAFz1v-3b7Je4L+=wZM3UGAczXV47YVZfZi9wbJxspxeA@mail.gmail.com> (raw)
In-Reply-To: <CAAAe_zCwaccH7h+GOtHbo_docCY-o0c5NMRuYkdz15f=KL4f0g@mail.gmail.com>
References: <[email protected]>
<CA+TgmoaRGSezRaA7x00X495Qho8WGTzggbDSUt-JsruXceZWug@mail.gmail.com>
<CA+zULE4L4rA2DLAcfy=eQL7w_ZexV4P5zpQRbP=_qrhJBEOzjg@mail.gmail.com>
<[email protected]>
<CAE2r8H5vaSyaC_t1FcpHBo-BB_=SrFj7GFnOC-SxC6WDf5c9VA@mail.gmail.com>
<CA+zULE47EXZOp7qKYODd+mjSgDiR-WX5ZNBkwdKnj-Zc0FT58w@mail.gmail.com>
<CA+TgmoZaoc37ohnhF5inoPxWzfoznV483xQw8Fmw+ELFScv47g@mail.gmail.com>
<[email protected]>
<CA+TgmoaoW4F2rRzYcQQim9ddT4-6H3oi0UYV9Ucw-rRQ5MdHsg@mail.gmail.com>
<CA+hUKGKy-ViGBXdOjcPownBM=OdWiULO8H1RyH1r_8qNp=U4CA@mail.gmail.com>
<[email protected]>
<CAAAe_zCLVunjt1u+2E86shwc3hk1x4bzUyU86nY1fq-nAVYN0Q@mail.gmail.com>
<CA+hUKGJMrcS=hBkqVk=5pjM4w8edG=_ArASC82RqB6HQro-v-g@mail.gmail.com>
<CAAAe_zCwaccH7h+GOtHbo_docCY-o0c5NMRuYkdz15f=KL4f0g@mail.gmail.com>
Subject: Fix and expand comments for Korean encodings in encnames.c
Hi hackers,
While reading through the encoding alias table in src/common/encnames.c,
I noticed a few long-standing inaccuracies and omissions in the per-entry
comments for the three Korean encodings.
The most visible issue is the JOHAB entry, whose comment describes it as
"Extended Unix Code for simplified Chinese" -- apparently a copy/paste
slip from a neighboring EUC entry. JOHAB is in fact the Korean
combining-style encoding defined in KS X 1001 annex 3.
The attached 0002 patch makes comment-only adjustments to the three
Korean encodings:
* JOHAB: replace the incorrect "simplified Chinese" description with
a correct one that identifies it as the Korean combining (Johab)
encoding standardized in KS X 1001 annex 3.
* EUC_KR: drop a stray space before the comma in the existing
comment, and note that the encoding covers the KS X 1001
precomposed (Wansung) form.
* UHC: spell out "Unified Hangul Code", clarify that it is
Microsoft Windows CodePage 949, and describe its relationship to
EUC-KR (superset covering all 11,172 precomposed Hangul syllables).
No behavior change, no catalog change, no pg_wchar.h change -- this
touches comments in src/common/encnames.c only. pgindent is clean.
Thanks,
Henson Choi
From c7a7335d2cf5a2881b25d9091fd020a2d62f7661 Mon Sep 17 00:00:00 2001
From: Henson Choi <[email protected]>
Date: Wed, 15 Apr 2026 14:52:35 +0900
Subject: [PATCH v1] Fix and expand comments for Korean encodings in encnames.c
---
src/common/encnames.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/src/common/encnames.c b/src/common/encnames.c
index 9085dbecce1..959b991dde4 100644
--- a/src/common/encnames.c
+++ b/src/common/encnames.c
@@ -61,8 +61,9 @@ static const pg_encname pg_encname_tbl[] =
* Japanese, standard OSF */
{
"euckr", PG_EUC_KR
- }, /* EUC-KR; Extended Unix Code for Korean , KS
- * X 1001 standard */
+ }, /* EUC-KR; Extended Unix Code for Korean
+ * precomposed (Wansung) encoding, standard KS
+ * X 1001 */
{
"euctw", PG_EUC_TW
}, /* EUC-TW; Extended Unix Code for
@@ -119,8 +120,8 @@ static const pg_encname pg_encname_tbl[] =
}, /* ISO-8859-9; RFC1345,KXS2 */
{
"johab", PG_JOHAB
- }, /* JOHAB; Extended Unix Code for simplified
- * Chinese */
+ }, /* JOHAB; Korean combining (Johab) encoding,
+ * standard KS X 1001 annex 3 */
{
"koi8", PG_KOI8R
}, /* _dirty_ alias for KOI8-R (backward
@@ -186,7 +187,9 @@ static const pg_encname pg_encname_tbl[] =
}, /* alias for WIN1258 */
{
"uhc", PG_UHC
- }, /* UHC; Korean Windows CodePage 949 */
+ }, /* UHC; Unified Hangul Code, Microsoft Windows
+ * CodePage 949; superset of EUC-KR covering
+ * all 11,172 precomposed Hangul syllables */
{
"unicode", PG_UTF8
}, /* alias for UTF8 */
--
2.50.1 (Apple Git-155)
Attachments:
[text/plain] 0002-Fix-and-expand-comments-for-Korean-encodings.txt (1.7K, 3-0002-Fix-and-expand-comments-for-Korean-encodings.txt)
download | inline diff:
From c7a7335d2cf5a2881b25d9091fd020a2d62f7661 Mon Sep 17 00:00:00 2001
From: Henson Choi <[email protected]>
Date: Wed, 15 Apr 2026 14:52:35 +0900
Subject: [PATCH v1] Fix and expand comments for Korean encodings in encnames.c
---
src/common/encnames.c | 13 ++++++++-----
1 file changed, 8 insertions(+), 5 deletions(-)
diff --git a/src/common/encnames.c b/src/common/encnames.c
index 9085dbecce1..959b991dde4 100644
--- a/src/common/encnames.c
+++ b/src/common/encnames.c
@@ -61,8 +61,9 @@ static const pg_encname pg_encname_tbl[] =
* Japanese, standard OSF */
{
"euckr", PG_EUC_KR
- }, /* EUC-KR; Extended Unix Code for Korean , KS
- * X 1001 standard */
+ }, /* EUC-KR; Extended Unix Code for Korean
+ * precomposed (Wansung) encoding, standard KS
+ * X 1001 */
{
"euctw", PG_EUC_TW
}, /* EUC-TW; Extended Unix Code for
@@ -119,8 +120,8 @@ static const pg_encname pg_encname_tbl[] =
}, /* ISO-8859-9; RFC1345,KXS2 */
{
"johab", PG_JOHAB
- }, /* JOHAB; Extended Unix Code for simplified
- * Chinese */
+ }, /* JOHAB; Korean combining (Johab) encoding,
+ * standard KS X 1001 annex 3 */
{
"koi8", PG_KOI8R
}, /* _dirty_ alias for KOI8-R (backward
@@ -186,7 +187,9 @@ static const pg_encname pg_encname_tbl[] =
}, /* alias for WIN1258 */
{
"uhc", PG_UHC
- }, /* UHC; Korean Windows CodePage 949 */
+ }, /* UHC; Unified Hangul Code, Microsoft Windows
+ * CodePage 949; superset of EUC-KR covering
+ * all 11,172 precomposed Hangul syllables */
{
"unicode", PG_UTF8
}, /* alias for UTF8 */
--
2.50.1 (Apple Git-155)
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: BUG #19354: JOHAB rejects valid byte sequences
In-Reply-To: <CAAAe_zAFz1v-3b7Je4L+=wZM3UGAczXV47YVZfZi9wbJxspxeA@mail.gmail.com>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox