X-Original-To: pgsql-docs-postgresql.org@localhost.postgresql.org Received: from localhost (unknown [200.46.204.144]) by svr1.postgresql.org (Postfix) with ESMTP id 6616E8BA350 for ; Mon, 21 Feb 2005 04:20:32 +0000 (GMT) Received: from svr1.postgresql.org ([200.46.204.71]) by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024) with ESMTP id 35440-05 for ; Mon, 21 Feb 2005 04:20:30 +0000 (GMT) Received: from candle.pha.pa.us (candle.pha.pa.us [64.139.89.126]) by svr1.postgresql.org (Postfix) with ESMTP id 874B48BA333 for ; Mon, 21 Feb 2005 04:20:29 +0000 (GMT) Received: (from pgman@localhost) by candle.pha.pa.us (8.11.6/8.11.6) id j1L4KEP14031; Sun, 20 Feb 2005 23:20:14 -0500 (EST) From: Bruce Momjian Message-Id: <200502210420.j1L4KEP14031@candle.pha.pa.us> Subject: Re: Suggestion for Encodings table In-Reply-To: To: Preston Landers Date: Sun, 20 Feb 2005 23:20:14 -0500 (EST) Cc: pgsql-docs@postgresql.org X-Mailer: ELM [version 2.4ME+ PL121 (25)] MIME-Version: 1.0 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=US-ASCII X-Virus-Scanned: by amavisd-new at hub.org X-Spam-Status: No, hits=0.011 tagged_above=0 required=5 tests=AWL X-Spam-Level: X-Archive-Number: 200502/55 X-Sequence-Number: 2888 Preston Landers wrote: > > http://www.postgresql.org/docs/8.0/interactive/multibyte.html#CHARSET-TABLE > > I would humbly suggest a few improvements to that Encodings table to > improve the clarity. > > Many of the entries clearly indicate the language or writing system, such > as WIN1256 = "Windows CP1256 (Arabic)" > > I would suggest that every single entry should be described that way with > the common language or writing system name. Even Unicode could say "All > languages". > > In particular, the "WIN" encoding just says "CP1251" -- this is Cyrillic > (Russian) but some people might just see the WIN and assume it's the > character set that Western/US Windows uses (CP 1252). > > It's an easy mistake to make and one I see repeated frequently on other > web pages (calling Windows "Western" CP 1251.) Someone reading English > language docs and seeing a "WIN" character set might naturally assume that > it is the English Windows character set. (Which BTW is not currently > supported by PG for conversions.) > > Some more examples that might improve clarity: > > LATIN5 should say "Turkish" > > LATIN6 should say "Nordic" > > ALT and KOI8 should say "Cyrillic" (or Russian) Great. Would you submit a patch to the SGML sources? -- Bruce Momjian | http://candle.pha.pa.us pgman@candle.pha.pa.us | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup. | Newtown Square, Pennsylvania 19073