X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org Received: from localhost (av.hub.org [200.46.204.144]) by postgresql.org (Postfix) with ESMTP id 454829DCB11 for ; Thu, 8 Dec 2005 18:37:59 -0400 (AST) Received: from postgresql.org ([200.46.204.71]) by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024) with ESMTP id 79000-04 for ; Thu, 8 Dec 2005 18:38:00 -0400 (AST) X-Greylist: from auto-whitelisted by SQLgrey- Received: from linuxworld.com.au (unknown [203.34.46.50]) by postgresql.org (Postfix) with ESMTP id 44CED9DCB70 for ; Thu, 8 Dec 2005 18:37:55 -0400 (AST) Received: from linuxworld.com.au (IDENT:swm@localhost.localdomain [127.0.0.1]) by linuxworld.com.au (8.13.2/8.13.2) with ESMTP id jB8MbdJ2012104; Fri, 9 Dec 2005 09:37:39 +1100 Received: from localhost (swm@localhost) by linuxworld.com.au (8.13.2/8.13.2/Submit) with ESMTP id jB8MbdNR012101; Fri, 9 Dec 2005 09:37:39 +1100 Date: Fri, 9 Dec 2005 09:37:39 +1100 (EST) From: Gavin Sherry To: Bruce Momjian cc: Peter Eisentraut , pgsql-hackers@postgresql.org Subject: Re: Upcoming PG re-releases In-Reply-To: <200512070437.jB74buw20492@candle.pha.pa.us> Message-ID: References: <200512070437.jB74buw20492@candle.pha.pa.us> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: by amavisd-new at hub.org X-Spam-Status: No, score=0 required=5 tests=[none] X-Spam-Score: 0 X-Spam-Level: X-Archive-Number: 200512/458 X-Sequence-Number: 77306 On Tue, 6 Dec 2005, Bruce Momjian wrote: > > Exactly what does vim do that iconv does not? Fuzzy encoding sounds > scary to me. > Right. It actually makes assumptions about the source encoding. People who care about their data need, unfortunately, to spend a bit of time on this problem. I've been discussing the same issue on the slony1 mailing list, because the issue can affect people's ability upgrade using slony1. http://gborg.postgresql.org/pipermail/slony1-general/2005-December/003430.html It would be good if had the script I suggest in the email: A script which identifies non-utf-8 characters and provides some context, line numbers, etc, will greatly speed up the process of remedying the situation. Thoughts? Gavin