public inbox for [email protected]
help / color / mirror / Atom feedFrom: Gavin Sherry <[email protected]>
To: Tom Lane <[email protected]>
Cc: Paul Lindner <[email protected]>
Cc: Bruce Momjian <[email protected]>
Cc: Neil Conway <[email protected]>
Cc: [email protected]
Subject: Re: Upcoming PG re-releases
Date: Mon, 5 Dec 2005 15:43:59 +1100 (EST)
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
Hi all,
On Sun, 4 Dec 2005, Tom Lane wrote:
> Paul Lindner <[email protected]> writes:
> > To convert your pre-8.1 database to 8.1 you may have to remove and/or
> > fix the offending characters. One simple way to fix the problem is to
> > run your pg_dump output through the iconv command like this:
>
> > iconv -c -f UTF8 -t UTF8 -o fixed.sql dump.sql
>
> Is that really a one-size-fits-all solution? Especially with -c?
>
It's definately not a one size fits all. The reassuring thing is that
others have tried to deal with this problem before.
Omar Kilani and I have spent a few hours looking at the problem. For
situations where there is a lot of invalid encoding, manual fixing is just
not viable. The vim project has a kind of fuzzy encoding conversion which
accounts for a lot of the non-UTF8 sequences in UTF8 data. You can use vim
to modify your text dump as follows:
vim -c ":wq! ++enc=utf8 fixed.dump" original.dump
Now, our testing of this is far from exhaustive but it's a lot better than
just cutting the data from the original dump. Those suffering the problem
should definately check this out, particularly if you have a non-trivial
amount of data.
Thanks,
Gavin
view thread (55+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: Upcoming PG re-releases
In-Reply-To: <[email protected]>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox