public inbox for [email protected]
help / color / mirror / Atom feedFrom: Bruce Momjian <[email protected]>
To: Paul Lindner <[email protected]>
Cc: Neil Conway <[email protected]>
Cc: Tom Lane <[email protected]>
Cc: [email protected]
Subject: Re: Upcoming PG re-releases
Date: Tue, 6 Dec 2005 14:26:38 -0500 (EST)
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
I have added your suggestions to the 8.1.X release notes.
---------------------------------------------------------------------------
Paul Lindner wrote:
-- Start of PGP signed section.
> On Sat, Dec 03, 2005 at 10:54:08AM -0500, Bruce Momjian wrote:
> > Neil Conway wrote:
> > > On Wed, 2005-11-30 at 10:56 -0500, Tom Lane wrote:
> > > > It's been about a month since 8.1.0 was released, and we've found about
> > > > the usual number of bugs for a new release, so it seems like it's time
> > > > for 8.1.1.
> > >
> > > I think one fix that should be made in time for 8.1.1 is adding a note
> > > to the "version migration" section of the 8.1 release notes describing
> > > the "invalid UTF-8 byte sequence" problems that some people have run
> > > into when upgrading from prior versions. I'm not familiar enough with
> > > the problem or its remedies to add the note myself, though.
> >
> > Agreed, but I don't understand the problem well enough either. Does
> > anyone?
>
> There was a thread a couple of weeks back about this problem. Here's
> my sample writeup -- I give my permission for anyone to use it as they
> see fit:
>
>
> Upgrading UNICODE databases to 8.1
>
> Postgres 8.1 includes a number of bug-fixes and improvements to
> Unicode and UTF-8 character handling. Unfortunately previous releases
> would accept character sequences that were not valid UTF-8. This
> may cause problems when upgrading your database using
> pg_dump/pg_restore resulting in an error message like this:
>
> Invalid UNICODE byte sequence detected near byte ...
>
> To convert your pre-8.1 database to 8.1 you may have to remove and/or
> fix the offending characters. One simple way to fix the problem is to
> run your pg_dump output through the iconv command like this:
>
> iconv -c -f UTF8 -t UTF8 -o fixed.sql dump.sql
>
> The -c flag tells iconv to omit invalid characters from output.
>
> There is one problem with this. Most versions of iconv try to read
> the entire input file into memory. If you dump is quite large you
> will need to split the dump into multiple files and convert each one
> individually. You must use the -l flag for split to insure that the
> unicode byte sequences are not split.
>
> split -l 10000 dump.sql
>
> Another possible solution is to use the --inserts flag to pg_dump.
> When you load the resulting data dump in 8.1 this will result in the
> problem rows showing up in your error log.
>
> --
> Paul Lindner ||||| | | | | | | | | |
> [email protected]
-- End of PGP section, PGP failed!
--
Bruce Momjian | http://candle.pha.pa.us
[email protected] | (610) 359-1001
+ If your life is a hard drive, | 13 Roberts Road
+ Christ can be your backup. | Newtown Square, Pennsylvania 19073
view thread (55+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: Upcoming PG re-releases
In-Reply-To: <[email protected]>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox