public inbox for [email protected]  
help / color / mirror / Atom feed
From: Gregory Maxwell <[email protected]>
To: [email protected]
Subject: Upcoming PG re-releases
Date: Sun, 4 Dec 2005 12:19:32 -0500
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>

On 12/4/05, Tom Lane <[email protected]> wrote:
> Paul Lindner <[email protected]> writes:
> > On Sun, Dec 04, 2005 at 11:34:16AM -0500, Tom Lane wrote:
> >> Paul Lindner <[email protected]> writes:
> >>> iconv -c -f UTF8 -t UTF8 -o fixed.sql dump.sql
> >>
> >> Is that really a one-size-fits-all solution?  Especially with -c?
>
> > I'd say yes, and the -c flag is needed so iconv strips out the
> > invalid characters.
>
> That's exactly what's bothering me about it.  If we recommend that
> we had better put a large THIS WILL DESTROY YOUR DATA warning first.
> The problem is that the data is not "invalid" from the user's point
> of view --- more likely, it's in some non-UTF8 encoding --- and so
> just throwing away some of the characters is unlikely to make people
> happy.

Nor is it even guarenteed to make the data load: If the column is
unique constrained and the removal of the non-UTF characters makes two
rows have the same data where they didn't before...

The way to preserve the data is to switch the column to be a bytea.



view thread (55+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected]
  Subject: Re: Upcoming PG re-releases
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox