public inbox for [email protected]  
help / color / mirror / Atom feed
From: Gregory Maxwell <[email protected]>
To: Bruce Momjian <[email protected]>
Cc: Gavin Sherry <[email protected]>
Cc: Peter Eisentraut <[email protected]>
Cc: [email protected]
Subject: Re: Upcoming PG re-releases
Date: Thu, 8 Dec 2005 17:54:35 -0500
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
	<[email protected]>

On 12/8/05, Bruce Momjian <[email protected]> wrote:
> >       A script which identifies non-utf-8 characters and provides some
> >       context, line numbers, etc, will greatly speed up the process of
> >       remedying the situation.
>
> I think the best we can do is the "iconv -c with the diff" idea, which
> is already in the release notes.  I suppose we could merge the iconv and
> diff into a single command, but I don't see a portable way to output the
> iconv output to stdout., /dev/stdin not being portable.

No, what is needed for people who care about fixing their data is a
loadable strip_invalid_utf8() that works in older versions.. then just
select * from bar where foo != strip_invalid_utf8(foo);  The function
would be useful in general, for example, if you have an application
which doesn't already have much utf8 logic, you want to use a text
field, and stripping is the behaviour you want. For example, lots of
simple web applications.



view thread (55+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Upcoming PG re-releases
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox