X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org Received: from localhost (av.hub.org [200.46.204.144]) by postgresql.org (Postfix) with ESMTP id C96109DCCA3 for ; Sun, 4 Dec 2005 13:19:37 -0400 (AST) Received: from postgresql.org ([200.46.204.71]) by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024) with ESMTP id 21601-05 for ; Sun, 4 Dec 2005 13:19:35 -0400 (AST) X-Greylist: domain auto-whitelisted by SQLgrey- Received: from nproxy.gmail.com (nproxy.gmail.com [64.233.182.195]) by postgresql.org (Postfix) with ESMTP id B7B419DCC35 for ; Sun, 4 Dec 2005 13:19:33 -0400 (AST) Received: by nproxy.gmail.com with SMTP id g2so49436nfe for ; Sun, 04 Dec 2005 09:19:32 -0800 (PST) DomainKey-Signature: a=rsa-sha1; q=dns; c=nofws; s=beta; d=gmail.com; h=received:message-id:date:from:to:subject:in-reply-to:mime-version:content-type:content-transfer-encoding:content-disposition:references; b=kojGgBathKVR/2Xntlb3cMi4fGZUt6LepURxZoDVFEOARa74zNog5ArHFr8douZ4zrVX8gfILZYMxunlaG/YE5hT5i0MLyaseCOBZLcZllzojkDAzPB0N7zmSi1jT3PyDZHMSHYZ0/lk/xu6ucjWq/17yNp354d535ciM7zt1Bs= Received: by 10.48.47.14 with SMTP id u14mr293177nfu; Sun, 04 Dec 2005 09:19:32 -0800 (PST) Received: by 10.48.249.3 with HTTP; Sun, 4 Dec 2005 09:19:32 -0800 (PST) Message-ID: Date: Sun, 4 Dec 2005 12:19:32 -0500 From: Gregory Maxwell To: pgsql-hackers@postgresql.org Subject: Upcoming PG re-releases In-Reply-To: MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: quoted-printable Content-Disposition: inline References: <1133625371.9297.3.camel@localhost.localdomain> <200512031554.jB3Fs8h10927@candle.pha.pa.us> <20051204162520.GD10317@inuus.com> <8284.1133714056@sss.pgh.pa.us> <20051204164054.GE10317@inuus.com> <8437.1133715165@sss.pgh.pa.us> X-Virus-Scanned: by amavisd-new at hub.org X-Spam-Status: No, score=0 required=5 tests=[none] X-Spam-Score: 0 X-Spam-Level: X-Archive-Number: 200512/186 X-Sequence-Number: 77034 On 12/4/05, Tom Lane wrote: > Paul Lindner writes: > > On Sun, Dec 04, 2005 at 11:34:16AM -0500, Tom Lane wrote: > >> Paul Lindner writes: > >>> iconv -c -f UTF8 -t UTF8 -o fixed.sql dump.sql > >> > >> Is that really a one-size-fits-all solution? Especially with -c? > > > I'd say yes, and the -c flag is needed so iconv strips out the > > invalid characters. > > That's exactly what's bothering me about it. If we recommend that > we had better put a large THIS WILL DESTROY YOUR DATA warning first. > The problem is that the data is not "invalid" from the user's point > of view --- more likely, it's in some non-UTF8 encoding --- and so > just throwing away some of the characters is unlikely to make people > happy. Nor is it even guarenteed to make the data load: If the column is unique constrained and the removal of the non-UTF characters makes two rows have the same data where they didn't before... The way to preserve the data is to switch the column to be a bytea.