X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org Received: from localhost (av.hub.org [200.46.204.144]) by postgresql.org (Postfix) with ESMTP id 0E7E59DD79E for ; Sat, 10 Dec 2005 19:50:43 -0400 (AST) Received: from postgresql.org ([200.46.204.71]) by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024) with ESMTP id 60513-07 for ; Sat, 10 Dec 2005 19:50:45 -0400 (AST) X-Greylist: from auto-whitelisted by SQLgrey- Received: from eastrmmtao03.cox.net (eastrmmtao03.cox.net [68.230.240.36]) by postgresql.org (Postfix) with ESMTP id 5A9149DCD78 for ; Sat, 10 Dec 2005 19:50:40 -0400 (AST) Received: from [192.168.0.17] (really [24.136.39.213]) by eastrmmtao03.cox.net (InterMail vM.6.01.05.02 201-2131-123-102-20050715) with ESMTP id <20051210234908.HLPJ29285.eastrmmtao03.cox.net@[192.168.0.17]>; Sat, 10 Dec 2005 18:49:08 -0500 From: Robert Treat To: pgsql-hackers@postgresql.org, Martijn van Oosterhout Subject: Re: Upcoming PG re-releases Date: Sat, 10 Dec 2005 18:50:41 -0500 User-Agent: KMail/1.6.2 Cc: Bruce Momjian , Gregory Maxwell , Gavin Sherry , Peter Eisentraut References: <20051209164409.GG20352@svana.org> <200512091738.jB9HcLI18465@candle.pha.pa.us> <20051209182811.GH20352@svana.org> In-Reply-To: <20051209182811.GH20352@svana.org> MIME-Version: 1.0 Content-Disposition: inline Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit Message-Id: <200512101850.41863.xzilla@users.sourceforge.net> X-Virus-Scanned: by amavisd-new at hub.org X-Spam-Status: No, score=0 required=5 tests=[none] X-Spam-Score: 0 X-Spam-Level: X-Archive-Number: 200512/553 X-Sequence-Number: 77401 Was thinking if someone could summarize this all it would make a really good FAQ entry. Robert Treat On Friday 09 December 2005 13:28, Martijn van Oosterhout wrote: > On Fri, Dec 09, 2005 at 12:38:21PM -0500, Bruce Momjian wrote: > > > This means someone who is planning on upgrading to 8.1 in two months > > > can use this function now to weed out the bad data before the upgrade > > > even starts. > > > > Oh, so you back-load it into the old database. Interesting. I assume > > to be useful you would have to write something that checked every column > > values in every table and database. > > Umm, yeah. I was thinking about how to do that. pl/pgsql is not the > best language to do that in. In any case I found a bug in the version I > posted and also added a function that does: > > test=# select * from db_utf8_verify(); > tab | fld | location > ------+-----+---------- > tbl1 | foo | (12,3) > (1 row) > > It gives the table, field and ctid of any values that failed. It skips > pg_catalog. It's also *really* slow for long strings. Just executing it > on the pg_rewrite in the default installation takes forever. If someone > really wanted this for a large database maybe they should recode it in > C. > > http://svana.org/kleptog/pgsql/utf8_verify.sql > > Have a nice day, -- Robert Treat Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL