X-Original-To: pgsql-hackers-postgresql.org@localhost.postgresql.org Received: from localhost (av.hub.org [200.46.204.144]) by postgresql.org (Postfix) with ESMTP id 967789DCBDE for ; Fri, 9 Dec 2005 14:28:26 -0400 (AST) Received: from postgresql.org ([200.46.204.71]) by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024) with ESMTP id 47098-06 for ; Fri, 9 Dec 2005 14:28:25 -0400 (AST) X-Greylist: from auto-whitelisted by SQLgrey- Received: from svana.org (svana.org [203.20.62.76]) by postgresql.org (Postfix) with ESMTP id 6EC5F9DCABC for ; Fri, 9 Dec 2005 14:28:23 -0400 (AST) Received: from kleptog by svana.org with local (Exim 3.35 #1 (Debian)) id 1Ekmyb-0006pr-00; Sat, 10 Dec 2005 05:28:13 +1100 Date: Fri, 9 Dec 2005 19:28:12 +0100 From: Martijn van Oosterhout To: Bruce Momjian Cc: Gregory Maxwell , Gavin Sherry , Peter Eisentraut , pgsql-hackers@postgresql.org Subject: Re: Upcoming PG re-releases Message-ID: <20051209182811.GH20352@svana.org> Reply-To: Martijn van Oosterhout References: <20051209164409.GG20352@svana.org> <200512091738.jB9HcLI18465@candle.pha.pa.us> Mime-Version: 1.0 Content-Type: multipart/signed; micalg=pgp-sha1; protocol="application/pgp-signature"; boundary="/GPgYEyhnw15BExa" Content-Disposition: inline In-Reply-To: <200512091738.jB9HcLI18465@candle.pha.pa.us> User-Agent: Mutt/1.3.28i X-PGP-Key-ID: Length=1024; ID=0x0DC67BE6 X-PGP-Key-Fingerprint: 295F A899 A81A 156D B522 48A7 6394 F08A 0DC6 7BE6 X-PGP-Key-URL: X-Virus-Scanned: by amavisd-new at hub.org X-Spam-Status: No, score=0.077 required=5 tests=[AWL=0.077] X-Spam-Score: 0.077 X-Spam-Level: X-Archive-Number: 200512/511 X-Sequence-Number: 77359 --/GPgYEyhnw15BExa Content-Type: text/plain; charset=us-ascii Content-Disposition: inline Content-Transfer-Encoding: quoted-printable On Fri, Dec 09, 2005 at 12:38:21PM -0500, Bruce Momjian wrote: > > This means someone who is planning on upgrading to 8.1 in two months > > can use this function now to weed out the bad data before the upgrade > > even starts. >=20 > Oh, so you back-load it into the old database. Interesting. I assume > to be useful you would have to write something that checked every column > values in every table and database. Umm, yeah. I was thinking about how to do that. pl/pgsql is not the best language to do that in. In any case I found a bug in the version I posted and also added a function that does: test=3D# select * from db_utf8_verify(); tab | fld | location=20 ------+-----+---------- tbl1 | foo | (12,3) (1 row) It gives the table, field and ctid of any values that failed. It skips pg_catalog. It's also *really* slow for long strings. Just executing it on the pg_rewrite in the default installation takes forever. If someone really wanted this for a large database maybe they should recode it in C. http://svana.org/kleptog/pgsql/utf8_verify.sql Have a nice day, --=20 Martijn van Oosterhout http://svana.org/kleptog/ > Patent. n. Genius is 5% inspiration and 95% perspiration. A patent is a > tool for doing 5% of the work and then sitting around waiting for someone > else to do the other 95% so you can sue them. --/GPgYEyhnw15BExa Content-Type: application/pgp-signature Content-Disposition: inline -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.0.6 (GNU/Linux) Comment: For info see http://www.gnupg.org iD8DBQFDmcy7IB7bNG8LQkwRAlB+AJwL1Aa+cPEblMipOsf8+fT9DsRvRwCbBcQ4 LXQJV6a5dx+GnxY6cv5LIjY= =kybc -----END PGP SIGNATURE----- --/GPgYEyhnw15BExa--