Received: from maia.hub.org (maia-2.hub.org [200.46.204.251]) by mail.postgresql.org (Postfix) with ESMTP id 6DFC1633177 for ; Wed, 9 Jun 2010 05:24:25 -0300 (ADT) Received: from mail.postgresql.org ([200.46.204.86]) by maia.hub.org (mx1.hub.org [200.46.204.251]) (amavisd-maia, port 10024) with ESMTP id 14931-09 for ; Wed, 9 Jun 2010 08:24:17 +0000 (UTC) X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6 Received: from mail-yw0-f187.google.com (mail-yw0-f187.google.com [209.85.211.187]) by mail.postgresql.org (Postfix) with ESMTP id 70DEA632316 for ; Wed, 9 Jun 2010 05:24:18 -0300 (ADT) Received: by ywh17 with SMTP id 17so788431ywh.1 for ; Wed, 09 Jun 2010 01:24:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=gamma; h=domainkey-signature:mime-version:received:received:in-reply-to :references:date:message-id:subject:from:cc:content-type; bh=2OgSyuMrVzhgEggyrXMwTqaTQZqslzvVfAGWy1L/wBU=; b=guXMi//PcU/xWmcLTot2dxDogk/FVgP2PQ89k7w1JEf5dEJZS8HGkpWqPB5sfEPA5J uuz+gAGgBWtNe8nXaJWL8Zyx/Ynr3H7jdzqwdvBFfrhbBSrgb28QERcsrGwe7j5mhLgQ yB9YVPKO5N5jZgHWhgUmIOjBN4tY1bVOWdt2k= DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=mime-version:in-reply-to:references:date:message-id:subject:from:cc :content-type; b=chFTglFB56QPizhpXx3xXXlrULQmLjSpMGXIOMlh16L+Ow/oBzNGf5QGADjmyHU6B+ ys7RNKqUt9hei4SeCFc7znVLnHtVPDWF9sPNIRTmz8AzS/OJYjTr0Sw6Afa85n46FroK WSu5WUvzr4Z7qDslXXKijhv3CQ2uyQgkMw3Ek= MIME-Version: 1.0 Received: by 10.101.6.18 with SMTP id j18mr17530389ani.12.1276071857076; Wed, 09 Jun 2010 01:24:17 -0700 (PDT) Received: by 10.100.37.2 with HTTP; Wed, 9 Jun 2010 01:24:17 -0700 (PDT) In-Reply-To: References: <29FD1BB6-9AD8-485C-B5D2-23D66C15DC97@numericable.fr> <1C4CA63C-DD35-4554-81CE-4E2A94548CA6@numericable.fr> Date: Wed, 9 Jun 2010 10:24:17 +0200 Message-ID: Subject: Re: Cognitive dissonance From: Dave Coventry Cc: pgsql-general@postgresql.org Content-Type: text/plain; charset=ISO-8859-1 X-Virus-Scanned: Maia Mailguard 1.0.1 X-Spam-Status: No, hits=1.831 tagged_above=-5 required=5 tests=BAYES_50=0.8, MISSING_HEADERS=1.021, RCVD_IN_DNSWL_NONE=-0.0001, T_FRT_PROFILE2=0.01 X-Spam-Level: * X-Archive-Number: 201006/412 X-Sequence-Number: 163949 My tupp'th: Formatted text, whether PDF, HTML or (heaven forbid!) Word Documents, is easier to read than unformatted plain text, and those of us without the OP's very admirable proficiency in vi remain at the mercy of the various readers and their associated search functions. However, I sure that it's not too arduous a task to extract the text in these documents and strip them of their formatting? Or am I missing something?