X-Original-To: pgsql-www-postgresql.org@postgresql.org Received: from localhost (wm.hub.org [200.46.204.128]) by postgresql.org (Postfix) with ESMTP id 7EAC89FA501 for ; Tue, 29 Aug 2006 00:12:24 -0300 (ADT) Received: from postgresql.org ([200.46.204.71]) by localhost (mx1.hub.org [200.46.204.128]) (amavisd-new, port 10024) with ESMTP id 98971-03 for ; Tue, 29 Aug 2006 03:12:21 +0000 (UTC) X-Greylist: from auto-whitelisted by SQLgrey- Received: from lists.commandprompt.com (host-130.commandprompt.net [207.173.203.130]) by postgresql.org (Postfix) with ESMTP id 25CB29FA4E3 for ; Tue, 29 Aug 2006 00:12:20 -0300 (ADT) Received: from [192.168.1.50] (or-67-76-146-141.sta.embarqhsd.net [67.76.146.141]) (authenticated bits=0) by lists.commandprompt.com (8.13.7/8.13.6) with ESMTP id k7T3CO3D023798 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=NO) for ; Mon, 28 Aug 2006 20:12:25 -0700 Message-ID: <44F3B09C.3010104@commandprompt.com> Date: Mon, 28 Aug 2006 20:12:28 -0700 From: "Joshua D. Drake" Organization: Command Prompt, Inc. User-Agent: Thunderbird 1.5.0.5 (X11/20060728) MIME-Version: 1.0 To: PostgreSQL WWW Subject: A counter productive conversation about search. Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit X-Virus-Scanned: ClamAV version 0.88.3, clamav-milter version 0.88.3 on projects.commandprompt.com X-Virus-Status: Clean X-Greylist: Sender succeded SMTP AUTH authentication, not delayed by milter-greylist-1.6 (lists.commandprompt.com [192.168.2.159]); Mon, 28 Aug 2006 20:12:25 -0700 (PDT) X-Virus-Scanned: Maia Mailguard 1.0.1 X-Spam-Status: No, hits=1.774 tagged_above=0 required=5 tests=FORGED_RCVD_HELO, URIBL_SBL X-Spam-Level: * X-Archive-Number: 200608/150 X-Sequence-Number: 10541 Hello, Now that I have effectively slapped myself silly by being rude to Tom about search. Let me bring up some points about search and see if there is a way to resolve them. The problem: Search really isn't that good. Tom has good results with it, but I am guessing that because he is looking for specific things, likely just in archives as I doubt he often searches the documentation ;). A quick search on google: site:archives.postgresql.org index bloat archives.postgresql.org/pgsql-performance/2005-04/msg00617.php archives.postgresql.org/pgsql-performance/2005-04/msg00594.php archives.postgresql.org/pgsql-performance/2005-04/msg00608.php archives.postgresql.org: http://archives.postgresql.org/pgsql-performance/2005-04/msg00575.php http://archives.postgresql.org/pgsql-general/2004-12/msg00288.php http://archives.postgresql.org/pgsql-general/2005-07/msg00186.php site:www.postgresql.org create index www.postgresql.org/docs/7.4/static/sql-createindex.html www.postgresql.org/docs/8.1/static/sql-createindex.html www.postgresql.org/files/documentation/books/aw_pgsql/node216.html search.postgresql.org: http://www.postgresql.org/files/documentation/books/aw_pgsql/node216.html http://www.postgresql.org/files/documentation/books/pghandbuch/html/sql-createindex.html http://developer.postgresql.org/~petere/past-events/lsm2003-slides/foil20.html The first search is "reasonable" between the two, although it does not appear to correctly follow the thread path. The second search to me is completely wrong. CREATE INDEX should always return the current documentation first. I can forgive google for showing 7.4 first because it has been around longer and yet is still widely in use. I have on multiple occasions brought up the idea of another search engine. I wrote the pgsql.ru guys and asked if they would share their code. To their benefit they said they would be willing but didn't have the time to install it for us. I told them I would be happy to muscle through it if they would just answer some emails. I never heard back. Other options include lucene, and rolling our own. Rolling our own really wouldn't be that hard "if" we can create a reasonably smart web page grabber. We have all the tools (tsearch2 and pg_pgtrm) to easily do the searches. So is anyone up for helping develop a page grabber? Sincerely, Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/