Received: from localhost (unknown [200.46.204.183]) by postgresql.org (Postfix) with ESMTP id E05262E008A for ; Thu, 28 Feb 2008 05:10:20 -0400 (AST) Received: from postgresql.org ([200.46.204.71]) by localhost (mx1.hub.org [200.46.204.183]) (amavisd-maia, port 10024) with ESMTP id 16484-03 for ; Thu, 28 Feb 2008 05:10:11 -0400 (AST) X-Greylist: from auto-whitelisted by SQLgrey-1.7.5 Received: from sn.sai.msu.ru (sn.sai.msu.ru [195.208.220.215]) by postgresql.org (Postfix) with ESMTP id 1A91B2E00AC for ; Thu, 28 Feb 2008 05:10:10 -0400 (AST) Received: from sn.sai.msu.ru (localhost [127.0.0.1]) by sn.sai.msu.ru (8.14.1/8.12.8) with ESMTP id m1S9A8Ca031598; Thu, 28 Feb 2008 12:10:08 +0300 Received: from localhost (megera@localhost) by sn.sai.msu.ru (8.14.1/8.12.8/Submit) with ESMTP id m1S9A8GB031595; Thu, 28 Feb 2008 12:10:08 +0300 X-Authentication-Warning: sn.sai.msu.ru: megera owned process doing -bs Date: Thu, 28 Feb 2008 12:10:08 +0300 (MSK) From: Oleg Bartunov X-X-Sender: megera@sn.sai.msu.ru To: Magnus Hagander cc: Gevik Babakhani , "'PostgreSQL - WWW ML'" Subject: Re: www search behaviour In-Reply-To: <20080228085734.GH13189@svr2.hagander.net> Message-ID: References: <004a01c879a3$874b7a80$0a01a8c0@gevmus> <20080228082525.GD13189@svr2.hagander.net> <003f01c879e6$8ecdf380$0a01a8c0@gevmus> <20080228085734.GH13189@svr2.hagander.net> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII; format=flowed X-Virus-Scanned: Maia Mailguard 1.0.1 X-Spam-Status: No, hits=0.588 tagged_above=0 required=5 tests=AWL=-1.537, DNS_FROM_RFC_BOGUSMX=2.125 X-Spam-Level: X-Archive-Number: 200802/239 X-Sequence-Number: 14108 On Thu, 28 Feb 2008, Magnus Hagander wrote: > On Thu, Feb 28, 2008 at 11:54:34AM +0300, Oleg Bartunov wrote: >> On Thu, 28 Feb 2008, Gevik Babakhani wrote: >> >>>> Yeah, it's climbing up my TODO list... >>> >>> Anything that I can do to help? >> >> Several variants: >> >> 1. Create custom parser for pgweb, which doesn't mean that '_' is a >> separator >> 2. Create synonym dictionary for pgweb, which lists all pg specific >> terms >> >> 1. is a most right way, 2. is a most easy way, just create word stats, >> see all terms with '_' and create pgvars dictionary like >> some_word some_word > > Didn't we also talk about option 3, a custom dictionary that strips the > underscores? Where could poteitnally use the regexp one as well? I forget it :) > > If you do pt 2, I can put it in right away. It'll take a reindexing of the > whole db though. (Well, any of the options will) > > >> Then created text search configuration , test it and give it to Magnus. >> btw, Magnus, did you move to 8.3 ? > > Yes, we're on 8.3. And it's great ;-) I'm busy, I and Teodor are going to Toulouse (France) btw. I think Gevik could help you. Custom parser will be not difficult. About dictionary - http://vo.astronet.ru/arxiv/dict_regex.html, I'm not sure if it's ported to 8.3. I'll ping author. Regards, Oleg _____________________________________________________________ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83