public inbox for [email protected]  
help / color / mirror / Atom feed
From: Joshua D. Drake <[email protected]>
To: Dave Page <[email protected]>
Cc: PostgreSQL WWW <[email protected]>
Subject: Re: A counter productive conversation about search.
Date: Tue, 29 Aug 2006 07:28:05 -0700
Message-ID: <[email protected]> (raw)
In-Reply-To: <E7F85A1B5FF8D44C8A1AF6885BC9A0E40154C88D@ratbert.vale-housing.co.uk>
References: <E7F85A1B5FF8D44C8A1AF6885BC9A0E40154C88D@ratbert.vale-housing.co.uk>


>> Other options include lucene, and rolling our own.
> 
> Is Lucene capable of handling the size of our index? This has always

I am going to say, "yes" without any actual knowledge because of Lucene 
but that is because I am putting more trust in the fact that it is an 
Apache project then anything. I will check.

> been the problem we've had with other projects like MnogoSearch. They
> work well until you load them up with the archives after which they
> simply can't cope without ridiculous amounts of hardware.
> 
>> Rolling our own really wouldn't be that hard "if" we can create a 
>> reasonably smart web page grabber. We have all the tools 
>> (tsearch2 and 
>> pg_pgtrm) to easily do the searches.
>>
>> So is anyone up for helping develop a page grabber?
> 
> We have one - it builds the static version of the main site by spidering
> it hourly.

Should we look at that then?

> 
> Regards, Dave.
> 


-- 

    === The PostgreSQL Company: Command Prompt, Inc. ===
Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240
    Providing the most comprehensive  PostgreSQL solutions since 1997
              http://www.commandprompt.com/





view thread (15+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected]
  Subject: Re: A counter productive conversation about search.
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox