public inbox for [email protected]  
help / color / mirror / Atom feed
From: Oleg Bartunov <[email protected]>
To: Marc G. Fournier <[email protected]>
Cc: Josh Berkus <[email protected]>
Cc: Dave Page <[email protected]>
Cc: [email protected]
Subject: Re: Postgresql.org search engine.
Date: Sat, 31 Jan 2004 15:45:28 +0300 (MSK)
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>
	<[email protected]>

On Sat, 31 Jan 2004, Marc G. Fournier wrote:

> On Fri, 30 Jan 2004, Josh Berkus wrote:
>
> > Guys,
> >
> > > Do you have software to do this, including all the inter-posting
> > > references and followups?  Or do you propose we write this all from
> > > scratch?
> >
> > Robert Bernier apparently wrote something to break up mail for inclusion in a
> > database, and should be able to help in a couple months.  Josh Drake is also
> > willing to help, and has already done a prototype wiithout header searching.
>
> Dumping mail into a database isn't that hard to do ... there are several
> projects on the 'Net right now doing that, including one that connects a
> POP3 daemon into the database to download the mail ... in fact, from what
> I recall of fts.postgresql.org, isn't that what Oleg/Teodor's stuff does?
>
> I'm kinda curious here ... exactly what problem are we trying to solve
> here?
>
> Me, I'm just trying to clean up the archives so that when someone gets
> their search results, they don't all show the same 'text', which I've
> already accomplished ... Dave is working on improving the speed of the
> searches, which he has accomplished with ASPseek ...
>
> If I can figure out how to get the Date: of the posting into the
> Last-Modified field (I know *how* it should work, but last time I tried it
> ended up generating a whack of errors), then that should satisfy Oleg's
> beef ...
>
> Oleg, one question ... what do you recommend setting max-age to for
> Cache-control?  Right now, I have it set to 30 days ... too long?  not
> long enough?

in my experience Cache-control is not effective, because it's
HTTP/1.1 feature and a lot of users come through proxy which still
doesn't support HTTP/1.1
Last-Modified header is the most universal way.
Check http://www.mnot.net/cache_docs/#CACHE-CONTROL

>
> ----
> Marc G. Fournier           Hub.Org Networking Services (http://www.hub.org)
> Email: [email protected]           Yahoo!: yscrappy              ICQ: 7615664
>

	Regards,
		Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: [email protected], http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83



view thread (22+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected], [email protected], [email protected], [email protected]
  Subject: Re: Postgresql.org search engine.
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox