X-Original-To: pgsql-www-postgresql.org@localhost.postgresql.org Received: from localhost (neptune.hub.org [200.46.204.2]) by svr1.postgresql.org (Postfix) with ESMTP id 1BD10D1DA9D for ; Tue, 13 Jan 2004 19:52:39 +0000 (GMT) Received: from svr1.postgresql.org ([200.46.204.71]) by localhost (neptune.hub.org [200.46.204.2]) (amavisd-new, port 10024) with ESMTP id 26296-01 for ; Tue, 13 Jan 2004 15:52:09 -0400 (AST) Received: from ganymede.hub.org (u46n208.hfx.eastlink.ca [24.222.46.208]) by svr1.postgresql.org (Postfix) with ESMTP id 1BDE6D1D661 for ; Tue, 13 Jan 2004 15:52:08 -0400 (AST) Received: by ganymede.hub.org (Postfix, from userid 1000) id CBC2338A09; Tue, 13 Jan 2004 15:49:00 -0400 (AST) Received: from localhost (localhost [127.0.0.1]) by ganymede.hub.org (Postfix) with ESMTP id C85133886F; Tue, 13 Jan 2004 15:49:00 -0400 (AST) Date: Tue, 13 Jan 2004 15:49:00 -0400 (AST) From: "Marc G. Fournier" X-X-Sender: scrappy@ganymede.hub.org To: Oleg Bartunov Cc: pgsql-www@postgresql.org Subject: Re: incomplete headers: archives.postgresql.org In-Reply-To: Message-ID: <20040113154826.Q45512@ganymede.hub.org> References: MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII X-Virus-Scanned: by amavisd-new at postgresql.org X-Archive-Number: 200401/28 X-Sequence-Number: 3267 let me look into it ... I don't think adding that info is particularly difficult, just a matter of adding a couple of 'headers()' functions to the PHP on top ... added to my TODO list ... On Tue, 13 Jan 2004, Oleg Bartunov wrote: > Hi there, > > crawling of archives.postgresql.org is a pain, because there are no > last-modified information in headers and crawler have to download message > again. For example: > > megera@mira:~$ curl -I http://archives.postgresql.org/pgsql-hackers/2004-01/msg00282.php > HTTP/1.1 200 OK > Date: Tue, 13 Jan 2004 17:38:26 GMT > Server: Apache/1.3.28 (Unix) PHP/4.3.3RC1 > X-Powered-By: PHP/4.3.3RC1 > Content-Type: text/html > > Is't possible to add, at least, header 'Last-Modified', so crawler could > understand if this page should be downloaded again ? It'll save bandwidth > and time to crawle. I think the best way to set 'Last-Modified' header > to date of message from 'Date:' field. Of course, there are should be > proof from 'bad clocks', so default time may be arrival time. > > Also, it could be useful to add 'Expires' header. > I think, headers should be added only to pages with individual message, not > to indexes, because index pages are indeed changed. > > I don't think it's very difficult, but it help site and people. > > > btw, I use cacheability to check if page could cached: > http://www.sai.msu.su/admin/cacheability/?query=http%3A%2F%2Farchives.postgresql.org%2Fpgsql-hackers%2F2004-01%2Fmsg00282.php&descend=on > > http://archives.postgresql.org/pgsql-hackers/2004-01/msg00282.php > Expires - > Cache-Control - > Last-Modified - > ETag - > Content-Length - (actual size: 13277) > Server Apache/1.3.28 (Unix) PHP/4.3.3RC1 > > This object will be considered stale, because it doesn't have any freshness > information assigned. It doesn't have a validator present. It doesn't have a Content-Length header present, so it can't be used in a HTTP/1.0 persistent connection. > > > > > > Regards, > Oleg > _____________________________________________________________ > Oleg Bartunov, sci.researcher, hostmaster of AstroNet, > Sternberg Astronomical Institute, Moscow University (Russia) > Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ > phone: +007(095)939-16-83, +007(095)939-23-83 > > ---------------------------(end of broadcast)--------------------------- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match > ---- Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: scrappy@hub.org Yahoo!: yscrappy ICQ: 7615664