public inbox for [email protected]  
help / color / mirror / Atom feed
Mirror.php performance
4+ messages / 2 participants
[nested] [flat]

* Mirror.php performance
@ 2004-11-04 09:39 Dave Page <[email protected]>
  2004-11-04 10:43 ` Re: Mirror.php performance Alexey Borzov <[email protected]>
  0 siblings, 1 reply; 4+ messages in thread

From: Dave Page @ 2004-11-04 09:39 UTC (permalink / raw)
  To: pgsql-www; +Cc: Alexey Borzov <[email protected]>

Well mirror.php's performance is *far* better than it was, though there
is clearly still room for improvement. However, something is not right -
there are over 7000 docs pages if counting static and interactive:

Nov 04 08:53:38 mirror [info] Mirroring started
Nov 04 08:57:09 mirror [error] HTTP error 404 at page
http://wwwdevel.postgresql.org/images/editorschoice2003.jpg
Nov 04 09:01:46 mirror [error] HTTP error 404 at page
http://wwwdevel.postgresql.org/presskit/en/presskit74.html
Nov 04 09:02:47 mirror [error] HTTP error 404 at page
http://wwwdevel.postgresql.org/[email protected]
Nov 04 09:16:04 mirror [info] Mirroring finished. 1027 page(s) saved,
1346 second(s) spent

It appears to have saved everything in the root directory afaict, and
the 7.4 static docs, but nothing else.

Any ideas?

Regards, Dave.



^ permalink  raw  reply  [nested|flat] 4+ messages in thread

* Re: Mirror.php performance
  2004-11-04 09:39 Mirror.php performance Dave Page <[email protected]>
@ 2004-11-04 10:43 ` Alexey Borzov <[email protected]>
  2004-11-04 13:02   ` Re: Mirror.php performance Alexey Borzov <[email protected]>
  0 siblings, 1 reply; 4+ messages in thread

From: Alexey Borzov @ 2004-11-04 10:43 UTC (permalink / raw)
  To: Dave Page <[email protected]>; +Cc: pgsql-www

Hi,

Dave Page wrote:
> Nov 04 09:16:04 mirror [info] Mirroring finished. 1027 page(s) saved,
> 1346 second(s) spent
> 
> It appears to have saved everything in the root directory afaict, and
> the 7.4 static docs, but nothing else.
> 
> Any ideas?

Ouch. It did the same for me, will look into this: seems as if some 
links are dropped / not followed.



^ permalink  raw  reply  [nested|flat] 4+ messages in thread

* Re: Mirror.php performance
  2004-11-04 09:39 Mirror.php performance Dave Page <[email protected]>
  2004-11-04 10:43 ` Re: Mirror.php performance Alexey Borzov <[email protected]>
@ 2004-11-04 13:02   ` Alexey Borzov <[email protected]>
  0 siblings, 0 replies; 4+ messages in thread

From: Alexey Borzov @ 2004-11-04 13:02 UTC (permalink / raw)
  To: Dave Page <[email protected]>; +Cc: pgsql-www

Hi,

Alexey Borzov wrote:
>> Nov 04 09:16:04 mirror [info] Mirroring finished. 1027 page(s) saved,
>> 1346 second(s) spent
>>
>> It appears to have saved everything in the root directory afaict, and
>> the 7.4 static docs, but nothing else.
>>
>> Any ideas?
> 
> Ouch. It did the same for me, will look into this: seems as if some 
> links are dropped / not followed.

Fixed. Turned out the regexes to extract links from pages were broken 
and some of the links (including the main menu, unfortunately) were thus 
not crawled.





^ permalink  raw  reply  [nested|flat] 4+ messages in thread

* Re: Mirror.php performance
@ 2004-11-04 13:16 Dave Page <[email protected]>
  0 siblings, 0 replies; 4+ messages in thread

From: Dave Page @ 2004-11-04 13:16 UTC (permalink / raw)
  To: Alexey Borzov <[email protected]>; +Cc: pgsql-www

 

> -----Original Message-----
> From: Alexey Borzov [mailto:[email protected]] 
> Sent: 04 November 2004 13:03
> To: Dave Page
> Cc: [email protected]
> Subject: Re: [pgsql-www] Mirror.php performance
> 
> Hi,
> 
> Alexey Borzov wrote:
> >> Nov 04 09:16:04 mirror [info] Mirroring finished. 1027 
> page(s) saved,
> >> 1346 second(s) spent
> >>
> >> It appears to have saved everything in the root directory 
> afaict, and 
> >> the 7.4 static docs, but nothing else.
> >>
> >> Any ideas?
> > 
> > Ouch. It did the same for me, will look into this: seems as if some 
> > links are dropped / not followed.
> 
> Fixed. Turned out the regexes to extract links from pages 
> were broken and some of the links (including the main menu, 
> unfortunately) were thus not crawled.

Thanks, I'll give it a try.

Regard,s dave.




^ permalink  raw  reply  [nested|flat] 4+ messages in thread


end of thread, other threads:[~2004-11-04 13:16 UTC | newest]

Thread overview: 4+ messages (download: mbox mbox.gz follow: Atom feed)
-- links below jump to the message on this page --
2004-11-04 09:39 Mirror.php performance Dave Page <[email protected]>
2004-11-04 10:43 ` Alexey Borzov <[email protected]>
2004-11-04 13:02   ` Alexey Borzov <[email protected]>
2004-11-04 13:16 Re: Mirror.php performance Dave Page <[email protected]>

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox