X-Original-To: pgsql-www-postgresql.org@localhost.postgresql.org Received: from localhost (av.hub.org [200.46.204.144]) by postgresql.org (Postfix) with ESMTP id E718E9DC881; Mon, 16 Jan 2006 20:12:44 -0400 (AST) Received: from postgresql.org ([200.46.204.71]) by localhost (av.hub.org [200.46.204.144]) (amavisd-new, port 10024) with ESMTP id 74826-09; Mon, 16 Jan 2006 20:12:47 -0400 (AST) X-Greylist: from auto-whitelisted by SQLgrey- X-Greylist: from auto-whitelisted by SQLgrey- Received: from noel.decibel.org (noel.decibel.org [67.100.216.10]) by postgresql.org (Postfix) with ESMTP id 7F4799DC81C; Mon, 16 Jan 2006 20:12:42 -0400 (AST) Received: by noel.decibel.org (Postfix, from userid 1001) id A03AA39834; Mon, 16 Jan 2006 18:12:45 -0600 (CST) Date: Mon, 16 Jan 2006 18:12:45 -0600 From: "Jim C. Nasby" To: Magnus Hagander Cc: "Marc G. Fournier" , Josh Berkus , John Hansen , pgsql-www@postgresql.org Subject: Re: Infrastructure monitoring Message-ID: <20060117001245.GR67693@pervasive.com> References: <6BCB9D8A16AC4241919521715F4D8BCE92E9BA@algol.sollentuna.se> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <6BCB9D8A16AC4241919521715F4D8BCE92E9BA@algol.sollentuna.se> X-Operating-System: FreeBSD 6.0-RELEASE amd64 X-Distributed: Join the Effort! http://www.distributed.net User-Agent: Mutt/1.5.11 X-Virus-Scanned: by amavisd-new at hub.org X-Spam-Status: No, score=0.093 required=5 tests=[AWL=0.093] X-Spam-Score: 0.093 X-Spam-Level: X-Archive-Number: 200601/184 X-Sequence-Number: 9372 On Sat, Jan 14, 2006 at 12:16:25PM +0100, Magnus Hagander wrote: > BTW, we already do content monitoring on the actual website mirrors. If > a mirror does not answer, *or* does not update properly, it will > automatically be removed from the DNS record, and thus get out of > "public view" after 10-30 minutes. And this is how all the services should work, at least from a monitoring standpoint. If any public service (any of the websites, search, archives, email, ftp, etc) goes down, multiple people should get pages. Along those lines, disk space should also be monitored to make sure nothing fills up. > What I think would be good in cases like this is just information - > AFAIK nobody on the web team knew hte servers were being moved. (I may > be wrong here - I know I didn't know and I also spoke to Dave about it, > but those are the only ones I polled. Anyway, -www should know) And info is one of the other keys to keeping things running smoothly... ISTM any changes in service/outages should certainly be posted someplace where those monitoring things know what's going on. -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461