Received: from localhost (unknown [200.46.204.183]) by postgresql.org (Postfix) with ESMTP id D502A2E2EB9 for ; Wed, 12 Mar 2008 12:32:47 -0300 (ADT) Received: from postgresql.org ([200.46.204.71]) by localhost (mx1.hub.org [200.46.204.183]) (amavisd-maia, port 10024) with ESMTP id 40819-01 for ; Wed, 12 Mar 2008 12:32:28 -0300 (ADT) X-Greylist: from auto-whitelisted by SQLgrey-1.7.5 Received: from svr2.hagander.net (svr2.hagander.net [88.198.128.226]) by postgresql.org (Postfix) with ESMTP id C7F302E2E89 for ; Wed, 12 Mar 2008 12:32:27 -0300 (ADT) Received: by svr2.hagander.net (Postfix, from userid 1000) id 75512DCC8F4; Wed, 12 Mar 2008 16:32:26 +0100 (CET) Date: Wed, 12 Mar 2008 16:32:26 +0100 From: Magnus Hagander To: Dave Page Cc: Alvaro Herrera , Bruce Momjian , PostgreSQL www , Stefan Kaltenbrunner Subject: Re: Email not searchable in our archives Message-ID: <20080312153226.GP29649@svr2.hagander.net> References: <200803112135.m2BLZ6625071@momjian.us> <20080312130902.GC4926@alvh.no-ip.org> <937d27e10803120714h21f1f661s657fad641f35c3d5@mail.gmail.com> <937d27e10803120825k67fc19c1h27f977a8038dfdd7@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <937d27e10803120825k67fc19c1h27f977a8038dfdd7@mail.gmail.com> User-Agent: Mutt/1.5.11 X-Virus-Scanned: Maia Mailguard 1.0.1 X-Archive-Number: 200803/258 X-Sequence-Number: 14377 On Wed, Mar 12, 2008 at 03:25:00PM +0000, Dave Page wrote: > On Wed, Mar 12, 2008 at 2:14 PM, Dave Page wrote: > > Confirmed. I added a test mode to a copy of the archives indexer, and > > running that it claims it would index a further 715 messages, which > > would give us a total of 1187. > > > > So I guess the next step is to try running out of test mode to see if > > the data actually makes it into the index now, but I didn't want to do > > that and stomp on any testing you're doing. > > OK, so running it properly has added those missing 715 messages. I > think we need to run a full index run which should restore any missing > pages, but before we do that, I'd kinda like to gather any ideas on > why this has happened before removing any evidence. > > My best guess is simply that the indexer failed for some time and > noone noticed for a few weeks. By the time it was re-run, some > messages that it had missed were outside the timeframe that an > incremental crawl would have picked up (the current, plus last month). > Thoughts? > > Stefan; any thoughts on how we might monitor that the indexer has been > running correctly? I assume that should be fairly easy if we have it > drop a timestamp someplace? I admint to having a ticket on pmt to get that set up. Actually, it might be better to look into the actual database, and find the latest email indexed? If it's older than something is wrong. It oculd be the archives that's wrong and not indexer of course, but the point is we'll get notified and someone can look into it. Do you think we need to track it on a per-list basis, or just check for the latest timestamp across all lists? //Magnus