Received: from localhost (unknown [200.46.204.184]) by postgresql.org (Postfix) with ESMTP id 47F7C2E1370 for ; Wed, 12 Mar 2008 12:25:23 -0300 (ADT) Received: from postgresql.org ([200.46.204.71]) by localhost (mx1.hub.org [200.46.204.184]) (amavisd-maia, port 10024) with ESMTP id 39279-03 for ; Wed, 12 Mar 2008 12:25:03 -0300 (ADT) X-Greylist: from auto-whitelisted by SQLgrey-1.7.5 Received: from ik-out-1112.google.com (ik-out-1112.google.com [66.249.90.176]) by postgresql.org (Postfix) with ESMTP id 68B732E0105 for ; Wed, 12 Mar 2008 12:25:03 -0300 (ADT) Received: by ik-out-1112.google.com with SMTP id b35so1426540ika.3 for ; Wed, 12 Mar 2008 08:25:02 -0700 (PDT) Received: by 10.150.200.8 with SMTP id x8mr4610618ybf.80.1205335500181; Wed, 12 Mar 2008 08:25:00 -0700 (PDT) Received: by 10.150.96.5 with HTTP; Wed, 12 Mar 2008 08:25:00 -0700 (PDT) Message-ID: <937d27e10803120825k67fc19c1h27f977a8038dfdd7@mail.gmail.com> Date: Wed, 12 Mar 2008 15:25:00 +0000 From: "Dave Page" To: "Alvaro Herrera" Subject: Re: Email not searchable in our archives Cc: "Bruce Momjian" , "PostgreSQL www" , "Stefan Kaltenbrunner" In-Reply-To: <937d27e10803120714h21f1f661s657fad641f35c3d5@mail.gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <200803112135.m2BLZ6625071@momjian.us> <20080312130902.GC4926@alvh.no-ip.org> <937d27e10803120714h21f1f661s657fad641f35c3d5@mail.gmail.com> X-Virus-Scanned: Maia Mailguard 1.0.1 X-Archive-Number: 200803/256 X-Sequence-Number: 14375 On Wed, Mar 12, 2008 at 2:14 PM, Dave Page wrote: > Confirmed. I added a test mode to a copy of the archives indexer, and > running that it claims it would index a further 715 messages, which > would give us a total of 1187. > > So I guess the next step is to try running out of test mode to see if > the data actually makes it into the index now, but I didn't want to do > that and stomp on any testing you're doing. OK, so running it properly has added those missing 715 messages. I think we need to run a full index run which should restore any missing pages, but before we do that, I'd kinda like to gather any ideas on why this has happened before removing any evidence. My best guess is simply that the indexer failed for some time and noone noticed for a few weeks. By the time it was re-run, some messages that it had missed were outside the timeframe that an incremental crawl would have picked up (the current, plus last month). Thoughts? Stefan; any thoughts on how we might monitor that the indexer has been running correctly? I assume that should be fairly easy if we have it drop a timestamp someplace? -- Dave Page EnterpriseDB UK Ltd: http://www.enterprisedb.com PostgreSQL UK 2008 Conference: http://www.postgresql.org.uk