public inbox for [email protected]
help / color / mirror / Atom feedFrom: Magnus Hagander <[email protected]>
To: Dave Page <[email protected]>
Cc: Alvaro Herrera <[email protected]>
Cc: Bruce Momjian <[email protected]>
Cc: PostgreSQL www <[email protected]>
Cc: Stefan Kaltenbrunner <[email protected]>
Subject: Re: Email not searchable in our archives
Date: Wed, 12 Mar 2008 16:32:26 +0100
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>
References: <[email protected]>
<[email protected]>
<[email protected]>
<[email protected]>
On Wed, Mar 12, 2008 at 03:25:00PM +0000, Dave Page wrote:
> On Wed, Mar 12, 2008 at 2:14 PM, Dave Page <[email protected]> wrote:
> > Confirmed. I added a test mode to a copy of the archives indexer, and
> > running that it claims it would index a further 715 messages, which
> > would give us a total of 1187.
> >
> > So I guess the next step is to try running out of test mode to see if
> > the data actually makes it into the index now, but I didn't want to do
> > that and stomp on any testing you're doing.
>
> OK, so running it properly has added those missing 715 messages. I
> think we need to run a full index run which should restore any missing
> pages, but before we do that, I'd kinda like to gather any ideas on
> why this has happened before removing any evidence.
>
> My best guess is simply that the indexer failed for some time and
> noone noticed for a few weeks. By the time it was re-run, some
> messages that it had missed were outside the timeframe that an
> incremental crawl would have picked up (the current, plus last month).
> Thoughts?
>
> Stefan; any thoughts on how we might monitor that the indexer has been
> running correctly? I assume that should be fairly easy if we have it
> drop a timestamp someplace?
I admint to having a ticket on pmt to get that set up.
Actually, it might be better to look into the actual database, and find the
latest email indexed? If it's older than <nn> something is wrong. It oculd
be the archives that's wrong and not indexer of course, but the point is
we'll get notified and someone can look into it.
Do you think we need to track it on a per-list basis, or just check for the
latest timestamp across all lists?
//Magnus
view thread (51+ messages) latest in thread
reply
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Reply to all the recipients using the --to and --cc options:
reply via email
To: [email protected]
Cc: [email protected], [email protected], [email protected], [email protected], [email protected]
Subject: Re: Email not searchable in our archives
In-Reply-To: <[email protected]>
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox