Received: from localhost (unknown [200.46.204.183]) by postgresql.org (Postfix) with ESMTP id DC2D864FD3B for ; Fri, 1 Aug 2008 10:23:44 -0300 (ADT) Received: from postgresql.org ([200.46.204.86]) by localhost (mx1.hub.org [200.46.204.183]) (amavisd-maia, port 10024) with ESMTP id 92791-05 for ; Fri, 1 Aug 2008 10:23:35 -0300 (ADT) X-Greylist: from auto-whitelisted by SQLgrey-1.7.6 Received: from yw-out-2324.google.com (yw-out-2324.google.com [74.125.46.28]) by postgresql.org (Postfix) with ESMTP id 4543264FD2E for ; Fri, 1 Aug 2008 10:23:41 -0300 (ADT) Received: by yw-out-2324.google.com with SMTP id 3so510555ywj.73 for ; Fri, 01 Aug 2008 06:23:40 -0700 (PDT) Received: by 10.151.143.3 with SMTP id v3mr3244182ybn.229.1217597019993; Fri, 01 Aug 2008 06:23:39 -0700 (PDT) Received: by 10.150.156.1 with HTTP; Fri, 1 Aug 2008 06:23:39 -0700 (PDT) Message-ID: <937d27e10808010623v1fdce54cx63ea27da385a18f8@mail.gmail.com> Date: Fri, 1 Aug 2008 14:23:39 +0100 From: dpage@pgadmin.org To: "Magnus Hagander" Subject: Re: Email search failure Cc: "Joshua D. Drake" , "Tom Lane" , "Bruce Momjian" , "PostgreSQL www" In-Reply-To: <489303E6.40508@hagander.net> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <200807312019.m6VKJvR02505@momjian.us> <23897.1217539560@sss.pgh.pa.us> <48927FEF.8070200@commandprompt.com> <489303E6.40508@hagander.net> X-Virus-Scanned: Maia Mailguard 1.0.1 X-Spam-Status: No, hits=0 tagged_above=0 required=5 tests=none X-Spam-Level: X-Archive-Number: 200808/3 X-Sequence-Number: 15593 Any idea why there were no alerts? Ae we monitoring pgsql-aardvarks instead of pgsql-zebras? On 8/1/08, Magnus Hagander wrote: > Joshua D. Drake wrote: >> Tom Lane wrote: >>> Bruce Momjian writes: >>>> Why is the email below now appearing in a search? >>> >>> Probably because nothing has gotten indexed for a month or more. >>> Whoever is supposed to maintain the archive indexer has been >>> on vacation since it broke ... >> >> That would be Magnus and you are correct. He just got back. The problem >> (last I checked) is an issue with Russian emails. > > Looking at it now. That clearly wasn't the only problem, because there > was a "sleep 1800" process that had been running since July 3. Logfiles > weren't touched etc. Just restarting it fixed that part, which clearly > somebody else could've done as well ;) > > I found the bug with the Russian emails, btw. It seems mhonarc encoded > the invalid UTF8 sequences inside valid HTML escape entities And the > code applied the "fix broken UTF8" logic *before* it unescaped the HTML > entities. Now it does it both before and after.. > > Oh, and this should never have affected messages on -hackers for > example, because it was always processed before ru-general. It would hit > the PUG lists, -www, -patches and a few others. > > //Magnus > > -- > Sent via pgsql-www mailing list (pgsql-www@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-www > -- Dave Page EnterpriseDB UK: http://www.enterprisedb.com