public inbox for [email protected]  
help / color / mirror / Atom feed
From: Greg Sabino Mullane <[email protected]>
To: [email protected]
Subject: Re: Spam filtering on the mailing lists
Date: Thu, 17 Jul 2008 15:54:41 -0000
Message-ID: <[email protected]> (raw)
In-Reply-To: <[email protected]>


-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160


> Its sad how this is such an ongoing problem, but this is the first that I hear
> that ppl are having problems ... looking at the message headers for a random
> few, I notice that they are scoring >4, but just below 5:
...
> I can change the quarantining to be >4 if ppl want, which should greatly reduce
> the # of messages going through ...

I think that would be a good start, but there are definitely some other problems.
First, the example you gave:

> X-Spam-Status: No, hits=4.855 tagged_above=0 required=5 tests=AWL=-1.994,
> DCC_CHECK=1.37, DIGEST_MULTIPLE=0.001, HTML_MESSAGE=0.001,
> MIME_HTML_ONLY=1.672, RAZOR2_CHECK=0.5, RCVD_IN_BL_SPAMCOP_NET=2.188,
> RCVD_IN_SORBS_WEB=1.117

A score of 0.001 for HTML_MESSAGE? Might as well not have the check at all. Same
with things like DIGEST_MULTIPLE. I think we need more checks, and much higher
scores for many of them.

I grabbed a few random messages from the bugs list last night. Most interesting
was that some had no X-Spam-Status headers at all - does this mean they slipped
through the spam filtering entirely? Here's one of them:

===
Return-Path: <[email protected]>
Delivered-To: [email protected]
Received: from localhost (unknown [200.46.204.183])
        by postgresql.org (Postfix) with ESMTP id C3148650275
        for <[email protected]>; Wed, 16 Jul 2008 15:40:45 -0300 (ADT)
Received: from postgresql.org ([200.46.204.86])
 by localhost (mx1.hub.org [200.46.204.183]) (amavisd-maia, port 10024)
 with ESMTP id 48600-04-3 for <[email protected]>;
 Wed, 16 Jul 2008 15:40:43 -0300 (ADT)
X-Greylist: from auto-whitelisted by SQLgrey-1.7.6
Received: from wwwmaster.postgresql.org (wwwmaster.postgresql.org [217.196.146.204])
        by postgresql.org (Postfix) with ESMTP id AB1D565026D
        for <[email protected]>; Wed, 16 Jul 2008 15:40:44 -0300 (ADT)
Received: from wwwmaster.postgresql.org (wwwmaster.postgresql.org [217.196.146.204])
        by wwwmaster.postgresql.org (8.13.8/8.13.8) with ESMTP id m6GIehuA007983
        for <[email protected]>; Wed, 16 Jul 2008 18:40:43 GMT
        (envelope-from [email protected])
Received: (from www@localhost)
        by wwwmaster.postgresql.org (8.13.8/8.13.8/Submit) id m6GIehIP007982;
        Wed, 16 Jul 2008 18:40:43 GMT
        (envelope-from www)
Date: Wed, 16 Jul 2008 18:40:43 GMT
Message-Id: <[email protected]>
To: [email protected]
Subject: BUG #4310: PkMERMInZQ
From: "make money on line" <[email protected]>
Content-Type: text/plain; charset=utf-8
X-Virus-Scanned: Maia Mailguard 1.0.1


The following bug has been logged online:

Bug reference:      4310
Logged by:          make money on line
Email address:      [email protected]
PostgreSQL version: IUrjkiPgQkQXNgo
Operating system:   aJzBuaSGetA
Description:        PkMERMInZQ
Details:

<a href=" http://www.divinecaroline.com/public/user/profile?user_id=83997
">work at home jobs 101waystoincome.com</a>

====

Did it get whitelisted because it came from our form? I still think we
should scan it  - the "make money on line" is a dead giveaway, and
when I ran a local spamassassin on it, I even found:

 2.0 URIBL_BLACK            Contains an URL listed in the URIBL blacklist
                            [URIs: 101waystoincome.com]


Here's another one from last night that did have a spam header. I apologize
for how long this post is getting, but I'm trying to provide some hard data:
===

Return-Path: <[email protected]>
Delivered-To: [email protected]
Received: from localhost (unknown [200.46.204.183])
        by postgresql.org (Postfix) with ESMTP id AFB3A64FD01
        for <[email protected]>; Wed, 16 Jul 2008 23:15:20 -0300 (ADT)
Received: from postgresql.org ([200.46.204.86])
 by localhost (mx1.hub.org [200.46.204.183]) (amavisd-maia, port 10024)
 with ESMTP id 35883-07 for <[email protected]>;
 Wed, 16 Jul 2008 23:15:11 -0300 (ADT)
X-Greylist: domain auto-whitelisted by SQLgrey-1.7.6
Received: from smtp1web.tin.it (smtp1web.tin.it [212.216.176.195])
        by postgresql.org (Postfix) with ESMTP id 8ECBB64FCE4
        for <[email protected]>; Wed, 16 Jul 2008 23:15:17 -0300 (ADT)
Received: from pswm6.cp.tin.it (192.168.70.26) by smtp1web.tin.it (8.0.016.5)
        id 48623AD8015C5727; Thu, 17 Jul 2008 03:59:43 +0200
Message-ID: <[email protected]>
Date: Thu, 17 Jul 2008 02:59:41 +0100 (GMT+01:00)
From: "Tajana for(Mrs. Lucy Berg)" <[email protected]>
Reply-To: [email protected]
Subject: REMINDER NOTIFICATION
Mime-Version: 1.0
Content-Type: text/plain;charset="UTF-8"
Content-Transfer-Encoding: 7bit
X-Originating-IP: 62.163.243.54
To: undisclosed-recipients:;
X-Virus-Scanned: Maia Mailguard 1.0.1
X-Spam-Status: No, hits=1.806 tagged_above=0 required=5
 tests=SUBJ_ALL_CAPS=1.806
X-Spam-Level: *

REMINDER NOTIFICATION

This email is to notify you that your Email
Address attached to a
Ticket Number(140408) has won an Award Sum of
($500,000.00)(Five
Hundred Thousand Dollars)In an Email Sweepstakes
program held in
The Netherlands these year 2008.Please contact the
claim officer
through the below given contact information.

MR.HANSON
CHRIS.
TEL. +31-643-502-787.
FAX: +31-847-290-539.
E-mail:cpinans@aol.
nl

WINNING INFORMATIONS
Ref Number:Nl50286
lucky Numbers:
07,12,24,36,45
Batch Number:EU-175508
Ticket Number:360208

Please
forward the above stated winning information to your Claim
Agent and do
include the following,

Your Name:
Telephone Number:

Congratulations!!!

Yours Sincerely,
Mrs. Lucy Berg.
Public Relation
Officer.

===

The only spam trigger found by postgresql.org was:

X-Spam-Status: No, hits=1.806 tagged_above=0 required=5
 tests=SUBJ_ALL_CAPS=1.806

There are numerous triggers in the body of the email that should
have boosted the score up. Personally, I'd also like to see
SUBJ_ALL_CAPS raised to 3 or 4.

So, to reiterate, I'd like to request the following:

1) Spam filtering is run on all messages
2) The default to reject is lowered to at least 4
3) The values get raised significantly for some tests
4) More SA tests get added (are we at least cronning sa-update?)
5) If 3 and 4 are too much trouble to maintain, outsource the
filtering to someone who does have the time, or who specializes
in it (economies of scale)

I did #5 myself years ago, after getting tired of updating SA rules,
messing with DNS lookups, blacklists, etc. and now just let
maillaunder.com handle it all.

- --
Greg Sabino Mullane [email protected]
PGP Key: 0x14964AC8 200807171149
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8

-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAkh/atsACgkQvJuQZxSWSsjKKwCg4Pc0SNrYjfUZuJRQZjU6jDHR
oc0An0vTdKzfIJ3+CxQXpw7TZyWu0Tb6
=a3/E
-----END PGP SIGNATURE-----





view thread (18+ messages)  latest in thread

reply

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Reply to all the recipients using the --to and --cc options:
  reply via email

  To: [email protected]
  Cc: [email protected]
  Subject: Re: Spam filtering on the mailing lists
  In-Reply-To: <[email protected]>

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

This inbox is served by agora; see mirroring instructions
for how to clone and mirror all data and code used for this inbox