[SpamCop.net - protecting the internet through technology]

[SpamCop-List] Re: New Spam Filters

John E. Malmberg wb8tyw at qsl.network
Thu Apr 13 12:36:54 EDT 2006


In article <e1hbc8$l71$1 at news.spamcop.net>,
  Garen Erdoisa <scamper at trisk.com> writes:
> John E. Malmberg wrote:
>> The difference is if you want the error rate of your spam filtering to be
>> visible to outsiders.
>
> Heh, I don't like the idea of providing stats to outsiders on that
> level. If I did I'd post the stats on my website. Giving such reject
> stats to outsiders also helps spammers improve their methods to bypass
> filtering. So, I choose to keep most of the stats private. I'd rather
> let the spammers wonder.

I meant legitimate senders that had their e-mail mis-classified.

>>
>> Have you tried using the feature in SpamAssasin known as URIBL_SBL?
>
> I don't use SpamAssassin, I use SpamBouncer which has similar features.
> SpamAssassin is perl based, while SpamBouncer is procmail based. Over
> the years I've grown so familiar with procmail that I actually prefer
> using it over most other email filters. It's a personal choice.

Have you tried the feature that checks the I.P. address of URLs on suspect
E-mails against the sbl-xbl.spamhaus.org?

I suspect that will prove a vary accurate addition to the scoring.

> I have to disagree with you on one point though. In my experience
> bayesian filtering is quite effective on a per user basis, so long as
> each user trains their own filter.

The one built into my copy Mozilla has proved to be untrainable, and has errors
rates both on missing spam and flagging goodmail.

>> I would only do content filtering (except de-worming) on e-mail that has
>> something suspect in the headers:
>>
>>   1. Bad rDNS
>>   2. Missing rDNS unless you decide to just reject these.
>>   3. I.P. address on aggressive list such as spamcop.net, or a multihop list.
>>      such as multihop.dsbl.org
>>   4. A "With http" shows up in the headers indicating a web mailer.
>>      (419 favorites)
>>   5. I.P. address in a DHCP pool unless you just reject these.
>>   6. rdns has "ppp","dyn","pool" or "dhcp" in a subnet.
>>
>> From what I have seen, sbl-xbl.spamhaus.org + dul.dnsbl.sorbs.net +
>> list.dsbl.org will catch well over 90% and close to 99% of the spam delivery
>> attempts, with the remaining spam coming from one of sources that can be
>> classified above.
>
> I agree for the stats, however any of the above can also have a lot of
> false positives. I've tried most of those techniques above at one point
> or other. The most I do with any of them is bump the spam score a bit
> and/or tag the email.

I have not heard of any false positives yet from the xbl.spamhaus.org, and
in the past 8 years have only had two good-mails rejected by a dhcp list, and
in both cases the sending network TOS prohibited running mail servers on
DHCP addresses.

list.dsbl.org may have some hits from networks that have had a security
problem, and have a significant error in their mail server configuration,
such as ECN BUG, non-working Postmaster/abuse addresses, bad rDNS, or
silenting deleting Postmaster/abuse e-mail.

> Unfortunately the RFC's do not require a valid reverse DNS to transmit
> email.

Actually current RFCs require a valid reverse DNS on all servers connected
to the public internet, not just for sending e-mail.

I do not have the number handy, but I have seen multiple confirmations
of it's existance and the requirement.

AOL for quite a while has been refusing e-mail from I.P. addresses with no
rDNS.  They and ATT discovered that while many real mail servers have a broken
rDNS, almost none have no rDNS at all.

> As for the web mailers, I can't afford to block email that has such
> headers. That also would result in false positives. I get a lot of good
> email that use those methods to send email.

I was not advocating blocking, just giving them a higher potential spam
score for special handling.

Myself, If I were running a mail server, I would only use tagging for an
unproven spam or new detection method, with a notification to the users
that if they see tagged good-mail, that they need to contact me immediately
because eventually anything that is tagged by that algorithm may be rejected
by the mail server in the future.

>>
>> You generally do not want to block discussions about spam containing samples,
>> nor do you want to issue SMTP rejects to a mailing list that may have had a
>> slight spam leakage.
>
> I agree. But that is what postmaster@ and abuse@ are for, at least to
> start such discussions.

I do not get e-mail from any of those addresses, as I am not a postmaster.

My broadband ISP prohibits running any server, and will not sell that service
to any of their customers.

Things might be different when their new owner takes over, but based on
other discussions, the available I.P. addresses will probably be in SPEWS
and other private blocking lists.

-John
wb8tyw at qsl.network
Personal Opinion Only


More information about the SpamCop-List mailing list