[SpamCop.net - protecting the internet through technology]

[SpamCop-List] Re: Formatting of known spam domains [was URL not chased at all?]

John E. Malmberg wb8tyw at qsl.network
Mon Jun 27 02:06:48 EDT 2005


Sean Sowell wrote:
> On Sunday, June 26, 2005 0209, Berny noted in a separate thread:
> 
>>It seems that the spamvertised URL http://paperiness .net/ ...
>  
> Am rather new to the Spamcop list, so pardon me if this isn't appropriate.  I
> subscribe via the parallel mailing list, not the newsgroup itself.

The newsgroup is much easier to use.

> The above domain is apparently well-recognized by _all_ of the SURBLs.
> SpamAssassin scored it as such, making all three of these messages false
> positives.

Of course it will, a lot of false positives are going to happen if you 
apply a content based spam filter to all of your incoming messages, 
especially from a spam discussion group.

> For known spam domains, could posters please add an extra space or some other
> 'munge' character by the dot?  This way, similar messages will get past SA and
> into the intended folder to be read by human eyes.

Why not fix the flaw in your SpamAssasin setup?

A hit in the SURBLs should not be enough to totally classify a message 
as being spam to be rejected(preferred) or tagged/deleted/reported.

First require at least one of the following tests to also fail:

1. Sender has no rDNS - Over 99% chance of spam.  No real reason for 
most mail servers to even accept the connection, so no reason to analyze 
content unless you know you must communicate with someone that has so 
broken of a mail server and does not need to communicate with AOL.COM 
and a lot of other domains.

2. Sender rDNS is bad - Over 80% chance of spam.  This should be able to 
be used as an absolute test, but apparently a small number of mail 
servers that ATT and AOL can not get away with blocking can not make 
their mail servers compliant with the RFC requirements for valid rDNS.
[This reason also tends to invalidate most of the FUSSP that people 
dream up like SPF because they require the sending mail server to do 
something even more complicated than the simple setting of a DNS record. 
  Do a web search on FUSSP and spam]

3. Sender I.P. in list.dsbl.org, sbl-xb.spamhuas.org or ordb.org, over 
99.999% chance of spam.  No real reason for most mail servers to accept 
the connection.  So no reason to bother with a content check.

4. Sender I.P. in dul.dnsbl.sorbs.net over 95 % chance of spam.  Most 
mail server operators I know seem to think that is enough to reject the 
message.  But you might get "lucky" and find a non-spam in that.

5. Sender I.P. in bl.spamcop.net from 80 to 95 % chance of spam.

6. Sender I.P. in dnsbl.sorbs.net returns 127.0.0.6, means that there is 
a chance that the message is spam.  127.0.0.6 zone tends to be listing 
one or more major ISP mail servers at any given time, so using it alone 
will produce a lot of false positives.

7. Sender I.P. in unconfirmed or multihop.dnsbl.org, means that their is 
a chance the message is spam, but it could be a real message.

8. Sender I.P. is in other aggressive list like SPEWS, or one of several 
country specific DNSbls like China, Korea, or Brazil, or ISP specific 
ones that have given you too much spam in the past, but you do not want 
to completely block.

9. The rDNS indicates that this is a DYNAMIC I.P. either by having too 
many numbers in it, or by having the strings "pool", "dial", "dhcp", 
"dip", or "dyn" in it.

10. There is a " HTTP" mentioned in the headers indicating it came from 
a web mailer.  This is probably an Advance Fee Scam with out any URLs 
that will fail a test.  The presence of "PHP" or "PHP-NUKE" with the " 
HTTP" also seem to reliably indicates a Nigerian 419 or Advance Fee 
Lottery Scam."

In short, the URL check should be a tie breaker only, and never an 
absolute check.

And since several spamvertized URLs seem to have a lifetime of less than 
72 hours, it may be more accurate to use the test in SpamAssasin 3.0 
that resolves the URL to an I.P. address and then tests that I.P. 
address against the sbl-xbl.spamhaus.org.

The spamvertised URL list will tend to be about 4 to 8 hours behind on 
what the spammers are currently using.  The test that checks the I.P. 
address that the URL resolves to will mainly fail on URLs that the 
spammer has long abandoned or the spammer has had their DNS pulled, and 
in a few cases started the spam run before they got their DNS working.

The spammers change URLs often, but there are only a few networks that 
will sell them I.P. addresses.  The spammers stopped using bare I.P. 
addresses at the same time AOL.COM implemented a policy not accept 
e-mail containing URL with a bare I.P. address.

An e-mail with a URL that does not resolve is almost as suspicious to a 
scoring system than one that does resolve.  There is a slight chance 
that it is a typographical error by a human, and if a mail server were 
to reject on that, the human who sent the original message would quickly 
send a correction.


And the total score for any URL check should be the same.  Once a URL 
check + the header defects has indicated that a message is spam, there 
is no reason for any more tests to be done on the message.  The extra 
data is meaningless for that.

 From what I have seen, most content filtering besides a virus scan on 
messages that have correct rDNS, and are not even in an "aggressive" 
DNSbl or have any other header anomalies like coming from a HTTP mailer, 
is just as likely to have false positives as it is to catch any more 
additional spam from those same type of messages.

-John
wb8tyw at qsl.network
Personal Opinion Only


More information about the SpamCop-List mailing list