Re: Formatting of known spam domains [was URL not chased at all?]
John E. Malmberg
wb8tyw at qsl.network
Mon Jun 27 02:06:48 EDT 2005
Sean Sowell wrote:
> On Sunday, June 26, 2005 0209, Berny noted in a separate thread:
>>It seems that the spamvertised URL http://paperiness .net/ ...
> Am rather new to the Spamcop list, so pardon me if this isn't appropriate. I
> subscribe via the parallel mailing list, not the newsgroup itself.
The newsgroup is much easier to use.
> The above domain is apparently well-recognized by _all_ of the SURBLs.
> SpamAssassin scored it as such, making all three of these messages false
Of course it will, a lot of false positives are going to happen if you
apply a content based spam filter to all of your incoming messages,
especially from a spam discussion group.
> For known spam domains, could posters please add an extra space or some other
> 'munge' character by the dot? This way, similar messages will get past SA and
> into the intended folder to be read by human eyes.
Why not fix the flaw in your SpamAssasin setup?
A hit in the SURBLs should not be enough to totally classify a message
as being spam to be rejected(preferred) or tagged/deleted/reported.
First require at least one of the following tests to also fail:
1. Sender has no rDNS - Over 99% chance of spam. No real reason for
most mail servers to even accept the connection, so no reason to analyze
content unless you know you must communicate with someone that has so
broken of a mail server and does not need to communicate with AOL.COM
and a lot of other domains.
2. Sender rDNS is bad - Over 80% chance of spam. This should be able to
be used as an absolute test, but apparently a small number of mail
servers that ATT and AOL can not get away with blocking can not make
their mail servers compliant with the RFC requirements for valid rDNS.
[This reason also tends to invalidate most of the FUSSP that people
dream up like SPF because they require the sending mail server to do
something even more complicated than the simple setting of a DNS record.
Do a web search on FUSSP and spam]
3. Sender I.P. in list.dsbl.org, sbl-xb.spamhuas.org or ordb.org, over
99.999% chance of spam. No real reason for most mail servers to accept
the connection. So no reason to bother with a content check.
4. Sender I.P. in dul.dnsbl.sorbs.net over 95 % chance of spam. Most
mail server operators I know seem to think that is enough to reject the
message. But you might get "lucky" and find a non-spam in that.
5. Sender I.P. in bl.spamcop.net from 80 to 95 % chance of spam.
6. Sender I.P. in dnsbl.sorbs.net returns 127.0.0.6, means that there is
a chance that the message is spam. 127.0.0.6 zone tends to be listing
one or more major ISP mail servers at any given time, so using it alone
will produce a lot of false positives.
7. Sender I.P. in unconfirmed or multihop.dnsbl.org, means that their is
a chance the message is spam, but it could be a real message.
8. Sender I.P. is in other aggressive list like SPEWS, or one of several
country specific DNSbls like China, Korea, or Brazil, or ISP specific
ones that have given you too much spam in the past, but you do not want
to completely block.
9. The rDNS indicates that this is a DYNAMIC I.P. either by having too
many numbers in it, or by having the strings "pool", "dial", "dhcp",
"dip", or "dyn" in it.
10. There is a " HTTP" mentioned in the headers indicating it came from
a web mailer. This is probably an Advance Fee Scam with out any URLs
that will fail a test. The presence of "PHP" or "PHP-NUKE" with the "
HTTP" also seem to reliably indicates a Nigerian 419 or Advance Fee
In short, the URL check should be a tie breaker only, and never an
And since several spamvertized URLs seem to have a lifetime of less than
72 hours, it may be more accurate to use the test in SpamAssasin 3.0
that resolves the URL to an I.P. address and then tests that I.P.
address against the sbl-xbl.spamhaus.org.
The spamvertised URL list will tend to be about 4 to 8 hours behind on
what the spammers are currently using. The test that checks the I.P.
address that the URL resolves to will mainly fail on URLs that the
spammer has long abandoned or the spammer has had their DNS pulled, and
in a few cases started the spam run before they got their DNS working.
The spammers change URLs often, but there are only a few networks that
will sell them I.P. addresses. The spammers stopped using bare I.P.
addresses at the same time AOL.COM implemented a policy not accept
e-mail containing URL with a bare I.P. address.
An e-mail with a URL that does not resolve is almost as suspicious to a
scoring system than one that does resolve. There is a slight chance
that it is a typographical error by a human, and if a mail server were
to reject on that, the human who sent the original message would quickly
send a correction.
And the total score for any URL check should be the same. Once a URL
check + the header defects has indicated that a message is spam, there
is no reason for any more tests to be done on the message. The extra
data is meaningless for that.
From what I have seen, most content filtering besides a virus scan on
messages that have correct rDNS, and are not even in an "aggressive"
DNSbl or have any other header anomalies like coming from a HTTP mailer,
is just as likely to have false positives as it is to catch any more
additional spam from those same type of messages.
wb8tyw at qsl.network
Personal Opinion Only
More information about the SpamCop-List