[SpamCop.net - protecting the internet through technology]

[SC-Help] Re: What's wrong now, Spamcop?

Garen Erdoisa scamper at trisk.com
Sun Mar 19 20:43:27 EST 2006


Mike Easter wrote:
> Garen Erdoisa wrote:
> 
>> I really don't think that we are in disagreement here, as I stated in
>> my post quoting myself "These are my thoughts on this issue". That
>> means that it's my opinion. No where in it did I say or even suggest
>> that my opinion was a fact. You seem to have inferred otherwise.
> 
> Oh, I see.  I tho't maybe you had some other information as a basis for
> your opinion.
> 
> If I'm understanding you correctly, you are saying "On the basis of
> this, I think the stats reflect reporter reports and not spamtrap
> reports."

It's my opinion (and assumption) that green data on the charts currently 
reflect spam submitted by reporters, excluding spamtrap data and data 
from other sources such as spamfilter data feeds.

It's also my opinion (and assumption) that the blue line on that charts 
represent the number of reports to various abuse desks generated when 
reporters choose to submit.

I have looked at those charts for a couple of years off and on, and for 
a time (well over a year ago) it looked like the charts were showing 
both spamtrap and reporter data for a while, that suddenly dropped off, 
so I think spamcop has in the past changed the data sets on the fly 
without indicating the change to the rest of us. Again, that's just my 
opinion and assumption. I never asked spamcop for clarification and 
didn't really care at the time.

That data has since scrolled off the chart so isn't relevant any longer 
anyway.

> 
> Then I have another question.  Do you think that the stats reflect a
> 'sizeable percentage' of the reporter reports, and therefore one could
> surmise that 75% of the 3 million spams are derived from spamtraps, or

I currently think that the charts show 100% of the reporter reports, and 
0% of data from spamtrap sources and other data sources. I do know for a 
fact that spamcop has other data sources than reporters and spamtraps.

Unless a spamcop admin wants to pipe in and clarify the issue, as far as 
I'm concerned that assumption stands for the purposes of this discussion 
in this thread. I could change my opinion at any time if new information 
becomes available.

To clarify something here when I say "other data sources" I'm including 
things like data sources other than spamtraps which include but are not 
necessarily limited to things like distributed spam feeds from both 
SpamAssassin (with a plugin) and Spmabouncer spamfilter software. 
Possibly others though I don't know of any offhand.

Such distributed spam data feeds wouldn't necessarily be considered in 
the same category as a spamtrap though it's similar in nature it would 
most likely be treated differently since the data feeds are coming from 
real email accounts in many if not most cases.

I don't know if that data is included on the charts or not, but at this 
time I's my opinion and assumption that it's not included on those charts.

Those spamfilters are also capable of automatic submission of normal 
reports that are later confirmed by the users.

Somewhere on these boards a long time ago I read a post about Spam 
Assassin that stated it had it's own reporting address for the filter 
itself.

Spambouncer definitely has it's own reporting address for the spam 
filter that is completely independent from a "reporter" account. From my 
perspective (from the outside), it's like a black hole address. Stuff 
fed to it is never seen again by me. Only spamcop admins would have 
access to that data feed. I know this for a fact because I wrote that 
section of code in the spambouncer and use spambouncer to auto submit 
spam that meat the appropriate criteria to spamcop for accounts on my 
domain that are so enabled.

I also know for a fact that others use spambouncer in this mode. I've 
helped others set this up. I don't know how many use Spam Assassin like 
this, and I haven't actually looked at that hard code, but from what I 
have read on usenet and these spamcop boards I can make a reasonable 
assumption at this point in time that people use Spam Assassin to do the 
same thing as I do with Spambouncer.

Based on all of that, I can make a reasonable assumption that such data 
feeds are not included on the charts because I would have expected to 
have seen a significant increase in the data on the charts when those 
feeds when live if they were included in the data shown on the charts.

The charts did show an increase around last October to November, but not 
really enough to say one way or another if the increase was due to those 
data feeds going live or due to other things going on. (By live I mean 
stable release of Spambouncer vs beta testing).

To be honest I've never asked spamcop for more information about this 
and didn't really care. Still don't. I care more that they have the data 
available to make use of than how it affects the statistics they choose 
to display to the public.

I think that based on Don's statement, I can make a reasonable 
assumption that spamtrap data is not currently included on the charts.

But yes, it's just assumptions and opinions we are talking about here.
i.e.: educated guesses.

I don't know how many users spamcop has, and I don't really care.
That would have a bearing on the statistics though, since you could 
then infer how many average reports per person per day are submitted, 
which might be interesting to know.

> do you think that the stats only reflect a small percentage of the
> reporter reports, and therefore one would surmise that a much much
> smaller percentage than 75% of the 3 million are derived from spamtraps?
> 
> Along the lines of what Jim was asking:
> 
> Jim wrote:
>>> Does he know how many users there are of SPAMCOP?
>> I would like to know how many users of SpamCop there are.  What
>> percentage of spam reports are coming from users versus Spam traps.
> 
> 

With all of that said, I think I'm more curious now than I was before 
this thread about just how spamcop derives it's data for the charts.

I'd also like to see more charts that do show a spam feed breakdown from 
various data sources. reporters, spamtraps, spamassassin & spambouncer 
distributed feeds, etc. I doubt spamcop will do that though I guess we 
could ask them to. Who knows, maybe if enough people show an interest in 
that sort of data they'll do it.

Garen


More information about the SpamCop-Help mailing list