[SC-Help] Re: What's wrong now, Spamcop?
Mike Easter
MikeE at ster.invalid
Sun Mar 19 16:21:58 EST 2006
Garen Erdoisa wrote:
> Mike Easter wrote:
>> Garen Erdoisa wrote:
>>
>>> The spamtrap data is not included on the statistics charts they make
>>> available to the public.
>>
>> How do you know that? I always assumed that it did. Or rather that
>> the statistics were some 'subset' of the total spam processed, but
>> not
> Quote from http://members.spamcop.net/spamstats.shtml
Not only did I read that par, I cited a sentence from it. But I will
leave the par cited below so that we can all look at it again.
> <quote>
> These graphs show the number of messages submitted as spam along with
> the number of reports consumated regarding those messages. This data
> reflects more about SpamCop's usage patterns than it does about the
> spam. These numbers now reflect only a small fraction of total spam
> being processed by SpamCop, but they are still representative of the
> total. </quote>
>
> It's pretty clear to me with that statement on the website that the
> graphs do not include spamtrap data or data from other sources.
Data from other sources? That isn't even being debated here. The data
is reflective of spamcop spam, not some other spam.
But the par above and the information from the rest of that page and
others doesn't convince me of what is apparently 'clear' to you that the
graphs do not include spamtrap data.
There isn't anything in the par you cited that sez anything about
spamtrap data one way or the other. The par only sez that the data
shows a small fraction of spam being processed. It gives no clue about
which fraction.
> Combine that with what Don D'Minion just admitted in his post further
> up in this thread about spamcop processing about 3 million emails per
> day, the rest is deductive reasoning based on the information I have
> available.
The fact that he used the term 3 million and the fact that the stats
show less only confirms that there is a subset, which is what the
sentence I cited said, not how the subset is 'chosen'.
> Don D'Minion just stated in his post further up in this thread that
> spamcop process about 3 million spams per day. Again, the rest is just
> deductive reasoning.
I am debating the quality or accuracy of your deductive reasoning.
> He's posting as a spamcop admin, so I would tend
> to trust his statement as being somewhat factual since he does have
> access to insider knowledge.
I'm not saying his information isn't factual -- I'm saying you are
making unwarranted assumptions from the 3 million term, from the subset
concept, and from the numbers shown on the graph.
I don't know how it makes any difference one way or another if we are
looking at a set of stats and the stats don't represent all of the data
which data or which fraction of which data is 'left out' if it is
acknowledged that all of the data isn't there. In fact, only a 'small
fraction' of the data is shown.
A /small/ fraction? Is a small fraction 1% or 10% or 50% or what? A
'fraction' implies a fraction or a part, which one might think of as
being 'not all' but a 'significant' or noticeable fraction -- certainly
not a 'sizeable' fraction. But when you start saying a *small*
fraction, it makes me think of something perhpas less than 25% or even
below 10% -- whereas a 'tiny' fraction might be more in the 1% range.
If the 3 million figure were actually representative of the last week or
so, during which 5 million spams were shown on the 7 day graph, then the
fraction would be about 25% -- and saying 25% is a 'small' fraction
would be OK with me.
But, I think that if you were to 'insist' on saying that the graph is
showing some subset of reporter reports and no spamtraps, or insist on
saying that the graph is showing all of the reporter reports and no
spamtrap reports and therefore 75% of the spam processed is from
spamtraps, then I would 'insist' on saying your assumptions are
unwarranted.
--
Mike Easter
kibitzer, not SC admin
More information about the SpamCop-Help
mailing list