[SpamCop.net - protecting the internet through technology]

[SpamCop-List] Can SpamCop be improved to handle non-English messages?

Patto nobody at devnull.spamcop.net
Tue Aug 16 15:13:26 EDT 2005


http://www.spamcop.net/sc?id=z796577385z87fa23e572278be28e1c21c5500232cbz

I am just a SpamCop user; I do not know what abuse departments do with 
SC complaints, i.e. if they look at the original spam messages, or even 
read them.

Whatever they do, they will not be able to read anything if the original 
spam message is not in a Western character set.  Japanese (as this 
sample), Chinese, Russian texts are implicitly replaced with question marks.

Wouldn't it be better if SpamCop would forward the original message 
intact, with the original character set encoding? I even think that this 
would not be a very difficult thing to do.

Below are the (munged) headers from above report that I user-copied to 
myself; it seems that there is no character set encoding specified at all.

Microsoft Mail Internet Headers Version 2.0
Received: from em03.cincom.com ([192.168.1.13]) by im02.cincom.com with 
Microsoft SMTPSVC(6.0.3790.211);
	 Tue, 16 Aug 2005 00:57:02 -0400
Received: from vmx2.spamcop.net ([64.74.133.250]) by em03.cincom.com 
with Microsoft SMTPSVC(6.0.3790.211);
	 Tue, 16 Aug 2005 01:00:06 -0400
Received: from sc-app2.eq.ironport.com (HELO spamcop.net) (192.168.19.202)
   by vmx2.spamcop.net with SMTP; 15 Aug 2005 21:56:15 -0700
Received: from [218.42.148.249] by spamcop.net
	with HTTP; Tue, 16 Aug 2005 04:56:13 GMT
From: <1489166764 at reports.spamcop.net>
To: x
Subject: [SpamCop ( ) 
id:1489166764]=?ISO-2022-JP?B?GyRCNVUxZyRDJEZLXEV2JEskIiRrJE4kKy..
Precedence: list
Message-ID: <rid_1489166764 at msgid.spamcop.net>
Date: Tue, 16 Aug 2005 00:38:04 -0400
X-SpamCop-sourceip: 61.197.117.209
X-Mailer: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7.10) 
Gecko/20050721 Firefox/1.0.6
	via http://www.spamcop.net/ v1.471
Return-Path: 1489166764.e90161a2 at bounces.spamcop.net
X-OriginalArrivalTime: 16 Aug 2005 05:00:07.0046 (UTC) 
FILETIME=[5EF0EA60:01C5A21F]

P.S. I just noticed that the original Japanese spam message also didn't 
have the charset encoding specified, so my sample is a bit flawed. 
Still, if there *is* a charset encoding specified in the spam, I think 
it would be beneficial if the outgoing report would have the same 
encoding (and multibyte characters not being replaced by question marks).


More information about the SpamCop-List mailing list