Re: Non English Spam
- From: Erik Norgaard <norgaard@xxxxxxxxxxxx>
- Date: Sat, 14 Oct 2006 14:38:08 +0200
Beech Rintoul wrote:
I'm getting a ton of spam every day that comes from China, Japan and Korea. Spam Assassin completely ignores it because it has all non-english characters and slows kmail to a crawl loading. Is there a way to filter on non-english either using Spam Assassin or procmail?
I get none after adding simple filter rules for postfix:
# Accepted mime headers: (ASCII, UTF-8 and ISO-8859-X)
/^Content-Type:.*?charset\s*=\s*"?(us-ascii|iso-8859-\d+|utf-8)"?/
OK HDR2000 Accepted charset: $1
Strictly you can reject every other characterset, but I chose to make it explicit:
# Reject specific character sets
# Chinese, Japanese and Korean
/^Content-Type:.*?charset\s*=\s*"?(Big5|gb2312|euc-cn)"?/
REJECT HDR2100: Unaccepted character set: "$1"
/^Content-Type:.*?charset\s*=\s*"?(euc-kr|iso-2022-kr)"?/
REJECT HDR2110: Unaccepted character set: "$1"
/^Content-Type:.*?charset\s*=\s*"?(iso-2022-\w+|euc-jp|shift_jis)"?/
REJECT HDR2120: Unaccepted character set: "$1"
# Cyrrilic character sets: Russian/Ukrainian
/^Content-Type:.*?charset\s*=\s*"?(koi8-(?:r|u))"?/
REJECT HDR2200: Unaccepted character set: "$1"
/^Content-Type:.*?charset\s*=\s*"?(windows-(?:1250|1251))"?/
REJECT HDR2210: Unaccepted character set: "$1"
And then you may want a catchup rule to catch unknown character sets.
/^Content-Type:.*?charset\s*=\s*"?(\w?)"?/
WARN HDR2299: Unknown character set: "$1"
you may change WARN to REJECT.
I have noted however, that some subscribers to this list write english encoded in one of the above character sets, I don't know enough about the character set definition, but it seems that English characters are a subset of any character set?
What is the recommended policy here? Should subscribers be advised to change character set when posting to the list?
Cheers, Erik
--
Ph: +34.666334818 web: http://www.locolomo.org
X.509 Certificate: http://www.locolomo.org/crt/8D03551FFCE04F0C.crt
Key ID: 69:79:B8:2C:E3:8F:E7:BE:5D:C3:C3:B1:74:62:B8:3F:9F:1F:69:B9
_______________________________________________
freebsd-questions@xxxxxxxxxxx mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to "freebsd-questions-unsubscribe@xxxxxxxxxxx"
- Follow-Ups:
- Re: Non English Spam
- From: Ted Mittelstaedt
- Re: Non English Spam
- References:
- Non English Spam
- From: Beech Rintoul
- Non English Spam
- Prev by Date: Removing Giant from a driver
- Next by Date: Re: Non English Spam
- Previous by thread: Re: Non English Spam
- Next by thread: Re: Non English Spam
- Index(es):
Relevant Pages
|