Spam Filtering

Last Updated: May 10, 2004
For help in setting up SpamAssassin and procmail, see the MFCF FAQ

Spam Assassin

I get just over 1000 junk messages a week (as of April 1, 2004). SpamAssassin will catch a lot of junk email, but I've found that it accepts a lot of junk email. By watching the tests triggered by accepted spam messages and increasing their scores, I have further reduced the amount of junk email that gets by spamassassin.

To use my list of scores, insert my scores (last updated May 10, 2004) into your spamassassin user_prefs file (~/.spamassassin/user_prefs on Unix). At the top of my user_prefs file are some comments; find the same comments in your user_prefs file and insert my scores there.

Whitelists

Using my scores, some non-junk messages will be tagged as spam. You may want to store your spam messages and scan them to find mis-classified messages. In particular, conference email for reviews and for paper acceptance notification might be classified as spam.

You will probably want to set up a whitelist. I whitelist uwaterloo.ca (although this means I get a fair amount of spam that has a forged From: at uwaterloo.ca) by putting the following lines in my user_prefs file:

whitelist_from *.uwaterloo.ca
whitelist_from *@uwaterloo.ca

If you look at the scores in my user_prefs file, a few are listed as 95. This is an attempted to reduce the amount of spam coming from uwaterloo accounts. If you don't whitelist uwaterloo, you may wish to reduce these scores from 95 to 4 or 5. Then again, since these 95 scores are all for forged headers, a score of 95 may be fine.

You will also likely want to whitelist any conference accounts, colleagues, etc.. Air Canada is another one to whitelist, since if you cash in frequent flier miles online, their email is usually classified as spam even with the default spamassassin score settings:

whitelist_from *@aircanada.ca

Blacklists

You will probably want to blacklist the following:
blacklist_from support@uwaterloo.ca
blacklist_from staff@uwaterloo.ca
blacklist_from management@uwaterloo.ca
since these accounts don't exist and any message from them is likely a virus.

Course Accounts

If you forward email from a course account to your own account, you have an additional source of spam. For CS488, we took a more heavy handed approach: we only accept email from uwaterloo.ca accounts or messages that have "cg:" in the subject line (which we tell students that they must include in messages sent to the course account from off campus). We used procmail to achieve this, putting the line
"|/software/procmail/bin/procmail"
in the course .forward file and using this .procmailrc file. In this .procmailrc file, you will want to change where it says
user1@uwaterloo.ca user2@uwaterloo.ca
to a list of accounts to which the course email is to be forwarded. Also, you may want to use a subject string other than "cg:" to accept message for your course ("cg" for "computer graphics").