Spam

From Antiflux Wiki

(Difference between revisions)
Jump to: navigation, search
(Step 4: filter spam detected by DSPAM)
Current revision (00:54, 30 January 2011) (edit) (undo)
(Step 3: filter spam detected by Spamassassin)
 
(9 intermediate revisions not shown.)
Line 1: Line 1:
== Overview ==
== Overview ==
-
We hate spam and viruses. Our goal is to block 100% of them without blocking any legitimate email. We also realize that our goal is nearly impossible to reach in the real world. As a compromise, we attack the problem from three sides.
+
We hate spam and viruses. Our goal is to block 100% of them without blocking any legitimate email. We also realize that our goal is nearly impossible to reach in the real world. As a compromise, we attack the problem in a few different ways.
-
=== Conservative policy to bounce messages based on sending server address ===
+
=== Spamhaus: Conservative policy to bounce messages based on sending server address ===
-
The server uses the Spamhaus [http://www.spamhaus.org/sbl/sbl-rationale.html SBL] and [http://www.spamhaus.org/xbl/index.lasso XBL] to reject mail from known spammers. Messages are bounced back to the sender with a message explaining why the message was rejected. End users do not need to configure anything.
+
The server uses the Spamhaus [http://www.spamhaus.org/sbl/ SBL], [http://www.spamhaus.org/xbl/ XBL], and [http://www.spamhaus.org/pbl/ PBL] to reject mail from known spammers. Messages are bounced back to the sender with a message explaining why the message was rejected. End users do not need to configure anything.
-
=== More aggressive spam analysis with user-configurable filtering ===
+
=== Postgrey: Make mail servers work a tiny bit harder to prove they're for real ===
-
Okcomputer runs incoming mail through [http://www.spamassassin.org/ Spamassassin] and [http://www.nuclearelephant.com/projects/dspam/ DSPAM] before delivering it. If either program determines that the email is spam, it tags the message with special headers like the ones below but it does not reject it. Users can configure their email programs to automatically delete the tagged email or sort it into a special folder. This gives users control over how they want to filter their email, by choosing thresholds and particular tests.
+
As a second line of defence, okcomputer uses [http://isg.ee.ethz.ch/tools/postgrey/ Postgrey] to temporarily delay incoming mail servers the first time they connect. The idea behind [http://en.wikipedia.org/wiki/Greylisting greylisting] is that spammers use bulk mail servers that simply try to deliver messages as quickly as possible. Their business is built on volume, so they don't care about reliability - if they can't deliver a message to a particular email address right away, they just give up and try the next one. When a real mail servers can't deliver a message immediately, it queues the message and tries again later. Postgrey on okcomputer tells the remote mail server to try again in 5 minutes. If the mail server comes back later and tries again, Postgrey will accept the message and remember the exchange for next time. Once a mail server proves that it isn't a spammer, Postgrey stops delaying mail from that server.
 +
 
 +
Greylisting has proven very effective on okcomputer. It catches a lot of spam and uses little in the way of system resources. However, some mail servers may be very broken and not understand the "wait for 5 minutes and try again" message. If mail sent to your antiflux.org account bounces, let root@antiflux.org know and we can probably fix the problem.
 +
 
 +
=== SpamAssasin: More aggressive spam analysis with user-configurable filtering ===
 +
Okcomputer runs incoming mail through [http://spamassassin.apache.org/ SpamAssassin] before delivering it. If SpamAssassin determines that the email is spam, it tags the message with special headers like the ones below but it does not reject it. Users can configure their email programs to automatically delete the tagged email or sort it into a special folder. This gives users control over how they want to filter their email, by choosing thresholds and particular tests.
<pre><nowiki>
<pre><nowiki>
Line 13: Line 18:
CLICK_HERE_LINK,CTYPE_JUST_HTML version=2.20
CLICK_HERE_LINK,CTYPE_JUST_HTML version=2.20
X-Spam-Level: **********
X-Spam-Level: **********
-
</nowiki></pre>
 
-
 
-
<pre><nowiki>
 
-
X-DSPAM-Result: Spam
 
-
X-DSPAM-Confidence: 0.5910
 
-
X-DSPAM-Probability: 1.0000
 
-
X-DSPAM-Signature: 422a4197102002045975100
 
-
X-DSPAM-User: tim
 
</nowiki></pre>
</nowiki></pre>
Line 27: Line 24:
You can configure your email application to check the email header (not the email body!) for either "X-Spam-Status: Yes" if you trust the system default threshold or "X-Spam-Level: ****" (adjust the number of * characters) if you want to pick your own threshold.
You can configure your email application to check the email header (not the email body!) for either "X-Spam-Status: Yes" if you trust the system default threshold or "X-Spam-Level: ****" (adjust the number of * characters) if you want to pick your own threshold.
-
You can also filter based on the "tests=" section. Spamassassin performs a [http://spamassassin.rediris.es/tests.html long list of tests] on each message and tags the message with the names of the tests that indicate the message's possible spaminess. For example, if you want to filter out email from servers listed in the [http://www.ordb.org/about/ relays.ordb.org] database, set your email program to check the X-Spam-Status header for the string " RCVD_IN_RELAYS_ORDB_ORG".
+
You can also filter based on the "tests=" section. SpamAssassin performs a [http://spamassassin.apache.org/tests.html long list of tests] on each message and tags the message with the names of the tests that indicate the message's possible spaminess. For example, if you want to filter out email from servers listed in the [http://www.ordb.org/about/ relays.ordb.org] database, set your email program to check the X-Spam-Status header for the string " RCVD_IN_RELAYS_ORDB_ORG".
-
=== Virus scanning ===
+
=== DSPAM: no longer used ===
-
On top of analyzing messages for spam characteristics, okcomputer also uses [http://www.amavis.org/ Amavis] to scan for viruses. Like Spamassassin and DSPAM, Amavis inserts special headers like the ones below into the messages to let the end user decide what to do with incoming viruses.
+
We used to use DSPAM as another layer of defence against spam. However, it used a lot of system resources and greylisting has been so effective that we decided to disable DSPAM.
 +
 
 +
Reference: [http://log.antiflux.org/system/archives/001347.html System Log message]
 +
 
 +
=== Amavis: Virus scanning ===
 +
On top of analyzing messages for spam characteristics, okcomputer also uses [http://www.amavis.org/ Amavis] to scan for viruses. Like SpamAssassin, Amavis inserts special headers like the ones below into the messages to let the end user decide what to do with incoming viruses.
<pre><nowiki>
<pre><nowiki>
Line 59: Line 61:
</nowiki></pre>
</nowiki></pre>
-
=== Step 3: filter spam detected by Spamassasin ===
+
=== Step 3: filter spam detected by SpamAssassin ===
-
Add the following to your .procmailrc file to filter anything Spamassasin classifies as spam into a mailbox called "spamassasin".
+
Add the following to your .procmailrc file to filter anything SpamAssassin classifies as spam into a mailbox called "spamassassin".
<pre><nowiki>
<pre><nowiki>
Line 68: Line 70:
</nowiki></pre>
</nowiki></pre>
-
This uses Spamassasin's default threshold set by the Antiflux Management. We tend to be a little conservative, so we may set the threshold a bit high to prevent legitimate email being tagged as spam even if it means a few spams slip by. If you would like to set your own threshold, you can filter based on the X-Spam-Level header like this.
+
This uses SpamAssassin's default threshold set by the Antiflux Management. We tend to be a little conservative, so we may set the threshold a bit high to prevent legitimate email being tagged as spam even if it means a few spams slip by. If you would like to set your own threshold, you can filter based on the X-Spam-Level header like this.
<pre><nowiki>
<pre><nowiki>
Line 76: Line 78:
</nowiki></pre>
</nowiki></pre>
-
=== Step 4: filter spam detected by DSPAM ===
+
=== Step 4: filter viruses ===
-
This step may be more work than it's worth for many users, so feel free to skip it. Unlike Spamassasin which works well "out of the box," DSPAM requires training. To get good results, you'll need a mailbox containing at least a thousand spams (and no legitimate messages) and another mailbox with at least a thousand legitimate emails (and no spams). You might want to run with Spamassin for a while, manually removing spams that manage to get through to your inbox.
+
-
 
+
-
<pre><nowiki>
+
-
nice dspam_corpus <user> <mailbox>
+
-
nice dspam_corpus --addspam <user> <spambox>
+
-
</nowiki></pre>
+
-
 
+
-
Once you have trained DSPAM, add the following to your .procmailrc file.
+
-
 
+
-
<pre><nowiki>
+
-
:0:
+
-
* ^X-DSPAM-Result: Spam
+
-
dspam
+
-
</nowiki></pre>
+
-
 
+
-
It's important to keep training DSPAM. If DSPAM misses a spam, bounce (not forward) the message to spam-username@spam.antiflux.org (replace "username" with your actual username, of course). If DSPAM tags a legitimate message as spam, bounce it to ham-username@spam.antiflux.org instead. You can also check DSPAM's performance and configure some things at http://spam.antiflux.org.
+
-
 
+
-
=== Step 5: filter viruses ===
+
Add the following to the end of your .procmailrc file to have procmail dump mail containing identified viruses to a folder called "virus".
Add the following to the end of your .procmailrc file to have procmail dump mail containing identified viruses to a folder called "virus".
Line 132: Line 116:
Scanning outgoing mail is essentially worthless because there's no easy, secure way for the recipient to trust the sender's scanner. It's up to the recipient's mail client (or mail server) to scan incoming mail. It's also nicer to add headers to the email rather than adding text to the message body so that it's less distracting to the user.
Scanning outgoing mail is essentially worthless because there's no easy, secure way for the recipient to trust the sender's scanner. It's up to the recipient's mail client (or mail server) to scan incoming mail. It's also nicer to add headers to the email rather than adding text to the message body so that it's less distracting to the user.
 +
 +
[[Category:Services]]

Current revision

Contents

Overview

We hate spam and viruses. Our goal is to block 100% of them without blocking any legitimate email. We also realize that our goal is nearly impossible to reach in the real world. As a compromise, we attack the problem in a few different ways.

Spamhaus: Conservative policy to bounce messages based on sending server address

The server uses the Spamhaus SBL, XBL, and PBL to reject mail from known spammers. Messages are bounced back to the sender with a message explaining why the message was rejected. End users do not need to configure anything.

Postgrey: Make mail servers work a tiny bit harder to prove they're for real

As a second line of defence, okcomputer uses Postgrey to temporarily delay incoming mail servers the first time they connect. The idea behind greylisting is that spammers use bulk mail servers that simply try to deliver messages as quickly as possible. Their business is built on volume, so they don't care about reliability - if they can't deliver a message to a particular email address right away, they just give up and try the next one. When a real mail servers can't deliver a message immediately, it queues the message and tries again later. Postgrey on okcomputer tells the remote mail server to try again in 5 minutes. If the mail server comes back later and tries again, Postgrey will accept the message and remember the exchange for next time. Once a mail server proves that it isn't a spammer, Postgrey stops delaying mail from that server.

Greylisting has proven very effective on okcomputer. It catches a lot of spam and uses little in the way of system resources. However, some mail servers may be very broken and not understand the "wait for 5 minutes and try again" message. If mail sent to your antiflux.org account bounces, let root@antiflux.org know and we can probably fix the problem.

SpamAssasin: More aggressive spam analysis with user-configurable filtering

Okcomputer runs incoming mail through SpamAssassin before delivering it. If SpamAssassin determines that the email is spam, it tags the message with special headers like the ones below but it does not reject it. Users can configure their email programs to automatically delete the tagged email or sort it into a special folder. This gives users control over how they want to filter their email, by choosing thresholds and particular tests.

X-Spam-Status: Yes, hits=10.7 required=5.0 tests=FROM_STARTS_WITH_NUMS,
        FROM_ENDS_IN_NUMS,NO_REAL_NAME,CLICK_BELOW,WEB_BUGS,BIG_FONT,
        CLICK_HERE_LINK,CTYPE_JUST_HTML version=2.20
X-Spam-Level: **********

For specific directions on configuring your email program to filter mail based on header information, we suggest reading the UBC spam filtering page. The examples below show the headers we insert to let your email program identify spam and viruses.

You can configure your email application to check the email header (not the email body!) for either "X-Spam-Status: Yes" if you trust the system default threshold or "X-Spam-Level: ****" (adjust the number of * characters) if you want to pick your own threshold.

You can also filter based on the "tests=" section. SpamAssassin performs a long list of tests on each message and tags the message with the names of the tests that indicate the message's possible spaminess. For example, if you want to filter out email from servers listed in the relays.ordb.org database, set your email program to check the X-Spam-Status header for the string " RCVD_IN_RELAYS_ORDB_ORG".

DSPAM: no longer used

We used to use DSPAM as another layer of defence against spam. However, it used a lot of system resources and greylisting has been so effective that we decided to disable DSPAM.

Reference: System Log message

Amavis: Virus scanning

On top of analyzing messages for spam characteristics, okcomputer also uses Amavis to scan for viruses. Like SpamAssassin, Amavis inserts special headers like the ones below into the messages to let the end user decide what to do with incoming viruses.

X-Virus-Scanned: by amavisd-new-20030616-p10 (Debian) at antiflux.org
X-Amavis-Alert: INFECTED, message contains virus: Worm.Bagle.Gen-zippwd,
    Worm.Bagle.Gen-zippwd

Statistics

We also keep some statistics http://antiflux.org/mrtg/spam-day.png

Filtering spam using procmail

Procmail is a very flexible mail processor with many uses, including sorting incoming mail into different folders. One big advantage of procmail is that it processes your mail when it arrives instead of when you check it. This can save you the time and frustration associated with downloading many spam messages.

Step 1: You don't need to configure your account to use procmail for mail delivery

This is not really a step, just a reference for people reading other procmail tutorials. Most places will tell you to create a .forward file with something like "| /usr/bin/procmail" in it. Don't do that for your Antiflux account. All mail already goes through procmail.

Step 2: Create files and directories

We suggest that you keep a procmail log so that you can keep track of what it's doing. This is helpful in case anything goes wrong. Create a .procmail directory in your home directory.

Create a .procmailrc file in your home directory and add the following lines to it. This assumes that your mail is stored in a directory called "mail" in your home directory. If you use Pine, this is the default.

MAILDIR=$HOME/mail
PMDIR=$HOME/.procmail
LOGFILE=$PMDIR/log

Step 3: filter spam detected by SpamAssassin

Add the following to your .procmailrc file to filter anything SpamAssassin classifies as spam into a mailbox called "spamassassin".

:0:
* ^X-Spam-Status: Yes
spamassassin

This uses SpamAssassin's default threshold set by the Antiflux Management. We tend to be a little conservative, so we may set the threshold a bit high to prevent legitimate email being tagged as spam even if it means a few spams slip by. If you would like to set your own threshold, you can filter based on the X-Spam-Level header like this.

:0:
* ^X-Spam-Level: \*\*\*\*
spamassassin.level4

Step 4: filter viruses

Add the following to the end of your .procmailrc file to have procmail dump mail containing identified viruses to a folder called "virus".

:0:
* ^X-Amavis-Alert: INFECTED
virus

A note about "This email scanned by [...]" messages

Some systems, typically run by corporate IT departments with something to prove, like to advertise that they scan outgoing email for viruses and spam. You'll often see something like this.


Date: Fri, 5 Mar 2004 13:40:22 -0700 (MST)
From: William 'Bill' Lumbergh
To: Peter Gibbons
Subject: new cover sheets for TPS reports

Hey Peter, what's happening? Just wanted to let you know that we're putting
those new cover sheets on all TPS reports before sending them out now, so
if you can remember to do that from now on, that would be great.

Bill Lumbergh
"My other car is also a Porsche"

==================================================================
This message certified virus-free by CompuGlobal HyperScanner 2000
Enterprise Edition.
http://www.compuglobalhypermeganet.com/
==================================================================

That text at the end is worthless from a security point of view and is really just advertising for the scanner software. Since it's only plain text, it would be trivial for a virus to add it to every message it sends out. Indeed, some viruses are starting to do just that. There might be some value in cryptographically signing the message so that people can verify it using a public key, but that's beyond the abilities of casual email users.

Scanning outgoing mail is essentially worthless because there's no easy, secure way for the recipient to trust the sender's scanner. It's up to the recipient's mail client (or mail server) to scan incoming mail. It's also nicer to add headers to the email rather than adding text to the message body so that it's less distracting to the user.

Personal tools