The Anti-Spam Wizard

Basics

KMail does not have a built-in spam detection solution: the developers believe using external, but specialized, tools is the better approach. KMail uses these tools through its flexible filter architecture. The Anti-Spam Wizard helps you with the initial filter setup.

What can the wizard do to help you?

It will give you some choices about how you want the spam filtering to be set up. Afterwards it will automatically create the appropriate filter rules.

What are the limitations of the wizard?

All it can do is set up the filters for you; it will provide a standard setup. Manual modifications that have been applied to existing anti-spam filters are not recognized. Instead, such filters are overwritten by the wizard.

You can activate the wizard via ToolsAnti-Spam Wizard.... If this choice is not available, click SettingsConfigure KMail...Plugins and check the box next to Antispam. You will be prompted to restart KMail; when you do, the wizard will appear on the Tools menu.

The wizard scans for known anti-spam tools on your computer. It is also possible to use the results of spam checks made by your service provider, by evaluating header information which has been added to the messages. You can let the wizard prepare KMail to use one or more of these in parallel. However, note that anti-spam tool operations are unusually time consuming. KMail can appear to be frozen during the scan of messages for spam, so you may encounter problems with the responsiveness of KMail. Please consider deleting the filter rules created by the wizard if the filtering becomes too slow for you. (This has been a problem with older hardware. It probably won't afflict more modern machines.)

Here are some observations about a few anti-spam tools.

Bogofilter

Bogofilter is a Bayesian filter. Its spam detection ability relies on an initial training phase. On the other hand, it's a pretty fast tool. That's why it is recommended for people who want fast spam detection, and who aren't worried about putting some effort into the initial training, before the detection rate increases significantly.

SpamAssassin

SpamAssassin is a fairly complex tool to use against spam. Although its behavior depends heavily on its configuration, it can detect spam quite well without any training. However, scanning a message takes a little longer compared to pure Bayesian filters. Let's say it's not the tool of choice for people without some background information about SpamAssassin's capabilities.

Annoyance Filter

Perhaps not so often used until more distributions pick it up. It's clearly a tool for specialists.

GMX Spam Filter

If you get your mail via the GMX freemail provider, your messages have already been scanned for spam. The result of that process is documented in a special header field in each message. It's possible to use the content of this header field to filter out spam. There is very little slowdown in the filtering when this tool is used, as the messages have already been processed by the external email server.

Advanced

KMail can use several external tools to detect spam messages; it will try to automatically find out which tools are installed on your system, and will display all of these in a list. The list is ordered by the average speed of the filtering process of the tools. You can mark the tools which you want KMail to utilize to detect spam. If you want more choices, you can simply close the wizard, install a new tool, then restart the wizard.

If you have marked at least one tool, KMail is able to provide filters which allow the classification of the messages as spam or not spam. It will also provide actions to let you manually classify messages. These actions will be available in the MessageApply Filter > menu item, and also via a pair of icons on the toolbar. If any of the tools you selected support Bayesian filtering (i.e. a method to detect spam based on statistical analysis of the messages) then these messages are not only marked but additionally piped through the tools to help them learn, thereby improving their detection rate.

On the second page, you will be able to select some additional actions to be performed in KMail with regard to spam messages: if you want messages detected as spam to be moved into a certain folder, select the appropriate folder and mark the Move known spam to: option; if messages detected as spam should additionally be marked as read, then mark the Mark detected spam messages as read option.

Selecting at least one of the available tools will allow the wizard to finish the filter setup. The wizard will not take any modifications in existing filters, formerly created by it, into consideration, but will either append new filters or replace existing filters. In any case you may want to inspect the result of this process in the Filter Dialog. The wizard will also create toolbar buttons for marking messages as spam or as ham. Keep in mind that classifying messages as spam will also move those messages to the folder you have specified for spam messages, when you select the spam option.

Some More Details for Experts

The wizard uses information stored in a special configuration file named kmail.antispamrc (stored in the global or local KDE config directory). It will first check the global config file and then the local config file. If the local config file contains an entry with a higher (newer) version number, the configuration data from the local file (for that tool) is used, so both administrators and users can update the wizard's configuration.

The local detection of spam messages is achieved by creating pipe through actions per-tool within a special filter. Another filter contains rules to check for detected spam messages and actions to mark them and (optionally, depending on the choice in the wizard) to move them into a folder. Both filters are configured to be applied to incoming messages and for manual filtering.

Two filters are needed for the classification of ham and spam. They contain actions to mark the messages appropriately. As mentioned above, the filter for classification as spam can have another supplementary action to move the messages into a predefined folder. If the selected tools support Bayesian filtering, the wizard will create additional filter actions to pass the messages to the tools (via Execute Command actions) in the appropriate learning mode.

If you want to fine-tune the filtering process, you might be interested in the chapter about Filter Optimization.