Blocking Comment Spam

If you run a blog, you will get hit by comment spam. In many cases, you can use filtering services like Akismet to block spam. But If your blog is on a low-end server with a small pipe, like mine is, then this is not ideal — you have to receive each spam comment, forward it to Akismet, then receive the response from them.

I will show you how to block a significant amount of spam using mod_security.

Contents

  1. A word about referer spam
  2. Install mod_security
  3. The anti-spam rules

A word about referer spam

First, you should check your logs to see if you get referer spam (Google it if you don't know what it is). Referer spam is much easier to identify at the HTTP level than comment spam, and by dynamically blocking IP addresses that send you referer spam, you will as a side effect block a significant amount of comment spammers as well. It seems that comment spammers and referer spammers often use the same zombie machines.

Please read my tutorial on blocking referer spam for information on how to detect referer spam and add spammers to an IP blacklist in real time.

Install mod_security

First, you need to install it. This article is not a guide for installing Apache or mod_security, so please consult other resources if you need help on that. In CentOS and Fedora, you can get mod_security as an Apache module just by doing # yum install mod_security and restarting Apache. For other distros, consult Google or the documentation. This guide is for mod_security version 2 or higher.

The anti-spam rules

Below you will find the rules I use to block comment spam. Of course, every blog is set up differently, so let me take a second to explain the setup of my blog.

The blog is at http://www.icydog.net/zhanga. This page displays blog posts and comments. When a user attempts to post a new comment, the POST action is set to /post/comment_post.php, which processes the request and redirects the user back to /zhanga. We can write some simple yet effective rules based on just these facts.

First, initialize a collection to track how spammy this IP is. (This block of code is also used in my blocking referer spam guide. You should only include it once in the configuration file.) For more information on exactly what is going on here, consult the mod_security documentation. Basically, each IP has a spam counter which gets increased if we detect spammy requests, and slowly decremented every day. Requests are dropped for any IP with >15 points. Note that how often and how much to decrement the counter, how high the threshold for blocking is, and how many points to assign to each spammy request is entirely up to you. Generally you will want to assign higher point scores to rules that have little or no chance of being a false positive.

# Make sure to clear the default action
SecDefaultAction phase:1,pass

# Initialize collection and deprecate by 3 points per day (86400 seconds)
SecAction phase:1,initcol:IP=%{REMOTE_ADDR},deprecatevar:IP.spam=3/86400,nolog

# If there are already >15 spam points for this IP, then drop
# the connection and add 1 point (instead of 3, as below).
SecRule IP:spam "@gt 15" phase:1,setvar:IP.spam=+1,drop,setenv:spam=spam

If the client is trying to access comment_post.php, and also claims that page is its referer, then it is either horribly broken or a spammer. Some spammers do this to get around referer filters, however on my site this script never refers itself. Assign 15 points to IPs who do this.

SecRule REQUEST_URI "^/post/comment_post\.php" chain,drop
SecRule REQUEST_HEADERS:Referer "/post/comment_post\.php" \
	setenv:spam=comment,setvar:IP.spam=+15

If the client is trying to access comment_post.php, and also claims to be referred from some URI outside of my domain, then it's spam since no domain can legitimately be posting comments to my blog. This regex matches referers that begin with http:// not followed by icydog.net/ (optionally prefixed by www.) — we can't block empty referers because that would block many legitimate clients. Assign 20 points to IPs who do this:

SecRule REQUEST_URI "/post/comment_post\.php" chain,drop
SecRule REQUEST_HEADERS:Referer "^http://(?!(www\.)?icydog\.net/)" \
	setenv:spam=comment,setvar:IP.spam=+20

Some broken bots will try to POST to URLs like /zhanga/2004/05/post/comment_post.php?page=20040516 (which doesn't exist) and /zhanga?msg=Thanks+for+commenting.. These are never legitimate POST targets because no forms on my site point there. Block these and assign 16 points to any IP who tries:

# There is no reason to ever POST to anything under /zhanga.
SecRule REQUEST_METHOD "^POST$" chain,drop
SecRule REQUEST_URI "^/zhanga" setenv:spam=comment,setvar:IP.spam=+16

Happy hunting!