5 Layer Spam Filter
It’s finally finished. I created a 5 layer secure spam filter for form submissions. With this method, in my opinion, 99.9% of spams will be filtered. And best of all, it doesn’t use any CAPTCHA.
First of all thanks to Allan Odgaard for his spam filtering method. I modified it a little though.
I will go through the security layers that the script uses and I’ll show you a little demo too.
Layer 1:
The main idea of securing a form is to create some extra form fields hidden for the human users, but visible for bots. And on form submission to simply check if the hidden fields were filled or not. Usually if we’re dealing with a human user than these fields shouldn’t be filled, but bots don’t know that.
The naming of form fields are very important, because spam bots are intelligent, they know what field are they dealing with (username, name, password, etc.).
This security layer picks a random name for every hidden field and on every request the field’s name is different.
$decoy = array('d_name','d_password','d_pw','d_user','d_username','d_comment');
$default_1 = $decoy[array_rand($decoy)];
Layer 2.
The second security layer is the actual extra text field that the human users can’t see. I didn’t used a hidden form field, because in my experience, spam bots doesn’t like them very much.
<input type="text" name="<?= $default_1 ?>" style="display:none;position:absolute; left:9000px; top:9000px;" />
As you can see the name of this field is random. It’s inline CSS is set to be out of the visibility of human users. I used positive values, because as I mentioned spam bots are intelligent…
Layer 3.
The idea of using a timer is simple. Spam bots are usually automatically fill in the form and submit it under 3 – 4 seconds which humans simply can’t do. So by creating a field that’s of course not visible for human users and saving the current time, we can later check if it corresponds to our settings. It’s a similar field like the other, except we save the time in the name of the field. Not in the value in case the spam bot fills this too. Same as with the other field, the script pick a random name for it.
This is the PHP part:
$timer = array('t_name_','t_pw_','t_username_','t_age_');
$default_2 = $timer[array_rand($timer)] .time();
This is the HTML part:
<input type="text" name="<?= $default_2 ?>" style="display:none;position:absoute; left:9999px; top:9999px;" />
Layer 4.
This one is tricky. Usually spam bots doesn’t use Javascript, they simply forge a POST header and submit it directly. I used Javascript to simply create a cookie with a simple value in it and to be more secure I set this variable in a .htaccess file as a $_SERVER variable. And from time to time it can be changed for added security.
This simple line is in the .htaccess file: SetEnv SPAM_SECRET “sp_secret”
This is the Javascript which is loaded along with the form:
function createCookie() {
document.cookie = "<?=$_SERVER['SPAM_SECRET']?>=1; path=/"
}
If the cookie value is set to 1 it mens the form was loaded in a Javascript enabled browser and hopefully, by a human user.
Layer 5.
After all the validations the form is sent to Akismet service which evaluates the content of the form fields. You will need a Wordpress API key to use the service. You can get one for free here: http://wordpress.com/api-keys/
Demo:
You can check it out at: http://playground.primalskill.com/spam-filter/
Download at: http://playground.primalskill.com/spam-filter/spam_filter.zip
If you enjoyed this article then help spread the word and please follow us on Twitter or subscribe to our RSS feed.
Subscribe to our RSS feedFollow us on Twitter
on Friday 26, 2008
I would stop using short tags – they will be gone in PHP 6.
on Friday 26, 2008
Oh BTW – good article.
on Friday 26, 2008
I didn’t know about the short tags. Good tip. Thanks!
on Friday 26, 2008
Thank you. What you need))
on Friday 26, 2008
I’m trying to implement your code, but I’m getting the following:
failed tests: 1 array(4) { ["decoy"]=> bool(false)
["timer"]=> bool(false) ["cookie"]=>
bool(true) ["akismet"]=> bool(false) } Spam. Any
chance you can let me know what I’m doing wrong from this? I’ve
copied your code and updated my htaccess, but I’m doing something
wrong. Thanks.
on Friday 26, 2008
Can you send me an e-mail with your source code and .htaccess file? Just to see what is wrong.
As a hint, the cookie didn’t passed the test. Either cookies are disabled from the browser, or somehow the pass-phrase got it wrong, what you get from the .htaccess file…
on Friday 26, 2008
Great post, it’s good to see someone tackling the accesibility nightmare that is CAPTCHA! However, I think Layers 2 and 3 may be flawed – a lot of people now use automated form fillers, for example RoboForm or Google Toolbar’s Autofill feature. This means that any fields a spam bot would fill in will be filled in, and it also enables us to fill in most forms in just a couple seconds.
Your system would potentially label these users as spammers. I guess the solution might be to use all the above methods in combination, perhaps giving a weighting to each… gives me something to think about! Pete
on Friday 26, 2008
@Pete Williams: Very interesting, I didn’t calculated for automatic form fillers…
Hmm. That’s a challenge!
on Friday 26, 2008
Well, I did some thinking about how we could best avoid using CAPTCHA and posted my comments on my blog: http://petewilliams.info/blog/2009/02/why-captcha-sucks-and-what-to-do-about-it/
Thanks for the inspiration and I hope my comments help. Pete
on Friday 26, 2008
I am concerned about offscreen input fields. For anyone who is a bit inclusive in their designs will start thinking about braille, there is a few of them around you might have to consider.
I’m not saying captcha is good because I hate it for the same reasons and most of the algorithms are just plainly unreadable and not exactly optimal.
on Friday 26, 2008
Braille readers doesn’t use Javascript.. The hidden input fields can be included with Javascript then…
Any other ideas are welcome and together we can make this code better.
on Friday 26, 2008
Well most bots don’t use javascript either, so that would classify them as bots wouldn’t it?
on Friday 26, 2008
Hm that sounds good but I would like to know more details.
on Friday 26, 2008
Require the posting software to factor a large prime-number (or perform some other difficult calculation) — something that the Javascript function you supply can do in a few seconds. Even if some spammer were dedicated enough to write the code, it would cut down on his ability to send more than a few thousand spams a day.
on Friday 26, 2008
That’s good man, keep it going.
on Friday 26, 2008
Have a text-field hidden by css. Let Javascript set the value of the text-field to a special word, when the form is submited, the php-code checks for the special word. Bots don’t use javascript and therefore the textfield is not filled in with the required word. This may work even if google-toolbar for example fills the form automatically.. maybe..
on Friday 26, 2008
The proposed 5 Layer spam filter is weak since a bot can be created to pass all described layers.