Drupal 8: Automated Spam Protection

Spam is a constant problem for any site on the internet that has a publicly available form, but automatically preventing spam can be tricky. The idea is to prevent the automated spam bot from being able to submit data to your site, but not at the detriment of the users. There is a careful balance between preventing spam and prevent real content being submitted by real users. Manually moderating blog comments is usually a good idea, but many websites contain contact forms and user registration forms that are often targeted by spam bots.

Whilst Drupal does have a number of protections against cross site submissions or denial of service attacks and even has build in user and comment moderation. It does, however, need a little bit of help with preventing spam.

Drupal has a number of modules to deal with automated spam and they fall mainly into a number of different categories.

Form Alterations

Honey Pot

The Honey Pot module adds an extra hidden field into your forms. The idea is that if a user fills in this field then they are obviously a bot and the submission will be rejected. You can either add this to all forms across the site, or just a select few forms.

In reality I have found mixed levels of success with this module, some spam bots appear to be able to detect the missing field and are able to bypass the protection. Either that or they are just randomly submitting data to see what sticks.

Overall, this mechanism isn't always reliable in stopping spam but rarely produces false positives. Your users will never hit an anti-spam warning with this module in place.

Spamicide

This module is a different variant of the Honey Pot module, but this module isn't as feature complete as Honey Pot and also doesn't have a stable release.

CAPTCHA

A CAPTCHA (or Completely Automated Public Turing test to tell Computers and Humans Apart) is an automatic way of vetting that the user is a real human by showing them something that a computer wouldn't be able to process. This consists of things like images with noise over the text or text based riddles. In reality, CAPTCHAs can be difficult to get past for normal users and are especially problematic for people with accessibility problems.

The CAPTCHA module is really a framework module that you can install other modules to take advantage of. It comes with a default Image CAPTCHA module that will present the user with a bunch of text that has been peppered with random noise. I have found that although this module works pretty well to cut down spam, it also creates a bit of a barrier for your users.

Antibot

This module employs a couple of neat little tricks that will present problems to all but the most dedicated spam bots. The idea is that it injects JavaScript onto all forms to detect things like mouse movements or mobile swipe movements in order to ensure that this is a real user. If the user does any of these actions then the Antibot module will allow the submission through. If not then the submission will go to the /antibot endpoint and will be rejected.

This is a simple module and employs some interesting tricks, but it does require JavaScript enabled to even allow the users to be able to view the forms.

Content Analysis

Protected Submissions

This module looks at the incoming data to the form and makes a decision on the submission based on a couple of factors. These includes a check on the character set to make sure it fits in with a list of expected values. You can also set a number of different strings that are used to reject submissions if they are seen in the content. The good thing about this module is that you can also enable logging to ensure that your rules aren't too restrictive.

In reality, this module works very well, but it will take you a little while to tweak it to your needs. I have used it on this site for a number of months now and it's been really good addition to my continuous battle against spam. It took me a few weeks of tweaking though, and required me to look at spam submissions and feed repeat offenders back into the configuration of the tool. Protected Submissions is a really good module, but you'll need to devote some time to refining the rules.

Services

reCAPTCHA

One of the more successful extensions to the CAPTCHA module is the reCAPTCHA module. This uses an API provided by Google to create a variety of different CAPTCHAs for users to pass. Most notably, if the user is already logged into a Google account then they will automatically be accepted and be allowed to pass. This is one of the better CAPTCHA services that exists and Google have put lots of effort into making the system as accessible as possible (https://www.google.com/recaptcha/intro/index.html).

There are modules like Simple Google reCAPTCHA and reCAPTCHA v3 that provide this feature as separate modules, but I tend to use this reCAPTCHA module plugin. That said, I have not seen much success with using reCAPTCHA and have found that spam bots are still able to get past it and submit multiple entries to a form that was apparently protected by this service.

Anti Spam by CleanTalk

CleanTalk is a SaaS based solution that provides anti-spam protection based on a number of different factors. This includes simple things like checking the domain name of an email address to checking the request against a database of known spam bots. Key to all this is the absence of using a CAPTCHA to check the user submission and instead opting for a 'install and forget' mechanism.

The module looks pretty comprehensive and might be worth a look.

Human Presence Form Protector

This module integrates with the Human Presence SaaS platform, which uses multiple overlapping strategies in order to fight form spam. Interestingly, this module integrates with the CAPTCHA module and can be allowed to show a CAPTCHA as a fallback for when the Human Presence fails to identify the user as a human.

Other

Mother May I

The idea is that users will not be able to submit forms on your site without first entering a string. This string might be given to the users outside of the site, or printed on the site itself, but without it there is no way to submit the registration forms on the site. This creates a situation where users who register on the site will almost certainly be humans and probably quite determined humans at that.

A bit of an odd module, but might be useful in certain situations.

Conclusion

In terms of implementation I would suggest going for the simplest approach first. Install the Honey Pot module to start off with and keep an eye on the levels of spam getting through. If that isn't doing much then you can start looking at more aggressive or comprehensive solutions. If you are certain that you'll need to start off with some more advanced spam protection then install it, just be sure to test things to make sure that your aren't prevent normal users adding content.

Finally, be sure to add spam protection to your default Drupal setup strategies so that it doesn't get left out. You can always install these modules after the fact, but by that point your users might already be fed up.
 

Comments

A lot of thanks for all your hard work on this blog

Permalink

Add new comment

The content of this field is kept private and will not be shown publicly.
CAPTCHA
1 + 2 =
Solve this simple math problem and enter the result. E.g. for 1+3, enter 4.
This question is for testing whether or not you are a human visitor and to prevent automated spam submissions.