Software | Web | Coding
Customising Django Simple Captcha
I used the django-simple-captcha to protect the comments form from spamming. Some time after the site went live, I noticed spam comments starting to appear.
The site does not generate much traffic, but the comment spam nonetheless became more and more prolific, with dozens of entries appearing each day. I took a look at the captcha to find out what was happening
The default captcha uses a string of random characters, but django-simple-captcha provides good customisation options using Django settings.
First I tried switching to a Math based captcha, in which the user is asked to complete simple arithmetic.
This also failed to protect the form, with the same number of spam comments appearing.
I wanted to check the contents of the post request to see if the captcha was being circumvented somehow. I am using nginx to handle requests initially, so added this to my nginx config:
log_format postdata $request_body;
server {
location / {
access_log /var/log/nginx/postdata.log postdata;
This saves all post data to a dedicated log file. Looking at the post data I could see the spam captchas were being completed correctly (or close to) for both random char and math challenges.
- csrfmiddlewaretoken=gTcnDmbR2Ve8IgbDUVOFMkFfcJFqMRA5fh6ewAbtz94BKLjrX82FqK3vES01RS1p&name=Robertthuse&email=robert771%40cialisrxl.com&comment_text=dosa+cialis+%0D%0A+%0D%0Ahttp%3A%2F%2Fcandiancialisuy.com%2F+-+generic+cialis+%0D%0A+%0D%0A<a+href%3D%22http%3A%2F%2Fcandiancialisuy.com%2F%22>cialis+cheap<%2Fa>viagra+generico+marcas+%0D%0A+%0D%0Ahttp%3A%2F%2Fviagralkas.com%2F+-+buy+viagra+%0D%0A+%0D%0A<a+href%3D%22http%3A%2F%2Fviagralkas.com%2F%22>viagra+online<%2Fa>soft+generic+online+viagra+%0D%0A+%0D%0Ahttp%3A%2F%2Fviagranelius.com%2F+-+generic+viagra+online+%0D%0A+%0D%0A<a+href%3D%22http%3A%2F%2Fviagranelius.com%2F%22>viagra+online<%2Fa>&captcha_0=9fb0cfcf30d6af14ae77e122251ec894a2a419bc&captcha_1=WFDM&entry=&entry=4&submit= -
I decided not to try to further obfuscate the captcha, as it seems likely the machine recognition used by the spam bots would be at least as good as a human at reading this. It also makes captchas even more unpleasant for a human to deal with.
My solution was to take advantage of django-simple-captcha's custom challenge feature. A custom function is provided that must return a string for the challenge and the correct result. This provides the opportunity to make a human comprehensible adjustment that cannot be easily parsed by a bot, and will sit outside the standard behaviour they expect.
import random
def captcha_challenge():
challenge = u''
response = u''
for i in range(4):
digit = random.randint(0,9)
challenge += str(digit)
response += str((digit + 1) % 10)
return challenge, response
In this case I am using a random string of digits and requiring the user add one to each. I adjusted the form to add this instruction for the user.
As I know the captcha image will be parsed regardless of the noise settings, I remove this element of randomness to make the captcha easier to use.
# Plain output with no obfuscation as we are using a custom challenge
CAPTCHA_LETTER_ROTATION = 0
CAPTCHA_NOISE_FUNCTIONS = []
CAPTCHA_CHALLENGE_FUNCT = captcha_challenge
Of course, this example is easy to circumvent automatically, but the spam bots search all sites looking for easy targets, and will just fail and move on from a small site like this.
Interesting Read!
Simple and effective solution. GJ!