Software | Web | Coding

Customising Django Simple Captcha

I used the django-simple-captcha to protect the comments form from spamming. Some time after the site went live, I noticed spam comments starting to appear. 

The site does not generate much traffic, but the comment spam nonetheless became more and more prolific, with dozens of entries appearing each day. I took a look at the captcha to find out what was happening

Captcha image containing obfuscated letters

The default captcha uses a string of random characters, but django-simple-captcha provides good customisation options using Django settings.

First I tried switching to a Math based captcha, in which the user is asked to complete simple arithmetic. 

Capcha image containing an arithmetic question

This also failed to protect the form, with the same number of spam comments appearing.

I wanted to check the contents of the post request to see if the captcha was being circumvented somehow. I am using nginx to handle requests initially, so added this to my nginx config:

log_format postdata $request_body;

server {

    location / {      
        access_log  /var/log/nginx/postdata.log postdata;

This saves all post data to a dedicated log file. Looking at the post data I could see the spam captchas were being completed correctly (or close to) for both random char and math challenges. 

 - csrfmiddlewaretoken=gTcnDmbR2Ve8IgbDUVOFMkFfcJFqMRA5fh6ewAbtz94BKLjrX82FqK3vES01RS1p&name=Robertthuse&<>cialis+cheap<%2Fa><>viagra+online<%2Fa><>viagra+online<%2Fa>&captcha_0=9fb0cfcf30d6af14ae77e122251ec894a2a419bc&captcha_1=WFDM&entry=&entry=4&submit= - 

I decided not to try to further obfuscate the captcha, as it seems likely the machine recognition used by the spam bots would be at least as good as a human at reading this. It also makes captchas even more unpleasant for a human to deal with.

My solution was to take advantage of django-simple-captcha's custom challenge feature. A custom function is provided that must return a string for the challenge and the correct result. This provides the opportunity to make a human comprehensible adjustment that cannot be easily parsed by a bot, and will sit outside the standard behaviour they expect.

import random

def captcha_challenge():
    challenge = u''
    response = u''
    for i in range(4):
        digit = random.randint(0,9)
        challenge += str(digit)
        response += str((digit + 1) % 10)
    return challenge, response

In this case I am using a random string of digits and requiring the user add one to each. I adjusted the form to add this instruction for the user.

Captcha image containing numbers and no obfuscation

As I know the captcha image will be parsed regardless of the noise settings, I remove this element of randomness to make the captcha easier to use. 

# Plain output with no obfuscation as we are using a custom challenge
CAPTCHA_CHALLENGE_FUNCT = captcha_challenge

Of course, this example is easy to circumvent automatically, but the spam bots search all sites looking for easy targets, and will just fail and move on from a small site like this.

Andrew Liu 2 years, 6 months ago

Interesting Read!

Simple and effective solution. GJ!

Add a comment


Add one to each digit. 9 becomes 0!