Thesis

Breaking the Multi Colored Box: A Study of CAPTCHA

Abstract

Communication is faster than ever. Innovations in low cost network computing have brought an era in which people can effortlessly and instantaneously view and post opinions collaboratively with others across the world. With such an infrastructure of public message boards, chat rooms and instant messaging systems, there is also a large potential for abuse by people wishing to capitalize on such open services by posting unsolicited advertisements.

An entire industry has been constructed around the prevention of unsolicited electronic advertisements (SPAM). This thesis examines various techniques for preventing SPAM, focusing on Completely Automated Public Turing Tests to Tell Computers and Humans Apart (CAPTCHA), a challenge/response technique where an image is displayed with text that is heavily distorted. It also examines the feasibility of breaking CAPTCHA programmatically, alternatives to CAPTCHA based on filtering, improvements to CAPTCHA using photo recognition and avoiding the need for CAPTCHA using naïve approaches.

Paper

My thesis is licensed under the Creative Commons Attribution-Noncommercial-Share Alike 3.0 Unported License. It’s free to read, modify and redistribute so long as proper attribution is given and it is never distributed commercially. See the Creative Commons by-nc-sa license for more details.

Breaking the Multi Colored Box: A Study of CAPTCHA (PDF)

Project

Source code for the engine used to preform the experiments described within the thesis can be found on the BMCB Project Page