A captcha is challenge response to prove the user is not a computer. It's valuable because it is (supposed to be) easy to read by humans, and difficult to read by computers. But if you're like me, it can typically be harder to solve than a math problem. And although a captcha "captures" your attention because you have to stare at it to read it, they're probably going to be short lived on the web just because of that. Captchas typically come in the form of disfigured randomized letters and/or numbers in a picture image. The idea? Computers can't decipher pictures. Computers only have image recognition software. As an unintended consequence to blind humans is obvious - they're unable to pass the captcha without a sound recording built into it.

See also: Optical Character Recognition, mechanical or electronic translation of images - which can be used in programs used to defeat captchas. 

Captchas come in varying degrees of difficulty. A blog website might use the same captcha (the exact letters or numbers) just to prevent spam. The degree of difficulty can only increase with levels of perturbation or what I would call "wavyness." Then there's also degrees of noise, or background fuzz. Add into that contrast, as well as font differences for each character, and now you understand why they're so freaking hard to read. And I'm sure there other ways to make it harder.

The term captcha is based upon the word "capture," and was coined in 2000 as an acronym for "Completely Automated Public Turing test to tell Computers and Humans Apart." Captchas are basically algorithms to increase security of websites or programs through obscurity. They're a necessary evil so to speak, just like locks on your car or house doors. There has to be some mechanism to prevent unwanted intruders. While you might give the key to your house to family and specific friends, you wouldn't give it to a complete stranger. Similarly a captcha gives the key to the website to those it wants, hopefully... users.

Example:

   ___    >      ___     _  . __       ______
 / ___ \       / ___ \       /  |     (_____ \
| | + | |     | |   | |     /_/ |       ____) )
| |   | |     | | \ | |       | |      /_____/
| |___| |  +  | |___| |       | |      |______
 \ ___ /       \ ___ /        |_|     (_______)

This number example doesn't have perturbation, and I wouldn't really call the random characters in it noise. It's still a captcha however, because an image reader or computer protocol could still be thrown off from time to time. For better examples click the rodcorp link (hit Example) it shows image examples outside the realm of ASCII art.

 

Filling out captchas, to get an account:
Anyone who has signed up for an online account like Facebook, MySpace, Yahoo, etc. has filled one out. Programmers have various reasons for circumventing captchas. If you wanted to make a business selling Facebook accounts, as absurd as that sounds, you'd need a program that can successfully fill out a captcha, or dispose of it. And trust me when I say there is in fact a business in just the process of creating accounts on websites that use captchas. Even a 2004 Slashdot article mentions using porn websites with captchas to generate email accounts. "Spammers are now usings a new technique to circumvent the 'captchas,' the distorted text in graphics, that users must input to receive the free email account. The spammers have cracked the system by displaying the 'captchas' on free porn sites in real time. Since there are always a large number of people signing up for free porn, they do the work of decripting the 'captchas' which is then replayed back into the spammers program to create a new email account. Who thought that porn could be a hacking technique!" (Slashdot)

"The CAPTCHA approach to securing forms is not new - it first appeared in the late 90's for domain name submissions to search engines and the like - but with the exponential growth of scripted exploits it's coming to the fore once again." (Art of Web)

Beating a captcha:
"Some people actually believe that spammers can now 'fairly easily' write scripts which use advanced optical character recognition to automatically defeat any online CAPTCHA form." (Coding Horror) That is largely up for debate to this day. There seems to be a majority agreement small sites greatly benefit from captchas. Some, however, believe large sites fall into that category of false security.

    Ways to beat a captcha
  • exploiting bugs in the implementation that allow the attacker to completely bypass the CAPTCHA,
  • improving character recognition software or OCR's Optical character recognition - most effective
  • using cheap human labor to process the tests.
  • brute-force - multiple sequential attacks instead of Recognition Software

Creating a CAPTCHA graphic using PHP
Those who want to easily code a captcha can follow this instruction. "The following code needs to be saved as a stand-along PHP file (we call it captcha.php). This file creates a PNG image containing a series of five digits. It also stores these digits in a session variable so that other scripts can know what the correct code is and validate that it's been entered correctly.

<?PHP // Adapted for The Art of Web: www.the-art-of-web.com // Based on PHP code from: php.webmaster-kit.com // Please acknowledge use of this code by including this header. // initialise image with dimensions of 120 x 30 pixels $image = @imagecreatetruecolor(120, 30) or die("Cannot Initialize new GD image stream"); // set background to white and allocate drawing colours $background = imagecolorallocate($image, 0xFF, 0xFF, 0xFF); imagefill($image, 0, 0, $background); $linecolor = imagecolorallocate($image, 0xCC, 0xCC, 0xCC); $textcolor = imagecolorallocate($image, 0x33, 0x33, 0x33); // draw random lines on canvas for($i=0; $i < 6; $i++) { imagesetthickness($image, rand(1,3)); imageline($image, 0, rand(0,30), 120, rand(0,30), $linecolor); } session_start(); // add random digits to canvas $digit = ''; for($x = 15; $x <= 95; $x += 20) { $digit .= ($num = rand(0, 9)); imagechar($image, rand(3, 5), $x, rand(2, 14), $num, $textcolor); } // record digits in session variable $_SESSION'digit' = $digit; // display image and clean up header('Content-type: image/png'); imagepng($image); imagedestroy($image); ?>" (Art of Web)


Sources:
http://rodcorp.typepad.com/photos/art_2003/captcha_stack.html
http://en.wikipedia.org/wiki/Captcha
http://www.the-art-of-web.com/php/captcha/
http://www.codinghorror.com/blog/archives/000712.html

2 More note on CAPTCHAS

A bypass around CAPTCHAs

A simple way around CAPTCHAs has been discovered by the noble denizens of the internet. The essence of the method is to reshow the CAPTCHA to an unwitting human. In our example there are three participants in this shell game of a hack, the security minded email server, the spammer, and an average internet user. The spammer wants to register an account at nobots.com but Oh No! they have a CAPTCHA, whatever will we do? In a feat of unusual intelligence the spammer comes with this clever idea. banner ads is purchased promising free pornography, electronics, or some other too good to be true offer. Our innocent average Joe clicks on the link and is prompted with a CAPTCHA to claim the offer. Little does he know that he is processing the very same CAPTCHA provided by the email site! He fills out the form, no free TV ever arrives and the spambot registers one more email at nobots.com. Because the whole process can be automated now, and nothing is as reliable as an internet user seeking pornography, it is now possible to break any CAPTCHA that couldbe evaluated by a human.

reCAPTCHA

reCAPTCHA is a program recently created to employ the drudgery of filling out a CAPTCHA towards a useful purpose. Created by Luis von Ahn at Carnegie Mellon University the reCaptcha program utilizes CAPTCHAS to facilitate digital conversion of aging books. Due to limitations in OCR software, scanning equipment, and the nature of older printing, it is quite difficult to reliably convert a physical book to digital text. The reCAPTCHA program shows 2 words to the users. The first word is one the software is already confident of, for the purpose of verifying the humanity of the user. The second word is a new unknown word, and here is where the utility of reCAPTCHA comes into play. After a few uses this new word is reliably known and can be added to the pool of known words. reCAPTCHA is currently helping scan books from the Internet Archive and old editions of the New York Times.

Log in or register to write something here or to contact authors.