Man or machine?

If, online, you can’t tell the difference, the computer program you’re querying can be said to think.
So postulated Dr. Alan Turing, famed British mathematician and cryptographer, in his seminal investigations into machine intelligence. The Turing Test is the standard by which machines will continue to be judged as we move ever-closer to the other-creatures in Spielberg-Kubrick’s A.I..

Ironically, over the past fifty years, computers have been able to “master” thought processes we might consider “difficult”—things like chess or medical diagnosis—much easier than they have been able to “hear” or “speak" or “see”.

Dr. Jitendra Malik, who researches computer vision at the University of California at Berkeley, tells us that “Abilities like vision are the result of billions of years of evolution and difficult for us to understand by introspection, whereas abilities like multiplying two numbers are things we were explicitly taught and can readily express in a computer program.”

This obvious discrepancy between human and machine intelligence is at the heart of an interesting program that grew out of a very real concern of Dr. Udi Manber, the chief scientist at Yahoo, the internet portal, in September of 2000.

Yahoo at the time was infested with bots that masqueraded as teenagers and collected personal information from subscribers. Links were posted to commercial websites and hundreds of free Yahoo accounts—created by hacker scripts—were used for bulk mailings of millions of pieces of spam. Things were out of control.

“What we needed was a simple way of telling a human user from a computer program,” said Dr. Manber. The first thing the company needed to do, he reasoned, was prevent automated registrations.

Manber placed a conference call to Dr. Manuel Blum, a cryptographer at Carnegie Mellon University in Pittsburgh, who theorized that the essential failures of A.I. theretofore were precisely the solution to Dr. Manber’s problem.

Dr. Blum, together with his Ph.D. student Luis von Ahn, Nicholas Hopper, and John Langford, devised a sort of reverse Turing Test, a series of cognitive puzzles that, ironically, computers could generate and grade but could not pass.

Blum called the puzzles Captchas, Completely Automated Public Turing Test to Tell Computers and Humans Apart.

Their first Captcha was called Gimpy. It consisted of a display of seven randomly-chosen words that were overlapped and distorted. In order to pass the test, three of the words had to be identified and typed into a dialog box. Humans could do this easily; machines could not. A simplified version of Gimpy, which contains a single word, distorted against a complicated background, is now part of Yahoo’s registration process.

Sounds was a second Captcha. A distorted computer-generated sound clip containing a word or a sequence of numbers was presented to the user, who had to type the word or number into the box provided.

Blum and his colleagues were not the first to attempt to shut down on-line registration mischief. AltaVista and PayPal already had systems in place, and Hewlett-Packard held a patent on text-based solutions. But Blum “did a great thing by recognizing that this problem is much more than solving a nuisance for Yahoo and AltaVista,” said Dr. Andrei Broder of I.B.M., who developed the AltaVista solution.

Blum recognized that there would always be breaches in security measures on the internet because that is the nature of encryption and the cryptographer. What he hoped to do was motivate other researchers to create better Captchas while they built programs that cracked existing ones.

“Captchas are useful for companies like Yahoo,” said Dr. Blum, "but if they’re broken it’s even more useful for researchers. It’s like there are two lollipops and no matter what you get one of them.”

The earliest Captchas have already been broken, and the Captcha bar, so to speak, has been raised. Dr. Malik and his associate Dr. Serge Belongie have developed an object-recognition technique that has some of the properties of human vision.

A Gimpy-cracking program written by Greg Mori, one of Malik’s students, which utilizes the Malik-Belongie methods, is able to give the right answer over 80 percent of the time. More difficult Gimpy puzzles are solved after only three tries.

The research has many other applications as well. “We want to keep working on this in a principled way," says Dr. Malik, “so we can use the same technique on an outdoor scene with buildings, trees, and cars.”

In addition to the military (which always seems to get its heavy foot in the door first), Captcha programs are already in use by online polls, free E-mail services, search engines, and preventions against worms, spam, and dictionary attacks in password systems.

One of the best examples of why Captchas, more and more, will play an important role in the future of computing was demonstrated back in November of 1999. E2’s very own http://slashdot.com released an online poll which asked: Which is the best graduate school in computer science?

IP addresses of the voters were recorded, as is usually the case with online polls, to keep people (or machines) from voting more than once. Students at Carnegie Mellon (is there a pattern here?) devised a way to stuff the electronic ballot box thousands of times. Not to be outdone, by the next day students at MIT had programmed their own bot and the virtual voting race was on.

MIT finished with 21,156 votes and Carnegie Mellon had 21,032. Every other school registered less than 1,000.

I'll give the nod to Carnegie Mellon, for thinking it up in the first place.


Human or Computer? Take This Test, Sara Robinson, The New York Times, Dec 10, 2002.
http://nytimes.com/2002/12/10/science/physical/10COMP.html?8hpib
http://www.captcha.net/

A CAPTCHA, as defined by The CAPTCHA Project (http://www.captcha.net/), "is a program that can generate and grade tests that most humans can pass, [and] current computer programs can't pass." It stands for "Completely Automatic Public Turing Test to tell Computers and Humans Apart" and is pronounced like "capture" without the final /r/ sound. Because computers have several times more patience than a human* when performing automated tasks, web services often use human authentication tests such as those of The CAPTCHA Project to prevent automated processes from creating an account, submitting information such as URLs to a database, sending messages to users, or performing any other action that uses publicly available scarce resources.

* In this article, "human" refers to any entity with the intelligence and patience of a typical member of Homo sapiens, and "computer" refers to any entity whose reaction to the environment is governed primarily by a well-understood mechanical process. The author intends no prejudice against members of non-human biological races from fantasy and science fiction.

There exist at least three different types of information that humans are known to perceive better than computers: speech, pictures, and semantics of text. These are the differences that CAPTCHAs attempt to exploit. However, the first two raise accessibility issues with respect to people with disabilities, and all three may have legal problems. For instance, Hewlett-Packard Company (NYSE:HPQ) holds United States Patent 6,195,698 on several forms of CAPTCHAs. Thus, I don't see CAPTCHA systems coming into much wider use within the next twenty years.

Speech

It's possible to synthesize spoken text, with various distortions such as wow and flutter, wave shaping (i.e. guitar distortion), echo, reverb, and superposition (two samples overlapping) so that humans have a much easier time separating the sources and correcting for the distortions than a typical computer does. However, hearing-impaired users can't pass such a test, and neither can users of machines without audio output capability.

Pictures

The "Gimpy", "Bongo", "Pix", and "HumanAut" tests are based on recognition of natural images or images of distorted text. Yahoo!, AltaVista, Freeservers, Tripod, and several other web sites use picture-based human authentication.

There are two problems with this technique. First, most web sites use a deprecated image format called GIF that uses LZW compression technology patented by Unisys Corporation, but it's easy to work around this (see PNG). The other, perhaps more obvious, limitation is that blind users and other humans behind non-visual user agents cannot see images. Thus, you lose accessibility and make Bobby cry.

Speech and pictures

The Americans with Disabilities Act requires the United States government, those who do business with the United States government, and those who engage in interstate commerce to make appropriate efforts to satisfy the reasonable special needs of disabled people. A law commonly called Section 508 requires federal government web sites and some commercial web sites to make all information accessible to the disabled. To comply with the ADA, PayPal uses a test that presents the same information as a picture and as a sound; responding to either one will allow a user to sign up for an account. But the test material still cannot be perceived by a user on a braille display without a sound card.

Semantics

Finally, there exist tests based on a human's ability to interpret the meaning of written natural language. I know of no other test that a braille terminal can reliably present. Examples:

Type the ninth letter of Blockstackers, followed by an exclamation point.
C!
In the sentence 'I regret that I have but food life to lose for my country', which word does not make sense?
food

Patents

However, Hewlett-Packard's patent (http://patft.uspto.gov/netacgi/nph-Parser
?Sect1=PTO1&Sect2=HITOFF&d=PALL&p=1&u=/netahtml/srchnum.htm
&r=1&f=G&l=50&s1=6,195,698.WKU.&OS=PN/6,195,698&RS=PN/6,195,698) seems to cover all methods based on the meaning of text, as well as methods involving the visual or acoustic distortion of text. HP has every right to prevent third parties from making the patented invention during the patent's term. There exist several legitimate ways to make commercial use of an invention covered by a broad patent, none of which a company would find acceptable: license the patent (which HP has every right to deny), leave the country (which may cease to apply as countries sign treaties; you'd also lose all your customers in the USA), buy the company (prohibitive, as HP's market capitalization as of December 2002 was close to $45 billion), or wait twenty years for the patent to expire (provided that Cher does not follow in the late Sonny Bono's footsteps as a spokeswoman for the drug industry and seek a Cher Patent Term Harmonization Act). HP may have a government-granted monopoly on reliably distinguishing humans from bots for at least the next decade.

If Section 508 really does prohibit use of speech- or picture-based CAPTCHAs, web sites will have to allow for the only 100 percent reliable method: talking with another human. In fact, Yahoo! already does this (http://add.yahoo.com/fast/help/us/edit/cgi_access).

Log in or register to write something here or to contact authors.