Loebner Prize

I am going to ask you some test questions are you OK with that
yes
Could you kill someone if you stabbed them with a towel
no
why
because a towel is a soft fluffy thing
Thank you human - thank you very much
ou can have a break now

The Loebner prize contest is an annual tournament that pits human confederates against AI chatbots in the Turing Test, whereby judges have to decide - on the basis of text-only interaction with two participants at a time - which is the (more) human. This year's event was held at the University of Exeter, and as that's sufficiently near, and I have the kind of job that allows me to take the day off and watch competitive typing, I attended as a spectator.

I'd been inspired to do so by a talk earlier in the year during the Festival of Ideas in Bristol, by one of 2009's confederates, Brian Christian. He'd managed to win the 'most human human' award, and has written a thoroughly excellent book of the same title in which he describes some of his approach. Based on 2011's challengers, though, very little strategy seems to be required - not once did the judges mistake a machine for a person (or vice versa), and usually it was abundantly clear within a few minutes who was who. Deliberate attempts to break the algorithms made for easy decisions: be it by asking entirely novel questions as above; entering the same message and deterministically getting the same response back each time; or revealing a lack of memory by twice introducing yourself, but by different names, and being greeted by each in turn. But having done so, the judges would often toy with the bots, to the amusement of the audience, and this tended to highlight other obvious flaws: one was programmed to assume the persona of an online language tutor, and thus, when otherwise stumped, would resort to complimenting the judge on their english skills; another cheerfully admitted to being a robot when asked.

Fortunately, whilst the machines proved to be underwhelming company, the humans present more than made up for it. Loebner himself is a larger than life character - with a taste for loud shirts, a PhD in sociology and a fortune from selling portable disco floors, he seems to almost have no interest in the prize he sponsors beyond feeling that it should exist to further his utopian vision: 100% unemployment. This is a man who, on learning of the idea of the butterfly effect, didn't think 'that's an interesting analogy' but rather 'I should be that butterfly'. A twenty minute speech he gave later in the day meandered through chaos theory, non-commutativity of history, possible insanity of biblical prophets, his plan to be saviour of the world (resolve the energy crisis by almost eliminating transport costs, via rollercoaster-like tracks), Genghis Khan, ball bearings, hanging chads, and (almost as an afterthought) intelligence. As egotistical as it might seem to commission a medal with Turing's portrait on one side and your own on the other, even if through ridicule and controversy alone Loebner has drawn a lot more attention to the Turing test question that it might otherwise enjoy. After all, this mathematician was sufficiently intrigued to make the trip, and I met computer scientists, a psychologist, an artist/author, a documentary film crew, and other interested outsiders (bonus of an event based around the superiority of human conversation: everyone attending is willing to chat with strangers).

One of the judges was Professor Noel Sharkey, perhaps best known to britnoders of a certain age from robot wars, but a serious academic in addition to his media activities (and responsible for half of the quote above). Despite some sparring between them earlier in the day, he opened his talk with the suggestion that Loebner's greatest achievement has been in being such an irritant to the AI community - it's a discipline with more than its share of hot air and bold claims, and the routing of the chatbots in the contest demonstrated there's still a long way to go. He also highlighted some of the ethical issues that will arise if and when machines become better at imitating us, citing examples of parents who wrote in glowing terms of products such as talking hello kitty that allowed them to ignore their offspring without guilt, or seeming concern that their children were then less interested in actual people.

Also on the panel was Exeter's Antony Galton, who presented work by one of his PhD students who had presented a problem - and obtained a solution - in which software could perform near-indistinguishably from humans: the challenge of describing the location of an object in a scene. Although they had avoided doing so, the media has sinced dubbed this a 'visual turing test' (although no computer vision techniques are actually employed, with the objects already nametagged and located in 3d-space); you can try it yourself here.

Most fascinating for me, though, was talking to Rollo Carpenter, creator of the online program Cleverbot. With 80 million lines of conversation logs with bored internet users to draw upon, it can, for instance, demonstrate a grasp of pop-culture references far beyond what the Loebner contest entrants showed during the day. After years of development, the current challenge for improving cleverbot lies not in a better understanding of conversation, but optimising use of that enormous database, in support of thousands of simultaneous chats. At present, there's an appreciable lag of several seconds between each message. Reducing that would both make the session feel more like a natural conversation (as opposed to one with someone on the moon), hopefully encouraging users to chat for longer and leading to more rapid generation of further raw material for the database. A particularly interesting anecdote he related was in regard to bilingual users: as an example, Polish users would sometimes be amazed that, after a lengthy conversation in English, Cleverbot would suddenly address them in Polish. A probable explanation is that other bilingual users would, after a while, try typing in Polish instead of English to see what would happen. But their preceeding english messages must deviate in some subtle way from a native english speaker, and thus once Cleverbot identified those hallmarks in a new user, it was more likely to draw upon its stock of Polish responses - despite having no notion of what language a given string is in, of course.

As with Deep Blue and chess, it seems unlikely that were a chatbot to eventually claim the (solid gold) first prize at the Loebner contest we'd declare it to be intelligent. But I don't really think that's the point, either: challenges like the Turing test are valuable for what they can tell us about human activity, ability and limitations, not just those of machines.

Turing test	M-x doctor	Alice	Eliza
Nanniebot	Richard Wallace	Jason Hutchens	Iron Horse
hospitality network	Asian Maple Glazed Roasted Salmon	Purity and Balance in Asian Cultures	Beau Brummell
Playmobil	Time travel	Baker Street	AI
Alan Turing

Recommended Reading

About Everything2

User Picks

Editor Picks

New Writeups