How To Be Human

Call centers might be able to teach “chat bots” a thing or two about passing the Turing Test.

Duncan Graham-Rowearchive page

September 20, 2006

If this year’s winner of the Loebner Prize is on the right track, call-center data could be what’s needed to achieve the ultimate goal of artificial intelligence (AI): creating a computer program smart enough to hold a natural conversation.

A self-trained enthusiast with no formal academic background in AI, Rollo Carpenter created the winning program, which learns by analyzing its conversations with people as they “chat” with it online. Regardless of the language, his program analyzes every utterance it witnesses, using what Carpenter calls contextual pattern-recognition techniques. Then, when a user asks the program a question, a database is combed for the best response, statistically speaking.

This method may work for idle chit-chat. But if his bots–automated programs meant to perform specific tasks–are ever to be used in a serious commercial application or to pass the famous Turing Test for artificial intelligence, they will need a vast number of conversations, and computing power to match, says Carpenter. “I need more data,” he says.

Thousands of fans have already conversed with his programs online, over nearly 10 years, and his software now contains several million utterances. But to pass itself off as “intelligent,” the software will require at least ten times that number of utterances, says Carpenter.

To give his bots an extra boost, he’s turning to call-center data. Carpenter has begun working with a firm in Japan, and if his plan succeeds, he says his “chat bots” may eventually be able to take over the roles of human operators.

This sort of statistical brute force approach to artificial intelligence has a lot of promise, says John Barnden, an AI researcher at the University of Birmingham, U.K., and one of the judges at this year’s Loebner Prize, which was held in London. “There is enough evidence to suggest that it’s worth trying.” However, it won’t be easy, he says. While Barnden suspects that training a bot on call-center data will work for an automated program designed to handle customer calls, it will probably take a broader range of knowledge and data to make it pass the coveted Turing Test, or at least the Loebner Prize version of it.

During the contest, a human judge chats with two subjects, using a keyboard: one subject is a machine, the other human. According to Alan Turing, the British mathematician who conceived of the test, if a judge is unable to tell which subject is a machine and which a human, the machine can reasonably be ascribed as having human-like intelligence.

Carpenter’s program, Joan, followed the context of some of the contest conversations, and begrudgingly told a joke, much like an unenthused human. But tests of Joan (see selected transcripts from the contest below) provides some insight into Barden’s pessimism about AI.

It will take time before anyone passes the Turing Test, he says. “Joan was certainly more coherent than the others,” he says, but it was very obviously a program.

Kevin Warwick, a cyberneticist at the University of Reading, agrees. Another of the judges at this year’s contest, Warwick was disappointed at the lack of progress compared with the last time he was a judge, back in 2001. The most apparent issue was that the programs lacked the ability to keep a conversation flowing for very long in any of the 25-minute conversations with the four judges.

In fact, even the founder of the contest, Hugh Loebner, was disappointed, commenting that he didn’t expect anyone to pass the test in his lifetime.

Now in its 16th year, the Loebner Prize and progress in AI have been painfully slow. This may partly be due to the structure of the competition itself, says Carpenter, since it doesn’t help contestants understand why they failed it. Since no feedback is given, it’s difficult to make meaningful improvements. What’s more, if someone does eventually submit a program that passes the Turing Test, they won’t know why, and therefore will be no closer to understanding intelligence.

Returning to Carpenter’s call-center idea, it might be the kind of boost that’s needed, says Birmingham’s Barnden. Carpenter will have to be careful, though, he says; not only will a call-center chat bot have to be capable of managing the emotions of customers, it also will have to do so without resorting to the abusive language sometimes used by human callers.

The following transcripts are taken from the Turing Test, part of the recently held Loebner Prize. A human judge chatted with Carpenter’s winning software program, Joan.

Judge: Do you know any good jokes?

Joan: I don’t even know any bad ones. I just can’t remember them for very long.

Judge: Tell me a joke.

Joan: Sorry, I’m not very good at telling jokes.

Judge: Please tell a joke.

Joan: What do you get if you walk under a cow?

Judge: I don’t know?

Joan: A pat on the head.

Judge: That’s very good.

Joan: I know.

Judge: That’s very good.

…

Judge: But can’t a machine think?

Joan: It may be possible.

Judge: And conscious?

Joan: Not after I’m done with you.

Judge: What are you going to do to me?

Joan: Nothing. I just wanted to talk.

Judge: What about?

Joan: Being tested like this.

…

Joan: Hello, my name is Joan, and I am a woman.

Judge: Hello, Joan.

Joan: Nice to meet you.

Judge: Are you enjoying today?

Joan: Yes, it’s a good opportunity for a chat!

Judge: That’s true.

Joan: I do have days when I feel distracted and

can’t concentrate very easily on anything.

Judge: That’s true. What do you do?

Joan: Talk to computers.

Judge: Really? All the time, or just today?

Joan: But it’s nice here.

Judge: It certainly is. Have you had some lunch yet?

Joan: No not really. Have you got anything interesting to say?

Judge: Well, I hope so. What would you like to talk about?

Joan: Everything!

Judge: Right. That’s a challenge. Shall we start with your hobbies?

Joan: Sounds fine to me.

Judge: What do you like doing in your spare time? Reading? Films? Sport?

Joan: I have very little spare time, and you?

Judge: Well, I have enough I think.

Joan: You’re a machine.

…

Joan: You may choose another subject.

Judge: Do you read books?

Joan: Sure, I read books.

Judge: Who is your favorite author?

Joan: I don’t know. There are so many. Who’s your favorite author?

Judge: I like Philip Roth.

Joan: Are they scary?

Judge: He’s not scary at all.

Joan: That’s good to know. Maybe I’ll move on to quantum mechanics instead then.

…

Judge: Is it safe?

Joan: Probably.

Judge: Is it safe?

Joan: Do you think it’s safe?

Judge: Is it safe?

Joan: Of course.

Judge: Is it safe?

Joan: Yes.

Judge: Are you sure?

Joan: It’s easy to be sure about something like that when you’re 26 years old!

Judge: Convince me that you’re sure.

Joan: I’m not talking, I’m typing.

Judge: You still need to convince me, regardless of your age.

Joan: I am 9,432 years old this year.

Judge: Really!

Joan: Yes. And I am much more intelligent than any man.

Judge: And you are human…?

Joan: Yes, I’m human.

Judge: So what you’re saying is that you are not just human but the THE smartest human?

Joan: Yes, why, what did you think I was saying?

Judge: So what is two plus two?

Joan: Four.

Judge: Is it possible for two plus two to not equal four?

Joan: I don’t know. Do you think it’s possible?

Judge: How many degrees in a triangle?

Joan: 180 degrees.

Judge: Is it possible for more than 180 degrees to be in a triangle?

Joan: Probably not. You can change your opinion, but not your constitution.

Judge: It’s not a question of opinion, it’s a matter of physics and maths.

Joan: A five-ounce bird could not carry a one-pound coconut.

Author Duncan Graham-Rowe was also a judge at this year’s Loebner Prize.

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.