I’ll Take ‘Massively Parallel’ for $1,000, Alex

Watson, a computer program built by IBM, is about to compete on “Jeopardy.” But a bigger test will come when IBM tries to translate the machine intelligence into other domains.

Brian Bergsteinarchive page

January 13, 2011

Two years after IBM announced that it was working on a computer program that would compete on TV’s Jeopardy, the event is set to air over three days next month. IBM’s program, Watson, will have two matches against the top players the game show has ever had, Ken Jennings and Brad Rutter, with the top performer getting $1 million (IBM would donate its winnings to charity). It should make for an entertaining spectacle, but it’s not clear how meaningful Watson’s performance would be.

IBM wants to show that it has dramatically advanced the state of machine intelligence since 1997, when its Deep Blue computer beat chess grandmaster Garry Kasparov. Unlike playing chess, competing on Jeopardy requires a machine to understand nuances of language, which is one of the hardest problems in computing. The Watson system won’t be connected to the Internet—it will have to analyze information it has already been fed and assess its level of confidence in whether it has come up with the right answer. (Watson also won’t listen to Alex Trebek read the “answers” for which Jeopardy players come up with questions; those will be entered into the program as text. For some insight into how Watson sorts through possible responses, you can try this interactive demonstration from the New York Times.)

Michael Littman, a machine-learning researcher at Rutgers University, says that even if Watson doesn’t win the match, merely coming close to human Jeopardy champions would be an impressive achievement, because until now computers have bested humans only at “closed-world” games that can be reduced to logical choices, like chess and backgammon. Machines have not come close to matching human skill at “open-world” games and puzzles such as crosswords, Littman says. IBM says this will have real value—that Watson possesses such linguistic and computational dexterity that it is likely to lead to useful tools in a wide variety of fields. For instance, the company says, it might become the basis of engines for answering questions in medicine, customer service, and travel applications. That has long been a goal of artificial intelligence research: to create systems that aren’t “brittle,” or good only in narrow fields.

Then again, answering Jeopardy questions well is in itself a specific field. It makes use of puns and other forms of word play in a way that other applications do not. “I’ve looked into this work a bit and it seems to be an interesting collection of special-purpose methods designed for Jeopardy questions, which tend to fall into one of a fairly small set of categories,” says Stuart Russell, an AI researcher at the University of California, Berkeley. “I don’t want to minimize the effort it took to get it working, but I am not sure it represents a significant advance for AI because the techniques do not appear to generalize easily to other forms of natural language understanding.” In other words, says Doug Lenat, who heads Cycorp—a company that has labored to infuse AI software with common-sense reasoning skills—even though Watson’s ability to understand grammar and subtleties of meaning is valuable, to succeed outside of Jeopardy it would have to be taught new sets of rules for each application. “You could do something similar for other domains, but just like in the past, the narrower the domain, the more success you’re going to have with that,” Lenat says. “This is more of a celebration of the current state of the art that natural language processing has achieved rather than an advance in that state of the art.”

Some researchers find it hard to assess Watson because IBM hasn’t revealed much of the computer science behind it. Marvin Minsky, an MIT professor and AI pioneer, says he doesn’t think it would be right to comment about the project “until IBM issues a technical report on the system, how it works, and what are its results.” Insight could come from watching the show, especially when Watson gets things wrong: its incorrect responses figure to offer clues about the program’s methods.

Lenat says he’ll be watching; he plans to have a viewing party at his company. Because even if Watson does not necessarily represent a breakthrough for AI, the attention it will get will be positive for the field, he says. “It is the 2011 analogue of what Deep Blue did in chess,” Lenat says. “It will capture human imagination.”

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.