What is Watson? IBM’s supercomputing system beat Ken Jennings and Brad Rutter, the top two (human) contestants Jeopardy! has ever had.
Watching the computer system known as Watson defeat the top two human Jeopardy! players of all time was fun in the short term. This demonstration of IBM’s software, however, was a bad idea in the longer term. It presented a misleading picture to the public of what is known about machine and human intelligence, and more seriously, it advanced a flawed approach to science that stands to benefit the enemies of science.
There’s a crucial distinction to make right away. My purpose is not to criticize the work done by the team that created Watson. Nor do I want to critique their professional publications or their interactions with colleagues in the field of computer science. Instead, I am concerned with the nature of the pop spectacle hatched by IBM.
Why was there a public spectacle at all? Certainly it’s worthwhile to share the joy and excitement of science with the public, as NASA often does. But there were no other Mars rovers to compare with the NASA rovers when they landed, and there is a whole world of research related to artificial intelligence. By putting its system on TV and personifying that system with a name and a computer-generated voice, IBM separated it from its context, suggesting—falsely—the existence of a sui generis entity.
Contrast IBM’s theatrics with the introduction of Wolfram Alpha, a “knowledge engine” for the Web that physicist Stephen Wolfram released in 2009 (see “Search Me,” July/August 2009). Although the early rhetoric around Alpha was a touch extreme, sometimes exaggerating its natural-language competence, the method of introduction was vastly more honest. Wolfram Research didn’t resort to stage magic: Alpha was made available online for people to try. Stephen Wolfram encouraged people to use his technology and compare the results with those generated by search engines like Google. Alpha proved honestly that it was something fresh, different, and useful. Comparison with what came before is crucial to progress in science and technology.
But Watson was presented on TV as an entity instead of a technology, and people are inclined to treat entities charitably. You are more likely to give a “he” the benefit of a doubt, while you judge an “it” for what it can do as a tool. Watson avoided any such comparative judgment, and the public wasn’t given a window into what would happen in that kind of empirical process. Stephen Wolfram himself, however, went to the trouble of writing a blog post comparing Watson with everyday search engines. He entered the text of Jeopardy! clues into those search engines and found that in many cases, the first document they returned contained the answer. Identifying a page that contains the answer is not the same thing as being able to give the answer on Jeopardy!, but this little experiment does indicate that Watson’s abilities were less extraordinary than one might have gathered from watching the broadcast.
Wouldn’t it have been better to open the legitimate process of science to the public instead of staging a fake version? An example of how to do this was the DARPA-sponsored “Grand Challenge” to create self-driving cars. By pitting technologies against each other, DARPA informed the public well and offered a glimpse into the state of the art. The contest also made for great TV. Competitors were motivated. The process worked.
The Jeopardy! show in itself, by contrast, was not informative. There are a multitude of open questions about how human language works and how brains think. But when machines are pitted against people, an unstated assertion is inevitably propagated: that human thinking and machine “intelligence” are already known to be at least comparable. Of course, this is not true.
In the case of Jeopardy!, the game’s design isolates a specific skill: guessing words on the basis of hints. We know that being able to guess an unstated word from its context is part of language competency, but we don’t know how important that skill is in relation to the whole phenomenon of human language. We don’t fully know what would be required to re-create that phenomenon. Even if it had been stated (in fine print, as it were) that the task of competing at Jeopardy! shouldn’t be confused with complete mastery of human language, the extravaganza would have left the impression that scientists are on a rapid, inexorable march toward conquering language and meaning—as if a machine that can respond like a person in a particular context must be doing something similar to what the human brain does.
Much of what computer scientists were actually doing in this case, however, was teaching the software to identify statistical correlations in giant databases of text. For example, the terms “Massachusetts,” “university,” “technology,” and “magazine” will often be found in documents that also contain the term “Technology Review.” That correlation can be calculated on the fly to answer a Jeopardy! question; similar methods have proved useful for search engines and automated help lines. But beyond such applications, we don’t know where this particular line of research will lead, because recognizing correlations is not the same as understanding meaning; a sufficiently large statistical simulation of semantics is not the same thing as semantics. Similarly, you could use correlations and extrapolations to predict the next number in a given numeric sequence, but you need deeper analysis and mathematical proof to get it right every time. Goodstein sequences are sequences of numbers that seem to always go up—until eventually they revert and fall to zero. A prediction based on statistical analysis of the early phase of such a sequence would get the rest of the sequence wrong. Correlations can simulate understanding without really delivering it.
Ultimately, does the Watson show really matter? Why not let IBM’s PR people enjoy a day in the sun? Here’s why not: there is a special danger when science is presented to the public in a sloppy way. Technical communities must exhibit exemplary behavior, because we are losing public legitimacy in the United States. Denying global climate change remains respectable in politics; many high-school biology teachers still don’t fully accept evolution.
Unfortunately, the theatrics of the Jeopardy! contest play the same trick with neuroscience that “intelligent design” does with evolution. The facts are cast to make it seem as though they imply a metaphysical idea: in this case, that we are making machines come alive in our image.
Indeed, that is a quasi-religious idea for some technical people. There’s a great deal of talk about computers inheriting the earth, perhaps in a “singularity” event—and perhaps even granting humans everlasting life in a virtual world, if we are to believe Ray Kurzweil.
But even if we quarantine overtly techno-religious ideas, the Watson-on-Jeopardy! scheme projects an alchemical agenda. We say, “Look, an artificial intelligence is visible in the machine’s correlations.” A promoter of intelligent design says, “Look, a divine intelligence is visible in the correlations derived from sources like fossils and DNA.”
When we do it, how can we complain that others do it? If scientists desire respect from the public, we should expect to be emulated, and we should be careful about what methods we present for emulation.
Jaron Lanier is a computer scientist, writer, and musician. His most recent book is You Are Not a Gadget (Knopf, 2010). He is a partner architect at Microsoft Research and the innovator in residence at the Annenberg School of USC. His name has been used in a Jeopardy! clue.
Hear more from IBM at EmTech 2014.