As director of research at Google, Peter Norvig is intimately involved in the attempt to manage the world’s information. He’s a good match for the job, having spent much of his life thinking about how computers think and making them do it more efficiently. An expert on artificial intelligence, he has taught at universities, held research jobs in the corporate world and at NASA, and cowritten the influential textbook AI: A Modern Approach.
Norvig came to Google in 2001 as the director of search quality; he assumed his current position four years later. In that role, he oversees about 100 computer scientists as they work on projects as diverse as medical records management and machine translation. An untold number of Google servers housing the searchable Web provide them with a test bed. He says Google is structured to ensure that researchers are not sequestered from the rest of the company. “The main allegiance they have is to the product they’re working on,” he says.
When Norvig arrived in Mountain View, Web search was simply about serving up the pages most relevant to a given query. But as the Web has grown, so has people’s need to filter information quickly. Norvig recently spoke with Technology Review’s information technology editor, Kate Greene, about what’s next for Web search.
TR: Google has many innovative products, but the look and feel of Web search hasn’t changed much in 10 years. Why?
Peter Norvig: We’ve hit on something that people mostly liked. We weren’t the first to do it. Go back to Excite and the search engines before: you have a box, and you get a list of 10 results, with a little bit of information accompanying each result. We’ve just stuck with that.
TR: What has changed?
PN: The scale. There’s probably a thousand times more information. It used to be just Web pages; now it’s video, pictures, blogs, and all sorts of media and formats. Also, the immediacy has changed. When I started, we were updating the index once a month. We thought of it as a library catalogue, a long-term thing. Now we’re seeing it more as up-to-the-minute media. When news breaks, you want to be able to read it in minutes, not in days, weeks, or months.
TR: You claim that Google’s accuracy is pretty good. How do you know how good it is, and how do you make it better?
PN: We test it in lots of ways. At the grossest level, we track what users are clicking on. If they click on the number-one result, and then they’re done, that probably means they got what they wanted. If they’re scrolling down, page after page, and reformulating the query, then we know the results aren’t what they wanted. Another way we do it is to randomly select specific queries and hire people to say how good our results are. These are just contractors that we hire who give their judgment. We train them on how to identify spam and other bad sites, and then we record their judgments and track against that. It’s more of a gold standard because it’s someone giving a real opinion, but of course, since there’s a human in the loop, we can’t afford to do as much of it. We also invite people into the labs, or sometimes we go into homes and observe them as they do searches. It provides insight into what people are having difficulty with.
TR: Companies such as Ask and Powerset are betting that the future is in natural-language search, which lets people use real, useful sentences instead of potentially ambiguous keywords. What is Google doing with natural language?
PN: We think what’s important about natural language is the mapping of words onto the concepts that users are looking for. But we don’t think it’s a big advance to be able to type something as a question as opposed to keywords. Typing “What is the capital of France?” won’t get you better results than typing “capital of France.” But understanding how words go together is important. To give some examples, “New York” is different from “York,” but “Vegas” is the same as “Las Vegas,” and “Jersey” may or may not be the same as “New Jersey.” That’s a natural-language aspect that we’re focusing on. Most of what we do is at the word and phrase level; we’re not concentrating on the sentence. We think it’s important to get the right results rather than change the interface.
TR: How much will Google search become personalized to individual users?
PN: We’re doing some of that in various places. One good example is in news personalization, where we give recommendations for news articles. There, it’s easier to do than in larger Web databases, because there’s a limited number of news stories. We track what news stories you look at, and we compare it to other people. And that seems to work out well. It’s harder to apply it to something as vast as the whole Web, but we’re starting with the easy parts.
TR: Where do you see Google search in two to five years?
PN: You’ll see integration of various kinds of content. We’re getting into speech recognition and all the kinds of interfaces on phones, where you have a tiny screen and awkward keyboard. You’ll see that gaining in importance. You’ll see integration of our various properties. We used to put the onus on the user and ask them if they wanted Web search or image search or video search. Now we’re trying to solve that for them and serve up the results in a way that makes sense.