Peter Norvig, Google’s director of research, is an expert ace at building machines that answer tough questions. An authority in programming languages and artificial intelligence, he has written an oft-cited book on AI (Artificial Intelligence: A Modern Approach), has taught at the University of California, Berkeley, and the University of Southern California, and was the head of computational sciences at NASA. In 2001, Norvig came to Google to be the director of search quality. Four years later, he became Google’s director of research, overseeing about 100 researchers who investigate topics that range from networking to machine translation. Technology Review spoke with Norvig to get a hint of what we can expect from search technology in the years to come.
Technology Review: What does Google Research do?
Peter Norvig: The core of what we do is still search and advertising. A lot of researchers are working on that. They’re working to give better-quality search results and to match ads better. Another area of research is gathering more sources of information, such as text in books, still images, video, and now audio in terms of speech recognition. I think another focus is to understand how people interact with Google and interact with each other on the Web, in general. How do people operate in these social networks? Understanding that question can help us serve them better.
TR: Which research has the most people and funding?
PN: The two biggest projects are machine translation and the speech project. Translation and speech went all the way from one or two people working on them to, now, live systems.
TR: Like the Google Labs project called GOOG-411 [a free service that lets people search for local businesses by voice, over the phone]. Tell me more about it.
PN: I think it’s the only major [phone-based business-search] service of its kind that has no human fallback. It’s 100 percent automated, and there seems to be a good response to it. In general, it looks like things are moving more toward the mobile market, and we thought it was important to deal with the market where you might not have access to a keyboard or might not want to type in search queries.
TR: And speech recognition can also be important for video search, isn’t it? Blinkx and Everyzing are two examples of startups that are using the technology to search inside video. Is Google working on something similar?
PN: Right now, people aren’t searching for video much. If they are, they have a very specific thing in mind like “Coke” and “Mentos.” People don’t search for things like “Show me the speech where so-and-so talks about this aspect of Middle East history.” But all of that information is there, and with speech recognition, we can access it.
We wanted speech technology that could serve as an interface for phones and also index audio text. After looking at the existing technology, we decided to build our own. We thought that, having the data and computational resources that we do, we could help advance the field. Currently, we are up to state-of-the-art with what we built on our own, and we have the computational infrastructure to improve further. As we get more data from more interaction with users and from uploaded videos, our systems will improve because the data trains the algorithms over time.
TR: While there’s a lot of research going on behind the scenes, on the surface, it looks as though search technology hasn’t changed much in more than 10 years. How is Google’s user interface changing?
PN: We’re in a situation in the main Web search where there’s a real imbalance. Users are giving us three words at a time and we’re able to give back a lot of info: 10 links with titles, snippet of text, and other information about the page. So we’re able to present a lot at once. If the user has a big screen, they can consume what we’re giving them quickly. So it’s a fast interaction but a very imbalanced one. One of the things we’re looking at is finding ways to get the user more involved, to have them tell us more of what they want. People type the query “map,” and then they get upset if it’s not the map they were thinking of. So, people may be willing to talk more than type. Or maybe they’re willing to take a suggestion if we offer something that they didn’t type a query for, but is related.
But there are search interactions other than main Web search. When you’re on cell phones, you can only see one link at a time. It really changes the game. There’s much more impetus for us to be correct, so we’re thinking about that kind of interaction there, and how you could use audio to present information.
TR: What are the outstanding problems in search?
PN: In general, we think there are two aspects of it. One is understanding users’ needs more. The other is understanding the contents of documents, whether they be Web pages or video. Mostly we look at what the user types in, treat the input as individual words, and count them up on pages and weigh those pages with different kinds of evidence. But we don’t look only at words they type in. We also look at spelling variants, and if a user types in a long query, we break it into pieces. Maybe a user meant some words, but didn’t really mean others.
TR: That seems to have elements of natural-language search, in which people just type in a question, for instance, instead of a few keywords. How is Google advancing natural-language search?
PN: I think there’s a whole range of what you can mean as natural-language search. The first part of that range, we’ve been doing for a while. For instance, we understand synonyms and that the two words in San Francisco should go together. But then there’s Las Vegas and Vegas, which mean the same thing, and New York and York don’t mean the same thing. Those are the kinds of things we figure out. Another component of natural-language search is to parse a longer query into components. And the farthest along is typing in a full sentence in English and getting a full sentence as an answer. That sort of thing we’re not doing yet. We are answering some kinds of questions. You can query “population of Japan,” and we’ll pull that out. But for the majority of questions, that’s not what people want. They don’t want the burden of having to express it as a full sentence.
TR: Your expertise is in artificial intelligence. Isn’t Google, at its core, an artificial-intelligence company using machine-learning algorithms to search the Web, recognize speech, and match advertising with keywords?
PN: I think a lot of AI is trying to do a better job on a task for which there’s no definite answer. What are the best results for a given search query? There is no absolutely correct answer, it’s subjective, and that’s the question that Google’s answering. So I think it’s fair to say that this is an AI problem. But the way we address that problem is with lots of different approaches. There’s an AI algorithm involved, there’s software engineering, hardware, and networking to make it fast and efficient. I wouldn’t want to say that AI is everything, but it’s a big part of it.
Couldn't get to Cambridge? We brought EmTech MIT to you!Watch session videos here