Even after the search application loads, the voice-recognition system kicks in only when the user puts the phone to her ear, as determined by its built-in motion sensors. “If you’re listening all the time, then you trigger false positives,” Glass says. “The typical solution is to make you push a button,” but the motion-activated system is easier and more intuitive, he says.
The search application also uses the iPhone’s built-in location-awareness system to prioritize results. For instance, if you search for Bank of America, one of the results will be a map of local branches. This saves users from having to include location terms–which can be open to misinterpretation–in their queries.
While Google won’t disclose details about how its voice-recognition system works, it probably hasn’t done anything too radical, says Nelson Morgan, director of the International Computer Science Institute, in Berkeley, CA. “Nearly everybody who does speech recognition has a system that looks about the same,” he says. First, the system analyzes frequency characteristics of the voice input. Then, based on probabilities drawn from a huge number of real-world examples, it correlates them with words. Finally, those words are fed into a language model that uses common combinations or sequences of words to resolve ambiguities. For instance, if you say, “president of the United,” it’s likely that the next word is going to be “States.”
While Google isn’t announcing plans to use its voice-recognition technology for other services, the potential is easy to see. “Now we have tech to take spoken words and convert it to text,” says Gummi Hafsteinsson, a senior product manager at Google. “There are a lot of options.” Currently, there’s no way to use your voice to access Google’s calendar or e-mail applications or to write an e-mail or a text message. But that could change in the future. “I think this opens up a whole new dimension,” Hafsteinsson says.
Hear more from Google at EmTech Digital.
Watch video from the event