The notion of asking a computer for information out loud is familiar to most of us only from science fiction. Google is trying to change that by adding speech recognition to its search engine, and releasing technology that would allow any browser, website, or app to use the feature.
But are you ready to give up your keyboards and talk to Google instead?
Over the last two weeks, speech input for Google has gradually been rolled out to every person using Google’s Chrome browser. A microphone icon appears at the right end of the iconic search box. If you have a microphone built-in or attached to your computer, clicking that icon creates a direct audio connection to Google’s servers, which will convert your spoken words into text.
It has been possible to speak Google search queries using a smart phone for almost three years; since last year, Android handsets have been able to take voice input in any situation where a keyboard would normally be used. “That was transformational, because people stopped worrying about when they could and couldn’t speak to the phone,” says Vincent Vanhoucke, who leads the voice search engineering team at Google. Over the last 12 months, the number of spoken inputs, search or otherwise, via Android devices has climbed six times, and every day, tens of thousands of hours of audio speech are fed into Google’s servers. “On Android, a large fraction of the use is people dictating e-mail and SMS,” says Vanhoucke.
Vanhoucke’s team now wants using voice on the Web to be as easy as it is on Android. “It’s a big bet,” he says. “Voice search for desktop is the flagship for this, [but] we want to take speech everywhere.”
Voice recognition is more technically challenging on a desktop or laptop computer, says Vanhoucke, because it requires noise suppression algorithms that are not needed for mobile speech recognition. These algorithms filter out sounds such as those of a computer’s fan or air conditioners. “The quality of the audio is paramount for phone manufacturers, and you hold it close to your mouth,” says Vanhoucke. “On a PC, the microphone is an afterthought, and you are further away. You don’t get the best quality.”
Google asked thousands of people to read phrases aloud to their computers to gather data on the conditions its speech recognition technology would have to handle. As people use the service for real, it is trained further, says Vanhoucke, which should increase its popularity. Data from users of mobile voice search shows that people are much more likely to use the feature again when it is accurate for them the first time.
Hear more from Google at EmTech 2014.