Smart phones promise a lot of computing power and connectivity: We can search the Web and communicate from anywhere. But it can be hard to make full use of all these capabilities on small screens with tiny buttons. Now comes a new wave of applications that combine speech recognition and artificial intelligence to help people carry out simple tasks on their mobile devices.
The latest such service, from Vlingo, a company that makes voice-recognition applications, tries to go beyond earlier apps by combining a user’s spoken commands with personal data and information stored online . Called “SuperDialer,” the service can, for example, let a user say “Call pizza” and subsequently see a list of nearby pizza places drawn from both the user’s address book and the Web.
The SuperDialer is the first of a series of releases planned by Vlingo. All are intended to add a stronger artificial intelligence backbone to the company’s speech recognition software.
In August, Vlingo hopes to release a social networking application that would connect with a variety of the user’s accounts, including those on sites such as the location-based services Foursquare and Loopt. Users could, for instance, ask aloud where their friends are and retrieve answers.
A separate service in the works, Vlingo Answers, would respond if a user asked a specific question such as “How old is Kiefer Sutherland?” Vlingo would try to get the answers from standard Web search results and scans of specialty information sites such as Wolfram Alpha and True Knowledge.
On the surface, applications like these may seem simple, but CEO Dave Grannan says they involve sophisticated levels of technology. First, the application has to recognize what the user is saying. Then it has to distinguish what the user means–for example, deciding how to interpret words that could have multiple meanings, such as “vets.” Finally, it has to get the information the user needs and provide an easy interface for acting on it.
Grannan says Vlingo’s goal is to help users transform words into actions, so that people don’t have to think about what button to push or exactly how to say what they need a device to do.
This idea is similar to the virtual assistant for the iPhone offered by Siri, a company that was recently acquired by Apple for an undisclosed amount. Siri’s CEO, Dag Kittlaus, often referred to his company’s technology as a “doing engine,” and carefully distinguished its ability to accomplish tasks for the user from the Web’s familiar search functions.