Apple has popularized some revolutions in how we use personal computers in its time: the graphical interface, the mouse, and the touch screen, for example. Next Tuesday could see the company add to that list of milestones in man-machine interaction by letting users control a computer by having a conversation with it.
Apple’s new boss, Tim Cook, will take the stage at the company’s California headquarters to announce the latest updates to the company’s products. Apple’s invite to the event says only “Let’s Talk iPhone,” but the Internet rumor mill has decided that Cook will announce two things: a fifth model of the iPhone; and a voice-activated “Assistant” for iPhone and iPad devices, based on an impressive app called Siri that was bought by Apple last year (see here for one of the more plausible predictions).
We may well get neither of these, but of the two, the second is the most interesting. I may regret saying this, but there are few significant hardware upgrades that Apple could add to the iPhone 5. Some things will get incrementally better; more resolution (camera), faster (processor), or bigger (screen), but there’s not a lot left to throw in that makes sense.
On the other hand, making it simple to set up calendar invites or find a nearby movie just by conversing with your iPhone or iPad would break new ground. It’s also the kind of achievable revolution that Apple is known for.
The formula is simple: take a bunch of neat technology that has never lived up to its promise, rethink what it’s for, do some secretive hard work, and then release a natural, retrospectively obvious experience that redefines what computers can do.
The iPad and iPhone interfaces are good examples of this. Touch screens, mobile browsers, and tablets existed already, but Apple rolled them together and altered the trajectory of personal computing.
The shabby history of speech recognition, voice control, and virtual assistants (remember Clippy?) is perfect feedstock for this approach. All have been around for decades and have the potential to be so much better than poking buttons or a screen. Never has anyone come close to realizing that potential.
When Siri debuted in 2009, it looked to be the best hope yet of changing that. It was the spawn of a DARPA-funded AI project and some smart thinking on integrating various tools such as maps, restaurant reviews, and movie ticket bookings, and we made it one of our 10 technologies to watch in 2009. A user could have back-and-forth conversations that begin with complex statements like, “I’d like a romantic place for Italian food near my office.”
Siri contained several smart technical ideas, but crucially, it condensed them into an easy-to-understand, working conversational interface that was actually useful. Apple could take this a significant step further by making the technology more robust and integrating it with the iPad and iPhone operating system. If they do that, the humble app Siri would be promoted to the role of Assistant, a personal aide that you talk to in normal language and helps with most things you use your phone or tablet for. In essence, it would be your phone’s personality.
As will be pointed out in many a discussion thread if this does come to pass, Google was (sort of) there first. The company’s Android operating system has a “voice actions” feature that allows users to press and hold a button and request directions to a local business, or dictate a text message. Yet it lacks the power to take actions beyond your phone, such as booking a restaurant. More importantly, it doesn’t have a smart conversational interface.
Smaller design teams can now prototype and deploy faster.