Apple has popularized some revolutions in how we use personal computers in its time: the graphical interface, the mouse, and the touch screen, for example. Next Tuesday could see the company add to that list of milestones in man-machine interaction by letting users control a computer by having a conversation with it.
Apple’s new boss, Tim Cook, will take the stage at the company’s California headquarters to announce the latest updates to the company’s products. Apple’s invite to the event says only “Let’s Talk iPhone,” but the Internet rumor mill has decided that Cook will announce two things: a fifth model of the iPhone; and a voice-activated “Assistant” for iPhone and iPad devices, based on an impressive app called Siri that was bought by Apple last year (see here for one of the more plausible predictions).
We may well get neither of these, but of the two, the second is the most interesting. I may regret saying this, but there are few significant hardware upgrades that Apple could add to the iPhone 5. Some things will get incrementally better; more resolution (camera), faster (processor), or bigger (screen), but there’s not a lot left to throw in that makes sense.
On the other hand, making it simple to set up calendar invites or find a nearby movie just by conversing with your iPhone or iPad would break new ground. It’s also the kind of achievable revolution that Apple is known for.
The formula is simple: take a bunch of neat technology that has never lived up to its promise, rethink what it’s for, do some secretive hard work, and then release a natural, retrospectively obvious experience that redefines what computers can do.
The iPad and iPhone interfaces are good examples of this. Touch screens, mobile browsers, and tablets existed already, but Apple rolled them together and altered the trajectory of personal computing.
The shabby history of speech recognition, voice control, and virtual assistants (remember Clippy?) is perfect feedstock for this approach. All have been around for decades and have the potential to be so much better than poking buttons or a screen. Never has anyone come close to realizing that potential.
When Siri debuted in 2009, it looked to be the best hope yet of changing that. It was the spawn of a DARPA-funded AI project and some smart thinking on integrating various tools such as maps, restaurant reviews, and movie ticket bookings, and we made it one of our 10 technologies to watch in 2009. A user could have back-and-forth conversations that begin with complex statements like, “I’d like a romantic place for Italian food near my office.”
Siri contained several smart technical ideas, but crucially, it condensed them into an easy-to-understand, working conversational interface that was actually useful. Apple could take this a significant step further by making the technology more robust and integrating it with the iPad and iPhone operating system. If they do that, the humble app Siri would be promoted to the role of Assistant, a personal aide that you talk to in normal language and helps with most things you use your phone or tablet for. In essence, it would be your phone’s personality.
As will be pointed out in many a discussion thread if this does come to pass, Google was (sort of) there first. The company’s Android operating system has a “voice actions” feature that allows users to press and hold a button and request directions to a local business, or dictate a text message. Yet it lacks the power to take actions beyond your phone, such as booking a restaurant. More importantly, it doesn’t have a smart conversational interface.
Voice actions on Android feel like a techy side feature, not a new way of interacting with computers. Assistant could and should be a much more cohesive package. If it does arrive on Tuesday, it will likely condense a boatload of technology into one simple thing: a computer interface you converse with. If done well, that could see Apple once again shift what it means to use a computer.
Apple doesn’t perform such tricks for free, and is notoriously controlling, though. Should Assistant appear, it will only be available on Apple devices, to drive sales. Any external services it connects with will be carefully approved. I wouldn’t be surprised to hear that Apple gets a cut of anything sold through Assistant, whether movie tickets or restaurant bookings. Still, like the iPhone and Apple’s other disruptive ideas, it won’t be long before competitors launch mimics.
Questions remain in my mind, though, about the limits Apple will have placed on Assistant to have it live up to the company’s own high standards. Creating a voice-based interface is easy, but creating one that, in Steve Jobs’s words, “just works” is not.
The fact is that voice recognition has to cheat to be really accurate without extensive pretraining to your voice. It needs some precognition of what you are going to say. Google’s voice search mobile app, for example, is incredibly accurate because it draws on piles of data about phrases people search for. Apple Assistant should be fine when taking orders related to things it knows you might talk about, like your calendar, contacts, or music playlists. Transcribing speech, such as an e-mail message, when you could say literally anything is another matter, though, and it will be interesting to see if Apple makes it part of its system. I’ve found Google’s voice actions to be infuriating to use for composing messages, and I can’t imagine Apple launching a product with such potential to annoy users.
Striking the balance between power and reliability could be the toughest design decision involved in building something like Assistant. It’s the type of judgement call that Steve Jobs excelled at, for example, when he put the iPad on hold and launched a smaller version in the form of a phone first. Come Tuesday, we may get a glimpse at how well Jobs’s successor negotiates the same trade-off between what could be launched and what meets Apple’s unique brand of experience-centric perfectionism.