Developers Expand Amazon Alexa’s Skills—Exposing Both Its Potential and Its Limitations
It’s amazing how much you can get done by talking to a 9.25-inch-tall cylinder.
Amazon’s Echo home speaker and the device's built-in Alexa voice-activated assistant spring into action any time you call out, “Alexa.” You can cue up music, call an Uber, or play games. If you have Internet-connected home devices you can turn on the lights with your arms full of groceries, or adjust the thermostat without lifting a finger. It’s incredibly handy.
Amazon has already sold an estimated four million Echo devices and inspired Google to create a me-too product called Google Home. The e-commerce giant is now trying to encourage other companies to expand Alexa’s capabilities by integrating it into their own services.
The idea is that just as apps made smartphones much more useful and popular, apps for Alexa, dubbed “skills,” can make the Echo more powerful and lucrative for Amazon. There are already more than 2,000 skills for the Echo. At the Code conference in May Amazon CEO Jeff Bezos said the current state of the technology is “just the tip of the iceberg.”
The capabilities developers have added show the promise of the platform, but they also highlight its limitations.
Many of the most useful skills for Echo work with smart-home devices. The leading smart-home company Insteon, for example, has dozens of devices such as lightbulbs and thermostats that can be controlled via Echo by saying things like “Alexa, lower the master-bedroom thermostat by three degrees.”
Owners of the August smart lock can lock a door using voice commands. But they can’t unlock it, because August and Amazon worry about the security ramifications of that capability. Other apps make it possible to order pizza from Domino’s or pay your Capital One credit card bill.
In the business world, Alexa is literally getting a seat at the conference table. Customers of the business analytics company Sisense can ask it questions such as what total revenue was during the last quarter in Europe. Customers are now installing the Echo in conference rooms, says Amir Orad, CEO of Sisense.
The collaboration software company Citrix is also bringing Echo into the workplace. It has built ways for customers to reserve conference rooms, and control their lights and equipment, by voice command. IT administrators can call out to Alexa to check the status of Citrix programs’ health.
Echo and Alexa still have plenty of limitations, though. One is that although you’re encouraged to talk to Alexa “naturally,” you have to use specific words and phrasing. For example, you can only ask one thing at a time, so when you go to bed you can’t call out, “Alexa, turn off the downstairs lights, turn the thermostat to 70 degrees, and set an alarm for 7 a.m.” You have to request each action individually and wait for a response.
Some companies are finding ways to work around the Echo’s shortcomings. Yonomi, a smart-home company, has built software that connects multiple home devices so that they can all be controlled by a single Alexa voice command. After a little setup, I can say, “Alexa, turn on Netflix and Chill” to have my TV turn on, Netflix open on the screen, and my lights dim.
Mark Rolston, founder of the design firm Argodesign and the former chief creative officer at Frog Design, thinks Alexa as it exists today has more fundamental problems. He argues that if it is ever to be more than a convenient novelty, it must communicate using more than just audio—for example, by showing you information on a screen.
“If Amazon continues to sell it in that vein, it’s ultimately limited,” he says. “It’s like the Apple iPhone before the App Store.” Steve Wilson, VP of core infrastructure at Citrix, agrees. “Via voice you’re going to get a very thin interaction; if you want to go in-depth you’ll have to look at a screen,” says Wilson, who worked on the company’s Alexa skill.
Amazon does show some responses to questions on its Alexa phone app, but in almost all cases you have to listen for a reply to your question. You can’t, for example, ask Alexa for the lunch menu of a nearby restaurant and see it on your phone. (A promotional video for the forthcoming Google Home shows the device pushing information like maps to people’s phones, but it’s unclear how that feature will appear in the final product.)
Another challenge of a voice interface is how to handle authentication. Currently, the Echo offers a PIN that an owner can set. The user must verbally say the PIN before making a purchase. However, a spoken PIN is only useful as a security tool if no one else hears it. Parents might not want their kids overhearing the code needed to ask Alexa to buy stuff on their credit card, for example.
The surprise success of the Echo gives Amazon a head start over its competitors in figuring out such challenges. Millions of users, and lots of companies working to integrate with the platform, will give Amazon lots of data that it can use to try to invent ways to make a talking cylinder an indispensable part of every home.
The inside story of how ChatGPT was built from the people who made it
Exclusive conversations that take us behind the scenes of a cultural phenomenon.
How Rust went from a side project to the world’s most-loved programming language
For decades, coders wrote critical systems in C and C++. Now they turn to Rust.
Design thinking was supposed to fix the world. Where did it go wrong?
An approach that promised to democratize design may have done the opposite.
Sam Altman invested $180 million into a company trying to delay death
Can anti-aging breakthroughs add 10 healthy years to the human life span? The CEO of OpenAI is paying to find out.
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.