Skip to Content
Artificial intelligence

Alexa Gives Amazon a Powerful Data Advantage

Millions of people talking with Alexa could help Amazon fight off Google in the home voice assistant market.
January 18, 2017
Amazon is estimated to have sold more than five million Echo devices, which have the voice-activated Alexa assistant inside.

“Hey, Alexa”—a phrase that millions of people call out at home just before telling Amazon their desires at that moment. All those people asking Alexa to order kitchen supplies, turn on the lights, or play music gives Amazon a valuable stockpile of data that it could use to fend off competitors and make breakthroughs in what voice-operated assistants can do.

“There are millions of these in households, and they’re not collecting dust,” Nikko Strom, a speech-recognition expert and founding member of the team at Amazon that built Alexa and Echo, said at the AI Frontiers conference in Santa Clara, California, last week. “We get an insane amount of data coming in that we can work on.”

Strom said that data had already helped the company make progress on a longstanding challenge in speech recognition known as the cocktail party problem, where the challenge is to pick out a single voice from a hubbub of many people talking.

Initially Alexa could easily tell that someone had called out its name, but—like other voice-recognition systems—it struggled to know which words being said around it were the request being issued. Then Strom’s team developed a system that notes characteristics of a voice that calls out “Alexa” and uses them to home in on the words of the person asking for help.

The data Amazon is amassing to take on problems like that could be unique. Standard datasets available for training and testing speech recognition systems don’t usually include audio captured in home environments, or using microphone arrays like that the Echo uses to focus on speech from a particular direction, says Abeer Alwan, a professor at University of California, Los Angeles, who works on speech recognition.

“People have been toying with microphone arrays for a long time but I don’t think there has been a deployment at the scale Amazon is talking about,” says Alwan. More data on a particular scenario or type of speech usually translates into better performance, she says.

Strom said he also hopes that his team’s data trove could eventually help upgrade Alexa to being able to follow two people speaking simultaneously. “It’s hard, but there’s been some progress,” he said. “It’s super interesting for us if we could solve that problem.”

Strom didn’t say what Alexa might be able to do once that problem is solved. But it might make it more natural for multiple people to interact with an Echo or other device at once, whether that’s kids peppering Alexa with questions or their parents rattling off a shopping list.

The data piling up from Alexa could also help Amazon fend off Google’s Echo competitor, Google Home, which launched late last year. Google can draw on years of work in Web search and voice search, and sizeable investments in artificial intelligence. But its previous products and businesses don’t naturally collect speech like that of a person calling out to a device in the home, or on the same type of requests people ask home assistants to serve.

Amazon is probably hoping that this contest turns out like the Web search market. Research has suggested that one reason Google’s dominance couldn’t be shaken by startups or well-funded competitors such as Microsoft was that Google had piles more data on what people search for and click on.

Early reviews of Google Home have generally said that it and Amazon’s products are broadly similar, each with their own strong points. And Google is presumably working hard to learn all it can from the data coming in from its new product. But it will take some time for that flow of information to rival what Amazon is getting.

Analysts estimated last November that over five million Echo devices had been sold since its launch two years prior, and Amazon said last month that Echo devices were the top seller over the holiday season. Alexa is also set to start appearing in products, such as speakers, cars, and fridges, from other companies.

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

OpenAI teases an amazing new generative video model called Sora

The firm is sharing Sora with a small group of safety testers but the rest of us will have to wait to learn more.

Google’s Gemini is now in everything. Here’s how you can try it out.

Gmail, Docs, and more will now come with Gemini baked in. But Europeans will have to wait before they can download the app.

Google DeepMind’s new generative model makes Super Mario–like games from scratch

Genie learns how to control games by watching hours and hours of video. It could help train next-gen robots too.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.