Skip to Content
Artificial intelligence

What’s Next for AI Home Assistants

Phone calls, wider integration, and even screens could make our new domestic butlers more useful.
February 16, 2017

The world has already fallen in love with AI home assistants like Amazon’s Alexa and Google’s Assistant. Now what?

Anyone who’s used a device like Amazon’s Echo or Google’s Home smart speakers—the physical embodiments of the Alexa and Assistant software—knows that the experience is compelling. Asking for a specific song over dinner, switching off a smart bulb on the way to bed, or setting a timer while cooking all make life just that little bit more pleasant. Perhaps it's no surprise, then, that Morgan Stanley estimates that Amazon has sold 11 million Alexa devices.

But anyone who’s spent months living with an AI voice assistant will also know that they have limitations, too. So the tech giants are preparing to add extra features in an attempt to make the devices more useful.

The Wall Street Journal reports that with the landline phone dead in a ditch, both Amazon and Google think that their smart assistants could take its place. Both companies are reported to be working on voice-calling features for their devices, with the intention of tapping your digital contacts to make Skype-like calls an entirely hands-free affair. Currently, though, the companies are grappling with challenges including privacy concerns (since the whole point of these devices is that they’re always listening) and the difficulty of turning what will inevitably be a speakerphone conversation into an enjoyable experience.

Other software features will continue to appear from third parties, too. Amazon quickly opened up a development kit for developers to build on, and it’s working: the number of apps (or Skills, as Amazon calls them) available for the company’s Echo smart speaker has risen significantly in the past six months, from 950 last May to over 8,000 today. Google followed suit with its own development kit in December.

It’s still early days for developers, though, so we can expect more useful services, from grocery ordering to more comprehensive integration with smart home hardware, to dribble out over the coming months. There’s also scope for third parties to create systems that use multiple assistants at once: Sonos, for instance, which makes wireless audio systems, is working closely with both Amazon and Google and intends to integrate both Alexa and Assistant into its products.

There is, though, a nagging issue: while apps for Alexa in particular may be proliferating, it seems that people aren’t really using them. That’s because keeping users aware of software is tricky: push notifications on audio-only devices are obtrusive, and without any other cue it’s easy to forget that an “app for that” exists.

That’s one reason why companies are seriously considering adding screens to the next generation of home assistants. Speaking to MIT Technology Review, Andrew Ng, chief scientist at Baidu, also points out that, while a 2016 study by Stanford researchers and his own team showed that speech input is three times quicker than typing on mobile devices, “the fastest way for a machine to get information to you is via a screen.”

“Say you want to order takeout,” he said. "Imagine a voice that reads out: ‘Here are the top twenty restaurants in your area. Number one …’ This would be insanely slow!”

In fact, Baidu has already developed its own smart assistant device with a screen, called Little Fish. Ask it a question, and, instead of reading out the answer, it simply shows you its results, using built-in cameras to make sure it’s always pointing the information in the right direction. Amazon, too, has been rumored to be developing a future iteration of its Echo device that includes a screen. The AI assistant revolution, it seems, may be televised.

(Read more: Wall Street Journal, Verge, “In 2016, AI Home Assistants Won Our Hearts,” “Google’s New Home Helper Flexes Powerful AI Muscles,” “AI Voice Assistant Apps are Proliferating, but People Don’t Use Them”)

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

OpenAI teases an amazing new generative video model called Sora

The firm is sharing Sora with a small group of safety testers but the rest of us will have to wait to learn more.

Google’s Gemini is now in everything. Here’s how you can try it out.

Gmail, Docs, and more will now come with Gemini baked in. But Europeans will have to wait before they can download the app.

Google DeepMind’s new generative model makes Super Mario–like games from scratch

Genie learns how to control games by watching hours and hours of video. It could help train next-gen robots too.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.