Skip to Content
Artificial intelligence

One day your voice will control all your gadgets, and they will control you

January 11, 2019

Everything you own in the future will be controlled by your voice. That’s what this year’s CES, the world’s largest annual gadget bonanza, has made abundantly clear.

Google and Amazon have been in fierce competition to put their assistants into your TV, your car, and even your bathroom. It all came to a head this week in Las Vegas, where the full line-up of voice-enabled products underscored the scope of each company’s ambitions.

Maybe it seems like a wasteful side effect of capitalism that you can now ask Alexa to lift your toilet cover (or maybe not—you do you), but there’s more to the ubiquity of voice interfaces than a never-ending series of hardware companies jumping on the bandwagon.

It’s tied to an idea that leading AI expert Kai-Fu Lee calls OMO, online-merge-of-offline. OMO, as he describes it, refers to combining our digital and physical worlds in such a way that every object in our surrounding environment will become an interaction point for the internet—as well as a sensor that collects data about our lives. This will power what he dubs the “third wave” of AI: our algorithms, finally given a comprehensive view of all our behaviors, will be able to hyper-personalize our experiences, whether in the grocery store or the classroom.

But this vision requires everything to be connected. It requires your shopping cart to know what’s in your fridge so it can recommend the optimal shopping list. It requires your front door to know your online purchases and whether you’re waiting for an in-home delivery. That’s where voice interfaces come in: installing Alexa into your fridge, your door, and all your other disparate possessions neatly ties them to one software ecosystem. It’s quite the clever scheme: by selling you the powerful and seamless convenience of voice assistants, Google and Amazon have slowly inched their way into being the central platform for all your data and the core engine for algorithmically streamlining your life.

Whether or not you trust either company with that much control, such a grand undertaking will be limited by what voice assistants can understand. And compared with other subfields of AI, progress in natural-language processing and generation has kind of lagged behind.

But that could be about to change. Last year several research teams used new machine-learning techniques to make impressive breakthroughs in language comprehension. In June, for example, research nonprofit OpenAI developed an unsupervised learning technique to train systems on unstructured, rather than cleaned and labeled, text. It dramatically lowered the costs of acquiring more training data, thereby increasing their system’s performance. A few months later, Google released an even better unsupervised algorithm that is as good as humans at completing sentences with multiple-choice answers.

All these advancements are getting us closer to a day when machines that really understand what we mean could render physical and visual interfaces obsolete—and usher in the full potential of an OMO world. For better or worse.

This originally appeared in our AI newsletter The Algorithm. To have it directly delivered to your inbox, subscribe here for free.

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

OpenAI teases an amazing new generative video model called Sora

The firm is sharing Sora with a small group of safety testers but the rest of us will have to wait to learn more.

Google’s Gemini is now in everything. Here’s how you can try it out.

Gmail, Docs, and more will now come with Gemini baked in. But Europeans will have to wait before they can download the app.

Google DeepMind’s new generative model makes Super Mario–like games from scratch

Genie learns how to control games by watching hours and hours of video. It could help train next-gen robots too.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.