Skip to Content

Voice Recognition for the Internet of Things

With natural-language processing aided by crowdsourced data, aims to make smartphones, wearables, and drones heed your call.
October 24, 2014

It’s not unusual to find yourself talking to an uncoöperative appliance or gadget. Soon, though, it could soon be more common for those devices to actually pay attention.

A startup called plans to make it easy for hardware makers and software developers to add custom voice controls to everything from smartphones and smart watches to Internet-connected thermostats and drones.

While big companies like Apple and Google have their own voice recognition technology, smaller companies and independent developers don’t have the deep pockets required to create voice software that continuously learns from mountains of data., based in Palo Alto, California, is taking aim at the swiftly growing number of devices with small displays, or no screen at all, and at activities like driving and cooking, where you may want the aid of technology but don’t want to look at or touch a display.

And to give all kinds of developers access to a simple-to-use, always-learning natural-language service, the company is offering it free to those who agree to share their user data with the community. Collecting this data should help improve the accuracy of the system over time.

“Everyone will benefit from that,” cofounder and CEO Alex Lebrun says.

Lebrun has been thinking about how to make something like work for a while. He previously founded and led VirtuOz, a company that spent months building Siri-like voice-controlled software for clients like eBay and AT&T (bought by the speech recognition company Nuance in late 2012, these days it goes by the name Nina Web).

With, developers type a handful of plain-English commands they want it to recognize, such as “Wake me up tomorrow at 6” or “Wake me up in 20 minutes,” and note what they want to accomplish through each command—in this case, set the alarm on a hypothetical voice-controlled smart watch. uses what it knows about language to figure out the different ways a command might be expressed. Then, when a user wants to set the alarm for a specific time, that person’s utterances are sent to a server, which analyzes the audio and sends structured data back to the gadget—here, the instruction to set the alarm for the proper date and time. A demo on the company’s site gives an idea of how this can work. Already, about 4,600 developers are using with things like mobile apps, robots, home automation, and wearable devices.

Nick Mostowich, a student at the University of Waterloo in Ontario, is one of them. At a hackathon last month at his school, Mostowich and his team used to add voice control to a toaster and microwave. Mostowich says they quickly put together a set of commands and targets that could be mapped to a list of recipes on a remote server, so a user could say something like “Cook me some bacon” and the microwave would turn itself on, set to the right power level and time.

Voice-powered bacon-nuking aside, there are still plenty of obstacles for to overcome. Like many similar systems that rely on the cloud, such as Siri, it’s not as quick to respond as it could be, and it can’t work if you don’t have an Internet connection. And while Lebrun says can also be used to varying extents in Spanish, French, German, Italian, and Swedish, it’s still far better in English.

Lebrun believes that as more data is added to the system, the non-English languages will improve. And he hopes to enable developers to use online to build and train voice interactions and then download it so it can be used on, say, a smartphone, without needing an Internet connection. Instead, it could just occasionally check in with’s servers to update its learning.

Keep Reading

Most Popular

The inside story of how ChatGPT was built from the people who made it

Exclusive conversations that take us behind the scenes of a cultural phenomenon.

Sam Altman invested $180 million into a company trying to delay death

Can anti-aging breakthroughs add 10 healthy years to the human life span? The CEO of OpenAI is paying to find out.

ChatGPT is about to revolutionize the economy. We need to decide what that looks like.

New large language models will transform many jobs. Whether they will lead to widespread prosperity or not is up to us.

GPT-4 is bigger and better than ChatGPT—but OpenAI won’t say why

We got a first look at the much-anticipated big new language model from OpenAI. But this time how it works is even more deeply under wraps.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at with a list of newsletters you’d like to receive.