Skip to Content

Voice Recognition for the Internet of Things

With natural-language processing aided by crowdsourced data, Wit.ai aims to make smartphones, wearables, and drones heed your call.
October 24, 2014

It’s not unusual to find yourself talking to an uncoöperative appliance or gadget. Soon, though, it could soon be more common for those devices to actually pay attention.

A startup called Wit.ai plans to make it easy for hardware makers and software developers to add custom voice controls to everything from smartphones and smart watches to Internet-connected thermostats and drones.

While big companies like Apple and Google have their own voice recognition technology, smaller companies and independent developers don’t have the deep pockets required to create voice software that continuously learns from mountains of data.

Wit.ai, based in Palo Alto, California, is taking aim at the swiftly growing number of devices with small displays, or no screen at all, and at activities like driving and cooking, where you may want the aid of technology but don’t want to look at or touch a display.

And to give all kinds of developers access to a simple-to-use, always-learning natural-language service, the company is offering it free to those who agree to share their user data with the Wit.ai community. Collecting this data should help improve the accuracy of the system over time.

“Everyone will benefit from that,” cofounder and CEO Alex Lebrun says.

Lebrun has been thinking about how to make something like Wit.ai work for a while. He previously founded and led VirtuOz, a company that spent months building Siri-like voice-controlled software for clients like eBay and AT&T (bought by the speech recognition company Nuance in late 2012, these days it goes by the name Nina Web).

With Wit.ai, developers type a handful of plain-English commands they want it to recognize, such as “Wake me up tomorrow at 6” or “Wake me up in 20 minutes,” and note what they want to accomplish through each command—in this case, set the alarm on a hypothetical voice-controlled smart watch. Wit.ai uses what it knows about language to figure out the different ways a command might be expressed. Then, when a user wants to set the alarm for a specific time, that person’s utterances are sent to a Wit.ai server, which analyzes the audio and sends structured data back to the gadget—here, the instruction to set the alarm for the proper date and time. A demo on the company’s site gives an idea of how this can work. Already, about 4,600 developers are using Wit.ai with things like mobile apps, robots, home automation, and wearable devices.

Nick Mostowich, a student at the University of Waterloo in Ontario, is one of them. At a hackathon last month at his school, Mostowich and his team used Wit.ai to add voice control to a toaster and microwave. Mostowich says they quickly put together a set of commands and targets that could be mapped to a list of recipes on a remote server, so a user could say something like “Cook me some bacon” and the microwave would turn itself on, set to the right power level and time.

Voice-powered bacon-nuking aside, there are still plenty of obstacles for Wit.ai to overcome. Like many similar systems that rely on the cloud, such as Siri, it’s not as quick to respond as it could be, and it can’t work if you don’t have an Internet connection. And while Lebrun says Wit.ai can also be used to varying extents in Spanish, French, German, Italian, and Swedish, it’s still far better in English.

Lebrun believes that as more data is added to the system, the non-English languages will improve. And he hopes to enable developers to use Wit.ai online to build and train voice interactions and then download it so it can be used on, say, a smartphone, without needing an Internet connection. Instead, it could just occasionally check in with Wit.ai’s servers to update its learning.

Keep Reading

Most Popular

transplant surgery
transplant surgery

The gene-edited pig heart given to a dying patient was infected with a pig virus

The first transplant of a genetically-modified pig heart into a human may have ended prematurely because of a well-known—and avoidable—risk.

open sourcing language models concept
open sourcing language models concept

Meta has built a massive new language AI—and it’s giving it away for free

Facebook’s parent company is inviting researchers to pore over and pick apart the flaws in its version of GPT-3

Muhammad bin Salman funds anti-aging research
Muhammad bin Salman funds anti-aging research

Saudi Arabia plans to spend $1 billion a year discovering treatments to slow aging

The oil kingdom fears that its population is aging at an accelerated rate and hopes to test drugs to reverse the problem. First up might be the diabetes drug metformin.

images created by Google Imagen
images created by Google Imagen

The dark secret behind those cute AI-generated animal images

Google Brain has revealed its own image-making AI, called Imagen. But don't expect to see anything that isn't wholesome.

Stay connected

Illustration by Rose WongIllustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.