Select your localized edition:

Close ×

More Ways to Connect

Discover one of our 28 local entrepreneurial communities »

Be the first to know as we launch in new countries and markets around the globe.

Interested in bringing MIT Technology Review to your local market?

MIT Technology ReviewMIT Technology Review - logo

 

Unsupported browser: Your browser does not meet modern web standards. See how it scores »

In an effort to make speech the dominant way that people control technology, AT&T is opening up its speech-recognition technology for others to use. Starting in June, software engineers can tap into a cloud service offered by the company to make any device that can connect to the Internet respond to its master’s voice.

AT&T believes the technology could ultimately be used for everything from smart-phone apps and online games to cars and appliances. While the initial offering will only convert speech into text, and corresponding commands, the company is considering a broader set of offerings later, including ones that translate English text into six other languages and vice versa, and can also synthesize translated speech.

“We believe there are a lot of smart people out there who can create applications and services we have never dreamt of before,” says Mazin Gilbert, vice president for intelligent systems research at AT&T Labs in Florham Park, New Jersey. To use the technology, developers write code into their software to take advantage of an API (application programming interface) specified by AT&T. That code causes an application to send speech to AT&T over the Internet, where it is converted to text and returned to the device. The new APIs were announced last week. AT&T claims the technology is 95 percent accurate in taking English speech and rendering it as text. It says its accuracy at converting the meaning of English text to and from other languages ranges from 70 percent to 80 percent.

The underlying speech technology now being offered by AT&T is already used in many of its own applications, including the AT&T translator app for Android and iOS phones, and mobile voice directory search provided by Yellow Pages. “I want to be able to have a million apps riding on our platform, not hundreds, as we have today,” Gilbert says. “Whatever your wild idea is—we want to provide those APIs. I’ll be honest: I don’t know what people are going to use it for.”

The AT&T technology builds on decades of innovation at Bell Labs prior to the breakup of AT&T and the subsequent establishment of AT&T’s own service-centric labs. However, the company must compete with more established providers of speech-recognition technology, especially in the realm of smart phones.

For example, Nuance provides speech-recognition capabilities to many companies including, reportedly, Apple for its Siri personal assistant. Google’s speech-recognition technology is offered throughout its Android smart-phone operating system, and by any app written for an Android device. Microsoft also has speech-recognition technology, which appears in its Windows Phone operating system and in products from partners such as Ford, with its Sync system for in-car entertainment.

5 comments. Share your thoughts »

Credit: AT&T Labs

Tagged: Computing, Apple, AT&T, Nuance

Reprints and Permissions | Send feedback to the editor

From the Archives

Close

Introducing MIT Technology Review Insider.

Already a Magazine subscriber?

You're automatically an Insider. It's easy to activate or upgrade your account.

Activate Your Account

Become an Insider

It's the new way to subscribe. Get even more of the tech news, research, and discoveries you crave.

Sign Up

Learn More

Find out why MIT Technology Review Insider is for you and explore your options.

Show Me
×

A Place of Inspiration

Understand the technologies that are changing business and driving the new global economy.

September 23-25, 2014
Register »