Select your localized edition:

Close ×

More Ways to Connect

Discover one of our 28 local entrepreneurial communities »

Be the first to know as we launch in new countries and markets around the globe.

Interested in bringing MIT Technology Review to your local market?

MIT Technology ReviewMIT Technology Review - logo


Unsupported browser: Your browser does not meet modern web standards. See how it scores »

{ action.text }

Sound and Vision

In the future, members of Project Oxygen say, computing power will cost next to nothing. That means that computation-heavy technologies, such as vision systems and software that understands spoken requests, will be able to replace standard mouse-and-keyboard interfaces. “We have to extend the modality beyond pointing and clicking,” says Victor Zue, ScD ‘76, codirector of the lab and-along with Anant Agarwal and Rodney Brooks-one of the leaders of Project Oxygen. Instead of being tethered to a desktop and other stand-alone devices, people should be able to interact with computers easily and naturally, from a distance, through conversation or gesture.

As a first step, principal research scientist James Glass, SM ‘85, PhD ‘88, is creating language-processing systems that go beyond simple speech recognition and “track some sort of meaning, to understand the content and context of the conversation,” he says. His group created a system that allows someone to inquire over the phone about restaurants in the Boston area. The system analyzes each sentence using grammatical rules to figure out what information the caller needs, then searches a database that includes information about local restaurants-their locations, phone numbers, types of cuisine, and price ranges. Since this database is constantly changing, Glass says, it’s difficult for the program to learn every restaurant’s name. So instead, it assumes that unknown words are probably restaurant names and searches the database for likely matches. Then the system reprocesses the question and finds the phone number in a matter of seconds.

But speech is just one mode of communication. “One of the things about Oxygen is that it’s not trying to develop [stand-alone] technologies in networking, speech, and vision,” says Zue. “Increasingly, it’s the integration of these technologies.” Glass’s group and associate professor Trevor Darrell, SM ‘90, PhD ‘96’s vision group are collaborating on a system that combines speech and vision technologies. The system allows someone standing in front of a projected wall display to create and manipulate geometric shapes by gesturing and giving spoken commands such as “add a yellow pyramid here,” or “resize this.” The system tracks the person’s movements through a stereo camera and captures his or her voice through a nearby microphone array. Although the prototype is fairly simple, Darrell imagines that future systems may be used in physical-therapy programs or video games.

In some cases, people won’t need to give commands because computers embedded in their offices will anticipate their needs. The groups headed by Shrobe and Darrell have developed prototype offices that can learn their occupants’ patterns of behavior. Stereo cameras first track how a subject uses the space. Once the system understands how people’s locations correspond to their needs, computers, lights, and even radios can react to their movements. “A normal computer is blind to whether I’m sitting in front of it, sitting on the couch, or off in the kitchen making coffee,” says Darrell. But a vision-enabled room could direct a cell-phone call to voice mail if it recognized that the recipient was sitting at a table with three other people and, therefore, likely having a meeting.

0 comments about this story. Start the discussion »

Tagged: Computing

Reprints and Permissions | Send feedback to the editor

From the Archives


Introducing MIT Technology Review Insider.

Already a Magazine subscriber?

You're automatically an Insider. It's easy to activate or upgrade your account.

Activate Your Account

Become an Insider

It's the new way to subscribe. Get even more of the tech news, research, and discoveries you crave.

Sign Up

Learn More

Find out why MIT Technology Review Insider is for you and explore your options.

Show Me