Select your localized edition:

Close ×

More Ways to Connect

Discover one of our 28 local entrepreneurial communities »

Be the first to know as we launch in new countries and markets around the globe.

Interested in bringing MIT Technology Review to your local market?

MIT Technology ReviewMIT Technology Review - logo


Unsupported browser: Your browser does not meet modern web standards. See how it scores »

{ action.text }

Gesture Recognizer
Computer interface understands gestures and speech

Results: Researchers from MIT have developed a computer interface that enables a user to manipulate virtual shapes projected onto a screen using gestures, such as pointing, and spoken commands, such as “make a red cube in the middle of the screen.” Standing in front of cameras mounted above the screen, a user can create a virtual cube, rotate it, and change its color and size. In one experiment, the researchers found that their gesture recognition system had an error rate of 6 to 17 percent with some gestures, but a zero error rate when the gesture was coupled with a corresponding spoken command.

Why It Matters: Using gestures and speech to control computers can be easier and more natural than using a keyboard and mouse. Commercial gesture interfaces, such as those that TV meteorologists use to interact with digital maps during newscasts, respond to hand or head movements in two dimensions and require the user to be a fixed distance from the camera.

Other systems recognize full-body movements, but typically require users to wear markers or special garments, which can be cumbersome. This system, designed by David Demirdjian and colleagues, recognizes head, torso, and arm movements in three dimensions. Users don’t need to wear markers, and the system responds in real time. By combining gesture and voice inputs, the system more accurately follows different commands.

Methods: The software runs on a PC connected to three cameras and a microphone array. The researchers asked 10 subjects to perform 50 gestures in front of the cameras. Half of this data was used to “train” the software to recognize specific gestures. The software works by first estimating the user’s body position based on the camera images and then putting together sequences of poses to identify gestures. The researchers incorporated an existing speech recognition system into their setup. They used the other half of the gesture data from the performing subjects to test the overall accuracy of their system.

Next Step: The researchers would like to improve their software so that it recognizes more-natural gestures and handles conversational interactions. They would also like their system to be able to recognize gestures from multiple users at the same time. – By Corie Lok

Source: Demirdjian, D., T. Ko, and T. Darrell. 2005. Untethered gesture acquisition and recognition for virtual world manipulation. Virtual Reality. In press.

1 comment. Share your thoughts »

Tagged: Computing

Reprints and Permissions | Send feedback to the editor

From the Archives


Introducing MIT Technology Review Insider.

Already a Magazine subscriber?

You're automatically an Insider. It's easy to activate or upgrade your account.

Activate Your Account

Become an Insider

It's the new way to subscribe. Get even more of the tech news, research, and discoveries you crave.

Sign Up

Learn More

Find out why MIT Technology Review Insider is for you and explore your options.

Show Me