Computer interface understands gestures and speech
Results: Researchers from MIT have developed a computer interface that enables a user to manipulate virtual shapes projected onto a screen using gestures, such as pointing, and spoken commands, such as “make a red cube in the middle of the screen.” Standing in front of cameras mounted above the screen, a user can create a virtual cube, rotate it, and change its color and size. In one experiment, the researchers found that their gesture recognition system had an error rate of 6 to 17 percent with some gestures, but a zero error rate when the gesture was coupled with a corresponding spoken command.
Why It Matters: Using gestures and speech to control computers can be easier and more natural than using a keyboard and mouse. Commercial gesture interfaces, such as those that TV meteorologists use to interact with digital maps during newscasts, respond to hand or head movements in two dimensions and require the user to be a fixed distance from the camera.
Other systems recognize full-body movements, but typically require users to wear markers or special garments, which can be cumbersome. This system, designed by David Demirdjian and colleagues, recognizes head, torso, and arm movements in three dimensions. Users don’t need to wear markers, and the system responds in real time. By combining gesture and voice inputs, the system more accurately follows different commands.
Methods: The software runs on a PC connected to three cameras and a microphone array. The researchers asked 10 subjects to perform 50 gestures in front of the cameras. Half of this data was used to “train” the software to recognize specific gestures. The software works by first estimating the user’s body position based on the camera images and then putting together sequences of poses to identify gestures. The researchers incorporated an existing speech recognition system into their setup. They used the other half of the gesture data from the performing subjects to test the overall accuracy of their system.
Next Step: The researchers would like to improve their software so that it recognizes more-natural gestures and handles conversational interactions. They would also like their system to be able to recognize gestures from multiple users at the same time. – By Corie Lok
Source: Demirdjian, D., T. Ko, and T. Darrell. 2005. Untethered gesture acquisition and recognition for virtual world manipulation. Virtual Reality. In press.