Gestures that Your TV Will Understand
The company behind Microsoft’s Kinect controller wants to kill the remote.
Thanks to Microsoft’s Kinect, millions are casting aside their controllers and using their bodies to play games. Now the company that created the motion-tracking hardware for the Kinect wants to make waving your arms an accepted way to control everything from your TV to your desktop computer.
PrimeSense, based in Tel Aviv, Israel, makes a package that combines one or two conventional cameras, an infrared depth sensor, and specialized computer chips. Together they collect and interpret a person’s movements in 3-D. The movements are calculated by projecting a grid of infrared light spots into a room, tracking how light bounces back, and correlating this with information from the stereo cameras. Certain motions can be translated into computer commands or, in the case of Kinect, used to control an on-screen avatar.
While Microsoft focuses on gaming, PrimeSense is trying to establish other uses, for example TV control. In collaboration with PC manufacturer Asus, PrimeSense has developed a device called the WAVI Xtion. It looks a lot like the Kinect controller, but connects via a PC to the TV and lets the viewer use gestures to control what appears on the screen.
The WAVI Xtion camera is positioned next to the TV, while the control box connects to the computer. A user waves a palm in front of the TV to call up a simple menu that would let him choose between watching shows, playing games, or looking at photos. The user points to one of these options with his palm, which is tracked by the cameras and infrared sensor. To choose an option, the user holds a palm over a particular video, or he can flip through options by waving to the right or left. When the clip is playing, he can wave a palm at the screen to call up the controls to rewind the video or turn up the volume.
Adi Berenson, PrimeSense’s vice president of business development, says the hands-free approach eliminates a major sticking point with efforts to bring the Internet to televisions. “We believe that the industry is trying to force-fit the PC into the living room, and it won’t work,” he says. “It’s a more relaxed environment that needs a more natural way to interact.” Google TV—the search giant’s Internet TV effort—relies on a full QWERTY keyboard, a feature that many think is too unwieldy to be practical.
Asus and PrimeSense are also interested in adding gesture control to conventional PCs. Within weeks of the release of the Kinect controller, hobbyists had figured out a way to access it, leading to an explosion of new ideas about how gesture control could be used—everything from robots to air guitar. “We didn’t expect that to happen so fast,” says Berenson. “It is a validation of how many good ideas developers have, and we want to help them bring them to users.”
PrimeSense has accelerated the rollout of a software tool kit to aid experimentation with the controller and plans to offer a $200 hardware kit for developers.
“Not having to bring a controller is great for situations with multiple changing participants,” says Doug Fritz, part of a team of Kinect hackers at the MIT Media Lab that developed software for controlling the Chrome Web browser.
Stepping in front of the camera and taking control with gestures could make group work easier than having to take turns at a keyboard or mouse, Fritz says. However, he adds, much more would be possible if the device could track hand shapes. “The current technology is good for body gestures, not fine-grain control,” he explains. That makes things like text input a particular challenge.
Doug Bowman, a professor at Virginia Polytechnic Institute and State University who has developed 3-D gestural interfaces based on more expensive tracking technology, agrees. One of his students, Tao Ni, has developed a system that allows a user to navigate menus on a TV using simple movements such as pinching fingers together (see a video of a prototype). “These freehand gestures could replace a remote or keyboard altogether in, say, an entertainment scenario,” says Bowman. However, Ni’s prototype requires a special glove to capture the precise orientation of a person’s hand and their finger movements. “The Kinect is not capable of that yet,” says Bowman. “But perhaps in the future it will be.”
Berenson says that improving the resolution of PrimeSense’s tracking is one area of active research. “We are thinking about tracking resolution and trying to follow fingers,” he says. Another future direction would have the system interpret subtle body language. “We want to make it less explicit and more implicit,” he says. “For example, it should be possible to have the volume go down on your TV when you pick up a newspaper and start reading it.”
Become an MIT Technology Review Insider for in-depth analysis and unparalleled perspective.Subscribe today