Skip to Content

Hackers Take the Kinect to New Levels

But the Holy Grail—controlling a computer without touching it—proves hard to achieve.
December 2, 2010

Soon after Microsoft released the Kinect gaming device, hackers found a way to pull raw data out of the system, radically expanding its potential uses. Enthusiasts have used the hardware to draw 3-D doodles in the air with hand movements, to play with virtual onscreen characters, and allow a robot to recognize gestures and map its surroundings.

Hand waving: A software plug-in called DepthJS makes it possible to control a Web browser using the Kinect.

But one of the biggest goals of Kinect hackers—controlling a computer with gestures—is proving difficult to achieve.

Researchers at MIT’s Media Lab have created a new Chrome Web browser extension that lets users interact with any Web page via the Kinect if the device is plugged into a computer. Their project is one test case for the promise and limitations of hacking Microsoft’s gaming peripheral for nongaming uses.

The extension, called DepthJS, uses JavaScript to translate a small number of hand gestures into commands that can be executed by the browser. For example, a rapid arm movement to the left switches between open browser windows. Opening and closing a hand quickly acts as a mouse click.

The goal isn’t really to use the Kinect as a practical means of browsing the Web. Instead, DepthJS is meant to act as the interface between a variety of Web applications and the gestures captured by Kinect.

“Getting Kinect’s events into the Web browser is all about lowering the cost of entry to exploring and creating applications using depth information,” says Doug Fritz of the Fluid Interfaces group at MIT, who worked on the project. Computer users spend most of their time in the Web browser, Fritz notes. And most computer programmers (especially Web developers) know how to use JavaScript. This makes it an easy point of entry for Kinect programming.

One trouble is that unlike using a mouse, keyboard, or touch screen, there is no widely recognized (or naturally intuitive) vocabulary for gestural computing. Microsoft has developed a small number of gestures to let Kinect users navigate menus and browse media on the Xbox.

“Most of us hadn’t even used a Kinect with the Xbox before we started working, so we weren’t really burdened by the gesture language Microsoft has developed,” says Fritz. The team was inspired by the iPhone’s multitouch gestures and work by 3-D computing pioneer John Underkoffler. Surprisingly, some of the gestures created for DepthJS are similar to those Microsoft came up with. “Right now we are in that state of rapid change where people are remixing familiar interaction techniques with what feels natural,” Fritz said.

Limor Fried and Phillip Torrone from Adafruit Industries, a company that supplies equipment to hardware hackers, helped kick off the race to hack the Kinect by putting out a bounty of $3,000 for software that could connect the device to a regular computer.

Both are excited about the future of the Kinect as an off-the-shelf sensor for everything from high-end robotics to art projects. Developers have created a steady stream of videos of different applications using the Kinect. “These videos are really just proof-of-concepts that show some of the possibilities for further development,” says Fried.

One of the most popular videos is of a 3-D interactive puppet. “It’s fun, it’s intuitive, and it’s something that would be really hard to do without this inexpensive, off-the-shelf component. As you bring down the barriers, people have room to get creative.”

MIT’s Fritz is quick to note that three-dimensional, natural user interface computing using gestural recognition and depth sensors has been in play in the research community for years. The Kinect is a breakthrough device in terms of packaging and implementing these technologies for consumers. The more familiar users become with it, the more likely they are to translate it to spheres beyond gaming.

“The keyboard and the mouse aren’t going anywhere, but there is a lot of space for something more, and I think people are ready for that,” Fritz says.

But any effort to translate gestures to the screen inevitably bumps into the fact that we’re still three-dimensional beings trying to interact with a two-dimensional world. Most Kinect games solve this problem by matching us with an onscreen avatar who imitates our movements. Whether we’re dancing, playing volleyball, or whitewater rafting, the characters on the screen perform a stylized version of our movements offscreen.

One solution could be to use light projectors to create virtual objects in real space that we can interact with. Microsoft Research has already taken steps in this direction with Mobile Surface, a projector-based multitouch environment.

Keep Reading

Most Popular

DeepMind’s cofounder: Generative AI is just a phase. What’s next is interactive AI.

“This is a profound moment in the history of technology,” says Mustafa Suleyman.

What to know about this autumn’s covid vaccines

New variants will pose a challenge, but early signs suggest the shots will still boost antibody responses.

Human-plus-AI solutions mitigate security threats

With the right human oversight, emerging technologies like artificial intelligence can help keep business and customer data secure

Next slide, please: A brief history of the corporate presentation

From million-dollar slide shows to Steve Jobs’s introduction of the iPhone, a bit of show business never hurt plain old business.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at with a list of newsletters you’d like to receive.