The Kinect projects a laser dot pattern into a scene and looks for distortions using an infrared camera, a technique called structured light depth sensing. This generates a “point cloud” of distances to the camera that the Kinect uses to perceive and identify objects and gestures in real time.
A KinectFusion user waves a Kinect around a scene or object. An algorithm called iterative closest point (ICP) is used to merge data from the snapshots being taken at 30 frames per second into an ever-more-detailed 3-D representation. ICP is also used to track the position and orientation of the camera by comparing new frame data with previous frames and the composite merged representation. The team describes the use of a standard computer graphics processing unit for both camera tracking and image generation as a major innovation.
While KinectFusion is generating a buzz, it is still an ongoing research project. Microsoft has not disclosed plans to release any products using the technology, or versions of the software that power the system.
“It’s just stunning,” says Christian Holz, of the Hasso Plattner Institute at the University of Potsdam, in Germany, who previously worked on a project that used Kinect at Microsoft Research in Redmond, Washington. “It’s going to make 3-D creation available to a much wider range of people. The fact that it can not only model the real-world environment in mind-blowing fidelity, but also use the model to simulate realistic physics on top of that, opens up the possibility of a vast number of applications.”