Depth-Sensing Cameras Head to Mobile Devices

Adding 3-D sensors to existing and future mobile devices will enable augmented-reality games, handheld 3-D scanning, and better photography.

Tom Simonitearchive page

October 1, 2013

Just over a decade since cameras first appeared in cell phones, they remain one of the most used features of mobile devices, underpinning wildly popular and valuable companies such as Instagram and Snapchat. Now hardware that gives handheld computers 3-D vision may open up a new dimension to imaging apps, and enable new ways of using these devices. Early mobile apps that can scan the world in 3-D show potential for new forms of gaming, commerce, and photography.

**Copy and paste:** A tablet can be used to make 3-D scans of an object using the Structure Sensor and other new hardware coming to mobile devices.

The first mobile depth-sensing technology to hit the market is likely to be the Structure Sensor, an accessory for Apple’s iPad that gives the device capabilities similar to those of Microsoft’s Kinect gaming controller. Occipital, the San Francisco company behind the device, says it will start shipping its product in February 2014. A Kickstarter campaign for the device has raised almost $750,000, with more than a month to run.

Occipital has developed apps that allow people to scan objects in 3-D by walking around them, and to scan entire rooms. One shows how the sensor can enable augmented reality, where virtual imagery is overlaid onto the real world when seen through a viewfinder. In that app, a person plays fetch with a virtual cat by throwing a virtual ball that bounces realistically off real-world objects (see video).

Jeff Powers, Occipital’s CEO and cofounder, says he is currently focused on giving software developers the tools to come up with compelling apps that use 3-D sensing. Among people interested in buying the Structure Sensor, there is strong interest in gaming, and in using it to scan real-world objects to help copy or design objects to be 3-D printed. “We’re also getting a lot of people asking about using this for measurements of space, to replace the way people in the construction industry are doing that today,” says Powers.

The Structure Sensor works by projecting a dappled pattern of infrared light out onto the world so that its infrared camera can observe how that pattern is distorted by the objects it falls on. That information is used to reconstruct objects in 3-D. That process relies on a chip Occipital buys from PrimeSense, based in Israel, which makes hardware underpinning Microsoft’s Kinect and has its own effort to bring depth-sensing to mobile devices. This January, PrimeSense demonstrated a 3-D sensor called the Capri small enough for laptops or tablets (see “PC Makers Bet on Gaze, Gesture, Voice, and Touch”); this month mobile chipmaker Qualcomm used it to demonstrate augmented reality gaming on an Android tablet.

That game uses the Capri sensor to capture the 3-D shape of a table and any objects on it, and then builds a virtual world of warring orcs, rocky outcrops, and towers on top of them. “Game characters can navigate around, collide into, and jump over physical objects,” says Jay Wright, vice president of product management at Qualcomm. A person playing the app can tap on the tablet’s screen to interact with that world, and walk around the real world to get a different perspective on the virtual one (see video).

The app was enabled by adding support for 3-D sensing to Qualcomm’s Vuforia software that helps mobile software developers build augmented reality apps, a feature the company calls Smart Terrain. But Wright predicts that consumers will eventually use depth-capable apps for more than games. A person could use an app powered by Smart Terrain to see a virtual preview of a new piece of furniture in his living room at true scale, for example.

Qualcomm will make Smart Terrain available to app developers early next year, but Wright isn’t offering any guesses of when devices with the depth sensing necessary to use it might appear. However, Kartik Venkataraman, chief technology officer and cofounder of Pelican Imaging, a startup company in Mountain View, California, says that it won’t be more than a few years.

Venkataraman’s company has developed a depth-sensing camera sensor that is the same size as most cameras used in mobile devices today. “We have a reference design and are talking to handset OEMs that have interest in taking this to market on a cell phone or mobile device of some kind,” says Venkataraman.

Pelican’s technology uses an array of 16 tiny cameras to record the “light field” in front of a device, logging not only the color and location of light beams as a 2-D camera does but also the direction they come from (see “Light-Field Photography”). A light field can be used to reconstruct stereo images of a scene to infer depth, and Pelican’s sensor can capture the shape of objects to high accuracy within about 50 centimeters, says Venkataraman.

Pelican is researching the same kinds of augmented reality and object-scanning uses as Occipital and Qualcomm. However, with such ideas far removed from how people use mobile devices today, Venkataraman’s work on using depth-sensing to enhance existing mobile photography might take off first.

“Instagram applies filters to the whole scene, but this allows you to apply filters to different layers of the scene,” says Venkataraman. “That gives the potential for creating much more interesting filters.”

Pelican’s demonstration app can make everything in a photo a moody black and white except for the person in the image, for example. The same app can use depth-sensing data to very accurately cut out people in an image so they can be repositioned, or cut and pasted into another scene.

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.