Compound-Eye Camera Analyzes Scenes

A new, ultrathin camera can extract 3-D information from scenes and recognize objects.

Kate Greenearchive page

May 23, 2007

Researchers at the University of Osaka have developed an ultrathin camera that can determine the distance between objects in a scene and pick out color and structural features. In effect, the team, led by Jun Tanida, has built an integrated hardware and software system for recognizing objects and recreating 3-D scenes.

**New perspective:** At top, an example of an image captured by the compound-eye camera. Although the nine images seem to be identical, they hold slightly different information about the objects. When the signals are combined, information about the objects, such as detailed structure and the location data, can be retrieved. The black box in the middle, which is about the size of a shirt button, contains the lenses and image sensors. The data is sent, via a USB connector (attached to the black box), to a PC, which processes the data and extracts 3-D information. An expanded view of the lenses and sensors is shown at bottom. The nine lenses are embedded in a case placed in front of the CMOS (Complementary Metal-Oxide Semiconductor) image sensor (blue). These components are affixed to circuitry, which outputs data to a PC via a USB port.

Tanida says that he looked at biological imaging systems–in particular, the compound eyes of insects–for the design blueprint. The technology, called TOMBO (Thin Observation Module by Bound Optics), is actually a collection of nine small lenses and software that analyzes the scene by mimicking the process that insects use to recognize the position, shape, and color of objects, Tanida says. The researchers have crammed TOMBO’s hardware into a tiny box the size of a shirt button. And in the age of increasingly smaller and thinner mobile gadgets, such a compound-eye camera could provide powerful image-taking and image-recognizing functions in, for example, a cell phone.

The general principle behind the camera isn’t new, says Frédo Durand, a professor of electrical engineering at MIT. For years, researchers have been experimenting with compound-eye lenses in cameras to increase resolution, for instance. Unlike other groups working in the field, Durand says, the Osaka researchers are focusing on making the device as thin as possible so that it can be useful in applications in which thickness is an issue. For example, Durand says, a thin image-recognizing system like the TOMBO camera could be secured to the wings of an aircraft for surveillance purposes without causing much drag.

The basic idea behind the technology is that multiple lenses capture information about a scene from slightly different angles, just as our eyes look at an object from two distinct points of view. The relative angle at which a person sees an object depends on how far away the object is from her eyes. Additionally, the color and shape of an object differ slightly based on which eye is looking at it and where a light source is. Essentially, our brains compare the input from our two eyes to determine distance, color, and shape, among other features.

The same principle is applied to the image-recognition algorithms, says Tanida. The software separates the nine small images, removes shading, compensates for distortion in the images, and remaps the pixels into a single two-dimensional image. Tanida explains that the accumulated error in the remapping process, which is effectively the differences between the images from each lens, can be used to extract the object’s distance, color, and shape, allowing a picture to be recreated in full 3-D glory, as well as employed for object recognition.

Dave Brady, a professor of electrical engineering at Duke University, says that he “thinks highly” of Tanida’s system. While much of the technology, including the optical setup and some of the algorithms for analyzing objects, isn’t completely new, Tanida’s group has integrated it into a novel, small package that could be useful in applications ranging from cell-phone scanners to automobile navigation systems. “The innovation is more in the optical design and integration,” Brady says.

Tanida admits that the image quality of the TOMBO camera–which is currently only about 1.1 megapixels–needs to improve, and this is mainly done by altering the image-processing algorithms and adding lenses. Also, he says, it’s important to consider the application for which the camera design will be optimized. For instance, a surveillance camera in a parking lot may only need low-resolution images, which would require a camera with a small number of lenses. However, military applications may need a much higher-resolution camera with a large array of lenses.

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.