By reimagining digital cameras, MIT scientists could help overhaul the art of photography.
On a summer day in 1826, at his country estate about 340 kilometers southeast of Paris, Joseph Nicéphore Niépce set up his camera obscura and projected the image of his courtyard onto a pewter plate coated with a light-sensitive material. For eight hours, the lens focused light from the sun, chemically fixing the areas where the light struck the plate to capture the view of a pigeon house, a pear tree, a barn roof, and an extended wing of his house. For this achievement, Niépce is credited with creating the world’s first photograph.
Pewter and other solid plates gave way to flexible rolls of film in 1889; color film followed in the mid-1930s. In the mid-1990s, the first mass-market color digital cameras were introduced, capturing images with light sensors on a chip. These advances have led to cheaper, smaller, more portable cameras that can produce vivid images. But at the most fundamental level, cameras haven’t been altered significantly, says Ramesh Raskar, associate professor and leader of the Camera Culture group at the MIT Media Lab. “The physical device itself has barely changed over the last 100 years,” he says. “You have a similar lens, a similar box that mimics the human eye. Other than the fact that it’s cheaper, faster, and more convenient, photography hasn’t changed that much.”
Raskar, however, is hoping that he and others at MIT and around the world can spark a revolution in photography. Researchers in a field called computational photography are rethinking digital cameras to take better advantage of the computers built into them. They envision a day when anyone can use a camera with a small, cheap lens to take the type of stunning pictures that today are achievable only by professional photographers using high-end equipment and software such as Adobe Photoshop. In fact, they think such cameras could exceed today’s most sophisticated technologies, overcoming what have seemed like fundamental limits.
Computational photography encompasses new designs for optical components and camera hardware as well as new algorithms for image analysis. The goal, says Raskar, is to build cameras that can record what the eye sees, not just what the lens and sensor are capable of capturing. “If you’re on a roller coaster, you can never get a good picture,” he says. “If you’re at a great dinner, you can never take pictures that make the food look appetizing.” But with computational techniques, cameras could eliminate blur from a snapshot taken on a bumpy amusement-park ride. Such cameras could also capture the subtle shapes and shadows of food and people’s smiles in the low light of a candlelit dinner–without a long exposure time, which invariably produces blurry pictures, or the use of a disruptive flash.
Sidebar: The Journey from Lab to Market
Moreover, computational photography could make it easy for amateur photographers to create pictures that today require specialized and time-consuming post-processing techniques. Even cell-phone cameras, which have inexpensive fixed lenses, could give amateurs the same kind of control over focusing that professionals have with a high-end single-lens reflex (SLR) camera.
All cameras operate in the same basic way: light enters through a focusing lens and passes through an aperture. In a traditional camera, the light hits photoreactive chemicals on film or plates. In a digital camera, the light passes through color-separating filters and lands on an array of photosensors, each of which represents a pixel. When light hits a photosensor, it produces an electrical current whose strength reflects the intensity of the light. The current is converted to digital 1s and 0s, which the camera’s processor (a computer chip) then converts into the image that shows up on the camera’s preview screen and is stored on a flash memory card or an internal hard drive.
When images are captured as digital bits of information, they can be improved by software, opening up a whole new world of possibilities. For the past 15 years or so, says Raskar, researchers have been working to take full advantage of those possibilities, especially through new image processing algorithms that borrow from the traditionally distinct fields of computer vision and computer graphics. Computer vision enables a camera to analyze objects in a picture, picking out features like the edge of a table. And the techniques of computer graphics offer numerous ways to manipulate a digital image. When these approaches are combined in a camera whose optical components are designed with such algorithms in mind, you can do some surprising things. For example, you can, in effect, adjust the light source after the photo has been taken, so that an object lit from one angle appears to be lit from another. And you can even adjust the focus on a photograph after the fact.
Battling motion Blur
One of the most compelling examples of what computational photography can achieve is motion-invariant photography, a clever way of eliminating blur from pictures of moving objects.
“Blur is a process that scrambles information,” says Frédo Durand, an MIT associate professor of electrical engineering and computer science who has helped develop this idea. Pixels of a digital image behave just like squares on a checkerboard, he says. A rapidly moving black-and-white checkerboard pattern blurs to gray, an average of the black and white squares. But if you know precisely how the checkerboard was moved–say, by spinning it around a point in the center, or by shaking it up and down–then you can write a mathematical function to describe the motion-based blur. Once you know that function, you can invert it to remove the blur.
Durand and his colleagues–including Anat Levin, a postdoctoral fellow, and Bill Freeman, an MIT professor of electrical engineering and computer science–have designed a camera that can take advantage of this principle to remove blur from a picture of an object that’s traveling in a straight line, such as a car speeding down the road. The key is to do something counterintuitive, Durand says: “We create more blur by moving the camera during exposure.”
The researchers’ test camera has an optical system that moves back and forth along a straight line, blurring the entire image. Because of the way the sensor is moving back and forth, there will be at least one moment during the exposure when the camera is perfectly tracking the photographed object, allowing the camera to capture accurate information about the object’s visual structure, regardless of its velocity. This information enables the researchers to write an equation defining the motion-based blur–and then to eliminate the velocity from that equation. By inverting the equation, they can reconstruct an image without any blur at all (see “Eliminating Motion Blur,” p. M14).
In this camera, unlike a typical model, “the job of the optics isn’t to directly form the final image,” Durand says. Instead, in a sense, it’s to “shuffle the light rays so what’s recorded by the sensor gives us access to more information.
One of the annoyances of cell-phone and compact cameras is that they lack the SLR’s focusing control. With an SLR camera, the lens can be moved to change what’s in focus. By adjusting the aperture, a photographer can get a shot in which a foreground subject is in clear focus, while the background is purposely blurred to deëmphasize distracting elements. SLRs are expensive, however, and they’re difficult for amateurs to use. Computational-photography researchers are trying to develop a simple, fixed-lens cell-phone camera that makes it easy for anyone to achieve such effects. They also hope to give photographers the ability to choose which objects they want in focus after a picture is taken.
Cameras are designed to focus on objects within a given range. When a camera is focused on a particular object, the lens concentrates the light reflecting off that object onto the sensor array. The light reflecting off objects that are not in focus still reaches the sensors, but it’s unconcentrated, resulting in a blurred image. “If a camera is not perfectly focused,” Durand says, “then the lens will project points from the scene onto the sensor as disks rather than points.”
If the distances between the camera and objects in an image are known, then an algorithm can be applied to the image data to sharpen the out-of-focus parts of a picture, converting the blurred disks of light into focused points. Conventional cameras, however, can’t determine this depth information on their own.
To extract depth information from a photograph, Durand, Freeman, and other colleagues modified an existing lens with a mask inserted into the aperture. Essentially, the mask is a piece of cardboard that blocks part of the light to subtly change the look of the out-of-focus parts of the picture. Durand explains that the undifferentiated blur caused by an ordinary out-of-focus lens doesn’t provide enough clues that could be used to reconstruct a clear image. But their mask changes this uniform blur into what he calls a “weird but structured mess.” Streaks and other unusual features of the blurry image help the researchers recover depth information: thanks to the way the mask blocks light in the camera aperture, an object 10 feet from the camera will be blurred differently from an object five inches away. Because they know the shape of the mask, the researchers have been able to mathematically define the blur associated with each depth, enabling them to devise an algorithm that can undo it (see photographs of conventional and coded apertures and “Extracting Depth Information,” p. M15).
Another strategy for improving focus, especially in a simple cell-phone camera, is similar to Durand’s technique for addressing motion blur. An SLR’s large aperture size gives it a shallow depth of field (the range of distances from the camera where objects appear sharply in focus), which makes it possible to focus on a specific subject and allow the background, the foreground, or even both to recede, explains Raskar. But pictures taken with ordinary cell-phone cameras, which have very small apertures, appear “flat” because everything looks as though it’s the same distance from the camera. At the first IEEE International Conference on Computational Photography, held in San Francisco in April, postdoc Ankit Mohan presented a paper he wrote with Raskar and others describing a technique for simulating a lens with a larger aperture size. They demonstrated how a fixed-lens camera can be designed so that both its lens and its sensor move slightly during exposure. By varying the velocity and range of the movement, they are able to, in effect, change the focal length and aperture size to control which part of the photo is in focus; the rest is purposely blurred (see “Focus Control for Fixed-Lens Cameras,” p. M15). Such technology could give a cheap cell-phone camera the focusing control of an SLR.
Adjusting Lighting, Perspective
Improved focusing is just the beginning; computational photography could also enable people to adjust the lighting of a scene, or even change the camera’s perspective, after a shot has been taken. This is the kind of trick that computer vision makes possible. “It’s difficult to do if your computer only has an understanding of the image at the level of pixels,” says Freeman. “But if you can give the computer an understanding of that image in terms of higher-level concepts, like lighting or shape, then you can let the user adjust the knobs controlling those quantities.”
This higher-level understanding comes from image analysis algorithms that let a computer “see” the components of a picture. For example, an algorithm can identify which components of the image are due to the coloring of an object’s surface and which are due to the modulation of light reflected by its shape. Once that is known, a user can adjust surface coloration and lighting effects independently. The goal is to build a system that can identify, say, where the edge of a dark piece of prime rib ends and the shadow it casts on a plate begins.
Such techniques, Raskar says, could reveal details–such as subtle facial expressions–that would previously have been obscured by shadows. In short, cameras will be able to more closely capture the essence of a scene. “When you’re walking down the street with a friend, you can be in any type of lighting and you can see how beautiful this person is,” he says. “Right now a photograph can’t do that.” But computational techniques are narrowing the gap that still separates the eye and brain from the camera. Ten years from now, he says, that may be exactly what a photograph can do.
AI is here.
Own what happens next at EmTech Digital 2019.