Erase Obstructions from Photos with a Click

Researchers at Google and MIT came up with an algorithm that could make taking pictures of obstructed scenes like taking a panoramic photo.

Rachel Metzarchive page

August 4, 2015

If your attempts at mimicking Ansel Adams with your smartphone camera have been foiled by things like reflective windows or chain-link fences, researchers at Google and MIT may have a solution: an algorithm that can separate the foreground from the background and delete the obnoxious obstruction.

Researchers at Google and MIT came up with an algorithm that can separate visual obstructions in the foreground, such as this chain-link fence, from what you want to photograph in the background.

The algorithm works by detecting differences between the foreground and background within a sequence of photos you’d take while moving your smartphone slightly, kind of like you would while taking a panoramic picture.

Michael Rubinstein, a research scientist at Google who worked as a postdoctoral researcher at Microsoft Research while some of the work was conducted, says the basic principle behind the algorithm is the phenomenon of motion parallax, where objects that are closer to us seem to move faster than those that are farther away. Therefore, since one of the images in the scene—the obstruction—is closer to the camera than the other—what you’re actually trying to take a photo of—they’ll move differently.

“Since they’re moving differently, we can use that information to figure out there are two layers we’re actually looking at, and we remove one of them,” he says.

Other work has been done on removing the same kinds of occlusions in photos, Rubinstein says, but he believes the researchers’ algorithm is more multipurpose.

The work will be presented in a paper this month at the Siggraph computer graphics and interaction conference in Los Angeles, and images attached to the paper—most of which were taken with a couple Android smartphones—show a marked difference between the initial and final shots. In one, a black chain-link fence in front of a tiger’s enclosure at a zoo appears to be completely removed, while in another a window reflecting a checkered shirt onto a scene of a distant building and foliage is mostly deleted from the scene.

Tianfan Xue, lead author of the paper and a graduate student at MIT, says that in addition to reflections on windows and chain-link fences, the algorithm can correct for a number of different kinds of obstructions on windows like raindrops or dirt. It also works on other reflecting surfaces.

Xue, who conducted part of the work while he was an intern at Microsoft Research, says that as long as the obstruction doesn’t move, the algorithm can be used to remove it.

Rubinstein says that “there’s interest” at Google in the algorithm, and perhaps eventually the work could become another camera feature, as it would operate similar to the panorama feature on many smartphones. For now, though, there’s no concrete plan to bring it to users’ pockets.

And before that would happen, there are several limitations that would need to be addressed. The algorithm won’t work with images of subjects in motion, such as sporting events, and doesn’t work well in low light. It also can’t currently handle several obstructions at once—a rainy window in front of a chain-link fence through which you were trying to capture an image of a lion at the zoo, say.

While it’s currently better than a photo occluded by a smeary window, “it’s definitely not magic,” Rubinstein says.

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.