Real-World Video Game

Surveillance was never so much fun.

David Talbotarchive page

July 25, 2002

The producers of the 1999 sci-fi martial-arts extravaganza The Matrix used elaborate and costly movie camera technology to circle around characters frozen mid-kung-fu kick. Such vertiginous visual effects can now be generated on the cheap from live images provided by stationary video cameras. But even more pertinent in the post-September 11 era, the package of video-processing and 3-D modeling technologies delivering these tricks promises to set a new standard for surveillance and security systems-and bring new meaning to the term “reality programming.”

The system, developed at Sarnoff in Princeton, NJ, allows users to joystick their way through a live, 3-D scene as if it were the latest video game. Indeed, by stitching together scenes captured by dozens or hundreds of networked cameras, the technology makes it possible to conduct a virtual patrol of an entire urban center, or every hallway in a building, in real time. “It’s difficult for eyeballs to make sense of what a collection of cameras sees. This is an integrated way to use hundreds of cameras to make one display,” says Rakesh Kumar, computer scientist and lead developer of the technology, dubbed “Video Flashlight.”

What makes the new system so unusual is that it melds 3-D models of a background scene-say, a cluster of buildings-with real-time camera views of the same area. Image-processing chips developed at Sarnoff detect new or moving objects, construct 3-D images of these objects, and integrate them into the model. New software for “tweening”-filling in the gaps between video frames-lets security personnel “fly” around a subject such as a pedestrian, getting a detailed look without jumping between widely separated views. “This is beyond anything else out there” in vision processing, says Mari Maeda, a physicist who shepherded Sarnoff’s project as a program manager at the U.S. Defense Advanced Research Projects Agency.

Uses of the technology could include marking and tracking individuals as they move, or setting off an alarm when an unusual pattern is detected, such as a large group of people entering a building. Sarnoff engineers say the system needs refinement-for example, they aim to make it work with cameras that zoom, pan and tilt, not just fixed ones. But early versions have already been installed at U.S. Army Intelligence headquarters and are under consideration for New York City’s three airports-perhaps bringing us all a step closer to living inside the Matrix.

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.