Panoramic Imaging May Enhance Online Mapping

Microsoft researchers have developed software that could create more sweeping – and useful – perspectives for city maps.

Kate Greenearchive page

May 4, 2006

Researchers in Microsoft’s Interactive Visual Media Group have unveiled a software application that creates a panoramic, four-gigapixel image out of hundreds of smaller pictures, a technology that could be integrated into Windows Live Local to provide more visually accurate and interactive navigation for online maps.

The panorama shot, demonstrated at the Microsoft Research Silicon Valley Road Show in Mountain View, CA, this week, was composed of about 750 smaller digital images, captured by an off-the-shelf digital camera mounted atop a building in Seattle. Then the composite pictures were stitched together almost seamlessly by automated software developed by Microsoft researchers.

This big picture illustrated technology that’s part of ongoing work by the software giant to expand its online mapping platform, Windows Live Local, currently in its beta (or testing) phase. Live Local already provides traditional street maps and top-down views culled from satellites images for almost all of the United States, as well as actual photos of a handful of cities that allow viewers to look at buildings from an angled perspective – what Microsoft refers to as “Bird’s Eye” images.

The company plans to take the image-processing techniques demonstrated in these panoramas and apply them to the Bird’s Eye images, which are taken from planes flying at low altitudes over cities. This would allow future versions of Live Local to offer images stitched together for easier navigation, says Matt Uyttendaele, a Microsoft researcher who worked on the panorama project. Currently, Live Local offers only the angled view for a small part of the city at a time; in order to look beyond that view, another picture must be loaded into the browser.

Within the past year, competition for the best online mapping application has increased between Microsoft, Google, and Yahoo (see “Mapmaking at Microsoft”). Many experts believe that the Bird’s Eye images distinguish Microsoft maps from others by providing a more natural view of a city. Now panoramic views, such as those taken by the Microsoft researchers, could be even more useful for mapping purposes because multiple cameras placed on multiple buildings could provide more views of site, says Rick Bobbit, founder of GeoSpatial Experts, a Thorton, CO-based company. By placing cameras all around a city atop buildings, “you could get a whole bunch of different angles, and that would be useful,” he says.

That’s in stark contrast to Google’s online mapping tool, for instance, which offers street maps, top-down satellite images, and a view that combines the two. But satellite images mostly provide information about the tops of buildings and their geometries, while Microsoft’s angled view offers pictures of storefronts or geographic nuances that a satellite can miss. The close-up, angled view can be more useful for actually navigating in an unfamiliar area, Bobbit says.

To capture the images to create the panoramic views, Microsoft researchers mount a digital camera on a motorized platform on a building’s roof. The camera then slowly pans the scene, taking pictures as the camera’s view snakes in an up-and-down orientation. Cohen says that each picture takes two to three seconds to capture, adding up to about 90 minutes of picture taking. The images are then processed by Microsoft’s “stitching” software, which combines the hundreds of photos.

Traditionally, large panoramas have been created by software that automatically pieces together images, but this often blurs images, says Cohen. Additionally, most software fails to account for the changes in natural light that occur over the hour-long photo shoots.

Microsoft’s software tackles these problems by using algorithms that scour each individual picture for signature features – lines at the top of the building or bright points such as sunlight reflected in windows – and aligns them. Then, Uyttendaele says, the software “cuts” the images, as opposed to “blurring” them, which is how most panoramic software inelegantly stitches images together. Cutting is often needed for images of roads; for example, if there are two pictures of a road, with a car in one picture but not the other, the software will “cut” the portion of the picture with the car, instead of averaging or “blurring” the two images together. “This avoids ‘ghosts’ when objects such as cars are moving,” Cohen says.

Additionally, Uyttendaele says, the software compensates for lighting changes over an hour’s time by adjusting each picture’s brightness to match the preceding one’s brightness. This process keeps a daytime sky appearing light and shadows of buildings dark consistently throughout the panorama.

Being able to incorporate the Bird’s Eye and panoramic images into online navigation is changing the map-making experience, says Robert Dollison, project manager for the U.S. Geological Survey’s Geospatial One-Stop project. Projects such as Microsoft’s Live Local and Google Earth are “driving more development in the field,” he says.

The challenge in this new type of mapmaking, Bobbit says, remains in making the mapping interface easy to use – not in how many nice photos are available. Making Live Local more user friendly is one of the goals of this research, says Microsoft’s Uyttendaele. “People love the detail of the [Bird’s Eye] imagery,” he says. “This should allow them to easily pan across the images.”

At the Microsoft Research Silicon Valley Road Show, the company demonstrated a number of prototypes designed to work with Microsoft’s online mapping application, Windows Live Local. Next week, we’ll describe another research project that makes real-time information, from traffic jams to restaurant wait times, searchable via online maps.

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.