Technology Review - Published By MIT
Advertisement

Microsoft Demos Augmented Vision

Continued from page 1

By Kate Greene

Tuesday, February 24, 2009

smaller text tool iconmedium text tool iconlarger text tool icon

Recognizing elements of a scene regardless of the angle or lighting is a significant challenge. Cohen and his colleague Simon Winder, a senior research engineer at Microsoft, have developed algorithms that perform this task frame by frame, for a video feed in real time. The algorithm instantly matches frames to previously analyzed images stored in a database. In developing the algorithm, the researchers determined the best parameters or characteristics to help the system match each scene. Cohen explains that they used machine learning to quickly test different parameters and determine the ones that will provide the best matches.

For today's demo, Cohen's team took pictures of the conference hall in which TechFest is being held. The photos were analyzed using the computer-vision software, and the key features were stored in a database on a laptop computer that employs a built-in video camera to capture a scene.

"In about a tenth or a fifteenth of a second, the software is able to recognize a scene and look it up in a database," says Cohen. For the treasure-hunt game demoed during TechFest, the software displays a trail of bubbles that point to the direction in which the user should walk to find the prize.

Since it is just a research project, Cohen stresses that there is still plenty of room for improvement. For one thing, the parameters used to identify physical features of objects could be refined to make matching even more accurate, he says.

Another challenge to consider is how this kind of system would work in a less controlled environment, says Kari Pulli, a research fellow at Nokia. "The most common augmented-reality application is to use it as a museum guide," he says. "That's easy to do because the environment is fixed." The challenge is to make sure that such systems can work in an unfamiliar context, like a city street. But Pulli believes that this could become possible thanks to databases owned by Microsoft, Google, and Navteq that contain images of street views.

Cohen says he's optimistic that the computer-vision algorithms developed by his team could have myriad uses--from augmented-reality systems to gaming and robotics--but he doesn't foresee them being used in a specific Microsoft product anytime soon.

Comments

Log In

Forgot your password?     Register »
Advertisement

Videos

Malleable Maps, Artistic Robots and Bubble Interfaces
Technology Review January/February 2010

Current Issue

Security in the Ether
Information technology's next grand challenge will be to secure the cloud--and prove we can trust it.
Advertisement
Advertisement
Advertisement
Subscribe to Technology Review's daily e-mail update. Enter your e-mail address

TECHNOLOGY RESOURCES
Advertisement
MIT Massachusetts Institute of Technology © 2010 Technology Review. All Rights Reserved.