Select your localized edition:

Close ×

More Ways to Connect

Discover one of our 28 local entrepreneurial communities »

Be the first to know as we launch in new countries and markets around the globe.

Interested in bringing MIT Technology Review to your local market?

MIT Technology ReviewMIT Technology Review - logo


Unsupported browser: Your browser does not meet modern web standards. See how it scores »

{ action.text }

Recognizing elements of a scene regardless of the angle or lighting is a significant challenge. Cohen and his colleague Simon Winder, a senior research engineer at Microsoft, have developed algorithms that perform this task frame by frame, for a video feed in real time. The algorithm instantly matches frames to previously analyzed images stored in a database. In developing the algorithm, the researchers determined the best parameters or characteristics to help the system match each scene. Cohen explains that they used machine learning to quickly test different parameters and determine the ones that will provide the best matches.

For today’s demo, Cohen’s team took pictures of the conference hall in which TechFest is being held. The photos were analyzed using the computer-vision software, and the key features were stored in a database on a laptop computer that employs a built-in video camera to capture a scene.

“In about a tenth or a fifteenth of a second, the software is able to recognize a scene and look it up in a database,” says Cohen. For the treasure-hunt game demoed during TechFest, the software displays a trail of bubbles that point to the direction in which the user should walk to find the prize.

Since it is just a research project, Cohen stresses that there is still plenty of room for improvement. For one thing, the parameters used to identify physical features of objects could be refined to make matching even more accurate, he says.

Another challenge to consider is how this kind of system would work in a less controlled environment, says Kari Pulli, a research fellow at Nokia. “The most common augmented-reality application is to use it as a museum guide,” he says. “That’s easy to do because the environment is fixed.” The challenge is to make sure that such systems can work in an unfamiliar context, like a city street. But Pulli believes that this could become possible thanks to databases owned by Microsoft, Google, and Navteq that contain images of street views.

Cohen says he’s optimistic that the computer-vision algorithms developed by his team could have myriad uses–from augmented-reality systems to gaming and robotics–but he doesn’t foresee them being used in a specific Microsoft product anytime soon.

1 comment. Share your thoughts »

Credit: Microsoft

Tagged: Computing, Microsoft, software, augmented reality, visual system

Reprints and Permissions | Send feedback to the editor

From the Archives


Introducing MIT Technology Review Insider.

Already a Magazine subscriber?

You're automatically an Insider. It's easy to activate or upgrade your account.

Activate Your Account

Become an Insider

It's the new way to subscribe. Get even more of the tech news, research, and discoveries you crave.

Sign Up

Learn More

Find out why MIT Technology Review Insider is for you and explore your options.

Show Me