Skip to Content

Augmented Reality Meets Gesture Recognition

A new app superimposes imagery over your smart-phone view, and lets you interact with it via hand gestures.
September 15, 2011

To make its business software more effective, HP recently paid $10 billion for Autonomy, a U.K. software company that specializes in machine learning. But it turns out that Autonomy has developed image-processing techniques for gesture-recognizing augmented reality—the type of technology that could be more attractive to consumers than IT managers.

Enhancing reality: The Aurasma app overlays interactive content on the real world, such as a page in a magazine. The app can recognize gestures, too, letting a user interact with virtual objects.

Augmented reality involves layering computer-generated imagery on top of a view of the real world as seen through the camera of a smart phone or tablet computer. So someone looking at a city scene through a device could see tourist information on top of the view.

Autonomy’s new augmented reality technology, called Aurasma, goes a step further: it recognizes a user’s hand gestures. This means a person using the app can reach out in front of the device to interact with the virtual content. Previously, interacting with augmented reality content involved tapping the screen. One demonstration released by Autonomy creates a virtual air hockey game on top of an empty tabletop—users play by waving their hands.

Autonomy’s core technology lets businesses index and search data that conventional, text-based search engines struggle with. Examples are audio recordings of sales calls, or video from surveillance cameras. “We use the same core technology in Aurasma to identify images or scenes and retrieve the relevant content to put on top,” says Aurasma director Matt Mills, who presented the app at the DEMO technology conference in Santa Clara, California, this week.

Autonomy quietly launched Aurasma in May, and GQ magazine has already used it to make some of its pages interactive. But the company announced only recently that Aurasma can track and respond to gestures to make virtual objects interactive. “We’ve now added finger recognition,” says Mills, “so you get an experience a bit like using the Kinect. You reach out your hand and the content responds.”

The Aurasma app, available for iPhone, iPad, and Android smart phones, constantly creates a visual “fingerprint” of what’s in front of it, and compares it to a set of fingerprints for the area where the app is being used. When it identifies a scene, perhaps a photo on a billboard, the Statue of Liberty, or a house on your street, interactive video or imagery is overlaid on top of the view. Users can also create their own content by assigning a photo or video to a particular real-world scene. The virtual content is carefully lined up with the visual features it was programmed for. This means a massive dinosaur can rear its head behind the Golden Gate Bridge, as seen in this video.

Aurasma’s closest competitor is Layar, a Netherlands company that offers an augmented-reality platform that others can add content to. However, Layar has so far largely relied on GPS location to position content, and only recently made it possible to position virtual objects more precisely, using image recognition. And Layar does not recognize users’ gestures.

Mills says that Aurasma’s ability to track objects precisely means it can be used for more than just advertising. In another demonstration, a smart phone running the app, when pointed at the back of a broadband router, revealed graphics and text explaining what each port was for.

Although mobile phones and tablets are the best interfaces available for augmented reality today, the experience is still somewhat clunky, since a person must hold up a device with one hand at all times. Sci-fi writers and technologists have long forecast that the technology would eventually be delivered through glasses. Recognizing hand movements would be useful for such a design, since there wouldn’t be the option of using a touch screen or physical buttons.

Deep Dive


Uber Autonomous Vehicles parked in a lot
Uber Autonomous Vehicles parked in a lot

It will soon be easy for self-driving cars to hide in plain sight. We shouldn’t let them.

If they ever hit our roads for real, other drivers need to know exactly what they are.

stock art of market data
stock art of market data

Maximize business value with data-driven strategies

Every organization is now collecting data, but few are truly data driven. Here are five ways data can transform your business.

Cryptocurrency fuels new business opportunities

As adoption of digital assets accelerates, companies are investing in innovative products and services.

Yann LeCun
Yann LeCun

Yann LeCun has a bold new vision for the future of AI

One of the godfathers of deep learning pulls together old ideas to sketch out a fresh path for AI, but raises as many questions as he answers.

Stay connected

Illustration by Rose WongIllustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at with a list of newsletters you’d like to receive.