Skip to Content

A Google Glass App Knows What You’re Looking At

An app for Google’s wearable computer Glass can recognize objects in front of a person wearing the device.
September 30, 2013

Google has shown that the camera integrated into Google Glass, the company’s head-worn computer, can capture some striking video. Now machine learning company AlchemyAPI has built an app that uses that camera to recognize what a person is looking at. The app was built at an employee hack session held by the company this month to experiment with ways to demonstrate their new image recognition service.

Seeing Clearly: This machine vision app is 67 percent sure that it is looking at a desk (Credit: AlchemyAPI)

The app can either work on photos taken by a person wearing Glass, or constantly grab images from the device’s camera. Those are sent to the cloud or a nearby computer for processing by AlchemyAPI’s image recognition software. The software sends back its best guess at what it sees and then Glass will display, or speak, the verdict.

“There’s a slight delay and then you’ll hear it say ‘arm chair’ or ‘desktop computer,’” says AlchemyAPI’s CEO Elliot Turner. “It takes about 250ms to analyze a given frame.”

Here’s a video of the app in action:

You could say Turner’s app simply states the obvious, but doing that in (almost) real time is no mean feat for computer vision software. AlchemyAPI’s image recognition system is built on a system of complex simulated neural networks of the type known as “deep learning”, which can produce systems that learn faster and smarter than more established techniques. Google has been a pioneer in this area (see “Deep Learning”) and many other large companies including Microsoft (see “Microsoft Brings Star Trek’s Voice Translator to Life”) and Facebook (“Facebook Launches Advanced AI Research Group”) are also investing in the technology.

An online demo shows the capability of AlchemyAPI’s image recognition software. It shows the system responding to a constant train of images pulled from Google Image search and Flickr.

Although far from perfect, the software’s performance is impressive. The insight the demo gives into the certainty of each judgement it makes also suggests it could easily be made to appear more competent. Many of the system’s failures come when it tries to be very specific. Saying “This is an insect” would be better than “I’m not sure what this is, it could be a mantis or a cricket”. Turner says that early customers for the image recognition offering are mostly media companies that want to categorise and search large collections of unlabelled photographs.

Object recognition systems can be compared by testing them against the standard ImageNet database, which contains more than 50 million images labeled with 22,000 different categories. Elliot won’t share exact figures, but says his system performs on par with the best systems publicly tested against that, which typically get about 15-17 percent of their guesses wrong. One such system now powers the object recognition built into Google’s image search feature for its Google Plus social network, after Google bought a startup founded by deep learning pioneer Geoffrey Hinton of the University of Toronto year.

Keep Reading

Most Popular

open sourcing language models concept
open sourcing language models concept

Meta has built a massive new language AI—and it’s giving it away for free

Facebook’s parent company is inviting researchers to pore over and pick apart the flaws in its version of GPT-3

transplant surgery
transplant surgery

The gene-edited pig heart given to a dying patient was infected with a pig virus

The first transplant of a genetically-modified pig heart into a human may have ended prematurely because of a well-known—and avoidable—risk.

Muhammad bin Salman funds anti-aging research
Muhammad bin Salman funds anti-aging research

Saudi Arabia plans to spend $1 billion a year discovering treatments to slow aging

The oil kingdom fears that its population is aging at an accelerated rate and hopes to test drugs to reverse the problem. First up might be the diabetes drug metformin.

Yann LeCun
Yann LeCun

Yann LeCun has a bold new vision for the future of AI

One of the godfathers of deep learning pulls together old ideas to sketch out a fresh path for AI, but raises as many questions as he answers.

Stay connected

Illustration by Rose WongIllustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.