Skip to Content
Artificial intelligence

Powerful computer vision algorithms are now small enough to run on your phone

October 11, 2019
An image of hand gestures being recognized on a mobile phone
An image of hand gestures being recognized on a mobile phoneHand illustrations: Noun Project / Ms. Tech

Researchers have shrunk state-of-the-art computer vision models to run on low-power devices.

Growing pains: Visual recognition is deep learning’s strongest skill. Computer vision algorithms are analyzing medical images, enabling self-driving cars, and powering face recognition. But training models to recognize actions in videos has grown increasingly expensive. This has fueled concerns about the technology’s carbon footprint and its increasing inaccessibility in low-resource environments.

The research: Researchers at the MIT-IBM Watson AI Lab have now developed a new technique for training video recognition models on a phone or other device with very limited processing capacity. Typically, an algorithm will process video by splitting it up into image frames and running recognition algorithms on each of them. It then pieces together the actions shown in the video by seeing how the objects change over subsequent frames. The method requires the algorithm to “remember” what it has seen in each frame and the order in which it has seen it. This is unnecessarily inefficient.

In the new approach, the algorithm instead extracts basic sketches of the objects in each frame, and overlays them on top of one another. Rather than remember what happened when, the algorithm can get an impression of the passing of time by looking at how the objects shift through space in the sketches. In testing, the researchers found that the new approach trained video recognition models three times faster than the state of the art. It was also able to quickly classify hand gestures with a small computer and camera running only on enough energy to power a bike light.

Why it matters: The new technique could help reduce lag and computation costs in existing commercial applications of computer vision. It could, for example, make self-driving cars safer by speeding up their reaction to incoming visual information. The technique could also unlock new applications that previously weren’t possible, such as by enabling phones to help diagnose patients or analyze medical images.

Distributed AI: As more and more AI research gets translated into applications, the need for tinier models will increase. The MIT-IBM paper is part of a growing trend to shrink state-of-the-art models to a more manageable size.

To have more stories like this delivered directly to your inbox, sign up for our Webby-nominated AI newsletter The Algorithm. It's free.

Deep Dive

Artificial intelligence

A Roomba recorded a woman on the toilet. How did screenshots end up on Facebook?

Robot vacuum companies say your images are safe, but a sprawling global supply chain for data from our devices creates risk.

The viral AI avatar app Lensa undressed me—without my consent

My avatars were cartoonishly pornified, while my male colleagues got to be astronauts, explorers, and inventors.

Roomba testers feel misled after intimate images ended up on Facebook

An MIT Technology Review investigation recently revealed how images of a minor and a tester on the toilet ended up on social media. iRobot said it had consent to collect this kind of data from inside homes—but participants say otherwise.

How to spot AI-generated text

The internet is increasingly awash with text written by AI software. We need new tools to detect it.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.