Skip to Content

A Supercharged System to Teach Robots New Tricks in Little Time

Startup Osaro’s system is designed to make robots learn faster with a combination of deep-reinforcement learning and human help.
December 2, 2015

A new artificial intelligence startup called Osaro aims to give industrial robots the same turbocharge that DeepMind Technologies gave Atari-playing computer programs.

In December 2013, DeepMind showcased a type of artificial intelligence that had mastered seven Atari 2600 games from scratch in a matter of hours, and could outperform some of the best human players. Google swiftly snapped up the London-based company, and the deep-reinforcement learning technology behind it, for a reported $400 million.

Now Osaro, with $3.3 million in investments from the likes of Peter Thiel and Jerry Yang, claims to have taken deep-reinforcement learning to the next level, delivering the same superhuman AI performance but over 100 times as fast.

Deep-reinforcement learning arose from deep learning, a method of using multiple layers of neural networks to efficiently process and organize mountains of raw data (see “10 Breakthrough Technologies 2013: Deep Learning”). Deep learning now underlies many of the best facial recognition, video classification, and text or speech recognition systems from Google, Microsoft, and IBM Watson.

Deep-reinforcement learning adds control to the mix, using deep learning’s ability to accurately classify inputs, such as video frames from a game of Breakout or Pong, to work toward a high score. Deep-reinforcement learning systems train themselves automatically by repeating a task over and over again until they reach their goal. “The power of deep reinforcement is that you can discover behaviors that a human would not have guessed or thought to hand code,” says Derik Pridmore, president and chief operating officer of Osaro.

Training a new AI system from a blank slate, however, can take a long time. DeepMind’s Atari demo required tens of millions of video frames, the equivalent of many thousands of games, to reach near-perfection. That is fine for digital tasks that can be virtually compressed to hours or minutes in supercomputers, but it doesn’t translate well to real-world robotics.

“A robot is a physically embodied system that takes time to move through space,” says Pridmore. “If you want to use basic deep-reinforcement learning to teach a robot to pick up a cup from scratch, it would literally take a year or more.”

To accelerate that training process, Osaro took inspiration from the way people learn most activities—by watching other people. Osaro has built a games-playing program that starts by observing a human play several games; it then uses those behaviors as a jumping-off point for its own training efforts. “It doesn’t copy a human and you don’t have to play precisely or very well. You just give it a reasonable idea of what to do,” says Pridmore. He claims Osaro’s AI system can pick up a game 100 times as fast as DeepMind’s program, although the company has yet to publish its research.

Osaro’s first application for its deep-reinforcement learning technology is likely to be high-volume manufacturing, where reprogramming assembly line robots can currently take weeks of effort from highly skilled (and highly paid) professionals. Pridmore says Osaro can reduce that time to around a week, with an added benefit of building efficient control systems that can cope with “noisy” conditions such as uneven components or changing lighting.

Eventually, says Pridmore, the training process should be almost effortless. “In the future, you will be able to give a robot three buckets of parts, show it a finished product, and simply say, ‘Make something like this.’” That day is still some ways off. Osaro’s next step is to run simulated robotic demos in a virtual environment called Gazebo before launching with industrial robot manufacturers and their customers in 2017.

Oren Etzioni, executive director of the Allen Institute for Artificial Intelligence, says the approach is “technically exciting” and “tantalizing.” Pieter Abbeel, a professor of computer science at the University of California, Berkeley, and organizer of a deep-reinforcement learning symposium, agrees. “Learning more directly from human demonstrations and advice in all kinds of formats is intuitively the way to get a system to learn more quickly,” he says. “However, developing a system that is able to leverage a wide range of learning modalities is challenging.”

And there is always the question of what DeepMind has been working on. If DeepMind’s AI system could master the Atari in a matter of hours, two years behind Google’s closed doors might leave even Osaro’s human-taught AI systems far behind.

Keep Reading

Most Popular

transplant surgery
transplant surgery

The gene-edited pig heart given to a dying patient was infected with a pig virus

The first transplant of a genetically-modified pig heart into a human may have ended prematurely because of a well-known—and avoidable—risk.

open sourcing language models concept
open sourcing language models concept

Meta has built a massive new language AI—and it’s giving it away for free

Facebook’s parent company is inviting researchers to pore over and pick apart the flaws in its version of GPT-3

Muhammad bin Salman funds anti-aging research
Muhammad bin Salman funds anti-aging research

Saudi Arabia plans to spend $1 billion a year discovering treatments to slow aging

The oil kingdom fears that its population is aging at an accelerated rate and hopes to test drugs to reverse the problem. First up might be the diabetes drug metformin.

images created by Google Imagen
images created by Google Imagen

The dark secret behind those cute AI-generated animal images

Google Brain has revealed its own image-making AI, called Imagen. But don't expect to see anything that isn't wholesome.

Stay connected

Illustration by Rose WongIllustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.