This robot dog just taught itself to walk

AI could help robots learn new skills and adapt to the real world quickly.

Melissa Heikkiläarchive page

July 18, 2022

Danijar.com

The robot dog is waving its legs in the air like an exasperated beetle. After 10 minutes of struggling, it manages to roll over to its front. Half an hour in, the robot is taking its first clumsy steps, like a newborn calf. But after one hour, the robot is strutting around the lab with confidence.

What makes this four-legged robot special is that it learned to do all this by itself, without being shown what to do in a computer simulation.

Danijar Hafner and colleagues at the University of California, Berkeley, used an AI technique called reinforcement learning, which trains algorithms by rewarding them for desired actions, to train the robot to walk from scratch in the real world. The team used the same algorithm to successfully train three other robots, such as one that was able to pick up balls and move them from one tray to another.

Traditionally, robots are trained in a computer simulator before they attempt to do anything in the real world. For example, a pair of robot legs called Cassie taught itself to walk using reinforcement learning, but only after it had done so in a simulation.

“The problem is your simulator will never be as accurate as the real world. There’ll always be aspects of the world you’re missing,” says Hafner, who worked with colleagues Alejandro Escontrela and Philipp Wu on the project and is now an intern at DeepMind. Adapting lessons from the simulator to the real world also requires extra engineering, he says.

The team’s algorithm, called Dreamer, uses past experiences to build up a model of the surrounding world. Dreamer also allows the robot to conduct trial-and-error calculations in a computer program as opposed to the real world, by predicting potential future outcomes of its potential actions. This allows it to learn faster than it could purely by doing. Once the robot had learned to walk, it kept learning to adapt to unexpected situations, such as resisting being toppled by a stick.

“Teaching robots through trial and error is a difficult problem, made even harder by the long training times such teaching requires,” says Lerrel Pinto, an assistant professor of computer science at New York University, who specializes in robotics and machine learning. Dreamer shows that deep reinforcement learning and world models are able to teach robots new skills in a really short amount of time, he says.

Jonathan Hurst, a professor of robotics at Oregon State University, says the findings, which have not yet been peer-reviewed, make it clear that “reinforcement learning will be a cornerstone tool in the future of robot control.”

Removing the simulator from robot training has many perks. The algorithm could be useful for teaching robots how to learn skills in the real world and adapt to situations like hardware failures, Hafner says–for example, a robot could learn to walk with a malfunctioning motor in one leg.

The approach could also have huge potential for more complicated things like autonomous driving, which require complex and expensive simulators, says Stefano Albrecht, an assistant professor of artificial intelligence at the University of Edinburgh. A new generation of reinforcement-learning algorithms could “super quickly pick up in the real world how the environment works,” Albrecht says.

But there are some big unsolved problems, Pinto says.

With reinforcement learning, engineers need to specify in their code which behaviors are good and are thus rewarded, and which behaviors are undesirable. In this case, turning over and walking is good, while not walking is bad. “A roboticist will need to do this for each and every task [or] problem they want the robot to solve,” says Pinto. That is incredibly time consuming, and it is difficult to program behaviors for unexpected situations.

And while simulators can be inaccurate, so can world models, Albrecht says. “World models start from nothing, so initially the predictions from the models will be completely all over the place,” he says. It takes time until they get enough data to make them accurate.

In the future, Hafner says, it would be nice to teach the robot to understand spoken commands. Hafner says the team also wants to connect cameras to the robot dog to give it vision. This would allow it to navigate in complex indoor situations, such as walking to a room, finding objects, and—yes!—playing fetch.

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

Will Douglas Heavenarchive page

Google DeepMind’s new generative model makes Super Mario–like games from scratch

Genie learns how to control games by watching hours and hours of video. It could help train next-gen robots too.

Will Douglas Heavenarchive page

What’s next for generative video

OpenAI's Sora has raised the bar for AI moviemaking. Here are four things to bear in mind as we wrap our heads around what's coming.

Will Douglas Heavenarchive page

The AI Act is done. Here’s what will (and won’t) change

The hard work starts now.

Melissa Heikkiläarchive page

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

This robot dog just taught itself to walk

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

Google DeepMind’s new generative model makes Super Mario–like games from scratch

What’s next for generative video

The AI Act is done. Here’s what will (and won’t) change

Stay connected

Get the latest updates from
MIT Technology Review

The latest iteration of a legacy

Advertise with MIT Technology Review

About

Help

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

Google DeepMind’s new generative model makes Super Mario–like games from scratch

What’s next for generative video

The AI Act is done. Here’s what will (and won’t) change

Stay connected

Get the latest updates fromMIT Technology Review

Get the latest updates from
MIT Technology Review