Watch this robot dog scramble over tricky terrain just by using its camera

Unlike most existing robots, which rely heavily on internal maps to get around, this robot uses vision and reinforcement learning.

Melissa Heikkiläarchive page

November 21, 2022

Courtesy of the Researchers

When Ananye Agarwal took his dog out for a walk up and down the steps in the local park near Carnegie Mellon University, other dogs stopped in their tracks.

That’s because Agarwal’s dog was a robot—and a special one at that. Unlike other robots, which tend to rely heavily on an internal map to get around, his robot uses a built-in camera. Agarwal, a PhD student at Carnegie Mellon, is one of a group of researchers that has developed a technique allowing robots to walk on tricky terrain using computer vision and reinforcement learning. The researchers hope their work will help make it easier for robots to be deployed in the real world.

Unlike existing robots on the market, such as Boston Dynamics’ Spot, which moves around using internal maps, this robot uses cameras alone to guide its movements in the wild, says Ashish Kumar, a graduate student at UC Berkeley, who is one of the authors of a paper describing the work; it’s due to be presented at the Conference on Robot Learning next month. Other attempts to use cues from cameras to guide robot movement have been limited to flat terrain, but they managed to get their robot to walk up stairs, climb on stones, and hop over gaps.

grid of clips of robot dog walking on stairs

The four-legged robot is first trained to move around different environments in a simulator, so it has a general idea of what walking in a park or up and down stairs is like. When it’s deployed in the real world, visuals from a single camera in the front of the robot guide its movement. The robot learns to adjust its gait to navigate things like stairs and uneven ground using reinforcement learning, an AI technique that allows systems to improve through trial and error.

Removing the need for an internal map makes the robot more robust, because it is no longer constrained by potential errors in a map, says Deepak Pathak, an assistant professor at Carnegie Mellon, who was part of the team.

It is extremely difficult for a robot to translate raw pixels from a camera into the kind of precise and balanced movement needed to navigate its surroundings, says Jie Tan, a research scientist at Google, who was not involved in the study. He says the work is the first time he’s seen a small and low-cost robot demonstrate such impressive mobility.

The team has achieved a “breakthrough in robot learning and autonomy,” says Guanya Shi, a researcher at the University of Washington who studies machine learning and robotic control, who also was not involved in the research.

Akshara Rai, a research scientist at Facebook AI Research who works on machine learning and robotics, and was not involved in this work, agrees.

“This work is a promising step toward building such perceptive legged robots and deploying them in the wild,” says Rai.

However, while the team’s work is helpful for improving how the robot walks, it won’t help the robot work out where to go in advance, Rai says. “Navigation is important for deploying robots in the real world,” she says.

More work is needed before the robot dog will be able to prance around parks or fetch things in the house. While the robot may understand depth through its front camera, it cannot cope with situations such as slippery ground or tall grass, Tan says; it could step into puddles or get stuck in mud.

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.