Watch this robot dog scramble over tricky terrain just by using its camera
Unlike most existing robots, which rely heavily on internal maps to get around, this robot uses vision and reinforcement learning.
When Ananye Agarwal took his dog out for a walk up and down the steps in the local park near Carnegie Mellon University, other dogs stopped in their tracks.
That’s because Agarwal’s dog was a robot—and a special one at that. Unlike other robots, which tend to rely heavily on an internal map to get around, his robot uses a built-in camera. Agarwal, a PhD student at Carnegie Mellon, is one of a group of researchers that has developed a technique allowing robots to walk on tricky terrain using computer vision and reinforcement learning. The researchers hope their work will help make it easier for robots to be deployed in the real world.
Unlike existing robots on the market, such as Boston Dynamics’ Spot, which moves around using internal maps, this robot uses cameras alone to guide its movements in the wild, says Ashish Kumar, a graduate student at UC Berkeley, who is one of the authors of a paper describing the work; it’s due to be presented at the Conference on Robot Learning next month. Other attempts to use cues from cameras to guide robot movement have been limited to flat terrain, but they managed to get their robot to walk up stairs, climb on stones, and hop over gaps.
The four-legged robot is first trained to move around different environments in a simulator, so it has a general idea of what walking in a park or up and down stairs is like. When it’s deployed in the real world, visuals from a single camera in the front of the robot guide its movement. The robot learns to adjust its gait to navigate things like stairs and uneven ground using reinforcement learning, an AI technique that allows systems to improve through trial and error.
Removing the need for an internal map makes the robot more robust, because it is no longer constrained by potential errors in a map, says Deepak Pathak, an assistant professor at Carnegie Mellon, who was part of the team.
It is extremely difficult for a robot to translate raw pixels from a camera into the kind of precise and balanced movement needed to navigate its surroundings, says Jie Tan, a research scientist at Google, who was not involved in the study. He says the work is the first time he’s seen a small and low-cost robot demonstrate such impressive mobility.
The team has achieved a “breakthrough in robot learning and autonomy,” says Guanya Shi, a researcher at the University of Washington who studies machine learning and robotic control, who also was not involved in the research.
Akshara Rai, a research scientist at Facebook AI Research who works on machine learning and robotics, and was not involved in this work, agrees.
“This work is a promising step toward building such perceptive legged robots and deploying them in the wild,” says Rai.
However, while the team’s work is helpful for improving how the robot walks, it won’t help the robot work out where to go in advance, Rai says. “Navigation is important for deploying robots in the real world,” she says.
More work is needed before the robot dog will be able to prance around parks or fetch things in the house. While the robot may understand depth through its front camera, it cannot cope with situations such as slippery ground or tall grass, Tan says; it could step into puddles or get stuck in mud.
The inside story of how ChatGPT was built from the people who made it
Exclusive conversations that take us behind the scenes of a cultural phenomenon.
AI is dreaming up drugs that no one has ever seen. Now we’ve got to see if they work.
AI automation throughout the drug development pipeline is opening up the possibility of faster, cheaper pharmaceuticals.
GPT-4 is bigger and better than ChatGPT—but OpenAI won’t say why
We got a first look at the much-anticipated big new language model from OpenAI. But this time how it works is even more deeply under wraps.
The original startup behind Stable Diffusion has launched a generative AI for video
Runway’s new model, called Gen-1, can change the visual style of existing videos and movies.
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.