Teaching Robots New Tricks
Robotic helicopters learn complex tricks by analyzing demos.
Programming instructions for robots can be a time-consuming, labor-intensive task. Many roboticists believe that training robots by demonstrating new skills could speed up the process and enable the machines to perform more difficult tasks. Now researchers have created such a system for robotic helicopters. With their approach, the team can train a robotic helicopter to perform a complicated aerial maneuver in less than 30 minutes simply by analyzing video footage of the trick. The work could one day be applied to a wide variety of robots on land and sea, as well as in the air.
For very basic aerial maneuvers, researchers can program specific commands based on the way a human operator would use the controls. But aerial acrobatics, such as flying upside down, require a more robust and adaptive approach. A gust of wind or a small variation in the helicopter’s starting position can send the vehicle completely off course if adjustments aren’t made immediately to the flight plan. “It’s not sufficient to just replay the same sequence of controls as a human pilot,” says Pieter Abbeel, who worked as a researcher on the project while completing his PhD at Stanford University. With the apprenticeship approach, the robot can make changes mid-flight because it’s not tied to a specific series of commands. This could help autonomous helicopters deal with real-world challenges, such as landing on slanted terrain or coping with sudden changes in weather conditions, ultimately resulting in more stable flight.
Training begins with a human expert demonstrating a new trick on a remote-controlled helicopter. As the expert repeats the maneuver, one of the researchers presses a button to indicate the start and end time of each attempt. The expert needs to perform each trick approximately 10 times, so that subtle deviations can be eliminated and the software can calculate the ideal path. The software carefully warps the timing of each video clip so that it can compare the attempts. Small blips in the data, known as noise, are also eliminated. Ultimately, the software creates a highly accurate aerodynamic model of the trick that the autonomous helicopter uses as a flight guide.
Once in the air, the robotic helicopter wirelessly relays information from its onboard sensors to a computer on the ground. “We place a number of instruments on the helicopters–gyroscopes, accelerometers, and a magnetic compass–to figure out the position and orientation,” says Andrew Ng, an assistant professor of computer science at Stanford University, who also worked on the project. “We wirelessly send the instrument readings down to a desktop computer on the ground, which computes the appropriate control commands.” These commands are sent back to the helicopter 20 times per second. Video cameras on the ground also help to keep track of the helicopter.
With each attempt, the robot learns how to perfect the trick. “The first time, it might take a turn a bit too wide. It will then use its knowledge of its own dynamics to learn to adjust the way it takes a turn,” Ng says.
At a recent demonstration at Stanford, an autonomous helicopter used this approach to perform several complicated tricks, including loops with pirouettes and a backward funnel maneuver known as the hurricane. The team was even able to demonstrate a particularly difficult stunt called the tic toc, in which the helicopter hovers with its tail down while its nose swings back and forth like an inverted pendulum. Such a trick had been impossible to perform using hard coding, and it represented an impressive achievement for the team. “We can now trust our helicopter controls a lot more [and achieve] higher-performance flight,” says Abbeel, who now works as an assistant professor in the Department of Electrical Engineering and Computer Science at the University of California, Berkeley.
Eric Feron, a professor of aerospace engineering at the Georgia Institute of Technology, was not involved in the Stanford project but is impressed by the performance of the autonomous helicopters trained using the approach. He also appreciates the underlying methodology. “When I was involved in similar research back in early 2000,” he says, “there was definitely what I would call human intervention in figuring out what the online control systems should be doing in order to repeat the maneuvers. We had to program the computers ourselves.” Feron says the Stanford work represents a significant gain in efficiency, by cutting down the learning time to half an hour. “At the end of our research, we were able to maybe do a new maneuver in one day,” he says.
Abbeel notes that while the autonomous helicopters have achieved a new level of reliability, there is room for improvement, and safety will be a key concern if such robots are ever flown over populated areas. The machines have to be able to fly at least as well as an expert human pilot, even while doing complicated maneuvers, he says, and simple back-and-forth flight won’t be good enough for search-and-rescue missions. “I like to imagine a future in which someday, if there is an accident out on the ocean, a fleet of a dozen autonomous helicopters can be instantaneously deployed to search for survivors,” he says. This could help offset the lack of human pilots qualified to perform such a task and increase the chance of locating survivors.
The learning system could be used on other kinds of robots as well, Ng says, such as those that do housework or work in factories. “It could also allow for the very precise control of cars, motorcycles, fixed-wing aircraft, and even sea-based vehicles,” he says.
In the future, the team hopes to make their system more flexible. “When we as humans learn, there are many things that speed up the process besides demonstrations. An expert pilot might give advice in other forms,” says Abbeel, such as verbal or written tips. Ideally, the team hopes to design a system that can incorporate such advice.