An algorithm that evolved Starcraft bots is also training self-driving cars

A more efficient way to training neural nets could provide a crucial edge in the hyper-competitive world of automated driving—and elsewhere.

Will Knightarchive page

July 25, 2019

Waymo vehicleWaymo

Waymo’s self-driving cars now have something in common with the brains that guide regular vehicles: their intelligence comes partly from the power of evolution.

Engineers at Waymo, owned by Alphabet, teamed up with researchers at DeepMind, another Alphabet division dedicated to AI, to find a more efficient process to train and fine-tune the company’s self-driving algorithms.

They used a technique called population-based training (PBT), previously developed by DeepMind for honing video-game algorithms. PBT, which takes inspiration from biological evolution, speeds up the selection of machine-learning algorithms and parameters for a particular task by having candidate code draw from the “fittest” specimens (the ones that perform a given task most efficiently) in an algorithmic population.

Refining AI algorithms in this way may also help give Waymo an edge. The algorithms that guide self-driving cars need to be retrained and recalibrated as the vehicles collect more data and are deployed in new locations. Dozens of companies are racing to demonstrate the best self-driving technology on real roads. Waymo is exploring various other ways of automating and accelerating the development of its machine-learning algorithms.

Indeed, more efficient methods for retraining machine-learning code should allow AI to be flexible and useful in different contexts.

“One of the key challenges for anyone doing machine learning in an industrial system is to be able to rebuild the system to take advantage of new code,” says Matthieu Devin, director of machine learning infrastructure at Waymo. “We need to constantly retrain the net and rewrite our code. And when you retrain, you may need to tweak your parameters.”

Modern self-driving cars are controlled by an almost Rube Goldberg combination of algorithms and techniques. Numerous machine-learning algorithms are used to spot road lines, signs, other vehicles, and pedestrians in sensor data. These work in concert with conventional, or hand-written, code to control the vehicle and respond to different eventualities. Each new iteration of a self-driving system has to be tested rigorously in simulation.

Today’s self-driving vehicles rely heavily upon deep learning, in particular. But configuring a deep neural network with the right properties and parameters (the values that are hard-coded at the start) is a tricky art. Candidate networks and parameters are mostly either selected manually, which is time consuming, or tweaked at random by a computer, which requires lots of processing power.

“At Waymo we train tons of different neural nets, and researchers spend a lot of time figuring out how to best train these neural nets,” says Yu-hsin (Joyce) Chen, a machine-learning infrastructure engineer at Waymo. “We had a need for it and just jumped at the opportunity.”

Chen says her team is now using PBT to improve the development of deep-learning code used to detecting lane markings, vehicles, and pedestrians, and to verify the accuracy of labeled data that is fed to other machine-learning algorithms. She says PBT has reduced the computer power required to retrain a neural net by about half and has doubled or tripled the speed of the development cycle.

Google is developing a range of techniques to help automate the process of training machine-learning models, and it already offers some of them to customers through a project known as Cloud Auto-ML. Making AI training more efficient and automated will undoubtedly prove crucial to efforts to commercialize, and profit from, the technology.

Oriol Vinyals, a principal research scientist at DeepMind and one of the inventors of PBT, says the idea for using PBT at Waymo came up when he was visiting Devin. DeepMind first developed the technique in 2017 as a way to speed the training of neural networks, later using it to help a computer to play StarCraft II, a combat video game that is especially challenging for machines (see “Innovators Under 35, 2016”). DeepMind's collaboration with Waymo began before it published its StarCraft research in January of 2019.

The evolution-like process employed in PBT also makes it easier to understand how a deep-learning algorithm has been tweaked and optimized, with something that resembles a genealogical tree. “One of the cool things is that you can you can visualize the evolution of parameters,” says Vinyals. “It’s a nice way to verify that what happens actually makes sense to you.”

Updated July 29 to reflect the fact that PBT was developed before DeepMind began working on StarCraft II.

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.