Waymo is going to share its self-driving data—but it’s still not enough

Waymo, the self-driving spinoff of Alphabet, is the latest firm to offer up some of the information gleaned from its vehicles to the wider research community.
The news: Waymo says it will share some of the data it’s gathered from its vehicles for free so other researchers working on autonomous driving can use it. Waymo isn’t the first to do this: Lyft, Argo AI, and other firms have already open-sourced some data sets. But Waymo’s move is notable because its vehicles have covered millions of miles on roads already.
Why this matters: Unlike human drivers, autonomous vehicles don’t have an instinctive understanding of the world. Instead, they rely on training data to learn about conditions they are likely to encounter and how to react to them. The more high-quality data AI models have to train on, the better.
Waymo’s data set: It contains 1,000 segments, each capturing 20 seconds of continuous driving. The data comes from four locations: San Francisco and Mountain View in California; Phoenix in Arizona (where Waymo has launched a small-scale robotaxi service); and Kirkland in Washington. It also comes from multiple sources, including cameras as well as radar and lidar, which bounce lasers off nearby objects to create 3D maps of their surroundings. Helpfully, the company has labelled things like pedestrians, bikes, and signals in the data set, which means other researchers won’t have to do this grunt work.
Data hoarders: While Waymo deserves some credit for its move, it’s sharing just a tiny sliver of the information it has gathered. Other companies are also hoarding data for competitive reasons, and they are especially reluctant to share information related to accidents and near-misses. But if the industry wants to overcome concerns about autonomous vehicles’ safety, the businesses in it will have to become far more transparent about what they’ve learned.
Keep Reading
Most Popular
This new data poisoning tool lets artists fight back against generative AI
The tool, called Nightshade, messes up training data in ways that could cause serious damage to image-generating AI models.
Rogue superintelligence and merging with machines: Inside the mind of OpenAI’s chief scientist
An exclusive conversation with Ilya Sutskever on his fears for the future of AI and why they’ve made him change the focus of his life’s work.
Data analytics reveal real business value
Sophisticated analytics tools mine insights from data, optimizing operational processes across the enterprise.
Driving companywide efficiencies with AI
Advanced AI and ML capabilities revolutionize how administrative and operations tasks are done.
Stay connected
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.