Missing code and data are making it difficult to compare machine-learning work—and that may be hurting progress.
The problem: Science reports that from a sample of 400 papers at top AI conferences in recent years, only 6 percent of presenters shared code. Just a third shared data, and a little over half shared summaries of their algorithms, known as pseudocode.
Why it matters: Without access to that information, it’s hard to reproduce a study’s findings. That makes it all but impossible to benchmark newly developed tools against existing ones, so it’s hard for researchers to know which direction to push future research.
How to solve it: Sometimes a lack of sharing may be understandable—say, if intellectual property is owned by a private firm. But there seems to be a wider-spread culture of keeping details under wraps. Some meetings and journals are now encouraging sharing; perhaps more ought to follow.