Skip to Content
Artificial intelligence

A leading AI conference is trying to fix the field’s reproducibility crisis

April 9, 2019

Last week, organizers of the Neural Information Processing Systems Conference (NeurIPS), one of the world’s largest annual AI research conferences, updated their policy for paper submissions to require what they’re calling a reproducibility checklist. It’s a small shift in a grander fight to curb the growing “reproducibility crisis” in science, where a disconcerting number of research findings are not successfully being replicated by other researchers, casting doubt on the validity of the initial findings.

In February, a statistician from Rice University warned that machine-learning techniques are likely fueling that crisis because the results they produce are difficult to audit. It’s a worrying problem as machine learning is increasingly being applied in important areas such as health care and drug research.

NeurIPS’s reproducibility checklist tries to tackle the problem. Among other things, researchers have to provide a clear description of their algorithm; a complete description of their data collection process; a link to any simulation environment they used during training; and a comprehensive walk-through of what data they kept, tossed, and why. The idea is to create a new standard of transparency for researchers to show how they arrived at their conclusions.

As the “world's most significant AI conference,” wrote Jack Clark, the policy director of the nonprofit OpenAI, in his weekly newsletter Import AI, “NeurIPS 2019 policy will have [a] knock-on effect across [the] wider AI ecosystem.”

This story originally appeared in our Webby-nominated AI newsletter The Algorithm. To have it directly delivered to your inbox, sign up here for free.

Deep Dive

Artificial intelligence

A Roomba recorded a woman on the toilet. How did screenshots end up on Facebook?

Robot vacuum companies say your images are safe, but a sprawling global supply chain for data from our devices creates risk.

The viral AI avatar app Lensa undressed me—without my consent

My avatars were cartoonishly pornified, while my male colleagues got to be astronauts, explorers, and inventors.

Roomba testers feel misled after intimate images ended up on Facebook

An MIT Technology Review investigation recently revealed how images of a minor and a tester on the toilet ended up on social media. iRobot said it had consent to collect this kind of data from inside homes—but participants say otherwise.

How to spot AI-generated text

The internet is increasingly awash with text written by AI software. We need new tools to detect it.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.