Skip to Content
Artificial intelligence

New standards for AI clinical trials will help spot snake oil and hype

The guidelines ensure that medical AI research is subject to the same scrutiny as drug development and diagnostic tests.
September 11, 2020
CT Scan
Kyle McDonald / Flickr

The news: An international consortium of medical experts has introduced the first official standards for clinical trials that involve artificial intelligence. The move comes at a time when hype around medical AI is at a peak, with inflated and unverified claims about the effectiveness of certain tools threatening to undermine people’s trust in AI overall. 

What it means: Announced in Nature Medicine, the British Medical Journal, and the Lancet, the new standards extend two sets of guidelines around how clinical trials are conducted and reported that are already used around the world for drug development, diagnostic tests, and other medical interventions. AI researchers will now have to describe the skills needed to use an AI tool, the setting in which the AI is evaluated, details about how humans interact with the AI, the analysis of error cases, and more.

Why it matters: Randomized controlled trials are the most trustworthy way to demonstrate the effectiveness and safety of a treatment or clinical technique. They underpin both medical practice and health policy. But their trustworthiness depends on whether researchers stick to strict guidelines in how their trials are carried out and reported. In the last few years, many new AI tools have been developed and described in medical journals, but their effectiveness has been hard to compare and assess because the quality of trial designs varies. In March, a study in the BMJ warned that poor research and exaggerated claims about how good AI was at analyzing medical images posed a risk to millions of patients. 

Peak hype: A lack of common standards has also allowed private companies to crow about the effectiveness of their AI without facing the scrutiny applied to other types of medical intervention or diagnosis. For example, the UK-based digital health company Babylon Health came under fire in 2018 for announcing that its diagnostic chatbot was “on par with human doctors,” on the basis of a test that critics argued was misleading. 

Babylon Health is far from alone. Developers have been claiming that medical AIs outperform or match human ability for some time, and the pandemic has sent this trend into overdrive as companies compete to get their tools noticed. In most cases, evaluation of these AIs is done in-house and in favorable conditions. 

Future promise: That’s not to say AI can’t beat human doctors. In fact, the first independent evaluation of an AI diagnostic tool that outperformed humans in spotting cancer on mammograms was published only last month. The study found that a tool made by Lunit AI and used in certain hospitals in South Korea finished in the middle of the pack of radiologists it was tested against. It was even more accurate when paired with a human doctor. By separating the good from the bad, the new standards will make this kind of independent evaluation easier, ultimately leading to better—and more trustworthy—medical AI.

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

OpenAI teases an amazing new generative video model called Sora

The firm is sharing Sora with a small group of safety testers but the rest of us will have to wait to learn more.

Google’s Gemini is now in everything. Here’s how you can try it out.

Gmail, Docs, and more will now come with Gemini baked in. But Europeans will have to wait before they can download the app.

Google DeepMind’s new generative model makes Super Mario–like games from scratch

Genie learns how to control games by watching hours and hours of video. It could help train next-gen robots too.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.