Skip to Content
Artificial intelligence

A new AI method can train on medical records without revealing patient data

December 11, 2018

When Google announced that it would absorb DeepMind’s health division, it sparked a major controversy over data privacy. Though DeepMind confirmed that the move wouldn’t actually hand raw patient data to Google, just the idea of giving a tech giant intimate, identifying medical records made people queasy. This problem with obtaining lots of high-quality data has become the biggest obstacle to applying machine learning in medicine.

To get around the issue, AI researchers have been advancing  new techniques for training machine-learning models while keeping the data confidential. The latest method, out of MIT, is called a split neural network: it allows one person to start training a deep-learning model and another person to finish.

The idea is hospitals and other medical institutions would be able to train their models partway with their patients’ data locally, then each send their half-trained model to a centralized location to complete the final stages of training with their models together. The centralized location, whether that be the cloud services of Google or another company, would never see the raw patient data; they would only see the output of the half-baked model plus the model itself. But the hospitals would benefit from a final model trained on a combination of every participating institution’s data.

Ramesh Raskar, an associate professor at the MIT Media Lab and a coauthor of the paper, likens this process to data encryption. “Only because of encryption do I feel comfortable sending my credit card data to another entity,” he says. Obfuscating medical data through the first few stages of a neural network protects the data in the same way.

In testing this approach over others also designed to keep patient data safe, the research team found that split neural networks require significantly fewer computational resources to train and also produce models with much higher accuracy.

This post originally appeared in our AI newsletter The Algorithm. To have it delivered directly to your inbox, subscribe here for free.

Deep Dive

Artificial intelligence

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

OpenAI teases an amazing new generative video model called Sora

The firm is sharing Sora with a small group of safety testers but the rest of us will have to wait to learn more.

Google’s Gemini is now in everything. Here’s how you can try it out.

Gmail, Docs, and more will now come with Gemini baked in. But Europeans will have to wait before they can download the app.

Google DeepMind’s new generative model makes Super Mario–like games from scratch

Genie learns how to control games by watching hours and hours of video. It could help train next-gen robots too.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.