Skip to Content

Microsoft and Google Want to Let Artificial Intelligence Loose on Our Most Private Data

New ways to use machine learning without risking sensitive data could unlock new ideas in industries like health care and finance.
April 19, 2016

The recent emergence of a powerful machine-learning technique known as deep learning has made computing giants such as Google, Facebook, and Microsoft even hungrier for data. It’s what lets software learn to do things like recognize images or understand language.

Yet many problems where deep learning could be most valuable involve data that is hard to come by or is held by organizations that are unwilling to share it. And as Apple CEO Tim Cook puts it, some consumers are already concerned about companies “gobbling up” their personal information.

“A lot of people who hold sensitive data sets like medical images are just not going to share them for legal and regulatory concerns,” says Vitaly Shmatikov, a professor at Cornell Tech who studies privacy. “In some sense we’re depriving these people from the benefits of deep learning.”

Shmatikov and researchers at Microsoft and Google are all working on ways to get around that privacy problem. By providing ways to use and train the artificial neural networks used in deep learning without needing to gobble up everything, they hope to be able to train smarter software, and convince the guardians of sensitive data to make use of such systems.

Shmatikov and colleague Reza Shokri are testing what they call “privacy-preserving deep learning.” It provides a way to get the benefit of multiple organizations—say, different hospitals—combining their data to train deep-learning software without having to take the risk of actually sharing it.

Each organization trains deep-learning algorithms on its own data, and then shares only key parameters from the trained software. Those can be combined into a system that performs almost as well as if it were trained on all the data at once.

The Cornell research was partly funded by Google, which has published a paper on similar experiments and is talking with Shmatikov about his ideas. The company’s researchers invented a way to train the company’s deep-learning algorithms using data such as images from smartphones without transferring that data into Google’s cloud.

That could make it easier for the company to leverage the very personal information held on our mobile devices, they wrote. Google declined to make someone available to discuss that research, but Shmatikov believes the company is still working on it.

Microsoft’s cryptography research group has developed its own solution to machine learning’s privacy problem. It invented a way to use trained deep-learning software on encrypted data and spit out encrypted answers. The idea is that a hospital, for example, could ask Microsoft to use one of these “CryptoNets” to flag medical scans containing possible problems, avoiding the usual need to expose the images to the company.

The Microsoft researchers pulled off that trick using a technique called homomorphic encryption, which makes it possible to perform mathematical operations on encrypted data and produce an encrypted result (see “10 Breakthrough Technologies 2011: Homomorphic Encryption”). They have tested the idea using deep-learning software that recognizes handwriting, and a system that estimates a patient’s risk of pneumonia from his vital signs.

A CryptoNet requires more computing power than conventional deep-learning software to do the same work. But Kristin Lauter, who leads Microsoft’s cryptography research, says the gap is small enough that CryptoNets could become practical for real-world use. “The health, financial, and pharmaceutical industries are where I think this is most likely to be used first,” she says.

Keep Reading

Most Popular

open sourcing language models concept
open sourcing language models concept

Meta has built a massive new language AI—and it’s giving it away for free

Facebook’s parent company is inviting researchers to pore over and pick apart the flaws in its version of GPT-3

transplant surgery
transplant surgery

The gene-edited pig heart given to a dying patient was infected with a pig virus

The first transplant of a genetically-modified pig heart into a human may have ended prematurely because of a well-known—and avoidable—risk.

Muhammad bin Salman funds anti-aging research
Muhammad bin Salman funds anti-aging research

Saudi Arabia plans to spend $1 billion a year discovering treatments to slow aging

The oil kingdom fears that its population is aging at an accelerated rate and hopes to test drugs to reverse the problem. First up might be the diabetes drug metformin.

Yann LeCun
Yann LeCun

Yann LeCun has a bold new vision for the future of AI

One of the godfathers of deep learning pulls together old ideas to sketch out a fresh path for AI, but raises as many questions as he answers.

Stay connected

Illustration by Rose WongIllustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.