Select your localized edition:

Close ×

More Ways to Connect

Discover one of our 28 local entrepreneurial communities »

Be the first to know as we launch in new countries and markets around the globe.

Interested in bringing MIT Technology Review to your local market?

MIT Technology ReviewMIT Technology Review - logo


Unsupported browser: Your browser does not meet modern web standards. See how it scores »

{ action.text }

Researchers at Vanderbilt University have created an algorithm designed to protect the privacy of patients while maintaining researchers’ ability to analyze vast amounts of genetic and clinical data to find links between diseases and specific genes or to understand why patients can respond so differently to treatments.

Medical records hold all kinds of information about patients, from age and gender to family medical history and current diagnoses. The increasing availability of electronic medical records makes it easier to group patient files into huge databases where they can be accessed by researchers trying to find associations between genes and medical conditions–an important step on the road to personalized medicine. While the patient records in these databases are “anonymized,” or stripped of identifiers such as name and address, they still contain the numerical codes, known as diagnosis codes or ICD codes, that represent every condition a doctor has detected.

The problem is, it’s not all that difficult to follow a specific set of codes backward and identify a person, says Bradley Malin, an assistant professor of biomedical informatics at Vanderbilt University and one of the algorithm’s coauthors. In a paper published online today in the Proceedings of the National Academy of Sciences, Malin and his colleagues found that they could identify more than 96 percent of a group of patients based solely on their particular sets of diagnosis codes. “When people are asked about privacy priorities, their health data is always right up there with information about their finances,” says Malin–and for good reason. In 2000, computer science researcher Latanya Sweeney cross-referenced voter-registration records with a limited amount of public record information from the Group Insurance Commission (birth date, gender, and zip code) to identify the full medical records of former Massachusetts governor William Weld and his family. In the wrong hands, medical information could lead to blackmail or employment discrimination, or, less critical but still immensely annoying, increases in medical spam. In addition, the hospitals where data were compromised could be liable for negligence, says Malin.

To solve this problem, the Vanderbilt team designed an algorithm that searches a database for combinations of diagnosis codes that distinguish a patient. It then substitutes a more general version of the codes–for instance, postmenopausal osteoporosis could become osteoporosis–to ensure each patient’s altered record is indistinguishable from a certain number of other patients. Researchers could then access this parallel, de-identified database for gene-association studies.

To test their algorithm, the researchers applied it to 2,762 patients, then went back and tried to reconnect the study participants to their diagnostic codes. They were unable to do so. The algorithm also allows researchers to explicitly balance the level of anonymization according to the needs of their research. Ben Reis, an assistant professor at Harvard Medical School who studies personalized, predictive medical systems, says this is an important benefit of the Vanderbilt system.

0 comments about this story. Start the discussion »

Credit: Technology Review

Tagged: Biomedicine, security, privacy, healthcare IT, medical records

Reprints and Permissions | Send feedback to the editor

From the Archives


Introducing MIT Technology Review Insider.

Already a Magazine subscriber?

You're automatically an Insider. It's easy to activate or upgrade your account.

Activate Your Account

Become an Insider

It's the new way to subscribe. Get even more of the tech news, research, and discoveries you crave.

Sign Up

Learn More

Find out why MIT Technology Review Insider is for you and explore your options.

Show Me