We noticed you're browsing in private or incognito mode.

To continue reading this article, please exit incognito mode or log in.

Not an Insider? Subscribe now for unlimited access to online articles.

Rewriting Life

Study Highlights the Risk of Handing Over Your Genome

Researchers found they could tie people’s identities to supposedly anonymous genetic data by cross-referencing it with information available online.

Genetic data is an important resource for biomedical research, but it can put donors’ privacy at risk.

If you contribute your genome sequence anonymously to a scientific study, that data might still be linked back to you, according to a study published today in the journal Science. The researchers behind the study found they could deanonymize genomic data using only publicly available Internet information and some clever detective work.

The study points to rising issues concerning genetic privacy and the need for better legal protection against genetic discrimination, experts say, since such a technique could reveal a person’s propensity to a particular disease. The work also shows that study participants need to be better educated about the risks of joining genetic research efforts.

Open-access data sets of human genomic information are an important resource for researchers trying to uncover the genetic basis of human disease. The 1000 Genomes Project, for example, is a publicly available catalog of variation in humans that researchers can use to identify mutations that cause disease risk in certain populations (see “The Future of the Human Genome”). Researchers use this kind of open database much more often than controlled access sources, the National Institutes of Health said in a response to today’s findings that was also published in Science.

“Our last intention is to push these resources behind some firewall, says Yaniv Erlich, a geneticist at the Whitehead Institute for Biomedical Research and senior author on today’s study. “We are in favor of public data sharing, but we need to think about how it could be misused and describe that correctly to people.”

While the Genetic Information Nondiscrimination Act of 2008 offers people some protection against employers or health insurers discriminating against them based on their genetics, life insurers and disability insurers are not prevented from using such information in their decisions.  

“We have no comprehensive genetic privacy law,” says Jeremy Gruber, a lawyer and president of the Council for Responsible Genetics. “People need to be much better informed of the lack of privacy protections we have for genetic information,” says Gruber.  

In the long run, says Erlich, it is better for these potential breaches to be demonstrated by a friendly investigator rather than someone who really wants to exploit the data. “That would really undermine the public trust,” he says.

This isn’t the first time privacy risks have been highlighted for public genome databases. Different groups have shown that with a second DNA sample, an individual’s genetic information could be pulled out of what was thought to be anonymous “pooled” genomic data or gene activity databases. But Erlich’s team used only knowledge of genetic markers and Internet detective work to identify nearly 50 people in public genomic data sets.

Erlich, a former computer security researcher, was once hired by banks and other businesses to test their computer systems. For the DNA sleuthing, Erlich and his team used free genealogical databases that link surnames with genetic markers, called short tandem repeats, on the Y chromosome. There is no known biological function for these repeats, but the length and number are commonly used in ancestry research because, like surnames, those patterns are typically passed from father to son.

Once the team found a link between the Y chromosome repeats in the genomic databases and potential surnames, they used other pieces of demographic information, such as date and place of birth, which are included in some of the genomic databases, and public records to identify donors.

Eric Green, director of the National Human Genome Research Institute, and other employees of the NIH acknowledge that Erlich’s study highlights vulnerabilities in these research projects. To mitigate future risks, they write in the response published in Science, the NIH has decided to “shift age information, which had been available for some of the participants on the repository’s public Web site, into controlled-access portions of the resource.”

In addition to recruiting people who think the societal and medical research benefits of participating in genomic research outweigh the risks, better legal protection is key, says George Church, a geneticist at Harvard Medical School and founder of the Personal Genome Project, an open-access database of genomic and health data. While there may be ways to make the data more secure, “for every lock there is going to be a countermeasure, and I think that’s a game that’s just not worth playing,” says Church. “Much better is coming up with a protocol where you don’t need any locks,” he says, which would include better legal protection and education for study participants.

Today’s findings emphasize the need for public representation in the oversight of data collection, says Wylie Burke, a clinical geneticist at the University of Washington in Seattle. “Information should be readily available to the public concerning the oversight procedures in place, the research purposes for which data are being used, the outcomes of data uses, and, of course, how any misuses of data have been handled,” she says. “Without this kind of approach, we could see increasing mistrust of the research process.”

Get stories like this before anyone else with First Look.

Subscribe today
Already a Premium subscriber? Log in.
More from Rewriting Life

Reprogramming our bodies to make us healthier.

Want more award-winning journalism? Subscribe to Insider Plus.
  • Insider Plus {! insider.prices.plus !}*

    {! insider.display.menuOptionsLabel !}

    Everything included in Insider Basic, plus the digital magazine, extensive archive, ad-free web experience, and discounts to partner offerings and MIT Technology Review events.

    See details+

    What's Included

    Unlimited 24/7 access to MIT Technology Review’s website

    The Download: our daily newsletter of what's important in technology and innovation

    Bimonthly print magazine (6 issues per year)

    Bimonthly digital/PDF edition

    Access to the magazine PDF archive—thousands of articles going back to 1899 at your fingertips

    Special interest publications

    Discount to MIT Technology Review events

    Special discounts to select partner offerings

    Ad-free web experience

You've read of three free articles this month. for unlimited online access. You've read of three free articles this month. for unlimited online access. This is your last free article this month. for unlimited online access. You've read all your free articles this month. for unlimited online access. You've read of three free articles this month. for more, or for unlimited online access. for two more free articles, or for unlimited online access.