Exponential: The number of human beings whose entire DNA sequence is known has increased dramatically.
If the Internet cloud were actually airborne, it would be crashing down right now under the sheer weight of a quintillion bytes of biological data.
This year, the world’s DNA-sequencing machines are expected to churn out 30,000 entire human genomes, according to estimates in Nature magazine. That is up from 2,700 last year and a few dozen in 2009. Recall that merely a decade ago, before the completion of the Human Genome Project, the number was zero. At this exponential pace, by 2020 it may be feasible—mathematically, at least—to decode the DNA of every member of humanity in a single 12-month stretch.
The vast increase in DNA data is occurring because of dazzling advances in sequencing technology. What cost hundreds of millions of dollars a decade ago now costs a mere $10,000. In a few years, decoding a person’s DNA might cost $100 or even less.
But what’s missing, say a growing chorus of researchers, is a way to make sense of what these endless strings of As, Gs, Cs, and Ts mean to individuals and their health. “We are really good at sequencing people, but our ability to interpret all of this data is lagging behind,” says Eric Schadt, director of the Mount Sinai Institute for Genomics and Multiscale Biology and chief scientific officer at California-based Pacific Biosciences, which sells sequencing machines.
Scientists don’t yet know what all our DNA does—how each difference in genetic code might influence disease or the color of your hair. Nor have studies confirmed that all the genetic markers linked to, say, heart disease and most cancers actually increase a person’s risk for these illnesses. Just as significant, the thousands of genomes being cranked out right now can’t easily be compared. There is no standard format for storing DNA data and no consistent way to analyze or present it. Even nomenclature varies from lab to lab.
The industry is working to address these problems. Earlier this summer, at a meeting of geneticists and other experts that I attended in San Francisco, Clifford Reid, the CEO of Bay Area-based Complete Genomics, called for a consortium of gene companies to develop sorely needed standards for everything from consent procedures for DNA donors to methods of collecting, storing, and analyzing DNA specimens. Reid says the ultimate purpose is to “aggregate multiple data sets, providing broad access to data sets that are today in silos and largely unavailable to the broader scientific community.”