When I was in school at MIT and Harvard in the 1980s and 1990s, I was taught that there were 100,000 or so human genes, every one encoding a protein. The properties of those genes were unknown. Today, I teach that our genome contains only 21,000 protein-coding genes. To our surprise, there are thousands of additional genes that don’t encode proteins. All of these genes have been described in great detail.
I was taught that the parts of the genome not encoding proteins were “junk.” Today, we know that this junk makes up three-quarters of our functional DNA. Parts of it help exquisitely control where and when genes are active in the body.
I was taught that “genetic diseases,” such as cystic fibrosis, are caused by mutation of a single gene, with only a small handful of these mutations known. Today, precise causes are known for 2,800 of these rare single-gene disorders.
I was taught nothing about the more complex genetics of common diseases. Today, we are learning at dizzying speed about the interplay of genes and environment in diabetes, heart disease, and other common conditions. In the past three years alone more than 1,000 genetic risk factors have been found (an increase of perhaps 50-fold), contributing to more than 100 common diseases.
Such advances would have come far later, if at all, without the Human Genome Project (see “The Human Genome, a Decade Later” ). But a body of knowledge is not its only legacy. It also changed the way biological research is performed.
I was trained to view scientific data as the private property of each investigator. Human genetics research groups were locked in a “race” to discover each disease gene, and there were winners and losers. This often led to fragmentation of effort and yielded results irreproducible by others. Data was collected by hand and stored in paper notebooks.
The Human Genome Project held the revolutionary view that data collected should be freely available to all. Today this view prevails in genomics and many other fields of biology and medicine. Data is shared online by scientists the world over.
Today, thanks in no small part to the genome project’s example, investigators working on the same disease often publish together. Combining clinical and genetic data this way increases the statistical robustness of the claimed findings and makes for highly reproducible results.
Of course, knowledge of the human genome alone is not sufficient to cure disease. It will always be the case that creativity, hard work, and good fortune are needed to translate biological data into medical progress. But without the information, understanding, and cultural changes brought on by the genome project, the benefits to patients would be much further off.
David Altshuler is a founding member, the deputy director, and the chief academic officer of the Broad Institute of Harvard and MIT, and Professor of genetics and of medicine at Harvard Medical School.
This new data poisoning tool lets artists fight back against generative AI
The tool, called Nightshade, messes up training data in ways that could cause serious damage to image-generating AI models.
The Biggest Questions: What is death?
New neuroscience is challenging our understanding of the dying process—bringing opportunities for the living.
Rogue superintelligence and merging with machines: Inside the mind of OpenAI’s chief scientist
An exclusive conversation with Ilya Sutskever on his fears for the future of AI and why they’ve made him change the focus of his life’s work.
How to fix the internet
If we want online discourse to improve, we need to move beyond the big platforms.
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.