When a sequenced human genome was announced in 2001, most people missed a crucial part of the news: the genome was a first draft, sketchy and incomplete. This fall, scientists completed a new draft. The more complete version reveals that humans have a few thousand fewer genes than was previously predicted. The new version will also allow researchers to analyze the genome on a larger scale.
Thousands of scientists affiliated with the National Human Genome Research Institute, including almost 200 from MIT, announced the completion of the new human genome draft in the October 21, 2004, issue of the journal Nature. Their higher-quality sequencing data, which gives improved coverage of the entire genomeespecially the gene-rich regionsmeans all scientists will now be better equipped to search for the genetic causes of disease and investigate the genomes structure and evolution.
In the 2001 rough draft, significant chunks of the genome werent sequenced. The gaps in the initial draft meant that researchers always had to be somewhat wary of the accuracy of the publicly available sequence, which slowed the progress of their projects. The final draft, however, accurately represents 99 percent of these gene-heavy genome regions.
The number of genes in the genome was once estimated at near 100,000 and reëstimated at about 30,000 in 2001, but the new draft suggests that it is in fact somewhere between 20,000 and 25,000. Mark Daly, a researcher at the Whitehead Institute for Biological Research, isnt concerned about this smaller number. I dont think its such an important number, he says. It was an easy benchmark figure. Understanding how the genes function and interact with each other is much more valuable information, Daly says, which scientists are just starting to gather.
The more-complete sequence does allow scientists to draw some immediate conclusions about the genome, particularly about the mechanisms of gene evolution. For example, the researchers examined gene death, a natural process occurring over many generations in which genes acquire debilitating mutations. They found that, over time, humans have lost genes that make proteins for smelling. Today, humans have somewhere around 800 olfactory receptor genes, and only half of them appear to function. Mice, on the other hand, have around 1,000 such genes, and zebra fish only have about a hundred. These differences in gene family number are really dramatic, indicating how fast these genes are evolving, says Chad Nusbaum, a coauthor of the Nature paper and codirector of the sequence and analysis program at the Eli and Edythe L. Broad Institute.
Though the genome is still technically incomplete, the remaining regions are very difficult to sequence using existing analytical methods. It will be another project to close the gaps, requiring new techniques, says Nusbaum. We will get as much as we can until the cost becomes unreasonable.
For now, Nusbaum says that he and his colleagues are relieved to have finally finished this most recent draft. We were being held hostage by not having the data, he explains. Now we can go back to understanding biology again!
This new data poisoning tool lets artists fight back against generative AI
The tool, called Nightshade, messes up training data in ways that could cause serious damage to image-generating AI models.
Rogue superintelligence and merging with machines: Inside the mind of OpenAI’s chief scientist
An exclusive conversation with Ilya Sutskever on his fears for the future of AI and why they’ve made him change the focus of his life’s work.
The Biggest Questions: What is death?
New neuroscience is challenging our understanding of the dying process—bringing opportunities for the living.
Driving companywide efficiencies with AI
Advanced AI and ML capabilities revolutionize how administrative and operations tasks are done.
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.