Hunting Disease Origins with Whole-Genome Sequencing
Two studies show that complete-genome sequencing can identify disease-causing genes.
James Lupski, a physician-scientist who suffers from a neurological disorder called Charcot-Marie-Tooth, has been searching for the genetic cause of his disease for more than 25 years. Late last year, he finally found it–by sequencing his entire genome. While a number of human genome sequences have been published to date, Lupski’s research is the first to show how whole-genome sequencing can be used to identify the genetic cause of an individual’s disease.
The project, published today in the New England Journal of Medicine, reflects a new approach to the hunt for disease-causing genes–an approach made possible by the plunging cost of DNA sequencing. Part of a growing trend in the field, the study incorporates both new technology and a more traditional method of gene-hunting that involves analyzing families with rare genetic diseases. A second study, the first to describe the genomes of an entire family of four, confirmed the genetic root of a rare disease, called Miller syndrome, afflicting both children. That study was published online yesterday in Science.
While the approach is currently limited to rare genetic diseases, researchers hope it will ultimately enable the discovery of rare genetic variants increasingly thought to lie at the root of even common diseases, such as diabetes and heart disease.
Lupski was diagnosed as a teenager with Charcot-Marie-Tooth, a disorder that strikes about one in 2,500 people and affects sensory and motor nerves and leads to weaknesses of the foot and leg muscles. Three of his seven siblings also have the disease. While the disorder has a number of different forms and can be caused by a number of genetic mutations, it appears to be recessive in Lupski’s family, meaning that an individual must carry two copies of the defective gene to have it.
Decades later, in 1991, after Lupski had trained as both a molecular biologist and a medical geneticist, his lab at Baylor College of Medicine, in Houston, identified the first genetic mutation linked to Charcot-Marie-Tooth. It was a duplication of a gene on chromosome 17 that is involved in producing the fatty insulation that covers nerve fibers, as well as several other genes tied to the disease. “Every time we discovered a new gene, we put my DNA in the group of samples to be sequenced,” says Lupski. But those studies failed to identify the mutation responsible for his case. To date, 29 genes and nine genetic regions have been linked to the disease.
In 2007, Lupski and colleague Richard Gibbs, director of the Human Genome Sequencing Center at Baylor, helped sequence James Watson’s genome, the first personal genome sequence to be published (aside from that of Craig Venter, who used his own DNA in the private arm of the Human Genome Project). The problem with that project was that, thanks to Watson’s good health, there was little clinical relevance to his genome–he had no diseases to try to match to a gene. So Gibbs offered to turn the genome center’s sequencing power on Lupski.
Using technology from Applied Biosystems, a sequencing company based in Foster City, CA, the researchers generated about 90 gigabytes of raw sequence data, covering Lupski’s genome approximately 30 times. (Because of unavoidable errors in sequencing, a human genome must be analyzed a number of times to generate an accurate read.) They then identified spots where his genome differed from that of the reference sequence from the Human Genome Project, and narrowed that pool down to novel variations found in genes previously linked to Charcot-Marie-Tooth or other nerve disorders. Researchers found that Lupski’s genome carried two different mutations in a gene called SH3TG2, which had been previously tied to the disorder. The team then sequenced the gene in DNA from his siblings, parents, and deceased grandparents. (In preparation for this discovery, the scientist had collected his family’s DNA 25 years ago.) All of his affected siblings also carried both of the mutations, while the unaffected family members carried either one or neither, exactly the pattern for a recessive disease.
“I think this is the wave of the future,” says Thomas Bird, director of the Neurogenetics Clinic at the University of Washington, in Seattle, who was not involved in the research. “Genetic testing is going to become more and more important in medicine as the technology becomes more extensive and less expensive.”
Understanding the genetic mutation that causes Lupski’s disorder can help scientists search for treatments. For example, animals genetically engineered to mimic the gene duplication responsible for about 70 percent of the human cases of Charcot-Marie-Tooth can be helped by an estrogen-blocking drug, which is now in clinical trials. (Other genetic variations, including some yet to be discovered, are responsible for the remaining 30 percent.) The same week that Lupski identified his disease mutation, he received a research paper to review that described the creation of a mouse lacking the same gene, SH3TG2. “Suddenly we’re starting to get insight into the disease process for the first time in 25 years,” says Lupski, who hopes to repeat his success by sequencing patients with other unexplained nerve disorders.
In the Science study, Leroy Hood and collaborators at the Institute for Systems Biology, in Seattle, sequenced the complete genomes of a nuclear family of four, the first published example of familial whole-genome sequencing. Both children in the family have Miller syndrome, a rare craniofacial disorder. By comparing the sequence of parents and offspring, researchers could calculate the rate of spontaneous mutations arising in the human genome from one generation to the next. The rate equates to about 30 mutations per child, lower than previous estimates.
One of the major problems with analyzing whole-genome data is isolating important genetic signals from noise–both sequencing errors and thousands of harmless genetic variations that have little or no impact on a person’s health. Comparing intergenerational genomes allowed scientists to filter out some of this noise. They honed in on the genetic changes that appeared from one generation to the next and then resequenced those regions to identify true changes. Hood estimates that errors are about 1,000 times more prevalent than true mutations. “In the future, when all of us have our genomes done, we’ll almost certainly have them done in families, because it increases the accuracy of the data,” says Hood.
By comparing the genomes of the unaffected parents to their affected children, researchers identified four candidate genes for Miller syndrome. One candidate overlapped a gene linked to the disease in a study published in January. That study sequenced just the gene-coding regions of these children and two others with Miller syndrome. (Lupski’s study, in contrast, focused on genes known to be related to Charcot-Marie-Tooth or other nerve disorders. But that approach would be ineffective in identifying unexpected genes or genes for diseases that are not well-studied.)
Thus far, whole genome sequencing has been limited to identifying genes linked to so-called Mendelian disorders, in which mutations in a single gene cause the disease. Eventually, Lupski, Hood, and others aim to move on to more complex diseases, such as Alzheimer’s. “There are various ways to turn a common disease into a rare one–you start with families that have more severe forms of common disease, or earlier onsets,” says George Church, who leads the Personal Genome Project at Harvard. “I think almost every disease has rare [variants].” It was this type of approach that led to the identification in 1993 of the Alzheimer’s risk gene APOE4, still the strongest genetic risk factor known to date. Now, thanks to cheap sequencing, the ability to scan the genome in its entirety will allow a much broader and more thorough search.
Lupski’s family gives a potential example of how genes tied to rare disorders may shed light on more common ones. Two of the scientist’s siblings who carried one of the genetic mutations linked to Charcot-Marie-Tooth had signs of carpal tunnel syndrome, a common disorder often caused by repetitive movements. “That’s a very common disease and now we have insight into it,” says Church. “I think there will be lots of cases where you identify a gene in one person in a family, and then start asking questions about the phenotype of family members with only one copy.”
Church and others say the two studies signal a new trend in human genetics research. Over the last few years, microarrays designed to cheaply screen human genomes for common genetic variations linked to common, complex diseases have mostly picked up variants with only a mild effect on disease risk. The bulk of the genetic causes of these ailments remains a mystery, and a growing number of scientists believe this disease risk lies in rare variants only detectable with whole-genome sequencing.
Lupski, who as a medical geneticist still sees patients once a week, won’t be able to offer them whole-genome sequencing in the next year or two. “I don’t want people to think everyone can have this diagnosis, or that having a diagnosis means there is a cure,” says Lupski. “But we can start using the technology more and more for gene discovery.”
But cost-wise, personal genomes may not be far off. For example, Bird at the University of Washington says that a comprehensive genetic screen for inherited nerve diseases costs about $15,000. Researchers estimate that Lupski’s genome cost about $50,000. And Complete Genomics, a startup in California that sequenced the family in Hood’s study, will soon offer bulk sequencing services for about $20,000 a genome, with a $5,000 price tag not far behind.