Scientists hope that the ability to identify novel SNPs will be a boon in the hunt for the genomic basis of disease. Most large genomic studies to date have focused on common genetic variations–those with a frequency of at least 5 percent–because they were the easiest to find. But research suggests that these variations account for only a fraction of the genetic contribution to common diseases. The ability to sequence many human genomes will allow scientists to find more rare variants and to characterize the potentially large role that they play in human health.
Such studies are already under way. The Yoruba genome is part of an international collaboration known as the 1,000 Genomes project, which will serve as a technological test bed for high-volume human sequencing. “You couldn’t do 1,000 genomes with old technologies, but the new technologies are making it possible,” says Lisa Brooks, director of the genetic-variation program at the National Human Genome Research Institute, in Bethesda, MD. Scientists involved in the project aim to catalogue all human variations that appear at about 0.1 percent frequency.
Illumina is not alone in its quest to cheaply sequence human genomes. Applied Biosystems, the company that supplied many of the sequencing machines for the Human Genome Project, has also sequenced the Yoruba genome and is likely to publish its results soon. Two startups, Pacific Biosciences and Complete Genomics, are also hot on the trail. Complete Genomics, for example, promises a $5,000 genome by next year. The company’s scientists have not yet published their results in peer-reviewed journals, however, so the completeness and accuracy of their method has yet to be independently validated. “With this and other data from the 1,000 Genomes project, we will be in good position to properly calibrate these different technologies,” says Richard Gibbs, director of the Human Genome Sequencing Center at Baylor College of Medicine, in Houston, TX.
The two new genomes are also the first non-Caucasian ones to be added to the public database. “They provide a stepping stone to understanding genetic differences between ethnicities,” says Levy, who wrote a commentary accompanying the publication of the two papers.
In the same issue of Nature, scientists from Washington University School of Medicine, in St. Louis, describe using Illumina’s technology to sequence the first complete cancer genome. They found eight previously unidentified mutations, which may shed light on the disease.
In the Illumina sequencing approach, DNA is fragmented into small pieces and molecularly attached to a specially designed slide known as a flow cell. About 50 million fragments fit on a single cell. Each fragment is copied 1,000 times while still stuck to the flow cell. Fluorescently labeled bases, representing the four letters that make up DNA and colored red, green, blue, and yellow, are then added to the cell. The base that corresponds to the letter at the first position in a fragment of DNA will attach to that fragment. A camera then snaps a picture of the fluorescent bases at each of the 50 million locations on the flow cell. The base is then clipped off, and the cycle is repeated for each letter of the DNA fragment. The resulting images are computationally stitched together to generate a sequence.