The $100 Genome
Forget the $1,000 genome. Some companies are looking far past that goal to create a really inexpensive sequencing technology.
It currently costs roughly $60,000 to sequence a human genome, and a handful of research groups are hoping to achieve a $1,000 genome within the next three years. But two companies, Complete Genomics and BioNanomatrix, are collaborating to create a novel approach that would sequence your genome for less than the price of a nice pair of jeans–and the technology could read the complete genome in a single workday. “It would have been absolutely impossible to think about this project 10 years ago,” says Radoje Drmanac, chief scientific officer at Complete Genomics, which is based in Mountain View, CA.
The most recent figures for sequencing a human genome are $60,000 in about six weeks, as reported by Applied Biosystems last month. (That’s down from $3 billion for the Human Genome Project, which was sequenced using traditional methods and finished in 2003, and about $1 million for James Watson’s genome, sequenced using a newer, high-throughput approach and released last year.) But scientists are still racing to develop methods that are fast and cheap enough to allow everyone to get their genomes sequenced, thus truly ushering in the era of personalized medicine.
Most existing technologies detect the sequence of DNA a single letter at a time. But Complete Genomics aims to speed the process by detecting entire “words,” each composed of five DNA letters. Drmanac likens the technology to Google searches, which query a database of text with keywords. Further speeding up the process with novel chemistry and advances in nanofabrication, the companies will develop a device that can simultaneously read the sequence of multiple genomes on a single chip.
To accomplish the new sequencing, scientists first generate all possible combinations of five-letter DNA segments, given the four letters, or bases, that make up all DNA. These segments are labeled with different types of fluorescent markers and added in groups to a single-stranded molecule of DNA. When a particular segment matches a sequence on the strand of DNA to be read, it binds to that part of the molecule. A specialized camera then snaps a picture–the different fluorescent signals indicate the sequence at specific points along the strand of DNA. The process is repeated with different five-letter DNA combinations, until the entire chromosome is sequenced. The approach is feasible because of the recent availability of cheap DNA synthesis, making it much more efficient to generate libraries of these DNA segments.
Each DNA molecule will be threaded into a nanofluidics device, made by Philadelphia-based BioNanomatrix, lined with rows of tiny channels. The narrow width of the channels–about 100 nanometers–forces the normally tangled DNA to unwind, lining up like a train in a long tunnel and giving researchers a clear view of the molecule. “Since we can stretch out DNA, we can get a huge amount of information from each piece of DNA we look at,” says Mike Boyce-Jacino, chief executive officer of BioNanomatrix. “The big difference from any other approach is that we are looking at physical location at the same time we are looking at sequence information.” Sequencing methods currently in use sequence small fragments of DNA and then piece together the location of each fragment computationally, which is more time consuming and requires repetitive sequencing.
The companies still have a long road to the $100 genome. BioNanomatrix has already shown that long pieces of DNA–two million letters in length–can be threaded into the channels of existing chips. But now researchers need to develop chips with many more channels, so that multiple genomes’ worth of DNA can be sequenced simultaneously.
The main hurdle for Complete Genomics will be to generate fluorescent labels that can be easily and accurately detected. Most current methods get over this problem by making many copies of the same DNA molecule and sequencing them simultaneously, thus boosting the signal to noise ration. But that approach limits the length of the piece of DNA that can be sequenced, and it increases cost by increasing the amount of chemicals needed for the reaction.
The project is part of the Advanced Technology Program, funded by the National Institute of Standards and Technology to spur development of novel, high-risk technologies. This year, Complete Genomics is releasing a commercial product based on similar chemistry, but the company has declined to give details on its status.
The technology necessary to achieve a $100 genome is still at least five years away, says George Church, a geneticist at Harvard Medical School, in Boston, and a member of Complete Genomics’ scientific advisory board. “But [it’s] coming from a company that has an almost-as-good technology coming out this year.”
Both Drmanac and Boyce-Jacino say that one of the biggest advantages of their technology will be the ability to sequence very long strands of DNA. The newest sequencing technologies in use today read DNA in fairly short spurts, from about 30 to 200 letters, which are then stitched together by a computer. This approach works well for some applications, such as resequencing a known genome. But a growing number of studies suggest that the small structural changes in DNA, such as deletions or inversions of short sequences, play a significant role in human variability, says Jeff Schloss, program director for technology development at the National Human Genome Research Center, in Bethesda, MD. “Those are much harder to pick up with short reads.”
Longer reads will also allow scientists to look at collections of genetic variations that have been inherited together, known as haplotypes. This kind of analysis can determine if a particular genetic variation has been passed down from the individual’s mother or father. Recent research suggests that in some cases, maternal or paternal inheritance can impact the severity of the disease. With new tools to better track inheritance patterns, scientists may discover that this phenomenon is more common than previously thought. “That’s one reason we’re hoping that several of the emerging methods will allow long reads,” says Schloss.