Sequencing a Single Molecule of DNA
Helicos Biosciences’ novel machine could speed up sequencing and unearth new disease-linked genetic variations.
In the corner of Helicos BioSciences’ offices in Cambridge, MA, a screen on the face of what looks like a giant refrigerator flashes a countdown: 10 days, five hours, and 51 minutes until it finishes reading the sequence of all the DNA that has been fed into it. The high-throughput machine, a complex configuration of tubes, lasers, and chemicals, contains two plates, each with 25 microfluidic channels etched into it. Each channel is capable of holding and sequencing a separate DNA sample. Sequencing the samples in parallel, the machine takes just one hour to read 1.3 billion of the chemical “bases”–known as A, C, T, and G–that make up a strand of DNA.
Called the HeliScope, it is the first commercial instrument that can directly read the sequence of a single such strand, a capability that gives it the potential for unprecedented speed. In fact, says Stephen Quake, a bioengineer at Stanford University who cofounded the company in 2003, Helicos has “basically built the world’s fastest DNA sequencer.” Though it’s not clear whether the machine will produce a complete sequence more rapidly than competing systems do (the data generated by a sequencing machine still has to be analyzed and stitched together, a computationally intensive task), Quake says it is “opening entire new areas of research.”
The HeliScope, introduced earlier this year, is joining an intense race for faster and cheaper sequencing technologies. The price of sequencing a human genome has dropped in recent years, from the $300 million the Human Genome Project spent on its first draft to less than $100,000. The applications of cheap sequencing are almost limitless, from disease diagnostics to research that could yield microbes engineered to produce biofuels or medicines.
In other advanced sequencing technologies currently in use, including those from Illumina, Applied Biosystems, and 454 Life Sciences (which was acquired by Roche last year), the DNA to be sequenced must be amplified, or copied many times; the copies are then read simultaneously to make it easier to detect fluorescent signals that indicate the position of each DNA letter. Single-molecule sequencing skips the copying step, meaning that many more unique samples can be packed into a single sequencing experiment.
In addition, single-molecule sequencing may be able to generate a more complete picture of the genome. That’s because when DNA is amplified, some strings are likelier than others to be copied successfully, so they’re more likely to be represented in the final sequence. Likewise, rare genetic mutations may go unrepresented because they don’t get copied. “If at the end of the day you can just put a single strand of DNA onto a platform and sequence it directly, it’s a huge advantage,” says Elaine R. Mardis, codirector of the Genome Center at Washington University in St. Louis.
Awake at Night
With the Helicos technology, the DNA to be sequenced is first chopped into short pieces about 200 bases long and injected into a flow cell, a specialized glass slide. The flow cell is coated with tiny snippets of DNA that are designed to snag the fragments as they float by, anchoring them in place. The immobilized pieces of DNA are fluorescently labeled so that their position under a fluorescence microscope can be recorded by a camera. Nearly a billion pieces of DNA can be analyzed in a single sequencing experiment, compared with about 400,000 to 50 million for other technologies.
The flow cell is then nestled into the HeliScope, where the microscope sits ensconced in 400 pounds of Vermont granite. The added weight stops any vibrations from interfering with the signals the device must detect. A complex optical system and a tangle of tubing surround the microscope, connecting it to what looks like a miniature fridge filled with bottles of specially made chemicals.
When a scientist activates the machine, a precisely choreographed dance of fluids begins. An enzyme called DNA polymerase and a single type of fluorescently labeled base–say, A–flow into the cell. The enzyme causes those As to take their places in growing strands of DNA that complement the strands in the samples. (Each of the four bases can pair with only one other base, so an added A must line up opposite an existing T, and a C against a G.) Once the fluorescently tagged base is incorporated into the new strand, the HeliScope’s camera can spot the light it emits. “The imager detects a plume–a 200-nanometer cone of light–from the integration of a single [base] onto a single strand of DNA,” says Steve Lombardi, president of Helicos.
Other advanced sequencing methods use a similar approach, known as sequencing by synthesis. But unlike those technologies, the HeliScope can distinguish the unamplified fluorescent signal of a single base taking its place on a growing DNA strand. One key to that ability is a nonstick material that the company developed, which coats the surface of the flow cell and allows it to be washed clean between reactions: residual fluorescent bases would make it more difficult to accurately detect individual sequencing reactions. “You need to make sure no extra base molecules are sticking to the surface,” says Patrice Milos, chief scientific officer at Helicos. “This was one of the biggest early challenges.” After each cycle, the fluorescent markers are clipped from the newly incorporated bases, and remaining chemicals are washed away. The process is repeated sequentially with each of the four bases.
The HeliScope generates a massive amount of raw data every second. It takes five to ten days to read all the DNA that can be loaded into two flow cells; for sequencing, that’s 400 million strands of DNA per cell, which can generate 20 billion bases’ worth of usable sequence. Scientists load the machine, press a button on its face, and leave. But the sequencing obsessed can use the Internet to check the machine’s progress in the middle of the night, a common occurrence at Helicos.
Once the HeliScope creates its series of fluorescence photographs, an accompanying data-processing center converts them into strings of letters. Software pastes these pieces together to form a longer sequence.
In a paper published in Science earlier this year, scientists reported on their use of the HeliScope to sequence the genome of the M13 virus, important proof that single-molecule sequencing could be used to read and assemble the sequence of a complete genome. (Approximately 7,000 base pairs long, the M13 virus’s genome is tiny–about a millionth the size of a human’s.) The technology is so new that it’s not yet clear what applications it will be best suited to. But some scientists believe that single-molecule sequencing could be particularly important in understanding how genetic variations contribute to disease. After all, some rare mutations linked to disease may have been missed in previous genomic studies because they weren’t copied during the amplification process.
Helicos is still tinkering with the technology, developing chemistry that could boost the speed of the sequencing reactions and allow more pieces of DNA to be anchored to a flow cell. Along with the other major players in the field, the company hopes to deliver a complete genome sequence for $1,000, an accomplishment that would mark the beginning of something totally new in medicine: individuals’ ability to access their own genomic information.
Emily Singer is TR’s biotechnology and life sciences editor.