Poring over the raw genetic data, Mark Daly noticed a startling pattern. An expert in statistical genetics and a fellow at MIT’s Whitehead Institute for Biomedical Research, Daly was scouring a region of human chromosome 5, a place that colleagues strongly suspected contained a gene that puts people at risk for a devastating digestive condition called Crohn’s disease.
The sequence spelled out in the DNA letters A, T, G, and C was almost identical in all the samples Daly examined-each from a different person. As Daly expected, sprinkled every thousand letters or so were spots where a single letter tended to vary from one person to another. Then came the surprise. Many of these single-letter variations seemed to occur together, as if they were tightly linked across long stretches of the DNA. In other words, whenever Daly looked at an individual copy of one of the sections of DNA and found an A at one of these positions, he would find a G at the next one, about a thousand letters away, a C in a third position still further down the line, and so on. After roughly tens of thousands of letters, another pattern began; the long stretches of linked variants, it seemed, divided the chromosome into neatly defined blocks. What’s more, for any given stretch of the chromosome, there were only four or five versions of these blocks that kept showing up in the different individuals Daly studied. Daly realized he was staring at evidence of an underlying structure to the human genome. He was also looking at the beginnings of biology’s next big project-and its next big controversy.
At about the same time, in the fall of 2001, several other genetic researchers reported similar findings. Much of the human genome, it soon appeared, consists of what researchers began to refer to as haplotype blocks. And as Daly had seen on chromosome 5, the blocks tend to come in a limited number of common varieties, which suggests that the genetic variants that put people at risk for common diseases might also be widely shared. Overall, the findings suggested a far simpler structure for the human genome than had previously been supposed. “It is a fundamental change in how we view genetic variations,” says Daly. “And for once, the genetics are very favorable toward performing disease studies.”
Indeed, the finding has immense implications for understanding and treating diseases such as diabetes, schizophrenia, and hypertension. Though people share roughly 99.9 percent of their genes, it is precisely that other one-tenth of a percent that plays a role in determining why one person gets schizophrenia or diabetes while another doesn’t, why one person responds well to a drug while another can’t tolerate it. If, in fact, the variable DNA letters occur in a limited number of easy-to-identify, blocklike patterns, it would give geneticists a practical way to quickly and cheaply search for the complex genetic variations related to common diseases and different drug responses. Instead of identifying all 10 million of a person’s specific single-letter variants-a time-consuming and prohibitively expensive task-researchers could simply pinpoint a telltale letter for each block and then know the other variants around it.
But first they would need a map, one that identifies the boundaries of blocks and the different versions of each block found in populations around the world. Last October, a year after Daly’s discovery, the world’s top genetic researchers-including scientists from the Whitehead, the National Institutes of Health, Johns Hopkins University in Baltimore, the University of Tokyo, the Beijing Genomics Institute, and Cambridge, England’s Wellcome Trust Sanger Institute-formed a $100 million, three-year plan to chart just such a map. It’s called the International HapMap Project, and beginning with several hundred blood samples collected from Nigeria, Japan, China, and the United States, it will use highly automated genomics tools to parse out the common haplotype patterns among a number of the world’s population groups (see “Shining Light Variations” infographic).
“This is really a natural outcome of having the sequence of the human genome,” says Aravinda Chakravarti, director of the Institute of Genetic Medicine at Johns Hopkins and a leading participant in the international consortium. “Now we want to know what part of the genome varies. Knowing the variations that enhance or retard specific diseases will be a tremendous value” for medicine, he says. “Having a catalogue of the variations will be very helpful. And the more global the catalogue is, the more helpful it will be.”