A Quicker Map for Disease

DNA segments passed down from common ancestors can be surprisingly large-which could accelerate the hunt for common disorders.

Erika Jonietzarchive page

May 11, 2001

Mapping common genetic diseases such as heart disease and asthma may turn out to be much easier than believed, according to a report published this week in Nature by researchers at the Whitehead Institute Center for Genome Research and the University of Oxford.

The reason? It turns out that segments of DNA shared by people with common ancestors can be much larger than previously thought-significantly decreasing the number of starting places researchers use to map genetic disorders.

SNPing Out

Researchers believe that many common diseases are caused by single-letter variations in the DNA code called single nucleotide polymorphisms, or SNPs (pronounced “snips”). Locating these variants is considered the key to hunting down genetic disorders, but to test each SNP for a correlation with a common disease is practically impossible because of the sheer number of variables involved.

Researchers have already identified over one million SNPs in the human genome, and at each location there could be four possible variants. (The entire genetic code is made up of sequences of only four letters, corresponding to the four nucleotides, or molecules, that form the building blocks of DNA.)

Fortunately, SNPs travel together, meaning that variations in DNA are linked. One variation in the series of letters making up the DNA code consistently corresponds to another variant at another location in the sequence. Like bookends, the pair of corresponding SNPs defines a segment of DNA.

The number of intervening letters, or bases, in the sequence determines the size of the segment. The larger the segments, the larger the needle in the haystack-and the simpler the task of zeroing in on genes implicated in a certain disease.

Until now, researchers have given widely varying estimates of the size of these segments. But it’s known that segment size is larger in populations that have a relatively small number of common ancestors, because as the number of founding fathers (and mothers) increases, the number of possible segments that could be inherited increases.

This explains why companies such as deCODE genetics and Newfound Genomics are looking for disease genes in regions where populations have been isolated for centuries, such as Iceland and Newfoundland.

Bigger Building Blocks

Until now, geneticists have believed that finding disease-causing SNPs in more diverse populations would be too difficult. Estimates of segment size in other groups have ranged from just 3,000 letters to more than 100,000.

But geneticists in the Whitehead Institute’s Center for Genome Research and at Oxford have found that the segment size in people of north-European descent is much larger than previous estimates suggest.

“This is good news for mapping disease genes in North Americans and certainly northern Europeans,” says David Reich, a post-doctoral fellow at the Whitehead and the study’s first author.

Studying 19 randomly chosen regions of the genome in 44 unrelated Americans, the researchers found that the segments extend an average of 60,000 letters. Because the genome consists of 3 billion letters, this means that the location of a disease-related SNP can now be sought among 50,000 DNA segments instead of a possible one million segments consisting of 3,000 letters each.

The group conducted the first systematic study of this kind, examining about ten times as many SNPs and different regions of DNA than any previous study. This “wouldn’t have been possible three years ago,” notes Reich, because the researchers needed large, adjoining chunks of genome sequence to use as a starting point. These have only recently become available through the Human Genome Project.

In a statement accompanying the release of the report, Eric Lander, director of the Center for Genome Research, said, “The large blocks in northern European populations will help us to easily map to first approximation the location of disease genes.” Then, analyzing smaller segments in other populations “will allow us to hone in on the specific single-letter difference responsible for a disease.”

Glimpses into the Past

Besides the boost to disease mapping, the study also gives insight into the history of human migrations. As populations grow over time, DNA is reshuffled and segments get shorter.

The researchers compared the findings for the U.S. group to 96 individuals from the Yoruba tribe in Nigeria. The ancestral segments for this group extended about 5,000 letters, pointing to a much older common ancestor. This matches with previous archeological and genetic data that suggest the split between African and European populations occurred about 100,000 years ago.

The Whitehead group’s models show that the long segments of DNA in northern Europeans stem from a severe population bottleneck between 27,000 and 53,000 years ago-when perhaps as few as 50 people may have resettled in northern Europe following the last ice age.

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.