Solving the Lack of Diversity in Genomic Research
An overwhelming majority of the data collected is from people of European ancestry. But researchers are trying to change that.
When geneticist Kent Taylor of the Los Angeles Biomedical Research Institute set out two years ago to write a grant to study genetic risk factors for type 2 diabetes in Hispanics, he couldn’t find much research. He consulted an international collection of all published genome-wide association studies, called the GWAS Catalog, and found 19 studies on type 2 diabetes in Europeans, 14 in Asians, three in Hispanics, one in Mexicans, and one in an American Indian population in Arizona.
These genome-wide association studies scan the DNA of many people to spot genetic variants associated with drivers of disease. But large population groups—including Africans, Latin Americans, and native or indigenous people—are hugely underrepresented in these studies and genomics research in general. Without this information, researchers may be missing key genetic factors that play a role in disease susceptibility and drug response among different population groups.
Taylor says the GWAS Catalog is the most important tool in modern genetics, but we’re “not even close” to completing it. He says more genomic diversity is needed to establish a “basic genome infrastructure” that researchers can use to study disease across different populations.
A recent analysis published in the journal Nature revealed that 81 percent of participants in these genome-wide association studies were of European descent. Together, individuals of African, Latin American, and native or indigenous ancestry represent less than 4 percent of all genomic samples analyzed. While the overall diversity in genome-wide association studies has increased since 2009, when 96 percent of data was from people of European descent, much of the rise in diversity is due to large gains in Asian data and only marginal increases from other population groups.
Adebowale Adeyemo, deputy director of the Center for Research on Genomics and Global Health at the National Human Genome Research Institute (part of the National Institutes of Health), says the research community has tended to assume that poorer countries, like many in Africa, Asia, and Central and South America, don’t need genomic studies because the biggest killers there are infectious diseases. But chronic diseases, like diabetes and heart disease, are now on the rise in lower-income countries too, and Adeyemo says more genome studies are needed to understand different populations’ risk factors for these conditions.
An international group of researchers known as the Human Heredity and Health in Africa Initiative, or H3Africa, is trying to increase what researchers know about genetics in African populations. The project, which consists of 14 studies looking at different diseases, is collecting data from genome-wide association studies as well as genome sequencing data—the entire readout of a person’s DNA—from thousands of participants.
Another effort to increase diversity in genomics research is the Hispanic Community Health Study/Study of Latinos launched by the National Institutes of Health (NIH) in 2013. The study, which will include 16,000 participants, is meant to establish the risk factors for cardiovascular and pulmonary disease and chronic diseases in Latin Americans. But Taylor says there is so much genetic diversity among Hispanics and people from Latin America that additional studies will need to be done.
NIH is also in the midst of a 30-year-long study on American Indian men and women and their genetic risk factors for cardiovascular disease.
Companies like Illumina and 23andMe are developing new tools to help researchers like these explore genetic variations in different population groups. Illumina is partnering with H3Africa to develop a microarray—a laboratory test chip used to determine differences in genetic makeup within a population—that contains information from more people of African descent than any other commercially available array.
The array will include 2.5 million genetic variations that appear in African populations. The information comes from genomic samples of 3,000 individuals collected by H3Africa. That represents much more genetic diversity than Illumina’s current tests, which include genetic information from fewer than 700 Africans.
Last year, Illumina made available a new array for populations of diverse ancestries. Julie Collens, Illumina’s senior manager for market development, says a chip for Africans was needed because the region is so genetically diverse, and previous arrays included only a small sampling of genetic information from Africa. Illumina has 28 chips commercially available to use for human genetic studies, including those for specific diseases like immune disorders and psychiatric diseases, as well as a Chinese population array.
In addition, 23andMe has announced that it is building a reference database containing the whole genome sequences of the company’s African-American customers who have consented to participate in research. Adam Auton, a 23andMe senior scientist and statistical geneticist, says the company is aiming to include sequences from more than 900 people in the database, which will eventually be shared with NIH and available for researchers. To date, most whole genome sequences have been done in people of European descent.
While Adeyemo says these new tools will certainly be helpful, their introduction doesn’t represent as big a shift in genomics research as he would like. Researchers have to want to do research in non-European populations, he says—and scientific organizations around the world have to provide funding to support those kinds of studies.
Become an MIT Technology Review Insider for in-depth analysis and unparalleled perspective.Subscribe today