Skip to Content

Massive Gene Database Planned in California

The data will be compared against electronic health records and patients’ personal information.
October 21, 2009

Plans for genetic analyses of 100,000 older Californians–the first time genetic data will be generated for such a large and diverse group–will accelerate research into environmental and genetic causes of disease, researchers say.

Spit kit: Genetic data from a diverse group of 100,000 California patients will be gleaned from samples of saliva, captured in kits like this one.

“This is a force multiplier with respect to genome-wide association studies,” says Cathy Schaefer, a research scientist at Kaiser Permanente, a health-care provider based in Oakland, CA, whose patients will be involved. Researchers will be able to study the data and seek insights into the interplay between genes, the environment, and disease, thanks to access to detailed electronic health records, patient surveys, and even records of environmental conditions where the patients live and work.

“The importance of this project is that it will, almost overnight–well, in two years–produce a very large amount of genetic and phenotypic data that a large number of investigators and scientists can begin asking questions of, rather than having to gather data first,” Schaefer says.

The effort will make use of existing saliva samples taken from California patients, whose average age is 65. Their DNA will be analyzed for 700,000 genetic variations called single-nucleotide polymorphisms, or SNPs, using array analysis technology from Affymetrix in Santa Clara, CA. Through the National Institutes of Health (NIH), the resulting information will be available to other researchers, along with a trove of patient data including patients’ Kaiser Permanente electronic health records, information about the air and water quality in their neighborhoods, and surveys about their lifestyles.

The result will be the largest genetic health research platform of its kind, says Schaefer, who directs Kaiser Permanente’s research program on genes, the environment, and health. The study is being undertaken together with the University of California, San Francisco (UCSF), with a $25 million, two-year NIH grant that tapped federal stimulus funds allocated earlier this year.

The potential for study is nearly limitless. Researchers will likely seek the genetic influences that determine why some people suffering from, say, cardiovascular disease and type 2 diabetes deteriorate more rapidly than others; and tease out which genetic factors reduce the effectiveness of various drugs or, indeed, make them hazardous, Schaefer says. As doctors obtain more such insights, this will allow them to tailor drug regimens and focus resources on higher-risk patients.

Given the high average age of the group, the platform will also be a boon to studying diseases of aging. “One might want to ask,” Schaefer says, “what are the genetic influences on changes in blood pressure as people age, and how are those changes in blood pressure related to diseases of aging, like stroke and Alzheimer’s and other cardiovascular diseases?”

UCSF will perform separate procedures on the samples to determine the length of telomeres–sections of DNA at the ends of chromosomes that protect against damage. The length of telomeres is associated with cell division and aging. One of the coinvestigators on the project is Elizabeth Blackburn, a biologist at UCSF who shared the 2009 Nobel Prize in Medicine for her work on telomeres.

Other so-called biobanks may be larger–for example, the U.K. Biobank is in the process of collecting samples from 500,000 people. But in that effort, the actual genetic analysis won’t be done until researchers design studies of various subcategories of patients and perform the genetic analyses on the relevant subset.

Many other institutions are assembling smaller biobanks and genetic-information databases. The Mayo Clinic, for example, this year launched an effort to build its own biobank of genetic information collected from 20,000 patients for purposes of general genomic and clinical research. It is also amassing smaller banks focusing on specific diseases, including bipolar disorder, a spokesman said yesterday.

John Glaser, vice president and chief information officer at Partners Healthcare in Boston, says the Kaiser Permanente platform will make it far easier to conduct research. “The payoffs could be very significant reductions in the costs and time–something on the order of a factor of five–to detect problematic medications and other medical interventions, assess the comparative effectiveness of treatments, and conduct clinical research,” he says.

Glaser adds that the long-term vision is to connect the various genetic databases to amplify their benefits. “One can imagine dozens of databases that are linked that have technical and governance means to conduct parallel analyses,” Glaser says. But, he notes, “there are challenges to making this happen that have only begun to be explored.”

Kaiser Permanente is meanwhile trying to expand its collection of biological samples to 500,000 by 2013.

Keep Reading

Most Popular

This new data poisoning tool lets artists fight back against generative AI

The tool, called Nightshade, messes up training data in ways that could cause serious damage to image-generating AI models. 

Rogue superintelligence and merging with machines: Inside the mind of OpenAI’s chief scientist

An exclusive conversation with Ilya Sutskever on his fears for the future of AI and why they’ve made him change the focus of his life’s work.

Data analytics reveal real business value

Sophisticated analytics tools mine insights from data, optimizing operational processes across the enterprise.

The Biggest Questions: What is death?

New neuroscience is challenging our understanding of the dying process—bringing opportunities for the living.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at with a list of newsletters you’d like to receive.