Hello,

We noticed you're browsing in private or incognito mode.

To continue reading this article, please exit incognito mode or log in.

Not an Insider? Subscribe now for unlimited access to online articles.

Rewriting Life

Combing Medical Records for Research

The vast data housed in electronic records and genomics databases could reveal new insights.

When the stimulus bill passed last year–allocating $20 billion to help doctors and hospitals adopt electronic medical records (EMRs)–many scientists were excited about the possibilities for medical research. EMRs provide vast amounts of medical information that can be combed automatically and used to ask questions that are too expensive or perhaps unethical to study in traditional clinical trials, such as whether newer, more expensive treatments are more effective than older ones.

“There is a lot of federal funding right now supporting the development of the infrastructure to do that kind of work, as well as to look at comparative effectiveness research using databases,” says Richard Tannen, a physician at the University of Pennsylvania, in Philadelphia. “But it’s a complex and difficult problem, in some ways more difficult than people appreciate.”

While the idea of using electronic medical records for research has been around for more than a decade, it’s only recently started to take off. Scientists and physicians are now scouring the growing number of electronic medical records and genomic databases to figure out how to use this vast medical resource to answer a number of questions in medicine, such as why patients can respond so variably to treatment, and how genetics or other factors might contribute to this.

It has been necessary to invent new analysis methods to glean useful data from often disparate databases, and to make sure that the results produced aren’t biased. Studies based on data from EMRs are subject to the same concerns as observational studies, in which scientists look for links between an individual’s natural behavior and their health. It was observational study that suggested that hormone replacement in postmenopausal women reduced risk of heart attack, while subsequent clinical trials found that the treatment increased risk of heart disease and stroke.

Dan Roden, a clinical pharmacologist at Vanderbilt University, in Nashville, TN, is beginning to address some of those challenges in a pilot project linking EMRs to genomics databases. While he ultimately wants to use EMRs to better understand why different patients can react so differently to the same drug, the project is starting with the most basic questions. “We wanted to ask what genetic information would you want to access to take care of someone, what are the informatics challenges, and what are the ethical challenges in storing people’s information?” says Roden.

His team began by building a DNA database in 2007, extracting DNA from clinical samples collected for other research projects. (Thanks to the way the Vanderbilt medical system is organized, researchers can use such samples for multiple purposes and link that information to the patient’s medical record, while the patient’s identity remains hidden.) The team analyzed DNA from 10,000 people, searching for 21 specific single-letter variations that had been previously linked to different diseases. Using a technique called natural language processing–a sophisticated way of analyzing information–researchers developed a method to reliably identify patients with specific diseases solely from their medical records. The task is more challenging than one might expect; for example, someone may see a rheumatologist for evaluation without actually having rheumatoid arthritis.

By searching for genetic variations that are more common in people with specific diseases, the team confirmed a number of previously identified gene-disease links. The findings, published last week in the American Journal of Human Genetics, show that this type of research can yield useful results.

The team has now expanded the database to 81,000 samples and plans to use it to ask more complex questions. Roden will to try to find genetic predictors of drug response–specific variations that predict whether a patient is unlikely to respond to a specific drug, or more likely to suffer a dangerous or debilitating side effect. “The outcome will be a set of genetic variants that we think will be important to incorporate into medical record,” says Roden. “We want to be able to say, ‘Here’s a person who won’t respond to beta blocker, so they should get a diuretic.’ “

According to Penn’s Tannen, it will likely take years to build up the databases needed to conduct broader clinical research. He estimates that a database of about 50 million people is necessary to ask the types of questions he is most interested in, such as whether patients older than 75 react the same way to a particular therapy as do those who are in their 40s. “That’s the potential great power of database studies,” he says.

Be the leader your company needs. Implement ethical AI.
Join us at EmTech Digital 2019.

Register now
More from Rewriting Life

Reprogramming our bodies to make us healthier.

Want more award-winning journalism? Subscribe to Insider Plus.
  • Insider Plus {! insider.prices.plus !}*

    {! insider.display.menuOptionsLabel !}

    Everything included in Insider Basic, plus the digital magazine, extensive archive, ad-free web experience, and discounts to partner offerings and MIT Technology Review events.

    See details+

    Print + Digital Magazine (6 bi-monthly issues)

    Unlimited online access including all articles, multimedia, and more

    The Download newsletter with top tech stories delivered daily to your inbox

    Technology Review PDF magazine archive, including articles, images, and covers dating back to 1899

    10% Discount to MIT Technology Review events and MIT Press

    Ad-free website experience

/3
You've read of three free articles this month. for unlimited online access. You've read of three free articles this month. for unlimited online access. This is your last free article this month. for unlimited online access. You've read all your free articles this month. for unlimited online access. You've read of three free articles this month. for more, or for unlimited online access. for two more free articles, or for unlimited online access.