We noticed you're browsing in private or incognito mode.

To continue reading this article, please exit incognito mode or log in.

Not an Insider? Subscribe now for unlimited access to online articles.

Business Report

Big Data Mining

A Q&A with one of the leading inventors of tools for medical data analytics.

Over the past decade, health-care providers have spent tens of billions of dollars to digitize their patients’ medical records. In theory this should be providing researchers with a treasure trove of data to dig through for evidence of the effectiveness and efficiency of care. In practice, it’s more complicated.

Data from these records can be hard to access and difficult to make sense of once it is in hand. Patient privacy issues and data security are of increasing concern and have yet to be fully addressed.

Isaac Kohane, co-director of the Center for Biomedical Informatics at Harvard Medical School, has spent the last 20 years working to pull meaning out of large sets of health data. A pediatrician with a PhD in computer science, Kohane mined medical data to discover the risk of heart attack for patients on one widely prescribed diabetes medicine, Avandia. After his study the drug was pulled off the market. His other research has identified early warning signs of domestic abuse and revealed the variations and patterns among patients with disorders such as autism. He spoke with senior editor Nanette Byrnes.

This story is part of our September/October 2014 Issue
See the rest of the issue

Has this multibillion-dollar investment in electronic health records led to better health care, or at least a better understanding of the quality of care?

You can’t have accountable care if you can’t count. But you’d be dismayed. If you ask any large health-care system, “How many patients do you have with this characteristic? How many patients of this kind did your doctors see? What was their average length of stay?” they will not know.

I do not believe it’s overly cynical to note that many electronic-health-record vendors have touted the ability to bill more effectively for care using electronic records than paper records. [Records that doctors submit to insurance companies] for reimbursement are obviously biased to maximize the income of the health-care system. It may not necessarily reflect the on-the-ground biological or clinical truth.

One oft-cited goal of medical analytics is to combine a patient’s health records with information from his or her genome to create a very precise kind of personalized medical care. But that also seems far off.

In addition to all the challenges of genomic data by virtue of its volume and complexity, no major electronic-health-record vendor supports it. A lot of electronic health records, if you look under the hood, are fairly antiquated. Even though they have a modern skin, they are really state-of-the-art 1980s technology, so integrating them with all the existing genomic tools is a very high bar. Even perhaps more important sometimes than the genome is knowing family history and knowing it in a structured way. But that is not done in most electronic health records, either. The bulk of our health-care data comes from [insurance] claims data and electronic health records, period. And maybe a little bit of public health data.

You and your colleagues have created two platforms with the idea that developers would write apps that could unlock what is in electronic health records.

I do not believe the answer is to tear down all these [electronic-health-record] dinosaurs. Work has gone into them, a lot of thought has gone into them, and you don’t want to rebuild all the back-end stuff. The apps give you modern functionality.

What’s an example?

A detailed family history is on average the most informative [information] for understanding inherited disease risk. Yet very few electronic health records provide the capability to easily enter a family history and link it to the broader genealogy of the family. There are several highly successful Web apps that are low-cost and yet allow entry of highly detailed family history—and by virtue of their market success allow linking of a small family’s history to a much larger genealogy. With a platform like ours you can adapt these modern Web apps to provide a legacy electronic health record with a state-of-the-art family history record.

Even with the technology to mine these records, you say doing that accurately can be tricky.

I am concerned that it’s all too easy to see the data and say, “I’ve been doing big-data analysis for Target and now I can do it for medicine.” That turns out not to be true. You really have to know something about medicine. If statistics lie, then big data can lie in a very, very big way.

When you are looking for adverse events in drugs given for diabetes, for example, it’s pretty tricky if one of the adverse events you are looking for is heart attacks, because heart attacks are also a result of poor diabetes care—the same reason for which the drug is being given. So consequently, if you just willy-nilly said “Just give me all the drugs with a high rate of heart attack,” of course all the diabetes drugs would light up. Instead what we did was say, “Let’s compare the different drugs that are used in the same way and belong to the same class of drug and see if we can see different rates of heart attack if we control for all the other aspects.” And sure enough, we found one such drug. It was called Avandia, and compared to another similar drug, it had a much higher heart attack rate.

There is a lot of concern that compiling databases of health records could result in personal information becoming public. Does that worry you?

The more I know about someone, the more I can do useful things for them, and the more I know about them the more I can discover. And the more you blind me to things, the less useful I’ll be. The only real protection is that the people who have the authorized use of the data have to understand what is the right code of conduct.

Get stories like this before anyone else with First Look.

Subscribe today
Already a Premium subscriber? Log in.
Next in this Business Report
Data-Driven Health Care

A report on how technologies incorporating multiple types of patient data—molecular, behavioral, environmental—are beginning to change care, though roadblocks remain.

Want more award-winning journalism? Subscribe to Insider Plus.
  • Insider Plus {! insider.prices.plus !}*

    {! insider.display.menuOptionsLabel !}

    Everything included in Insider Basic, plus the digital magazine, extensive archive, ad-free web experience, and discounts to partner offerings and MIT Technology Review events.

    See details+

    What's Included

    Unlimited 24/7 access to MIT Technology Review’s website

    The Download: our daily newsletter of what's important in technology and innovation

    Bimonthly print magazine (6 issues per year)

    Bimonthly digital/PDF edition

    Access to the magazine PDF archive—thousands of articles going back to 1899 at your fingertips

    Special interest publications

    Discount to MIT Technology Review events

    Special discounts to select partner offerings

    Ad-free web experience

You've read of three free articles this month. for unlimited online access. You've read of three free articles this month. for unlimited online access. This is your last free article this month. for unlimited online access. You've read all your free articles this month. for unlimited online access. You've read of three free articles this month. for more, or for unlimited online access. for two more free articles, or for unlimited online access.