Natural-Language Processing Makes Sense of Doctors’ Notes

The technique could ultimately offer a way to make electronic medical records more useful.

Emily Singerarchive page

August 24, 2011

Despite billions of dollars in incentives to support the adoption of electronic medical records, evidence that these systems improve the efficiency or quality of care has been scarce. But a new study shows that natural-language processing—a branch of computer science that employs linguistics to analyze regular speech—may greatly increase the utility of these records in improving care.

Researchers used this approach to sift through physicians’ notes, the richest and most complicated aspect of electronic medical records, for postsurgical complications such as pneumonia and sepsis. The method proved considerably more accurate than other automated systems. They say similar approaches could be used for a variety of applications, including predicting which patients are at risk, and developing automated tools that help doctors choose treatments.

“You can finally see how clinical data can be used to measure patient safety more systematically, and that we will really be able to use these things to manage care,” says Ashish Jha, a physician at Harvard Medical School who wrote an editorial accompanying the paper. The paper and editorial were published this week in Journal of the American Medical Association.

One of the most anticipated benefits of electronic medical records is computerized tracking of patients and institutions—to detect whether a particular patient is at risk for a specific complication, for example, or a specific department or hospital is performing more poorly than others.

Automated tracking is already in use in prescribing; for example, to detect when two medicines interact. Because prescription information is a highly structured part of the medical record, it has been fairly easy to analyze with software. However, harnessing the vast information available in less structured parts of the medical record, such as clinicians’ notes—which contains free-form entries about the patient’s history and status, including postsurgical complications—is much harder.

“If we can’t access that information, we will have a hard time monitoring records to improve care,” says Jha. “This paper is so powerful because it shows you can do this.”

Harvey Murff, a physician at Vanderbilt University, and collaborators tackled the problem using natural-language processing algorithms that incorporate certain rules of speech and language into analysis. For example, a keyword search could retrieve all instances of the word “pneumonia,” but natural-language processing could also take into account modifiers, such as “no signs of” pneumonia, that would yield a more accurate count.

Researchers analyzed nearly 3,000 medical records from patients who had surgery at six medical centers that are part of the Veterans Health Administration for signs of pneumonia, sepsis, deep vein thrombosis, pulmonary embolism, and myocardial infarction. Tracking adverse incidents after surgery can help hospitals and health-care systems monitor how well an institution is following safety guidelines. But current methods can require extensive manpower—manually going through records to identify complications—or lack accuracy. “We wanted to try to replicate what a human would do, but in a way that would be scalable to a larger health-care setting and more cost-effective,” says Murff.

While developing the algorithms involved some trial and error, the end result was highly sensitive—they could identify between 80 percent and 90 percent of the complications previously noted in a manual review by trained nurses. The natural-language processing approach was more sensitive than another automated method that used billing codes to identify postsurgical complications. For example, Murff’s approach detected 82 percent of acute renal failure cases, compared with 38 percent for the billing-codes approach.

However, the new approach was less specific in many cases, detecting more false positives. “I think with more iterations, we could improve that,” says Murff. His team is now working on using the data in clinicians’ notes to predict patients’ risks of complications or other safety hazards.

One of the benefits of natural-language processing is its flexibility. Jha says the approach could be used for a number of applications. Most notable, he says, are clinical decision support tools, “where you give physicians ideas for how to treat patients better. Giving physicians suggestions that take information in clinical notes into account would be very powerful.”

Nuance, a leading maker of voice-recognition software, is already developing commercial systems that use natural-language processing to analyze medical information. The company is collaborating with the IBM team that developed Watson, the robot made famous by beating human contestants on the television game show Jeopardy, to apply the robot’s natural-language processing tools to medicine.

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.