Skip to Content

The US has no idea how to manage all the testing data it’s collecting

In the US, each state decides how it reports findings from covid-19 tests. The result is a chaotic system that’s hurting our response to the pandemic.
healthcare workers at testing station in NY
As the US expands coronavirus testing, it needs to find a better way to organize and report the data moving forward.David Dee Delgado/Getty Images

Imagine you’re an epidemiologist or public health expert in the US during the current crisis. Senior elected officials have just contacted you to ask your advice on whether it’s safe to ease some lockdown restrictions. To prepare your answer, you will need to take a closer look at what the covid-19 testing data says. 

Getting this data means going to the health department website of each jurisdiction in question (and the neighboring ones), pulling up the information separately, and then trying to collate it all. You’ll have to pray the information is up to date, since there’s no guarantee. And you might even have to contact the departments directly and make a special request if you’re looking for numbers and information not readily available on their websites. The entire process will be a long, drawn-out, frustrating affair. And you might not even get what you want. 

Why? Because public health is a decentralized system in the US. In the case of covid-19, there’s no consistent standard for how states should collect and report the data. Individual states and their own health departments decide how they want to handle testing—including how to collect, organize, and report the results. And that can be a problem.

“As you can guess, some health departments are better than others,” says Neal Goldstein, an epidemiologist at Drexel University’s Dornsife School of Public Health. 

Every state health department reports the number of positive and negative tests. But then disparities arise. States can choose whether or not to slice the numbers up geographically (like by zip code); tally recovered cases and deaths (confirmed and probable); show hospitalizations and factors like ventilator or ICU usage; or include demographic information like patients’ ethnicity, age, sex, and preexisting conditions. 

In some places, like New York City, the raw data is freely available and regularly updated for any interested party. But this kind of transparency is rare, and more often you need to make time-consuming and cumbersome special requests for deeper testing data. 

This mishmash of approaches and standards is causing delays in the US response to the pandemic. Without a uniform, efficient pipeline for aggregating and reporting covid-19 testing data, we lack the up-to-date information that would help focus our efforts, and we must spend unnecessary resources and time reconciling irregularities and disparities in the numbers. Things like contact tracing, surveillance, and resource management for hospitals depend on real-time testing information, but that is hard to get when no one is reporting it in the same way. Delays in wrangling it into shape can affect decision-making. Policymakers don’t want to make decisions based on data that is 72 hours old while they wait for experts to smooth out incoming information. 

“It makes it quite difficult for us as epidemiologists to draw population-level conclusions when there's inconsistency in the data,” says Goldstein. "It really impacts our ability to assess the true scope of the pandemic as it is impacting the US.”

Some states and municipalities simply haven’t joined the 21st century. Emily Travanty, the scientific director of the Colorado State Public Health Laboratory, notes that her state allows results to be reported by fax, since some smaller clinics don’t have electronic means to do so. Aaron Miri, the chief information officer for Dell Medical School at the University of Texas at Austin, says it’s not all that uncommon to see some institutions using Excel instead of more modern software for electronic health records. Other states are simply dragging their feet.

Meanwhile, epidemiologists want to understand the prevalence of a disease in a community. Part of the work means adjusting numbers to account for erroneous results—moving false negatives into the positive column. But that depends on the health department to report what the measures of accuracy are so the scientists can make corrections. And those just aren’t always uniform across state lines, or even county lines—assuming they are reported at all. 

“You can interface with a single health department and work with their data and work with the experts there, but then just even going from one health department to another, it could be a very different scenario,” says Goldstein. Trying to understand trends across state lines can be a nightmare—and the virus does not discriminate on the basis of borders. 

New testing platforms will exacerbate these problems. There’s no guarantee at-home test results will be reported to the state unless people who take those tests go out of their way to engage with the health-care system. The varying accuracy of these tests means not all results carry the same weight—and ought not to be counted in the same way.

So how do we solve this mess? The good news is we don’t have to invent something new—we just have to stretch out the solutions we have. Many health-care systems already use a standard for managing and presenting health record data, called Fast Healthcare Interoperability Resources (FHIR). The CDC has a new FHIR-based tool that can automatically generate covid-19 case reports using new test results and electronic patient records. The tool could send those reports to health departments on its own, without human oversight. An initial version is already available

Erich Huang, the assistant dean for biomedical informatics at the Duke University School of Medicine, says it’s vital for different health-care computer systems to be able to work together. The nonprofit Logica has put out an open-source interoperability platform designed to make it easier to collate and organize all covid-related clinical data for all health systems. “I think people should embrace that, and start mapping their data into that format,” he says.

What is important, Miri emphasizes, is that these platforms be flexible enough to handle regular updates and changes: “As we continue to learn more about covid-19, we’re learning more about what critical data ought to be reported.” The CDC, for instance, just added 15 new elements to its covid-19 case reporting form. “We need to be ready to build the plane in flight, so to speak,” he says.

Other solutions are more simple. Donald Thea, a professor of global health at Boston University, suggests you could theoretically use a smart device that tracks PCR tests as they happen and sends results straight to the local or state health department as soon as they are in, without the need for a human to read that information and enter it manually. Any clinics that haven’t already adopted cloud-based electronic health records ought to do so—and if they’re nervous about such a drastic move at such a chaotic time, they could limit it to covid-19. 

Eric Perakslis, a Rubenstein Fellow at Duke, led efforts in West Africa during the Ebola outbreak in 2014 and 2015. He says the biggest lesson he learned was that “getting on the ground sooner with something smaller was more valuable than waiting for something bigger later.” Even if you’re left trying to standardize testing data with pen and paper and just a few people, that’s better than holding out for a “killer app” to hit the market in several months. 

Most experts agree that in an ideal situation, the federal government would lead the management of public health data for a crisis like this—but that is highly unlikely to happen. Given that some models suggest we’ll need to implement response measures against the pandemic into 2022, though, Miri doesn’t think it’s too late to push forward a national initiative. It’s just a matter of funding such measures, and persuading states to accept more oversight. That’s not always an easy sell, but the pandemic’s effects may have softened state officials up a bit. 

“It may take us 24 to 36 months,” says Miri. “But if you don't start today, when the next thing hits we're still going to be sitting here like we did during Ebola—in which a portion of people have reasonable processes, and half are using Excel. We can't do this anymore. We have to get our act together.”

Deep Dive

Biotechnology and health

Scientists are finding signals of long covid in blood. They could lead to new treatments.

Faults in a certain part of the immune system might be at the root of some long covid cases, new research suggests.

This baby with a head camera helped teach an AI how kids learn language

A neural network trained on the experiences of a single young child managed to learn one of the core components of language: how to match words to the objects they represent.

The first gene-editing treatment: 10 Breakthrough Technologies 2024

Sickle-cell disease is the first illness to be beaten by CRISPR, but the new treatment comes with an expected price tag of $2 to $3 million.

Weight-loss drugs: 10 Breakthrough Technologies 2024

Weight-loss drugs like Wegovy and Mounjaro are wildly popular and effective, but their long-term health impacts are still unknown.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at with a list of newsletters you’d like to receive.