On Tuesday, search giant Google released an experimental tool that tracks the intensity and movement of the influenza virus across the United States by monitoring the number of times that people search the Web using terms related to the disease.
The tool, known as Google Flu Trends, makes use of the fact that, before they go to the doctor’s office, many people will search for information about what ails them. Using aggregate data on the number of searches for terms such as “flu” and “flu symptoms,” software engineers from Google and researchers from the Center for Disease Control (CDC) found a strong link between these searches and reports from doctors of flu outbreaks a week to 10 days later.
“We found that there’s a very close relationship between the frequency of these search queries and the number of people who are experiencing flu-like symptoms each week,” Jeremy Ginsberg and Matt Mohebbi, both software engineers at Google, write in a blog post describing the work. “As a result, if we tally each day’s flu-related search queries, we can estimate how many people have a flu-like illness.”
The result could allow the CDC to respond to outbreaks of influenza more than a week earlier than it could using intelligence based on trends in reports from doctors, the federal agency’s top flu tracker says. “Influenza has a very short incubation period,” says Lyn Finelli, lead for influenza surveillance at the CDC. “You can have a lot of transmission in a very short time, so the more warning you have, the more you can do to prevent an outbreak of the disease.”
The Google researchers worked closely with the CDC for more than a year to improve their model. The tool reveals a close match between historical increases in searches related to the flu and increases in reports from doctors, Finelli says.
The idea of mining data that people leave behind through their electronic existence is not new, says Nathan Eagle, a research scientist with the Media Lab at MIT. “This isn’t the first example of it, by any means,” Eagle says. “Virtually any type of trace that you, as a user or as a citizen, leave could be mined for all sorts of purposes.”
Eagle and other researchers have published research on tracking students’ movement patterns using GPS to better understand how a disease can spread through a population. Another system, called HealthMap, plots news and blog reports related to different diseases on Google Maps. And Philip Polgreen, an assistant professor at the University of Iowa, recently published a paper that used Yahoo search data to correlate searches for flu symptoms with the incidence of reports of the disease. “Every day, people search for information on health on the Internet, and we thought that the pattern of searches could give some information about future expectation or current incidents,” Polgreen says.
Gunther Eysenbach, a health-policy professor with the University of Toronto, says that three years ago, he proposed to Google the idea of analyzing search data in the same way that Flu Trends does. However, he claims that he did not receive a response. Without direct access to Google’s data, Eysenbach decided to take out an AdWords advertisement on the site. The ad not only gave him access to search data for the terms “flu” and “flu symptoms,” but it also showed the number of people who clicked on the advertisements.
“If you count the number of clicks, then you can get a better prediction,” Eysenbach says. “You can weed out the number of people who have flu symptoms from the number of people who just hear about the disease from media reports.”
In a paper published in 2006, Eysenbach showed a strong positive link between Internet users’ searches and the incidence of influenza in a certain region–in this case, Canada. While Eysenbach’s work is briefly mentioned in Google’s paper, the researcher says that he was “in a mild state of shock” that Google had decided not to collaborate with him.
In its announcement, Google tried to head off any concerns that the tool would impact users’ privacy, stressing that, while the research “aggregates hundreds of billions of individual searches,” the information is anonymized and therefore cannot be employed to track an individual user.
MIT’s Eagle, however, says that people should expect that their data will increasingly be used in the future. “At the end of the day, this type of data is a fact of life in the 20th century,” he says. “We could ignore it and pretend it doesn’t exist, or we could use this data, without compromising privacy, in ways that can help people.”