However, at the time, the CDC had a delay of up to two weeks in releasing public reports of flu caseloads. The agency is now rolling out new technology that will reduce that delay to one week. If the new technology works, Web-search flu predictions may not be any better than the CDC’s figures.
Philip Polgreen, an assistant professor of medicine at the University of Iowa, published a paper in 2008 that showed a correlation between Yahoo’s search data and official reports of the flu. Polgreen says a user’s intent is often difficult to figure out. For example, a search for an illness or a symptom doesn’t necessarily mean someone is sick–it could be that the searcher is writing a research report on the topic.
An analysis released this spring by Justin Ortiz, a clinical fellow at the University of Washington, suggests that Google Flu Trends can overestimate the number of people getting sick from flu when there is heightened press coverage of the flu, such as during the 2009 H1N1 pandemic.
However, as more data become available, some researchers believe better predictions will be possible. “Over the next five to 10 years, I see more and more companies using this kind of nanodata–fine-grain data with hundreds of billions of observations–in their forecasting,” says Erik Brynjolfsson, director of the MIT Center for Digital Business.
Brynjolfsson says that Web queries provide the most accurate predictions in cases where people do research before they make a purchase. His research has shown that a rise in home sales can be predicted from Web search queries. Each percentage-point rise in the housing search index predicts sales of 121,400 additional houses in the next quarter.
The Yahoo researchers say that the search data may be particularly useful when a small improvement in prediction accuracy could have a big impact–for example, in the financial world.
Web searches may also be helpful for spotting sudden changes. For example, existing statistical models have difficulty telling when the popularity of a song rising up the Billboard charts will wane. But search queries can quickly spot this shift. These turning points can also be important in health, economics, and consumer research.