Skip to Content
Uncategorized

Future Gazing with Search Data

Search queries aren’t always better than traditional trend-spotting methods.
September 28, 2010

For the past few years, computer scientists have touted Web-search data as a way to spot emerging trends–from changes in housing prices and unemployment numbers to the next box-office hit or the location of the next flu epidemic. But research released today gives a more nuanced view of what these data are good at predicting, and why.

A group of researchers at Yahoo analyzed queries fed into the company’s search engine and found that such searches aren’t always the best way to spot a trend. They studied the volume of search queries related to a particular movie, song, and video game up to six weeks before each came out. The total number of searches was highly correlated with the revenue a movie made on opening weekend, the first-month sales of a video game, and the rank of a song on the Billboard chart.

The researchers then compared these results with those produced using traditional methods. For movies, they looked at the Hollywood Stock Exchange, a futures market for trading box-office revenue for upcoming titles, or figures showing the number of theaters at which a movie will be screened. For games, the researchers examined the ratings provided from critics. For songs, they looked at its reviews as well as an artist’s current and previous rank on the Billboard chart.

Search-based predictions fared only a little better than these methods, and were sometimes worse. The research is published today in the Proceedings of the National Academy of Sciences.

Search-based predictions were most accurate for new video games. This may be because of a lack of data, says Jake Hofman, one of the Yahoo researchers involved. “The only early indicators of the quality of a nonsequel video game are reviews from critics,” says Hofman. So search data works because traditional data are not available. For both films and songs, search-based predictions offered no improvement over traditional methods.

In recent years, search queries have been promoted as a tool for trend-spotting. In 2008, Google researchers released a tool, called Google Flu Trends, for predicting how many people were getting sick with the flu in different places around the world, based on search queries for “flu,” ‘influenza,” and similar terms. They found that the tool could predict the likely number of cases in parts of the United States 10 days before the Centers for Disease Control and Prevention (CDC) could.

However, at the time, the CDC had a delay of up to two weeks in releasing public reports of flu caseloads. The agency is now rolling out new technology that will reduce that delay to one week. If the new technology works, Web-search flu predictions may not be any better than the CDC’s figures.

Philip Polgreen, an assistant professor of medicine at the University of Iowa, published a paper in 2008 that showed a correlation between Yahoo’s search data and official reports of the flu. Polgreen says a user’s intent is often difficult to figure out. For example, a search for an illness or a symptom doesn’t necessarily mean someone is sick–it could be that the searcher is writing a research report on the topic.

An analysis released this spring by Justin Ortiz, a clinical fellow at the University of Washington, suggests that Google Flu Trends can overestimate the number of people getting sick from flu when there is heightened press coverage of the flu, such as during the 2009 H1N1 pandemic.

However, as more data become available, some researchers believe better predictions will be possible. “Over the next five to 10 years, I see more and more companies using this kind of nanodata–fine-grain data with hundreds of billions of observations–in their forecasting,” says Erik Brynjolfsson, director of the MIT Center for Digital Business.

Brynjolfsson says that Web queries provide the most accurate predictions in cases where people do research before they make a purchase. His research has shown that a rise in home sales can be predicted from Web search queries. Each percentage-point rise in the housing search index predicts sales of 121,400 additional houses in the next quarter.

The Yahoo researchers say that the search data may be particularly useful when a small improvement in prediction accuracy could have a big impact–for example, in the financial world.

Web searches may also be helpful for spotting sudden changes. For example, existing statistical models have difficulty telling when the popularity of a song rising up the Billboard charts will wane. But search queries can quickly spot this shift. These turning points can also be important in health, economics, and consumer research.

Keep Reading

Most Popular

Death and Jeff Bezos
Death and Jeff Bezos

Meet Altos Labs, Silicon Valley’s latest wild bet on living forever

Funders of a deep-pocketed new "rejuvenation" startup are said to include Jeff Bezos and Yuri Milner.

ai learning to multitask concept
ai learning to multitask concept

Meta’s new learning algorithm can teach AI to multi-task

The single technique for teaching neural networks multiple skills is a step towards general-purpose AI.

Professor Gang Chen of MIT
Professor Gang Chen of MIT

All charges against China Initiative defendant Gang Chen have been dismissed

MIT professor Gang Chen was one of the most prominent scientists charged under the China Initiative, a Justice Department effort meant to counter economic espionage and national security threats.

conceptual illustration showing various women's faces being scanned
conceptual illustration showing various women's faces being scanned

A horrifying new AI app swaps women into porn videos with a click

Deepfake researchers have long feared the day this would arrive.

Stay connected

Illustration by Rose WongIllustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.