Yahoo Predicts America's Political Winners
The effort combines a variety of data-driven approaches.
Data scientists at Yahoo are using prediction markets—along with polls, sentiment analysis on Twitter, and trends in search queries—to create the mother of all political prediction engines. The project involves Web-based prediction markets like Intrade, in which large numbers of people bet on the outcomes of elections.
The researchers behind this effort, David Rothschild, an economist at Yahoo Research, and Dave Pennock, a computer scientist at Yahoo Research, call their effort the Signal. They plan to produce data visualizations that best convey probability to a lay audience, and to publish work on machine learning and fundamental economic models based on the effort.
They’ll get the public involved in all this political and mathematical wonkiness with fun and games. Drawing on Yahoo’s success with fantasy sports leagues—for which the company is the biggest community on the planet—Rothschild and Pennock have created “Fantasy Politics,” in which users can bet on the outcomes of pretty much anything.
“We’re going to let people [bet on simple predictions] like ‘the Democrats will win California,’ ” says Pennock. “But if they want to geek out, they could bet on the odds that ‘the Democrats will win both Ohio and Florida,’ or ‘the Republicans will win Florida but lose the election.’ “
Such bets will take Yahoo’s political prediction markets, which will roll out in spring, to a level of complexity and predictive power not seen elsewhere, says Pennock.
The Signal will use these markets and other real-time data streams. The prediction markets run by the Signal are polled constantly, as will be the results of analysis of Yahoo search queries and Twitter.
Sentiment analysis, or the effort to automatically determine how people feel about something based on how they communicate about it, is “at an infant stage,” says Rothschild, but it can provide insight that no poll can match.
One limitation of most polls is that they’re binary—they ask whether a vote will go one way or the other. Sentiment data, on the other hand, can tell politics watchers precisely why a candidate’s numbers are up or down. One example Rothschild cites is Rick Santorum’s fluctuating poll numbers. By tracking Twitter sentiment and search data, Rothschild and Pennock found evidence that this reflected a shift in focus: the public was at first interested in the candidate’s stand on homosexuality and race. Later on, voters were more likely to search for information about his economic policies.
“This can happen in hours or days, on a time scale you can’t see in polling,” says Rothschild. It’s possible that sentiment analysis will even yield insights into whether or not a candidate or an issue has staying power, or is merely a flash in the pan. Prediction markets already seem better than polls at taking the longevity of a trend into account—while Cain, Perry, and Trump all took turns at the top of polls, Intrade and other prediction markets monitored by the Signal always had Romney coming out ahead consistently.
So what do the brains behind the Signal predict for Saturday’s primary in South Carolina?
“The race is pretty much over,” says Rothschild. “At this point, Romney has over a 90 percent likelihood of winning the nomination, and of winning the South Carolina primary.”
And what about the general election?
“[The odds of Obama winning] have been consistently inching up slowly as the Republican primary has hit a crescendo over the past two months,” says Rothschild. Pennock, who crunches the numbers for the team, says their latest result puts the chances for Obama’s victory at 52.9 percent.
Become an MIT Technology Review Insider for in-depth analysis and unparalleled perspective.Subscribe today