Hello,

We noticed you're browsing in private or incognito mode.

To continue reading this article, please exit incognito mode or log in.

Not an Insider? Subscribe now for unlimited access to online articles.

Emerging Technology from the arXiv

A View from Emerging Technology from the arXiv

Now Wikipedia Used to Predict Movie Box Office Revenues

It’s not only Twitter that can predict opening weekend box office revenues for movies. Wikipedia data has a similar prescience, say computational social scientists

  • November 7, 2012

Social media services such as Twitter provide a fascinating insight into the collective mind and mood.  Computational social scientists say they can see evidence in the Twitter stream of all kinds of events in real time. Among these phenomena are traffic jams, rainbows and even earthquakes. It’s not hard to imagine a service that updates people about these things as they happen.

Much harder is the ability to make forecasts. However various groups say they can use the Twitter stream to predict the outcome of elections, stock market prices and future box office revenues.

This story is part of our July/August 2011 Issue
See the rest of the issue
Subscribe

Today, Marton Mestyan from the Budapest University of Technology and Economics in Hungary, and a couple of pals, say that the patterns of behaviour on Wikipedia can lead to similar predictions. In particular, they use this behaviour to predict box office revenue of movies a month before they are released. 

These guys examined the Wikipedia entries for 312 movies released in 2010. They looked in particular at the number of viewers, the number of human editors, the number of edits and another factor related to editing called the collaborative rigor.

They then measure the correlation between this data and the success of the movie as measured by box office revenue over the release weekend.

Their results show that the combined activity measures on Wikipedia are highly correlated with box office revenue when movies are particularly successful.

“We show that the popularity of a movie could be predicted well in advance by measuring and analyzing the activity level of editors and viewers of the corresponding entry to the movie in Wikipedia,” say Mestyan and co.

There is a caveat, however. Wikipedia is not so good at predicting box office revenues for less successful movies. 

There’s an interesting factor related to this. One feature of the box office data is that it is bimodal–it has two peaks. So lots of movies are successful and lots of movies are moderately successful. In between there is a trough. 

None of the Wikipedia activity measures have this kind of twin-peaked behaviour. So it’s not really a surprise that it correlates with only some of the box office data.

It may be that earlier Twitter-based studies show the same limitation.  Mestyan and co say that the Twitter only works for the most successful movies as well, and only after the opening night.

By contrast, Wikipeida data seems to correlate with some box office data up to a month before release.

That’s interesting because it extends the sphere of social media predictions beyond Twitter and into Wikipedia.

But this works lacks a resounding slam dunk. Various researchers have thrown cold water on the idea that social media can make useful predictions about the future.

Indeed, it’s all very well using historical data to make ‘predictions’ about the past. What would be far more impressive is to use current data to make predictions about the future.

And until Mestyan and co can do that in a repeatable, unambiguous way, it’ll be hard to think of this work, and others like it, as much more than a curious correlation. 

Ref: arxiv.org/abs/1211.0970:Early Prediction of Movie Box Office Success based on Wikipedia Activity Big Data

Gain the insight you need on emerging technologies at EmTech MIT.

Learn more and register
Want more award-winning journalism? Subscribe to Insider Plus.
  • Insider Plus {! insider.prices.plus !}*

    {! insider.display.menuOptionsLabel !}

    Everything included in Insider Basic, plus the digital magazine, extensive archive, ad-free web experience, and discounts to partner offerings and MIT Technology Review events.

    See details+

    Print + Digital Magazine (6 bi-monthly issues)

    Unlimited online access including all articles, multimedia, and more

    The Download newsletter with top tech stories delivered daily to your inbox

    Technology Review PDF magazine archive, including articles, images, and covers dating back to 1899

    10% Discount to MIT Technology Review events and MIT Press

    Ad-free website experience

/3
You've read of three free articles this month. for unlimited online access. You've read of three free articles this month. for unlimited online access. This is your last free article this month. for unlimited online access. You've read all your free articles this month. for unlimited online access. You've read of three free articles this month. for more, or for unlimited online access. for two more free articles, or for unlimited online access.