Select your localized edition:

Close ×

More Ways to Connect

Discover one of our 28 local entrepreneurial communities »

Be the first to know as we launch in new countries and markets around the globe.

Interested in bringing MIT Technology Review to your local market?

MIT Technology ReviewMIT Technology Review - logo

 

Unsupported browser: Your browser does not meet modern web standards. See how it scores »

Some things—fog in San Francisco or traffic in New York City—are easy to predict. Others, such as the way a stock market will react to big trades, or the progression of an HIV patient’s illness, are far more complicated. That’s where a startup called Kaggle comes in. It organizes contests in which participants attempt to make seemingly impossible predictions by analyzing mountains of data.

Kaggle corrals thousands of people with backgrounds in data science, including PhDs, graduate students, professors, and people who work at companies such as IBM and Google, offering them the chance to compete to solve companies’ big-data conundrums in exchange for cash. Users take data provided by contest sponsors and compete using custom-made algorithms to find patterns and make the most accurate predictions. You might think of it as a predictive-modeling death match.

Created by Australian economist Anthony Goldbloom, Kaggle was inspired partly by a competition Netflix held from 2006 to 2009. The company offered $1 million to the team that could improve the accuracy of its movie-recommendation software by 10 percent.

The popularity of the Netflix competition showed Goldbloom how many people were interested in working on companies’ data-related conundrums. His 2008 internship at The Economist exposed him to plenty of companies with data that could be mined for valuable insights, but without the right people to study it.

He bet there was room for a company that would bring these two sides together, and figured that giving it a competitive twist would provide better results.

He was on to something. Since launching in April 2010 with a prize of $1,000 for the team that could most accurately predict how countries would vote in the annual Eurovision Song Contest, Kaggle has run 30 different competitions, five of which are still in progress.

And Kaggle’s community, which has grown to about 27,000 people, is getting results. In one early challenge, a Drexel University academic provided anonymous HIV patient records containing genetic marker data that he hoped could be used to predict the progression of the virus. Within a week and a half, Kaggle users could predict the progression of the virus with 70 percent accuracy, when comparing their predictions with known data—a milestone academic research reached only after four years of effort. By the end of the three-month competition, site users had created a model that reduced the previous error rate by about a third and increased the accuracy of predictions to 77 percent.

0 comments about this story. Start the discussion »

Credit: Kaggle

Tagged: Computing, data mining, data, Netflix Prize

Reprints and Permissions | Send feedback to the editor

From the Archives

Close

Introducing MIT Technology Review Insider.

Already a Magazine subscriber?

You're automatically an Insider. It's easy to activate or upgrade your account.

Activate Your Account

Become an Insider

It's the new way to subscribe. Get even more of the tech news, research, and discoveries you crave.

Sign Up

Learn More

Find out why MIT Technology Review Insider is for you and explore your options.

Show Me