In 2006, Netflix launched an unusual, and highly successful, competition designed to improve its recommendation system. It released a database of 100 million movie and TV show ratings from nearly 500,000 users and, in 2009, awarded the $1 million jackpot the first team to increase the accuracy of its own movie recommendation algorithm by more than 10 percent.
Those three years seem like ages ago today. The rapid rise of streaming content—with Netflix customers alone viewing one billion hours of content this June—has exploded the amount and types of data available to the company’s data science team. Whether Netflix uses that data wisely could be crucial to its future business, as it faces growing competition from streaming sites like Amazon and Hulu.
Netflix is already reinventing its recommendation formulas and personalized home pages to be tuned to its streaming business, says Netflix vice president of product innovation Todd Yellin, who spoke on an IBM Research panel on Friday. He estimates that, since Netflix began streaming content, the amount of data Netflix holds about the average user has increased ten-fold, especially as people view more hours of content than they did when DVDs primarily arrived in the mail.
Real-time data also gives new, detailed insights into people’s actual movie-watching habits. For example, a person might never get around to rating a movie, but Netflix might guess it wasn’t a winner if the person never made it to the end. Families often share Netflix accounts. So Netflix might recommend different movies, or change its personalized home page design, based on whether it guesses a husband or wife or their child is watching at that time of day.
Netflix is also adding a social element. Outside of the U.S. only, Yellin says that Netflix is now integrating with Facebook, so that like on the popular music service Spotify, people can share what they are watching. That could help Netflix tap data related to one common reason people watch movies or TV shows: for “watercooler currency” to discuss with colleagues and friends.
All of this means the improvements yielded by the original Netflix Prize aren’t as relevant today as they were in 2009. “We still use some of the learning…but it’s not at the forefront,” Yellin says. “The most relevant data is what people are actually watching.”
There are fresh challenges with streaming data, such as deciding what new factors to emphasize when presenting recommendations. For example, should Netflix weigh time of day more or less than the type of viewing device?
But a new Netflix Prize is unlikely to happen.
“We were a smaller company then,” Yellin says, and there are certain privacy risks to sharing user data with the public research community–even if it’s even anonymized. “We are open to the idea, but there are no plans in the near future.”