Top scientists from companies such as Google and Yahoo are gathered alongside leading academics at the 17th Association for Computing Machinery (ACM) conference on Knowledge Discovery and Data Mining (KDD) in San Diego this week. They will present the latest techniques for wresting insights from the deluge of data produced nowadays, and for making sense of information that comes in a wider variety of forms than ever before.
Twenty years ago, the only people who cared about so-called “big data”—the only ones who had enormous data sets and the motivation to try to process them—were members of the scientific community, says Usama Fayyad, executive chair of ACM’s Special Interest Group on Knowledge Discovery and Data Mining and former chief data officer at Yahoo. Even then, the results of data mining were impressive. “We were able to solve significant scientific problems that were standing in the field for 30-plus years,” Fayyad says.
The explosive growth of the Internet, however, changed everything. Whether they liked it or not, businesses found themselves operating online and amassing enormous volumes of data about customers and their behavior. As the power of data mining became clear, Fayyad says, so did economic motivations to invest in the field.
Netflix, for example, offered a $1 million prize to any team that could mine its information about users and build a more accurate recommendation system than the one it already had. High-profile examples like this only scratch the surface of the applications for data mining.
“Businesses and industry are increasingly interested in leveraging the data they capture through business processes,” says Chid Apte, director of analytics research at IBM and chair of the conference. In particular, he points to health care, social media, and anything that takes place on the Web.
These days, Internet giants make their money from the information they collect about users and the insights they gain from mining it. Retailers can access complex patterns of shopper behavior to help them stock their stores more profitably. Industry researchers can predict automobile traffic patterns based on congestion, weather, and time of year, and offer the best routes.