Digg, a popular social bookmarking website, began rolling out a recommendation engine late last week. The design of this recommendation engine, however, is quite different from that of the engines used by companies such as Amazon. While e-commerce sites tend to derive recommendations from a mix of information about users’ browsing and purchasing habits and information about the items for sale, Digg’s system, much like the site itself, places its trust in the wisdom of crowds.
Digg has built up a reputation for helping users find interesting stories among the flood of new information that’s constantly posted on the Internet. Users submit interesting items to the website, and then other users “digg” the stories they like and “bury” those they don’t. The most popular stories make their way to Digg’s front page.
Digg has grown considerably since its launch in 2004–which has led to a serious problem for the site and its users. It’s nearly impossible for an interested user to sort through the now more than 15,000 stories that are submitted every day, and it’s therefore difficult for many users to participate in voting on which stories should make the front page. Anton Kast, Digg’s chief scientist, hopes that the recommendation engine will solve this problem. By highlighting the new stories that a user might like, he says, it makes it easier for that user to manage the flow of submitted stories. “You get to see stuff you might be interested in, and you get to contribute in a way that’s more effective than it would have been otherwise,” Kast points out.
But Digg’s character, he says, calls for the design of an unorthodox recommendation engine. “It’s not a magic oracle,” says Kast. “It’s not that we’re saying that the computer is smarter than you, or that we know what you want, or we know who you are.” Instead of using the characteristics of articles to run its recommendation engine’s algorithms, Digg’s system is based entirely on calculating connections between users.
Every time a user digs a story, the system compares that action with the actions of everyone else in the system, and it finds which users have the most digs in common. To keep recommendations from being all over the map, the system calculates connections for each topic separately, so that two users who share an interest in video games won’t necessarily be thought to have, say, like opinions on political stories. To keep recommendations diverse, the system shows only a certain number of stories from each compatible user and, each time the user requests recommendations, fills out its quota of suggestions employing stories selected by less compatible users. The recommendation engine also limits the effect that a single dig can have, so that someone who digs a very popular story won’t suddenly become connected to thousands of other users. Because the system calculates correlations in real time, using separate servers devoted to performing the computations, Kast says that a new dig will affect the recommendation system within one or two minutes.