When Google launched Buzz, a microblogging social network, several months ago, the company boasted that the network had been generated automatically, by algorithms that could connect users to each other based on communications revealed through Gmail and other services.
However, many users balked at having what they perceived as mischaracterized social connections, forcing the company to frantically backpedal and make the Buzz service less automated and more under users’ control.
This incident notwithstanding, many companies are increasingly interested in automatically determining users’ social ties through e-mail and social network communications. For example, IBM’s Lotus division offers a product called Atlas that constructs social data from corporate communications, and Microsoft has investigated using such data to automatically prioritize the e-mails that workers receive. But researchers say there are a lot of unsolved problems with generating and analyzing social networks based on patterns of communication.
In a paper presented recently at the WWW2010 conference in Raleigh, NC, a group of researchers from Yahoo pointed out that before it’s possible to construct an accurate picture of a social network, researchers have to do a better job of defining what it takes for two people to be connected. Should two people be considered friends if they’ve exchanged e-mails once? Or should it take 10 exchanges before their connection counts?
“You don’t get to directly observe relationships, you get to observe communication events,” says Jake Hofman, a researcher in Yahoo Research’s social dynamics group, who was involved with the work. Algorithms will infer dramatically different social network structures based on different interpretations of these communications events. Such networks might be more suitable for different circumstances. For example, a network based on relatively infrequent communications might turn out to work well for sharing tagged news items. More frequent communications might work better for networks designed for sharing more intimate information.
“For the most part, the thresholds we set [for automatically generating social networks] are arbitrary,” says Lada Adamic, an assistant professor in the School of Information and the Center for the Study of Complex Systems at the University of Michigan. Adamic notes that there are other questions than the ones raised by the Yahoo paper. For instance, she says, most algorithms define networks simplistically–people are either connected or not, without a way to indicate the gray areas common in real life.
She says it’s possible to keep refining the algorithms, but there will always be errors because the data available won’t capture the whole pattern. For example, two people might not e-mail each other, but they may talk regularly over the phone or in person.