To give the algorithm a starting point, the researchers also need to identify a few users from an anonymous social-network graph. But they say that this is easy to do on many social networks. A portion of users of Facebook, for example, choose to make their profiles public, and an attacker could use this as the starting point. In their experiments, the researchers found that they needed to identify as few as 30 individuals in order to be able to run their algorithms on networks of 100,000 users or more.
The researchers add that the algorithm uses the smallest amount of information feasible and that, in practice, a determined snoop would be able to find much more. “This attack would have been much, much stronger if we’d actually used information that is typically left after [names and addresses] have been removed,” says Shmatikov. “So we’re really showing how the bare minimum is enough.”
“It’s important research,” says Alessandro Acquisti, an associate professor of information technology and public policy at Carnegie Mellon University and an expert on privacy online. The research highlights how data that might not seem important can actually provide an attacker with the means to uncover truly sensitive information, Acquisti says. For example, the algorithm could theoretically employ the names of a user’s favorite bands and concert-going friends to decode sensitive details such as sexual orientation from supposedly anonymized data. Acquisti believes that the result paints a bleak picture for the future of online privacy. “There is no such thing as complete anonymity,” he says. “It’s impossible.”
Shmatikov does think that there is no technical solution to the problem. He suggests that privacy laws and corporate practices may need to be changed to recognize that there’s no way to anonymize social-network data. Users should also be able to decide whether to allow their data to be shared in the first place, Shmatikov says.