The Cambridge Analytica affair reveals Facebook’s “Transparency Paradox”

Sinan Aral, a professor at MIT, fears the fallout from the scandal could limit researchers’ access to social networks’ data.

Martin Gilesarchive page

March 20, 2018

Meissa Hampton

The scandal surrounding Cambridge Analytica, the data-mining company that Facebook has banned from its platform for violating data use policies, gets bigger by the day. On Monday, a UK television program aired undercover footage of Alexander Nix, the company’s CEO, boasting about how it could use spies, entrapment techniques, and fake news to influence elections. Britain’s information commissioner, Elizabeth Denham, said she would seek a court order to look at Cambridge Analytica’s servers and databases.

The same day, Facebook said it had hired forensic auditors to examine whether Cambridge Analytica had held on to data allegedly passed to it by an academic at Cambridge University, Aleksandr Kogan, who gained access to Facebook’s platform for research purposes several years ago. Kogan, who used that access to glean information about 50 million Facebook users, was also blocked from the platform, as was Christopher Wylie, a former Cambridge Analytica employee turned whistleblower.

The post-mortem into the affair, and how the data were used in Cambridge Analytica’s efforts to help Donald Trump win the 2016 US presidential election, is in full swing. It will add to the turmoil at Facebook, which lost $37 billion in market cap during the day's trading and may also lose Alex Stamos, its chief security officer, who has reportedly clashed with the network’s leadership over how transparent Facebook should be about Russia’s use of the platform to spread disinformation.

One really important question is whether all this could harm researchers’ efforts to shed more light on the immense influence that social networks now have over our lives. To explore the issue, we spoke with Sinan Aral, a social-media expert who is a professor at MIT’s Sloan School of Management.

Were you surprised that a researcher could access so much data and allegedly pass it to a third party in violation of Facebook’s data use rules?

I wasn’t surprised they could access that much data. Facebook’s been pursuing research questions with qualified researchers for some time. What is surprising is that an academic researcher could so flagrantly violate the spirit and the terms of the data-sharing policies Facebook has in place by taking that data and giving it to a firm that was never authorized to have it in the first place for the purposes of political targeting.

Are you very concerned that this episode could have a chilling effect on social networks’ willingness to share data with researchers?

Yes, I am. Facebook is facing what I call a “transparency paradox.” On the one hand, it’s under tremendous pressure to be more transparent, to reveal more about how targeted advertising works; how its News Feed algorithms work; how its trending algorithms work; and how Russia or anyone else can spread propaganda and false news on the network. So there’s this very strong pressure to be more transparent and to share data with trusted third parties. But on the other hand, there’s really strong pressure to increase the security of the data that they do reveal to make sure that it doesn’t get into the wrong hands and to protect users’ privacy.

This transparency paradox is at the core of Facebook’s existential crisis today, and there’s a real risk that the Cambridge Analytica story will make it more conservative in what it shares, which would affect the research of hundreds of good scientists who are working with the social network every day without breaching its terms of service in order to understand how Facebook is affecting our society.

Is the current ad hoc process for giving researchers access to social-network data governed by nondisclosure agreements working well, or could it be improved?

I think each request needs to be individually vetted by Facebook and other networks. I wouldn’t want them to give blanket access to just certain narrow kinds of data. There are a lot of different aspects of Facebook that are important to understand, and it needs to be able to consider the costs and benefits of each proposed data release.

One suggestion that’s been floated is to create a portal where registered researchers could query anonymized data from social-network users who have consented to share their information. What do you think of that idea?

That kind of portal could be useful, but it’s only one very small solution to the transparency paradox. There needs to be much more openness than just a single set of data. There need to be lots of different ways that Facebook and other networks work with the scientific community to help us all understand how social media is affecting our democracy, our businesses, and our public health.

Is the Cambridge Analytica story a signal that we should be more broadly concerned about the power of these platforms and what they are doing with our data?

I think the story is about a researcher who flagrantly violated the likely terms of any data-sharing agreement he had with Facebook for research purposes and the company, Cambridge Analytica, that either knowingly or unknowingly used the data for potentially nefarious purposes without vetting the source of that data and any restrictions associated with it. That’s the real story here.

We need to better understand the threat of bad actors who may use access to data to help them spread fake news or propaganda on social platforms. And the only way we’re going to get a handle on that is if Facebook can find a way to resolve its transparency paradox effectively by becoming more open and more secure at the same time.

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.