Scrutinizing Facebook Spam

Researchers downloaded 3.5 million profiles to see how accounts are used to send out spam.

Tom Simonitearchive page

October 22, 2010

A study that involved downloading more than three million Facebook profiles has provided the largest-ever snapshot of the methods used by spammers on the world’s biggest online social network.

The study, led by researchers at Northwestern University, turned up hundreds of thousands of spam messages, most of which were sent by compromised user accounts in coordinated campaigns similar to those carried out by e-mail spammers.

“For normal users, it mostly remains a myth,” says Yan Chen of Northwestern, whose team led the study, “but spam has been a big problem to Facebook.”

Reports of user credentials being sold online also motivated the researchers, says Ben Zhao at University of California, Santa Barbara, who with a colleague also contributed to the study, which will be presented at the Internet Measurement Conference in Melbourne, Australia, next month.

Zhao’s group had previously collected a dataset of around 11 million Facebook profiles by exploiting the now-discontinued Facebook feature that caused people belonging to regional “networks” to share their profile information with other users by default. Three months worth of data, collected in mid-2009 and representing around 3.5 million people, were used in the study.

The researchers searched for spam in 190 million wall posts–messages posted on one user’s profile page by another user–by hunting for Web addresses, even if those addresses were deliberately obscured. Wall posts were grouped into clusters containing the same Web addresses before the malicious clusters were separated from those not sharing spam links by screening the addresses using Web security services.

Altogether 200,000 spam posts from 57,000 different user accounts were picked out from 2.08 million posts containing Web links. These spam posts were generated by 23 million users in total. The study is the first to examine spam activity and features at scale, says Zhao, and it shows Facebook is now a major platform for such activity. “The results are quite surprising to me – that even last year there was so much activity,” he says. “I think this is the harbinger of things to come, as Facebook attracts more of the wrong kind of attention.”

Many messages tempted users with offers of free swag such as ringtones, or used a social trap like announcing that someone had a “crush” on them. Around 70 percent of the messages were “phishing attacks,” meaning they directed users to websites that attempt to trick them into divulging personal information. But most were attempts to gain Facebook account details, a strategy that could help send out more spam.

“We expected that attackers would mostly create new accounts to send spam attacks, but in fact, most are sent via compromised accounts,” says Chen. “That may be harder than creating new accounts, but it is more effective to send spams to real friends.”

Different accounts often sent the same spam, sometimes in simultaneous bursts of activity. “These are coordinated spam campaigns, as we see in e-mail spam,” says Zhao.

A Facebook spokesperson noted that only about 0.1 percent of Facebook wall posts analyzed in the study were spam, “which is striking when compared to similar reports that have been done on e-mail.” Multiple studies have reported that more than 90 percent of e-mails sent globally are spam.

“Overall fewer than 1 percent of all people who use Facebook have ever experienced a security issue, and that’s since Facebook’s founding more than six years ago,” the spokesperson added. “This study appears to confirm our success at stopping spam and helping people stay in control of their accounts.”

The patterns of activity discovered could, however, be used to fine-tune algorithms designed to automatically identify when an account has been taken over by spammers. One telling characteristic was the fact that compromised accounts generally sent most spam in the early hours of the morning (for that user’s time zone), presumably to reduce the chance of someone noticing that his account had been compromised. Another was that compromised accounts showed sudden bursts of high activity, something that could be used to accurately identify more than 90 percent of compromised accounts, the researchers showed.

While people have become relatively skilled at spotting the spam e-mail, few expect it on Facebook, potentially making it more successful. The researchers couldn’t show this to be the case, but researchers at antivirus software vendor BitDefender believe it is. “We did some experiments to see how much trust people place in others on Facebook,” says Catalin Cosoi, a researcher with the company. One experiment, in March, found that around a third of people who were sent a friend request from an account created by BitDefender for the purpose would accept it and that a quarter of those would click on a link sent by their new contact.

Another trial, in August, involved sending friend requests from profiles with photos of young women to 1,000 men and 1,000 women ranging from 17 to 65 years old. There was negligible difference between the two groups, but 92 percent of requests were accepted. “When it comes to social media, people feel that [Facebook] is a company that looks after security for them and that they are in a safe place where other users have good intentions,” says Cosoi.

The Northwestern study provides a valuable large-scale look at how spam functions inside Facebook, Cosoi says, although the spam detected still makes up a relatively small proportion of posts, and the company has since stepped up its security efforts. “Last year saw the appearance and spread of the Koobface worm, which was very successful,” he says. “I know that things have changed at Facebook since then.”

Outside researchers may not be able to repeat the study to see how spam strategies evolve, though. “With the removal of regional networks, it becomes harder to get crawled data in Facebook,” says Chen.

Corsoi is certain, however, that the network will attract more spammers, for the same reason it attracts advertisers. “They see an opportunity to do things that marketers have long dreamed of–being able to see the interests of people and target messages based on that.” That means users may have to become more suspicious of messages on the site.

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.