Select your localized edition:

Close ×

More Ways to Connect

Discover one of our 28 local entrepreneurial communities »

Be the first to know as we launch in new countries and markets around the globe.

Interested in bringing MIT Technology Review to your local market?

MIT Technology ReviewMIT Technology Review - logo

 

Unsupported browser: Your browser does not meet modern web standards. See how it scores »


Back in 2011, a team from Texas A&M University carried out a cyber sting to trap nonhuman Twitter users that were polluting the Twittersphere with spam. Their approach was to set up “honeypot” accounts which posted nonsensical content that no human user would ever be interested in. Any account that retweeted this content, or friended the owner, must surely be a nonhuman user known as a social bot.

The team set up 60 honeypots and harvested some 36,000 potential social bot accounts. The result surprised many observers because of the sheer number of nonhuman accounts that were active. These bots were generally unsophisticated and simply retweeted more or less any content they came across.

Since then, social bots have become significantly more advanced. They search social networks for popular and influential people, follow them and capture their attention by sending them messages. These bots can identify keywords and find content accordingly and some can even answer inquiries using natural language algorithms.

That makes identifying social bots much more difficult. But today, Emilio Ferrara and pals at Indiana University in Bloomington, say they have developed a way to spot sophisticated social bots and distinguish them from ordinary human users.

The technique is relatively straightforward. They start by gathering a set of social bots from the original group outed in 2011. They chose 15,000 of these and collected their 200 most recent tweets as well as the 100 most recent tweets mentioning them. That produced a dataset of some 2.6 million tweets. The team then gathered a similar dataset for 16,000 human users consisting of more than 3 million tweets.

Finally, the researchers created an algorithm called Bot or Not? to mine this data looking for significant differences between the properties of human users and social bots. The algorithm looked at over 1,000 features associated with these accounts, such as the number of tweets and retweets each user posted, the number of replies, mentions and retweets each received, the username length, and even the age of the account.

It turns out that there are significant differences between human accounts and bot accounts. Bots tend to retweet far more often than humans and they also have longer usernames and younger accounts. By contrast, humans receive more replies, mentions, and retweets.

Together these factors create a kind of fingerprint that can be used to detect bots. “Bot or Not? achieves very promising detection accuracy,” say Ferrara and pals.

There are some limitations, however. First, the team took social bots originally identified in 2011 so it’s quite possible that there are now more advanced bots that are less easy to detect.

And there are also borderline cases that contain posts from both humans and social bots, for example when humans lend their accounts to bots or when accounts have been hacked by bots. “Detecting these anomalies is currently impossible,” admit Ferrara and co.

Nevertheless, this is an interesting start in the process of identifying social bots. But it is a task that is likely to become more difficult as time goes on. With only 140 characters, Twitter places significant constraints on the type of communication that is possible. It is therefore much easier for a computer to recreate the very limited behavior that humans demonstrate in this space.

For those interested, Ferrara and co have made their Bot or Not? algorithm available at this website. Simply enter the screen name of the Twitter user and it will analyze its features and most recent posts to determine the likelihood of it being a social bot.

It wasn’t working at the time of writing, perhaps the victim of an aggrieved social bot. But if it is working now, give it a try and post your thoughts in the comments section below.

Ref: http://arxiv.org/abs/1407.5225 : The Rise of Social Bots

1 comment. Share your thoughts »

Tagged: Web, Mobile

Reprints and Permissions | Send feedback to the editor

From the Archives

Close

Introducing MIT Technology Review Insider.

Already a Magazine subscriber?

You're automatically an Insider. It's easy to activate or upgrade your account.

Activate Your Account

Become an Insider

It's the new way to subscribe. Get even more of the tech news, research, and discoveries you crave.

Sign Up

Learn More

Find out why MIT Technology Review Insider is for you and explore your options.

Show Me