Skip to Content
Uncategorized

How a Troll-Spotting Algorithm Learned Its Anti-antisocial Trade

Antisocial behavior online can make people’s lives miserable. So an algorithm that can spot trolls more quickly should be a boon, say the computer scientists who developed it.

Trolls are the scourge of many an Internet site. These are people who deliberately engage in antisocial behavior by posting inflammatory or off topic messages. At best, they are a frustrating annoyance; at the worst they can make people’s lives a misery.

So a way of spotting trolls early in their online careers and preventing their worst excesses would be a valuable tool.

Today, Justin Cheng at Stanford University in California and a few pals say they have created just such a tool by analyzing the behavior of trolls on several well-known websites and creating an algorithm that can accurately spot them after as few as 10 posts. They say their technique should be of high practical importance to the people who maintain online communities.

Cheng and co study three online news communities: the general news site CNN.com, the political news site Breitbart.com, and the computer gaming site IGN.com.

On each of these sites, they have a list of users who have been banned for antisocial behavior, over 10,000 of them in total. They also have all of the messages posted by these users throughout their period of online activity. “Such individuals are clear instances of antisocial users, and constitute ‘ground truth’ in our analyses,” say Cheng and co.

These guys set out to answer three different questions about antisocial users. First, whether they are antisocial throughout their community life or only towards the end. Second, whether the community’s reaction causes their behavior to become worse. And lastly, whether antisocial users can be accurately identified early on.

By comparing the messages posted by users who are ultimately banned against messages posted by users who are never banned, Cheng and co discover some clear differences. One measure they use is the readability of posts, as judged by a metric called the Automated Readability Index.

This clearly shows that users who are later banned tend to write poorer quality posts to start off with. And not only that, the quality of their posts decreases with time.

And while communities initially appear forgiving and are therefore slow to ban antisocial users, they become less tolerant over time. “This results in an increased rate at which [posts from antisocial users] are deleted,” they say.

Interestingly, Cheng and co say that the differences between messages posted by people who are later banned and those who are not is so clear that it is relatively straightforward to spot them using a machine learning algorithm. “In fact, we only need to observe five to 10 user posts before a classifier is able to make a reliable prediction,” they boast.

That could turn out to be useful. Antisocial behavior is an increasingly severe problem that requires significant human input to detect and tackle. This process often means that antisocial users are allowed to operate for much longer than necessary. “Our methods can effectively identify antisocial users early in their community lives and alleviate some of this burden,” say Cheng and co.

Of course, care must be taken with any automated approach. One potential danger is of needlessly banning users who are not antisocial but have been identified as such by the algorithm. This false positive rate needs to be more carefully studied.

Nevertheless, the work of moderators on sites that allow messages could soon be made significantly easier thanks to Cheng and co’s approach.

Ref: arxiv.org/abs/1504.00680 : Antisocial Behavior in Online Discussion Communities

Keep Reading

Most Popular

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

OpenAI teases an amazing new generative video model called Sora

The firm is sharing Sora with a small group of safety testers but the rest of us will have to wait to learn more.

Google’s Gemini is now in everything. Here’s how you can try it out.

Gmail, Docs, and more will now come with Gemini baked in. But Europeans will have to wait before they can download the app.

This baby with a head camera helped teach an AI how kids learn language

A neural network trained on the experiences of a single young child managed to learn one of the core components of language: how to match words to the objects they represent.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.