Connectivity

Yahoo Has a Tool that Can Catch Online Abuse Surprisingly Well

Computers are getting better at spotting trolls, but they often can’t understand the meaning of messages.

Trolls seem to lurk in every corner of the Internet, and they delight in ruining your day. But if our e-mail inboxes can be kept relatively spam-free, why can’t machines automatically purge abusive messages from tweets or comments?

It’s a question that seems relevant to the very fabric of Internet culture today. Last week, Twitter banned a journalist that it accused of orchestrating a campaign of abuse aimed at one of the stars of the all-female Ghostbusters reboot. Twitter said it would introduce new guidelines and tools for reporting abuse through its service. Certainly, countless other incidents on Twitter and elsewhere go unnoticed every day.

Researchers are, in fact, making some progress toward technology that can help stop the abuse. A team at Yahoo recently developed an algorithm capable of catching abusive messages better than any other automated system to date. The researchers created a data set of abuse by collecting messages on Yahoo articles that were flagged as offensive by the company’s own comment editors.

The Yahoo team used a number of conventional techniques, including looking for abusive keywords, punctuation that often seemed to accompany abusive messages, and syntactic clues as to the meaning of a sentence.

But the researchers also applied a more advanced approach to automated language understanding, using a way of representing the meaning of words as vectors with many dimensions. This approach, known as “word embedding,” allows semantics to be processed in a sophisticated way. For instance, even if a comment contains a string of words that have not been identified as abusive, the representations of that string in vector space may be enough to identify it as such.

When everything was combined, the team was able to identify abusive messages (from its own data set) with roughly 90 percent accuracy.

Catching the remaining 10 percent may prove tricky. Although AI researchers are making significant progress in training machines to parse language, artificial intelligence has yet to equip computers with the brainpower needed to untangle meaning. As a contest held at a recent AI conference shows, computers cannot disentangle the most simple ambiguities in sentences.

Many tech companies, including Twitter, have AI researchers dedicated to advancing the state of the art in areas such as image recognition and text comprehension. But so far surprisingly little effort seems to have been put into catching abuse or harassment systematically. Twitter declined to say if its AI team is actively working on the problem (although it seems likely). But it is unlikely that the company will introduce a magic bullet for filtering out malicious messages. The problem with automated hate filtering is that words are packed with meaning that can only be unpacked with real intelligence.

“Automatically identifying abuse is surprisingly difficult,” says Alex Krasodomski-Jones, who tracks online abuse as a researcher with the U.K.-based Centre for Analysis of Social Media. “The language of abuse is amorphous—changing frequently and often used in ways that do not connote abuse, such as when racially or sexually charged terms are appropriated by the groups they once denigrated. Given 10 tweets, a group of humans will rarely all agree on which ones should be classed as abusive, so you can imagine how difficult it would be for a computer.”

Until machines gain real intelligence, filtering out hateful messages will be impossible. But Krasodomski-Jones offers another, more human, reason why we might not want an automated solution: “In a world where what we read is increasingly dictated by algorithms and filters, we ought to be careful about demanding more computer interference.”

 

 

Become an MIT Technology Review Insider for in-depth analysis and unparalleled perspective.

Subscribe today

Uh oh–you've read all of your free articles for this month.

Insider Premium
$179.95/yr US PRICE

More from Connectivity

What it means to be constantly connected with each other and vast sources of information.

Want more award-winning journalism? Subscribe and become an Insider.
  • Insider Premium {! insider.prices.premium !}*

    {! insider.display.menuOptionsLabel !}

    Our award winning magazine, unlimited access to our story archive, special discounts to MIT Technology Review Events, and exclusive content.

    See details+

    What's Included

    Bimonthly home delivery and unlimited 24/7 access to MIT Technology Review’s website.

    The Download. Our daily newsletter of what's important in technology and innovation.

    Access to the Magazine archive. Over 24,000 articles going back to 1899 at your fingertips.

    Special Discounts to select partner offerings

    Discount to MIT Technology Review events

    Ad-free web experience

    First Look. Exclusive early access to stories.

    Insider Conversations. Listen in as our editors talk to innovators from around the world.

  • Insider Plus {! insider.prices.plus !}* Best Value

    {! insider.display.menuOptionsLabel !}

    Everything included in Insider Basic, plus ad-free web experience, select discounts to partner offerings and MIT Technology Review events

    See details+

    What's Included

    Bimonthly home delivery and unlimited 24/7 access to MIT Technology Review’s website.

    The Download. Our daily newsletter of what's important in technology and innovation.

    Access to the Magazine archive. Over 24,000 articles going back to 1899 at your fingertips.

    Special Discounts to select partner offerings

    Discount to MIT Technology Review events

    Ad-free web experience

  • Insider Basic {! insider.prices.basic !}*

    {! insider.display.menuOptionsLabel !}

    Six issues of our award winning magazine and daily delivery of The Download, our newsletter of what’s important in technology and innovation.

    See details+

    What's Included

    Bimonthly home delivery and unlimited 24/7 access to MIT Technology Review’s website.

    The Download. Our daily newsletter of what's important in technology and innovation.

/
You've read all of your free articles this month. This is your last free article this month. You've read of free articles this month. or  for unlimited online access.