Skip to Content
Silicon Valley

Data mining has revealed previously unknown Russian Twitter troll campaigns

Trolls left forensic fingerprints that cybersecurity experts used to find other disinformation campaigns both in the US and elsewhere.
Image of phone on the twitter login page
Image of phone on the twitter login

Human activity leaves all kinds of traces, some more obvious than others. For example, messages posted to services such as Twitter are obviously visible. But the pattern of tweets from a user over time is not as self-evident.

Various researchers have begun to study these patterns and found that they can identify certain types of accounts, particularly those that post in high volume. For example, accounts that post continuously, 24 hours a day, are unlikely to be operated by humans. Instead, this is a clear signal that a bot of some kind is at work.

Humans also generate specific patterns, albeit less obviously than bots. In particular, accounts that post high volumes of tweets often do so in a pattern whose unique signature forensic analysis can identify.

One corpus of interesting tweets encompasses the messages posted by Russian trolls attempting to influence the 2016 US presidential election. Now researchers have analyzed these to search for any unique fingerprints they might contain. The idea is to use these fingerprints to identify other disinformation campaigns by the same trolls that have gone unnoticed. But is this possible?

Today we get an answer thanks to the work of Christopher Griffin and Brady Bickel at Pennsylvania State University. These guys’ forensic analysis has identified a unique signature in these tweets and used it to find evidence of other disinformation campaigns. “We identify an operation that includes not only the 2016 US election, but also the French national and both local and national German elections,” say Griffin and Bickel.

Unique behavioral fingerprints are hard to identify because of the sheer volume of data on Twitter. A vast number of human users share similar behavioral characteristics and so cannot be easily distinguished. However, the behavioral signature becomes more distinctive as the volume of messages increases.

That’s why the Russian trolls are identifiable in this way. Griffin and Bickel downloaded a database of 200,000 Russian troll tweets gathered by Twitter and obtained by NBC News. They then analyzed the tweets by the most prolific users—those who posted more than 500 times during the election period.

The researchers examined the way these users tweeted over time and how they differed from other Twitter users. They also looked for communities within the database and then created word clouds of their tweets showing the most commonly used words.

This threw up a surprise. The analysis revealed seven communities that each use different word clouds. Four of these communities were clearly focused on topics such as the US Tea Party movement and African-Americans.

But two of these word clouds consisted entirely of words in Russian and German.  Griffin and Bickel analyzed these further to show that the timing of the tweets spiked in the run-up to the German national election in 2017 and the local Berlin election in 2016. “The Berlin state election was significant because Chancellor Merkel’s party was beaten by right-wing populists,” say the researchers.

The team also found a similar spike in activity in the build-up to the French national election in 2017, although this involved only 588 messages. That’s too small for detailed analysis, but Griffin and Bickel speculate that it points to the existence of another group of trolls, as yet unidentified, who targeted France. 

That’s interesting work suggesting that Russian troll activity was significantly more ambitious on an international scale than previously thought. It also suggests a way of spotting this kind of meddling as it is happening by looking for the kind of forensic fingerprint the team identified.

Of course, finding trolls is a cat-and-mouse game. For the organizations responsible for Russian troll activity, it ought to be a straightforward matter to change the pattern of activity in a way that does not create the same signature.

And yet, if this malicious activity is to be significant and effective, it will inevitably take place on a relatively large scale and so generate a different signature. The question is how to spot it in time to take action. And so the game continues.

Ref: : Unsupervised Machine Learning of Open Source Russian Twitter Data Reveals Global Scope and Operational Characteristics

Keep Reading

Most Popular

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

The problem with plug-in hybrids? Their drivers.

Plug-in hybrids are often sold as a transition to EVs, but new data from Europe shows we’re still underestimating the emissions they produce.

Google DeepMind’s new generative model makes Super Mario–like games from scratch

Genie learns how to control games by watching hours and hours of video. It could help train next-gen robots too.

How scientists traced a mysterious covid case back to six toilets

When wastewater surveillance turns into a hunt for a single infected individual, the ethics get tricky.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at with a list of newsletters you’d like to receive.