Software Extracts Your Location on Twitter Even When It’s Secret

34 percent of Twitter users don’t fill out the “location” field with accurate information, but a new algorithm can infer it from their tweets

Christopher Mimsarchive page

June 17, 2011

It used to be only stalkers and private investigators could track you down through subtle hints in your communications with others, but now an algorithm developed by a pair of computer scientists from Northwestern University and PARC can automate the process.

This effort to pull back the curtain on users’ day-to-day lives began with the observation that a significant proportion of users of online social networks do not post accurate information about their own location.

A survey of thousands of Twitter users discovered that more than a third did not disclose their location, making it difficult to study their tweets with reference to geography – or to accurately target them with advertising. Many users did not merely leave the field blank, but filled it with obfuscating or even threatening information, clearly indicating that they wished to retain some degree of digital privacy.

Well so much for that: Part two of the researcher’s efforts involved applying a machine learning algorithm to the corpus of 10,000 active Twitter users’ recent tweets. While it wasn’t able to pull out their address or even their zip code, it was able to determine what country and state users inhabited.

Analyzing the data after the fact, the researchers even discovered that some terms were highly predictive of location. It should surprise no one that Twitter users who often used the word “Colorado” were in that state. Other results were less intuitive: people who mentioned “elk” also tended to be in Colorado, “biggbi” was highly predictive of residence in Michigan, and people who mentioned “gamecock” were likely to be in South Carolina. People outside of Louisiana were unlikely to mention “crawfish,” and “redsox” fans were disproportionately from Massachusetts.

This research demonstrates that to be online is at all times to reveal a great deal more than we realize. It’s not exactly the post-privacy society that Zuckerberg has envisioned for all of us, but it does mean that simply participating in social networks involves a great deal more trust than most of us thought.

Keep Reading

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.