A View from Emerging Technology from the arXiv
Twitter "Exhaust" Reveals Patterns of Unemployment
Twitter data mining reveals surprising detail about socioeconomic indicators but at a fraction of the cost of traditional data-gathering methods, say computational sociologists.
Human behaviour is closely linked to social and economic status. For example, the way an individual travels round a city is influenced by their job, their income and their lifestyle.
So it shouldn’t come as a surprise that economic status might also be reflected in patterns of social media behaviour. Indeed, that’s exactly what, say Alejandro Llorente at the Autonomous University of Madrid in Spain and a few pals. Today, these guys show that the broad pattern of tweets across cities and counties in Spain reveals fascinating detail about unemployment rates in these areas.
These guys began with a database of 19.6 million geolocated tweets in Spain published between November 2012 and June 2013. Llorente and co wanted to correlate these tweets with regions of economic activity but these are not easy to determine. That’s because they do not correspond well to the administrative boundaries in Spain, which reflect historical and political boundaries rather than economic ones.
So the team analysed the rate at which messages were exchanged between regions using a standard community detection algorithm. This revealed 340 independent areas of economic activity, which largely coincide with other measures of geographic and economic distribution. “This result shows that the mobility detected from geolocated tweets and the communities obtained are a good description of economical areas,” they say.
Finally, they looked at the unemployment figures in each of these regions and then mined their database for correlations with twitter activity.
The results show clear differences between regions with high and low unemployment. For example, the rate of tweeting between 9am and midday on weekdays is significantly higher in areas of high unemployment. These tweets are more likely to contain words such as job or unemployment. And the messages are also more likely to contain spelling mistakes, perhaps reflecting a lower level of education among the unemployed.
“We demonstrate that behavioural features related to unemployment can be recovered from the digital exhaust left by the microblogging network Twitter,” say Llorente and co.
That’s important because this kind of analysis is quick and easy compared to traditional methods of data collection, such as surveys. These are expensive, so much so that some countries have considered abandoning them in times of economic hardship to save money.
The possibility that Twitter data can provide a quick and cheap overview of unemployment is therefore an interesting alternative. What’s more, it allows governments and policy makers to monitor changes in the population, more or less in real time.
“The immediacy of social media may also allow governments to better measure and understand the effect of policies, social changes, natural or man-made disasters in the economical status of cities in almost real-time,” say Llorente and co, adding that their techniques should be applicable anywhere in the world.
Work like this shows how the nature of economic data gathering is changing. It’ll be interesting to see how quickly governments and other organisations adapt.
Ref: arxiv.org/abs/1411.3140 : Social Media Fingerprints Of Unemployment