A View from Emerging Technology from the arXiv
Data Fusion Heralds City Attractiveness Ranking
Big data mining allows cities to be ranked according to their “attractiveness,” say researchers developing a new science of cities.
The ability of any city to attract visitors is an important metric for town planners, businesses based on tourism, traffic planners, residents, and so on. And there are increasingly varied ways of measuring this thanks to the growing volumes of city-related data generated by with social media, and location-based data.
So it’s only natural that researchers would like to draw these data sets together to see what kind of insight they can get from this form of data fusion.
And so it has turned out thanks to the work of Stanislav Sobolevsky at MIT and a few buddies. These guys have fused three wildly different data sets related to the attractiveness of a city that allows them to rank these places and to understand why people visit them and what they do when they get there.
The work focuses exclusively on cities in Spain using data that is relatively straightforward to gather. The first data set consists of the number of credit and debit card transactions carried out by visitors to cities throughout Spain during 2011. This includes each card’s country of origin, which allows Sobolevsky and co to count only those transactions made by foreign visitors—a total of 17 million anonymized transactions from 8.6 million foreign visitors from 175 different countries.
The second data set consists of over 3.5 million photos and videos taken in Spain and posted to Flickr by people living in other countries. These pictures were taken between 2005 and 2014 by 16,000 visitors from 112 countries.
The last data set consists of around 700,000 geotagged tweets posted in Spain during 2012. These were posted by 16,000 foreign visitors from 112 countries.
Finally, the team defined a city’s attractiveness, at least for the purposes of this study, as the total number of pictures, tweets and card transactions that took place within it.
This level of visitor activity leads to a simple ranking but one that obviously needs to be normalized by city size. And that leads to an interesting question—how does the level of visitor activity scale with city size?
At first glance, it’s easy to imagine that visitor activity ought to scale linearly with city size. Bigger cities attract more visitors who together are more active.
But anthropologists have long known that different aspects of city life scale in different ways. For example, the number of jobs, houses, and water consumption all scale linearly with city size. Other things scale less quickly or sublinearly with city size such as road surface area. And still other things scale more quickly or superlinearly, such as income and productivity as well as pollution and poverty.
So how does visitor activity scale with city size? Superlinearly, say Sobolevsky and co. Bigger cities not only attract more visitors but more active ones as well—they take more pictures, tweet more often, and use their credit and debit cards more frequently. “City attractivity was found to demonstrate a strong superlinear scaling with the city size,” say the researchers.
Having identified this trend, the team then look for exceptions and attempt to explain why certain cities deviate. For example, the cities of Malaga and Alicante have the highest relative levels of foreign transactions but among the lowest Flickr activity.
Sobolevsky and co say these cities are well known as popular retirement locations for older people from across Europe. “It seems quite likely that visitors from this category might typically be much less active users of Flickr,” they say.
The island cities of Santa Cruz de Tenerife and Las Palmas are in a similar situation. They enjoy plenty of financial transaction activity but significantly lower-than-expected Twitter and Flickr activity. Apparently, visitors to these cities are happy to spend but not to record their activity with pictures and tweets.
Another curious exception is the city of Badajoz, an inland city close to the Portuguese border, which has the lowest relative levels of Twitter and Flickr activity of all the major cities in this study. Sobolevsky and co say this probably because Badajoz is not a tourist city at all but a shopping center. So most foreign visitors tend to have crossed the border from Portugal to go shopping and spend little time tweeting or taking photos during their visits.
That’s interesting work that shows how the fusion of big data sets can provide insights into the way people use cities. It has its limitations of course. The study does not address the reasons why people find cities attractive and what draws them there in the first place. For example, are they there for tourism, for business, or for some other reason. That would require more specialized data.
But it does provide a general picture of attractiveness that could be a start for more detailed analyses. As such, this work is just a small part of a new science of cities based on big data, but one that shows how much is becoming possible with just a little number crunching.
Ref: arxiv.org/abs/1504.06003 : Scaling of city attractiveness for foreign visitors through big data of human economic and social media activity