The way information percolates through networks is of ongoing fascination to increasingly diverse groups of people.
That’s why physicists have developed a new science of networks to better understand what’s going on. The models they have developed can accurately describe how disease spreads through society, how gossip spreads through social networks and how malware spreads over the internet.
In other words, given the location of the source and the structure of the network, they can accurately predict how information will spread.
However, the reverse task of determining the source of information after it has already spread is much harder. That’s because most interesting networks are so big that it’s impossible to measure the state of every node.
Today, Pedro Pinto and pals at the École Polytechnique Fédérale de Lausanne in Switzerland show that it can be done, even if you have information from only a few nodes. “We show that it is fundamentally possible to estimate the location of the source from measurements collected by sparsely-placed observers,” they say.
The start with a theoretical description of the problem and how it can be solved. They go on to demonstrate the effectiveness their method using data about a cholera outbreak in the KwaZulu-Natal province in South Africa in 2000. This data includes a detailed map of the network of waterways in this area through which the disease would have spread, and the number of cholera victims in various communities in the network.
“By monitoring only 20% of the communities, we achieve an average error of less than 4 hops between the estimated source and the ﬁrst infected community”, they say.
That’s an impressive result. However, it comes with a number of caveats. One problem is that the method assumes a good understanding of the structure of the network, something that is not always easy to get for large, real-world networks.
In the case of cholera, for example, the disease spreads downstream through rivers, which can be mapped reasonably accurately. But it also spreads when infected victims move from one geographical location to another and this is much harder to to take into account.
Another problem is that nodes can have different levels of importance in a network so the choice of the ones used to sample the data is important. However, nobody knows what the optimal choice should be.
Nevertheless, the new approach should have wide application. The same technique that can spot the first victim in a cholera outbreak should also work with other network-based phenomena such as the spread of a computer virus or news or malicious gossip.
That’s something that more than few people ought to be interested in.
Ref: arxiv.org/abs/1208.2534: Locating the Source of Diffusion in Large-Scale Networks
It will soon be easy for self-driving cars to hide in plain sight. We shouldn’t let them.
If they ever hit our roads for real, other drivers need to know exactly what they are.
Maximize business value with data-driven strategies
Every organization is now collecting data, but few are truly data driven. Here are five ways data can transform your business.
Cryptocurrency fuels new business opportunities
As adoption of digital assets accelerates, companies are investing in innovative products and services.
Where to get abortion pills and how to use them
New US restrictions could turn abortion into do-it-yourself medicine, but there might be legal risks.
Get the latest updates from
MIT Technology Review
Discover special offers, top stories, upcoming events, and more.