The study of networks has exploded in recent years. One of the more important discoveries is that many networks share common growth patterns. So if researchers grow a model network at random using these rules, the model will have the same topological structure as the network under study
For example, the world wide web tends to grow according to a process known as preferential attachment in which new links to a page depend on the number it already has (otherwise known as the rich get richer effect).
By contrast, the growth of networks associated with protein interactions in cells is best described by another process known as “duplication-mutation with complementarity”. Here new nodes become copies of old ones by connecting to all their neighbours, then a process of mutation occurs in which connections can be removed.
And social networks tend to grow according to the same model that describes the way forest fires spread.
That has given network specialists all kinds of insights into network dynamics, how they evolve, the relative importance of specific nodes, how communities change over time and how information propagates through them.
But this information is entirely generic rather than specific to a real network. So a snapshot of the Last.fm social network will tell you who the central players are now and the forest fire model will give you an idea of how this structure evolved. But ask who the central players were three years ago, given the structure that exists today, and network scientists will scratch their feet and stare at the floor.
Now that looks set to change thanks to a new approach from Saket Navlakha and Carl Kingsford at the University of Maryland at College Park. Instead of using these growth patterns to study how networks evolve, their idea is to look at the process in reverse.
“Instead of growing a random network forward according to an evolutionary model, we decompose the actual observed network backwards in time, as dictated by the model,” they say. “The resulting sequence of networks constitute a model-inferred history of the present-day network.” This is network archaeology.
That’s significant because the result depends specifically on the network under investigation, rather than solely on the growth model used to generate it.
They go on to show the power of this idea by inferring the history of several networks. For example, they are able to accurately estimate the time at which users of last.fm joined the network simply by looking at the structure today.
Navlakha and Kingsford quite rightly point out that the possibility of inferring past behaviour from current network structures raises certain privacy issues. These will now need to be addressed by the owners this data.
Perhaps the most important application of network archaeology will be in biology. Over the last few years, molecular biologists have been painstakingly mapping the networks formed by protein interactions in cells. These give a snapshot of the way these networks have evolved.
In principle, comparing the networks of closely related species should give some insight into how they evolved. For example, a search for common structures should reveal ancestral subnetworks. That’s useful but it does not show how the process of evolution.
Network archaeology should provide new insight. Navlakha and Kingsford demonstrate it by taking the network of protein-protein interactions in baker’s yeast and essentially “playing the tape backwards” according to the rules of “duplication-mutation with complementarity”.
They discovered they were able to determine the age of proteins, ie how long it has been since they first entered the network. They were also able to work out how the proteins were duplicated and mutated in the past, or, in other words, how they evolved.
That’s a powerful tool that is likely to generate some interesting new insights into the process of evolution, not to mention the history of the networks. Expect to hear a lot more about it.