The increased use of peer-to-peer communications could improve the overall capacity of the Internet and make it run much more smoothly. That’s the conclusion of a novel study mapping the structure of the Internet.
It’s the first study to look at how the Internet is organized in terms of function, as well as how it’s connected, says Shai Carmi, a physicist who took part in the research at the Bar Ilan University, in Israel. “This gives the most complete picture of the Internet available today,” he says.
While efforts have been made previously to plot the topological structure in terms of the connections between Internet nodes–computer networks or Internet Service Providers that act as relay stations for carrying information about the Net–none have taken into account the role that these connections play. “Some nodes may not be as important as other nodes,” says Carmi.
The researchers’ results depict the Internet as consisting of a dense core of 80 or so critical nodes surrounded by an outer shell of 5,000 sparsely connected, isolated nodes that are very much dependent upon this core. Separating the core from the outer shell are approximately 15,000 peer-connected and self-sufficient nodes.
Take away the core, and an interesting thing happens: about 30 percent of the nodes from the outer shell become completely cut off. But the remaining 70 percent can continue communicating because the middle region has enough peer-connected nodes to bypass the core.
With the core connected, any node is able to communicate with any other node within about four links. “If the core is removed, it takes about seven or eight links,” says Carmi. It’s a slower trip, but the data still gets there. Carmi believes we should take advantage of these alternate pathways to try to stop the core of the Internet from clogging up. “It can improve the efficiency of the Internet because the core would be less congested,” he says.
To build their map of the Internet, published in the latest issue of the Proceedings of the National Academy of Sciences, the researchers enlisted the assistance of 5,000 online volunteers who downloaded a program to help identify the connections between the 20,000 known nodes.
The distributed program sends information requests, or pings, to other parts of the Internet and records the route of the information on each journey.
Previous efforts had relied upon only a few dozen large computers to carry out this task, says Carmi. But by using this distributed approach, which meant collecting up to six million measurements a day over a period of two years from thousands of observation points around the world, it was possible to reveal more connections, says Scott Kirkpatrick, a professor of computer science and engineering at the Hebrew University of Jerusalem, who also took part in the study. In fact, the project has already identified about 20 percent more of the interconnections between Internet nodes than ever before.
The researchers then used a novel hierarchical approach to map the connectivity data, taking into account how the nodes are connected. Each node was assessed based on how well connected it was to other nodes that are better connected.
Most previous research efforts only considered the number of connections as an indicator of the importance of a node without factoring in where those nodes lead, says Carmi. But taking this new approach, known as a k-shell model, allows for dead-end connections to be discounted, since they play a lesser role in the connectivity of the Internet.
Seth Bullock, a computer scientist at University of Southampton who studies network complexity and natural systems, finds it encouraging to see people taking a more sophisticated approach to modeling network structures, which are often quite crude.
But, Bullock warns, although there are potential benefits to improving the efficiency of the Internet using peer-to-peer networks, letting peer-to-peer networks grow in an unconstrained way could just as easily result in the creation of more congestion. For example, there would be nothing to prevent them from channeling data through the same nodes, thereby creating congestion elsewhere. Even so, there is currently a lot of interest in trying to figure out how to improve the Internet in the future; revealing its structure should help this process, says Kirkpatrick.