We noticed you're browsing in private or incognito mode.

To continue reading this article, please exit incognito mode or log in.

Not an Insider? Subscribe now for unlimited access to online articles.

Emerging Technology from the arXiv

A View from Emerging Technology from the arXiv

PageRank Algorithm Reveals Soccer Teams' Strategies

Using network theory to analyse the performance of soccer teams and players produces unique insights into the strategy of the world’s best team

  • July 3, 2012

Many readers will have watched the final of the Euro 2012 soccer championships on Sunday in which Spain demolished a tired Italian team by 4 goals to nil. The result, Spain’s third major championship in a row, confirms the team as the best in the world and one of the greatest in history.

So what makes Spain so good? Fans, pundits and sports journalists all point to Spain’s famous strategy of accurate quick-fire passing, known as the tiki-taka style. It’s easy to spot and fabulous to watch, as the game on Sunday proved. But it’s much harder to describe and define.  

That looks set to change. Today, Javier Lopez Pena at University College London and Hugo Touchette at Queen Mary University of London reveal an entirely new way to analyse and characterise the performance of soccer teams and players using network theory. 

They say their approach produces a quantifiable representation of a team’s style, identifies key individuals and highlights potential weaknesses.

Their idea is to think of each player as a node in a network and each pass as an edge that connects nodes. They then distribute the nodes in a way that reflects the playing position of each player on the pitch.  

The image above shows the resulting networks for the Netherlands (left) and Spain using data from the knockout stages of the 2010 World Cup in South Africa.  These teams contested the final which Spain won. 

A visual inspection of these networks immediately reveals some interesting insights into the match. The thickness of the arrows represents the number of passes between nodes and it is immediately clear that the Spanish team pass more often. This image captures 417 passes by the Spanish team versus 266 for the Netherlands.   

Key players also stand out by the number of passes they make and receive, such as 16 (Sergio Busquets) and 8 (Xavi).

However, this representation also allows a much more sophisticated analysis using the standard tools of network science. 

For example, closeness centrality measures how easy it is to reach a given node in the network. In footballing terms, it measures how well connected a player is in the team. 

Busquets and Xavi have the highest scores in the Spanish team. Both are better connected than the best connected Dutch player, 1 (Steckelenberg) the goal keeper. That the goal keeper was the Netherland’s best connected player itself speaks volumes.

Another notion is betweenness centrality, which measures the extent to which a node lies on a path to other nodes. In footballing terms, betweenness centrality measures how the ball flow between players depends on another player. Players with a high betweenness centrality are crucial for keeping the momentum of the game going. 

These players are important because removing them has a huge impact on the structure of the network.  So a single player with a high betweenness centrality is also a weakness, since the entire team is vulnerable to an injury to this player or a red card. 

Spain’s number 11 Joan Capdevilla is the player with by far the highest betweenness centrality in this match. He is clearly a target for passes from many players, which he feeds mainly to 14 (Xabi Alonso).

Then there is the famous PageRank algorithm which measure’s a player’s popularity, as judged by the number of passes he receives from other popular players. It gives a rough idea of who is most likely to end up with the ball after a suitably large number of passes. In this game it is Xavi.

Seven members the starting team that won the 2010 World Cup also started the Euro 2012 final. It’ll be interesting to see Pena and Touchette’s analysis of this tournament and how it varied from the earlier one.

There are clearly limitations to this approach. The data is an average over several games so it fails to capture the dynamics of specific games. And the positions of the nodes are also a vast generalisation and taken only from a player’s nominal starting position. 

Pena and Touchette say there are various ways in which this approach could be improved. They suggest adding another node to represent the opponents goal and would record the number of shots. They also imagine using a similar approach to measure the accuracy of passes by taking into account the probability of pass from one player to another being successful. 

“The defensive strength of a team could also be incorporated in the model by tracking passing interceptions and recovered balls,” say Pena and Touchette.

Perhaps more fascinating would be a way of collecting and analysing the data in real time to produce a network-based analysis of a game as it happens. 

In terms of data analysis, football has always lagged behind more statistically-friendly games such as American football, baseball and cricket, because it lacks the long pauses during which data can be gathered and analysed. That looks set to change.

Ref: arxiv.org/abs/1206.6904: A Network Theory Analysis Of Football Strategies 

Tech Obsessive?
Become an Insider to get the story behind the story — and before anyone else.

Subscribe today
More from Intelligent Machines

Artificial intelligence and robots are transforming how we work and live.

Want more award-winning journalism? Subscribe to Insider Plus.
  • Insider Plus {! insider.prices.plus !}*

    {! insider.display.menuOptionsLabel !}

    Everything included in Insider Basic, plus the digital magazine, extensive archive, ad-free web experience, and discounts to partner offerings and MIT Technology Review events.

    See details+

    Print + Digital Magazine (6 bi-monthly issues)

    Unlimited online access including all articles, multimedia, and more

    The Download newsletter with top tech stories delivered daily to your inbox

    Technology Review PDF magazine archive, including articles, images, and covers dating back to 1899

    10% Discount to MIT Technology Review events and MIT Press

    Ad-free website experience

You've read of three free articles this month. for unlimited online access. You've read of three free articles this month. for unlimited online access. This is your last free article this month. for unlimited online access. You've read all your free articles this month. for unlimited online access. You've read of three free articles this month. for more, or for unlimited online access. for two more free articles, or for unlimited online access.