Skip to Content

PageRank Algorithm Reveals Soccer Teams’ Strategies

Using network theory to analyse the performance of soccer teams and players produces unique insights into the strategy of the world’s best team

Many readers will have watched the final of the Euro 2012 soccer championships on Sunday in which Spain demolished a tired Italian team by 4 goals to nil. The result, Spain’s third major championship in a row, confirms the team as the best in the world and one of the greatest in history.

So what makes Spain so good? Fans, pundits and sports journalists all point to Spain’s famous strategy of accurate quick-fire passing, known as the tiki-taka style. It’s easy to spot and fabulous to watch, as the game on Sunday proved. But it’s much harder to describe and define.  

That looks set to change. Today, Javier Lopez Pena at University College London and Hugo Touchette at Queen Mary University of London reveal an entirely new way to analyse and characterise the performance of soccer teams and players using network theory. 

They say their approach produces a quantifiable representation of a team’s style, identifies key individuals and highlights potential weaknesses.

Their idea is to think of each player as a node in a network and each pass as an edge that connects nodes. They then distribute the nodes in a way that reflects the playing position of each player on the pitch.  

The image above shows the resulting networks for the Netherlands (left) and Spain using data from the knockout stages of the 2010 World Cup in South Africa.  These teams contested the final which Spain won. 

A visual inspection of these networks immediately reveals some interesting insights into the match. The thickness of the arrows represents the number of passes between nodes and it is immediately clear that the Spanish team pass more often. This image captures 417 passes by the Spanish team versus 266 for the Netherlands.   

Key players also stand out by the number of passes they make and receive, such as 16 (Sergio Busquets) and 8 (Xavi).

However, this representation also allows a much more sophisticated analysis using the standard tools of network science. 

For example, closeness centrality measures how easy it is to reach a given node in the network. In footballing terms, it measures how well connected a player is in the team. 

Busquets and Xavi have the highest scores in the Spanish team. Both are better connected than the best connected Dutch player, 1 (Steckelenberg) the goal keeper. That the goal keeper was the Netherland’s best connected player itself speaks volumes.

Another notion is betweenness centrality, which measures the extent to which a node lies on a path to other nodes. In footballing terms, betweenness centrality measures how the ball flow between players depends on another player. Players with a high betweenness centrality are crucial for keeping the momentum of the game going. 

These players are important because removing them has a huge impact on the structure of the network.  So a single player with a high betweenness centrality is also a weakness, since the entire team is vulnerable to an injury to this player or a red card. 

Spain’s number 11 Joan Capdevilla is the player with by far the highest betweenness centrality in this match. He is clearly a target for passes from many players, which he feeds mainly to 14 (Xabi Alonso).

Then there is the famous PageRank algorithm which measure’s a player’s popularity, as judged by the number of passes he receives from other popular players. It gives a rough idea of who is most likely to end up with the ball after a suitably large number of passes. In this game it is Xavi.

Seven members the starting team that won the 2010 World Cup also started the Euro 2012 final. It’ll be interesting to see Pena and Touchette’s analysis of this tournament and how it varied from the earlier one.

There are clearly limitations to this approach. The data is an average over several games so it fails to capture the dynamics of specific games. And the positions of the nodes are also a vast generalisation and taken only from a player’s nominal starting position. 

Pena and Touchette say there are various ways in which this approach could be improved. They suggest adding another node to represent the opponents goal and would record the number of shots. They also imagine using a similar approach to measure the accuracy of passes by taking into account the probability of pass from one player to another being successful. 

“The defensive strength of a team could also be incorporated in the model by tracking passing interceptions and recovered balls,” say Pena and Touchette.

Perhaps more fascinating would be a way of collecting and analysing the data in real time to produce a network-based analysis of a game as it happens. 

In terms of data analysis, football has always lagged behind more statistically-friendly games such as American football, baseball and cricket, because it lacks the long pauses during which data can be gathered and analysed. That looks set to change.

Ref: arxiv.org/abs/1206.6904: A Network Theory Analysis Of Football Strategies 

Keep Reading

Most Popular

Large language models can do jaw-dropping things. But nobody knows exactly why.

And that's a problem. Figuring it out is one of the biggest scientific puzzles of our time and a crucial step towards controlling more powerful future models.

OpenAI teases an amazing new generative video model called Sora

The firm is sharing Sora with a small group of safety testers but the rest of us will have to wait to learn more.

Google’s Gemini is now in everything. Here’s how you can try it out.

Gmail, Docs, and more will now come with Gemini baked in. But Europeans will have to wait before they can download the app.

This baby with a head camera helped teach an AI how kids learn language

A neural network trained on the experiences of a single young child managed to learn one of the core components of language: how to match words to the objects they represent.

Stay connected

Illustration by Rose Wong

Get the latest updates from
MIT Technology Review

Discover special offers, top stories, upcoming events, and more.

Thank you for submitting your email!

Explore more newsletters

It looks like something went wrong.

We’re having trouble saving your preferences. Try refreshing this page and updating them one more time. If you continue to get this message, reach out to us at customer-service@technologyreview.com with a list of newsletters you’d like to receive.