Emerging Technology from the arXiv

A View from Emerging Technology from the arXiv

PageRank Algorithm Reveals Soccer Teams' Strategies

Using network theory to analyse the performance of soccer teams and players produces unique insights into the strategy of the world’s best team

  • July 3, 2012

Many readers will have watched the final of the Euro 2012 soccer championships on Sunday in which Spain demolished a tired Italian team by 4 goals to nil. The result, Spain’s third major championship in a row, confirms the team as the best in the world and one of the greatest in history.

So what makes Spain so good? Fans, pundits and sports journalists all point to Spain’s famous strategy of accurate quick-fire passing, known as the tiki-taka style. It’s easy to spot and fabulous to watch, as the game on Sunday proved. But it’s much harder to describe and define.  

That looks set to change. Today, Javier Lopez Pena at University College London and Hugo Touchette at Queen Mary University of London reveal an entirely new way to analyse and characterise the performance of soccer teams and players using network theory. 

They say their approach produces a quantifiable representation of a team’s style, identifies key individuals and highlights potential weaknesses.

Their idea is to think of each player as a node in a network and each pass as an edge that connects nodes. They then distribute the nodes in a way that reflects the playing position of each player on the pitch.  

The image above shows the resulting networks for the Netherlands (left) and Spain using data from the knockout stages of the 2010 World Cup in South Africa.  These teams contested the final which Spain won. 

A visual inspection of these networks immediately reveals some interesting insights into the match. The thickness of the arrows represents the number of passes between nodes and it is immediately clear that the Spanish team pass more often. This image captures 417 passes by the Spanish team versus 266 for the Netherlands.   

Key players also stand out by the number of passes they make and receive, such as 16 (Sergio Busquets) and 8 (Xavi).

However, this representation also allows a much more sophisticated analysis using the standard tools of network science. 

For example, closeness centrality measures how easy it is to reach a given node in the network. In footballing terms, it measures how well connected a player is in the team. 

Busquets and Xavi have the highest scores in the Spanish team. Both are better connected than the best connected Dutch player, 1 (Steckelenberg) the goal keeper. That the goal keeper was the Netherland’s best connected player itself speaks volumes.

Another notion is betweenness centrality, which measures the extent to which a node lies on a path to other nodes. In footballing terms, betweenness centrality measures how the ball flow between players depends on another player. Players with a high betweenness centrality are crucial for keeping the momentum of the game going. 

These players are important because removing them has a huge impact on the structure of the network.  So a single player with a high betweenness centrality is also a weakness, since the entire team is vulnerable to an injury to this player or a red card. 

Spain’s number 11 Joan Capdevilla is the player with by far the highest betweenness centrality in this match. He is clearly a target for passes from many players, which he feeds mainly to 14 (Xabi Alonso).

Then there is the famous PageRank algorithm which measure’s a player’s popularity, as judged by the number of passes he receives from other popular players. It gives a rough idea of who is most likely to end up with the ball after a suitably large number of passes. In this game it is Xavi.

Seven members the starting team that won the 2010 World Cup also started the Euro 2012 final. It’ll be interesting to see Pena and Touchette’s analysis of this tournament and how it varied from the earlier one.

There are clearly limitations to this approach. The data is an average over several games so it fails to capture the dynamics of specific games. And the positions of the nodes are also a vast generalisation and taken only from a player’s nominal starting position. 

Pena and Touchette say there are various ways in which this approach could be improved. They suggest adding another node to represent the opponents goal and would record the number of shots. They also imagine using a similar approach to measure the accuracy of passes by taking into account the probability of pass from one player to another being successful. 

“The defensive strength of a team could also be incorporated in the model by tracking passing interceptions and recovered balls,” say Pena and Touchette.

Perhaps more fascinating would be a way of collecting and analysing the data in real time to produce a network-based analysis of a game as it happens. 

In terms of data analysis, football has always lagged behind more statistically-friendly games such as American football, baseball and cricket, because it lacks the long pauses during which data can be gathered and analysed. That looks set to change.

Ref: arxiv.org/abs/1206.6904: A Network Theory Analysis Of Football Strategies 

Get stories like this before anyone else with First Look.

Subscribe today
Already a Premium subscriber? Log in.

Uh oh–you've read all of your free articles for this month.

Insider Premium
$179.95/yr US PRICE

More from Intelligent Machines

Artificial intelligence and robots are transforming how we work and live.

Want more award-winning journalism? Subscribe and become an Insider.
  • Insider Premium {! insider.prices.premium !}*

    {! insider.display.menuOptionsLabel !}

    Our award winning magazine, unlimited access to our story archive, special discounts to MIT Technology Review Events, and exclusive content.

    See details+

    What's Included

    Bimonthly home delivery and unlimited 24/7 access to MIT Technology Review’s website.

    The Download. Our daily newsletter of what's important in technology and innovation.

    Access to the Magazine archive. Over 24,000 articles going back to 1899 at your fingertips.

    Special Discounts to select partner offerings

    Discount to MIT Technology Review events

    Ad-free web experience

    First Look. Exclusive early access to stories.

    Insider Conversations. Listen in as our editors talk to innovators from around the world.

  • Insider Plus {! insider.prices.plus !}* Best Value

    {! insider.display.menuOptionsLabel !}

    Everything included in Insider Basic, plus ad-free web experience, select discounts to partner offerings and MIT Technology Review events

    See details+

    What's Included

    Bimonthly home delivery and unlimited 24/7 access to MIT Technology Review’s website.

    The Download. Our daily newsletter of what's important in technology and innovation.

    Access to the Magazine archive. Over 24,000 articles going back to 1899 at your fingertips.

    Special Discounts to select partner offerings

    Discount to MIT Technology Review events

    Ad-free web experience

  • Insider Basic {! insider.prices.basic !}*

    {! insider.display.menuOptionsLabel !}

    Six issues of our award winning magazine and daily delivery of The Download, our newsletter of what’s important in technology and innovation.

    See details+

    What's Included

    Bimonthly home delivery and unlimited 24/7 access to MIT Technology Review’s website.

    The Download. Our daily newsletter of what's important in technology and innovation.

/
You've read all of your free articles this month. This is your last free article this month. You've read of free articles this month. or  for unlimited online access.