6 minute read


Expected threat - a tool for evaluating performance


In football, we often talk about goals and shots, but what if the most important actions happen long before the ball reaches the net? With expected threat we can quantify how teams and players create dangerous situations – and perhaps even predict who will win. During EURO last summer, 45,630 passes and 44,139 carries were completed. This article explores which of these actions add the most value and identifies the players and teams excelling at them. Using position-based expected threat (xT), I will explore questions such as:

  • Is there a connection between xT and winning?
  • Which teams and players created most xT per game during EURO 2024?
  • Which player positions are most efficient at creating xT?

    Position-based expected threat

    Position-based expected threat assigns a value to each point on the pitch. These values is probabilities for their respective areas and used for finding expected threat, or xT. Each probability is the chance for scoring from that particular position, either directly or after moving the ball elsewhere before ball is lost or out of play. The grid below is an example with probabilities split into 12x8 bins, where each bin containing these probabilities. These probabilities are based on multiple seasons of football data. They are made with the use of a mathematical concept called markow chains, which calculate probabilities for sequence of events. This thought were first used in football by Sarah Rudd, and later Karun Singh used the same idea and called it “expected threat” in an article in 2018 (you can read here).

L4

Figure 1. Grid with probabilities
A value of 1 means every sequence of play from that location would result in a goal, either directly from a shot, or indirectly after moving the ball elsewhere. The values obviously increase the closer we get to the opposition goal. The difference in probabilities between the end and start location give the xT-value. Lets use an example to illustrate.

L4

Location (A) has a prability of 0.019. This means a 1,9% chance of scoring either directly from A or after moving the ball to another location. If the ball is moved from (A) to (B), we can calculate xT for that movement with subtracting p(B)-p(A). The player passing the ball (Calhanoglu in this case) is credited with 0.238 xT, increasing his team’s chance of scoring by 23,8 %. The numbers in the grid above are from McKay Johns’ github (see link at bottom). Credit to him for his valuable sharing and contributions. There is multiple versions of this type of grid. Tom Worville wrote an article where he covers it more in-depth: here. With a foundational understanding of xT, let’s move into real-world applications.

xT and Winning: Is There a Connection?

Using data from the Premier League 2017/2018 season, I investigated whether teams with higher xT values consistently performed better, measured with leaguerank. Here, I only measure passes, as carries were not a part of the Statsbomb data at that point.

L4

The results indicate top teams generate significantly more xT. The old “top six” teams stand apart from the rest, underscoring the influence of xT on league outcomes. Carrying out a correlation test give more information about the size of connection between the two.

L4

Figure 3. Pearson correlation for xT and leaguerank PL 2017/2018
The correlation coefficient measures the linear relationship between xT and leaguerank. Pearson correlation is measured from -1 to 1, where 1 meaning xT value explaining all league ranks, 0 meaning there is no connection, and -1 is negative correlation. Here, leaguerank is flipped to avoid negative correlation (1st place would originally mean low leaguerank). There is a strong relationship (0.785) between xT and leaguerank. Teams that accumulated more xT were more likely to finish higher in the table. There are definately more factors affecting the outcome, but the high correlation suggest xT as a relevant metric. After establishing xT’s relevance for 2017/2018, let’s apply this to present time and EURO 2024.

EURO 2024

Here are three examples of different passes and their xT values during EURO 2024. These passes had some of the highest xT values in the tournament.

L4


Francisco Conceição (Portugal) leads the xT chart. The threshold for minutes is set to a minimum of 150, and he was effective while playing (played 202 minutes). I split xT into separate charts for passes and carries, which gives a more clear picture of the two different actions.
L4

A lot of experienced players are on top for passing. One observation from this figure is the much higher number of xT created by passes, compared to carries. Passing is an easier and normally quicker way to get the ball into dangerous areas, unless your name is Jeremy Doku. Doku is a clear example of a player tasked with carrying the ball and creating chances. A comparison between players make more sense for players in similar position and roles (original positions is taken from Hudl/Statsbomb lineups). From those positions, I then made six groups.
L4
The figure above show a high amount of carries for wingers, as well as passes and carries for both defence and central midfielders. The low average for forwards could maybe be explained by low involvement. The wingers and CMs stand out for carries. Accumulating xT might be easier for a winger or a central midfielder who can recieve the ball wider or deeper compared to a forward who could have more difficult to get on the ball. After splitting into positions, we can explore who performed in spesific position groups.
L4

Wingers were top for carries in the position group, and high for passes as well. This visual confirms Doku's special ability of running with the ball. It also shows top passers like Eriksen and Tadic, as well as players who excell at doing both, like Conceição and Williams.
L4
In center midfield we see many players scoring high for both passes and carries. Baumgartner (Switzerland) stands out with extrordinary high xT from passes, and Bruno for carries. The numbers are even more impressive for players who reached the final stages of the tournament (like Pedri, Olmo and Simons), who managed to deliver high xT numbers for more games. ### Summing up: Advantages and limitations As I have shown in this article, xT is all about assigning a value to ball movement, based on difference in probabilities. And this difference in probabilties is the expected threat, which tells us if the moving action increased or decreased the chance of scoring a goal. This way, even a pass on your own half can be measured as valuable, which in my opinion is one of the great properties that xT contains: xT measures ball movements that are not shots or goals (shots is less than 1% of the events). Another property, is that xT is quite simple both to implement and understand, and it can provide useful information about teams and players. One limitation with xT, however, is that it only uses the location of the ball, and does not take other factors into account (i.e. position of opponents and teammates, type of pass, pressure, passage of play, etc.) That is where an action based model like on-ball-value or other possession value models come into play and can add valuable context. More of that in a later article. Sources: ### Sources: [Statsbomb Open Data Specification (PDF)](https://raw.githubusercontent.com/statsbomb/open-data/master/doc/StatsBomb%20Open%20Data%20Specification%20v1.1.pdf) [Worville, T. (2021) - Expected Threat (The Athletic)](https://www.nytimes.com/athletic/2751525/2021/08/06/introducing-expected-threat-or-xt-the-new-metric-on-the-block/) [Sumpter, D. (2022) - Expected Threat Analysis](https://soccermatics.readthedocs.io/en/latest/lesson4/xTPos.html) [Singh, K. (2018) - Understanding Expected Threat (xT)](https://karun.in/blog/expected-threat.html) [Mckay Johns xT Tutorial (Jupyter Notebook)](https://nbviewer.jupyter.org/github/mckayjohns/youtube-videos/blob/main/code/xT%20Tutorial.ipynb)