Goal Distribution, is it important?

In the previous post, I referred to an article by Jonathan Wilson which noted that Rooney’s prolific goal scoring may not be benefitting Manchester United. The argument was that when Rooney scores more goals, the team becomes much more dependent on him, and hence much easier to defend against. If Rooney is the only goal threat, then just take Rooney out of the game. I tried to investigate this claim in greater breadth. If having a focused goal scorer makes the offense single dimensional and predictable, then would teams with a more diverse scoring talent have better goal scoring records? Or if two teams score similar number of goals, would it win you more points when your offense has more scorers?

In order to investigate this claim, first the correlation between final rankings and goals were investigated.

Table 1. EPL 11/12 Season team rankings and goals

Plot 1. EPL 11/12 Goals vs Rankings, notice the quadratic relationship

There are some two points to pick up from this data:

  1. The two title contenders scored significantly more goals than the other 18 teams.
  2. There isn’t a clear linear relationship between more goals scored and a higher place finish at the end of the league.

Simply put, in order to win the league, you need to score a lot more than anyone else, but for the other positions, the number of goals you score seems to have a lesser effect. It’s especially interesting to note that the relegation zone teams of last season (QPR, Bolton, Blackburn, Wolverhampton) scored more goals than the low mid-table teams (Sunderland, Stoke, Wigan, Aston Villa).

So then, what in terms of their offence differentiates between these two teams? The factor I wanted to look into today was the effect of how goal distribution in each team affects their performance. In order to do this, I came up with a metric: the Goal Distribution Index (GDI), as defined below:

Where N is the total number of goal scorers in the team, gi is the number of goals scored by the player, and gt is the number of goals scored by the team (discounting own goals). The scaling factor of 1000 is there just to make the GDI a value greater than 1. The primary designing factor of the GDI is such that the lower it is, the more evenly distributed the goal scoring ability is within the team. It also takes into account the tactical focus of goal scoring routes. A key assumption in the factor is that the fewer goals a goal scorer has, the more likely it is that the team was not focusing on that route in offense, and the goal happened as an off chance. In order to emphasize the off chance fact, the fraction gi⁄gt is squared. See the following table for the additional information.

Table 2. EPL 11/12 Season teams’ goal data

The No column refers to the number of goal scorers in each team. In order to really investigate the effects of goal distribution on goals and rankings, the following graphs were constructed.

Plot 2. EPL 11/12 Season Goal vs GDI plot. No clear correlation can be found.

Plot 3. EPL 11/12 Season GDI vs Rankings. Correlation is virtually nonexistent.

As we can see, there is no clear correlation of any sort between these variables. From the GDI vs Goals plot, we see that the two league title contenders both have GDI’s below the average (although Rooney scored a lot of goals in comparison to his teammates, Manchester United had total of 16 goal scorers which brought down there GDI significantly), but the next two teams have higher GDI’s than the league average. However, when looking at the league as a whole, there seems to be no dependence whatsoever between GDI and goals. The number of goals a team scores over the season in general, doesn’t seem to be affected by the distribution of the goals. When looking at the correlation between GDI and rankings, the relationship is even more nonexistent. The Pearson’s R Correlation Coefficient, which is an index for determining linear relationship between two variables are -0.125 between Goals vs GDI and 0.038 between GDI vs Rankings. Considering that a coefficient of 1 is a perfect linear correlation, we can say that these variables are not correlated with each other at all.

The simple conclusion to draw from this study is that Goals are Goals. It doesn’t matter who scores them, or how they are scored, a goal is worth as much as any other goal. Maybe this isn’t a surprising conclusion, but the conclusion has certain implications about transfer activities of teams. Considering that the acquisition of Robin van Persie for Manchester United isn’t going to mean that United will score 120 goals this season (89 last season + 30 from van Persie last season), the 24M price tag for van Persie must be justified from more than just additional goals. Developing or buying players at a cheaper price who offer varied attacking options, and tactical maneuvers may have been a more economical decision (for example the acquisition of Javier Hernandez in his first season). More importantly, this may be a better lesson for smaller teams with smaller budgets. If the team is suffering from a lack of goals, the team doesn’t have to buy a high profile goal scorer. It works just as well to have more players score goals.

However, this does not mean that a super star, goal scorer is unnecessary. In addition to the GDI, the Goal Contribution Factor (GCF) was designed for this study. The GCF is defined as:

It was simply designed to examine the contribution factor of an individual player in terms of the sheer number of goals he scores, and his contribution of goals in the team effort. In the 2011/12 season of EPL, these are the players with a GCF of 0.8 or higher. These are the players, according to the metric, the most dangerous goal scorers in the league and in their respective teams by a significant margin.

Sergio Aguero – Manchester City (1.47)
Wayne Rooney – Manchester United (2.60)
Robin van Persie – Arsenal (5.07)
Emmanuel Adebayor – Tottenham (1.16)
Demba Ba – Newcastle (1.57)
Papiss Cisse – Newcastle (0.84)
Clint Dempsey – Fulham (2.66)
Danny Graham – Swansea (0.93)
Grant Holt – Norwich (1.25)
Peter Crouch – Stoke (0.82)
Yakubu Aiyegbeni – Blackburn (1.59)
Steven Fletcher – Wolverhampton (1.08)

Notice that with the top 5 sides of the league all had players who took up a significant portion of the team’s goal scoring. Then among the second quadrant teams there is only Clint Dempsey. And the rest of the players are evenly distributed in the lower two quadrants. The relationship probably signifies the presence of attacking talent in the top teams. That they have players who they primarily focus on, and despite opposition preparation, these players are good enough to score often enough. On the other hand, for teams at the lower end of the spectrum, these players are probably the only players with enough attacking talent to consistently score in the league.

It is also interesting to note that Newcastle is the only team with two players on the list. Since the index squares the fraction of goals scored, the GCF tends to diminish rapidly as the number of goals scored decreases. The fact that both Demba Ba and Papiss Cisse managed to record GCF of higher than 0.8 shows both that the popular claims of the two being one of the most potent strike partnerships in the league to be true, and that Newcastle have a serious case of concentration of goals in the two players. Unsurprisingly, Newcastle had the highest GDI in the league at 17.4, followed by Fulham (and Dempsey) at 17.1.

cisse-and-ba-celebration-1-278x300.jpg (278×300)

Demba Ba and Papiss Cisse, the most potent strike partnership in the EPL last season

At the other end of the spectrum is Everton at GDI of 4.8. Everton had 18 players score in the league last year, the most of any EPL team in 11/12. Compare the GDI of Everton to Arsenal. At 17, Arsenal had the second most number of players score in the league, but had a GDI of 12.1 (thanks to RvP). In contrast, Everton’s best goal scorer Jelavic scored 9 goals, just one more than Arsenal’s second best goal scorer last season, Theo Walcott.
More data and more analysis are definitely necessary to establish and support the claims made here. Previous season data and data from other leagues would be equally helpful (the La Liga data will likely destroy the statement that goal distribution has very little correlation with goals scored and team ranking). There may be more definite trends when looking at more specific portions of the data. For example, maybe there is a trend for league contenders or Champions League spot teams. Or, the quadratic trend between goals and rankings could be explained by looking into whether or not relegation zone teams become more offensive near the end of the season in order to escape relegation. As usual, this study is limited in its scope and any suggestions and comments are welcome.

ronaldo-and-messi.jpg (640×360)

Stats for Ronaldo and Messi will surely throw some, if not most, analysis in this post out the window.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: