Tuesday, 25 April 2017

Profiling Open Play Shooters in the 2016/17 Championship

As a follow up to the recent post looking at the average distance from goal Premier League strikers tend to shoot from an the variability of their preferences, here's the same for the EFL Championship.

The players are arranged in decreasing level of shot volume and shots only from open play are considered. All headers are also omitted.

I've simply ranked the players in terms of distance and variability.

Average distance to goal from the origin of all shots has been used to determine those who take their efforts closest to goal. Currently this honour falls to Tammy Abraham among the top 100 individual shooters by volume.

Tammy also barely strays from this area of preference, bagging him the description of an archetypal goal hanger, along with Scott Hogan.

Variability in the range of shots a player takes has been determined by calculating the amount each shot taken by a player varies from that player's average shooting position.

In contrast, Jacob Butterfield only appears to try his luck from distance.

As a more general guide to the likely shooting distance and either single mindedness or willingness to to vary their approach, you can refer to the crib sheet below.

Data from Infogol

Saturday, 22 April 2017

Profiling Open Play Shooters in the 2016/17 Premier League.

Among the many attributes Charlie Adam has bought to the Premier League is his willingness to give it a go from distance.

Whereas most cultured midfielders, picking the ball up in their own half, would only consider their passing options, Charlie can often be seen checking out that the opposing goalie is actually paying attention.

Success from such extreme distance is invariably fleeting, if glorious, but Adam's opportunism has extended his average open play shooting distance in 2016/17 to nearly 29 yards from goal.

This places him top (or if you prefer, bottom) in the list of 75 greatest volume shooters when sorted by average distance from goal per attempt.

Andros Townsend hang your head in shame, (a mere 68th).

"Now if I could just tame this damn thing......"

Sadio Mane is currently the league's anti Adam.

The average distance to the centre of the goal for all open play shots attempted by Mane is around 16 yards, nearly half the average racked up by Adam.

These are useful figures, but it is also helpful to know if Mane is an habitual penalty area shooter or whether no one is safe from Charlie's optimism, but it is relatively rare to see him threatening to get on the end of a tap in from the six yard box.

We can attempt to answer this additional question about a player's shooting profile by measuring how far each of his attempts strays from his average shooting position for a particular season.

In Mane's case, the answer is not very far.

Out of the top 75 volume shooters in 2016/17, he's ranked 10 for sticking close to his average shooting position of 16 yards from the centre of the goal-line.

Adam by contrast is ranked 73rd out of 75 for sticking close to his average shooting position. He pretty much shoots from any and every where.

Here's the full list.

Aguero shakes out as the league's premier goal hanger. His average attempt is ranked the 7th closest to goal and he sticks vigorously to his hunting ground, with England new boy, Defoe running him close.

Mata is the second nearest average shooter, but his 15th ranked variability suggests that he's had a few tap ins from virtually the goal line and he's far from an classic six yard box poacher.

Pogba's a long distance shooter and a relatively high variability rank (low variability) suggests that long distance shots are his gig.

But the prince of long distance shooting with little desire to get into the six yard box is Spurs' Eriksen.

He has a very similar profile to Swansea's Sigurdsson, an ideal transfer target should Spurs require depth in this niche shooting role.

Data from Infogol

Saturday, 18 February 2017

Expected Saves Ageing Curve.

Everyone is probably familiar with the concept of expected goals, assists and saves by now.

A modelled prediction of the likelihood that a player will score, based mainly on the location and type of attempt is summed over a number of attempts and then compared to his or her actual output.

A player who scores say ten goals against a cumulative expected goals tally of eight is therefore considered to have over performed against their expectation.

The reasons for and the sustainablility of this over achievement can be  many and varied, ranging from the presumption that they are a persistently skilled finisher, they have had a hot, finishing run or the model is inadequate to fully describe the nuances of real football life. (Although the latter may be mitigated by running goodness of fit tests on out of sample data).

Instead of merely presenting expected and actual goal numbers ranked by over and under achievement, the same information can be presented in a more graphical form.

Rather than quoting cumulative figures, the granular nature of attempts is respected by using a Monte Carlo simulation for all shots and headers to produce a range and frequency of potential actual goals scored based on all attempts and these distributions are then compared to reality.

Here's a recent example that shows Chelsea and to a lesser degree, Spurs and Arsenal outstripping their simulated range of potential goal difference tallies based on the number and quality of chances they each have created and allowed in a possibly unsustainable manner.

The same approach may be used to describe, if not fully predict the performances of goal keepers.

In defining the difficulty of the task faced by a keeper it is legitimate to include post shot information, such as placement, strength and whether or not a shot took a deflection. These are additions that may not be repeatable from the shooter's point of view, but do better describe the reality of the keeper's task.

Here's a distribution plot for a number of Premier League goalies in 2016/17. Hull's Jakupovic's is most likely to have conceded 15 goals, rather than the nine he actually has and it is around a 1% chance that the average keeper described by the model would have performed as well or better.

By contrast Bravo is having a well documented torrid time at Manchester City, conceding nine more goals that the most likely peak of the simulated distribution of the attempts he has been asked to save.

However, the question remains as to whether these snapshots of "form" represent a longer term up or down tick in the keeper's potential future performance in his current environment or if they will regress towards less extreme levels going forward.

David de Gea is a couple of goals in credit against the model's expectation in 2016/17 and while this is not uncommon for United's keeper, it is possible to find runs of 50 attempts when would have been classed as under performing.

Notably in May of 2015 and February 2016.

Perhaps most usefully, this simulation approach may open up another way to look at the age at which a position generally reaches the peak of a particular attribute.

A variety of methods and curves have been used, See here, here and here. Grouping keepers by their rounded age when they did or didn't make a particular save and then seeing if this enlarged group of ages show a tendency to over or under perform may be another route.

Here's the under (red) and over (green) shot stopping performance of Premier League goal keepers, sorted by age over multiple seasons.

Notwithstanding the problems of survivor bias for older keepers in this type of traditional plot, there does appear to be tendency for keepers to over perform an attempt based model in their mid to late 20's, peaking at around 28 (which is consistent with other approaches listed above).

Their under performance relative to their older selves in their formative years and in the advanced stages of their careers compared to their younger selves is also typical of ageing curves in general.

This approach of course may be used for other performance related indicators across other playing positions.

Modelled data via InfoGolApp