Predicting the premier league results
Here is the spreadsheet showing the way in which my predictions were made. I hope it is comprehensible, at least for enthusiasts! I discussed this on the Today programme the day before the matches.
The statistical method used is basically the same as we used last year, when we did well and got 9 win/draw/lose correct and 2 exact scores. For each team we work out an expected number of goals that they will score: this is based on the average for a home or away side (1.69 and 1.09, this season, a strong home advantage of over 50%), adjusted by the 'attack strength' of the team and the 'defence weakness' of their opponents. The expected number of goals for each team are then taken as the means of two independent Poisson distributions and the probability of each goal combination calculated. Adding up the relevant probabilities then gives the assessed chances of a home win, draw, or away win.
Last year we used a very simple model for attack strength and defence weakness, based on the total goals scored and conceded during the season. This year we have allowed the attack strength to depend on whether the team is playing home and away - this easiest way to do this is to consider goals scored home and away entirely independently, but we have 'smoothed' the resulting estimates by giving some weight to away goals when estimating home attack strength. (In formal statistical terms, we are fitting an approximate Poisson regression model with main effects for home/away, team and opposing team, and a mixed effect interaction term.)
A real problem occurs when the 'most likely' exact score is a draw, but overall the most likely overall result is a win for one team. In this case we have gone for the most likely overall outcome, although this means we have not predicted any draws.
The final predictions are as follows:
Home | Away | Most likely | Probability of result | Probability | Actual result |
---|---|---|---|---|---|
score | (win/draw/lose) | exact score | |||
Arsenal | Fulham | 2-0 | 73% | 15% | 4-0 |
Aston Villa | Blackburn | 1-0 | 69% | 16% | 0-1 |
Bolton | Birmingham | 1-0 | 39% | 11% | 2-1 |
Burnley | Tottenham | 0-2 | 64% | 11% | 4-2 |
Chelsea | Wigan | 4-0 | 96% | 11% | 8-0 |
Everton | Portsmouth | 2-0 | 75% | 14% | 1-0 |
Hull | Liverpool | 0-1 | 61% | 14% | 0-0 |
Man U | Stoke | 2-0 | 80% | 18% | 4-0 |
West Ham | Man C | 1-2 | 55% | 10% | 1-1 |
Wolves | Sunderland | 0-1 | 37% | 15% | 2-1 |
It is important that by adding up the probabilities we can work out how many we expect to get right: 6.5 results and 1.3 exact scores, and anything more than this is luck!
By multiplying the probabilities we can assess the chance that all the predictions will be correct: this comes to around a 1 in 100 chance that all the results will be right, and around 1 in 700 million chance that all the exact scores will be correct. That's why I don't bet on these predictions.
Added at 6pm on Sunday 9th May
A fairly pathetic result. Only 5 results right and no correct exact scores, less than expected but not incompatible with the probabilities given. Just goes to show that uncertainty does not always play out as desired. Mark Lawrenson for the BBC did better: 6 results and 2 exact scores, so he gets his own back for last year when he only got 7 results and 1 exact score. Oh well, back to the day job.
