BP Comment Quick Links
| Home | Unfiltered | Articles | Newsletter | Statistics | Fantasy | Events | Radio | Glossary | Search |
![]() |
|
|
|
March 24, 2005 2005--Setting the StageRandomness in Team Standings PredictionsSpring training is in full bloom, and besides the furtive fantasy league planning going on in offices throughout the country (should the GDP's seasonal adjustments take this into account?), it is also the time for predictions to be made about the upcoming season. Most baseball coverage includes player or team projections of some kind, and Baseball Prospectus is no exception. However, I've often wondered just how accurate even the best predictions can be. We've written about predicting player performance before, so I am going to focus on predictions at the team level, namely the order of finish within a division. An interesting aspect of the question is whether even perfect predictions of team quality can result in reliable predictions of standings. Is a 162-game season sufficiently long enough for competitive teams to differentiate themselves conclusively? One way to approach this is to use a past season's actual results to create a "perfect" predictor of a team's ability to win a game over a span of time, and simulate a season's worth of games to see if you recreate the actual observed standings. For example, knowing that a team won 55% of its games in a season (89 wins out of 162 games), simulate a season in which they have a 55% chance of winning each game. This is the probability that maximizes the chance of observing 89 wins in the simulated season. Repeating this process for each team in turn creates a full season of predictions, each tuned to maximize the chance that the simulated outcome will match the known outcome. This represents the best that a perfect estimate of each's team's ability to win games over the course of a season can do, with randomness over 162 games accounting for the only source of discrepancy in this artificially-constrained experiment. And this is exactly the approach I took to the question. I simulated 1000 seasons. For each team, I used their actual 2004 winning percentage as the probability of winning a game, and used a normal distribution as an approximation for the expected number of wins for each team over 162 games.
|