March 8, 2012
House of Cards
Without further ado, we present to you the PECOTA cards. Debuting on the cards are the 10-year forecasts and the percentile forecasts.
The 10-year projection process breaks down like so:
Peak ages will vary based upon a player’s comparables and his skill set (as each component gets its own aging curve). But typically for hitters, we see peak ages in the 10-year forecasts at around age 28, a year later than the conventional wisdom. This is somewhat offset by a decline in a player’s defensive value, which doesn’t really peak at all but starts to decline almost immediately upon his debut in the majors. (You can see the effects in this aggregation of various 10-year forecast components by age, weighted by DC playing time, here.)
Pitchers are more interesting—there seems to be an earlier peak, at age 26, for pitchers, in terms of ERA. But for pitchers who manage to survive beyond age 26, there seems to be a much later peak. Pitchers who pitch past ages 27–29 seem to peak around age 30, with some pitchers peaking even later. So pitching seems to be essentially bimodal in aging, where some pitchers peak early and others peak late. This is even true if we restrict our analysis only to pitchers who work primarily as starters their entire career. (Again, a breakdown is available here.)
Playing time in the 10-year forecasts is a reflection of what we expect a player’s playing time to be at his peak production, rather than starting with his expected 2012 playing time; this results in more sensible long-term forecasts for young prospects who aren’t quite ready for MLB yet but are expected to have productive careers when they do make it to MLB. Playing forecasts for off-peak years are then adjusted from the peak-year playing time forecast.
Now, projecting the future is more difficult the further out it goes. So how reliable are the 10-year forecasts? Looking at the root mean square error of projected 10-year True Averages for “backcasts” of historic players:
This is about what we would expect; a player’s performance 10 years down the road is substantially more difficult to project than his performance one year down the road. But especially through the first several seasons, the reliability of the forecasts is not substantially different.
Similarly, for pitchers:
Again, results in the first several seasons are very close, with results becoming harder to project the further out you get. (Bear in mind that these are forecasts for a neutral park and league context and thus will exhibit higher RMSEs than regular PECOTA forecasts.)
The percentiles are based on three primary variables:
Again, playing time is based upon a pitcher’s expected performance—the better the performance, the more playing time we expect for that player.
Keep in mind that the percentiles key off of the primary value component—TAv for batters, and ERA for pitchers (although the ERA is a component ERA with less variance than actual ERA, as it does not account for random variance in sequencing around the component lines). Component stats are meant to illustrate the key value stats only—a pitcher’s 90th-percentile home run forecast, for instance, is not his maximum home run potential but the most likely home run total to accompany his 90th-percentile TAv.
How well do the percentiles do on historic data? Looking at back-forecasts from 1950 on, we see that 79 percent of observed TAvs fall within the 10th and 90th percentiles, and 60 percent fall between the 20th and 80th percentiles, exactly what we should expect.
As a reminder, PECOTA on the cards is restricted to subscribers only. To those of you who already subscribe, thank you for your patronage, and I hope you find the PECOTA cards useful, informative, and (employers of America, forgive me) a veritable time sink. Enjoy.
Additionally, you can look at progressions for different age groups over time, weighted by projected 2012 playing time according to the depth charts. Pitchers are located here. Hitters can be found here.
UPDATE: The ten-years forecast shows seasons two through ten of a player expected to play all ten seasons. For players whose forecast falls below the attrition rate, no forecast is displayed.