One of baseball’s enduring charms is its ability to defy prediction. Each time we think we’re absolutely sure of something-say, that the 2008 Tigers will score a bajillion runs, or Juan Pierre will be a disaster filling in for Manny Ramirez-our forecasts are confounded by baseball’s eternally fickle nature. Sophisticated projection tools, such as Nate Silver‘s PECOTA, are designed to help take some of the guesswork out of predicting how teams and players will perform during a given season, and often produce surprisingly accurate forecasts on the whole. But even PECOTA is prone to big misses, especially in individual player projections, which help to preserve the game’s air of mystery.
Projection models generally use each new season of data to better aim next year’s forecasts. A good model should improve its accuracy over time, and while a swing and a miss on a given player in a given season is inevitable, the addition of new data should make it less likely to be repeated. So how quick a study is PECOTA? After missing once, can it use this new information to get back in the box and make solid contact, or do certain players continue to perplex the system, year after year?
To find this out, I looked at PECOTA Equivalent Average (EqA) projections for all hitters with 300+ plate appearances during the 2006-2008 seasons, and compared them to the actual offensive production of those hitters. If the projection was at least 10 points lower than the actual EqA, I rated that projection as an “underestimation”; a projection that was at least 10 points higher is considered an “overestimation.” The results are shown in the following charts, starting with the players from whom PECOTA expected the worst:
Players with 300+ PA During Season: PECOTA Underestimations Sample Sample Proj. EqA Year Description Size 10 Pts. Low % 2006 All Players 260 124 48% 2007 Underestimated in 2006 108 22 20% 2008 Underestimated in 2006-07 17 7 41%
During the 2006 season, there were 260 players with at least 300 plate appearances. Of those, 124 players saw their actual EqA surpass their projection by at least 10 points-a surprisingly high 48 percent. Presumably, PECOTA should be able to absorb this new information and adjust their 2007 projections accordingly-and the numbers seem to bear this out. Of the 108 “overachieving” players from 2006 who again met the 300 PA threshold in 2007, only 22 of them (20 percent) again exceeded PECOTA‘s projection by 10 or more points. Of the 17 players which PECOTA had twice underestimated who met the 300 PA threshold in 2008, seven of them were underestimated yet again-so the percentage goes up to 41 percent, but with such a small sample that’s probably just noise. There were 154 players who met the PA threshold in all three seasons; of those players, only seven of them (4.5 percent) were underestimated by PECOTA in all three seasons.
Players with 300+ PA During Season: PECOTA Overestimations Sample Sample Proj. EqA Year Description Size 10 Pts. High % 2006 All Players 260 65 25% 2007 Overestimated in 2006 39 20 51% 2008 Overestimated in 2006-07 12 4 33%
Here we see that PECOTA, as a stern evaluator, was about half as likely in 2006 to overestimate a player (25 percent) as underestimate a player (48 percent). Not surprisingly, those that PECOTA overestimated (and thus had a disappointing season) were less likely to meet the 300 PA threshold in the following season-so our sample shrinks at a faster rate. But interestingly, in 2007 PECOTA didn’t seem to learn as much about the underachievers as it did about the overachievers. While PECOTA had only a 1-in-5 chance of repeating its underestimation in 2007, more than half the players it overestimated in 2007 (who met the PA threshold) were again overestimated in 2008.
The list of players who were twice overestimated is peppered with names like Jim Edmonds, Richie Sexson, Trot Nixon, Craig Biggio, and the Giles brothers-players who had been highly productive but whose numbers suddenly cratered (often due to age or injury). For PECOTA, as with managers and fans, it took a while to see that these players truly had become shadows of their former selves. By the third season, most of these players were either no longer full-time major leaguers, or PECOTA finally stopped squinting and came up with a more realistic projection: only four players (2.6 percent of the 154 who met the PA threshold in all three seasons) were overestimated a third time.
Who were these masked men, the players who managed to turn PECOTA into Pollyanna, continually predicting performance far beyond that which they produced?
2006 2006 | 2007 2007 | 2008 2008 Actual PECOTA | Actual PECOTA | Actual PECOTA Player EqA EqA | EqA EqA | EqA EqA Bobby Crosby .231 .276 | .225 .265 | .234 .255 Juan Uribe .234 .253 | .231 .263 | .236 .250 Austin Kearns .282 .292 | .271 .290 | .223 .280 Jason Varitek .248 .284 | .272 .282 | .237 .274
If there’s a pattern to discern here, it’s early promise followed very quickly by injury and/or disappointment, or what we might call the Ben Grieve career path. Crosby has long been either injured or lackluster, with his career shape looking ever more like a Pet Rock: instant, inexplicable, short-lived success that quickly becomes a metaphor for fleeting value. Kearns has never exactly been bad (until recently); neither has he become the consistent, multi-talented outfielder most thought he would grow into. PECOTA seems to have focused on what Ooh Ooh Uribe could do (hit 20-plus home runs in his mid-20s) while ignoring what he couldn’t do (get his OBP much above the mid-.200s). Only Varitek stands out in this crowd, and it looks as if PECOTA felt his leadership and moxie would exempt him from the standard catcher aging curve.
Can any member of this rogue’s gallery make PECOTA whiff yet again this year? Varitek’s bounce-back season (.278 EqA) and Uribe’s surprising competence (.263) in San Francisco has them both far exceeding PECOTA‘s sudden and deserved pessimism. On the other hand, Crosby (.235 actual/.243 projected) and Kearns (.237 actual/.275 projected) continue to be poster children for unrealized potential and may well achieve the four-peat.
The list of three-time overachievers is a little more complex:
2006 2006 | 2007 2007 | 2008 2008 Actual PECOTA | Actual PECOTA | Actual PECOTA Player EqA EqA | EqA EqA | EqA EqA Chipper Jones .331 .303 | .339 .308 | .360 .321 Hanley Ramirez .288 .241 | .318 .277 | .320 .298 Matt Holliday .304 .271 | .317 .296 | .316 .295 Dan Uggla .278 .233 | .275 .262 | .296 .273 Ichiro Suzuki .288 .266 | .302 .277 | .283 .271 Aaron Miles .234 .222 | .240 .206 | .265 .221 Mark Grudzielanek .257 .247 | .268 .255 | .265 .245
A PECOTA Similarity Score below 20 indicates a player who is particularly unique and difficult to compare to other players; Chipper Jones (Sim Score: 4) and Ichiro Suzuki (Sim Score: 17) fall into this category. Look at Chipper’s Equivalent Averages-it’s not like PECOTA expected the oft-injured star to become unproductive, it’s just that he’s been virtually superhuman (when healthy) during his late-career drive towards Cooperstown. Ichiro is a unique story, so it’s not surprising PECOTA has never been sure what to make of him. Uggla and Ramirez both achieved such immediate success that it’s taken time for PECOTA to believe what our eyes have already seen, especially for Hanley, whose minor league numbers were no match for his scouting reports. I’m not sure exactly what ancient grudge PECOTA has held against Matt Holliday, but it looks like 2009 might see them meeting halfway (.296 actual/.305 projected EqA). Miles has only managed to be not as awful as you might think, while Grudz has been a useful player far later into his 30s than most would have thought possible, and with his recent history of exceeding expectations, the Twins may have bought themselves a useful insurance policy.
For 2009, PECOTA has finally come around on this group; in fact, all but two players are currently well below their forecast. Hanley’s .324 projection is pretty much spot-on. The only player currently in great danger of yet again being underestimated by 10 or more points: the inscrutable Ichiro (.308 actual/.258 projected EqA), continually mistaken by PECOTA like some latter-day Rodney Dangerfield, someone who just can’t get any respect.
With the percentage chance of missing three times in a row comfortably in the low single digits, it looks as if PECOTA rarely goes into extended slumps when projecting any given player. There may be a few specific types of players (early career busts, players who maintain high productivity late into their 30s, holders of the single-season hit record) that tend to be venerated or demonized longer than they should. But overall, a given PECOTA projection is at least as accurate as your local weather forecast-good enough to know whether you’ll need a coat, but with enough short-term variation to occasionally leave you out in the cold.