Premium and Super Premium Subscribers Get a 20% Discount at MLB.tv!
October 11, 2006
Lies, Damned Lies
Every year since PECOTA made its debut, we've used the system to project the season's final standings. Some of those years have been better than others from a prediction standpoint--2003 and 2005 were strong, while 2004 was pretty much an unmitigated disaster with crazy park effects, breakouts, and collapses across the board. But we've never taken the time to go back and reflect on PECOTA's projections.
In the tables below, you'll see a number of comparisons between PECOTA's projections and the final major league standings. "Vegas" is not actually Vegas, but the over/unders from a prominent online gambling site that I was able to dig up after extensive googling. "Result" is what would have happened if you'd used PECOTA to bet those over/under lines. The RS (runs scored) and RA (runs allowed) categories represent the degree that PECOTA missed after accounting for changes in overall run scoring. This adjustment is important, because the average National League team scored about 40 more runs than they did in 2005. (PECOTA does not try to project changes to overall run scoring--mainly because my research indicates that these changes are entirely unpredictable--but instead assumes that run scoring levels will be the same as they were in previous seasons).
AL East Actual PECOTA Vegas Result RS RA Yankees 97 94 100 Win +19 -11 Blue Jays 87 79 86 Loss +20 -56 Red Sox 86 93 90.5 Loss -94 +41 Orioles 70 77 76.5 Loss -10 +81 D-Rays 61 69 67.5 Loss -59 -14
The East was not one of PECOTA's stronger divisions, but it's fairly obvious where the problems lie. PECOTA vastly overestimated the Red Sox' offense, failing to account for the injuries of Jason Varitek or the disappearance of Coco Crisp; it did urge caution on Mark Loretta at least. It also underestimated the depth of the Blue Jays' bullpen, while failing to account for the collapses that Rodrigo Lopez and Bruce Chen experienced on the tail end of the Orioles' rotation. Of these issues, the Blue Jays bullpen is the one that troubles me the most. PECOTA had some awfully conservative projections for a number of bullpens this year, and while that turned out to be prudent in the case of the White Sox and the Braves, it seems to have trouble distinguishing the better relievers from the inferior ones once you get past the Trevor Hoffmans of the league. Then again, so do a lot of baseball teams.
AL Central Actual PECOTA Vegas Result RS RA Twins 96 84 80 Win +43 -45 Tigers 95 83 77 Win +21 -102 White Sox 90 82 91.5 Win +82 +21 Indians 78 88 90 Win +44 +24 Royals 62 61 64 Win +61 +69
Whether PECOTA deserves the "win" for the White Sox prediction is debatable, since while the under would have won you money, Vegas came far closer to projecting the White Sox' win total than PECOTA did. Most of that is the result of projecting not enough playing time for Jim Thome and not enough production from Jermaine Dye, though players with history of catastrophic injuries are going to give any projection system trouble. Otherwise, however, PECOTA deserves a lot of credit for perceiving that this would be a competitive four-team race, rather than the White Sox and Indians running away with things. The eerie parallels between the '05 White Sox and the '06 Tigers continue with the latter's dismantling of an AL East powerhouse over the weekend.
AL West Actual PECOTA Vegas Result RS RA A's 93 93 88.5 Win -53 +21 Angels 89 81 88.5 Loss +42 -1 Rangers 80 80 80.5 Win -40 -94 Mariners 78 77 75.5 Win -6 -15
Another fairly good division for PECOTA, though under the surface there are a couple of issues. Most of the individual projections for the Angels offense were close to the mark--Juan Rivera is the obvious exception--but the Angels got 1000 pretty good at-bats from Mike Napoli, Howie Kendrick, Robb Quinlan, and Tim Salmon that the system hadn't anticipated. The bigger problem was with underestimating the Rangers' run prevention abilities, though the Rangers had a number of pitchers, like Rick Bauer, whose performance outpaced their peripherals.
NL East Actual PECOTA Vegas Result RS RA Mets 97 88 91 Loss +25 -9 Phillies 85 86 81.5 Win +37 +31 Braves 79 85 89 Win +20 +22 Marlins 78 71 67 Win +60 -17 Nationals 71 70 77 Win +72 +92
During spring training, Kevin Goldstein asked for predictions on our internal mailing list about the Marlins' win total. I don't have the e-mail chain handy, but I was the lone voice in the wilderness predicting at least 70 wins for the Fish. My argument--supplemented by PECOTA--was that the pitching was going to be pretty decent, and that the offense would be better than it looked on Opening Day as the Marlins sorted through the available options. Of course, PECOTA was still way off the mark on Hanley Ramirez and Dan Uggla (though it was right on Jeremy Hermida). The other problem here is with the Nationals, where PECOTA nailed the win total, but got the offense/defense balance completely wrong. You might think that the park factor was the issue there, but PECOTA's projected park factor of 92 wasn't far from the actual 96. Perhaps we have to introduce a "contract year" variable into the system after seeing what Alfonso Soriano has done.
NL Central Actual PECOTA Vegas Result RS RA Cardinals 83 86 94 Win -5 +21 Astros 82 81 83.5 Win -20 -35 Reds 80 78 74 Win -35 -15 Brewers 75 84 80.5 Loss -50 +82 Pirates 67 79 75.5 Loss -78 +1 Cubs 66 85 85 Push -50 +106
We were some of the only people on the planet pointing out that the Cardinals' roster was extraordinarily vulnerable after the Pujols/Rolen/Carpenter core. Fortunately for Tony LaRussa, the two teams that we pegged as having the best chance to mount a challenge to their hegemony flatlined. I don't know that I can recall a more disappointing sequence of seasons than the Cubs' 2004-2005-2006. The case of the Brewers is a bit strange, in that the pitchers allowed 50 more runs than they usually would have based on their peripherals; couple that with Ben Sheets missing half his starts, and the projection is closer to the mark.
NL West Actual PECOTA Vegas Result RS RA Padres 88 78 77.5 Win -5 -83 Dodgers 88 87 85.5 Win +34 +19 Giants 76 80 81.5 Win -30 +10 D'Backs 76 77 74 Win -27 -48 Rockies 76 74 68.5 Win -31 -106
A clean sweep in the over-under department, though the "wins" on the Padres and Giants are cheapies. It's a shame that the Padres held the tiebreaker over the Dodgers, because otherwise PECOTA would have guessed 5 out of the 6 divisions correctly. Speaking of the Padres, it's amazing that PECOTA underestimated their pitching staff so badly while being overoptimistic on Jake Peavy, but that's what happens when Cla Meredith and Scott Cassidy combine for 93 innings of 1.74 ERA baseball. The miss on the Rockies' pitching was even worse, but I'll take my mulligan there until we have the chance to figure out just what in the hell is going on with Coors Field park effects.
There are a couple of standard metrics that we can use to compare PECOTA's overall results to those of other systems: correlation coefficient and root mean square error. In addition to comparing PECOTA to the Vegas lines, we'll also evaluate Diamond Mind's projections, and ESPN's consensus picks. It's clear that PECOTA had a pretty good year:
Correlation Coefficient PECOTA .669 ESPN .636 DMB .634 Vegas .617
Root Mean Square Error PECOTA 7.40 DMB 7.96 Vegas 8.18 ESPN 8.20
It's also worth noting PECOTA's performance against the over-under lines. If you had used the PECOTA projected standings to bet on MLB team over/unders at the start of the season, betting only in cases where there was a difference of at least three wins, you would have gone 14-5. Counting the marginal cases, the record would be 21-8, with no action going in on the Cubs.
This is for informational purposes only, of course, particularly in a world in which the Republican leadership has decided that gambling over the Internet is less acceptable than having cybersex with underage pages. A cheap shot? Of course, but so was Majority Leader Frist's technique of attaching gambling language to an essential port security bill that the White House specifically ordained should be unencumbered by "goulash" provisions. PECOTA projects that Mr. Frist's presidential aspirations will go the way of Sammy Sosa's late career.