March 31, 2009
As baseball writers, analysts, and fans get set to lay out their predictions for the next six months, I find it useful to look back at last year and see what I can learn from the picks I made a season ago. I've been doing this exercise for a few years, and sometimes there are valuable lessons to be had. Sometimes, to be honest, there aren't. Baseball's funny that way.
The first thing to do is check the run environment. In my predictions, I had 23,358 runs being scored last season, which was high by about 3.5 percent; the actual total was 22,584. That's the global error, so to make comparisons, I've gone ahead and adjusted all of my RS and RA predictions upward by that amount. Call it park effects for prognosticators.
Let's make something clear: I'm predicting runs scored and allowed, and the records are simply a function of that. I make some minor adjustments to correct for rounding errors and the possible impact of a particularly strong or weak bullpen, but for the most part, I'm concerned with runs. We simply don't have much evidence that outside of the effect of a bullpen, teams can distribute their runs in a way that gives them a leg up on the Pythagorean formula. So my predictions, and my evaluations of them, focus on runs. Nailing a team's record but being off by 70 runs of differential is a bug, not a feature.
My Net Error Score (or NES), which is the total number of runs I missed by on all 60 predictions, was 3,517. That's a tick worse than last year's figure, and middle of the pack for the years in which I've done this. Here are the best and worst projections by that metric. "Pred" is the adjusted figure:
Actual Pred* Team RS RA RS RA NES Royals 691 781 698 762 26 Red Sox 845 694 819 678 42 A's 646 690 661 718 44 Angels 765 697 781 727 46 Indians 805 761 805 710 52 Rangers 901 967 812 821 235 Pirates 735 884 638 755 226 Twins 829 744 650 713 211 Phillies 799 680 821 841 183 Cardinals 779 725 704 816 166 *Figures may not all match due to rounding.
The presence of the A's and Indians up top isn't much consolation. The A's were actually a bit better than that for three months before trading away two of their top starting pitchers; by rights, that probably should have been a bigger miss. The Indians, of course, were so bad in close games that they traded away CC Sabathia while still having a positive run differential, and all things considered they might have won the division had they not done so.
The misses are something of a type. I wasn't that far off on the differentials for the Rangers and Pirates, but completely blew their run environments, even after adjustments. Both teams' staffs were horrible, bad enough to waste what ended up being better offenses than expected. The Twins famously had an amazing year hitting with runners in scoring position, which is why I missed their runs scored by more than one per game. The Phillies' bullpen, which I expected to be below average, was excellent, and their defense was much improved as well. That's how you miss runs allowed by one per game.
I think the above approach is the best way to evaluate predictions, but you could also give someone a pass for gaps that are of a kind, like the Pirates and Rangers above, and argue that nailing the run differential is what you're really trying to do. In that case, the top and bottom look like this:
Actual Pred* Team RS RA RS RA NES Brewers 750 689 823 768 6 Dodgers 700 648 758 700 6 Red Sox 845 794 819 678 10 A's 646 690 661 718 13 Angels 765 697 781 727 14 Cardinals 779 725 704 816 166 Tigers 821 857 891 770 157 Orioles 782 869 648 885 150 Twins 829 744 650 713 148 Rays 774 671 785 822 140 *Figures may not all match due to rounding.
Even here, the accurate predictions don't really say much about any skills I might possess. To hit those numbers, the Brewers had to trade for a guy at midseason who pitched at a Cy Young-caliber level for three months. The Dodgers dealt for someone who hit .396 with power for two months. Without Sabathia and Manny Ramirez getting involved-and no, I didn't see those deals coming in March-those picks would have been much further off. See also the A's, who had to dump at midseason to get that close to my prediction.
On the flip side, you have the well-covered aging of the Detroit Tigers and the magical RISP talent of the Twins. There's also the Rays, who had one of the greatest single-season turnarounds in run prevention in baseball history.
I'm seeing a lot of mistakes in evaluation, but what I'm not seeing is an actionable pattern. When I golf, I sometimes go an entire round pushing shots left, or leaving putts short, or fluffing chips. (Actually, I sometimes do all of these things at once.) This isn't so bad, because if you're making one mistake over and over, you can identify the error, correct it, and move on. But when you're making different mistakes each time-alternating slices and hooks, hitting irons fat and thin, spraying putts all over the green-then you're in trouble, because you can't even figure out what to fix.
That's what I'm seeing here. I missed a lot of stuff last year, none of which necessarily shows a blind spot, and some of which is just stuff that, no excuse implied, everyone missed. What the Rays did, what the Twins did, isn't the kind of thing that you can see coming, which is why-say it with me now-they play the games.
One last thing I want to look at, however, is the very basic task of getting the arrows pointed in the right direction. As analysts, we should be able to classify teams as "going to be outscored" and "going to outscore their opponents" as a very basic measure of competence. Two years ago, I got just 18 teams right. Last year, I jumped all the way up to 19. This bugs me, because I have to think that a dart-throwing fourth-grader could get at least half of the teams right. It's a little like picking the NCAA field; you're not really predicting 65 or even 34 slots. You're picking the last four or so, because most of the field is set for you. If you get two or three wrong, you've just wasted weeks of your life.
In baseball, everyone agrees that the Yankees, Red Sox, and Cubs are going to outscore their opponents, and that the Padres, Pirates, and Nationals will be outscored. You can probably even extend those categories by two or three each. The questions come in the middle. You come to BP to find out about the middle.
I'll take my stab at that middle this week. Hopefully I'll do better this time, but as you read my prediction sets and those of others this week, keep in mind that despite my breakdown of the numbers above, you want to worry less about the numbers and more about the analysis. It's the words, the thought process, that matters, and I'll say that even after I run my Net Error Score into double-digits this year.