December 19, 2006
Last spring, I took the process one step further by evaluating the predictions, to see if there were any lessons to be learned, perhaps biases to be caught or blind spots to be narrowed. It turned out I'd had a better season than I thought I had, although there was something to be said for keeping an eye on how teams align their talent (White Sox), improve their defense (also White Sox), or perhaps have one signature skill that pushes them forward (like the Astros' rotation).
I ran the numbers on the 2006 predictions last night, and what I found was that I shouldn't have. Once again, I missed on the league run level by a significant amount, 4.4%, predicting 22,601 runs in a season that saw 23,599. Offense was up over '05 in MLB, and might have been up more if not for four months of odd results out of Denver. I didn't see that coming, nor did many others in the third first year of Sooper Serious Steroid Sternness. For purposes of this exercise, we'll adjust everyone's predicted runs and allowed by 4.4%; that's the global error by missing the run context of the league.
Once you do that, you find that I didn't have as good a year as I did in 2005. The Net Error Score-the totals of all the misses between my adjusted RA and RS predictions and the actual ones-was 3,606, 803 runs worse than 2005. In other words, if my average prediction missed by about 47 runs in '05, that jumped to about 60, or about 28% higher, in 2006. I was off across the board: my best NES score in '06 was 25, with the Mariners, whereas in 2005 I had a microscopic NES of seven for the Mets. At the high end, I might as well have been pulling numbers out of a hat. The max NES in 2005 was 220, for the Devil Rays; I missed four teams by more than that in '06, including three by at least 300 net runs. Let's look at those three monster misses:
I'm not seeing any magic bullets here the way I did last year, although hopefully analyzing the data this way imprints some lessons that I can integrate, even if I'm not entirely sure what they are yet.
Some would argue that analyzing won-loss records is a better way to do this. I disagree. Consider those Reds. Because they underperformed their runs scored and allowed, I only missed their won-loss record by three games (80-82 projected, 83-79 actual), and there's no way I think that accurately reflects my evaluation. These two things generally move in step, but not always, and I'm inclined to get as close to the predictable element-the runs-that I can.
Looking at record, we find that I nailed the Pirates, Angels and White Sox, and was off by two or fewer games on seven others. I don't buy it, though; I had a 141 NES on the Sox, and nailed the record because I got the differential mostly right. All that means is that I missed badly on both sides of the equation.
What else can we learn…well, I was closest (by NES) on the Mariners, Astros, Blue Jays and Phillies, all under 30 NES. I don't see a common thread there…wait, I do. All four teams occupy the middle ground of the game, and all are largely veteran teams. It's easier to predict the inside part of any curve, to be sure. I'm thinking older teams should be easier to predict not because all the players have reliable track records to go on, but because figuring out the distribution of playing time is much easier with a veteran team. I dare say that the four teams above were among the most stable last year in terms of who played and how often, save for trades that altered the rosters in season.
If you just look at differentials, you find that I was within two on the Dodgers, within three on the Mariners and within nine on the Nationals. This figure, the error score (Err), actually came down between '05 and '06 by more than 200 runs. The closing gap indicates that while I'm not doing a job of projecting overall RS and RA, I am getting better at sorting out which teams will be ahead in the runs department and which will not. This, to me, is a very basic skill; if I can't reliably say which teams will be outscored and which ones will not, it's time to write a resume.
I was disappointed by my work in this area in '05, when I missed the direction of a team's run differential in 10 of 30 cases. In 2006, I got that down to eight, and of those, five were near misses, involving teams who were within 10 runs of the line. I still don't think that's good enough, because it's not 22 of 30, it's more like nine of 17 once you consider that a third to a half of MLB team are locks in this category.
You'll notice that I haven't mentioned the Tigers yet. I missed them pretty badly, projecting a 78-84 mark for a team that went 95-67, and a -28 run differential for a team that ended up at +147. They were the second biggest miss by Err, behind only the Rockies, who I'm taking a mulligan on for 2006. In the Tigers case, I had the run environment right-1536 projected runs scored in Tigers games, 1497 actual-but the distribution wrong. That's why the NES of 175 doesn't crack the top five.
The two scores each provide valuable information, and I'm not sure I'll be dropping either any time soon.
Let's see, anything else notable…after adjusting for the offensive level, I nailed the runs allowed by the Cards, and was within two runs of both the Astros' and Royals' marks in that category. I'd said last year that predicting offense was easier than predicting defense, but that didn't hold in '06 I was off by 1825 runs in the aggregate on offense, and just 1781 on defense. Neither score is impressive after last year's 1312 and 1491.
There may not be a great lesson in this, but I think the exercise is valuable. Predictions don't have to just be interesting late-March content, they can be a guide to how we evaluate teams. To make them valuable, we have to go back and analyze them-analyze our performance-the same way we would analyze the performance of players. Nate Silver does this with PECOTA, which is how it gets better and better each year. That's my goal as well.