Before I kick off the 2008 season-preview series, with records and runs scored/runs allowed predictions for all 30 teams, I want to take a look back at last year’s set to see if I can learn anything. This is sometimes an interesting exercise, and sometimes an excuse for navel-gazing, and I’m never sure which one it will be until I fire up the spreadsheet.
The first thing I noticed is that I absolutely nailed the run environment in 2007. My preseason predictions had 23,193 runs being scored in the major leagues last year. In reality, 23,305 runs were scored, a difference of just 112, or less than half a percent. For purposes of this exercise, I went ahead and adjusted the runs scored and runs allowed totals of every prediction by that half-percent-it’s the “global error” in the predictions-to be consistent with past years, but that’s splitting hairs. This is mostly luck, although I suppose it’s evidence that I have some sense of where the game is at the moment. Nah, it’s luck.
Note that the 2007 total is just 294 runs less than the 2006 total, which itself was up over 2005. If you’re looking for a connection between “penalties for PED use” and run scoring, you’ll have to look elsewhere, because the data doesn’t support the hypothesis.
I wish I could say I was so accurate in predicting team performance. Keep in mind that my metric here is runs; while we measure success in wins and losses, those vary from runs scored and allowed in unpredictable ways during a season. We can’t predict which teams will, like the Mariners and Diamondbacks, outperform their runs scored and allowed. So with that in mind, I measure my performance by the RS and RA columns, not W and L.
The Net Error Score-that’s the total gap represented by the difference in my estimates of runs scored and allowed-was 3,345, more or less splitting the difference between my 2005 and 2006 marks. I was off by an average of 55.8 runs per team, which seems pretty lousy to me. My best and worst projections:
Actual Pred NES Team RS RA RS RA Twins 718 725 727 723 14 Mets 804 750 806 758 18 Yankees 968 777 942 788 36 Padres 733 657 701 671 46 Dodgers 735 727 777 731 53 Royals 706 778 825 901 250 Tigers 887 797 751 721 205 Marlins 790 891 701 791 182 White Sox 693 839 806 775 177 Diamondbacks 712 732 843 765 172
The Diamondbacks’ listing as one of my worst predictions points out the difference between evaluating on runs versus record. I had picked the D’backs to go 89-73 for the best mark in the NL. They ended up 90-72 for the best mark in the NL. However, they failed to outscore their opponents, and the disappointing seasons by Stephen Drew, Carlos Quentin, and others helped me miss their runs scored by nearly a run per game–that’s an enormous miss. The Marlins finished with exactly the 71-91 I predicted they would, only they did so by scoring 90 more runs and allowing 100 more. I shouldn’t get credit for this.
If you wanted to, though, you could look at it this. Instead of using RS and RA figures, consider that what you’re really trying to get at is the difference between the two, the margin. In that case, you’d have the Error Score. By this measure, I was third-closest on the Marlins (-90 predicted, -101 actual), with the Twins and Mets retaining their spots as the teams I came closest on. The Royals, however, jump from worst by NES to best by Error Score, with a mark of 4.
I’ve said in the past that I’m not sure which of these methods is best for evaluating a prediction, and until I can figure it out, I’ll continue presenting both.
One area in which I need to improve is something simple: identifying which teams will outscore their opponents, and which ones will be outscored. It’s one thing to miss on one side of the line, but if you can’t accurately estimate which teams will be-and it is this simple-“good” or “bad”, it’s time to write a resume. Well, I missed on 40 percent of the field. Some of my errors were small and either meaningless or explainable; the Twins, for example, were -7 last year, whereas I had them as +4. That’s insignificant. On the other side of the ledger, though, there were some massive mistakes that take some explaining:
- Chicago White Sox, 810 AdjRS, 779 AdjRA predicted; 639 RS and 693 RA, actual. The second figures are easily laid at the feet of a declining defense and a wretched bullpen (outside of Bobby Jenks). The performance at the plate, however, was the real culprit, down close to 200 runs below expectations. The ’07 White Sox had one of the worst performances on batting averages on balls in the play you’ll find: .281. They hit homers and they drew a fair number of walks, but they were the worst team in baseball at hitting singles and doubles, and that killed them. It is almost certain that the offense will bounce back.
- Colorado Rockies, 845 AdjRS, 902 Adj RA; 851 RS, 750 RA. Not that anyone was looking, but here is another place you find the difference between the ’07 Rockies and their predecessors. Their defense made plays that previous Rockies’ defenses didn’t make, taking hits off the board and turning them into outs. That’s how you save nearly 150 runs. I don’t know that I could have foreseen this, given that for 14 years, the Rockies had been unable to overcome the physics effects to play the kind of defense they did last year. Still, it’s a reminder that gauging changes in defensive capability across an offseason is a challenge.
- St. Louis Cardinals, 771 AdjRS, 759 AdjRA; 725 RS, 829 RA. Lots of things going wrong at once ruined these estimates, starting with injuries to Chris Carpenter, Jim Edmonds, Scott Rolen, and Juan Encarnacion, but affected as well as by underperformance by Adam Kennedy, Rolen, and a host of starting pitchers. I wasn’t optimistic about their chances to begin with, but they had even more go wrong than I foresaw.
- Texas Rangers, 872 AdjRS, 792 AdjRS; 816 RS, 844 RA. I’ve been trying to remember why I predicted that the Rangers would win the AL West and post the second-best record in baseball. I cannot. The internet, however, can:
This year, the pitching should still be fine again, highllighted by what should be a terrific bullpen if Eric Gagne stays healthy, and a good one if he doesn’t. Where the improvement lies is at the plate; the Rangers are a bit the opposite of the Twins and Blue Jays, in that you can project gains at many offensive slots. The defense, now featuring Kenny Lofton, will not be an asset, so watch the strikeout rates as a key indicator of run prevention.
Just about everything up there is true, as it turns out. The team still stunk. The Rangers didn’t improve at catcher (Gerald Laird was lousy), third base (Hank Blalock missed 65 percent of the season), or in the outfield corners, save for when Marlon Byrd played in them. The rotation was simply awful, with 15 guys making starts, and just one, Kevin Millwood, making more than 23. The Rangers were last in the AL in strikeouts and second-to-last in walks allowed. Bad play for four months cost them the services of Mark Teixeira for the last two. That hurt as well.
The lesson is to play smarter hunches.
We’ll kick off the 2008 previews later this week. As always, consider the final numbers interesting, but pay more attention to the analysis around them. That, rather than the number, is what has the most value.