This is Part 2 of 2 of the projection roundup; the first piece for position players ran here on Wednesday. The methodology is as identical to the hitter evaluations as is possible. I use 50 IP as my cut-off point. Pitchers are excluded from consideration if they had no forecast in at least three out of the eight systems. Otherwise, I ran with the data I had, filling in a 4.75 ERA forecast (slightly worse than league average) for missing pitchers.
First, summary statistics for the eight projection systems:
We have a range of about three-tenths (0.30) of a run with respect to leaguewide offensive levels. The important thing, though, is that most of these systems were internally consistent; those that had the highest ERA’s for pitchers also had the highest OPS’s for hitters. One exception was the Hardball Times, which had both the lowest projected OPS’s and the highest projected ERAs; possibly too much regression to the mean there. RotoWire, on the other hand, had high projected OPS’s and low projected ERAs; possibly not enough regression to the mean.
As measured by standard deviation, Marcel and Chone are again the most conservative forecasts. ESPN is the most aggressive.
None of the forecasting systems were especially unique except Hardball Times, which was quite unique. I remember noticing when I downloaded those projections in March that they were pretty different from the other systems.
Next, the first of our evaluators, correlation coefficient.
There are two versions here, the latter of which was included based on a discussion at Tom Tango’s blog. This “adjusted” version recalibrates each system such that it correctly predicted league average ERA, the idea being that all value in baseball is relative. So all the PECOTA forecasts, for example, had 11 points of ERA subtracted from them, because PECOTA overestimated ERAs from our sample group of pitchers by that margin.
Either way, the ordering is the same. CHONE and PECOTA are the top two systems, with CHONE a little bit out in front. Then there’s a gap, then Marcel, then another gap to the other systems.
The best you could have done last year is to bundle PECOTA and CHONE in about a 4:3 ratio. This would have increased your correlation coefficient from .451 using PECOTA alone to .461 with the hybrid version. The other systems wouldn’t really have contributed positively to your results. Taking an average of all eight systems, for example, leaves you with a correlation of .429, which is worse than either PECOTA or CHONE taken alone.