Premium and Super Premium Subscribers Get a 20% Discount at MLB.tv!
March 12, 2010
2010 PECOTA Projection Analysis
As promised, here are the results of our preliminary analysis of the 2010 PECOTA projections. We took the methodology for multiple builds of the projections (which will be easier to identify when we start using versioning, which we will do when we've got everything else cleaned up) and ran all the inputs for the 2009 version of PECOTA through it. Then we compared those projections to the actual results of the 2009 season.
In all cases, lower numbers are better.
RMS Error results
Bias-adjusted RMS Error results
BP used the 2010 PECOTA projections as the basis of our LABR draft strategy this weekend.
PECOTA Ten-Year Forecasts and Hitter Cards
We have diagnosed the main problem with our ten-year forecasts as reported on the PECOTA beta hitter cards. Nate Silver generated one set of comps in the original PECOTA process and used those comps to generate the long-term projections. We were trying to re-generate new comps in year n+1 based on the player's career thus far with his projections for year n included, and repeating. In addition to introducing considerable extra complication, this process generally created much less favorable long-term projections, as many readers noted.
We've adjusted the long-term projection process to work the way Nate originally designed it. Once we've got everything stabilized and released, we'll be revisiting this topic.
Another problem we had was that a player projected to be out of baseball entirely had a value of zero, while if he was good enough to be projected to get some playing time but bad enough to be below average, he'd have a negative value. For example, in a player's tenth percentile, he might be out of baseball entirely, returning 0 WARP, but in his fiftieth he performed well enough to stay in baseball, and rated -1.0 WARP in the playing time he was projected to get. Because we didn't distinguish between a 0 WARP in baseball and a 0 WARP being out of baseball, this enabled some very weird results for players with this condition. We've changed the process to differentiate between the two, and the values should now be much more reasonable.
The K bar in the player profile graph has been reversed, and now works as it did previously.
We still have a problem with some players having their higher percentile projections zeroed out, while the lower are filled--we've identified this internally as "the Koyie Hill problem". This is a similar issue to the reversed projections problem above, and it'll be fixed this weekend.
PFM Settings Update
Levels 2 and 3 are disabled. We are looking into bringing those levels back into play--lets us know how much you used and miss them.
The first one was sent out late March 9. With this update, we attempted to address some of the issues people had noted with Depth Chart team statistics that Clay mentions in this post. While this might have made the team projections more satisfactory, it quickly became apparent that it did so at the expense of the individual player projections, which are what the vast majority of subscribers are using PECOTA for this time of year. If you downloaded a spreadsheet or ran PFM between the evening of March 9 and now, please do it again and use the individual player stats you get from the current data.
As an aside, the modifications that were made to the March 9 data were more-or-less applied on a league-wide basis, so draft order and dollar value from the PFM probably won't change very much. The raw statistics that were predicted will be fairly different, though.
This morning, we pushed out another update which puts things largely back to their previous state. The RMSE analysis above is run on this version of PECOTA.
Depth charts will be updated at least every other weekday through the start of the season.
Weighted Means versus Fiftieth Percentile
Traditionally, PECOTA has used weighted means projections for it's default projections--the Depth Charts, PFM, Weighted Means Spreadsheet (obviously), and player cards have all used or highlighted the weighted means stat line.
This year, we've been using the fiftieth percentile for these applications instead. Until recently, we haven't had the weighted means at all for PECOTA in 2010. We now have the weighted means in the cards--see the bottom row of the 2010 projections table--and the weighted means are being used for the ten-year projections in the cards. Everywhere else, though, we're sticking with the fiftieth percentile projections for now, so you'll see the projections in the fiftieth percentile line of the cards match the Depth Charts, PFM, and Weighted Means Spreadsheet (which I realize means the spreadsheet is now misnamed).
We're working on the pitcher cards now. They'll be available next week.