Got another round of updates done and sent out for PFM, depth charts, and the weighted means spreadsheet late last night. What’s changed?
* SS/Sim is in. Thanks to Mike for the help with those.
* Upside - the Upside is calculated from a series of player forecasts; it is essentially the players runs above average for a six-year period. Nate calculated the upside only by looking at the current set of comparable players; I’ve calculated by iteratively running the player’s forecast into the future. The first time I ran that the forecast was definitely too ‘hot’ - virtually every 20-year old was eventually turned into a .350 hitter, as optimism up like a runaway resonance effect. I’ve put a quick damper on that for now (reading from a lower forecast level will reduce the influence of age, which was the primary variable being over-emphasized); a larger fix, with the forecast starting out highly optimistic but regressing towards a median level over time, will take more time to run. I’ll also revisit the “classic Nate” method - I changed it in the first place because the iterative method worked better, but I’ve made enough other changes since then that I can’t be sure that’s still the case. I’ll run some tests from the early 2000s - whichever version makes the best projections for the late 2000s will enter the program.
* Steals - I know there was some discussion of stolen bases looking too optimistic, and there was a good bad reason for that - a piece of code that was regressing stolen base percentage towards the league average was actually regressing it towards 1, making a ~10-point gain for the typical player. Not noticeable on a guy with 5 steals - but very much so on a guy with 30.
* Strikeouts - hitter strikeouts were an area where the initial version was performing noticeably worse than last year’s PECOTA, and I did make some changes that wipes out 75% of that difference (which does mean that the current one is still doing worse than last year for some reason, but with an average error about 0.3 worse instead of 1.3 worse). The change had to do with how the stats are weighted to determine a player’s baseline rate of performance. For both the tested players and his comparables, we build a weighted mean of his prior three years of performance - this establishes a baseline that is then tested against the fourth year, and those differences are what drives PECOTA. How that weighted mean is built mainly depends on the age of the player - for very young and old players, the most recent year is the strongest driver, while for mid-career guys you tend more towards a simple three-year average. Among all age groups, though, strikeouts need to be strongly driven by the most recent year - and that change, allowing different stats to weight differently for the same player, is new. I wouldn’t be surprised to find that other stats will also benefit.
* Depth charts - John Perrotto collected a series of depth charts from the beat writers for each team, with their opinions about how the lineups would look come April. I’ve allowed those to influence, and in some cases strongly influence, my thinking on various players; I’ve also rejected it in places where the result simply made no sense. The beat writers were focused primarily on an Opening Day lineup, while I’m trying to establish patterns for the entire season - I can readily believe that teams would be stupid enough to start Sucky Player A in April, but don’t think that they’ll continue sticking with him in July.
Another slew of words about the depth charts in general. PECOTA is a system geared for the projection of individual players. It is not run for teams - the depth chart takes playing time estimates from a person, looks up the PECOTA projection for that player, and adds those up for every player on the team to generate “team totals”. That is not the way to optimize the projection for a team. The sum of the individual projections is going to be greater than a proper team projection, and the sum of those is going to be higher than a proper league projection. The reason the league doesn’t end up as high as the projections is not because the individual players projected will all do worse across the board - it is because teams will go deeper on their depth charts than we can reliably predict (and beyond the top two, its generally a crap shoot which minor leaguer gets called up). Some players are going to get hurt and fall dramatically short of the projected playing time, and there are likely to be more high estimates of PT than low ones. We’re listing 2-4 players per position, probably about 30 ’slots’ per team. The Diamondbacks, to pick a team more or less at random, last year used 3 players at shortstop, 4 at catchers and second and third, 5 in center and right, 6 in left, and 8 at first, plus 13 used as a DH or PH at least 5 times, which is somewhere between 39 and 52 ’slots’ depending on how you count the PHs. We list 17 or 18 pitchers per team - the average team in 2009 used more than 24. Attempting to constrain PECOTA to the depth charts - by changing the numbers to match the expected league total - will damage the forecast. There were elements in the depth charts that were doing just that - I’ve been removing them as I find them, but we’re still doing it for playing time. I may change that soon as well; since the new PECOTA does make a specific major league playing time estimate (the “Major” column on the spreadsheets is the expected percentage of his playing time that comes in the majors) it doesn’t need to be nearly so totally reliant on the depth charts.
But we also use those depth charts, rightly or wrongly, to assess a team’s expected wins, we have to find a way to reconcile the individual projections (which tend to produce too many runs for the offense, and not allow enough to the defense). The runs scored and allowed totals that show up for the team have been balanced - the total runs scored and allowed made to be equal, allowing pythagorean win estimates to create a balanced won/loss record for the league. However, the batting line that goes with it is still just the sum of the individual player projections. So yes, there is a disconnect between the team slash line and the runs scored, and there is a disconnect between the sum of the players runs scored and the team runs scored. I haven’t figured out any way around that without compromising the quality of the individual projections.