BP Comment Quick Links


August 28, 2014 MoonshotOn Regressing DefenseWe heard the first blows in the nascent MVP debate of 2014 unfold just last week. At the time, Alex Gordon led all players in fWAR (by a narrow margin), largely on the basis of his extraordinary defense in left field (15 fielding runs above average, fifth highest in MLB). In response, Jeff Passan wrote that the idea of Alex Gordon as the best player in baseball was absurd. Much wailing and gnashing of teeth ensued. To some of the doubters of sabermetrics, Gordon’s triumph on the leaderboards was yet more proof of the uselessness of WAR(P). To others, arguments against Gordon may have seemed illformed. Fortunately, Gordon no longer leads baseball players in any of the flavors of WAR(P) (whew, argument defused). Even so, Alex Gordon brought to the surface a recurring theme in criticisms of the WAR framework: the weighting of defensive metrics. In theory, a run saved is a run scored. But whereas the relationship between singles, doubles (etc.), and runs produced is easily parsed with linear weights, defense is more difficult to measure. The steps between the events on the field and the runs being saved require more estimation, and that potentially injects more error in the final result. A natural response to the additional error implicit in defensive measurements is to deem them unreliable and regress them according. ‘Regression’ exists in the sabermetric lexicon as both an abstract concept and a concrete, mathematical transformation. In the abstract sense of the word, to regress a player’s defensive WAR(P), for example, is to mentally adjust his contribution back toward the mean, accounting for the uncertainty in the estimate—exactly what we’d like to do with defense. We can formalize that mathematically by simply multiplying each player’s defensive value, calculated hereabouts as FRAA, by some constant which I’ll call a “regression factor,” r: FRAA_{i} x r = regressed FRAA of player i When r is less than one, a player’s defensive contribution is being pushed back towards zero, which is to say the average. For example, at a regression factor of .5, a defensive standout like last year’s Manny Machado loses half of his value, i.e. about 14 runs. Meanwhile, a defensively mediocre player that year, Matt Carpenter, also sheds half of his value, but this only amounts to a single run subtracted. The penalty is thus much stiffer at the extremes, both good and bad. Consider the following graph, which shows the density of players at different FRAA values, with and without a regression factor of .5 applied.
12 comments have been left for this article. (Click to hide comments) BP Comment Quick Links Dan Brooks mentioned that he wasn't clear how the model was built and tested, so to remedy that, the model was: Aug 28, 2014 09:07 AM Paul Clarke (71864) Thanks  I was going to ask that, as I wasn't clear on it either. Aug 28, 2014 15:13 PM newsense (5112) Why did you use ERA instead of RA? Doesn't the removal of unearned runs confound the results? Aug 28, 2014 09:38 AM I figured that errors are the easy to understand and calculate part of defense, and I wanted to show that even after we subtract them from the problem, defensive metrics still have value. Aug 28, 2014 09:58 AM bisanders (329) Impressive work. Thanks very much. I do understand the desire to tease out errors for your purposes. Aug 28, 2014 16:39 PM Thank you. You are quite right. Like you said, though, the impact should be minimal. Aug 28, 2014 16:53 PM newsense (5112) Also, wouldn't it make sense to predict a counting stat such as total Runs allowed (better than replacement level) when you use a cumulative stat such as WARP to predict? Aug 28, 2014 09:46 AM Since all teams play the same number of games (and roughly the same number of innings) over a full season, the difference between a rate stat (ERA) and a counting stat (earned runs allowed over the year) at the team level will be a simple multiplier which should be similar for all teams. In other words, it shouldn't make a difference. Aug 28, 2014 10:02 AM Not a subscriber? Sign up today!

1. We need a regression factor only when we are assessing how good a player is not how good he has been. If he has saved X runs  that has happened, but it doesn't mean he is really that good. Probably at that point in time Alex Gordon had saved and produced the most runs, but it doesn't mean he is a better player or even necessarily a better defensive outfielder than Trout.
2. Shouldn't the regression factor be individualized for each player? For example, it makes sense to me that rookies should have a very large regression factor towards the league mean  or the historical league mean for rookies  which might be more generous than the league mean, because players lose range as they age. We don't know if a rookie's defense is reliably as good as the runs he has saved so far in the season. However, the later in the season, the less we need to regress the rookie as he has a larger sample size to establish a defensive ability. Instead of regressing towards a league mean, a veteran should be regressed towards his own normal defensive prowess  with an aging factor to it  as we do with batting projections.
3. Come to think of it, this is how we should consider players when voting for allstar teams, post season awards, or single season fantasy teams: something that is a mix of what they've done  both offensively and defensively  and what we would project from them  in order to get the most accurate view of how good they really are.
To be clear, I'm only using regression to account for the greater uncertainty of defensive metrics. The use case you are talking about is perhaps to estimate a player's true talent levelfor instance for making a projection. But, as you note, for the purposes of MVP discussions we do not care so much about a player's true talent level as what they actually did on the field. The problem is that "what they actually did on the field" is possibly more uncertain for a player's defensive contributions than for their offensive contributions; hence the attempted use of a regression factor on their defensive contributions.
Unless you think that measurement uncertainty varies by player (which is entirely possible, but beyond the scope of this article), it would not be appropriate to apply a different regression factor to each player. You would apply a different factor when trying to estimate true talent levels, since they clearly do differ by player, but as I mentioned, that is a slightly different problem.
OK, we'll save individual regressions for true talent tests.
However, I don't see a big difference between how we measure hitters beyond the three true outcomes and using zone factors that cover the whole field for fielders.There is probably just as much luck involved. If anything there is more subjectivity in what is a hit as it could be judged as an error. A regression factor is therefor equally applicable.