I've written before about why I dislike the error and its cousins, due to its subjective nature. But how much does it matter, especially over a long period of time? Is there a practical consequence to the subjectiveness of the error?
So here's what I did. I took a look at the rate of errors per balls fielded by infielders in all of a team's home games, including the visitors, from 2002 – 2009. I did the same for a team's road games. Each of those rates were then regressed to the mean, then the regressed home rate divided by the regressed road rate to produce park factors. Then I averaged all the one-year regressed park factors over the full time span, and here's what I got:
Team | Park Factor |
---|---|
Anaheim/LA | 97 |
Arizona | 96 |
Atlanta | 106 |
Baltimore | 95 |
Boston | 100 |
Chicago (AL) | 103 |
Chicago (NL) | 102 |
Cincinnati | 96 |
Cleveland | 105 |
Colorado | 102 |
Detroit | 103 |
Florida | 99 |
Houston | 98 |
Kansas City | 97 |
Los Angeles | 98 |
Milwaukee | 105 |
Minnesota | 89 |
Montreal | 96 |
New York (AL) | 105 |
New York (NL) | 103 |
Oakland | 96 |
Philadelphia | 98 |
Pittsburgh | 98 |
St. Lous | 102 |
San Diego | 92 |
San Francisco | 103 |
Seattle | 99 |
Tampa Bay | 103 |
Texas | 100 |
Toronto | 102 |
Washington | 101 |
The standard deviation of the group over the eight-year time span was about four percent in either direction; in other words, teams had an error park factor of between 96 and 104 about 68% of the time.
Of course, this raises the question – how much of the difference between parks is scorers, and how much of it is actual changes in the way balls are hit at infielders? Bear in mind, for something to be an opportunity here, a fielder has to reach the ball. So the park effect would have to control how cleanly a fielder can field a ball.
Now, I have my own suspicions (and familiarity with my work would probably tell you what those suspicions are). But it's just a suspicion – I'm not sure we really know, and I'm not sure that we will ever know for certain.
But it does give one pause – or at least it should give one pause – when using errors to attribute value to players. Which is of course something we frequently do for hitters, pitchers and fielders right now. (Anyone who thinks we don't use errors for evaluating hitters, or that we shouldn't – the very act of considering an error an at-bat but not a time where he reached safely implicitly, if not explicitly, includes the error in an evaluation of a hitter's prowess.)
Thank you for reading
This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.
Subscribe now
Doesn't this fall into an 'immaterial' category when doing estimates?
Why just infielders? Am I missing something?
So outfield errors (which, as you note, are much, much rarer - and probably a bit more cut-and-dried as far as the scoring is concerned) are a bit more difficult to tease out. You'd want to look at batted ball data to sort out the relationship, but then you're left puzzling out what bias is in the batted ball data versus the judgement of the official scorer.
Not to get all nit-picky on you, but the standard deviation you quoted applies to data that is normally distributed. The data you've posted here is a small set, but actually looks bi-modal. Anyone want to take a stab at what that might mean?
And I don't see any reason to think that it's bimodal, looking at a histogram of the data. That isn't to say it's strictly normal, though - there's a pretty clear right skew to the data.