April 30, 2010
Checking the Numbers
A League Average Article
Talking to a friend last week, the topic of league averages came up, and I realized that my tone in the conversation relative to the term “average” did not match my actual feelings on the subject. When discussing the attributes of Phillies starter JA Happ, I countered the argument that the lefty was a front of the rotation pitcher by opining that he was nothing more than a league average hurler. Regardless of whether or not my sentiment rings true, I was alarmed at how I responded. My tone suggested that league average players were not worth much, that such players lacked value; it was as if league average was a derogatory term. I immediately qualified the assessment, extolling the virtues of league average production to the friend—who is real, I swear, and not a literary device—but consider this refresher on league averages and baselines my formal apology for the temporary slip.
What Is An Average?
Otherwise known as the mean of a dataset, the average broadly represents the quantity or amount associated with each data point, if every point were to be identical. This measure of central tendency is designed to purvey information regarding an entire dataset that is otherwise too vast to describe with each of its data points. For instance, it would not be prudent to list every player’s batting average when explaining what the American League "hit” in 2008; it’s much easier to say that the league hit .267.
The results of the entire league cannot be processed at once and so the average stands in and enables someone to deduce results from an entire group based on one specific number. As long as it is a good average—as in, one not skewed to a certain side or wildly divergent in its inputs, like a 5.00 ERA averaged between a 1.00 ERA pitcher and a 9.00 ERA pitcher—an additional benefit leans towards comparisons. It is much easier to compare the AL’s .267 BA in 2008 to the National League's.260 mark, and the comparison is valid because the data tends to hover close to the center. This is the purpose of central tendency measures—to represent a group by where the points are concentrated.
If the Braves have 12 batters and 108 home runs, the mean for their hitters is nine dingers; that is, if everyone were to hit the same amount of homers in order to amass the same 108 blasts, the individual number would be represented with the number nine. It doesn’t mean that every batter actually hit that mark, though, but that the group can be represented by this number. Coupled with the standard deviation or variance in a dataset, we can learn an awful lot about a robust set of numbers through just a few pieces of information.
Averages In Baseball
Relating to baseball, the league average in any metric is a fundamental concept and, for a very long time, the comparative baseline of value. Actually, before using the average, baselines were not common as success was largely considered static. As in, a .300 batting average was deemed great regardless of the league, even though it would be less impressive in a league wherein the data points concentrate around an average of .292 than it would in a .265 league. This still occurs today, especially with batting average, but there seems to be a larger shift recently towards baselines as barometers. With the advent of metrics like OPS+ and ERA+, fans are now learning that a 4.10 ERA is better when the league is around 4.60 than if the league was at 3.90, which really helps out in cross-era comparisons.
Unfortunately, anything serving as the comparative baseline takes on the appearance of “0” which connotes quite negatively. How can anything be valuable if, from a valuation standpoint, it represents no added value? The negative implications would be founded as well were it not for the fact that teams do not base all of their decisions off of whether or not a player surpasses this average baseline. They yearn for players to be better than the fungible AAAA or minor-league Timo Perez-types, who make the league minimum salary and would be expected to produce very little in the major leagues. If teams seek players above this level, it doesn’t make much sense to use the league-average player, who is markedly better than Perez, as a means of gauging the value of their players. This doesn’t mean that knowing how much above or below the league a player performs is worthless, but rather that it isn’t the gospel.
Muddying the waters is the recent shift towards fielding metrics as a means of developing a more well-rounded shape of a player’s true value. Suddenly, value has gone from how a player hits to a combination of his hitting, fielding and running, as it should.
The Replacement Level
Keith Woolner’s seminal work on the concept of the replacement player—the term for that $410,000 player hanging around in Triple-A, available at the snap of a finger—established that league average production was terribly underrated, especially given the value of league average playing time. A superstar for 250 plate appearances added to a replacement-level player for the other 350 trips to the dish could be equal in value to 600 plate appearances from an average player. Of course, the ideal is a durable superstar, but when durability fades in a superstar markedly above the average, suddenly that league-average production and given range of performance looks mighty attractive.
League-average players are not wasting away in the minors, either, as the bell curve of talent, per Woolner’s original studies, boasts a small number of superstars, more players hovering close to the center, and an unlimited amount of players on the replacement-level end. These players can provide a guaranteed .230/.275/.330. So the appropriate measurement of value involves comparing a player to this type of player, not the league average. Usually, the league-average player tends to be around two or two-and-a-half wins above replacement, with the replacement level obviously set to zero, connoting negatively because, well, you really don’t want to employ replacement-level players in the big leagues.
Personifying Average: The Oddibe Awards
One of the toughest aspects of understanding baselines is the lack of an ability to personify the terms. It is easy to understand performance volatility through Joel Pineiro and consistency through Jon Garland, but who embodies the average? Who embodies the replacement level, for that matter? Ironically, the latter is easier to personify through AAAA players like Chris Shelton and Mike Hessman, but what is the league average? To accomplish this personification, I have employed a method derived from a conversation with friend RJ Anderson a couple of years back.
Anderson found that the career slash line of Oddibe McDowell—former journeyman and Arizona State teammate of Barry Bonds—most closely resembled the all-time major league average slash line. Therefore, each year I hand out The Oddibe Awards, designed to honor those players who came the closest to personifying their league’s average line. Now, this is purely a hitting award, because even though fielding and baserunning are inputs to the average output, they are much more difficult to envision. Here are the awards for the last decade:
Looking at the names on this list, we see a combination of players who went onto become replacement-level, all-stars at the end of their careers, and players like Graffanino tend to come to mind when thinking about average hitters. Entering Thursday night’s action, the NL was hitting .258/.333/.407 while the AL was hitting a surprising .256/.332/.406. Therefore, the current frontrunners for this year’s trophy are, respectively, Dexter Fowler (.253/.337/.418) and Curtis Granderson (.243/.329/.414).
Conclusion Above Replacement
For many of you, evaluating players above replacement is second nature, and metrics that use the league average appear outdated and obsolete, but I hope that this served as a worthwhile refresher to the rest. The concept of an average is an example of something we know, but might have trouble explaining. If asked to describe an average, it’s tough to do so without using the word in its own definition; we intuitively know what it is but not what it tells us.
In this great sport, averages help to condense a multitude of data points into a central number capable of being used in comparisons to other averages. However, these averages should not be treated as the baselines, unlike in other industries, because below-average players still have value, since they are above the truer baseline, the replacement level. So next time you see a player hitting .265/.335/.422 and become unimpressed at the numbers, just think, that player could very easily be a .232/.278/.341, when the differences would become much, much more evident.