BP Comment Quick Links


April 30, 2010 Checking the NumbersA League Average Article
Talking to a friend last week, the topic of league averages came up, and I realized that my tone in the conversation relative to the term “average” did not match my actual feelings on the subject. When discussing the attributes of Phillies starter JA Happ, I countered the argument that the lefty was a front of the rotation pitcher by opining that he was nothing more than a league average hurler. Regardless of whether or not my sentiment rings true, I was alarmed at how I responded. My tone suggested that league average players were not worth much, that such players lacked value; it was as if league average was a derogatory term. I immediately qualified the assessment, extolling the virtues of league average production to the friend—who is real, I swear, and not a literary device—but consider this refresher on league averages and baselines my formal apology for the temporary slip. What Is An Average? Otherwise known as the mean of a dataset, the average broadly represents the quantity or amount associated with each data point, if every point were to be identical. This measure of central tendency is designed to purvey information regarding an entire dataset that is otherwise too vast to describe with each of its data points. For instance, it would not be prudent to list every player’s batting average when explaining what the American League "hit” in 2008; it’s much easier to say that the league hit .267. The results of the entire league cannot be processed at once and so the average stands in and enables someone to deduce results from an entire group based on one specific number. As long as it is a good average—as in, one not skewed to a certain side or wildly divergent in its inputs, like a 5.00 ERA averaged between a 1.00 ERA pitcher and a 9.00 ERA pitcher—an additional benefit leans towards comparisons. It is much easier to compare the AL’s .267 BA in 2008 to the National League's.260 mark, and the comparison is valid because the data tends to hover close to the center. This is the purpose of central tendency measures—to represent a group by where the points are concentrated. If the Braves have 12 batters and 108 home runs, the mean for their hitters is nine dingers; that is, if everyone were to hit the same amount of homers in order to amass the same 108 blasts, the individual number would be represented with the number nine. It doesn’t mean that every batter actually hit that mark, though, but that the group can be represented by this number. Coupled with the standard deviation or variance in a dataset, we can learn an awful lot about a robust set of numbers through just a few pieces of information. Averages In Baseball Relating to baseball, the league average in any metric is a fundamental concept and, for a very long time, the comparative baseline of value. Actually, before using the average, baselines were not common as success was largely considered static. As in, a .300 batting average was deemed great regardless of the league, even though it would be less impressive in a league wherein the data points concentrate around an average of .292 than it would in a .265 league. This still occurs today, especially with batting average, but there seems to be a larger shift recently towards baselines as barometers. With the advent of metrics like OPS+ and ERA+, fans are now learning that a 4.10 ERA is better when the league is around 4.60 than if the league was at 3.90, which really helps out in crossera comparisons. Unfortunately, anything serving as the comparative baseline takes on the appearance of “0” which connotes quite negatively. How can anything be valuable if, from a valuation standpoint, it represents no added value? The negative implications would be founded as well were it not for the fact that teams do not base all of their decisions off of whether or not a player surpasses this average baseline. They yearn for players to be better than the fungible AAAA or minorleague Timo Pereztypes, who make the league minimum salary and would be expected to produce very little in the major leagues. If teams seek players above this level, it doesn’t make much sense to use the leagueaverage player, who is markedly better than Perez, as a means of gauging the value of their players. This doesn’t mean that knowing how much above or below the league a player performs is worthless, but rather that it isn’t the gospel. Muddying the waters is the recent shift towards fielding metrics as a means of developing a more wellrounded shape of a player’s true value. Suddenly, value has gone from how a player hits to a combination of his hitting, fielding and running, as it should. The Replacement Level Keith Woolner’s seminal work on the concept of the replacement player—the term for that $410,000 player hanging around in TripleA, available at the snap of a finger—established that league average production was terribly underrated, especially given the value of league average playing time. A superstar for 250 plate appearances added to a replacementlevel player for the other 350 trips to the dish could be equal in value to 600 plate appearances from an average player. Of course, the ideal is a durable superstar, but when durability fades in a superstar markedly above the average, suddenly that leagueaverage production and given range of performance looks mighty attractive. Leagueaverage players are not wasting away in the minors, either, as the bell curve of talent, per Woolner’s original studies, boasts a small number of superstars, more players hovering close to the center, and an unlimited amount of players on the replacementlevel end. These players can provide a guaranteed .230/.275/.330. So the appropriate measurement of value involves comparing a player to this type of player, not the league average. Usually, the leagueaverage player tends to be around two or twoandahalf wins above replacement, with the replacement level obviously set to zero, connoting negatively because, well, you really don’t want to employ replacementlevel players in the big leagues. Personifying Average: The Oddibe Awards One of the toughest aspects of understanding baselines is the lack of an ability to personify the terms. It is easy to understand performance volatility through Joel Pineiro and consistency through Jon Garland, but who embodies the average? Who embodies the replacement level, for that matter? Ironically, the latter is easier to personify through AAAA players like Chris Shelton and Mike Hessman, but what is the league average? To accomplish this personification, I have employed a method derived from a conversation with friend RJ Anderson a couple of years back. Anderson found that the career slash line of Oddibe McDowell—former journeyman and Arizona State teammate of Barry Bonds—most closely resembled the alltime major league average slash line. Therefore, each year I hand out The Oddibe Awards, designed to honor those players who came the closest to personifying their league’s average line. Now, this is purely a hitting award, because even though fielding and baserunning are inputs to the average output, they are much more difficult to envision. Here are the awards for the last decade:
Looking at the names on this list, we see a combination of players who went onto become replacementlevel, allstars at the end of their careers, and players like Graffanino tend to come to mind when thinking about average hitters. Entering Thursday night’s action, the NL was hitting .258/.333/.407 while the AL was hitting a surprising .256/.332/.406. Therefore, the current frontrunners for this year’s trophy are, respectively, Dexter Fowler (.253/.337/.418) and Curtis Granderson (.243/.329/.414). Conclusion Above Replacement For many of you, evaluating players above replacement is second nature, and metrics that use the league average appear outdated and obsolete, but I hope that this served as a worthwhile refresher to the rest. The concept of an average is an example of something we know, but might have trouble explaining. If asked to describe an average, it’s tough to do so without using the word in its own definition; we intuitively know what it is but not what it tells us. In this great sport, averages help to condense a multitude of data points into a central number capable of being used in comparisons to other averages. However, these averages should not be treated as the baselines, unlike in other industries, because belowaverage players still have value, since they are above the truer baseline, the replacement level. So next time you see a player hitting .265/.335/.422 and become unimpressed at the numbers, just think, that player could very easily be a .232/.278/.341, when the differences would become much, much more evident.
Eric Seidman is an author of Baseball Prospectus. 11 comments have been left for this article. (Click to hide comments) BP Comment Quick Links harderj (32137) Thanks for this article, though now I'm bummed I drafted Dexter Fowler in my Strat league :). Apr 30, 2010 08:37 AM Baseball statistics are generally normally distributed (or close to it) with regards to opportunities (plate appearances/batters faced). In other words, the number of PAs given to hitters one standard deviation above the mean is typically symmetrical with the number of PAs given to hitters one SD below the mean. Apr 30, 2010 08:54 AM Except that I wrote it :). Apr 30, 2010 09:07 AM Dr. Dave (1652) Actually, your first example is misleading. If the league hit .267, that's almost certainly not the average of the batting averages of the players in the league, and thus not a measure of central tendency at the *player* level. Apr 30, 2010 15:51 PM Not a subscriber? Sign up today!

Any idea on how many games would an all average player team would win in a season?
My first thought is that we usually think of average as being in the middle, so an all average team would win 81 games.
Then I considered that an all replacement player team is expected to win about 40 games (I think, I don't recall exactly). So each player would contribute 40/25=1.6 wins. If an average player is worth 2.25 wins above replacement (splitting the range given in the article of 2 to 2.5), then an all average player team would win (1.6+2.25)*25=96.25 games. I don't know, that seems high to me.
An average player given a full season's worth of playing time (say about 150 games for a position player, something like 120 innings for a pitcher  this is all off the top of my head, so those figures may be off a bit) is worth somewhere around 22.5 wins above replacement.
The thing is you run into a marginal return point, where you can't get that much playing time for every player. Even if your bench players are average for hitting/defense, they won't accumulate 22.5 wins because they won't see that much time on the field.