Welcome to the first and possibly last article in the Baseball Prospectus NerdFight series, the place for nerd-on-nerd action that’s too raw for prime time, and too rough for a regular column. Today’s subject is league difficulty adjustments.
David Gassko of the Hardball Times is one of the more readable and reliable statheads out there, so when he published an article last week suggesting that we had vastly overestimated the improvement in league quality over time in Baseball Between the Numbers, I paid attention. David’s contention is that our method for estimating changes in league difficulty failed to account for regression to the mean, and that if you do account for regression to the mean, the league difficulty curve winds up being much less steep than it would be otherwise. I encourage you to read David’s article and his follow-up piece to digest his argument.
I’m actually less qualified to be addressing this subject than you’d think. Although I wrote the “Is Barry Bonds Better than Babe Ruth?” chapter in Baseball Between the Numbers, which addresses the league quality question in some detail, the underlying work on league quality was done by Clay Davenport. My contribution to that chapter was in framing Clay’s work; I have not done any hardcore statatistical analysis on this subject myself. In addition, it is difficult to examine the assumptions behind David’s work in much detail, because his methodology is not terribly well spelled out. He asserts that regression to the mean is hugely important to take into account, which might well be the case, but I don’t know how exactly he’s accounting for regression to the mean.
That lack of information is problematic, because this is a topic that is particularly sensitive to assumptions. Because of the way that the league quality adjustment chains itself forward from season to season, any small problems with methodology will tend to feed on themselves, resulting in vastly different conclusions when you’re examining performances that are many seasons removed from one another. Clay’s work, for example, implies that Honus Wagner would be a mediocre player if he was playing baseball today, while David’s concludes that he would be an All-Star.
The other problematic aspect of this topic is that it is difficult to find a frame of reference. The idea that Honus Wagner would be a modern incarnation of Adam Everett strikes me as inherently plausible, but so too does the idea that he would be an All-Star caliber performer. We just don’t know, and unless someone figures out how to construct a time machine, we are probably never going to find out.
What we can do, however, is to apply our baseball horse sense to phenomena that are a bit more self-contained, and that occupy briefer intervals in time. For example:
David’s work implies that the American League of 1945 — when a great number of players were serving the country overseas — was about 5% less difficult than the league was in 1941. Clay’s work implies that the difference is more like 15%.
David’s work suggests that the Union Association of 1884 was about 10% less difficult than the National League in that season; Clay’s work puts the difference at 30-35%.
David concludes that the National League got about 5% easier between 1900 and 1902, as baseball went from having 8 major league teams to 16 major league teams. Clay’s work pegs the difference at closer to 12%.
David’s work suggests that there was no perceptible difference in quality between the American and National Leagues in 2006; Clay’s numbers argue that the AL was 3-5% more difficult(**).
You can probably guess where I’m going with this.
I did a great deal of work on baseball during World War II for an upcoming chapter in It Ain’t Over ’til It’s Over: The Baseball Prospectus Pennant Race Book. What I found was that the quality of competition in baseball was absolutely blown to smithereens by the war. By 1945, more than half of the league’s major league player pool, and three-quarters of its minor league player pool, was unable to participate because of the war. I was able to estimate through some fairly exhaustive methods that the average team in the 1944 American League had lost about 17 wins worth of talent to the war. That is an absolutely enormous difference; I don’t buy for a second that the shortfall in league quality was just 5%, not when major teams were employing one-armed outfielders and 15-year-old pitchers.
Similarly, I don’t buy that the Union Association was only a little bit worse than the National League in 1884. Take a look at the Union Association for yourself. You’ll see a team that went 94-19, and another team that went 16-63. You’ll see that several teams folded after playing but a handful of games. Those things simply don’t happen in a competitive, well-organized baseball league. Moreover, if you examine the league’s rosters, you’ll find that the vast majority of players either never played outside the UA at all, or had only a cup of coffee in the “real” major leagues. The only regular exceptions were on the St. Louis Maroons, which was the team that went 94-19. A 10% difference in league quality is less than the difference between the major leagues in Triple-A, which generally works out to about 15%. Would a decent major league team go 94-19 against Triple-A competition? No way. The Union Association was a major league in name only, and as a minor league, it was probably playing something much closer to Single-A.
I am not terribly familiar with the turn of the century period in baseball, but it seems implausible that you could go from having 8 major league teams to 16 major league teams, and only experience a 5% decline in league quality.
I am quite familiar with the 2006 season, and it seems implausible that the difference in competition between the American and National Leagues was barely perceptible, when the AL had a 154-98 edge in interleague play.
If David’s method is wrong about these little things, then chances are that it is also wrong about the big thing, which is how much the quality of competition has improved in baseball over time.
NerdFight over and out.
** Edit/Caveat: I may be misreading David’s graphs as it is not clear to me whether his analysis runs through 2005 or 2006. Either way, his analysis suggests that there have rarely if ever been any perceptible differences in quality between the American and National Leagues, which strikes me as implausible. Even as crude a measure as All-Star game results implies some cyclical and systematic differences in the quality of completion between the two major leagues; the All-Star game results happen to dovetail exceptionally well with Clay’s numbers. In short, I’m quite certain that David is regressing to the mean too much, and throwing the baby out with the bathwater.