July 13, 2001
Doctoring The Numbers
The Hitters League?
It may seem that the Junior Circuit has always been the league of inflated offense, as a result of--take your pick--weaker pitching, cozier ballparks, weaker pitching, smaller strike zones, and weaker pitching. That's not the case. Prior to the installation of the DH in 1973, the two leagues had virtually identical offensive levels. If anything, the NL was the more offensive of the two.
The league with the most runs scored per game (since 1900) is the 1930 NL (5.68), which also holds records for OPS (808) and batting average (.303). A sampling of the leagues over the years will give you a sense for their equality before the DH:
Year AL OPS NL OPS Diff AL ERA NL ERA Diff
The most significant difference between the two leagues occurred in 1920, with Babe Ruth leading the revolution (he increased the league's OPS by nearly seven points all by himself) towards a more powerful game. It only took the NL two years to catch on; by 1922, the NL had the higher ERA, and Rogers Hornsby became the only hitter in history to hit .400 with 40 homers. World War II hit AL hitters hard, as the league ERA dipped by more than a full point between 1940 and 1945, but strangely, NL hitters were hardly affected at all.
The AL had the lower OPS and ERA in every year but one between 1965 and 1972, which goes a long way towards explaining why they were the league willing to experiment with the DH in the first place.
And the DH certainly had its intended effect. In every year since 1973, the AL has had the higher ERA of the two leagues:
Year AL ERA NL ERA Diff Year AL ERA NL ERA Diff
In the early years of the DH, the difference between the two leagues was very slim (invisible in the case of the 1974 season) which only highlights the fact that the AL endorsed the DH because they were desperate to inject offense into their games; without it, their offensive totals would have lagged far behind the NL's. But since 1979, the AL has consistently had consistently higher offensive totals than the NL. With the exception of 1981 (a strike season) and, for some reason, 1990, the AL has had an ERA more than a quarter-run higher than the NL in every season.
But does the AL truly have better hitters, or simply more hitters? If you eliminate DHs and pitchers from the equation entirely, do the leagues even out, or does the AL still have the higher offensive totals?
Unfortunately, I don't have the game-by-game box scores needed to eliminate DHs from league totals, but it is fairly easy to eliminate the performance of pitchers. The following chart lists the OPS for each league since 1973, as well as the OPS of each league once pitchers' hitting totals are removed (pitchers' hitting data is not available after 1998):
Year AL OPS NL OPS Diff AL OPS NL OPS Diff (no P) (no P)
The chart shows that once pitchers are removed from the equation, NL hitters were actually more productive up through 1978, and though the AL has been the more productive league since, the difference is fairly small. On raw totals alone, the AL has had a 30-35 point edge in OPS over the last 20 years, but approximately 23 of those points are purely the result of pitchers batting in the NL. (Interleague play has had a measurable, but small, impact on these numbers. AL pitchers now pick up a few hundred at-bats a season, lowering the league's OPS by about two points.)
The remaining 10-point edge is still exaggerated; because DHs have no defensive responsibilities, it is much easier to find a DH that can hit than, say, a shortstop, so the "average" DH is a considerably better hitter than the "average" AL position player. If we were able to eliminate DHs as well as pitchers from the equation, the ten-point edge in OPS that the AL possesses would shrink almost to nothingness.
This is an absolutely crucial point to understand: the fact that the AL has had consistently higher offensive totals than the NL over the last 25 years does not mean that AL hitters are better, or that AL pitchers are worse. The difference is almost entirely an illusion created by replacing pitchers with designated hitters for nearly 5,000 at-bats over the course of a season.
What this means is that if, say, two hitters bat .270/.340/.450, and one of them is in the AL and one is in the NL, even though the NL hitter is better relative to his league than the AL hitter, that is an inaccurate (and therefore irrelevant) comparison, because the two league averages involve different components. Comparing a hitter to league averages can only be done after pitchers' hitting totals have been removed from the equation. Better still is to compare players to the performance of hitters only at their position.
The point of all this?
As of the All-Star break, the AL ERA is 4.52, which is a Rockies homestand away from the NL figure of 4.47. The AL OPS (766) is also just barely ahead of the NL total (760).
The five-point difference in ERA, and the six-point gap in OPS, are both the lowest figures by far since 1976. This, after the 1999 and 2000 seasons showed the smallest gap between the two leagues since the early 1990s. And while I don't have an easy way of removing pitcher hitting totals in-season, we can infer from previous seasons that, pitcher hitting aside, the average NL hitter has an OPS 15 to 20 points higher than the average AL hitter. This is a significant difference, especially when you consider that this is a trend: since 1998, NL hitters have been gaining steadily on their AL counterparts.
The possible reasons for this are too numerous to mention, but I prefer to keep things simple and look for a single, obvious, measurable cause for this shift. Fortunately, one such cause does in fact exist: the ballparks.
The following list includes every new stadium that opened after the 1998 season. Park factors are calculated from the STATS Major League Handbook, and are based on a score of 1000 for a neutral park, with higher numbers indicating more run scoring (e.g. a factor of 1050 indicates the stadium increases scoring by 5%). Note that these park factors are not the same as the ones Clay Davenport uses for Equivalent Average rankings; those park factors incorporate a team's road games as well as home games. All park factors are based on three-year data, where available.
Year Team Old Stadium Park Factor New Stadium Park FactorThis season, the Brewers moved out of County Stadium (park factor: 985) and into Miller Park, which through the All-Star break has a park factor of 1027. The Pirates have moved from Three Rivers (park factor: 1004) to PNC Park (park factor: 1109).
Break it down, and you have two AL teams moving from parks which were slightly favorable to hitters into the two best pitchers' parks in the league. Of the four NL teams to change parks, only the Giants have moved into a pitchers' park, and Pac Bell is only slightly tougher on hitters than their old digs at Candlestick Point.
The Astros have gone from one of the worst parks (for hitters) in major-league history into one of the best, and the Brewers and Pirates have also made their hitters quite happy with their moves.
We can do a rough estimate of how each of these ballpark shifts have affected the overall rate of scoring in their league. For example, the Astros moved from a park which depressed scoring by 6.9% into a park that increases scoring by 19.4%, an increase (relative to the league average) of 26.3%. Since 1/16th of all NL games are played in Enron Field, that should correspond to a league increase in scoring of about 1.64%. Running the numbers for all six teams:
Team Old PF New PF % Change % Change (League)
Combine the totals for each league, and we can estimate that the overall drop in AL runs scored due solely to the change in ballparks in two AL cities is approximately 2.13%. The estimated increase in NL runs scored as a result of the four new ballparks is roughly 2.31%.
The ratio of AL runs per game to NL runs per game should, therefore, have declined by about 4.5%, once you do the math properly. And in fact, in 1998 the ratio of AL runs per game to NL runs per game was 1.089. This season, that ratio is just 1.027, an overall decline of 6.0%. That leaves 1.5% that we can't explain with our ballpark theory. That could be due to any number of possibilities, the most likely of which is simply natural sample-size variation, also known as a fluke.
But that still means that most of this paradigm shift--a full 75% of it--is not only explainable, but expected. The AL may boast of the DH, but it needs the DH to make up for parks like Comerica and Safeco, especially since the NL counters with parks like Coors, Enron, and PNC. The era of two different leagues playing at two markedly different scoring levels appears to be over.
At least until the next new ballpark opens in a city near you.
Rany Jazayerli is an author of Baseball Prospectus. You can contact him by clicking here.