keyboard_arrow_uptop

Many baseball fans know that in 1927, Babe Ruth hit 60 home runs. Lou Gehrig hit 47. The Cubs’ Hack Wilson hit 30, as did the Phillies’ Cy Williams. The Giants’ Rogers Hornsby hit 26, and his teammate Bill Terry had 20. That’s it—nobody else hit 20 or more round-trippers.

In 1927, there were 16 major-league teams. Let’s call the top 16 home run hitters in the majors—one per team—the "elite" sluggers that year. Ruth, Gehrig, Wilson, Williamson, Hornsby, Terry, and 10 guys who hit 14-19 bombs. Those 16 players hit 372 home runs that year. Across the majors, there were 922 home runs. So the elite 16 players accounted for over 40 percent of all home runs that year. Since 1920, the year Ruth hit 54 home runs, 1927 is the only season in which the elite home run hitters—defined as the top n in the majors, where n equals the number of teams—hit over 40 percent of all homers.

Fifteen years earlier, in 1912—two years before Ruth’s first appearance in the majors, not that that’s in any way relevant—the Italian statistician Corrado Gini created a measure, now known as the Gini Coefficient. The Gini Coefficient measures inequality among values in a distribution. It ranges from 0 (perfect equality) to 1 (perfect inequality). Gini proposed it as a measure of income or wealth inequality, and that’s its most common usage, though it can be used to measure the inequality within any dataset.

Here, for example, is the Gini Coefficient for disposable income inequality among the 35 (mostly wealthy) countries in the Organization for Economic Cooperation and Development, or OECD:

(source: OECD)

The most egalitarian countries in the OECD are Iceland, Norway, and Denmark. The most unequal are Chile, Mexico, and the United States. Further, several countries, including the U.S., have become more unequal since the Great Recession; those countries are represented by the blue bars with orange lines within them rather than over them. By and large, the Gini Coefficient is consistent with public perception. Scandinavian countries are notably egalitarian, the U.S. is pretty stratified. (Though what’s up with Chile?)

This isn’t an attempt to swerve from a discussion of home runs to one of income inequality. Rather, it’s an attempt to use Gini’s model to look at home run distributions. Baseball in the 1920s was very unequal when it came to home run production. A few players generated a large proportion of homers, and many hit very few, if any.

Contrast that to last season. There were 30 teams, so let’s call the 30 players who hit the most home runs "elite." Mark Trumbo, Nelson Cruz, Khris Davis, Brian Dozier, Edwin Encarnacion, Nolan Arenado, Chris Carter, and Todd Frazier all hit 40 or more. The 30th-most homers by an individual last season were 31, a total Mookie Betts, Yoenis Cespedes, Albert Pujols, Yasmany Tomas, and Justin Upton all met. The top 30 home run hitters, combined, went deep 1,096 times—49 percent more than the elite home run hitters in 1927, adjusting for the difference in league size and games played. Nonetheless, the elite 30 accounted for only 19.5 percent of the 5,610 home runs hit last year. That’s the lowest proportion of all time.

This trend shouldn’t be a surprise. Home runs were relatively rare back in the 1920s. In 1927, Ruth out-homered every team but the Cubs, Giants, and Cardinals (and the Yankees, of course). Even Wilson and Williamson, with 30 each, out-homered four teams. By contrast, there hasn’t been a team with fewer than 60 homers since the 1986 Cardinals, and the last time teams didn’t hit 30 was during World War II.

Let me show this graphically. This one shows the percentage of all major-league home runs hit by the elite sluggers, as defined above, for every season from 1920 to 2016:

As you can see, the top 16 home run hitters routinely accounted for a third of all homers through the 1930s. The proportion fell steadily from the start of World War II through the late 1950s, when in plateaued around 24 percent. It bounced around there for several years, then began another decline in the 1980s. The three years with the fewest home runs hit by the top sluggers are 2016, 2013, and 2012.

Let’s see whether the Gini Coefficient confirms this. I took every batter with 50 or plate appearances in a season since 1920 and, for each year, calculated a Gini Coefficient based on the number of home runs each batter hit. (Note: You can really test the limits of Excel by having it calculate array formulae on a table with over 40,000 rows.)

This confirms the results in the earlier graph showing the percentage of home runs hit by elite sluggers. The distribution of home runs is a lot more equal now that it was 90 years ago. The five most equal non-strike seasons, per the Gini Coefficient, are, in order, 2015, 2013, 2016, 2006, and 2005. The five most unequal are, in order, 1927, 1920, 1931, 1929, and 1926.

And that’s evident by looking at the distribution of home runs. Last year there were a record number of homers. And, as noted, eight players hit 40 or more. But that’s nowhere near the record. There were 17 players with 40 or more homers in 1996, 16 in 2000, 13 in 1998 and 1999, 12 in 1997 and 2001 … you get the idea. Eight players with 40 or more home runs is tied for only 12th-most all time.

But last year there were 30 batters with 30-39 home runs. Only 1999, with 32, and 2000, with 31, had more. And players with 20-29 homers? There were 73 of them in 2016, crushing the old record of 64 set in 2008.

See the pattern? We got a record number of home runs last year, not by a few players hitting a ton of them, but from many, many players getting 20 or more. There were 111 players with 20 or more home runs in 2016. That’s what made the record—a lot of players getting a decent number of home runs, not a Ruth and a Gehrig vastly outperforming their peers.

There’s a sense that early baseball was dominated by a handful of incredible athletes, while the contemporary game is brimming with them. From the perspective of home runs, at least, this seems to be the case. Home runs aren’t as equally distributed as disposable income—Chile’s 0.47 Gini Coefficient for income is lower than 2016’s 0.54 Gini Coefficient for homers—but I think we can also agree that home run distribution doesn’t rise to the level of a societal problem, either. Whatever the underlying cause, the rising tide of home runs seems to be lifting many, many boats.

You need to be logged in to comment. Login or Subscribe
leites
5/11
Interesting, thanks for doing this!
mainsr
5/11
Thanks for reading it, leites.
eas9898
5/11
So, what does it tell us when it's not just the stars hitting the homers, but everyone and their little brother? Seems to me that it indicates something more fundamental involved, whether it's the balls are constructed differently, or the every smaller ballparks are finally affecting the aggregate numbers, or higher velocity fastballs are being turned around with more power, or something. All of the above?
mainsr
5/11
Spitballing here, but... 1. If it's the balls (and I'm not convinced it's not, despite my link at the end to Ben Lindbergh's excellent Ringer article), wouldn't we be seeing more players hitting 40+, and some hitting 50+, dingers? 2. Ballparks haven't changed much since the scoring nadir in 2014. 3. Alan Nathan has shown that there is a very small contribution of pitch velocity to exit velocity. It's almost all generated by the hitter. 4. I think there has to be a development component. Last year AL second basemen had a higher TAv than left fielders and center fielders, and NL 2B had a higher TAv than 3B, LF, CF, and RF. I've got to think that reflects a preference for players who can hit the ball out of the park, regardless of position. 5. Here's my off the wall, probably wrong thought: Shifts. If we can place infielders optimally, it reduces the need for skinny guys who can range all over the place but can't hit for power.
jfranco77
5/12
Democratization is right. Shifts have allowed bigger players to play 2B, SS, and CF. Those players no longer need to be small and fast. Shifts help cover their lack of range. As players have gotten bigger, more of them have the power to drive the ball out of the park. Almost every player has legit 20 HR power these days. That wasn't always the case. But since they've got the power, and there's no stigma to striking out, why not swing for the fences? I wonder if the reduction in astroturf also plays a role. Easier to get to most ground balls on grass. Less emphasis on speed. Where have you gone, Willie Wilson? (Jerry DiPoto would totally sign Willie Wilson in an instant) Also, it's Cy Williams, not Williamson. Cy is one of the legendary sim league players, for just this reason - hitting a ton of bombs when almost nobody did. Ned Williamson was the guy from the 19th century who held the single-season HR record before Ruth.
mainsr
5/12
Good god how did I get Cy Williams wrong. I know baseball history. No excuse for that one. At last year's SABR Analytics Conference, Manny Acta said that he thought the biggest change arising from shifts was at second base, as they now need to have stronger arms to handle longer throws, suggesting a bigger if less mobile player. That sure came true. Re turf, the Jays aren't exactly burners, and the Rays are only middle of the pack base stealers, but I think you have a good point. When there was a lot of turf, it made more sense to pursue a speed-related strategy.