June 23, 2011
Checks and Balances
Background: You’ve got to admit they’re getting better
“When the 100-meter freestyle is held today in high school girls’ regional swimming meets, it is generally won by a girl who swims the distance in just under 60 seconds. That time would have won the men’s Olympic competition in 1920, or any year before it.”—Baseball Between The Numbers
Common sense should tell us that baseball players are better now than they have ever been before. Other sports, in which achievements are standardized and enumerated, permit little debate. In baseball, there is no question that players are bigger, better able to recover from injuries, have greater incentives to succeed, and come from a more diverse and substantial pool, than in years past. But given that in every season only one team wins the World Series, the average team totals 81 wins over 162 games, and hitters score exactly as many runs as pitchers allow, it’s difficult to put a number on changes in player talent.
The trick is in the control. It wouldn’t be too hard to evaluate players over time if their competition and environment remained constant. There is one skill among one group of players that has likely held constant over time—pitcher hitting. Because pitchers are selected independent of their hitting ability, any variation in performance in pitchers as hitters on a league-wide basis can be attributed to pitcher quality. Therefore, the league’s quality of pitching can be, and has been, measured by using pitchers as hitters as the control. (I can imagine a time when pitchers’ hitting abilities will be valued more highly, which would cause their hitting averages to go up even as they continued to get better at pitching.)
In Baseball Between the Numbers, Nate Silver introduced a method of quantifying league difficulty over time by using any hitter who played in two consecutive years as his control. By seeing how those players performed in consecutive years, variation in performance could be accounted for by the competition. Mitchel Litchtman similarly studied the drop in run-scoring in 2010 by selecting a control group of average major-league players and seeing how they performed from one year to the next.
My methodology will be most similar to that of Tom Tango, who searched for the reason behind changes in home run rates over time by controlling for hitter, pitcher, and park. My question is: How much better have hitters and pitchers gotten during the Retrosheet Era? To answer, I will focus on the Three True Outcomes, and although I probably should, I will not control for things like park or league and will instead consider all other changes to be “environmental” effects.
How to interpret this chart?
Basically, I compared how players who faced each other in consecutive years fared in those confrontations and in their matchups against everyone else. If they did better against the same players, that means their environment was different.* If they did worse against everyone else than they did against the same players, that means everyone else got better. That’s about it. Feel free to skip to the results now if you want.
*This entire analysis hinges upon this assumption. If, say, over a period of time, batters began to get better than pitchers by aging in an uncharacteristic way, that era might throw things off.
Details: In which I take the harmonic mean of harmonic means and in so doing lose sensation in my face.
I established two bins. Bin one was my control group. It contained the number of plate appearances, homers, walks, and strikeouts for year n and year n+1 for all batters and pitchers who faced each other in consecutive years. Bin two, my experimental group, contained the number of plate appearances, homers, walks, and strikeouts for year n and year n+1 for all batters and pitchers against everyone else.
Within each bin, I paired up year n and year n+1’s HR rates, BB rates, and K rates. I found the average difference contained in each pair for each player in each bin, weighing by the harmonic mean of plate appearances in both years.
I then averaged all of those averages over every pair of year n and year n+1, this time weighing by the harmonic mean of the two groups’ harmonic means. It was also at this point that I realized I would not be able to clearly explain what I was doing.
The following graph shows the difference in each rate stat from the previous year among players who faced each other in both years. For example, in 2009, our control homered in 3 percent of PAs, while in 2010, that number fell to 2.75 percent. The difference is -0.25 percent. I have also plotted the lines of best fit.
It appears that over time, the difficulty of recording a strikeout has increased, and it’s become a bit easier to draw a base on balls. I assume this is a function of the strike zone’s ever-shrinking nature.
We can point to specific changes in MLB, such as rule changes or new ballparks, to see how much they changed our control group. This also serves as a sanity check, as we already know that the lower mound in 1969 raised run-scoring. But how does that bear out in the numbers?
1968 was the original Year of the Pitcher. In 1969, the mound was lowered. This resulted in the second-highest increase in homers, highest increase in walks, and biggest decrease in strikeouts in the control group, which dates back to 1954.
The biggest jump in home runs in the control group came in 1977, the year Rawlings succeeded Spalding as the official supplier of major-league baseballs, per Wikipedia.
In 1963, when the top of the strike zone was changed from “between the batter’s armpits and the top of his knees” to “the top of the batter’s shoulders and his knees” (per baseball almanac), home run rates and walk rates fell .8 percent and .24 percent, respectively, as strikeout rates rose .34 percent. In 1969, the zone was changed back to the armpits, which served as another reason for the bump in offensive output that year, in addition to the mound being raised.
The opening of Coors Field in 1993 coincided with a 0.43 percent increase in home runs per plate appearance in our control group.
Pitchers are continually getting better at striking batters out. Every single year.
We already know pitchers are throwing harder than they did just a couple years ago. The number of Pinedas and Strasburgs entering the league is outweighing the amount of Mike Mussinas and Andy Pettittes departing. Furthermore, new pitches have been introduced, such as the splitter in the 1980s, making pitchers more difficult to hit.
One notable year was 1984, when the experimental pitcher group added 2 percent to its strikeout rate. Most of that can be attributed to rookies recording a strikeout rate 2 percent higher than non-rookie pitchers. It’s a rarity for rookies to perform as well as veterans in anything, but 1984 was the year of possibly the greatest rookie pitcher ever, Doc Gooden, who entered the league as baseball’s best pitcher.
My interpretation of this chart is that home runs are the driving force. Batters are exchanging more homers for more strikeouts and fewer walks. Just as pitchers are throwing faster, I assume batters are swinging harder.
I believe the data show three trends. The league has discouraged strikeouts over time, presumably through a smaller and smaller strike zone. Pitchers have more than counteracted that effect, and I think that’s because they can throw harder than they used to. Batters are also better at hitting home runs than they used to be, which I believe to be a result of increased strength, similar to that of pitchers.
The cause of fluctuations in MLB run-scoring is important. Consider: a change in MLB pitching quality one year does not necessitate a change in replacement level pitching quality that year. Were a drop in run-scoring the result of environmental differences or inferior batters, that would mean nothing in evaluating pitchers. However, if all pitchers in MLB improved as replacement level remained constant (a possibility, albeit a reach), that would impact their value. We like to say that a drop in run-scoring is the cause of better pitching or worse hitting, but without performing this type of analysis, it’s very difficult to know for sure.