January 3, 2011
Between The Numbers
Ground-ball Rates in the Minors and Majors
The idea behind SIERA, as opposed to say FIP, is that the key factors in a pitcher's profile are his walk rate, strikeout rate, and ground-ball rate. Those factors become the prime shapers of other statistics—hits, home runs, and ultimately runs—that go into a pitcher's value.
I've been adjusting the translation process (as applied to pitchers) to reflect those ideas. There are a couple of key points to make about ground-ball rates and pitcher development.
1. Ground-ball rates decline as you rise through the minors.
A simple look at ground-ball rates compiled for leagues makes this plain. In 2010, the ground-ball out percentage—ground outs divided by the sum of ground outs and air outs—was around .55-.56 for the short-season leagues. In the full-season A leagues, it was more like .53-.54, dropping to .52 in Double-A, to .51 in Triple-A, and finally to .50 in the majors.
The cause isn't a mystery. Ascent through the minors is also an ascent through age brackets, with each step of the minors averaging roughly a year older than the next level down. These older players have “filled out” their bodies more—not the way mine has filled through my 40s, but gaining in upper body strength. More strength leads to more ability to hit home runs, which leads to altering your swing to take advantage of that increased ability, so there are more uppercuts and more fly balls. A pitcher who does nothing but keep pace with his league will lose 5-6 points off his ground-ball rate as he climbs the minor-league ladder.
2. Even relative to their leagues, ground-ball rates decline as you rise through the minors.
One of my favorite tools to pull out is a script that will compare the stats of every player who meets a condition—say, someone who played in Triple-A one year and in the majors the next year. This program will pull all the stats for all the players that meet the criterion, scale them to the lesser of the two plate appearances, and then sum them. This gives me a dataset where each pitcher has the same weight toward the total, the total plate appearances are identical, and allows a comparison for change between the leagues. I can run this with real stats, translated stats, or what I call nodif stats—these are translations where I don't make any adjustment for the league difficulty, but only worry about re-setting the offensive context to a standard value.
If I were to run a nodif set for players who go from Triple-A to the majors in the same year, I'd get results like this:
Note: All pitchers from 2005-2010; we don't have good play-by-play info for the minor leagues before 2005.
Strikeout rates are calibrated to a standard value where league average equals 6.0, walk rates equal 3.0, ERA equals 4.50, hits are 9.0, home runs 1.0, and ground-ball rates equal 50 percent. It should come as no surprise to find that strikeout rates fall and walk rates rise as you move to the majors, and that those numbers should move from a player who was above average for his league (or else why would he get promoted?) to one who is below average in the majors... but it is not so obvious that ground-ball rates would behave the same way. Remember, these have been normalized to the league—this drop in ground balls is over and above the drop that comes from the overall league average falling. Pitchers who get promoted from Triple-A tend to have ground-ball rate above 50 percent in Triple-A, and become below average after promotion to the majors.
This pattern asserts itself repeatedly, across different levels and across differences in years:
*i.e., Triple-A in 2008 and majors in 2009
**i.e., Triple-A in 2008 and majors in 2010
The difference is always present; it always results in a score above league average to one that is below league average (and so cannot be written off as simple regression to the mean). For equally-timed transitions, the effect gets larger as the difficulty gap between leagues increases—that Double-A players take a bigger hit going to the majors than Triple-A, A-ball bigger still, etc. This is exactly the way strikeout or walk rates behave; it needs to be accounted for in the translation process in a manner similar to the way strikeouts are handled.
Just to emphasize those points, it also works in reverse:
Players who are demoted from the majors to Triple-A in the next year tend to be pitchers whose ground-ball rates were below average, and they improve those numbers in Triple-A.
If we look at transitions from the majors from one year to the next, what do we see?
Those are for the same 2005-2010 as the minor-league data already shown; here it is for the much broader 1954-2010 time frame we have available for the major leagues:
These numbers do look like regression to the mean, albeit with a clear preference for starting with pitchers with above-average ground-ball rates. There is also an apparent trend toward longevity, with longer "survival" times in the majors equaling higher ground-ball rates. That is a preference that will also show up if I look at pitchers who, for whatever reason, did not pitch in the majors in the following years:
Pitchers who are headed out of the league have below-average rates, and the lower they are the sooner they'll be gone.
So the reason to prefer minor-league pitchers with high ground-ball rates, first noted by Nate Silver several years ago, is getting a little clearer. Teams have a preference for pitchers with high ground-ball rates at the major-league level, for good reasons that I'll examine at a later time. That isn't to say that fly-ball pitchers don't exist—there are plenty of pitchers in the majors with consistently low strikeout rates or high walk rates, but they have to be able to make up for it in other areas. But ground-ball rates, while they are highly correlated from one year to the next, are not conserved as pitchers face increasingly better hitters while rising through the minors; the rates go down. And the only way to end up with an average major-league pitcher, in this measure, is to start from a level that is high enough to weather that erosion.
Note: When I refer to ground-ball rates throughout this article, I am actually referring to ground-ball outs—times when a man batted, hit a ground ball, and either he or another runner was put out before reaching the next base. Air outs refer to any batted ball, be it a line, fly, or pop, which is caught. Like Colin, I am deeply suspicious and distrustful of the distinctions between flies and line drives in particular in the stats we have—there is a large middle ground between them, and scorers for whatever reason have clear, consistent, and persistent variations between themselves as to which side they call it. There is also a large amount of the Retrosheet play-by-play data where that information is not known for hits. The ground-out and air-out data has the advantage of being precisely known for a much wider set of data, and being almost completely free of scorer biases. With the ground/fly data known, the correlation between full ground/fly data and ground-/air-out data is on the order of 90 percent.