February 25, 2010
Ahead in the Count
DIPS, BABIP and Common Sense
When Eric Seidman and I unveiled SIERA, a little Googling showed that there were three big debates that broke out on the internet. Firstly, sabermetricians debated its validity and value. Secondly, readers debated whether they wanted to see how the sausage was made or the just see the end result, which was how the statistic would be used. Thirdly, baseball fans with a sabermetric bent once again debated the validity of Defense Independent Pitching Statistics (DIPS) Theory.
My introduction to true baseball analysis came during my first year of graduate school in 2003, when my roommate interrupted my studying and asked me to read a chapter of this neat book called Moneyball. He explained that it was an inside look at how the A’s were able to field such good teams without having much money. He also said there was one chapter that really shocked him, the one that explained the theory of how pitchers do not control their Batting Average on Balls in Play (BABIP), but only walks, strikeouts, and home runs.
"There’s no way that’s true," I told him, horrified that he knew so little about baseball and thinking that he would understand if he had played the game. He said he believed the book's hypothesis, and I asked him the question that we all ask when we first hear of DIPS Theory:
"Do you think if I went took a major-league mound that hitters would hit me no better than Pedro Martinez?"
He asked me why the math showed that the other three statistics were the only things that mattered, up to my neck in Econometrics homework at the time, told him, "The skill is probably weaker and correlated with other things, and he's probably just running a regression. I'd bet this BABIP thing is correlated with walks, strikeouts, and home runs, and so they are picking up the effect in a regression making it look useless in isolation." Up until this point, I had probably answered with the same arguments most baseball fans use, but I don't know if most non-baseball fans would use that argument. That dismissive insight partly led to the birth of SIERA more than six years later, as Eric and I have picked up on and utilized some of the small correlation between strikeout rate and BABIP (that J.C. Bradbury found in 2005), and allowed SIERA to be a sort of hybrid DIPS statistic that picks up on some control that pitchers have over BABIP, too.
I didn't think about sabermetrics again for a couple more years after the conversation with my roommate, when I eventually picked up Moneyball and read it cover to cover. At that point, I was a convert and haven't turned back. I’m now a sabermetric preacher in day-to-day life, willing to talk baseball with anyone. If you tell intelligent people about sabermetrics, they seem to believe most things. When you tell people about on-base percentage being more important than batting average, most people say that should have been obvious to them. When you talk about RBI being team-dependent, people believe you. When you talk about fielding percentage undervaluing players with good range, they believe you, too, and also have no problem throwing win-loss record out the window (at least until they need it to describe someone as a 20-game winner.)
However, the second you tell anybody about DIPS Theory and that pitchers can't control their hit rate on balls in play, they say you're nuts and "if you played baseball, you would understand." I did play. I was a bad high school pitcher who had a ridiculously high hit rate on balls in play. I would never be able to hold major-league hitters to a league-average .300 BABIP, but this is where decision theory and DIPS need to be friends.
The simple reason I do not get to demonstrate this counterargument in real life is that I would not be allowed to pitch in a major-league game in the first place. To be able to pitch in the majors, one needs to at least be able to strike hitters out now and then. No one in the majors who pitched at least 100 innings last year struck out fewer than 9.4 percent of the hitters they faced, as Jeremy Sowers did, which is still a feat most people could never dream of achieving. On the other hand, no one struck out more than Tim Lincecum's 28.8 percent. The reality is that everybody who can get hitters to whiff enough to hold a roster spot on a major-league team has similar skills at preventing hits on balls in play. Even Sowers got hitters to whiff on 13 percent of their swings. Pitchers are not all the same at preventing hits on balls in play, but the discrepancies are so small that there is not much meaningful statistical difference between major-league pitchers as far as hit prevention upon contact.
One reason that people often suspect there should be a difference is that ground balls in play are more likely to be hits than fly balls in play. Although about 24 percent of ground balls are hits, just 14 percent of fly balls and pop-ups are hits (and 16 percent of non-home run outfield fly balls, specifically). Since pitchers are certainly prone to either be of the ground-ball or fly-ball variety—GB/FB ratio has as much persistence as walk and strikeout rates—people expect that there should be some difference between pitchers in this regard. The reason that this is such a small difference in aggregate is that the batted-ball type that really falls for hits more than the others is line drives, which drop about 73 percent of the time. Thus, the most important question in asking whether pitchers control their hit rates on balls in play is whether they control their line-drive rate on balls in play.
The answer to that question is no. Although we are perfectly aware that game charters are biased in evaluating what constitutes a line drive—Colin Wyers showed that very well a few months ago—when you look at a pitcher's line drive rate, net of his team’s pitching staff's line drive rate, the intra-class correlation Eric and I found was 0.007. In other words, pitchers who give up a lot of line drives on balls in play one year are no more or less likely to allow a lot of line drives on balls in play the next year. Line drives are not a pitcher skill, but they are the primary determinant in BABIP. That is why researchers have continually found that pitchers do not have significant control over BABIP.
That is not to say Lincecum will surrender the same number of line drives in his next start as Sowers. Lincecum will strike out more hitters he faces, and so he will allow fewer balls in play overall. But Lincecum's line-drive rate on the balls hitters put in play last year was 19.1 percent and Sowers' was 17.1 percent. And we know from the intra-class correlation that both will probably have line-drive rates around the league average of 19 percent this year.
This is the reason that tRA did so poorly at predicting ERA the following year compared to FIP, despite having all of the same information and batted ball rates mixed in. Since tRA asked the question, "What would the average pitcher's ERA be, given his strikeout, walk, home run, pop-up, ground-ball, non-HR outfield fly ball, and line-drive rate?" it was given an answer that highly correlated with line drives. There is a negative -0.23 correlated between line drive in a given season and ERA for pitchers who pitched at least 40 innings, but line-drive rate does not carry over to the following season. Thus, any DIPS statistic that relies on line-drive rate will unravel the following season if it tries to predict ERA. That is why when tRA was compared to FIP in predicting the following year's ERA, it did worse. It uses all the same information, and a bunch of extra information to confuse itself. Basically, tRA is FIP having a nightmare.
There simply isn't much of a difference between pitchers in their ability to control what percent of balls hit the bat and what percent hit the bat squarely on the center. That makes a lot of sense. The pitcher can control how often the batter misses, whether it's more likely to hit the top of the bat (fly ball) or bottom of the bat (ground ball) based on the trajectory of his pitches. I think that if I went out to the mound in a major-league game, hitters would be able to time my slow offerings right in the center of their bats. They might be a little better at putting their bats on Sidney Ponson's pitches than CC Sabathia's pitches, but when they do, they square it up more often based on whether they are Michael Young or Eric Bruntlett (or whether they happened to guess right), rather than who the pitcher was. Of course, if pitchers were predictable, then their line-drive rate would spike, but it does prove to be true statistically that their line-drive rate shows no persistence when they are trying to avoid tipping their pitches. The fact is if you can miss enough bats to get 10 percent of hitters to strike out, then the other 90 percent will get their bats on the ball right in the center of the bat as often against Sabathia (20.4 percent) as Ponson (19.3 percent).
This seems to be the missing puzzle piece in DIPS Theory, and I'm afraid that Eric and I buried it in our SIERA series, but it should be highlighted. The primary reason that pitchers do not control their BABIP is that among those sufficiently capable of missing bats, they all seem to have the similar lack of skill in keeping the ball from hitting directly on the barrel rather than just above or just below. Since they control the trajectory of their pitches, some show more of a tendency to make the hitter miss half an inch high rather than half an inch low when they do fail to hit it squarely (based on which part of the ball enters the strike zone first). Yet it's up to the hitter to guess right and center the ball on the sweet spot. The pitchers who can miss enough bats to keep their jobs simply do not differ in how often the ball hits the sweet spot.