One of my favorite things to do with baseball statistics is to pick two of them and see what kind of relationship they have. Many pitchers have changed locations this off-season and will have to get accustomed to the new team defense behind them. Some have made the move to strong defensive teams while others have moved into situations that are a step down from what they have been accustomed to having around them.
Not only are pitchers changing places but so are good and bad defenders, which makes it tough to predict how a team defense could affect a pitcher in 2013, but that does not prohibit from looking back at how the relationship works. Specifically, we can look at the relationship between a pitcher’s batting average on balls in play and team Defensive Efficiency ratings to see how one impacts the other.
Our crack stats team pulled together a report for me showing all 3,105 pitchers that threw at least 160 innings for a single team in a season from 1974 to 2012. In comparing the two metrics, we see that Defensive Efficiency (DEF_EFF) and batting average on balls in play (BABIP) have a negative correlation of r=-0.57.
Generally, scores of higher than 0.7 or lower than -0.7 are considered strong while anything less than 0.3 on either end is considered weak. The correlation between BABIP and DEF_EFF is not strong, but it’s not one that can be completely dismissed either.
A prime example of why the correlation is not stronger is the recently traded James Shields. If you look to the far left of the image above, you’ll see a small dot near the .28 mark on the y axis to the extreme left of the x axis. That represents the .282 BABIP Shields posted during the 2007 season when the then Devil Rays defense turned in the worst team defensive efficiency in the entire sample: .670. It was a historical low for a team score, and the improvement from 2007 to 2008 was one of the many reasons the Rays won the American League that season. In 2007, Shields had a .282 BABIP; Scott Kazmir was at .332 and Edwin Jackson was at .341. Jump ahead to the 2010 season when the Rays had a .722 team defensive efficiency, and Shields had a .341 BABIP. Shields owns both the best BABIP against the worst defensive efficiency as well as the third-highest BABIP for any pitcher on a team with a team defensive efficiency score of at least .720.
The table below shows how the two metrics break down at .025 intervals:
DEF_EFF RANGE |
COUNT |
MEAN BABIP |
BABIP RANGE |
HIGHEST |
LOWEST |
<.700 |
435 |
.302 |
.119 |
.355 |
.236 |
.700-.724 |
2067 |
.284 |
.122 |
.348 |
.226 |
>.724 |
603 |
.264 |
.093 |
.322 |
.229 |
If you are one to avoid risk, the obvious plan of attack would be to target pitchers on teams with the highest defensive efficiency scores. Unfortunately, only two teams have eclipsed the .725 threshold in the past three seasons: the 2011 Rays (.735) and the 2010 Athletics (.726). The American League does, however, dominate the leaderboard over the last three seasons; it owns each of the top 11 spots and 12 of the top 20. The Rays have three of the top 10 scores pitching in a pitchers’ park, Oakland has two of the top ten in their pitchers’ park, while Texas has two top-ten finishes in their hitters’ park.
Getting back to Shields, he moves from a team that has averaged a .723 defensive efficiency score over the past five seasons to one that has averaged a .696 score over that same time period. Shields had that rough 2010 season despite the strong team defense mainly due to pitch sequencing issues and giving his cutter usage a baptism by fire. He made some slight mechanical tweaks that off-season and came into 2011 with a different sequencing strategy, and the results since then have been excellent.
Shields has pitched converted himself into a groundball pitcher in recent years. He has both succeeded in spite of poor defensive play and has squandered excellent defensive support in the past. While this may be attributable to simple luck, it’s also possible that those worrying about how he’ll fare away from the comforts of Tropicana Field and the Rays defense may be making much ado about nothing.
Thank you for reading
This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.
Subscribe now
Shields has been rather consistent the past 2 seasons whereas 2010 was a mess due in part to predictability and not having complete command of his cutter.
A high DE & low BABIP have a good relationship, but it's also not impossible for a pitcher to put up a strong BABIP with a very weak DE behind him. In the sample sizes, there is a trend for the highest BABIP to be lower as the DE goes higher, but the low end remained rather static.
BABIP = (H - HR) / (AB - SO - HR + SF)
Defensive Efficiency = 1- (H-HR)/(AB-SO-HR+SH+SF)
So, for all intents and purposes, these two numbers are mathematical inverses (the strange omission of SH from the BABIP formula notwithstanding). Within the same set of plate, appearances, if you compute the two numbers, BABIP = 1- DE.
Each data point of your analysis includes a number for BABIP and a number for DE. But DE does not appear to be 1-BABIP (that is, the points don't all lie along y=1-x. But I don't understand why. Are you computing DE over a larger sample of PA's than the sample used to compute BABIP? Maybe one is computer for just the PA's against the pitcher and the other is computed for all of the defense's PA's? That would make sense. You can ignore this if you think I'm getting too technical but it seems like an ambiguity that obscures the whole focus of your analysis--to me. So maybe I'm not the only one wondering this. No worries either way.
So, your observation is correct: it looks like the correlation described in this article is tautological.
The conclusion of this piece is we really don't know what the change of location will do to him, or any pitcher that moves around. There is some correlation there, but with r=-0.57, it isn't terribly strong. The "duh" observation is that your risk of a high BABIP is reduced when pitching on a team with a strong DE.
Hopefully we are getting beyond the "BABIP = luck" paradigm for pitchers (many thanks to the excellent work by Mike Fast), and there is much to be learned from outlier players.
I think that everyone can agree that Justin Verlander is an outlier, in every sense of the word, and his putting up a .273 BABIP last season in front of the 27th-ranked defense in baseball by Def-Eff (.693) tells us something about the nature of the stat. In 2011, he put up a ridiculous .236 BABIP in front of a .708 Def-Eff Tiger defense (18th in baseball). It is similar with Clayton Kershaw (though Kersh is not so extreme) - these guys live in the lower-left quadrant of the correlation chart.
The middle of the graph is still muddled, either because we lack the tools sensitive enough to measure them with precision (my contention), or because the actual impact is quite small. But it's not a coincidence that the pitchers who are generally regarded as the best in the game also happen to have very low hit rates.
Great work, as always, Jason.
Anyway, thanks all. I understand the article's argument much better now as being somewhat about the lack of steepness in the line's slope and somewhat about the high variance of the team-conditional distribution.
DE >=.720 - r=-0.34
DE >=.700 - r=-0.50
DE >=.680 - r=-0.56
If I tweak the table to ranges of .025 for DE, I get this mean BABIP for each range:
.675-.699 = .302 (14% of the overall sample)
.700-.724 = .284 (67% of the overall sample)
.725-.749 = .264 (19% of the overall sample)