March 23, 2010
Ahead in the Count
Predicting BABIP, Part 1
If you don’t put your bat on the ball, you’re not going to get a hit, and if you don’t hit the ball over the wall, someone might catch it. This series begins with what happens the rest of the time as I develop a model to predict a hitter’s Batting Average on Balls in Play (BABIP). In Part 2, I will explain some of the current BABIP superstars then some of the players where my system differs from PECOTA will be the topic of Part 3.
Hitters vary wildly with respect to their abilities to put the bat on the ball. The best are guys like Placido Polanco and Dustin Pedroia, hitters who strike out in about seven percent of their plate appearances, while the worst are guys like Mark Reynolds and Jack Cust, who strike out about 30-35 percent of the time. Hitters vary wildly with respect to their abilities to hit the ball over the wall, too: Prince Fielder and Ryan Howard hit home runs in about eight percent of their at-bats while Jason Kendall and Luis Castillo are well under one percent.
It’s very clear who is good at making contact, and who is good at hitting home runs, but it is harder to know who is good at getting hits on balls in play. That’s because the difference between the best and worst hitters at BABIP is much smaller, which explains why the year-to-year correlation of a batter’s HR/AB is about .74, and the year-to-year correlation of a batter’s K/PA is about .84, while the year-to-year correlation between a hitter’s BABIP is only about .37. Guys like Derek Jeter and Ichiro Suzuki can reliably get hits on 35 percent of their balls in play, while guys like Edwin Encarnacion and Rod Barajas can only muster hits on 26 percent of their balls in play. Between hitters, there is a much smaller gap in BABIP skills, and it can be tricky to decipher the difference between skills and noise. However, it is very important to figure out if you want to assess a hitter’s skills, because 69 percent of all PA result in a ball being hit into the field of play.
BABIP came into popular usage when Voros McCracken initially observed how little pitchers have control over it. Recently, I explained that pitchers seem to control about 12 percent of their BABIP skill, but I have also found that hitters control about three times that much in their at-bats (36 percent). Furthermore, unlike pitchers, whose BABIP skills are correlated with and explained by each pitcher's strikeout, walk, and home run skills, any hitter's BABIP skills are often uncorrelated with his Three True Outcomes skills. The actual standard deviation of hitter BABIP among players with at least 300 PA in a given season is about .031, but removing the fraction that could be attributed to luck using the same methods as the article I just referenced, and true hitter skill in BABIP should have a standard deviation of about .019. My model predicts about .018, indicating that I have isolated the most important aspects of BABIP.
As I explained when I wrote my BP Idol article "You Can Beat PECOTA without a Computer Model," PECOTA uses nearly a century of data to predict hitters’ statistics with accuracy, although not every statistic has been recorded throughout through all of history. Thus, PECOTA cannot use information like rates for ground balls, line drives, pop-ups, and fly balls, nor can it examine BABIP on each of those batted-ball types. My model can see some things that PECOTA can’t about BABIP, and therefore has been successful at projecting BABIP more than PECOTA and other systems. My projected BABIP numbers, published at the now defunct StatSpeak.net blog last spring, correlated with actual 2009 BABIP at a higher rate than PECOTA and CHONE, and had a lower RMSE. They also were closer to actual BABIP than PECOTA was 57 percent of the time, and closer to actual BABIP than CHONE was 60 percent of the time. That might not seem like much, but those fractions are significant at the 95-percent and 99-percent level, respectively. In other words, there is less than a five-percent chance that there would have been that large of a difference between my BABIP model and PECOTA if they were equally as good, and less than a one-percent chance that the model would have beaten CHONE as badly as it did just by chance.
There is a large difference in BABIP on different batted balls. Line drives have a BABIP of about .730, while ground balls have a BABIP of about .240. Outfield fly balls have a BABIP of about .170, while infield pop-ups have a BABIP of only .020.
Different hitters have different swings that generate these batted balls at very different rates. The year-to-year correlation for ground-ball rate is .78, for outfield fly-ball rate is .72, for line-drive rate is .37, and for pop-up rate is .68. You can get pretty far already just by knowing the rate at which a hitter has hit these batted-ball types in the past.
Hitters also show some difference in their BABIP on each batted-ball types, with year-to-year correlations of .30 for ground balls, .22 for outfield fly balls, just .12 for line drives, and .17 for pop-ups. Therefore, if we know that a hitter had a high BABIP because he had a high BABIP on line drives, then we should expect him to regress back to the mean, while if a hitter had a high BABIP because he had a high BABIP on ground balls, then he will be more likely to maintain that.
While hitters seem to have more control over their BABIP on ground balls, this is primarily because faster players can reach on ground balls in the infield at a much higher rate than slower players. In the past, I have found that using ground-ball errors and infield hits together is better than only looking at infield hits alone, so I have incorporated a variable called "Infield Reach Percentage" which is the percentage of ground balls that stay in the infield that a hitter reaches first base safely on (excluding fielder’s choices). Infield Reach Percent has a .55 year-to-year correlation, while "Outfield Ground-Ball Rate," my statistic representing the percentage of ground balls that reach the outfield, has only a .25 year-to-year correlation; it certainly represents a skill, but not one where there is much difference between major-league hitters.
The batted-ball type with the least persistent BABIP was line drives, with only .12 year-to-year correlation. Most of hitting line drives away from fielders' gloves is a matter of luck, but hitters who hit the ball harder are better at getting them to fall in. In fact, home-run rate (or, actually the natural log of home-run rate) correlates more highly with next year’s line-drive BABIP than this year’s line-drive BABIP. In other words, you can most likely expect Andre Ethier (.618 LD-BABIP, 31 homers in 2009) to have a better BABIP on line drives than Erick Aybar (.861 LD-BABIP, five homers in 2009) this year.
Another statistic that has a high correlation with a lot of relevant aspects of BABIP is contact rate (as shown on FanGraphs), and defined as the percentage of pitches that a hitter swings at that he makes contact with, either generating a foul or fair ball (although I use the natural log of contact rate). Being able to make contact when a hitter swings is a good proxy for his ability to square up the ball, and hitters are more likely to improve their BABIP if they make more contact in general. Triples per at-bat were also used in some regressions as an additional proxy for speed.
Following my methods in previous articles, I developed simple OLS regressions to check BABIP on each batted ball type (except pop-ups) before developing an overall BABIP model. I did make an important change for the sake of accuracy to create weighted averages of a statistic over multiple seasons. Only including seasons with at least 300 PA, I created a weighted average of the three previous seasons by weighting three years ago by three, two years ago by four, and one year ago by five, and averaging those. For instance, the weighted average of ground-ball BABIP over the last three years was equal to:
(3*(Ground-ball Hits in 2007) + 4*(Ground-ball Hits in 2008) + 5*(Ground-ball Hits in 2009))
Similarly, when I only had two previous seasons with at least 300 PA, the years were weighted by just four and five.
Using only data from 2003-09, and only hitters with at least 300 PA for four straight years, I used regression analysis to predict ground-ball BABIP (GB-BABIP) in the fourth year using data from the previous three years. The regression had an R^2 of .20; here it is:
Variable Coeff. P-Stat GB-BABIP 0.412 .000 INF Reach% 0.179 .016 LN(Contact%) 0.082 .008 LN(HR/AB) 0.010 .015 TR/AB 1.248 .004 Constant 0.162 .000
Translated, this table says that:
Expected GB-BABIP = .412*(GB-BABIP) + .179*(INF REACH%) + .082*(LN(Contact)) + .010*(LN(HR/AB)) + 1.25*(3B/AB) + .162
Keep in mind that all of these statistics are weighted averages of the previous three years as described above.
Variable Coeff. P-Stat OF-BABIP .307 .000 PU% -.265 .002 LD% -.240 .014 Constant .186 .000
This equation had an R^2 of .09.
Projecting line-drive BABIP in the fourth year was not even helped by looking at previous years’ line-drive BABIP, but instead by looking at previous years’ line-drive rate and the natural log of home runs per at bat. This only had an R^2 of .04.
Using these results, I confirmed that using the same statistics would help me develop my BABIP model. Again, using all players with 300 PA in four straight seasons, I developed the following regression equation for BABIP the fourth season as a function of weighted averages from the three previous seasons. This regression had an R^2 of .31:
Variable Coeff. P-Stat Line Drives/Balls in Play .277 .000 Ground Balls/Balls in Play .091 .010 Popups/Balls in Play -.378 .000 Ground-ball BABIP .177 .003 IF Reaches/IF Reach Ground Balls .109 .040 Outfield Fly-ball BABIP .181 .000 Ln(Home Runs/At-Bats) .011 .000 Ln(Contact Made/Pitches Swung At) .054 .028 Constant .200 .000
Using only players with only three straight years of 300 PA, I developed the following regression equation for predicting BABIP in the third year as a function of weighted averages from the two previous seasons; this had an R^2 of .26:
Variable Coeff. P-Stat Line Drives/Balls in Play .226 .000 Ground Balls/Balls in Play .040 .127 Popups/Balls in Play -.387 .000 Ground-ball BABIP .138 .001 INF Reaches/INF Reach Ground Balls .104 .005 Outfield Fly-ball BABIP .124 .000 Ln(Home Runs/At-Bats) .007 .002 Ln(Contact Made/Pitches Swung At) .023 .180 Constant .232 .000
Using the same statistics to project BABIP one year using only the previous year’s data, I got a lot of insignificant variables and needed to change up the variables a little bit. With only one year of data, there is a lot of noise, and the best strategy is to use some other statistics to add information that would not have been valuable with more years to work with. I adjusted the pop-up variable to pop-ups per fly ball overall, which provided a more accurate assessment of how frequently the hitter makes a bad swing. I eliminated overall GB-BABIP, which had too much noise in it, and used only infield reach rate. I also used triples to add in some information about speed not contained in one year of infield reach rate. Finally, I used outfield fly-ball hits per all balls in play rather than just per outfield fly balls in play since that provided a slightly better prediction. This regression had an R^2 of .21.
Variable Coeff. P-Stat Line Drives/Balls in Play .236 .000 Ground Balls/Balls in Play .077 .000 Popups/(Popups + Fly Balls) -.116 .000 IF Reaches/IF Reach Ground Balls .132 .000 Ln(Home Runs/At-Bats) .003 .019 OF Fly-ball Hits/All Balls in Play .218 .000 Triples/At0Bats .605 .000 Constant .223 .000
It’s always better to use extra data, so to develop my model of BABIP, which I call Expected BABIP (or E-BABIP), I used the first of these three regressions for all hitters with 300 PA or more the previous three years, while replacing it with the second regression if the hitter had 300 PA the previous two years but not three years ago, and the last of the three equations only for hitters who had 300 PA just one year in a row.
Actually incorporating this model into a projection system would be a tricky endeavor, and with all the improvements going into PECOTA in 2010, this was not included in this year’s projections. However, it can be used by general managers and fantasy managers alike to better assess who is likely to outperform their projections. As other projection systems use similar amounts of data, this model serves as another approach to evaluate a few aspects of hitting skill.
Part 2 of this series will highlight the current BABIP Superstars, as I’ve called them before, showing the 10 highest BABIP projections for 2010, and the reasons why they are so high, and also the five-lowest BABIP projections for 2010 and the reasons why they are so low. Part 3 will show the hitters where E-BABIP and PECOTA-BABIP differ the most and why you should view each in different circumstances.