If you don’t put your bat on the ball, you’re not going to get a hit, and if you don’t hit the ball over the wall, someone might catch it. This series begins with what happens the rest of the time as I develop a model to predict a hitter’s Batting Average on Balls in Play (BABIP). In Part 2, I will explain some of the current BABIP superstars then some of the players where my system differs from PECOTA will be the topic of Part 3.

Hitters vary wildly with respect to their abilities to put the bat on the ball. The best are guys like Placido Polanco and Dustin Pedroia, hitters who strike out in about seven percent of their plate appearances, while the worst are guys like Mark Reynolds and Jack Cust, who strike out about 30-35 percent of the time. Hitters vary wildly with respect to their abilities to hit the ball over the wall, too: Prince Fielder and Ryan Howard hit home runs in about eight percent of their at-bats while Jason Kendall and Luis Castillo are well under one percent.

It’s very clear who is good at making contact, and who is good at hitting home runs, but it is harder to know who is good at getting hits on balls in play. That’s because the difference between the best and worst hitters at BABIP is much smaller, which explains why the year-to-year correlation of a batter’s HR/AB is about .74, and the year-to-year correlation of a batter’s K/PA is about .84, while the year-to-year correlation between a hitter’s BABIP is only about .37. Guys like Derek Jeter and Ichiro Suzuki can reliably get hits on 35 percent of their balls in play, while guys like Edwin Encarnacion and Rod Barajas can only muster hits on 26 percent of their balls in play. Between hitters, there is a much smaller gap in BABIP skills, and it can be tricky to decipher the difference between skills and noise. However, it is very important to figure out if you want to assess a hitter’s skills, because 69 percent of all PA result in a ball being hit into the field of play.

BABIP came into popular usage when Voros McCracken initially observed how little pitchers have control over it. Recently, I explained that pitchers seem to control about 12 percent of their BABIP skill, but I have also found that hitters control about three times that much in their at-bats (36 percent). Furthermore, unlike pitchers, whose BABIP skills are correlated with and explained by each pitcher's strikeout, walk, and home run skills, any hitter's BABIP skills are often uncorrelated with his Three True Outcomes skills. The actual standard deviation of hitter BABIP among players with at least 300 PA in a given season is about .031, but removing the fraction that could be attributed to luck using the same methods as the article I just referenced, and true hitter skill in BABIP should have a standard deviation of about .019. My model predicts about .018, indicating that I have isolated the most important aspects of BABIP.

As I explained when I wrote my BP Idol article "You Can Beat PECOTA without a Computer Model," PECOTA uses nearly a century of data to predict hitters’ statistics with accuracy, although not every statistic has been recorded throughout through all of history. Thus, PECOTA cannot use information like rates for ground balls, line drives, pop-ups, and fly balls, nor can it examine BABIP on each of those batted-ball types. My model can see some things that PECOTA can’t about BABIP, and therefore has been successful at projecting BABIP more than PECOTA and other systems. My projected BABIP numbers, published at the now defunct blog last spring, correlated with actual 2009 BABIP at a higher rate than PECOTA and CHONE, and had a lower RMSE. They also were closer to actual BABIP than PECOTA was 57 percent of the time, and closer to actual BABIP than CHONE was 60 percent of the time. That might not seem like much, but those fractions are significant at the 95-percent and 99-percent level, respectively. In other words, there is less than a five-percent chance that there would have been that large of a difference between my BABIP model and PECOTA if they were equally as good, and less than a one-percent chance that the model would have beaten CHONE as badly as it did just by chance.

There is a large difference in BABIP on different batted balls. Line drives have a BABIP of about .730, while ground balls have a BABIP of about .240. Outfield fly balls have a BABIP of about .170, while infield pop-ups have a BABIP of only .020.

Different hitters have different swings that generate these batted balls at very different rates. The year-to-year correlation for ground-ball rate is .78, for outfield fly-ball rate is .72, for line-drive rate is .37, and for pop-up rate is .68. You can get pretty far already just by knowing the rate at which a hitter has hit these batted-ball types in the past.

Hitters also show some difference in their BABIP on each batted-ball types, with year-to-year correlations of .30 for ground balls, .22 for outfield fly balls, just .12 for line drives, and .17 for pop-ups. Therefore, if we know that a hitter had a high BABIP because he had a high BABIP on line drives, then we should expect him to regress back to the mean, while if a hitter had a high BABIP because he had a high BABIP on ground balls, then he will be more likely to maintain that.

There are other statistics that show some ability to help guide a prediction of BABIP even after knowing batted-ball rates and BABIP on them.

While hitters seem to have more control over their BABIP on ground balls, this is primarily because faster players can reach on ground balls in the infield at a much higher rate than slower players. In the past, I have found that using ground-ball errors and infield hits together is better than only looking at infield hits alone, so I have incorporated a variable called "Infield Reach Percentage" which is the percentage of ground balls that stay in the infield that a hitter reaches first base safely on (excluding fielder’s choices). Infield Reach Percent has a .55 year-to-year correlation, while "Outfield Ground-Ball Rate," my statistic representing the percentage of ground balls that reach the outfield, has only a .25 year-to-year correlation; it certainly represents a skill, but not one where there is much difference between major-league hitters.

The batted-ball type with the least persistent BABIP was line drives, with only .12 year-to-year correlation. Most of hitting line drives away from fielders' gloves is a matter of luck, but hitters who hit the ball harder are better at getting them to fall in. In fact, home-run rate (or, actually the natural log of home-run rate) correlates more highly with next year’s line-drive BABIP than this year’s line-drive BABIP. In other words, you can most likely expect Andre Ethier (.618 LD-BABIP, 31 homers in 2009) to have a better BABIP on line drives than Erick Aybar (.861 LD-BABIP, five homers in 2009) this year.

Another statistic that has a high correlation with a lot of relevant aspects of BABIP is contact rate (as shown on FanGraphs), and defined as the percentage of pitches that a hitter swings at that he makes contact with, either generating a foul or fair ball (although I use the natural log of contact rate). Being able to make contact when a hitter swings is a good proxy for his ability to square up the ball, and hitters are more likely to improve their BABIP if they make more contact in general. Triples per at-bat were also used in some regressions as an additional proxy for speed.

Following my methods in previous articles, I developed simple OLS regressions to check BABIP on each batted ball type (except pop-ups) before developing an overall BABIP model. I did make an important change for the sake of accuracy to create weighted averages of a statistic over multiple seasons. Only including seasons with at least 300 PA, I created a weighted average of the three previous seasons by weighting three years ago by three, two years ago by four, and one year ago by five, and averaging those. For instance, the weighted average of ground-ball BABIP over the last three years was equal to:

(3*(Ground-ball Hits in 2007) + 4*(Ground-ball Hits in 2008) + 5*(Ground-ball Hits in 2009))
(3*(Ground Balls in 2007) + 4*(Ground Balls in 2008) + 5*(Ground Balls in 2009))

Similarly, when I only had two previous seasons with at least 300 PA, the years were weighted by just four and five.

Using only data from 2003-09, and only hitters with at least 300 PA for four straight years, I used regression analysis to predict ground-ball BABIP (GBBABIP) in the fourth year using data from the previous three years. The regression had an R^2 of .20; here it is:

Variable      Coeff.  P-Stat
GB-BABIP      0.412    .000
INF Reach%    0.179    .016
LN(Contact%)  0.082    .008
LN(HR/AB)     0.010    .015
TR/AB         1.248    .004
Constant      0.162    .000

Translated, this table says that:

Expected GB-BABIP =  .412*(GB-BABIP) + .179*(INF REACH%) + .082*(LN(Contact))
                        + .010*(LN(HR/AB)) + 1.25*(3B/AB) + .162

Keep in mind that all of these statistics are weighted averages of the previous three years as described above.

Predicting outfield fly-ball BABIP in the fourth year as a function of the weighted average of OF-BABIP, pop-up rate, and line-drive rate in the previous three years would look as follows:

Variable      Coeff. P-Stat
OF-BABIP       .307   .000
PU%           -.265   .002
LD%           -.240   .014
Constant       .186   .000

This equation had an R^2 of .09.

Projecting line-drive BABIP in the fourth year was not even helped by looking at previous years’ line-drive BABIP, but instead by looking at previous years’ line-drive rate and the natural log of home runs per at bat. This only had an R^2 of .04.

Variable       Coeff.  P-Stat
LN(HR/AB)      .017    .000
LD%            .205    .087
Constant       .741    .000

Using these results, I confirmed that using the same statistics would help me develop my BABIP model. Again, using all players with 300 PA in four straight seasons, I developed the following regression equation for BABIP the fourth season as a function of weighted averages from the three previous seasons. This regression had an R^2 of .31:

Variable                                  Coeff.  P-Stat
Line Drives/Balls in Play                 .277     .000
Ground Balls/Balls in Play                .091     .010
Popups/Balls in Play                     -.378     .000
Ground-ball BABIP                         .177     .003
IF Reaches/IF Reach Ground Balls          .109     .040
Outfield Fly-ball BABIP                   .181     .000
Ln(Home Runs/At-Bats)                     .011     .000
Ln(Contact Made/Pitches Swung At)         .054     .028
Constant                                  .200     .000

Using only players with only three straight years of 300 PA, I developed the following regression equation for predicting BABIP in the third year as a function of weighted averages from the two previous seasons; this had an R^2 of .26:

Variable Coeff. P-Stat Line Drives/Balls in Play .226 .000 Ground Balls/Balls in Play .040 .127 Popups/Balls in Play -.387 .000 Ground-ball BABIP .138 .001 INF Reaches/INF Reach Ground Balls .104 .005 Outfield Fly-ball BABIP .124 .000 Ln(Home Runs/At-Bats) .007 .002 Ln(Contact Made/Pitches Swung At) .023 .180 Constant .232 .000

Using the same statistics to project BABIP one year using only the previous year’s data, I got a lot of insignificant variables and needed to change up the variables a little bit. With only one year of data, there is a lot of noise, and the best strategy is to use some other statistics to add information that would not have been valuable with more years to work with. I adjusted the pop-up variable to pop-ups per fly ball overall, which provided a more accurate assessment of how frequently the hitter makes a bad swing. I eliminated overall GBBABIP, which had too much noise in it, and used only infield reach rate. I also used triples to add in some information about speed not contained in one year of infield reach rate. Finally, I used outfield fly-ball hits per all balls in play rather than just per outfield fly balls in play since that provided a slightly better prediction. This regression had an R^2 of .21.

Variable                                  Coeff.  P-Stat
Line Drives/Balls in Play                 .236     .000
Ground Balls/Balls in Play                .077     .000
Popups/(Popups + Fly Balls)              -.116     .000
IF Reaches/IF Reach Ground Balls          .132     .000
Ln(Home Runs/At-Bats)                     .003     .019
OF Fly-ball Hits/All Balls in Play        .218     .000
Triples/At0Bats                           .605     .000
Constant                                  .223     .000

It’s always better to use extra data, so to develop my model of BABIP, which I call Expected BABIP (or EBABIP), I used the first of these three regressions for all hitters with 300 PA or more the previous three years, while replacing it with the second regression if the hitter had 300 PA the previous two years but not three years ago, and the last of the three equations only for hitters who had 300 PA just one year in a row.

Actually incorporating this model into a projection system would be a tricky endeavor, and with all the improvements going into PECOTA in 2010, this was not included in this year’s projections. However, it can be used by general managers and fantasy managers alike to better assess who is likely to outperform their projections. As other projection systems use similar amounts of data, this model serves as another approach to evaluate a few aspects of hitting skill.

Part 2 of this series will highlight the current BABIP Superstars, as I’ve called them before, showing the 10 highest BABIP projections for 2010, and the reasons why they are so high, and also the five-lowest BABIP projections for 2010 and the reasons why they are so low. Part 3 will show the hitters where EBABIP and PECOTABABIP differ the most and why you should view each in different circumstances.

You need to be logged in to comment. Login or Subscribe
Excellent article. Thanks, Matt. I've been looking forward to E-BABIP since Idol last year. "Actually incorporating this model into a projection system would be a tricky endeavor..." but worth the effort. This could be the "next big step" in baseball analytics and projections. $$$. Kudos.
Really interesting article and analysis, Matt. Can't wait for Part 2. When will it be posted?
Part two should be tomorrow. Thanks!
Looking forward to part 2. Quick request - in addition to your write-ups for the ten best and five worst, would it be possible to link to an overall spreadsheet so readers can look at some of their own players of interest? Thanks!
Yeah, in part 3, there will be a google doc linked with all the E-BABIP projections. Thanks!
I probably should have posted this in the "Cult of Strike One" comments, but that is old and I'm not sure anyone would see it there. I just read on article on Manny Acta is which he appeared to say that first pitches had a BABIP of .070 "The favorite set of numbers in Acta's dossier, and the one that elicited the most wide-eyed reaction, showed that of the first-pitch strikes put into play, only seven percent resulted in hits." This just seems blatantly wrong. I looked it up on baseball reference and saw that the number was actually .303. That seems far more realistic. I like and respect Acta a lot. He loves sabermetrics and its refreshing to see a manager use real numerical analysis. Thats why this quote really jumps out at me. Was he misquoted? If the reporter misunderstood him, is it possible that his own players misunderstand him? Does he really think 0-0 BABIP is .070? How did he get to that number? I'm really curious about this.
Clearly that's wrong. What is true is that 3.5% of all first pitches are hits. Since batters swing at about 1/2 of pitches, and about 1/2 of pitches are over the plate, I'd assume that he could mean either "7% of all swings at first pitches are hits, while 93% are strikes or outs." or he could mean "7% of all pitches over the plate are put into hits, while 93% are strikes or outs."
Nice to see more thought being put into the idea of BABIP. Much appreciated!
Agree wholeheartedly. We've been stuck in "BABIP = luck" for a long time. I was surprised to see that liner rate doesn't have the highest correlation with BABIP. I'm looking forward to a conclusion on this one.
Thanks. The thing is that line drive rate DOES have a high correlation with BABIP that year, but line drive rate is not very persistent. So certainly a guy who has 19% line drives (league average) might have a BABIP of .300 (league average, holding everything else average), while if the same guy had 24% line drives, his BABIP would be .325. However, a guy with 24% line drives one year is probably going to have a 21% line drive rate the next year, which might correspond with a BABIP of .310. So 5% more line drives in 2009 indicates only .010 points more of BABIP in 2010.
Hi Matt, I'm interested in your thoughts on xBABIP which was formulated by Dutton and Bendix and described here: xBABIP was calculated off of batted ball data and other variables including plate discipline, contact rate, power, speed and handedness. Is it helpful to include such things or unnecessarily complicating?
xBABIP is a post-dictor/estimator, rather than a predictor of BABIP, which is a useful concept certainly (SIERA is a post-dictor/estimator of ERA), but serves a different purpose. More specifically, it uses same year variables to predict that year's BABIP. E-BABIP predicts future BABIP based on batted ball breakdown. A lot of the original xBABIP model has been changed due to suggestions they got at the time. For instance, they realized that using home run rate to model power was better than pitches/XBH, mostly because it was not putting doubles and triples directly in the set of independent variables when it's supposed to be the dependent variable. The speed score is supposed to represent ability to get hits on infield ground balls or draw the infield in. However, to post-dict BABIP, you can't just include infield hits because obviously that's including some singles in the independent variables already. (That would be like including RBI doubles in an ERA estimator-- of course you're ERA is higher, but really that's not what you're measuring.) However, to predict BABIP, it's okay to use historical infield hit rate, which is what I use (+ reaching on errors). Then you're simply asking how well the past predicts the future. I did use handedness in my regressions originally, but it did not do more than add or subract .001 from overall BABIP and wasn't significant. The reason is that any of the ways that handedness would affect BABIP would have already affected GB-BABIP and OF-FB-BABIP in previous years. xBABIP was trimmed down a lot and adjusted later. The things like discipline had been introduced, and things like home run rate and pop-up rate were things I was already using in older models, and the lefty*(gb/fb) variable is really just modeling infield hit rate too. The real add-on, and data that I'd love to get my hands on and incorporate meaningfully, is that they introduced "spray" which basically measures how well the hitter sprays the ball across the field. That would be useful to improve this too. I think that's the main contribution of xBABIP is proving that a wider distribution of batted balls across the field plays a major role in BABIP.
What a great series of articles! I was thinking that there were three variables that weren't adequately being considered. One of them was spray, which I'm glad to see was introduced to xBABIP. The other two are speed (I'm not convinced that triples is a reliable proxy) and fielding response. By this last one, I mean the extent to which the fielding team adjusts to the perceived hitting patterns of each hitter. Looked at the other way, what effect does a fielding shift have on a batter's BABIP? When the outfielders move back to defend against a home-run hitter, does that help or harm the hitter's BABIP?
Your stat makes a lot more sense. Now the question is does Manny Acta understand the real stat and was he misquoted or did he himself misunderstand the stat. Since it appears to be the central point in his argument, I think he might have misunderstood the stat himself. I don't think the real stat would create such a "wide-eyed reaction". I also think Indians' pitchers are taking the message to heart. Don't ever throw a ball on the first pitch. Strikes are always better. Nothing bad will ever happen. If they took this seriously, it would be a colossal mistake. I'm curious whether the Nationals had a significantly higher than average first pitch strike % last year. I'm going to look into it.
I looked up the numbers and the Nats pitchers threw a ball on the first pitch 42.1% of the time in 2009. It was 42.4% in 2008. The NL averages were 41.5% and 41.4%. So if Manny Acta is really spreading the "first pitch strike no matter what" message, he is either a very recent convert or the Nats pitchers didn't pick up on the message.
Except recent articles here suggest that "first pitch strike no matter what" is not the best method...
He wasn't saying it was right, he was saying Acta may have been spreading that thought.
Matt, do the same principles apply for pitchers? Does a pitcher's velocity, type of pitch thrown, perhaps handedness or park or altitude affect whether a pitcher can induce pop-ups or other kinds of outcomes? Perhaps it can explain the success of a pitcher with a mediocre strikeout rate but a high flyball ratio?
Check out the SIERA and BABIP article I wrote last Wednesday for a more detailed answer, but pretty much I think that pitcher BABIP is almost entirely explained by K, BB, and GB skills.
Awesome article.
I've only been subscribing to BP for about 2 years, but this is easily the best article that has been published in that time. Thanks.
Great article ... looking forward to PECOTA 2011 incorporating e-BABIP. Ahead of Thursday & Part III, can't you dangle some player names where there are large discrepancies between PECOTA BABIP and e-BABIP? No need to say whether the discrepancy is up or down ... in fact please don't disclose that - it'll be a challenge to guess the direction ahead of the article.
Okay, I'll pick a few that might be harder to guess: Casey Kotchman, Nick Punto, Martin Prado, Lyle Overbay. Take a look at those and see if you can guess whether I think PECOTA is high or low on them. They'll be in the article tomorrow along with 6-7 others IIRC.
If I read this right, your primary regression equation, when data is available from 3 years past, has an R^2 of .20? I ran a similar study using batted ball data from Fangraphs. I grouped the batted ball stats for each player (GB%, FB%, LD%, IFH%, IFFB%, HR/FB) into 3-year brackets from 2002-2008 for all hitters, with my response variable being, obviously, BABIP. The R^2 figure was 0.34, and the standard error was about 0.0185, but some of the P-stats were interesting. The intercept, as well as GB%, FB% and LD% P-stats were all higher than 0.6, while HR/FB, IFH% and IFFB% had P-stats all very nice and low. Then I set the intercept to 0, and got a highly suspicious R^2 of 0.99, but the standard error looked pretty good at 0.0185, or 18.5 BABIP points, and all P-stats had at least a couple zeros following the decimal point. While some of the p-stats and R^2s look suspicious, the standard errors look good indicating with 3 years of data a player is 65% likely to fall within 18.5 points of his regression prediction. What do you make of this?
My R^2 for the 3 years regression was 0.31, not 0.20. The reason that you are getting weird P-stats for GB% FB% and LD% is because of multicolinearity. They should all add up to 100% because Fangraphs includes Pop-ups and Outfield Flyballs in "Fly Balls", so one should be removed from the regression. One of them would actually have automatically been removed if not for I'm guessing Fangraphs' numbers are rounded and so they might theoretically have added up 99.9% or 100.1% for some people. Pick one, take it out, and leave the other two in. Rerun the regression. Your R^2 is probably a little higher because you used Fangraphs' default PA which is higher than 300. That makes your R^2 more exact for your dataset. Running mine on the same data set by restricting it to 500 PA gives me an R^2 of .38, for example You're actually running almost the exact same information as I am, minus GB-BABIP, OFFB-BABIP, and CONTACT%, so you're going to get most of the way there anyway. It sounds like you did. Standard Deviations of 18.5 points sound almost perfect, as it should, and naturally that means 65% should fall within 18.5. Running the regression with an intercept of 0 was an unnecessary sidebar. Setting the intercept equal to 0 means requiring things to all be 0, what is the variable. So for some who theoretically has 0% GB, 0% FB, and 0% LD and 0% IFH, 0% IFFB, 0% HR shouldn't even have a BABIP. And doing so while dropping one of the three collinear variables (GB, FB, and LD) would be arbitrarily assuming BABIP for some with 100% of that batted ball type is 0.