Last week, we talked a little bit about how we might assign credit and blame for different outcomes on a baseball diamond. For the most part, it came up that the batter actually has much more control over the outcome of an at-bat than does the pitcher. Last week, I only looked at six possible outcomes of a plate appearance: strikeout, walk, HBP, ground ball, line drive, and fly ball. For the three balls in play, I didn’t go any further beyond the fact that the batter hit a ground ball. It might have made its way through the infield. It might have been scooped up and thrown to first for an out.
This week, let’s take a little deeper look into what happens after the batter hits the ball and how much he and the pitcher control what happens to it afterward.
Warning! Gory Mathematical Details Ahead!
If a plate appearance ends in a strikeout, walk, or HBP, it just ends right there. We know what the next game state will be (so long as we know the previous one). The number of outs generated is fixed. The runners move in a prescribed manner (unless there’s some sort of SB attempt, something that we can address later). With a ball in play, there are more variables to take into account. Classic DIPS theory suggests that a pitcher has little to no control over what happens to a ball once it leaves the bat and that all pitchers are essentially true talent .300 BABIP giver-uppers. Now, the extreme position on that has been batted back a bit over the years. The evidence shows that it’s not quite that simple.
Let’s start with the knowledge that the ball has been struck and the batter has hit a fly ball and assume for a moment that’s all we know. From last week, we know that the batter gets 42.6 percent of the credit for the fact that it’s a fly ball, the pitcher gets 51.2 percent, and the rest (6.2 percent) goes to the “league” which is a way of saying that sometimes fly balls just happen. A fly ball though is a bundle of cork, twine, leather, and possibilities. In 2014, there were 37,559 plate appearances that ended in what Retrosheet classified as either a fly ball or a pop up. In 84.7 percent of those, the ball ended up in someone’s glove for an out. In 0.3 percent of those cases, the ball ended up in someone’s glove… and then jumped out of his glove for an error. There were 1,103 singles on fly balls (2.9 percent of the time), 1,410 doubles (3.8 percent), 242 triples (0.6 percent), and 2,873 home runs (7.6 percent).
Now, that fly ball will eventually do one of those things. But at the moment it leaves the bat, we don’t know what. So, we’re going to use the idea of expected value. We can generate a run value for a fly out, and a single, and a double, etc. and multiply by the various percentages that we found above and get the expected value of a batter hitting a fly ball. That’s important, because we know that fly balls have generally different outcomes than ground balls, so we want to give a batter and pitcher credit for what they hit. Also, what happens to the fly ball is going to depend on a few other factors. Namely, the fielders. They didn’t help the fly ball get made, but they’re going to have a say in what happens afterward.
Now that we know we have a fly ball, we should ask whether the defense was needed to defend it at all. In baseball, fly balls that aren’t able to be caught because they sail over the wall have a particular importance within the game. One thing that researchers have noticed is that while a pitcher’s home run rate is somewhat stable from year to year, the percentage of fly balls that a pitcher allows to go over the wall is not. In fact, it seems that about 9-10 percent of all fly balls become home runs and pitchers all seem to regress heavily back toward that mean from season to season. When we say a pitcher is homer prone, it’s mostly because he gives up a lot of fly balls and eventually, probability catches up with all of us.
I used a similar method to last week. I used data from 2010-2014 for this one and selected out all of the pop ups and fly balls and then coded them for whether they went for a home run or not. I took everyone’s HR/FB rate, both from the batter’s perspective and the pitcher’s perspective and converted those probabilities into logged-odds ratios. I also took the league average for the year and used each as a predictor within a binary logistic regression at the plate appearance level. I looked to see how much variance each actor accounted for (once again, for the super-initiated, using -2log likelihood contributions as my measure). The results? Whether or not a fly ball ends up as a home run is 78.8 percent the fault of the batter, 19.7 percent the fault of the pitcher, and 1.6 percent the fault of the league average. (Yes, I know, 100.1 percent… rounding). The pitcher actually bears a little more responsibility than the batter for giving up the fly ball. The batter is the one who hits the home run. No wonder HR/FB rate is so subject to random year-to-year variation.
But let’s tackle a tougher problem. And we’re going to do it with Retrosheet data. What about fly balls that don’t leave the yard? Once we dispatch all the home runs out of the data set, how well can pitchers and hitters steer the ball to or away from the fielders. Thankfully for us, from 1993-1999, Retrosheet data contain some rudimentary batted ball locations for most balls hit. They used a grid diagram system that at least told us whether the ball was hit to deep left field or shallow right field. The granularity of the data are not amazing and we have to treat them as a bit suspect because we have no quality control on them, but they give us a starting place.
Now, since we’re trying to figure out the contributions of the batter and pitcher in this equation, we want to leave the fielders out of this. Suppose that a batter hits a fly ball to the gap which most of the time would go for a double or triple. Except Juan Lagares happens. The batter, instead of celebrating an extra base hit gets to walk back to the dugout. The batter should get some sort of credit for hitting most of a double, but he doesn’t. A week later, he might get it back when he hits a lazy fly that someone just lets drop to the ground, but karma isn’t always so precise in baseball.
I started by looking at the “out” rate for fly balls in each zone in the Project Scoresheet data, again from 1993-1999. I then looked at each hitter’s expected BABIP on fly balls based on where he hit them, rather than whether or not they were caught. I did the same for the pitcher. I also took the league BABIP on fly balls during those years. As above, it all went into a logistic regression predicting whether the ball was caught or not. This will give us an idea of whether a ball being caught is more associated with a batter’s ability to hit the ball away from fielders or a pitcher’s skill at getting the ball hit toward his fielders. The result, the batter controlled 48.1 percent, the pitcher controlled 32.5 percent, and the league background noise got the remaining 19.4 percent. There’s a lot of randomness when it comes to catching fly balls.
Let’s look at ground balls as well. Similar to the method I used above, I looked at the ground ball locations provided by Retrosheet from 1993-1999 and calculated an expected BABIP for both pitcher and hitter. (Actually, I looked at the chances that an infielder was the one who fielded the grounder. Whether he converted it into an out is a different story.) Batter gets 46.6 percent of that credit, the pitcher gets 40.6 percent, and the background noise gets 12.7 percent.
For line drives, I looked at whether an infielder would be the one who fielded the ball, which would generally require either a ball hit right at someone or a great play. Sure enough, batters only accounted for 18.7 percent of the variance, pitchers for 20.9, and background noise for a whopping 60.2 percent. That tells us what we already sorta knew — that a when a line drive is caught on the infield, it’s no one’s fault really. Sometimes, you just hit it right at someone. For liners that make it to the outfield, it’s a different story. I looked at whether those would be caught (again, using an expected BABIP framework based on the Retrosheet zone information). Here, the credit for whether the ball is hit to an area where the ball is normally caught is 46.9 percent the batter’s fault, 41.4 percent for the pitcher, and 11.6 percent to background noise.
In general, it looks like the ability to steer balls away from fielders (or not) is slightly more in control of the batter than the pitcher but only slightly, and there’s a lot of just random noise in there. Now, somewhere out there, someone is thinking “Well, I’ve been told through the years that the pitcher has almost no control over where a batted ball lands. Are you telling me that we should actually give him half the credit actually?” Not exactly.
The secret lies in what’s known as variance partitioning. (It’s about to get really #GoryMath in here.) I’ve been figuring out how much blame to assign to the pitcher, the batter, and what we might call the “era” or “year” effect relative to each other. If you look at the overall variance accounted for within the logistic regressions I set up, they tell a different story. Logistic regressions don’t have a true R-squared statistic like your typical linear regression, but there is the Nagelkerke pseudo-R-squared. If we ask the question “How good are we at predicting whether this particular PA will end in a strikeout, given those three factors?” For outcomes like strikeouts and walks, the answers are somewhere in the neighborhood of 6 or 7 percent. That might seem low, but really there are no matchups in baseball where anyone can be absolutely certain of a strikeout. It doesn’t mean that we’re going to be right 7 percent of the time. It means that by accounting for the batter and pitcher and league’s tendencies to strikeout overall, we’ve got about 7 percent of the recipe. The rest is a mish mosh of other factors and dumb luck. When we look at the question “How good are we at predicting whether this ground ball will be fielded by an infielder?” the answer drops to less than 1 percent of R-squared. It was that way for most of the other ball in play outcomes. There’s just a lot more randomness overall in balls in play.
DIPS is still mostly right… It was just half right
Here we’re able to use some rudimentary data to ask the question of whether batters and pitchers bear much responsibility for steering the ball toward the fielders or away from them. We see from the data that what blame is out there is split about equally between the batter and pitcher, but we learn that there is a lot of randomness. To say that the batter and pitcher have absolutely no control over the fate of a batted ball (once it leaves the bat) is silly, but we see that they don’t have all that much in the grand picture. The further the ball gets from the bat (and the more things it has to bounce off) the more randomness there is in the outcome.
But if we looked closely, there’s another important lesson. We’ve always assumed that pitchers had no control over what happened to a batted ball. But looking closely, we see that they generally have about as much control as the batter, at least relative to each other. So, if pitchers have “no” control over batted balls, and batters have the same amount, then by transitive property, batters have “no” control over batter balls either? Last week we learned that the kind of ball that comes off the bat is much more in the control of the batter, and we already know that certain types of balls in play (line drives, especially) are more likely to go for hits. But it seems that the batter’s (relative) control over the process stops there when the ball leaves his bat. Where the ball goes, at least relative to where the fielders are is mostly a bit of randomness. If we see consistency over time in a batter’s BABIP, it’s not because he’s good at steering the ball but because he’s better at squaring it up and it matters more that he is. DIPS was half right. Pitchers don’t have a lot of control over what happens when the ball leaves the bat. Neither do hitters.
Thank you for reading
This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.Subscribe now