Notice: Trying to get property 'display_name' of non-object in /var/www/html/wp-content/plugins/wordpress-seo/src/generators/schema/article.php on line 52
keyboard_arrow_uptop

When last we met, we were playing the blame game, and specifically asking the question, when a batter strikes out, who is responsible? The batter? The pitcher? Random noise? It turns out that after some numerical gymnastics we find that, on a strikeout, a batter deserves about 56 percent of the blame (from his perspective anyway), while the pitcher gets about 43.3 percent. Background noise from the league takes up the remaining 0.7 percent. Today, we’ll take a look at some of the other (more complicated) events in baseball, but if you haven’t read part I (which lays out the methodology I’ll be using), now would be a good time to go back and do so.

Consider the following situation. In the top of the sixth inning of a tie game, Ervin Santana (whose real first name is Johan… it’s hard enough having two Chris Youngs and two Tony Penas in baseball… thank you Ervin for helping us out) is pitching with Ichiro Suzuki on first and no one out. Chone Figgins steps up and whacks a ground ball to the right side off his former teammate. Howie Kendrick makes a play for it, but it scoots just out of the reach of his glove and out into right field for a single. As right fielder Bobby Abreu picks up the ball, he notices that Ichiro is rounding second and thinking third. Abreu comes up throwing, but Ichiro’s clearly got third,  and the ball is cut off wisely by Erick Aybar (don’t call me Willy!). The Mariners have first and third with no one out.

How can we parse out who gets what credit in that situation? The problem with measures such as WPA as they are practiced now is that Figgins (the batter) will get all of the credit for the play, and Santana will take the entire hit for the play. Ichiro gets no love for "stealing" third on the single, nor does Abreu have to face the music for having a noodle arm in right field. If Kendrick had more range at second base, he might have gotten that ball and the play would have turned out very differently.

Can we hold Santana responsible for the fact that his second baseman didn’t get there? In fact, Santana did a pretty good job and got Figgins to hit a ground ball, which might have become a double play (it is Figgins running, so probably not. But you never know).

How to be a little more equitable? Let’s dive in.

Breaking It Down

Consider what can happen in the course of a plate appearance. Leaving aside very rare events (catcher’s interference, a strikeout with a dropped third strike so that the batter ends up on first), there are four basic outcomes: a walk, a strikeout, a hit batsman, or a "contact event," which is an event involving bat hitting ball. (Because someone will ask, I’m including home runs in "contact events.") The first three are negotiated exclusively between batter and pitcher, with no input from the defense, and they also end the plate appearance with very predictable results. These are the easy ones. Because the only actors that we have to worry about are the batter and pitcher, the calculations are fairly quick. Indeed, we did the calculations for strikeouts last week. I ran similar calculations for walks and HBP. The breakdown is as follows:

Outcome Batter Control Pitcher Control Background Noise
K 56.0% 43.3% 0.7%
BB 63.3% 35.8% 0.9%
HBP 62.2% 36.6% 1.2%

Then, there are those situations that involve the ball going into play. These are the ones that will cause us the most grief, because now we have to deal with the batter, pitcher, any runners on base, and the fielders. Let’s take the example above. Figgins has hit the ball into play. In general, a plate appearance ending with a contact event breaks down into 60.3 percent batter, 39.1 percent pitcher, and 0.4 percent background noise, about on par with the K, BB, and HBP numbers above.

Now, before the at-bat as described above, the Mariners have a win expectancy of 51.9 percent, so the Angels can expect to win 48.1 percent of the time. That’s about to change.

Once the ball is in motion of the bat, it can be a ground ball, a line drive, or a fly ball. For those of you who have followed what Colin Wyers has written on the subject, you know that there are problems aplenty with batted-ball classifications, but for now, we’ll trust the available data files. Let’s parse out responsibility for the fact that this ball in play is a worm-burner. Because we are still only dealing with the pitcher and batter, the calculations are a little easier. (Again, I’m using the same framework I used in Part One of this series), the variance in a ground ball is generally 49.1 percent controlled by the batter, 50.2 percent by the pitcher, and 0.7 percent background noise. It’s a pretty even split between batter and pitcher as to how that ground ball got there.

Knowing that the ball is a grounder changes the game state. Now we’ve gone from runner on first with no out to runner on first with no out and a live ground ball to the right side. This could end in a few different ways. It could be a fielder’s choice, a double play, a regular old out at first, or a base hit of some sort.

In 2009, ground balls that were eventually fielded by the first baseman, second baseman, or right fielder while there was a runner on first and fewer than two outs ended in the following configurations. In parentheses afterward is the WPA for that set of circumstances from the Mariners’ point of view:

29.5% double play (42.1%)
23.4% put out at first (49.3%)
20.2% fielder’s choice (48.0%)
15.4% single, runner to 2nd (60.3%)
8.6% single, runner to 3rd (70.1%)
1.5% double, runner to 3rd (69.5%)
0.4% double, runner to home (71.9%)
0.4% double, runner thrown out (49.3%)

There were a few other miscellaneous outcomes. We can take a look at these outcomes and average out what the expected change in win expectancy is given that all we know is that there’s a ground ball headed to the right side. It turns out that hitting a ground ball to the right side (before we know what will become of it), produces an expected win probability of 50.9 percent for the Mariners (down from 51.9 percent above). By hitting a ground ball, Figgins has actually taken away from his team’s chances of winning, at least in the aggregate.

Now, since Figgins is only halfway (well, 49.1 percent) responsible for the fact that the ball in play was a grounder, we’ll give him a debit his WPA account by 0.491 percent, rather than the fully 1.0 percent that the Mariners’ win expectancy has fallen. The rest of it goes to Santana (0.502 percent) and background noise/dumb luck (0.007 percent).

Right now, we have a live ground ball and no idea what happened to it. What happens when we unclick the pause button? The next step will be to see whether Kendrick can get to the ball, but it means we have another actor involved.

A batter has every incentive to steer the ball away from the fielder, if he can. The pitcher has every incentive to induce the batter to hit the ball toward a fielder, if he can. Clearly, the second baseman wants to get to the ball, if he can. So who has the most say over whether a grounder to the right side will end up being fielded? I looked at all grounders which were eventually fielded by the right fielder, the center fielder (we’re looking at the second baseman’s range, so we need to look at balls up the middle as well), or the second baseman. If the ball made it to the outfield, I gave the second baseman a debit of half a play, as he splits the blame 50/50 with the first baseman on a single to right, and with the shortstop on a single to center. (If I wanted to, I could get a little more fine-grained here as well… but that will have to wait for another day. Today, we just need a rough-range measure). For each ball that he made it to (whether he completed the play or not), he got a credit of one play. For the batter and pitcher, I took the percentage of such balls that were fielded by an infielder vs. an outfielder that happened on their watch.

The results were that the second baseman shoulders 3.8 percent of the responsibility for getting to the ball, the batter 52.6 percent, and the pitcher 43.0 percent. (The remaining 0.6 percent is background noise.) These numbers might seem a little surprising to some. In general, the story of DIPS has been that once the ball leaves the bat, the pitcher is absolved of almost all blame for that which his fielders do, as he can not control them. More and more, I’ve come to view this line of reasoning with suspicion.

I would suggest a slightly different way to look at the traditional DIPS interpretation. The pitcher controls about 43 percent of the variance in whether the fielder will get to the ball (on a grounder to the right side, anyway), which is not the majority of the variance, but it dwarfs—by factor of more than 10!—the variance explained by the fielder. It means that to some extent, the pitcher is going to have his BABIP dictated by the strength of the opposition batters, but that it’s his job to induce light contact. A pitcher who has a talent for sawing off bats and inducing weak grounders will have a lower BABIP. It does help to have a second baseman with some range, but it seems that the second baseman is much more the victim of what the pitcher sends his way, rather than the other way around. [/rant]

In any case, the ball got through. Before we knew what would become of that ground ball, we figured that the Mariners had a 50.9 percent chance of winning. Now, that the ball is through and it’s clearly a single, but prior to Ichiro taking off for third, we have first and second with no out, still in a tie game. The Mariners can expect to win 60.3 percent of the time, an increase of 9.4 percent. Again, Figgins gets 52.6 percent of that credit (so a nice credit of 4.9 percent), Santana gets dinged 4.0 percent, and Kendrick gets a 0.4 percent debit. (The final 0.1 percent goes in the dumb luck bin.)

Then Ichiro makes his break for third, successfully, taking Seattle’s win probability from 60.3 percent to 70.1 percent, another increase of 9.8 percent. It seems especially odd that the pitcher or batter should bear much responsibility for this turn of events. In some sense, this is now a confrontation between the baserunner and right fielder.

But let’s take a look at what the numbers say. Maybe some hitters are good at placing the ball in such a way that the runner can more easily make it to third. I figured out the percentage of times that a runner had gone first-to-third safely on a single to the right, how often the right fielder had been so victimized, and how often it had happened on the batter’s watch and the pitcher’s watch. Which actor had the most to say about such situations? The results might surprise you:

Batter 9.2%
Pitcher 39.4% (sic)
Right Fielder 14.0%
Baserunner 26.2%
Noise 11.2%

The pitcher has a lot to do with whether that runner makes it to third on that single. Consider that pitchers who give up sharp grounders or line-drive singles are probably more likely to give up singles, but they get to the right fielder faster. It might also have something to do with him having a good (or bad) pickoff move and keeping the runner closer (or letting him get a bigger lead). As we might have guessed, the runner deserves much more credit than the batter for his exploits on the bases. What might jump out though is that the noise component is rather large, compared to what we’ve seen so far, taking up 11.2 percent of the variance.

Coming back to our example, the 9.8 percent jump that comes with Ichiro going to third can be split up as 0.9 percent to Figgins, 3.9 percent to Santana, 1.4 percent to Abreu, 2.6 percent to Ichiro, and 1.1 percent to noise. (Someone just added those and got 9.9 percent… rounding error.)

Summing It All Up

In this one play, the Seattle Mariners saw their win expectancy go from 51.9 percent to 70.1 percent, for a total swing of 18.2 percent. But by breaking things down, we find a better way to chop up the credit than simply assigning everything to Figgins who had the good fortune to be the batter:

  • Figgins gets a negative 0.5 percent for his role in hitting a ground ball, but a positive 4.9 percent for his part in steering the ball through the hole, and 0.9 percent for hitting the ball in such a way to make it easier for Ichiro to go to third. His total WPA contribution is 5.3 percent

  • Santana gets a positive 0.5 percent for inducing the initial ground ball, but a negative 4.0 percent for not making it a more field-able grounder and 3.9 percent off for his role in letting Ichiro go to third. He "helps" the Mariners win an additional 7.4 percent of the time.

  • Kendrick and Abreu take hits of 0.4 percent and 1.4 percent respectively in WPA, for the former's not getting to the ball in the first place, and the latter for not either throwing Ichiro out or holding him at second.

  • Ichiro gets a WPA credit of 2.6 percent for going to third.

  • 1.1 percent of the credit goes to dumb luck

So, the most valuable "Mariner" on the play was Santana. Figgins, who otherwise would have gotten a full 18.2 percent WPA credit gets a mere 5.3 percent. Figgins owes most of the bounty that he receives now to members of the Angels (9.2 percent), his teammate (2.6 percent), and background noise (1.1 percent). Even if we focus on actual members of the Mariners, Figgins and Ichiro should split the credit about two-thirds to one-third for their relative contributions.

What It Means

It should be pretty obvious that programming an algorithm that would cover all (or at least most) situations that actually happen in a baseball game is a big project. This one play took me hours. But it’s just an engineering problem, and my hope is that this work will provide a decent blueprint on how to start. But more than that, even in this one example, it should become clear that the current cultural practice of giving the lion’s share of the credit to the pitcher and batter for what happens during a plate appearance is silly. There are other actors and they should be noted and given credit (or blame) as is due. My ultimate hope is that we can begin to look at WPA through a slightly more nuanced view and incorporate the baserunning and fielding components that have been ignored for so long

There are a few other lessons to be gathered here. One is that win probability and other related measures can show and incorporate the idea of "tipping your cap to the other guy." A batter who hits a line drive but is the victim of an outstanding diving play has nothing for which he should be ashamed. He hit it hard and was unlucky; sometimes that happens. We might not be able to completely capture that numerically, but this methodology can begin to tease some of it apart. Then there’s also the bad luck that comes from not being able to pick which pitcher you face. A batter who strikes out against Roger Clemens in his prime isn’t so different from the rest of his peers, and the strikeout may have been more to the credit of Clemens than the debit of the batter. Sometimes you do everything right, and the other guy does it better.

Then there is the unexpected lesson about DIPS. These numbers suggest that the pitcher has much much more control over what happens to a ball in play than we have previously believed. It’s true that BABIP is a measure that takes a while to stabilize on an individual level over what amounts to a small sample size. It’s proper to say that a pitcher’s BABIP in one year doesn’t tell you much about what’s going to happen next year. But, using a few statistical tricks to pump up the sample size, an inconvenient truth (am I allowed to use that?) emerges. For a long time, we’ve allowed pitchers to get away with just about everything once the ball left the bat. It appears now that the result of a play does indeed depend a great deal on the pitcher.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now
You need to be logged in to comment. Login or Subscribe
harderj
4/12
I liked this article a lot, and think that breaking down contributions, positive or negative, is a sophisticated way to look at on field performance.

I'm just glad when I did my dissertation there was "only" Boswell's Total Average and James's Runs Created!
surfdent48
4/12
Great article. However I feel Ichiro's credit and Figgins' credit would not exist if the ground ball hadn't "found a hole" and gotten through. To me the biggest factor in the ball finding "the hole" is the speed of the grounder. For this the pitcher does deserve much of the blame for the lack of quality of the pitch. Still, this smash grounder could have been hit right at the fielder.(Luck) Also many times a 4 or 5 bouncer finds a hole on a great pitch.(Luck) My summary?: There's a MUCH MUCH greater aspect (more than a miniscule 1.1%) of dumb luck to all of this.
bravejason
4/12
How are the umpires accounted for? In the noise? Assume perfect umpires (inaccurate, of course, but I can appreciate a well timed simplifying assumption)? This would apply both to the non-contact and contact events.

Also, I didn't see anything about errors with respect to contact events. It would be interesting to look at how credit and blame should assigned when a fast runner "forces" an error by a hurried defender. You could obviously look at the flip side too: a slow runner who gives the defense extra time to complete the play. Maybe it is time to develop a speed-neutral OBP.
nateetan
4/12
Yeah, I was wondering about the lack of reference to errors in the outcomes of groundballs hit to the right side.
pizzacutter
4/12
The outcomes on the ground balls include some errors. When I say "single, runner to second" that might just be a muffed grounder that went for an error. Either way, it doesn't matter. The point is that the end state was 1st and 2nd, no out. That's what I want to model.

Umpires aren't in there. There are a number of factors that in theory could be added, but then you overload the regression.
TheRedsMan
4/12
Great article, Russell. However, I've always interpreted DIPS theory a bit differently than you've suggested here. It's not that pitchers have relatively little influence over batted ball outcomes.

Rather, it's the difference in ability between established major league pitchers to influence batted balls is much, much smaller than their ability to strike batters out, walk batters (or to not do so), and to not allow home runs. Or perhaps more accurately, because there is a relatively small set of skills that drive a pitcher's influence on PA outcomes and because of the preceding point, once you control for strikeouts, walks, and homers, you've pretty much already captured all of the information you can about his ability to influence PA outcomes.

It makes logical sense as well. If a pitcher is particularly talented at getting players to hit the ball poorly, he's probably pretty good about missing bats altogether and/or keeping the ball in the yard. I think your analysis is fascinating and quite useful from a WPA perspective, but I don't think it really sheds any new light on DIPS theory.
sensij
4/12
Tangentially... what drives the batter's 62.2% portion of the HBP outcome? Where he stands in the batter's box?
nateetan
4/12
I imagine a lot of it is determined by his inability to get out of the way (either intentionally or unintentionally).

pizzacutter
4/12
There are certain players who just seem to get beaned a lot. It's a relatively stable stat year to year for hitters. It probably has to do with standing close to the plate. Or being a jerk.
buckgunn
4/13
This article is astonishingly good -- in fact, I'd say this series is my favorite thing published at BP in the last couple years. Russell, I look forward to you cracking your ongoing "engineering problem," b/c I think it has exciting possibilities for player evaluation looking backwards and going forward. But in the meantime, congrats.
SaberTJ
4/13
Fantastic Article. Looking forward to more.
biglou115
4/13
I also wonder how widespread this interpretation of DIPS theory is. I had always been under the impression that the newest batch of DIPS derived stats like xFIP and SIERRA were predicated around the fact that while a pitchers general skill level at non-in play events has a high persistence, his success on BIP was more volitile, not because it couldn't be controlled but because its regression to a player's mean was much more severe. If FIP is higher than ERA its usually because the pitcher's BABIP was lower than expected, FIP has a high predictability because BABIP is likely to regress in extremes, rather than simply stating that batted ball outcomes are outside the pitchers control.
biglou115
4/13
I meant to say that if I'm thinking of that wrong then I would much appreciate it if somebody would point out the flaw in my understanding.
swartzm
4/19
Yeah, you basically have it right there. It's true that pitchers do have control over ball-in-play outcomes, but the amount that they do control is relatively small and what they do control is correlated with strikeout, walk, and ground ball rates. Since SIERA is a regression analysis, it actually picks up on the portion of BABIP control that is correlated with those DIPS stats.