April 12, 2010
Credit Where It's Due, Part 2
When last we met, we were playing the blame game, and specifically asking the question, when a batter strikes out, who is responsible? The batter? The pitcher? Random noise? It turns out that after some numerical gymnastics we find that, on a strikeout, a batter deserves about 56 percent of the blame (from his perspective anyway), while the pitcher gets about 43.3 percent. Background noise from the league takes up the remaining 0.7 percent. Today, we’ll take a look at some of the other (more complicated) events in baseball, but if you haven’t read part I (which lays out the methodology I’ll be using), now would be a good time to go back and do so.
Consider the following situation. In the top of the sixth inning of a tie game, Ervin Santana (whose real first name is Johan… it’s hard enough having two Chris Youngs and two Tony Penas in baseball… thank you Ervin for helping us out) is pitching with Ichiro Suzuki on first and no one out. Chone Figgins steps up and whacks a ground ball to the right side off his former teammate. Howie Kendrick makes a play for it, but it scoots just out of the reach of his glove and out into right field for a single. As right fielder Bobby Abreu picks up the ball, he notices that Ichiro is rounding second and thinking third. Abreu comes up throwing, but Ichiro’s clearly got third, and the ball is cut off wisely by Erick Aybar (don’t call me Willy!). The Mariners have first and third with no one out.
How can we parse out who gets what credit in that situation? The problem with measures such as WPA as they are practiced now is that Figgins (the batter) will get all of the credit for the play, and Santana will take the entire hit for the play. Ichiro gets no love for "stealing" third on the single, nor does Abreu have to face the music for having a noodle arm in right field. If Kendrick had more range at second base, he might have gotten that ball and the play would have turned out very differently.
Can we hold Santana responsible for the fact that his second baseman didn’t get there? In fact, Santana did a pretty good job and got Figgins to hit a ground ball, which might have become a double play (it is Figgins running, so probably not. But you never know).
How to be a little more equitable? Let’s dive in.
Breaking It Down
Consider what can happen in the course of a plate appearance. Leaving aside very rare events (catcher’s interference, a strikeout with a dropped third strike so that the batter ends up on first), there are four basic outcomes: a walk, a strikeout, a hit batsman, or a "contact event," which is an event involving bat hitting ball. (Because someone will ask, I’m including home runs in "contact events.") The first three are negotiated exclusively between batter and pitcher, with no input from the defense, and they also end the plate appearance with very predictable results. These are the easy ones. Because the only actors that we have to worry about are the batter and pitcher, the calculations are fairly quick. Indeed, we did the calculations for strikeouts last week. I ran similar calculations for walks and HBP. The breakdown is as follows:
Then, there are those situations that involve the ball going into play. These are the ones that will cause us the most grief, because now we have to deal with the batter, pitcher, any runners on base, and the fielders. Let’s take the example above. Figgins has hit the ball into play. In general, a plate appearance ending with a contact event breaks down into 60.3 percent batter, 39.1 percent pitcher, and 0.4 percent background noise, about on par with the K, BB, and HBP numbers above.
Now, before the at-bat as described above, the Mariners have a win expectancy of 51.9 percent, so the Angels can expect to win 48.1 percent of the time. That’s about to change.
Once the ball is in motion of the bat, it can be a ground ball, a line drive, or a fly ball. For those of you who have followed what Colin Wyers has written on the subject, you know that there are problems aplenty with batted-ball classifications, but for now, we’ll trust the available data files. Let’s parse out responsibility for the fact that this ball in play is a worm-burner. Because we are still only dealing with the pitcher and batter, the calculations are a little easier. (Again, I’m using the same framework I used in Part One of this series), the variance in a ground ball is generally 49.1 percent controlled by the batter, 50.2 percent by the pitcher, and 0.7 percent background noise. It’s a pretty even split between batter and pitcher as to how that ground ball got there.
Knowing that the ball is a grounder changes the game state. Now we’ve gone from runner on first with no out to runner on first with no out and a live ground ball to the right side. This could end in a few different ways. It could be a fielder’s choice, a double play, a regular old out at first, or a base hit of some sort.
In 2009, ground balls that were eventually fielded by the first baseman, second baseman, or right fielder while there was a runner on first and fewer than two outs ended in the following configurations. In parentheses afterward is the WPA for that set of circumstances from the Mariners’ point of view:
29.5% double play (42.1%)
There were a few other miscellaneous outcomes. We can take a look at these outcomes and average out what the expected change in win expectancy is given that all we know is that there’s a ground ball headed to the right side. It turns out that hitting a ground ball to the right side (before we know what will become of it), produces an expected win probability of 50.9 percent for the Mariners (down from 51.9 percent above). By hitting a ground ball, Figgins has actually taken away from his team’s chances of winning, at least in the aggregate.
Now, since Figgins is only halfway (well, 49.1 percent) responsible for the fact that the ball in play was a grounder, we’ll give him a debit his WPA account by 0.491 percent, rather than the fully 1.0 percent that the Mariners’ win expectancy has fallen. The rest of it goes to Santana (0.502 percent) and background noise/dumb luck (0.007 percent).
Right now, we have a live ground ball and no idea what happened to it. What happens when we unclick the pause button? The next step will be to see whether Kendrick can get to the ball, but it means we have another actor involved.
A batter has every incentive to steer the ball away from the fielder, if he can. The pitcher has every incentive to induce the batter to hit the ball toward a fielder, if he can. Clearly, the second baseman wants to get to the ball, if he can. So who has the most say over whether a grounder to the right side will end up being fielded? I looked at all grounders which were eventually fielded by the right fielder, the center fielder (we’re looking at the second baseman’s range, so we need to look at balls up the middle as well), or the second baseman. If the ball made it to the outfield, I gave the second baseman a debit of half a play, as he splits the blame 50/50 with the first baseman on a single to right, and with the shortstop on a single to center. (If I wanted to, I could get a little more fine-grained here as well… but that will have to wait for another day. Today, we just need a rough-range measure). For each ball that he made it to (whether he completed the play or not), he got a credit of one play. For the batter and pitcher, I took the percentage of such balls that were fielded by an infielder vs. an outfielder that happened on their watch.
The results were that the second baseman shoulders 3.8 percent of the responsibility for getting to the ball, the batter 52.6 percent, and the pitcher 43.0 percent. (The remaining 0.6 percent is background noise.) These numbers might seem a little surprising to some. In general, the story of DIPS has been that once the ball leaves the bat, the pitcher is absolved of almost all blame for that which his fielders do, as he can not control them. More and more, I’ve come to view this line of reasoning with suspicion.
I would suggest a slightly different way to look at the traditional DIPS interpretation. The pitcher controls about 43 percent of the variance in whether the fielder will get to the ball (on a grounder to the right side, anyway), which is not the majority of the variance, but it dwarfs—by factor of more than 10!—the variance explained by the fielder. It means that to some extent, the pitcher is going to have his BABIP dictated by the strength of the opposition batters, but that it’s his job to induce light contact. A pitcher who has a talent for sawing off bats and inducing weak grounders will have a lower BABIP. It does help to have a second baseman with some range, but it seems that the second baseman is much more the victim of what the pitcher sends his way, rather than the other way around. [/rant]
In any case, the ball got through. Before we knew what would become of that ground ball, we figured that the Mariners had a 50.9 percent chance of winning. Now, that the ball is through and it’s clearly a single, but prior to Ichiro taking off for third, we have first and second with no out, still in a tie game. The Mariners can expect to win 60.3 percent of the time, an increase of 9.4 percent. Again, Figgins gets 52.6 percent of that credit (so a nice credit of 4.9 percent), Santana gets dinged 4.0 percent, and Kendrick gets a 0.4 percent debit. (The final 0.1 percent goes in the dumb luck bin.)
Then Ichiro makes his break for third, successfully, taking Seattle’s win probability from 60.3 percent to 70.1 percent, another increase of 9.8 percent. It seems especially odd that the pitcher or batter should bear much responsibility for this turn of events. In some sense, this is now a confrontation between the baserunner and right fielder.
But let’s take a look at what the numbers say. Maybe some hitters are good at placing the ball in such a way that the runner can more easily make it to third. I figured out the percentage of times that a runner had gone first-to-third safely on a single to the right, how often the right fielder had been so victimized, and how often it had happened on the batter’s watch and the pitcher’s watch. Which actor had the most to say about such situations? The results might surprise you:
The pitcher has a lot to do with whether that runner makes it to third on that single. Consider that pitchers who give up sharp grounders or line-drive singles are probably more likely to give up singles, but they get to the right fielder faster. It might also have something to do with him having a good (or bad) pickoff move and keeping the runner closer (or letting him get a bigger lead). As we might have guessed, the runner deserves much more credit than the batter for his exploits on the bases. What might jump out though is that the noise component is rather large, compared to what we’ve seen so far, taking up 11.2 percent of the variance.
Coming back to our example, the 9.8 percent jump that comes with Ichiro going to third can be split up as 0.9 percent to Figgins, 3.9 percent to Santana, 1.4 percent to Abreu, 2.6 percent to Ichiro, and 1.1 percent to noise. (Someone just added those and got 9.9 percent... rounding error.)
Summing It All Up
In this one play, the Seattle Mariners saw their win expectancy go from 51.9 percent to 70.1 percent, for a total swing of 18.2 percent. But by breaking things down, we find a better way to chop up the credit than simply assigning everything to Figgins who had the good fortune to be the batter:
So, the most valuable "Mariner" on the play was Santana. Figgins, who otherwise would have gotten a full 18.2 percent WPA credit gets a mere 5.3 percent. Figgins owes most of the bounty that he receives now to members of the Angels (9.2 percent), his teammate (2.6 percent), and background noise (1.1 percent). Even if we focus on actual members of the Mariners, Figgins and Ichiro should split the credit about two-thirds to one-third for their relative contributions.
What It Means
It should be pretty obvious that programming an algorithm that would cover all (or at least most) situations that actually happen in a baseball game is a big project. This one play took me hours. But it’s just an engineering problem, and my hope is that this work will provide a decent blueprint on how to start. But more than that, even in this one example, it should become clear that the current cultural practice of giving the lion’s share of the credit to the pitcher and batter for what happens during a plate appearance is silly. There are other actors and they should be noted and given credit (or blame) as is due. My ultimate hope is that we can begin to look at WPA through a slightly more nuanced view and incorporate the baserunning and fielding components that have been ignored for so long
There are a few other lessons to be gathered here. One is that win probability and other related measures can show and incorporate the idea of "tipping your cap to the other guy." A batter who hits a line drive but is the victim of an outstanding diving play has nothing for which he should be ashamed. He hit it hard and was unlucky; sometimes that happens. We might not be able to completely capture that numerically, but this methodology can begin to tease some of it apart. Then there’s also the bad luck that comes from not being able to pick which pitcher you face. A batter who strikes out against Roger Clemens in his prime isn’t so different from the rest of his peers, and the strikeout may have been more to the credit of Clemens than the debit of the batter. Sometimes you do everything right, and the other guy does it better.
Then there is the unexpected lesson about DIPS. These numbers suggest that the pitcher has much much more control over what happens to a ball in play than we have previously believed. It’s true that BABIP is a measure that takes a while to stabilize on an individual level over what amounts to a small sample size. It’s proper to say that a pitcher’s BABIP in one year doesn’t tell you much about what’s going to happen next year. But, using a few statistical tricks to pump up the sample size, an inconvenient truth (am I allowed to use that?) emerges. For a long time, we’ve allowed pitchers to get away with just about everything once the ball left the bat. It appears now that the result of a play does indeed depend a great deal on the pitcher.