March 31, 2015
The Most Important Player on the Field
In any one baseball game, there are 50 players who are eligible to play. Which of them is the most important? On any single play, there can be up to 13 players who can directly impact the outcome (the nine fielders, the batter, and potentially, three runners). Which one of them will have the biggest effect on what happens? Even if we zoom in on the batter and pitcher (because the answer is probably going to be one of them), should we worry more about what the batter brings to the at bat or the pitcher?
About five years ago (and two kids ago), I started a project to try to revamp how we give out credit for things in baseball. We have plenty of research on a lot of things in the game, but when it comes to stats like win probability added or even some linear weights based systems, we just sort of assume that just about everything that happens in a baseball game is the fault (or credit) of the pitcher and batter. A pitcher might induce a double play grounder directly to the shortstop, but then watch in horror as he realizes that his WPA for the play will suffer because the shortstop decided to try to turn a 6-9-4-3 double play. Although that leaping catch that the second baseman made to snare that line drive that was headed toward the gap. Totally the pitcher. Totally.
The problem is that I really haven’t done anything with it in… five years. So let’s pick that up again and take a look at what can happen when you look a little deeper into who is responsible for what in a baseball game.
Warning! Gory Mathematical Details Ahead!
I used a methodology similar to the one from five years ago. For 2010-2014, I coded all terminal batting events (they ended the plate appearance) as either being a strikeout or not a strikeout. I then took the seasonal strikeout percentage (K / PA) for the batter in each plate appearance, the pitcher, the league in general, and also the catcher. (As a check, I ran a version of this with the right fielder, who we can be pretty sure has no significant effect on the strike zone. Sure enough, he didn’t.) I converted all of these percentages into logged odds ratios, one for each actor (pitcher, hitter, catcher, league).
I created a binary logistic regression in which all four of these terms tried to predict whether or not a plate appearance would end up in a strikeout. It may seem a little strange, but here’s the logic behind it. Suppose that batters had absolutely no control whatsoever over whether a plate appearance ended in a strikeout. It was all the pitcher. Then, we would see that hitters would tend to strike out more when they faced pitchers who were good at striking hitters out. Any differences between hitters in their K rates would be a function of the pitchers they faced and random variation. If hitters were completely in control, they would strike out at a similar rate no matter who was on the mound. If neither had anything to do with it, we would find that the league rate would be the best predictor of whether an individual plate appearance ended with the letter K.
Of course, the answer is somewhere in the middle and it’s tempting to say “just call it 50/50,” but we can do better than that. Instead, we can statistically look at the amount of variance that each predictor picks up. For the initiated, I did this by looking at the -2 log likelihood statistic. I only looked at the variance contributed by the actor factors (batter, pitcher, catcher, league) to get an idea of how much they pick up variance relative to each other, rather than the actual overall variance which includes a good chunk of just randomness in general.
As you might imagine, I did the same for walks, HBP, grounders, line drive, and fly balls, again, using 2010-2014 Retrosheet data.
Based on that data set, we can portion out credit or blame for each event at these rates.
But let’s look at what this means, again looking at strikeouts. We know that while a pitcher is only responsible for 35.3 percent of an individual strikeout, if he’s the starter, he will pitch to 25-30 batters in a game, while a hitter will only take four or five plate appearances. Then again, the starter only goes once every five days while the hitter might play all give of those days. So, who actually ends up being responsible for more strikeouts over the course of a year? What follows is just last year’s strikeout leaderboard, weighted for the amount that each actor is responsible in the equation. For example, David Price led the majors with 271 strikeouts. His 95.7 value below is simply 271 multiplied by 35.3 percent.
Framing answers a somewhat different question than we’re asking here. In the anatomy of a strikeout, it’s more important to have a guy on the mound who can fill up the strike zone than a guy who can steal a few strikes around the edges. Here we see that the ability to get it near the strike zone to begin with is much more important. Consider that the difference in extra strikes per game generated by a fantastic framer like Hank Conger and an awful one like Tom Telis was about 7 strikes, using 2014 numbers. In 2014, the difference between a guy who fills up the zone (Phil Hughes, 56.4 percent of his pitches are in the zone) and a guy who seems to avoid the zone (Francisco Liriano, 35.0 percent in the zone) would be 21 potential called strikes over the course of a standard 100 pitch outing. Even accounting for the fact that some of those were swung at, you also have to account for the fact that the difference between a strong swinging strike artist (Kershaw at 14.1 percent of his pitches) and a weak one (David Phelps at 5.4 percent) represents another source of talent spread that will certainly lead to variations in K rate.
But what we see is that even weighted for the number of times that they come to the plate, hitters are still somewhat more important to a team over the course of a season. There’s a strange duality though that baseball has implicitly long-recognized. Within a single game, the starting pitcher is going to be the most important member of his team. It’s why individual games are often sub-titled by their starting pitching matchup. The only time they really bother to publish the lineup in advance is the All-Star Game, mostly because it’s the All-Star Game.
Important to Whom?
But wait a minute, in the playoffs themselves, where one game means a lot more and occasionally, it means everything, then suddenly, the pitcher is much more important again. Over the long haul, which only the baseball junkies pay attention to, you’re better off betting on a team with a great offense. When everyone’s paying attention in Game 7, you’re better off looking at that day’s starting pitcher. Pitching wins games. Offense wins seasons. Offense gets you to the playoff series. An amazing starter wins you Game 7. That’s an over-simplification since we’re talking about a variance composition that’s not terribly out of balance, but is still somewhat more tilted toward the hitters. But our more complex mantra is instructive and it’s more accurate than repeating “Pitching wins championships” which basically starts from the mistaken belief that Game 7 is all that matters. Sure, if you get there, it is all that matters, but you gotta get there first. And the dirty little secret is that to get there, it’s actually your offense that is more likely to carry you there.
So, who’s the most important player on a team? I guess it depends on what level of zoom you have the microscope set to.