World Series time! Enjoy Premium-level access to most features through the end of the Series!
May 13, 2010
Checking the Numbers
Last week, I brought up the topic of catcher defense, and how it had so piqued my interest. One of the main reasons for my fascination is how there are several different aspects of a catcher’s responsibility on the defensive end, some more quantifiable than others, yet the more qualitative components seem to matter a heck of a lot more than what can currently be measured. Another interesting point is how, even with the recent advancements in defensive metrics, determining how a catcher contributes to a team with actual glove work—and not just from being a catcher—has eluded analysts for quite some time. Even more interesting is the thought of catcher defense itself: while shortstops field grounders and right fielders shag fly balls, the foul outs and dinky three-foot balls on the ground that would be the equivalent for catchers of the balls in play for infielders and outfielders aren’t thought of initially with catcher defense.
It was great to hear the takes on defense from A.J. Hinch, Jason Jaramillo, and Gregg Zaun last time out, given their respective roles, but my goal today is to take a look at the other side of that coin, the analysts, and discuss similarities and differences between how certain components of catcher defense are accounted for in different attempts. Before developing any metric, it is important to understand what actually should be measured, as well as what others have done along those lines. While this literature review of sorts will not encompass every article penned on the subject, it should offer a solid foundation of the various components of catcher defense and how they would be converted to a runs-based or wins-based statistic. Perhaps the foundation will spur a discussion on specific components that can help us determine what should go into—and with what weighting—a metric that could instill confidence and become uniform in the field.
So what does a catcher do? Based on Keith Woolner’s outline in Baseball Between the Numbers and the interviews with the aforementioned triumvirate of major-league catchers, backstop responsibility essentially breaks down into three main groups: helping the pitcher execute, controlling the running game, and catching the ball. Each of these three involves its own subset of responsibilities. For instance, helping the pitcher execute encompasses game calling, making adjustments on certain hitters, keeping the pitcher’s head in the game, being his psychologist and confidant, and anything else qualitatively that impacts performance. Controlling the running game breaks down into nabbing would-be basestealers, preventing them from attempting based on reputation, or the seldom-discussed throwing out of lead runners on bunt attempts. Catching the ball involves, as Zaun mentioned, being a good receiver, framing pitches to allow the umpire to make an accurate call, and blocking balls in the dirt to prevent wild pitches or passed balls.
With regards to the qualitative facets of catcher defense, before breaking down each of the components above, I want to point out that immeasurability should not be conflated with importance. Just because we don’t know right now how to convert pitches framed or making adjustments behind the plate into linear weight run values does not mean they should be completely ignored when discussing catcher defense. As I mentioned last year, it is still perfectly fine to craft metrics without these qualitative inputs, so long as we realize they might not be complete; in other words, being 65 percent there is great, and better than nothing, but we shouldn’t "dis" that which cannot be computed and assume we are 100 percent accurate.
Admit it—steals and caught stealings are the first two things you think about when the topic of catcher defense comes up. Heck, it happens to me all the time. Throwing runners out is sexy—relatively speaking—and it is increasingly common to derive opinions of catcher defense from their stolen-base percentage against. Mike Piazza was considered to be defensively handicapped because everyone, Rich Garces included, could steal a base safely with him behind the plate. Though Piazza may have, in fact, been a poor receiver, his inability to prevent steals alone did not render him ineffective. It was just one of many inputs.
To account for stolen bases, most analysts utilize some variation of the following: take the numbers for the individual, compare it to the league, broken down by pitcher handedness and prorated to the league average. For example, if the league allows .10 steals per inning with a righty on the mound, and a catcher kneels down for 1,000 innings, he would be expected to allow 100 steals. If he only allowed 65, then he allowed 35 fewer than the league in this split. The same process can be run for times caught, and then the run values for steals and caught stealing are applied to the results. In the end, the calculation will offer how many runs above or below the league average a catcher contributed or subtracted from his team in this component.
I have seen a few attempts that do not break the inputs down by pitcher handedness, which I consider to be vital to this component. Lefties have an advantage over righties with runners on first base and thusly should be expected to have a lower rate of steals per inning. If a catcher receives pitches from an inordinate number of lefties, and a blanket percentage across the league is applied, the results are likely to be skewed.
Admittedly, I haven’t seen these used much save for Chuck Brownson’s article at Beyond the Box Score, but the idea is that certain catchers might not receive as much benefit from throwing runners out because they are not tested as much based on their reputation. Yadier Molina is a good example of this, as is Ivan Rodriguez from back in the day. Both catchers were held in such high esteem with regards to gunning down runners that fewer baserunners even attempted to steal. From a run-value standpoint, which aggregates in counting numbers fashion, Molina would potentially pale in comparison to a catcher who throws out a good number of runners due to having such a high volume of attempts against him.
To deal with this, Brownson proposed a methodology that would calculate reputation runs based on the value of a stolen base, the likelihood of the runner being caught, and the number of times runners have foregone the opportunity to steal based on said reputation. The calculation works very much like an expected value formula, where a catcher is expected to have a certain number of attempts against him; a number lower than that is attributed to his reputation. As of August 5, 2009, when Brownson described his methodology, Molina had produced 4.4 runs above average based on his reputation, well above anyone else.
Both of these events are lumped together most of the time due to Colin Wyers’ favorite subject other than run estimators—scorer bias. We might know how to explain the difference between a passed ball and a wild pitch, but if I had a nickel for every time I swore it was a wild pitch only to find out the pitch was scored a passed ball, I’d have $7.60. We can sit here and debate the merits of lumping the two events together, but because of the potential for errant classifications and similarities between both events, it makes sense to combine them. Of course, the usual caveats with balls getting by the catcher are at work here, such as a discount for catching knuckleballers. Why should Tim Wakefield’s personal catcher suffer when he has a much tougher job than catching, say, Dan Haren?
Anyway, the sum of these events is converted to runs added or subtracted in very much the same fashion as stolen bases, where the individual is compared to the league and the run values are applied to the net result. One difference, or at least one I would propose, would be to use pitches caught as opposed to innings behind the plate as the denominator, as the numerator consists of pitches as well. Combined with the stolen-base numbers and reputation runs, the passed balls and wild pitches wrap up what is generally thought of right away when conjuring up what catcher defense actually entails.
Catchers don’t have to make plays as often as infielders, but there are foul balls in the air to catch, dribblers to gobble up, and sacrifice bunts to field. Curiously enough, a cursory scan of catcher defense articles produced not one that included sac bunts and throwing out lead runners. Perhaps the availability bias is at work in my mind after watching Molina put on a veritable clinic against the Phillies last week, throwing out lead runners on bunts seemingly at will, but think of the shifts in run expectancy. If the Twins have a runner on first and second and one out, and a successful sacrifice bunt is executed, they have runners on second and third with two outs. If the opposing catcher nabs Jim Thome at third, the net result is no change in baserunners and another out. Then again, if he throws to third and the runner is safe, the change is even more substantial.
Accounting for lead-runner plays could be calculated using these shifts in run expectancy. In the end, it might even out, but it certainly shouldn’t be ignored, nor should errors on foul balls and dribblers, as infrequently as they might occur. In this article from Justin Inaz, who had the good fortune of sitting next to me and having to listen to me talk for three hours straight at our PNC Park event last year, different run values were proposed for throwing errors and fielding errors. From the author himself:
Not much has been done on this subject, though I was able to find this gem of an article from Bill Letson, which describes a very interesting framework for computing the value of framed pitches. Just to be clear, framing pitches refers to the catcher holding his glove after catching the ball, as well as potentially catching the ball in a certain way that increases the likelihood of the pitch being deemed a strike. Sometimes it looks silly, but certain catchers look as though they have the art down pat, receiving the ball and inching it subtly back to a corner, holding the glove for the umpire to get a better sense of where the ball was "actually" thrown.
Summarizing Letson’s methodology, we first normalize strike zones, and divide the zones into different bins based on batter handedness and count. A LOESS regression is then run to estimate the probability of a pitch being called a strike—and it should be noted that "pitch" refers solely to those classified as a ball or a called strike. The actual result of the pitch is compared to the probability of it being called a strike. If the probability was 0.3 and it was actually a strike, then the catcher is credited with 0.70 extra strikes.
Next, Letson astutely discussed umpire normalization. As someone who has worked with umpire data quite a bit, I was very pleased with this control. Essentially, umps have different strike zones. The range between the thinnest and the thickest zones might not be as vast as you may think, but it adds another level of accuracy to either discount or add a premium to the extra strikes based on the zone of the ump behind the plate. If the ump for our hypothetical pitch above calls, on average 0.05 extra strikes per pitch, then the pitch would add 0.65 extra strikes (0.70 from the actual vs. expected comparison minus 0.05, the ump factor).
I love Letson’s idea and think it is a tremendous attempt at quantifying framed pitches, or at least the number of extra strikes that are not attributable to the ump or the pitcher, which is sort of a definition backing into how framed pitches are defined.
The above components are certainly integral to crafting a metric measuring what catchers add or subtract behind the plate, but the qualitative aspects of calling a game and being a good receiver should not be forgotten. The problem is that I don’t feel incredibly comfortable using a blanket estimate range and attributing some number in that range to these aspects. While Catcher ERA isn’t persistent from year to year, it doesn’t mean what actually occurred was fake—it just means it might not happen next year.
Let’s open a discussion here on this topic. Keep in mind that what was presented here is more of a summary of others’ work to instill a solid foundation of what we’re working with, as opposed to my own attempt at a metric. How do you feel about how we are currently accounting for these components? How should the qualitative components be factored in? How important do you consider catcher defense to be given the massive positional adjustment of simply being a catcher? Hopefully we can get a good discussion going and hammer out some details or even add a few new ones into the fold.