Ever since the PITCHf/x system debuted in the 2006 playoffs, people have been interested in what it says about the strike zone that the umpires call.

John Walsh and Jonathan Hale provided some of the seminal work on the topic. John observed how the umpires called a strike zone that was wider than the rulebook definition but not as tall and that it was shifted toward the outside for left-handed batters. Jonathan also looked at the umpire zones and broke down the results by umpire and pitcher.

Jonathan and recent BP addition Dan Turkenkopf built on this work by examining how the strike zone changed in a variety of situations: by inning, by pitcher age/experience, by pitcher control, by home/away team, etc.

More recently, John Walsh and J-Doug Mathewson raised the profile of the discussion with articles about how umpire zones change based on the ball-strike count and other factors. Jonathan Hale and Dave Allen had observed many of these effects previously, but John and J-Doug’s work got the attention of Rob Neyer, and with that attention came a lot of criticism of umpire abilities from Rob and others. Unfortunately, the focus was more on inflaming hysteria about bad umpiring than on thoughtful sabermetric questioning:

From Major League Baseball's perspective, it doesn't matter why it's happening. It shouldn't be happening, and we can only hope that something's being done. The strike zone should be the strike zone, regardless of the count. If the umpires call it the same way every time, the players will adjust accordingly. And it's worth pointing out that the games would go just a little quicker if umpires weren't consistently extending plate appearances based on the count.

Why would umpires be influenced to change their strike zones in so many different ways? What physical or mental factors influence them, and is there any evidence to support those theories? Are the umpires really as inconsistent as the data presented by the articles to date would suggest?

I have spent portions of the last couple years investigating the data about the strike zone and puzzling over those questions, mostly without finding answers I considered satisfactory, until recently, when a number of separate ideas coalesced into a theory about how umpires actually call the strike zone. Most of the ideas are individually well known in baseball and will not come as a surprise. What came as a surprise to me, however, was how they fit together to explain with one consistent theory a great deal about the data that has been observed about the umpire strike zone.

Individual Batter Zones

As mentioned above, several analysts have found that different pitchers have different strike zones called by the umpires. It turns out that batters also have their own individual zones. That this is true in the vertical dimension is not surprising, of course. Batters are of differing heights, from David Eckstein at 5’7” to Adam Dunn and Corey Hart at 6’6”, and their crouches are more or less extreme, leading to varying vertical boundaries even for the rulebook strike zone. However, individual batter strike zones, like individual pitcher strike zones, also vary by an inch or two on the inside and outside edges.

In aggregate, this shows up as the previously observed difference between left-handed and right-handed batters, but it also extends to differences among individual batters of the same handedness.

The first theory I examined as an explanation for this variance among batter zones was that batters who stood closer to the plate had their zones shifted outside as a consequence of the umpire using the batter’s body as a reference point. It is true that there is a correlation between batters who stand closer to the plate, as measured by how often they are hit by inside pitches, and those whose strike zones are shifted outside. However, the correlation is not perfect, and the counter-examples are particularly instructive as to the true cause of shifted strike zones.

For example, Jason Kendall hangs over home plate and is among the batters with the highest hit-by-pitch rates. Nonetheless, his strike zone was shifted almost an inch inside relative to the average right-handed batter. On the other end of the spectrum, Nelson Cruz has a very upright stance that keeps him away from the plate, and he is near the low end on hit-by-pitch rates. However, his strike zone was shifted almost an inch outside relative to average. What could be causing this disparity?

You might know that Kendall displayed the last vestiges of a power stroke during final years of the Clinton administration, while Cruz had a slugging percentage of .555 over the last three years. Despite Kendal’s tendency to crowd the plate, pitchers are unafraid to come inside and over the plate to him, whereas low and away is the favorite spot for a hurler confronting Cruz.

The typical pitch location seen by the batter has a strong correlation to the horizontal shift in his strike zone. Batters who see more pitches on the outside edge also see their strike zone boundaries shift farther away on both the outside and inside edges of the plate. Batters who see more pitches on the inside edge see their strike zone boundaries shift toward the inside.

The question, then, is whether the umpire calls the strike zone differently because the pitcher and catcher are aiming the pitches differently to the batter or whether the causality runs in the opposite direction. If the pitcher and catcher are adapting to an already existing umpire-specific zone for each batter, that still leaves us with the unanswered question of why the umpire zone differs from batter to batter, and batter positioning at best offers only a partial answer to that question.

I was unable to definitively answer the question of the direction of causality, but a mountain of circumstantial evidence points to the umpire zone being influenced by the location of the catcher’s target, rather than the other way around. Thus, I propose that the catcher target is the driving factor in how umpires call balls and strikes.

The Catcher Target Theory

In June 1993, Baseball Digest quoted Matt Nokes with his views on how umpires call the zone.

“Predictability is the key to getting borderline calls,” says Matt Nokes of the Yankees. “If the pitcher is consistent, then the umpire knows where to be looking. But if the catcher is jerking all around the plate and the ump does not know what is coming in where, it’s going to be harder for him to focus on those close pitches and you won’t get them. If the pitcher is throwing consistently where the catcher is setting up, he doesn’t have to be so fine. But if I set up inside and the pitch is on the outside corner, even if it is a strike, we’re not likely to get that call. Even if the pitch is over the outer half of the plate, it will be called a ball, because it missed the catcher’s target so bad. That’s just the way it is.”

If the umpires adapt their strike zones based on the location of the catcher target, it explains with one consistent theory many of the heretofore observed phenomena regarding the zone. If the catcher changes his target based on shifting umpire zones, these phenomena remain a collection of unrelated and unexplained oddities requiring a variety of unsubstantiated and sometimes contradictory theories about umpire motivations.

For example, as mentioned earlier, the zone for left-handed batters is shifted toward the outside. Do umpires have some bias against left-handed hitters? If so, why? Perhaps a more likely explanation is that they simply call more strikes outside to lefty hitters because that’s where the catchers are setting their targets, and the umpires are using the target as a cue. While right-handed batters see 58 percent of pitches outside of the midpoint of the plate, left-handed batters see 66 percent of pitches on the outside half. The average pitch to a left-hander is 2.4 inches farther outside than the average pitch to a right-hander, which dovetails nicely with John Walsh’s finding that the average strike zone for a left-handed batter was shifted 2.3 inches farther outside than the average zone for right-handed batter.

If umpires are influenced by the catcher target, it also explains why individual pitchers see such different zones. J-Doug Mathewson’s research placed Livan Hernandez and Felix Hernandez at opposite ends of the spectrum in benefiting and suffering from changing umpire zones. Look at where those two pitchers locate. (Pitches where the batter swung are not shown.)

Livan Hernandez aims toward the very edges of the zone, or even a little outside, both to righties and lefties, and it appears that the umpires give him the strike call when he hits the middle or inside of the catcher target. Felix Hernandez, on the other hand, aims closer to the middle of the zone. If he locates a pitch at the edge of the zone, he’s very likely to have missed his catcher target, and the umpires don’t give him the strike call in those cases.

Similarly, one of the variations in umpire zones that Dan Turkenkopf identified can be explained by variations in the locations that pitchers and catchers are targeting. Dan found that the older (or more experienced) a pitcher was, the bigger the zone he got from the umpires. It also happens to be true that the older a pitcher is, the more he pitches to the outside edges.

Why older pitchers pitch more on the outside edge is a question for further investigation, but it’s no accident that this affects the strike zone that pitchers see. This is another piece of circumstantial evidence that umpires are giving pitchers strikes on the edges when they hit the catcher target.

Catcher Framing

We can even look at pitchers who live on the edge of the zone and see some surprising differences among their receivers. Livan Hernandez got a bigger strike zone in 2008 with Joe Mauer behind the plate than he did when Mike Redmond was his catcher. In 2009, Hernandez got a bigger zone with Wil Nieves than with Omir Santos behind the dish. The effect is not huge, but it’s noticeable—the difference of a couple strikes per game. Other pitcher-catcher pairs demonstrate this effect, as well.

Compare the strike zones that Javier Vazquez saw with Jorge Posada and Francisco Cervelli catching him in 2010.

Vazquez saw a slightly larger zone on the outside edge to left-handed batters and especially to right-handed batters when Cervelli was catching.

Analysts such as Dan Turkenkopf and Bill Letson have looked at the issue of catcher framing using the PITCHf/x location data. They found dramatic and repeatable differences in framing performance among catchers, to the tune of 50 runs per season or more. Our catcher target theory of the zone would suggest, however, that a large part of this difference may be due to the typical pitch distributions thrown by pitchers and seen by batters. The differences in batter pitch distributions would probably mostly wash out over a season-size sample for a full-time catcher, but the pitcher sample for each catcher could remain highly biased and have a large effect on the framing measurement.

When Bill’s catcher framing numbers for 2008-2009 are normalized by pitcher, the best and worst catchers are around +/- 20 runs per season. This method could benefit from some additional fine-tuning, but at least the size of the effect is now in a range much more compatible with the size of the catcher ERA effect that Sean Smith found by studying catcher-pitcher pairs in the Hardball Times 2011 Annual.

This is an effect that has been observed at least far back as 1989, if not earlier. In The Diamond Appraised, Craig Wright discussed catcher framing skills.

Surprisingly, one of the key differences between the best and the worst is a mechanical factor. A catcher can get more strike calls on borderline pitches by not showing the umpire his glove as a target, or at least by drawing it back after the target is given. The best catchers—particularly the ones who call fewer walks in the matched innings—tend to give a full open-faced target to the pitcher and hold the glove closer to their body (watch Boone and Gary Carter). Holding the glove in toward the body is partially a physical reaction. Holding the glove perpendicular to the ground is a strain on the wrist and the forearm; holding the glove closer to the body eases the tension in the arm.

At first, the technique may seem counterproductive, giving a better target to the pitcher, but at the cost of losing the umpire by taking the glove out of his view. It would also seem to hurt your chances of getting a strike call by making you move more to go after bad pitches, particularly the low ones.

But that isn't the way it works. It's easy enough to handle the pitches around the strike zone with the glove held close to the body. The excess movement going after a bad pitch doesn't make a difference, because those are obvious ball calls anyway. It may even help emphasize to the umpire that if the catcher has to move a lot, it's a ball. Now consider the borderline pitch. Along with his natural judgment, the umpire is instinctively looking for clues. If he can't see the glove clearly, he may rely more on the catcher's movement; he didn't move, so it's a strike.

Not every strike zone variation can be explained by the catcher target theory, however. For example, there is a small but significant portion of the home field advantage (around 15 percent, according to Dan and J-Doug’s research) which derives from the strike zone. Average pitch distributions are very similar between home and visiting pitchers, implying that there is a different cause for the variation in umpire zone in that case.

The Effect of the Ball-Strike Count

Let’s dive in deeper on another source of strike zone variation that doesn’t seem to be explained by the catcher target theory: the changing strike zone by ball-strike count.

To recap, Jonathan Hale, Dave Allen, John Walsh, and J-Doug Mathewson have all observed that the strike zone is bigger in ball-strike counts that favor the hitter and smaller in counts that favor the pitcher. Since pitchers tend to pitch more to the edges in pitchers’ counts and more to the middle of the zone in hitters’ counts, our catcher target theory of the strike zone would suggest the zone should get bigger in pitchers’ counts and smaller in hitters’ counts. But that’s not what happens. What gives?

This is not a question to which I have an answer yet. However, it is instructive to look at the detailed location data by count. Though Dave conducted a regression that indicated that pitch type had no significant impact on the size of the zone at different counts, I found that pitch type and pitcher handedness did have a noticeable impact.

I compared mid-height pitches on the outside edge to right-handed batters at the 0-2 and 2-0 counts. The outside edge of the strike zone at 2-0 is about an inch and a half farther outside than it is at 0-2, as defined by the point where the umpire calls 50 percent balls and 50 percent strikes on average.

At this boundary, changeups were highly likely to be called strikes, especially from left-handed pitchers. Changeups made up about nine percent of the mid-height pitches on the outside edge at 2-0, but only three percent at 0-2. Sinking fastballs displayed a similar effect. On the other hand, sliders and curveballs were highly likely to be called balls, especially from left-handed pitchers. Breaking balls made up about eight percent of the mid-height pitches on the outside edge at 2-0, and 24 percent at 0-2.

Nonetheless, even if only four-seam fastballs are considered, the strike zone is still larger at 2-0 than it is at 0-2, still by about an inch and a half, and the reason for this remains unclear.

I did not find any significant bias in the sample of batters or pitchers between the 0-2 and 2-0 counts in terms of their pitch location distributions. The differences were less than one tenth of an inch, which is a small fraction of the observed effect.

Even if the sample of players is relatively unbiased, the PITCHf/x data itself may be biased. There is some error associated with all measurements, and though the PITCHf/x plate location measurements are highly accurate, they are not perfect. Umpire ball-strike calls give us some information about the likely direction of this measurement error. The effect is usually not large, but small distinctions can become very important at the edge of the strike zone. Thus, corrected PITCHf/x plate location measurements may be needed for some types of strike zone research.

I have previously speculated in response to the findings about the zone and the count that PITCHf/x measurement errors could be playing a role in the measured size of the zone. However, this effect turns out to also be a fraction of the observed effect. Moreover, it actually operates to exaggerate the difference in the size of the zone. After accounting for PITCHf/x measurement error, the actual strike zone edge is about 0.4 inches closer to the plate on the 0-2 count and about 0.2 inches closer to the plate on the 2-0 count.

A better understanding of how and why the zone changes size by count awaits the results of further research.

What’s Next?

Anyone researching the performance of umpires in calling balls and strikes is strongly encouraged to consider the catcher target theory. It does not fully explain every umpire variation, but it appears to be the primary factor in many cases.

Catcher framing is also an important topic in its own right. The specifics of what catchers do is worthy of further research with the PITCHf/x data. Noting that the catcher target affects the umpire zone is one thing; identifying and quantifying the effect of specific catcher mechanics is another. In any case, the ability to better quantify this aspect of catcher fielding is very important.

Whether baseball benefits from umpires adapting their zones to the catcher’s target is not necessarily a question with a simple answer. Umpires are rewarding pitchers for accuracy and command and penalizing them for being inconsistent and missing their target. Pitchers and catchers presumably expect this, and in As They See ‘Em, Bruce Weber argues that coaches and players in the dugout also judge balls and strikes in relation to the catcher target.

[Umpires will] say calling strikes is paramount, but they’ll withhold a strike call from time to time—if the pitcher badly misses the catcher’s target, for example, even if the ball might still graze the zone. If the catcher sets up outside and the pitch is up and in, the umpire ethos says the pitcher doesn’t deserve a close call for doing a poor job. Besides that, he’s made the catcher lunge; his glove probably moved out of the strike zone, which means it’ll look like a ball from the dugout, which means the umpire will be getting an earful if he calls a strike.

Is it a good thing or bad thing that some pitchers, like Livan Hernandez, are able to make a living by persistently targeting the edges of the zone? Similarly, some catchers can help their pitchers and their teams by gaining strike calls from the umpire through superior receiving mechanics. The umpire zone affects the career prospects of some players positively and others negatively. There is undoubtedly skill involved from both the pitchers and the catchers who are able to expand the strike zone. Such skill would be lost if the umpires were replaced with pitch-calling robots or were retrained to call the exact same size of zone for every pitch, regardless of the catcher target. Some people would find such fairness laudable; others would lament the passing of valuable baseball skills.

Even if it were desirable, it may not be possible for the umpires to cease using the catcher target as a physical reference point. It might be difficult for umpires to change that behavior, whether it is conscious or unconscious. The Zone Evaluation system, which is based upon PITCHf/x data and is used by Major League Baseball to grade umpires, takes into account the catcher target, according to statements made by Sportvision representatives at the 2008 PITCHf/x Summit. (The details of how catcher target is included in the umpire grading process were not given.  It may be similar to the process used with Questec data, which is described in As They See ‘Em, pp. 198-199.)

In the cases where the catcher target theory does not appear to explain the strike zone variation, it’s worth taking a deeper look to see if the mix of pitch types and pitcher and batter handedness are concealing instances of the catcher target theory in operation in canceling directions. Moreover, the pitcher, batter, or even catcher samples may be biased toward types of players who throw or see atypical pitch distributions.

Even if the sample of players is relatively unbiased, the PITCHf/x data itself has measurement error which may bias the results. Thus, corrected PITCHf/x plate location measurements may be needed for some types of strike zone research.

Umpire grading, whether individually or collectively in various game situations, is a tricky task. Umpires appear to call a zone that is very dependent on the location of a pitch relative to the catcher’s target. Sample bias and PITCHf/x measurement are also confounding factors. One may wonder how well Major League Baseball’s umpire grading accounts for these factors.

Many fans and writers rush to stick negative labels on umpires and to jump to conclusions of incompetence without carefully investigating the data. It’s not enough to throw up a PitchTrax graphic of the strike zone with a strike call shown outside the box in order to declare an umpire an incompetent idiot better replaced with a machine.

PITCHf/x data has been a great boon for baseball analysis, including the analysis of umpires and the strike zone, but it requires careful analysis if we are to come to conclusions that will stand up to scrutiny. The strike zone is an important topic, and the quality and motivations of umpires are worth investigating deeply. Let’s not stop with half answers and then delude ourselves that we are ready to sit in judgment of the umpires.