October 2, 2007
The Umpires, Part I
Last night's one-game playoff between the Padres and Rockies was an exciting end to the regular season, made a bit more exciting by shoddy play-calling from the umps. With technology giving us detailed pitch location data, and additional cameras and high-definition television giving us a better view and more angles than ever before (except on TBS, where last night's game was presented with the same level of intensity and production values as a mid-May tilt between the Braves and Pirates), today's umpires work the playoffs under scrutiny bordering on the microscopic. In the view of some commentators, the question isn't if but when the umpires are going to have more embarrassing moments this month, and how disastrous they'll be when they happen.
Today we're going to discuss some of the tools that are used to analyze umpires' performance, and some of the research that's been based on this raw data. We'll start with the Umpires Report, located on our statistics page. The report focuses on calls made by the umpire when he's behind the plate calling balls and strikes, and essentially it gives information on every type of event that occurred in those games, including data that may seem extraneous, such as hits and home runs allowed, hit by pitch, etc. Naturally, the calls that we're going to be interested in are those on which the umpire has a direct effect: balls, strikes, walks, strikeouts, and a few others.
First, let's look at some data that could locate umpires who might have been squeezing the strike zone in 2007, with a minimum 200 innings behind the plate:
Name G UBB Rate (%) Name G Ball % Paul Schrieber 31 9.98 Paul Schrieber 31 40.3 Jerry Layne 19 9.24 Greg Gibson 34 40.2 Gerry Davis 35 9.24 Jerry Layne 19 40.0 Angel Hernandez 36 8.97 Gerry Davis 35 39.6 Greg Gibson 34 8.77 Jim Joyce 34 39.6 Dana DeMuth 34 8.74 Tim Timmons 33 39.4 Lance Barksdale 35 8.68 Ed Montague 31 39.4 Chuck Meriwether 33 8.54 Jerry Crawford 16 39.3 Ed Rapuano 34 8.44 Dana DeMuth 34 39.3 Mike Reilly 34 8.43 Randy Marsh 35 39.2
Some of you might remember unintentional walk rate (UBB Rate, sometimes also referred to as UBBR in our database) from the Non-Contact series. Eagle-eyed readers will note that there's no actual "Ball %" statistic in the Umpire's Report, or in its customizable version. Nonetheless, the raw data to tell us what percentage of the pitches an umpire saw were called balls is available in the report, and making this calculation on a spreadsheet is a relatively simple matter.
Now, we'll look at the opposite-umps whose strike zones might just have extended from batter's box to batter's box this season (again, with a minimum of 200 innings as the home plate ump):
Name G K Rate (%) Name G Strike % Andy Fletcher 18 20.0 Doug Eddings 34 46.6 Mark Wegner 33 20.0 Laz Diaz 36 45.8 Doug Eddings 34 19.6 Andy Fletcher 18 45.4 Tom Hallion 33 19.5 Bruce Froemming 33 45.3 Jeff Nelson 20 18.9 Jim Wolf 33 45.3 Laz Diaz 36 18.8 Mark Wegner 33 45.1 Chad Fairchild 28 18.8 Tom Hallion 33 45.1 Bill Miller 34 18.6 John Hirschbeck 26 45.1 Bill Welke 34 18.3 Bill Miller 34 44.9 Rob Drake 35 18.3 Bill Welke 34 44.9
If you were to calculate Ball Percentage and Strike Percentage for umps in 2007, you'd find that the numbers don't add up to 100 percent. This is because in the Umpires Report database batted balls (balls in play plus homers) are not treated as strikes. This makes some measure of sense, as an ump may not necessarily have much influence on how often the batter puts the ball in play-but we note it because it's not the way we're most accustomed to seeing things, where every pitch thrown is identified as either a ball or a strike.
The data above seems to be of substantial import to people interested in handicapping and placing wagers on baseball contests, presumably only in places where that kind of thing is legal. But the data we're looking at is raw, and doesn't seem to account for that watchword of the analytical community: context. There are no adjustments for ballpark effects, strike zone control of batters, or-most importantly-quality of pitchers. We also don't have any indication of the relative influence of the umpire, pitcher, and hitter on the strike zone of any given game.
We're going to stop here for now. A few more warnings before you go off to play with the Umpires Report:
Christopher A. Parsons, Johan Sulaeman, Michael C. Yates, and Daniel S. Hamermesh, "Strike Three: Umpires' Demand for Discrimination": This study from the University of Texas, which was covered in Time magazine, indicates the presence of racial bias in ball/strike calls in favor of batters and pitchers that are of the same race as the umpire, and shows an advanced use of the type of raw data contained in the Umpires Report.
Mitchel Lichtman, "A Fascinating Study…": The comments (particularly #24 and #25) to this blog post by one of the authors of The Book contain a follow-up work related to the University of Texas study on umpire discrimination.
Nate Silver, Lies, Damned Lies, "Fixing It": Inspired by the NBA referee scandal, this study looks at the possible on-the-field effect of a hypothetical attempted fix by a corrupt umpire. It's also potentially instructive for examining the effect of the bias claimed in the University of Texas study in terms of real-life wins and losses.
Dan Fox, Schrodinger's Bat, "Calling the Balls and Strikes": The next generation of baseball research is the increasingly-advanced pitch- and ball-tracking functions such as MLBAM's Pitch f/x. In this study, Dan Fox uses Pitch f/x to examine the accuracy of ball and strike calls by the men in blue.