“When you get right down to it, no corner of American culture is more precisely counted, more passionately quantified, than the performance of baseball players.”
–Alan Schwarz, from the introduction to The Numbers Game

All of us who appreciate the numbers that baseball generates understand that they have an inherent visual appeal. Whether it’s the simple “slash stats” in the familiar 3-4-5 pattern (.331/.417/.559 describing Stan Musial‘s career, for example) or an entire batting line from the back of a baseball card, for us the order, spacing, and shapes of the numbers themselves paint a picture of the players they describe. And from spray charts to spark lines to graphical representations of daily performance, we continue to find new ways to illustrate and appreciate the numbers of the game.

In a very small way, I hope to contribute in that vein this week by releasing version 2.0 of the Balls In Play Chart software I discussed at the end of 2006. In addition, we’ll take on a reader question and look at the size of the strike zone in the postseason.

Did You Say Free?

Some readers will recall that the BIPChart software breaks down all batted balls for hitters and against pitchers into fly balls, groundballs, line drives, and popups, and that it displays a percentage for each in a baseball field that appears on the left side of the window. It then further divides these into vectors based on the fielder who fielded the ball. Balls fielded by the third baseman, shortstop, and left fielder fall on the left side of the field, so they’re included in the slice to the left; those fielded by the catcher, pitcher, and center fielder are assigned up the middle; and balls fielded by the first baseman, second baseman, and right fielder are assigned to the right-side vector. The number and percentage of balls are then displayed on the diamonds for each hit type, as shown below for all left-handed hitters from 2003 through 2007.


Just as for the aggregate data, the data for individuals spans the 2003 through 2007 seasons, with hitters getting one set per year (except switch-hitters, who get two) and pitchers getting one each against lefties and righties. To find the player you’re looking for, you simply use the drop-down box in the upper left to start typing the last name of the player. You can compare individual seasons for a player by using the down arrow to scroll through each year.

New to this version are two primary enhancements. First, you’ll notice that on the different fields a batting average is displayed in parentheses after each percentage value, indicating the player’s batting average (for pitchers it’s average against) on the balls in play for that category. For example, for all left-handed hitters, line drives hit to the left side yielded a batting average of .657, while those hit to the right side resulted in a .729 average. You can also see that overall lefties hit .326 on balls in play, and righties .323 at the top of the diamond on the left. It should be noted that this definition of batting average on balls in play includes home runs, and so the numbers are typically higher than you might find on various baseball sites. In addition, these balls in play do not include bunts.

Second, you’ll see a button labeled “Grid Display” directly above the diamond on the left (it’s disabled in the screenshot above). When viewing a player line, the button will be enabled; when pressed, it will display a small window with the seasonal aggregates for that player. For example, when viewing David Ortiz, the window will display as follows:

Ortiz Chart

To get started installing and using the software (which runs on Windows) you can click on this link to download the set-up program. By running the .msi file, you’ll be prompted to install the application in a folder on your machine; it will create a shortcut both on the desktop and in your Start/Programs menu. The only prerequisite for installation is the .NET Framework 2.0 runtime, which you can download and install from here.

I hope you enjoy the new version and, as always, I welcome your feedback. And just to throw out an idea: I’d like to gauge the interest in making this software auto-updating. For example, during the season the application would recognize the availability of new data on startup and download it, thereby giving you up-to-date information.

The Incredibly Shrinking Strike Zone?

At this time of year, you’ll often hear announcers and read mainstream reporters opining about the way the game is played differently in the postseason. From employing one-run strategies to using a quick hook, they make it sound as if the game is played under a different set of rules when the weather turns cold (and thankfully, it appears cold is the worst we can expect at Coors Field this weekend).

It turns out that same mindset also plays into some folk’s perception of how umpires do their jobs in the postseason. Typical of that view is this question from a reader:

It seems that strike zones are shrinking this postseason. Can you use PITCHf/x to see if balls and strikes are being called any differently in the playoffs?

Yes, sir, indeed we can. To start, let’s do a simple comparison of the ball and strike percentages in the regular season and postseason. To do this, we’ll examine all called ball (not intentional) and strike pitches that were recorded as reaching the front of home plate in a rectangle 30 inches wide, centered on home plate, and eight inches both below and above the reported strike zone for the hitter. As in previous columns, we’re using a definition of the strike zone where we count as an “impartial strike” any ball that comes within 2.5 inches of the regulation strike zone (an inch and a half accorded because of the radius of the ball and another inch for the error in the system).

For the regular season, these criteria yield 111,349 pitches out of 174,563, or 63.8 percent. For the postseason, this yields 2,389 pitches out of 3,787, or 63.1 percent. The results can be summarized in the table below:

Time              CSPct      CBPct   AgreePct
Regular Season    83.5%      85.3%      84.4%
Postseason        82.8%      88.5%      85.5%

So what does this mean? In the regular season, umpires agree slightly more often with the regulation strike zone as far as calling strikes than they have thus far in the 2007 postseason. On the contrary, they do better on called balls in the postseason than they do in the regular season by over three percentage points. In other words, it would appear that in the postseason, umpires tend to call a slightly tighter strike zone than they do in the regular season.

From a statistical standpoint, we can say that there is a greater than 95 percent chance that the differences in called ball percentage are indeed real, although the other differences are not statistically significant at that level.

What this analysis doesn’t take into account is the actual umpires who have worked in postseason games. After all, it could be the case that the particular set of umpires who have worked the 24 postseason games thus far (this was done prior to game one of the World Series) actually call a tighter strike zone than do others. When we weight the percentages utilizing the distribution of umpires who actually worked the games, we come up with the following table:

Time              CSPct      CBPct   AgreePct
Regular Season    83.1%      89.8%      86.4%
Postseason        82.8%      88.5%      85.5%

With that adjustment, the differences shrink and are not statistically significant at the 95 percent confidence level. In other words, this provides no evidence that umpires are systematically calling a different strike zone in the postseason than they do in the regular season.

Even so, I thought it would be interesting to look at the data for the umpiring crew who will work the World Series. The data in the table below includes all games for 2007, including the postseason:

2007 World Series Umpires
Name                Games Pitches     CS  Agree   CSPct    CB Agree   CBPct   Agree AgreePct
Ed Montague            19    1825    855    737   86.2%   970   839   86.5%    1576   86.4%
Mike Everitt           18    1613    753    638   84.7%   860   738   85.8%    1376   85.3%
Ted Barrett            23    2167   1066    871   81.7%  1101   962   87.4%    1833   84.6%
Mike Reilly            17    1641    762    638   83.7%   879   743   84.5%    1381   84.2%
Laz Diaz               11    1013    521    419   80.4%   492   425   86.4%     844   83.3%
Chuck Meriwether       20    1925    884    750   84.8%  1041   840   80.7%    1590   82.6%

Crew chief Ed Montague–who was behind the plate last night–has the highest agreement percentage, and would seem overall to call the strike zone most in agreement with the rule book. Laz Diaz and Ted Barrett would seem to have the tightest strike zones.

Finally, let me reinforce a bit of knowledge you already have, and see displayed in postseason telecasts on Fox, with a pretty graphic. Although the analysis above shows that umpires don’t call a different strike zone in the postseason from the one in the regular season, the operative shape of the strike zone differs from the rule book shape. This can be seen using PITCHf/x data and creating a hexagonal bin chart, shown here:

bin chart

This chart summarizes 56,869 called strikes and plots their density using the polygons that are roughly the size of baseballs. The more yellow the polygon, the more strikes were called in that area; a bright yellow polygon represents about 2.5% of all called strikes. As you might have guessed, this illustrates that the actual strike zone is flatter and wider than the rule book definition, and sits a little to the low end of the rule book zone vertically. It’s also the case that the zone differs for left- and right-handers, as we’ve talked about on a previous occasion, but a display of that fact will have to wait for another column.

For now, let’s just enjoy the World Series, and hope that it provides a fitting end to what has been another exciting and entertaining season.