October 26, 2010
Interpreting Pitch Classifications
Several of the leading pitchers in this year’s postseason make their living with a cut fastball, most notably Roy Halladay and Mariano Rivera. The list of playoff pitchers who have the cutter as an important pitch in their arsenal, though, is long. It includes Cliff Lee, C.J. Wilson, and Tommy Hunter on the Rangers; Andy Pettitte and Phil Hughes on the Yankees; and Cole Hamels on the Phillies.
Rivera and Halladay’s excellent cutters have made news for years, but with all these other successful pitchers using the cut fastball, is that a sign that it is seeing increasing adoption in the major leagues?
Sky Kalkman of Beyond the Box Score has been championing the cutter as the hot new pitch for over a year now. Dave Cameron wrote a couple posts about it recently at FanGraphs. In his second article, Dave quoted pitch type percentages that showed that cutter usage had increased from 2.2 percent of pitches thrown in 2005 to 4.7 percent of pitches thrown in 2010. His source was the pitch type classifications by Baseball Info Solutions (BIS) video scouts that are available on FanGraphs. Is that data accurate?
What is a pitch type?
The basic pitch types in baseball are the four-seam fastball, sinking (two-seam) fastball, cut fastball, changeup, slider, and curveball. There are varieties of changeups: circle change, straight (three-finger) change, split-finger fastball, palm ball, and forkball. There are rarer pitches like the knuckleball, screwball, and eephus pitch. That covers all the possible pitch types, and we all know and agree on what they mean, right? Well, no.
It does cover the range of pitch types thrown today in the major leagues, but agreement on the exact definitions is a little tougher to attain. There are a number of things that contribute to confusion when discussing pitch types.
Probably the biggest one is that every pitcher is a little bit different. Speed, movement, location, handedness, grip, delivery, and deception all combine to make a pitcher’s unique take on a pitch type. A slider from Francisco Liriano does not look or behave exactly like a slider from Brad Lidge.
In order to make some sense of this variety of pitches, it is useful to attach pitch type labels to them based on their similarities, while ignoring subtler or less important differences between them. It may not make sense to apply these sorts of traditional labels for all types of analysis, but for many purposes, it is helpful.
How the baseball moves gives rise to pitch names. A curveball curves; a slider slides. The grip that a pitcher uses may also influence the name, as with a split-finger fastball or knuckle curveball. Though pitchers may give them creative pet names like “The Thing,” pitches almost always fit in one of the basic known types. The pitch names come directly from the pitchers in interviews or filtered through the words of catchers, coaches, opposing hitters, and broadcasters, and give a sense of what pitchers call their own pitches. The increasing availability of high-resolution game photos over the past couple years also provides a fair idea of the repertoire of pitch grips for many major-league pitchers from direct evidence.
However, some pitchers throw pitches that seem to fall in between the “traditional” pitch types thrown by the majority of pitchers. A common example is the slurve, such as that thrown by Carlos Marmol and Francisco Rodriguez, with speed and movement in between that of a typical slider and curveball.
How does PITCHf/x change our level of knowledge?
The advent of the PITCHf/x system, beginning in the 2006 postseason, and the public availability of that data, has provided a detailed record of the speed and movement of pitches thrown in the major leagues. That should make the subject of labeling pitch types quite straightforward, right?
In some ways, it does. It is certainly much easier, more accessible, and more accurate than in the days of relying on newspaper accounts. (If you’re interested in this topic and those good old days, read The Neyer/James Guide to Pitchers.) It’s also easier, more accessible, and more accurate than trying to identify pitch types from a television feed.
PITCHf/x shows that pitches of the same type from a given pitcher tend to group together into identifiable clusters in the speed and movement space. In one sense, that was obvious to the pre-PITCHf/x observer since pitch identification was already possible without that data. However, in another sense, the distinctiveness of the clusters for most pitchers and most pitch types was not expected. Much pitching lore was built around the idea that good pitchers constantly varied the speed and movement on their pitches.
PITCHf/x has facilitated a quantitative description of the typical speed and movement of the various pitch types around the league, an understanding of the variation between pitchers, and the definition of quantitative boundaries between pitch types.
The process of defining names and boundaries is not a completely objective one. It works hand in hand with knowledge of what pitchers call their pitches. It’s also somewhat arbitrary in the case of borderline pitches like the slurve. This mix of quantitative and subjective does not render the process useless for analysis. It’s certainly much more exact and more objective than the eyeball test of someone observing a pitcher on video.
Using PITCHf/x data, an analyst can tell you why he labeled a pitch a cutter or a slider based on a certain speed differential off the four-seam fastball and a certain vertical and horizontal spin deflection. He can also tell you which other pitchers in the league fall near each side of the cutter-slider boundary, and exactly how close to the boundary in each dimension. On this basis, the reader can decide for himself whether the placement of the boundary makes sense. It’s incredibly difficult or impossible for a visual observer to provide that kind of objective evidence to allow others to judge his decisions.
What is a cut fastball?
Thus we return to the question of the moment. Is cutter usage in the majors increasing? Let’s start by defining the cut fastball.
A list of pitchers who are popularly known to throw the cutter with some frequency and the PITCHf/x data for those pitchers can be used to identify the speed and movement characteristics of the corresponding pitch cluster for each of those pitchers. A general definition can be devised encompassing as many of these example pitches as possible while at the same time excluding as many other pitches as possible. A definition that is either too expansive or too restrictive is not useful; the goal is a bowl of porridge that is “just right.”
Knowledge of the physics of pitching guides the interpretation of the PITCHf/x spin data. It explains how pitch grips create the observed differences in speed and spin among pitch types. Consistent labeling of pitch types uses this physical and scientific understanding to avoid being thrown off track if one pitcher here or there uses a designation for his pitch that is inconsistent with the labels used by the vast majority of other pitchers.
A cutter is a pitch that’s thrown with a grip that is somewhat off-center, so that both sidespin and backspin are applied to the ball. Thus, it is deflected by spin, relative to the “straight” four-seam fastball, more toward the glove side of the pitcher. This is similar to the movement of the slider and opposite to the movement of the sinking/tailing two-seam fastball. The cutter still has enough backspin, however, that it does not experience as much drop as the slider or curveball.
Speed is also an important determining factor for identifying pitch types, and this is true for the cutter as well. The physics of pitching indicates that speed and spin are related. As a pitcher gets more off-center or around the side of the ball, he sacrifices the force of finger drive behind the ball and loses a few miles per hour of speed off the pitch. The more sidespin he applies to get the ball to cut in on the hands of an opposite-handed hitter, the more speed he must sacrifice. Eventually, if he sacrifices enough speed and gets around the side of the baseball enough, he’s throwing a slider rather than a cutter. In my experience, this line is best drawn between the cutter and slider at around a five-mph speed loss off the four-seam fastball and a few inches of positive vertical spin deflection.
The line between four-seam fastball and cut fastball can be a little harder to draw. Some pitchers, e.g., Tim Lincecum and Ricky Romero, use an over-the-top arm angle or apply a little sidespin to get a small bit of cutting action to their hardest-thrown pitch. I hesitate to label such pitches as cutters unless the cutting action is more extreme, relative to arm angle, such as it is for Halladay, Rivera, and Joakim Soria.
What does the data say about cutters?
With this working definition of a cutter, let’s examine the cutter usage trends over recent years.
In his article, Dave Cameron quoted the BIS pitch classification data going back to 2005. These data, going back to 2002, are available on FanGraphs, and indeed it shows quite an upswing in cutter usage.
BIS began tracking cutters in 2004 and has recorded a higher number of cutters thrown every season since. There are at least three possible interpretations this information. One, BIS is basically accurate, and cutter usage has been steadily rising for the last six or seven years. Two, cutter usage has not been rising, but BIS has applied a changing definition of a cutter, either explicitly or implicitly, over this time period. Three, both effects may be occurring. Cutter usage may be rising, but not to the extent reflected in the BIS counts due to changing definitions. Which of these possibilities is correct?
BIS creates its pitch classifications by having its video scouts watch replays of the same broadcast television footage that the fans see. Identifying pitch types from television takes skill. An inexperienced analyst may even find it quite frustrating. However, with some knowledge of how to approach the process, a good idea of what’s in a pitcher’s repertoire, and some experience, an analyst can develop skill at the process.
Speed is the biggest clue for identifying pitches, and the television reports this information. The bend in the trajectory is another clue, but it’s only an obvious clue for the curveball. For other pitch types, and particularly for certain pitchers, the bend in the trajectory may look very similar for more than one pitch type, greatly complicating accurate identification. Spending a lot of time becoming familiar with a particular pitcher’s repertoire can help an observer identify subtle clues and usage patterns that aid in the difficult task of accurate pitch classification from video.
For cut fastballs, speed may or may not be a helpful indicator. Some pitchers’ cutters are thrown 3-4 mph slower than their main four-seam fastball, and that’s enough to make a reliable distinction from the TV speed reading. But if the speed differential is only 1-2 mph, it becomes hard to tell whether the pitcher just took a little off his four-seamer, if he’s getting tired and lost a couple mph, or if he threw a cutter. Cut fastballs with more backspin than sidespin can be hard to identify from four-seam fastballs based on the bend in the trajectory. Similarly, cut fastballs with more sidespin than backspin can be hard to identify from sliders based on the bend in the trajectory. It’s not an impossible task, but it’s an extremely challenging one.
A BIS video scout tasked with this challenge would be wise to learn all he could about a pitcher’s repertoire to make these fine distinctions with aplomb. But how does one know what a pitcher throws? There is the past record of BIS pitch classifications for this pitcher, and there is the anecdotal information from the media. It is certainly plausible that if BIS video scouts rely on either of these sources of information, even if cutter usage rates were constant, that it would take some time for BIS records to catch up to recording most of the cutters thrown in the league. If the pitch type were very easy to identify, this adaptation period should be short. If the pitch type were very difficult to identify from video, as cutters are, this adaptation period could be very long.
It bears mention that there is one television angle that is occasionally shown where the observer has a decent chance to differentiate fastball types visually. It’s the shot from directly behind and just above the pitcher, at high zoom and in slow motion. From this angle, one can determine the lateral deflection of the pitch, which is a great clue for identifying cutters, four-seam fastballs, and sinkers from each other. Unfortunately for classification purposes, this view is rarely available on the broadcast footage.
If cut fastballs and sinking fastballs were easy to identify separately from four-seam fastballs, BIS would surely have been doing that from the beginning. Instead, they are the most difficult pitches to identify, and BIS only started identifying cutters in 2004 and has never identified sinkers separately. (Or, if they have, that data has not been made available on FanGraphs.)
However, short and long are relative terms. Did BIS adapt to recording cutter usage rates accurately within the first season of trying or are they still adapting today?
A comparison of BIS and PITCHf/x data
PITCHf/x data is another source for information about cutter usage rates. PITCHf/x data is available for nearly all the pitches thrown in the majors from 2008-10 and for just under half of the pitches thrown in 2007.
For the sake of simplicity, let’s define a cutter as any pitch thrown 85-90 mph with between +1 and +7 inches of vertical spin deflection and between -2.5 and +2.5 inches of horizontal spin deflection. That produces the following cutter usage rates for the four seasons covered by PITCHf/x.
This crude definition isn’t reliable for counting the absolute number of cutters thrown or identifying exactly which pitchers do and don’t throw the cutter. A more accurate definition of a cutter would consider each pitcher individually and adjust for speed and movement relative to his main fastball. Nonetheless, for the purpose of detecting changes in cutter usage, this definition should be sufficient.
These data do show a small increase in cutter usage. Without several additional years of data, it’s hard to know with certainty whether that’s part of a long-term trend or simply random fluctuations either in cutter usage itself or in how pitches happen to fit within our crude bucket. Nonetheless, even if it represents a true increase in cutter usage, the rate of increase is only a third or a fourth of that reported by the BIS data.
Outside of the first two years when BIS was classifying cutters, the biggest jump in cutter usage in the BIS data occurred between 2008 and 2009. The availability of PITCHf/x data for both of those seasons enables a comparison of the BIS classifications for individual pitchers throwing the cutter to the PITCHf/x data for those same pitchers.
The following comparison uses the author’s own classifications of the pitches based on the PITCHf/x data. Major League Baseball Advanced Media (MLBAM) provides its classifications of pitch types along with the PITCHf/x data on Gameday. In a quest to improve classifications, MLBAM has made numerous changes to its classification algorithm over time. An apples-to-apples comparison, however, requires a consistent definition of a cutter across the entire time period of interest.
BIS records 156 pitchers as having thrown at least one cutter in 2008 and 141 pitchers in 2009, for a total of 214 pitchers who threw a cutter between those two years. The sample chosen for comparison included the 96 pitchers who threw at least 200 cutters in 2008-2009 according to BIS. This sample covered over 90 percent of the cutters thrown. PITCHf/x data was used to count the number of cutters thrown in each season for each pitcher. In many cases, these counts agreed closely with BIS totals. In some cases, they did not. Here are the counts of cutters for the whole sample.
The results from PITCHf/x-based classifications agree fairly closely, in aggregate, with the BIS totals for 2009. However, there was significant undercounting of cutters in the 2008 BIS data. There was an increase in cutter usage between 2008 and 2009, from 4.3 percent to 4.7 percent, but it was less than half the increase reported by BIS and more consistent with the earlier crude PITCHf/x bucket-based estimate.
A full record for all the pitchers counted is available as a Google spreadsheet here, but let’s illustrate the point with some specific examples.
The following graphs show pitch speed vs. the left/right deflection of the pitch trajectory due to spin as determined from the PITCHf/x data. That’s typically a good way to identify clusters for pitch classification. The graphs show data from the 2008 season as compared to data from the 2009 season for three pitchers: Brian Tallet, Chad Durbin, and Jamey Wright.
BIS reported that Tallet threw no cutters in 2008 and 333 cutters in 2009. PITCHf/x shows about 307 cutters in 2008 and about 484 cutters in 2009. Clearly, BIS was slow to adapt to the fact that Tallet threw a cutter, and they missed a lot of his cutters in 2008 and a few in 2009.
BIS reported that Durbin threw 12 cutters in 2008 and 306 cutters in 2009. PITCHf/x shows about 557 cutters in 2008 and about 417 cutters in 2009. Again, BIS was slow to adapt to the fact that Durbin threw a cutter, and they missed a lot of his cutters in 2008 and a few in 2009. (There was a PITCHf/x system calibration issue at Citizens Bank Park in April-May 2008 that caused the pitch clusters for Durbin to be more spread out in 2008, but it is still clear that he threw a cutter in both 2008 and 2009.)
BIS reported that Wright threw 13 cutters in 2008 and 189 cutters in 2009. PITCHf/x shows about 314 cutters in 2008 and about 210 cutters in 2009.
BIS missed cutters from a number of pitchers in 2008. If the PITCHf/x estimates are correct, BIS missed, in aggregate, about 17 percent of the cutters that were thrown during 2008. It’s possible they did worse than that since this analysis did not examine any pitchers who BIS did not record with at least one cutter in either 2008 or 2009. For example, Kevin Millwood threw the cutter in both seasons, but BIS did not count any of them; thus, his cutters were not included in this analysis.
Moreover, the 17 percent figure already understates errors on individual pitchers since it is the total for the league, where undercounting and overcounting cancel out. BIS did much better in 2009, getting the aggregate total for the league within 99 percent of the PITCHf/x estimate from this method. By 2009, BIS video scouts probably were themselves using PITCHf/x data to establish their prior expectations of pitchers’ repertoires.
Trustworthiness of BIS classifications from earlier years
With PITCHf/x data available, there is less need for BIS pitch classifications today. However, there is no PITCHf/x data before 2007. To what extent is BIS data trustworthy for pitch classifications in seasons prior to PITCHf/x? There is no complete, quantitative answer to that question. However, a couple data points may prove instructive.
If BIS increasingly classified pitches as cutters over the seasons, as the 2008-2009 investigation indicates they did, what pitch type was BIS using for those cutters before they realized they were cutters? Let’s look at the full summary of BIS pitch type data for 2002-10.
Increasing cutter classification comes at the expense of the fastball bucket. Note also that some curveballs and a few change-ups have migrated into the slider bucket over time. In addition, BIS has greatly reduced the number of unknown pitches (XX) since 2005. That bucket is excluded when calculating the percentages of other pitch types.
Finally, let’s look to a pitcher who has been well known for relying nearly exclusively on a cutter during the whole time period in question, one Mariano Rivera.
The anecdotal evidence indicates that Rivera was primarily throwing the cutter as far back as 2002.
OK, you're back at home plate at Yankee Stadium. You're watching Rivera lurch into his delivery. You know what's coming….
Rivera's cutter comes at you at 95 mph, as fast as a four-seam fastball. And once in a while, that pitch heading your way turns out to be a four-seam fastball. But it's probably going to explode into destructive cutter mode—boring in on the hands of left-handed hitters, scaring the equilibrium out of right-handed hitters.
In addition, the PITCHf/x data tells with a great deal of accuracy how many cutters Rivera threw in 2007-10. The following table shows the percentage of Rivera’s pitches that BIS labeled as cutters in each season along with the percentage of cutters identified by PITCHf/x in the seasons where it was available.
BIS claims that Rivera was throwing his famous cutter less than half the time from 2004-2006. This does not square at all with common wisdom or with press reports. Though BIS caught up to reality somewhat in 2007 and 2008, it still under-reported his cutter usage by 15 percent and 7 percent in those two years. Finally in 2009, with a two-year track record of PITCHf/x data available for consultation, BIS appears to have begun classifying Rivera’s cutters correctly.
Baseball lore and PITCHf/x data both agree that Rivera has been using his cutter as his primary pitch, perhaps nearly 90 percent of the time, going back nine seasons or more. BIS took six years to realize this fact and reflect it in their pitch classification data.
Pitch classification from video is a difficult task, particularly so for distinguishing types of fastballs, and this article is not intended as an indictment of the effort of BIS video scouts. Instead, it is an acknowledgement of the difficulty of the human enterprise of classification and an attempt to gain a sense of the reliability of the resultant data.
What of the original question? Is cutter usage increasing? A truly accurate and complete answer is elusive. However, based upon PITCHf/x evidence, cutter usage has been fairly flat over the last few years, or if it has been increasing, it has only been increasing slightly. There has certainly been no sea change or mass adoption of the cutter since 2007.
Unfortunately, the BIS pitch classification data going back to 2002 is no help in answering the question due to the effectively changing definition of the cutter from season to season. The best information from those earlier seasons is mostly anecdotal, and a quantitative measurement of the rate of cutter usage prior to 2007 appears to be beyond our grasp.