BP Comment Quick Links

Premium and Super Premium Subscribers Get a 20% Discount at
MLB.tv!


May 1, 2009 Checking the NumbersWhiffery
On May 28, 2007, Freddy Garcia took the hill for the Phillies, squaring off against the Diamondbacks in a standard, runofthemill game that would ultimately have no bearing on the standings. Nor would it boast any outstanding feats you'd have cause to recall. Quite simply, the meeting served as the perfect example of a nondescript game that lives in the memories of a few avid baseball fans for one minuscule reason or another. Though Garcia proved to be a bust for the Phillies, he pitched very effectively in this particular outing, missing a flurry of bats. By missing bats, I am not referring to the shorthand for fanning a hitter, but rather the literal definition: he induced a lot of swings and misses. A quick glance at the box score shows that Garcia recorded 18 swinging strikes against the Snakes, an impressive tally, and one he had not reached in almost two years. Swinging strikes are rare in major league baseball, especially when compared to the other events capable of occurring on a pitch, and they usually signal some sort of overpowering of the hitter, whether that's with a deceiving offspeed delivery or an ample supply of late movement. How rare is the swinging strike? Consider that last season Juan Cruz posted the highest swingingstrike rate among those pitchers with 50 or more frames logged. His mark was 15.5 percent. The average swingingstrike rate was considerably lower, 8.6 percent, meaning that in bigleague action, it would be perfectly normal for other outcomes to take place on nine of every ten pitches thrown. The highest swingingstrike rate achieved in this decade belongs to Brad Lidge, when he recorded whiffs on 24.2 percent of his pitches during the 2004 season, a mark still short of a quarter of the time. Since 2000, there have been just five instances in which a pitcher exceeded 20 percent with this rate, all of which belong to either Lidge or Eric Gagne. Though the leaderboard in this same span is topped primarily by relievers, adjusting the playingtime qualifier to 120 or more innings places Francisco Liriano's 15.9 percent in 2006 atop all others. Perhaps not at all surprisingly, names like Randy Johnson, Roger Clemens, and Johan Santana appear the most frequently among the starters. Earlier this year, Matthew Carruth of Fangraphs penned a very interesting article in which the swingingstrike rates were broken down by pitcher, pitch, and batter handedness using Pitchf/x data. The research concluded that Ryan Madson's changeup, when delivered to samehanded hitters, caused the highest percentage of whiffs, at 36 percent. The Pitchf/x data set is still only a toddler, making similar inquiries into past years impossible, but Retrosheet has kept track of the pitch breakdowns for quite a while; their data makes possible at least some semblance of this kind of research for previous seasons. With this idea in tow, I first calculated the overall pitch breakdowns for everyone from 200308, and further partitioned the data based on batter handedness. Here are the swingingstrike percentages for the four different types of matchups over the last six seasons:
Pitcher Batter Swinging
Strike %
LHP LHH 9.3
LHP RHH 8.7
RHP LHH 7.9
RHP RHH 9.9
As expected, pitchers perform better against samehanded hitters, but righties were less prone by almost an entire percentage point as far as getting a swingandmiss against oppositehanded hitters. Aside from averages, who are the leaders this decade against hitters from each side of the plate? The data below consists of the top swingingstrike rates from pitchers with at least 150 batters faced from a specific side of the plate in this decade. Take note of the much higher percentages from northpaws to samehanded hitters, which makes sense given that this specific matchup boasted the highest average rate. Also of interest is how the top southpaw against righthanded hitters posted a lower rate than the six pitchers shown on the righties vs. lefthanded hitters chart, suggesting that the latter matchup featured much more extreme results, with certain righties faring very well, and others missing bats at a Jon Garlandlike clip. LHPLHH LHPRHH Swinging Swinging Pitcher Year Strike % Pitcher Year Strike % Randy Johnson 2005 15.4 Johan Santana 2002 16.5 Andy Pettitte 2005 14.8 Randy Johnson 2002 15.9 Scott Olsen 2008 14.7 Billy Wagner 2005 15.8 CC Sabathia 2007 14.6 Francisco Liriano 2006 15.8 Randy Johnson 2004 14.3 Billy Wagner 2002 15.8 CC Sabathia 2008 14.3 Randy Johnson 2001 15.4 RHPLHH RHPRHH Swinging Swinging Pitcher Year Strike % Pitcher Year Strike % Brad Lidge 2004 21.6 Brad Lidge 2004 26.8 Eric Gagne 2003 20.4 Eric Gagne 2003 23.8 Eric Gagne 2002 19.3 Eric Gagne 2004 23.2 F. Rodriguez 2000 19.2 Ugueth Urbina 2001 20.5 John Smoltz 2002 18.9 Octavio Dotel 2002 20.4 Eric Gagne 2004 18.6 Antonio Otsuka 2004 20.3 Although the data mining can be very enjoyable—yeah, I'm a nerd—what should be of great interest is whether or not recording swinging strikes is an actual ability. In other words, are these rates consistent from year to year for each pitcher? When testing the relationship between two variables on a multiyear level, an AR(1) IntraClass Correlation becomes a very valuable statistical test, working the same way as a yeartoyear correlation, but incorporating more than just two years of data. For the sake of this study I queried all data from 200308, since it is important to avoid very large time frames when performing such tests; comparing a pitcher in his age27 season to his performance in his age37 season will naturally produce different numbers based on changes in style and approach, skewing the results. Prior to crunching the numbers I hypothesized that the correlations would be closer to zero, proposing that the swingingstrike rates were more random than a sustainable skill, but what does the actual data say? Swinging Strike % ICC Overall 0.60 LHPLHH 0.55 LHPRHH 0.61 RHPLHH 0.57 RHPRHH 0.65 These coefficients work similarly to standard bivariate correlations in the sense that marks closer to 1.0 increase the strength of the relationship. All five of the aforementioned relationships ended up being particularly stable over the last six seasons, defying my hypothesis and suggesting that cajoling barren swings is actually more skillbased than random. The overall swingingstrike rate also shared a moderately strong 0.44 correlation to total percentage of strikes, indicating that just about 20 percent of the variation in total strike percentage, including balls in play, relative to all pitches thrown can be chalked up due to swingingstrike rates. Even though it makes sense for these rates to go hand in hand with foulball rates given the lack of solid contact made, their 0.05 correlation portends randomness in the data. Strikeout and walk rates were also particularly independent of swingingstrike rates with low correlation coefficients of their own. Since both of these rates are decidedly stable themselves, and are direct components of the more accurate metrics exhibiting success through controllable skills, it stands to reason that swinging strikes really have no advantage over fouls or called strikes. At this juncture, it dawned on me that certain pitchers posted lower swingingstrike rates relative to all pitches thrown, but they might actually be much more apt to recording whiffs relative only to themselves, called strikes, and foul balls. If balls and balls in play are removed from the number of pitches thrown, do these correlations change at all?
SS%/K ICC
Overall 0.58
LHPLHH 0.53
LHPRHH 0.58
RHPLHH 0.54
RHPRHH 0.63
SS%/K = SwingingStrike Rate relative to nonBIP strikes
A minimal dropoff (at best) is observed when comparing the overall swingingstrike rates to the rates relative to nonBIP strikes. Interestingly, these two rates share virtually no relationship with an r of 0.032. In fact, the only significant relationship I found involving the rate per strike included foul balls per strike at 0.38; as the distribution of nonBIP strikes tilted in favor of swings and misses, pitchers experienced a decrease in foul balls. Thinking of the pitches along the lines of nonBIP strikes paves the way for some other intriguing ideas, such as quantifying the term "effectively wild." These types of pitchers are notorious for being all over the place with location, throwing higher percentages of balls, but possessing "stuff" capable of getting the job done. Matt Clement instantly springs to mind as an example of such a pitcher whose control could not be predicted, and yet for a while the wildness worked to his advantage. Though this is merely a cursory attempt to quantify the aforementioned scouting term, I got to thinking that the effectively wild pitchers would throw strikes, including balls in play, less than half of the time, but would post high marks in the swingingstrike area. Querying for pitchers with at least 120 innings in a season and a strike percentage below 50 percent, sorted by swingingstrike rate relative to nonBIP strikes produced the following list: Pitcher Year Strike% SS%/K Victor Zambrano 2004 47.8 23.1 Daniel Cabrera 2006 49.1 22.8 Jason Jennings 2003 48.8 21.7 Victor Zambrano 2003 47.2 21.5 Zach Day 2003 48.4 21.2 Jason Jennings 2004 48.4 21.1 Victor Zambrano 2005 49.6 20.3 Jason Jennings 2005 47.2 19.5 Damian Moss 2003 45.6 19.4 Barry Zito 2006 49.2 18.8 SS%/K = SwingingStrike Rate relative to nonBIP strikes Swingingstrike rates are a very stable metric, year to year and pitcher to pitcher, but the lack of a noteworthy relationship to any other performancebased metric that makes some sort of logical sense—swingingstrike rates have no relationship with shutouts, no matter what any correlation coefficient argues—makes them much less meaningful. What would be of interest is the underlying root of these fruitless swings, searching for patterns in pitch repertoire and sequencing that leads to the whiffs. This sort of research will not be meaningful until the Pitchf/x data set is expanded, but it may help us to determine solid ways to attack a hitter as well as potential outpitches not currently being used in the correct fashion. For example, it would be very valuable to know if Ryan Madson's changeup to righties displays stability in the swingandmiss department over a predetermined time span, or if the pitch loses its effectiveness. Swinging and missing at the major league level is very rare, especially on fastballs. That's reflective of the level of talent; a hitter who whiffs too often had better exhibit tremendous ability in some other facet of performance, or else he will not last very long in the big leagues. Due to this kind of rarity, pitchers who miss bats with the greatest of ease are incredibly appealing assets, but the lack of a relationship between this feat and other performancebased metrics suggests that there is no true advantage over those who struggle to elude the whooping sticks.
Eric Seidman is an author of Baseball Prospectus. 8 comments have been left for this article.

Interesting stuff, thanks...
What pitchers have some of the lowest swingandmiss % of this decade?