Note: This is based on a presentation at the SABR Analytics Conference on Saturday, March 10. Audio of the presentation is here and presentation slides are here.
Part 1 of this article described the history of attempts to identify clutch hitting, dating back over 40 years. While the general sabermetric consensus is that clutch hitting does not exist as an identifiable and replicable skill, the topic remains controversial.
Using play-by-play data, we identified the difference between situation-dependent (i.e., weighted by win probability) and non-situation-dependent (neutral) run creation for 8,963 batters with 500 or more plate appearances since World War II. We calculated a z-score for each player to identify whether his performance was clutch (positive z-score) or unclutch (negative).
To consider the existence (or not) of clutch hitting as a skill, we looked at the distribution of z-scores for individual hitters. We first identified the players who were, by our methodology, most frequently among the best clutch hitters. Bill James once noted that a good statistic is one that gives us new insight while confirming much of what we already knew. His first mass market Baseball Abstract, published after the strike-shortened 1981 season, introduced his Runs Created formula to a wide audience. Among the top hitters in baseball, per Runs Created, were familiar names such as Mike Schmidt, Andre Dawson, George Foster, Keith Hernandez, and Eddie Murray. Some of the leaders, such as Dwight Evans, Rickey Henderson, and Bobby Grich, were surprises to many. We learned that players who draw walks and hit doubles can be as valuable as those who hit homers and get a lot of RBIs. Runs Created provided new information while confirming much of the then-extant previous knowledge.
The list of the best hitters by our clutch metric is, well, curious. Here are the players who finished in the theoretical top five percent (i.e., z-score 1.645 or greater) for clutch hitting most frequently:
- Six times: Bert Campaneris
- Four times: Hank Aaron, Sandy Alomar Jr., Glenn Beckert, Jose Cardenal, Vince Coleman, Nellie Fox, Steve Garvey, Ken Griffey Sr., Dick Groat, Tony Gwynn, Enos Slaughter, Billy Williams, Tony Womack
Taking nothing away from Bert Campaneris, he seems an odd choice for a consistent clutch leader. The list of players to finish in the top five percent four times includes some expected results along with several head-scratchers.
The list of players most frequently in the bottom five percent is similarly inconsistent:
- Six times: Manny Ramirez
- Five times: Torii Hunter, Chet Lemon, Frank Thomas
- Four times: Jay Bell, Wade Boggs, Barry Bonds, Robinson Cano, Dwight Evans, Todd Helton, Victor Martinez, Magglio Ordonez, Frank Robinson, Alex Rodriguez
- Three times: Jeff Bagwell, Adrian Beltre, Miguel Cabrera, Jermaine Dye, Damion Easley, Edwin Encarnacion, Ken Griffey Jr., Richard Hidalgo, Edgar Martinez, David Ortiz, Albert Pujols, Ivan Rodriguez, Benito Santiago, Sammy Sosa, Bernie Williams
Several of those 28 players were stars. Six are in the Hall of Fame, seven are on a Hall of Fame trajectory, and four more would likely be in were it not for PEDs. Are they all unclutch?
These lists would seem to violate James’ observation that statistics should confirm much of what we already know. (Or, alternatively, an example of Twyman’s Law: If a statistic looks interesting or unusual it is probably wrong.) Do Bert Campaneris and Manny Ramirez represent the opposite zeniths of clutch hitting skill? Or is clutch hitting random, varying widely from year to year, more indicative of chance than a skill?
The second feature we considered is consistency. A valid skill should be statistically replicable, occurring with some regularity. The five percent threshold above is high, but it is not unusual for the top performers in the game. Rod Carew finished in the top five percent of batting average 11 times. Ted Williams was in the top five percent of walk rate nine times. Harmon Killebrew was in the top five percent of home run rate eight times. Randy Johnson was in the top five percent of strikeout rate 12 times. Excellence is rare, but it does exist, and if clutch hitting is a skill, we should not be surprised to see batters consistently appear at the extremes of the distribution.
What we should be surprised to see, though, is batters who appear at both extremes. We don’t expect Rod Carew to bat .225, or Ted Williams to walk in four percent of plate appearances, or Harmon Killebrew to hit one homer every 70 at-bats, or Randy Johnson to strike out four batters per nine innings. Even for one season.
However, we found it not uncommon for batters to hit well below average in clutch situations in some seasons and well above average in those situations in other seasons. From 1946 to 2017, there were 120 players who appeared in the top five percent and the bottom five percent at least once in their career. And a dozen were in both the top five percent and the bottom five percent two times or more:
- Garret Anderson: Top 5% 1997, 2005; bottom 5% 1999, 2000
- Dwight Evans: Top 5% 1977, 1981; bottom 5% 1978-80, 1989
- Tony Gwynn: Top 5% 1984, 1988, 1991, 1997; bottom 5% 1987, 1990
- Joe Morgan: Top 5% 1970, 1973, 1982; bottom 5% 1974, 1980
- David Ortiz: Top 5% 2005, 2006; bottom 5% 2007, 2011, 2013
- Frank Robinson: Top 5% 1958, 1969; bottom 5% 1959, 1965-1967
- Pete Rose: Top 5% 1980, 1982, 1983; bottom 5% 1963, 1979
- Gary Sheffield: Top 5% 1990, 2002; bottom 5% 1996, 2000
- Miguel Tejada: Top 5% 2002, 2009; bottom 5% 2005, 2007
- Bobby Thomson: Top 5% 1951, 1952; bottom 5% 1949, 1953
- Larry Walker: Top 5% 1993, 2002; bottom 5% 1995, 1998
- Carl Yastrzemski: Top 5% 1966, 1972, 1979; bottom 5% 1973, 1974
To provide an example of what a top five percent and bottom five percent year looks like from a clutch perspective, we’ll examine the five seasons noted above for David Ortiz, using his plate appearances with a leverage index above 1.5. (For a frame of reference, these high-leverage plate appearances accounted for 18 percent of plate appearances in 2017 and, as noted earlier, batters performed almost identically in high-leverage and all other plate appearances.)
We’ve also included FanGraphs’ “clutch” metric, which, similarly to our z-scores, compares a player’s Win Probability Added (WPA) in high-leverage situations to his overall performance. “Clutch” values above 1.0 are considered clutch, and below -1.0 considered unclutch.
- 2005: z-score 2.66: 3.31 Clutch, 1.312 OPS in high-leverage situations, .928 in other plate appearances
- 2006 z-score 1.81: 1.48 Clutch, 1.100 OPS in high-leverage situations, 1.037 in other plate appearances
- 2007 z-score -2.25: -1.68 Clutch,1.019 OPS in high-leverage situations, 1.079 in other plate appearances
- 2011 z-score -2.70: -1.61 Clutch, .846 OPS in high-leverage situations, .981 in other plate appearances
- 2013 z-score -2.07: -1.08 Clutch, .867 OPS in high-leverage situations, .983 in other plate appearances
It is tempting to think that clutch hitting is a skill that players acquire with age, but every one of the above batters’ good and bad years (other than Ortiz and Rose) were interspersed throughout their careers. It is incongruous to suggest that a skill that certain players possess would vary so wildly over a player’s career. Often, players acquire power and plate discipline while losing speed and fielding ability as they age. But players do not swing from one extreme to another over the course of their career for these replicable skills. Gary Sheffield was one of the best clutch hitters in baseball when he was 21 and again when he was 33. He was one of the worst when he was 27 and 31. That’s just not how replicable baseball skills work.
Conclusions
This analysis, which builds on Pete Palmer’s and Dick Cramer’s research from the 1970s with a much more robust data set, illustrates continued difficulty in identifying clutch hitting as a replicable skill.
Again, this is not to say the clutch hitting does not exist. Over the course of our research, we noted some remarkable clutch hitting seasons. Pete Reiser in 1946 batted .338/.421/.541 in high-leverage plate appearances and .248/.333/.383 in his other plate appearances. Charlie Maxwell in 1960 hit .281/.348/.587 in high leverage and .222/.315/.391 otherwise. Both seasons equated to a z-score over 4.0, a 0.003 percent probability.
Three players had z-scores on the other side of the spectrum as well: Bill Mueller in 2003 (-3.03 Clutch, .223/.297/.330 in high leverage, .349/.421/.586 otherwise), Alex Rodriguez in 2008 (-3.13 Clutch, .264/.372/.434 in high leverage, .312/.398/.609 otherwise), and Aaron Judge in 2017, whom FanGraphs’ Travis Sawchik proclaimed “the least clutch player on record” (-3.64 Clutch, .219/.361/.500 in high leverage, .298/.435/.655 otherwise). Every year, batters have notable success and failure in clutch situations.
However we found no evidence that clutch hitting is a replicable skill. For that to be the case, we would see players repeat at the top (or bottom) of the charts, year after year. These are the year-by-year z-scores of the aforementioned Ortiz, heralded as the greatest clutch hitter of this generation, possibly all time:
Season | Z Score |
2003 | +0.04 |
2004 | -1.34 |
2005 | +2.66 |
2006 | +1.81 |
2007 | -2.25 |
2009 | +0.14 |
2010 | -0.67 |
2011 | -2.70 |
2013 | -2.07 |
2014 | +0.94 |
2015 | -1.38 |
2016 | -1.40 |
There is no consistent pattern of success (nor failure).
As we have illustrated, clutch performance for players often swings wildly from one season to the next. This, we feel, is a greater indictment of the concept of clutch hitting than is the apparent clutch-ness of Bert Campaneris, Vince Coleman, and Tony Womack, and the apparent unclutch-ness of Manny Ramirez, Frank Thomas, and Barry Bonds. A viable statistical metric must be replicable, with results generally consistent over time. Our measure of clutch hitting—the excess performance of a hitter in high win-expectancy plate appearances compared to others—fails to meet this test. We therefore echo Cramer’s conclusion from 41 years ago that while clutch hitting may exist as a feature, it does not exist as a repeatable skill.
Pete Palmer is the co-author with John Thorn of the Hidden Game of Baseball and co-editor with Gary Gillette of the Barnes and Noble ESPN Baseball Encyclopedia (five editions). Pete worked as a consultant to Sports Information Center, the official statisticians for the American League from 1976 to 1987. Pete introduced on-base average as an official statistic for the American League in 1979 and invented on-base plus slugging (OPS). He won the SABR Bob Davids award in 1989, was selected by the SABR in 2010 as a charter member of the Henry Chadwick Award, and is the 2018 recipient of the SABR Analytics Conference Lifetime Achievement Award.
Thank you for reading
This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.
Subscribe now
If you want to measure whether a hitter performs better when the pressure is on (a good definition of "clutch" hitting), you have to filter the statistics against a reasonable definition of clutch hitting _from the hitter's point of view_. 2nd & 3rd, 2 outs in the bottom of the 1st of a 0-0 game may actually have a reasonably high potential for WPA, but while the hitter might feel some pressure to deliver in that spot, the "clutch" factor isn't as strong. Likewise, simply batting at all in a close game in the late innings has a higher potential for WPA than some RBI situations earlier in the game, but most people don't include a tally of fly outs made with one out in the 9th while the team leads by 1 as negative points against someone's clutch ability. Most of the time, when the team is already leading, those at-bats are not really considered key ones. nice to get some insurance, but that's not where people look for clutch hitting. Adding these extra values will tend to wash out any effects of hitting the real clutch situations, making the results far more random. The team's situation within the season also matters - later in the season, closer to but not assured of contention, playing a rival, etc. None of this is to say that a far more detailed analysis would turn anything real up - obviously many efforts have been tried with stuff like late & close RBIs or detailed analyses of Ortiz's career, and human brains love to find patterns in noise.
The ratio of pure statistical Runs Created and WPA seems like an interesting measure, not of clutch hitting as it is usually perceived, but of situational hitting - doing the right thing to get your team runs. It doesn't capture that entirely, though - it gets at stuff like swinging for the fences when a homer would do it or shortening up and just trying to get on base when you aren't enough to tie the game alone, but misses other potentially important factors like expanding the zone in a key RBI spot when the hitter behind is you significantly worse, stretching or stealing your singles into doubles in front of low-ISO hitters, but not taking the bat out of the hands of Mike Trout. All the stuff that relies on awareness of not just the game state but the upcoming hitters.
However, your observation in the final paragraph got me thinking: maybe we're looking at the less interesting question. I'm sure people think they have a working definition of clutch, but I don't know of any person who made a determination of clutch on the basis of rational consideration/analysis. So we do analyses like this based on a definition of clutch that is reasonable enough (and which clutch believers may even sign on to), but which may not actually map to what clutch means in practice. Accordingly, we can establish that this definition of clutch-as-a-skill likely does not exist without doing much to change the conversation.
It would be interesting to start with a list of subjectively determined "clutch" players and then try to reverse engingeer how people make the clutch determination in practice. Perhaps it's mostly a function of batting average w/ RISP (or, more likely, H/PA) or RBI production instead more robust measures of production. Perhaps the clutch label is determined almost entirely in the first few years of a players' early career. Perhaps it's based on a few high profile events, or established by 1 great clutch season. Likely all of these play in to it.
Broader point, I suspect there's some element of missing the point when we choose to analyze an intuitive judgment like "that player is clutch" based on formalized holistic performance measures like OPS or wOBA and "high leverage" situations. Perhaps the question "What do people mean when they call a player clutch" would be more insightful than proving and re-proving that a formalized conceptualization of clutch is not a skill that players might possess.
1) Implicit in most people's definitions of "clutch" hitting is an added requirement that the player perform better in certain higher-leverage situations ~when I'm watching.~ It's hard to imagine any statistical evidence rebutting Yankees and Red Sox fans' sure knowledge of Derek Jeter's or David Ortiz's clutchness in light of their salient October moments, despite the fact that both experienced epic bouts of post-season ineptitude during their long careers. I think there is a subjective (to the viewer) element inherent in most people's conceptions of clutch hitting that does not lend itself to proof or disproof.
2) I don't think clutch is a thing. It's an adjective we apply to particular outcomes in particular situations that does not exist in the brains (or hearts) of flesh and blood humans. If God (or an evil demon) were to peer into the brains (or hearts) of both clutch and un-clutch baseball players at the moment of truth, I don't think they would look any different, because there is no a priori fact about someone's clutchness that exists before the outcome of the at-bat. This is consistent with the view that clutch = skill + luck.