In 1977, Dick Cramer wrote the seminal “Do Clutch Hitters Exist?” in SABR’s Baseball Research Journal. He used earlier work by Pete Palmer to identify batters who performed well in clutch situations. (Just to show how times have changed, Cramer noted that one of the calculations he used “is tedious to compute with a slide rule or ordinary calculator but is almost as accessible as a batting average with a programmable calculator such as the Hewlett-Packard HP-65.”)
He found that a) clutch hitting is mostly (“about 80 percent”) a byproduct of the player’s overall offensive ability, and b) there was no tendency for hitters who were clutch (or unclutch) in 1969, the first year of his two-year study, to remain so in 1970. He concluded that clutch hitting was more a matter of chance than of skill.
Subsequently, the topic of clutch hitting has been one of the more provocative areas of study within the sabermetric community. Various analysts have arrived at conclusions similar to Cramer’s. This has often been framed as a denial that clutch hitting exists. That’s an oversimplification. It is undeniable that, for example, Allen Craig hit .454/.500/638 with runners in scoring position for the Cardinals in 2013.
The sabermetric contentions are that:
- The batters who do best in clutch situations are the batters who do best overall—that is, the best hitters are the best in all situations. Since 1988, there have been five batters with a career OPS above 1.000 in high-leverage plate appearances (defined as leverage index greater than 1.5, minimum 500 high-leverage plate appearances): Barry Bonds, Joey Votto, Mark McGwire, Manny Ramirez, and Albert Pujols. They ranked first, sixth, third, second, and 12th in overall OPS during the same time frame. Good clutch hitters are hitters who are good, period.
- Clutch hitting is not a repeatable skill. In 2014, the year after Craig’s big season with runners in scoring position, he hit .216/.306/.319 in the same situations. Clutch hero David Ortiz’s batting averages in at-bats with a leverage index over 2.0 over his final six seasons were, chronologically, .229, .316, .196, .397, .208, and .273.
The thinking about clutch hitting shifted somewhat in 2004, when Bill James wrote “Underestimating the Fog,” also in the Baseball Research Journal. James suggested that prior analyses, including Cramer’s, are limited by the data at our disposal. He posited that clutch hitters may, in fact, exist, though we may not know how to identify them. Many subsequent studies have attempted to reach a conclusion, and while most fall on Cramer’s side, the question remains open.
The argument that we don’t have the sufficient depth of data in order to conclusively make a judgment on clutch hitting can and has been applied to almost any area of scientific research. It would be hubris to suggest that at any point we’ve collected all available information. We acknowledge both the limitations of current data and the current usefulness of that data. Our findings here may well be superseded by information gleaned for Statcast, more powerful algorithms, or statistical methods that don’t yet exist. But waiting for the cognitive quantum leap suggested 14 years ago by James makes little sense, when we have tools today that can aid our understanding.
In order to calculate clutch performances, we first had to identify what “clutch” entails. We settled on a definition of a clutch player as one who does better in higher-leverage situations than in others.
We didn’t seek just hitters who do well in clutch situations; that would lead us to a list like the one above featuring Bonds, Votto, McGwire, Ramirez, and Pujols. That hitters who are good overall are also good in clutch situations is already evident. We are seeking out players who “rise to the occasion,” performing better in clutch situations than in non-clutch situations. That could be a star who becomes a superstar when the pressure is on or an everyday player who somehow raises his game when he must.
To perform this analysis, we created a simple simulation of 18 equal players over an entire season. (We also created arguably more realistic simulations, adding variations in baserunner advancement and lineup-dependent batter characteristics. These did not meaningfully affect the overall results.) We calculated the runs they produced from two perspectives:
- The win probability before and after each of their plate appearances. This measurement, like Win Probability Added, gives greater weight to plate appearances that significantly move a team’s likelihood of winning. We multiplied this result by 10 to yield runs. This is our proxy for situation-dependent.
- Linear weights batting runs for the player’s full-season statistics, irrespective of when the plate appearances occurred. This is our proxy for situation-independent.
We ran 1,000 full-season simulations. Our objective was to derive a standard deviation between players’ situation-dependent and situation-independent runs. We found an average standard deviation of about 10 runs, equal to 0.4 x the square root of the player’s plate appearances.
We then analyzed every player with 500 or more plate appearances (8,963 in total) between 1946 and 2017, and calculated a Z-score equal to the difference between the player’s situation-dependent and situation-independent batting performance divided by the standard deviation calculated above. This enabled us to identify batters who performed better or worse in high-leverage situations compared to their overall norms.
For example, in 1998, Robin Ventura had 674 plate appearances. Using our conventions, he generated 19 situation-dependent runs and three situation-independent runs. The difference is 16 runs. The standard deviation for his season is 0.4 x 674^.5 = 10.4. So Ventura’s 1998 season yields a Z-score of 16 / 10.4 = 1.5, one-and-a-half standard deviations above expectation. He had a 1.099 OPS in high-leverage plate appearances that season and a .713 OPS in other plate appearances in 1998.
We found a modest overall skew toward positive Z-scores, as the histogram below illustrates. We would expect the average Z-score to be 0.00; the actual mean was 0.08 and the median 0.06. Probability would expect 1,422 players with Z-scores greater than 1.00 or greater; there were 1,828 in our sample.
We attribute this modest skew to equally modest selection bias. By limiting our sample to regular players (i.e., those with 500 or more plate appearances), we are selecting better-than-average players. A slight positive skew does not impugn our methodology and should have no impact on our conclusions. We did not set out to determine whether batters, in aggregate, do better or worse in higher-pressure situations. In fact, over the 2017 season, batters hit slightly better in clutch situations: .256/.329/.422 in high-leverage plate appearances and .255/.323/.427 in other plate appearances (defining high-leverage as a leverage index above 1.5, which accounted for 18 percent of 2017 plate appearances).
Our goal is to determine whether clutch hitting is a replicable skill that some players possess in greater quantities than others.
Coming in Part 2: Results
Pete Palmer is the co-author with John Thorn of the Hidden Game of Baseball and co-editor with Gary Gillette of the Barnes and Noble ESPN Baseball Encyclopedia (five editions). Pete worked as a consultant to Sports Information Center, the official statisticians for the American League from 1976 to 1987. Pete introduced on-base average as an official statistic for the American League in 1979 and invented on-base plus slugging (OPS). He won the SABR Bob Davids award in 1989, was selected by SABR in 2010 as a charter member of the Henry Chadwick Award, and is the 2018 recipient of the SABR Analytics Conference Lifetime Achievement Award.