If you’ve paid any attention to the 2012 season, you know that Albert Pujols has yet to hit a home run. The three-time MVP, fresh off the first homerless month of his career, is hitting just .208/.252/.287 with career-worst walk and strikeout rates. Jered Weaver’s no-hitter last night temporarily deflected some attention away from Albert’s struggles. But while Weaver mowed down Minnesota, Pujols’ homerless streak was extended to 107 plate appearances, ensuring that scrutiny of his every swing will only intensify once the no-hitter hubbub dies down.
Pujols averaged 39 home runs for the Cardinals over the past five seasons. After factoring in some age-related decline and the difficulty of hitting home runs from the right side in Angel Stadium, PECOTA projected him to hit 33 in 2012. The probability that a 33-home-run hitter would go homerless over 107 plate appearances by chance alone is just .3 percent. Either Pujols has been extremely unlucky, he’s declined more quickly than PECOTA expected, or he’s pressing at the plate.
Privately, Pujols is probably feeling some pressure. Publicly, though, he claims to be unconcerned. “I don’t think about that, man. It could be tomorrow, maybe the next day, a month from now, I don’t know. My job is to get myself ready to play and take my swing. Home runs, when they come, they come in bunches.”
At this point, Pujols would probably settle for hitting homers in dribs and drabs, let alone bunches. According to his comments, though, when he does start hitting homers, they’ll add up in a hurry. But can Albert be believed?
The belief that home runs are hit in bunches—in other words, that they’re hit in flurries followed by droughts, rather than at regular intervals—isn’t unique to the struggling Angels star. When Bryan LaHair went homerless this spring, Cubs manager Dave Sveum said, “People forget that home runs come in bunches.” Since then, they have for LaHair, who has hit six in the regular season, if not for the Cubs, who have collectively hit fewer home runs than Matt Kemp. But the history of “home runs in bunches” goes back well beyond Bryan LaHair. Writers and players alike have been referring to the idea at least since the middle of last century: in 1958, Willie Mays said, “When I hit home runs I get them in bunches and then no more for a time.”
Is there anything to this, or is “home runs are hit in bunches” another baseball myth that deserves to be busted? Google “clutch hitting,” “pitching to the score,” or a host of other time-honored baseball beliefs, and you’ll find countless studies that have tried and failed to find any statistical evidence supporting them. The contention that homers are hit in bunches seems to have escaped investigation so far, but it’s just as easy to check.
Using a statistical concept called binomial distribution, we determined the theoretical rates of zero-, one-, two-, three-, and four-homer games for the average major-league batter. By comparing those predicted rates to how often those games actually occurred, we could see whether there was anything to the idea that home runs are hit in bunches. If players actually alternate between home-run hot streaks and dry spells, their long balls would be bunched together, and we would see higher rates of two- and zero-homer games and lower rates of one-homer games than predicted.
Over small samples, of course, some players do have more two-homer games than predicted. In 78 games last season, Mike Cameron hit nine home runs, six of which came in three two-homer bunches. Those three two-homer games were about 2.6 more than the model would have predicted. Cameron was one of five players to have at least two more two-homer games than he “should” have in 2011:
Name |
Pred. 2-HR G |
Actual 2-HR G |
Difference |
||
476 |
122 |
0.96 |
4 |
3.04 |
|
506 |
127 |
0.31 |
3 |
2.69 |
|
723 |
155 |
2.33 |
5 |
2.67 |
|
Mike Cameron |
269 |
78 |
0.42 |
3 |
2.58 |
620 |
155 |
3.00 |
5 |
2.00 |
Over larger samples, though, we don’t see correspondingly large differences. Of the 211 players with at least 3000 plate appearances from 2002-2011, only nine had at least five more two-homer games than expected:
Difference |
|||
6015 |
1414 |
8.68 |
|
5851 |
1344 |
7.22 |
|
4778 |
1109 |
6.66 |
|
6039 |
1411 |
6.26 |
|
4022 |
1021 |
5.89 |
|
4783 |
1106 |
5.54 |
|
4526 |
1176 |
5.38 |
|
6295 |
1455 |
5.36 |
|
5083 |
1250 |
5.11 |
It’s possible that Vlad’s homers had some slight tendency to be “bunched,” but even in his case, it’s likely that the difference was due to chance.
So what’s the verdict when we look at home-run distributions for all players? The following table shows the predicted and observed percentages of games in which an average major-league batter hit each number of home runs from 1994-2011. The model predicted that the average player would go homerless in 89.29 percent of his games, hit one homer in 9.99 percent of his games, and hit two homers in 0.68 percent of his games. The predicted and observed results are almost identical, and the slight differences aren’t significant.
|
0 HR |
1 HR |
2 HR |
3 HR |
4 HR |
Predicted |
89.29 |
9.99 |
0.68 |
0.03 |
0.00 |
Observed |
89.09 |
10.17 |
0.71 |
0.03 |
0.00 |
Here are what those percentages look like for Pujols’s career. As one would expect, both the theoretical model and the in-game results show that he’s been much more likely to go deep than the typical player, but he hasn’t had more multi-homer games than expected.
|
0 HR |
1 HR |
2 HR |
3 HR |
4 HR |
Predicted |
75.75 |
21.57 |
2.52 |
0.15 |
0.01 |
Observed |
75.57 |
21.77 |
2.41 |
0.26 |
0.00 |
So why do Pujols and so many other players mistakenly believe that they’re hitting home runs in bunches? A cognitive bias called the "availability heuristic" might be to blame. According to Amos Tversky and Daniel Kahneman, the psychologists who coined the term, the availability heuristic is our “tendency to make a judgement about the frequency of an event based on how easy it is to recall similar instances.” The easier it is to summon instances of an event to our minds, the more often we believe that event to occur. For hitters, few events are more memorable than a multi-homer game or a long stretch without hitting a homer, so it’s not surprising that those events seem to them to happen more often than they do.
Home runs aren’t really hit in bunches, but it’s probably in the Angels’ best interests not to burst Albert’s bubble. There could be some psychological benefit to believing in bunches. In the midst of a home-run barrage last May, Mark Teixeira explained his success by saying, “Home runs come in bunches, and right now I’m just in one of those streaks where I’m hitting them out of the park a lot.” After ending the longest homerless stretch of his career in July of 2009, Teixeira used the same reasoning to explain his struggles: “I’m a streaky home run hitter. They come in bunches, and after hitting a bunch in a row, it took a while to get another one.”
Teixeira’s all-purpose explanation suggests that while hitting homers in bunches isn’t fact, it is a useful fiction. One of the most important qualities for a hitter to have is confidence, and the “bunches” belief provides a confidence boost for any occasion. A player who has homered recently can go to the plate believing he’s mid-bunch and about to hit another. A player who hasn’t homered in ages can console himself with the thought that a bunch of long balls could be a game away. What Albert Pujols could really use right now is a homer. But some confidence can’t hurt.
Colin Wyers provided research assistance for this article.
A version of this story originally appeared on ESPN Insider .
Thank you for reading
This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.
Subscribe now
As with all April stats and streaks, would this be anywhere near as big a story if a) Albert were still a Cardinal or b) this happened in August? Or May, even?
This wouldn't be as big a story if Pujols hadn't just changed teams and signed a giant contract, or if it had happened later in the season. But it's definitely reached the point at which it's a valid cause for concern.
I understand why it's "less computationally intensive". I just want somebody to do that heavy lifting! ;-)
Good article. Thanks Ben!
3/31-4/13: 1/12 drought
4/14-4/23: 6/9 bunch
4/24-5/29: 1/33 drought
5/30-6/7: 6/8 bunch
6/8-8/21: 17/51 steady
8/22-8/30: 0/7 drought
8/31-9/1: 3/2 bunch
9/2-9/28: 3/25 drought
2010: hr/games (6 multihomer games)
4/5-4/12: 5/7 bunch
4/14-5/26: 3/39 drought
5/27-6/6: 6/10 bunch
6/7-6/26: 1/17 drought
6/27-7/3: 5/7 bunch
7/4-7/30: 3/22 drought
7/31-8/27: 12/23 bunch
8/28-9/7: 0/10 drought
9/8-9/12: 4/5 bunch
9/13-9/22: 0/9 drought
9/23-9/26: 3/4 bunch
9/27-10/3: 0/6 drought
bunch: 60 homers in 75 games (130 hrs per 162 games = God himself)
drought: 12 homers in 174 games (11 hrs per 162 games = Casey Kotchman)
steady: 17 homers in 51 games (54 hrs per 162 games = prime Pujols)
Following a game in which a player homered, what is the predicted chance he will home in the following game and what is the actual chance he homered in the following game?
Repeat for the chances (predicted and actual) that he would/did homer within either of the next TWO games.
And, maybe for good measure, extend that out to the next 3 or 4 games.
That's what I'd be more curious to see rather than chance of them having multi-homer games.
Therefore, if it's not distributed exactly evenly, then they must be unevenly distributed... in bunches. QED.
The same home stand, the same week, the same month, okay. But not the same game. Talk about a small sample size.
I'm not the least bit surprised that the analysis discovered basically nothing. A lot of effort por nada.
However, coming up with a column named "Overthinking It" has now been revealed to have been a perfect exercise in thought! :)
Six games later Stanton has four home runs and has raised his OPS to .804.
That is hitting home runs in bunches. But there was not even a single multiple home run game in the "bunch."
One way would be to investigate the likelihood of multiple-HR games vs. the observed data. But I think we're in agreement that multi-homer games will not going capture the essence of what we all think of as "in bunches."
So, what other ways might one go about it?
However, until someone gives it some serious study, I guess that none of us know.
I think that study will definitely need to deal with longer periods of time, probably up to one month. One game just seems too limited. It is barely more revealing than trying to determine the answer by looking at how often home runs in consecutive at bats.
BTW, sure, it is anecdotal, but Giancarlo Stanton now has raised his OPS from .598 to .840 at this moment, by hitting six home runs in his last nine games. That's a bunch.
And Matt Kemp is now oh-for-May.