It Ain’t Over ‘Til It’s Over, Baseball Prospectus’ book on the best pennant races of all time, is available for purchase in stores and also available online through Amazon. If you like what you read here in this sidebar on the chapter covering the 1967 American League’s pennant chase, you’ll love a book with more than 420 more pages of this sort of content, perfect reading for every fan as he or she settles in to enjoy the final stretch drives and then October’s postseason action.

The Summer of Loving Carl Yastrzemski
by Jay Jaffe

In the simplified narratives that our sports media produce, the notion of one player’s carrying a team is a popular and appealing one. It puts a human-even superhuman-face on a disparate collection of players, emphasizing the strengths of one hitter’s or one pitcher’s accomplishments while glossing over his own weaknesses and those of his teammates. Who cares about Babe Ruth‘s lousy baserunning, or who was riding shotgun to Joe DiMaggio in 1941, or even Barry Bonds‘s peevishness unless it actually cost his team a game? Can one player carry a team? Performances like Carl Yastrzemski‘s final two weeks of September 1967, when he hit a jaw-dropping .523/.604/.955, certainly suggest it’s possible for a short time. In the longer term, the nature of baseball would suggest not. Aside from the obvious-the simple unlikelihood of one player’s maintaining such a high level of performance over a larger time frame-there’s the inherent structure of the game. The best hitter can only bat once every nine times, the most durable pitcher needs a few days’ rest between starts, and even the best fielder (beyond catchers) handles the ball only a handful of times each game, making it extremely unlikely that a team could keep relying on the same player over and over again for that extra boost.

As superficial as the notion of one player’s carrying a team may be, our ability to quantify the contributions of each player via an allencompassing value metric like wins above replacement player (WARP) lends itself well to exploring the limitations of this concept as it applies to a full season. WARP measures each player’s hitting, pitching, and fielding contributions against those of a freely available reserve or waiver-wire pickup. The metric calculates these contributions in terms of runs and then converts those runs into the currency of wins. Park and league contexts are built right into WARP, so that, for example, a player in a barren offensive environment such as mid-1960s Dodger Stadium and another player in a bountiful one such as turn-of-the-century Coors Field can be measured on the same scale. With WARP in hand, we can answer questions such as the following:

  1. How much impact does the presence of one great player have on a team’s chances?
  2. How much impact does the presence of one great player have on a team’s chances if he’s head-and-shoulders above all his other teammates?

To address these questions, we created a pool consisting of every ALand NL team since 1901, excluding the 1981 and 1994 strike years, for 2,082 teams in all. We logged the WARP scores of each team’s top two players, the team’s win-loss record, its spot in the standings (ignoring the wild card), and games behind first place. The average team in the sample had a .500 record, of course. It won its pennant or division title 14.6 percent of the time and finished an average of 16.9 games out of first place. With that baseline in mind, Table 1-6 shows a composite look at how the teams did, solely according to the WARP levels of their top ranked players. The sample sizes of the upper rows in the table are small enough to be dragged down by a few great seasons put up by players on horrendous teams-Cal Ripken Jr.‘s 17.4 WARP for the 67-95 Orioles in 1991 or Steve Carlton‘s 15.4 WARP for the 59-97 Phillies in 1972, for example-but it’s clear that having one great player greatly increases the chances of a team’s winning its league or division. Having a player of at least 13.0 WARP (combining the top three rows)-say, George Brett in 1985 (.335/.436/.585) or Albert Pujols in 2006 (.331/.431/.671)-gives us 129 teams with a combined winning percentage of .556, a 27.1 percent chance of winning a pennant or division, and an average finish of 7.3 games out of first. In all, these are outcomes comparable to those of the 12.0-12.9 bracket, with slightly fewer successes but a greater number of close finishes. Once the team’s best player falls below 12.0 WARP, the odds of winning take a significant hit. Once the WARP falls below 10.0, the advantage is pretty much lost, and it takes an extreme fluke to make a winner out of a team such as the 1980 Astros, led by Jose Cruz (.289/.367/.421), whose best player is worth less than 8.0 WARP.

Table 1-6
The Impact of a Team's Top Player on a Team's Performance
WARP Score of       # of   W-L     Pennant
Team's Top Player   Teams  Record  Success (%)   GB
>15.0                23    .550      17.4       10.2
14.0-14.9            42    .563      31.0        6.1
13.0-13.9            64    .553      28.1        6.9
12.0-12.9           128    .557      32.8        7.8
11.0-11.9           206    .536      23.8       10.6
10.0-10.9           329    .532      21.9       10.8
9.0-9.9             390    .512      14.9       14.5
8.0-8.9             384    .492       9.9       18.1
7.0-7.9             265    .457       3.8       24.8
6.0-6.9             169    .429       0.0       29.6
<6.0                 82    .370       0.0       39.9

The number of teams winning without a star of at least 8.0 WARP is just 10 out of 516, or 1.9 percent. The average WARP of the top five players on those 10 winners are 7.5, 6.8, 6.4, 6.1, and 5.6 (32.5 total for the top five). Even with a balance of players having good years, it's very tough to win without at least one player having a star-caliber year. One player can't carry a team, but somebody has to do the heavy lifting.

Turning to the second question, we looked at teams with the biggest WARP gaps between their best and second-best players (Table 1-7). Note how much lower the composite winning percentages are at each level than they were when we only considered the team's best player.

Table 1-7
Binary Star Power: The Impact of a Team's Top Two Players
on a Team's Performance
                       Avg. WARP,                Pennant
WARP Gap*  # of Teams  Top Player    W-L Record  Success%    GP
>7.0           24          15.0        .512       4.20     15.5
6.0-6.9        31          13.6        .490       9.70     20.3
5.0-5.9        41          12.5        .512      19.50     15.1
4.0-4.9        94          11.8        .513      18.10     14.7
3.0-3.9       200          11.0        .509      16.00     15.6
2.0-2.9       333          10.0        .501      15.60     16.4
1.0-1.9       551           9.1        .499      14.50     16.9
0.0-0.9       808           8.3        .496      13.70     17.6

*Difference between the WARP of the best and the second-best players on a team.

The lack of a supporting star clearly hampers the winning effort. If we combine all the teams with a WARP gap greater than 4.9, we have 96 teams with an average best player of 13.5 WARP. According to Table 1-6, we'd expect this group to have a winning percentage of around .553 and a pennant success of around 28 percent and be about seven games out of first. Yet the composite for the group is just .505, 12.5 percent, and 16.9 games behind. The lack of a second effective player appears to significantly lower a team's pennant chances, given a star player of a certain quality. A simple correlation confirms these observations. Across the sample of 2,082 team-seasons, the correlation between the team-high WARP score and that team's winning percentage is 0.49. But the correlation between the team's second-highest WARP score and winning percentage is actually higher, at 0.60. In other words, having a second star-level player is a better predictor of winning. Furthermore, balance and depth appear to be quite important; the correlations between winning percentage and WARP score increase the deeper one drills into the roster, at least from third-highest (correlation 0.64) to fourth-highest (0.68) to fifth-highest (0.69).

That finding shouldn't be very surprising, given the reasons we have just discussed. Baseball is primarily a team sport; a team can't simply expect its best player to do the bulk of the heavy lifting and still succeed. Nonetheless, the rare situations in which a player rose far above his teammates for a winning season are instructive. Table 1-8 shows the top 10 WARP differentials for pennant or division winners. Babe Ruth tops the list with what was, according to WARP, the greatest season of all time. Though he hit "only" 41 home runs in 1923, Ruth's .393/.545/.764 season established career highs in batting average, on-base percentage, hits (205), walks (170), and doubles (45). Ruth definitely carried the Yankees' offense that year; second baseman Aaron Ward (7.4 WARP) was the only other hitter above 4.8 WARP, and his value was largely defensive. Pitchers Joe Bush (7.8 WARP) and Herb Pennock (7.1) keyed a staff that was slightly better at preventing runs (relative to the league) than the offense was at scoring. Debuting on June 15 of that year but drawing just 26 at bats was Lou Gehrig, who would give the Bambino an effective compatriot and would keep Ruth off the upper reaches of this list, except for 1926. That year, the Babe was 21st on the list when the Iron Horse reached double digits in WARP for the first of 11 times. Ruth also ranked fourth in his 1921 season, when he bashed 59 home runs to fuel .378/.512/.846 hitting, but he had moderate offensive help from Bob Meusel (6.8 WARP) and Ward (6.3), as well as a pitching staff with an effective one-two punch in Carl Mays (9.5) and Waite Hoyt (6.9). The Ruth-Mays tandem also ranks 16th on the list for the 1916 season, when the Bambino was exclusively a pitcher for the Red Sox. The game's future savior was 23-12 with a 1.75 ERA in 323.2 innings, helping the Sox to the third of four World Championships between 1912 and 1918.

Table 1-8
Batman and Robin: Top 10 WARP Differentials Between a
Team's Best and Second-Best Players
Rank Year Team Player 1     WARP  Player 2   WARP   Dif   W-L    GB
 1   1923 NYA  Babe Ruth    18.2  Joe Bush    7.8   10.4  98-54 16.0
 2   1945 DET  H. Newhouser 14.4  Cullenbine  8.0    6.4  88-65  1.5
 3   1968 SLN  Bob Gibson   13.9  Lou Brock   7.6    6.3  97-65  9.0
 4   1921 NYA  Babe Ruth    15.5  Carl Mays   9.5    6.0  98-55  4.5
 5   1962 SFN  Willie Mays  13.1  O. Cepeda   7.5    5.6 103-62  1.0
 6   1990 PIT  Barry Bonds  13.0  Doug Drabek 7.5    5.5  95-67  4.0
7T   1984 CHN  R. Sandberg  11.9  Leon Durham 6.5    5.4  96-65  6.5
7T   1997 SFN  Barry Bonds  12.2  Shawn Estes 6.8    5.4  90-72  2.0
 9   2003 SFN  Barry Bonds  13.4  J. Schmidt  8.2    5.2 100-61 15.5
10T  2004 LAN  A. Beltre    12.8  Eric Gagne  7.7    5.1  93-69  2.0
10T  1995 CLE  Albert Belle 14.0  Jim Thome   8.9    5.1 100-44 30.0
10T  1995 ATL  Greg Maddux  13.6  Tom Glavine 8.5    5.1  90-54 21.0

Pitcher Hal Newhouser, second on the list, rose to prominence during World War II, when many of the major's top players were serving in the military. Newhouser had been a sub-.500 pitcher on mediocre teams, but, freed from service because of a congenital heart ailment, he rocketed to 29-9 with a 2.22 ERA in the softened balata-ball season of 1944. He went 25-9 with a 1.81 ERA in 1945, earned MVP honors in both years, and led the Tigers to the 1945 World Championship. Bob Gibson places third with his legendary 22-9, 1.12 ERA season in 1968, the Year of the Pitcher. Though Lou Brock ran a distant second at 7.7 WARP, just a hair's breadth separates Brock from Curt Flood (7.6) and Dal Maxvill (7.5) as the Cards' second-best player that year-a good illustration of the value of balance beyond one superstar. Willie Mays, the fifth-ranked player, hit .304/.384/.615 with 49 home runs in 1962. WARP runner-up Orlando Cepeda ranked second to Mays on a talented team with five future Hall of Famers on the roster-Mays, Cepeda, Juan Marichal, Gaylord Perry, and Willie McCovey. Blocked by Cepeda at first base, McCovey was a platoon outfielder that year, hitting .293/.370/.590 overall and worth 3.8 WARP-or better than half the WARP of Cepeda in roughly 40 percent of the playing time.

Barry Bonds cracks the top 10 in three very different eras of his career. In his Pirate guise in 1990, he had no particularly outstanding teammate, but, like Gibson, had several who were close in value. Doug Drabek (7.6), Bobby Bonilla (7.4), Jay Bell (7.3), and Andy Van Slyke (7.2) proved a more-than-ample supporting cast. In 1997, before the alleged steroid use that pumped up his muscles and statistics, he had less support; besides Shawn Estes (6.8), only Jeff Kent (6.6) topped 6.0 WARP, though J. T. Snow (5.9) came close. By 2003, he'd lost Kent as a teammate, but had help from Jason Schmidt (8.2) and José Cruz Jr. (7.3). Ray Durham (6.3) might have challenged for the number two spot had he not missed nearly all of August with injury. The all-time top differential didn't come from a pennant winner, but from the greatest season in baseball history, according to WARP. On the strength of a 36-7, 1.14 ERA season, Walter Johnson's 18.3 WARP in 1913 produced the largest gap of all, 12.4 WARP ahead of Chick Gandil's 5.9. With no supporting cast (the future Black Sox instigator Gandil should hardly count), Johnson dragged the Senators to a second-place, 90-win finish. Laboring for some wretched teams, Johnson produced five of the top 20 differentials in our study. Yastrzemski's 1967 WARP total (12.6) was 3.6 wins better than that of teammate Rico Petrocelli, a differential that's in a five-way tie-with Sandy Koufax's WARP of 1966, Lou Gehrig's of 1936, Ron Guidry's of 1978, and Vlad Guerrero's of 2004-for 42nd place on this list. His season was unimpressive by that measure, perhaps, but the WARP gap statistics hardly lessen the magic of his accomplishment. Yaz showed that a player might spark a team for a spell of 10 or 14 games, but maintaining that kind of hot streak over a full season is the real Impossible Dream.