January 3, 2002
The MVP Prediction System
Third in a SeriesPart Two
From 1946 though 1993, National League Most Valuable Player awards could be safely predicted, with only a handful of exceptions, using just a few indicators. Since that time, however, the system has already made three major mistakes (the MVP was not selected as a candidate by the system) and one minor mistake (the tie-breaker selected the wrong candidate). That's four out of eight correct calls, a rate that on the face of it suggests that the system may no longer work.
In this conclusion to the series, I'll look at reasons why National League MVP voters may be changing how they go about their business, examine the wrong predictions since 1994, and speculate about the future usefulness of the MVP predictor.
Four things have changed that may have spelled the end for the old ways of choosing MVPs:
The first two factors, the expanded playoffs and higher levels of offense, both have the same result: they increase the number of candidates produced by the system. In other words, the guidelines that in the 1950s and 1960s could often identify an MVP, and in the 1970s and 1980s could narrow the field to two or three candidates, may now yield several candidates.
Recall that the system supposes that writers narrow the field by giving credit for a small set of accomplishments, including playing for a winning team, playing an up-the-middle defensive position for a winning team, achieving 100 RBIs, maintaining a .300 BA, and leading the league in a triple-crown category. The system works, presumably, because it accurately reflects the need of the voters to find shortcuts, to make the job manageable even as they continue to file daily game stories and prepare for the off-season.
In 1951, when Roy Campanella won the award, there was only one first-place team (although Campanella's Dodgers forced a playoff with the Giants). Campanella was one of eight National League players with a .300 batting average and also one of eight with at least 100 runs batted in. In 1971, Joe Torre was one of just four National Leaguers to knock in at least 100 runs, although the list of .300 BA hitters was longer at 14. In 2001, I count 25 players with 100 RBIs and 26 regulars with at least a .300 BA. At that level, whether or not the writers still believe that these figures are important milestones, there may simply be too long a list generated to serve as a useful shortcut.
The third change for present day voters is the increase in available information, including the mainstream emergence of sabermetric ideas. Presumably none of the sportswriters makes use of Equivalent Average or VORP, but in an era in which Peter Gammons talks about the wonders of OPS, it wouldn't be shocking if writers used evidence beyond the old BA/HR/RBI.
Readers under the age of 25 may not realize just how little information was available as recently as 1980. USA Today began printing a full statistical package once a week in the 1980s, and many newspapers converted to a more informative daily leaderboard feature a few years later. Before that, not only were basic stats such as OBP and SLG almost completely unavailable, but even the raw material for those stats was hard to come by. Standard box scores did not include batters' bases on balls, and I do not believe that The Sporting News's weekly stats package included them either. Most newspapers ran triple-crown leaders daily--usually the top ten in batting average and a handful of leaders in RBIs, HRs, and little else--and a weekly list of all players, ranked by batting average, that included very little information.
I don't know whether the packages provided by the teams to their beat reporters included, for example, walks, but I suspect that the beat writers who did the voting had little more to help them than the public did. What it all comes down to is that an enterprising sportswriter who had read and absorbed the first Bill James Baseball Abstracts would have had an awful hard time making use of that new perspective in MVP voting. There was an explosion of new information in the 1980s, including the deservedly-mocked situational stats, but until reporters were able to turn to the Web, they were dependent on the press-relations departments of the various teams to put it all together. So the proper time frame for the impact of the information explosion is the mid-1990s.
The last curve thrown to the National League MVP voters was expansion to Denver in 1993, followed by the debut of Coors Field in 1995. Baseball people, including sportswriters, have known a little bit about park effects for decades, but there's no sign that MVP voters took any account of Wrigley Field or the Astrodome when filling out their ballots. Not so here: everyone knew right away that Denver was different. However, knowing that parks affect what happens in them is only the first step in knowing how to deal with Coors Field when voting for MVPs, and it was not at all clear what the voters would do. First, they would be faced with system-labeled candidates who were undeserving, and on top of that, to the extent that Rockies dominated the triple-crown stats, the voters would have to decide how to award credit for leading the league. Should a runner-up to a Mile High player receive credit for "leading" the league?
In the last several years, then, National League MVP Award voters who had previously used the shortcuts that make up the MVP predictor presented here have been faced with confusion. Of the seven points available, four (the two involving winning teams, and the two statistical standards) had become far less discriminating measures, and the other three (leading the league in triple crown stats) were often unavailable or pointed to guys such as Dante Bichette, who really didn't strike anyone as an MVP. Plus there was all that new information, and some senior sportswriters and some baseball executives were saying that you had to pay attention to some of it--it wasn't just a few cranky computer nerds, although that group was probably peppering plenty of the voters with e-mail full of odd acronyms.
What to do? The voters could continue as they had in the past, as best as they could. They could reject the old statistical approaches and find new, non-statistical ways of picking MVPs; the new information certainly could arm anyone wishing to vote for a non-traditional candidate with some fancy-sounding information. They could adopt new statistical standards to replace the old ones. Or, they could muddle through, depending on the circumstances.
The first year under the new conditions was an odd one on top of all of that, because the season ended in August. Nevertheless, the system correctly predicted the winner: Jeff Bagwell was one of four candidates, and he easily won the tie breaker (BA + HR + RBI + 15 for division winners + 15 for up-the-middle defensive position). In calculating the predicted winner, I assume that voters treated the teams in first place when play stopped as division winners, and that voters retained the 100-RBI standard even in the short season. Both assumptions seem reasonable to me, but it is worth noting that Bagwell's Astros finished second in the NL Central (and were leading for the wild card), so that Bagwell's candidacy would be enhanced if voters ignored the standings. Bagwell, however, is a case that proves very little; he would have been the MVP for 1994 under most plausible criteria. (Caveat: had the season continued, Bagwell's broken hand, suffered just before the strike, would have damaged or even ended his candidacy.)
In 1995, Coors Field opened, and for the next three years Coors swamped the MVP predictor. In 1995 and 1996, the system produced just one candidate each year: Dante Bichette in 1995, and Andres Galarraga in 1996. Voters rejected both. In 1997, the system identified a record six candidates, including Rockies Galarraga and Larry Walker; Walker was the easy winner of the tiebreaker, and this time the voters rewarded the Coors candidate.
What can we conclude from these three cases? First, there does not appear to be a simple deduction for playing in Coors. Both Bichette in 1995 and Galarraga in 1996 were stronger candidates under the system than was Walker in 1997, so if there was an across-the-board discount for Rockies, then Walker would not have been the only winner. Beyond that, it is harder to draw lessons.
The Bichette case looks basically like the 1988 Darryl Strawberry/Kirk Gibson case and other previous system mistakes. In each of these cases, the winner was elevated from system obscurity. Bichette had four system points in 1995, while Barry Larkin was one of several players with three points, but fared poorly in the tie breaker; Mike Piazza (.346/32/93) should have been an easy choice over Larkin (.319/15/66). The voters opted for non-system candidate Larkin in a close race (281-251) over system selection Bichette, just as they had bolted the system for Gibson over Strawberry in another narrow contest.
The 1996 race was different. Instead of a strong runner-up, four-pointer Galarraga fared poorly in the voting, finishing in a distant tie for sixth. The winner, Ken Caminiti, was one of a number of three-point candidates, and had impressive triple-crown stats (.326/40/130), within 15 of Galarraga's combined numbers (.304/47*/150*), and easily the best of any non-Rockie. In other words, while 1995 looks like a typical system miss, in which the system candidate is narrowly defeated by an MVP chosen for qualities the predictor ignores, in 1996 the voters almost ignored the system candidate and selected the system runner-up.
Then in 1997, the voters rewarded the first Coors MVP. Unlike Bichette and Galarraga, Larry Walker had established a strong reputation before arriving in Colorado (Galarraga had also played well north of the border, but was a disaster in St. Louis); perhaps that made his Coors performance more "real" to the voters. Walker's season was also one that, in fact, looked good (but not MVP-level) after adjusting for park context, while the other two seasons were really pretty mediocre but for Coors Field; still, it is hard to believe the voters make such distinctions. If they do, it is hard to imagine exactly what triggers the distinction. All in all, I see no explanation other than a long list of ad hoc rationalizations, none of which can be even strongly supported, let alone proven.
Following the three Coorsflation seasons, we have now had four seasons in which the candidate lists have been relatively Rockie-free. The system was correct in two of those years. Jeff Kent was the lone four-point candidate in 2000, and Sammy Sosa beat seven other three point candidates in the 1998 tiebreaker. As discussed earlier, Barry Bonds narrowly lost the tiebreaker to Sosa in 2001, but took home the award anyway.
In each of these seasons, the predictor seemed to function normally, with two correct selections and one understandable tie-breaker error. Indeed, while it is never possible to know what would have happened had circumstances changed, it is not entirely far-fetched to imagine that Sosa might have won the 2001 award had the Cubs won the NL Central, or that Luis Gonzalez might have won had he managed to hold on to the league lead in RBIs, despite the amazing year Bonds had. It is tempting to suppose that only the narrow differences between the candidates allowed the voters to ignore their normal procedures and support the new home-run king.
A far different story, however, needs to be told about the 1999 season. Chipper Jones, late-season heroics notwithstanding, was one of several candidates with three system points. For the first time in the entire era under consideration, however, a system winner was completely ignored by voters. Jones, with the division-winning Braves, had a .319 BA and 110 RBIs. Three points. Mark McGwire (.278/65*/147*) also had three points, and easily outpaced Jones in the tiebreaker. McGwire finished fifth in the voting, behind Jones, three-pointers Jeff Bagwell (.310/47/132) and Matt Williams (.303/35/142), and single pointer Greg Vaughn (.245/45/118). Bagwell drew support, but another division-winning Astro was thoroughly ignored by the voters. Carl Everett (.325/25/108) had four system points, but finished 17th in the voting. He is, in the almost six decades considered here, the first system selection that the voters had no use for at all.
Why? One possibility is that the voters were repulsed by Everett's personality. His reputation, while possibly not as bad in 1999 as it is now, after his tour with the Red Sox, has never been very positive. However, this explanation seems implausible to me. Turning with some trepidation to the American League, I note that Albert Belle was similarly "robbed" of an award in both 1995 and 1996, yet he finished second in one year and third in the other. Of course, I do not claim that the predictor works in the American League anyway--it has rarely picked a correct winner in the last 15 years--but the point here is that voters appear capable of voting for a player they presumably dislike, even if they are unwilling to place him first on their ballots.
Instead, I suspect that the answer is that in an age with three division winners, each of whom has four players at up-the-middle positions, it is simply not enough for one of those players to hit .300 and knock in 100 runs. By not enough, I mean that it's not enough to get noticed. Perhaps voters simply don't care about those landmarks any more, but more likely they are finding other ways of narrowing their choices, or, when they find that their normal methods do not work, they depart from them and find new, non-statistical methods of finding a winner.
Indeed, there was another somewhat similar case in 2001. The Cardinals finished in a dead heat with the Astros for the Central Division lead and for the wild card; thanks to the tie-breaker, St. Louis was the wild card. Generally, the system does not give credit for winning to wild card teams, but if the Cardinals were a "winner" in 2001, then Jim Edmonds would be the predicted MVP with a four-point season. As it is, the system gives Edmonds just two points, and the voters gave him absolutely nothing, not even a stray tenth-place vote.
From these eight cases in the new era, I think there are a few lessons to be learned. Sometimes--five out of eight so far, if we include the Bonds tie-breaker error as understandable--the old norms still apply. Sometimes, Coors will produce a winner so unqualified that the voters will reject him, but so far at least there's no way to tell what will trigger the voters to do that. And sometimes, the new conditions of the game will produce a system prediction that the voters reject, as they did with Everett in 1999. In those cases where the normal standards do not guide voters, there is no apparent objective new set of standards replacing the old.
When they do reject the candidate, the winner is about as unpredictable as ever. Chipper Jones and Ken Caminiti both had very visible big series late in the pennant race, and Caminiti and Barry Larkin were both widely considered the leaders of their teams, but other players have had big pennant-race games or a good reputation for leadership without getting the award. Each of the recent winners in system mistake years was a veteran player who had not previously won an MVP (Jones won it in his fifth season, at the age of 27), but that was not true of Roy Campanella back in 1955. Each of the recent people on the list were on division winners, but not Roberto Clemente in 1966. None of the 1990s group was on a miracle team, unlike Frank Robinson in 1961. Jones's Braves and Larkin's Reds won their divisions easily. Perhaps voters are seeing things that outsiders cannot spot. Or, perhaps, there are bandwagon effects; when the normal ways of narrowing the field don't work, writers use baseball columnists and "Baseball Tonight" commentators to suggest candidates.
On balance, I think the predictor is still a useful tool. It defines the normal way that NL MVPs are selected; it is the other awards that are in need of explanation. While their American League colleagues increasingly vote in seemingly random ways (Ichiro!?), National League voters appear to apply consistent standards, at least most of the time, at least when the standards yield plausible results.
Jonathan Bernstein's favorite Blue Sox heroes have always been Pete Gibbs and Tweet Tillman. He once again thanks the folks at rec.sport.baseball and others who have informed this study, and especially thanks Henry Schulman for his helpful comments.