Back at the end of August, our friends at ESPN tasked me with building an MVP predictor in the spirit of a system such as Bill James’ Hall of Fame Monitor, one that awards points for various accomplishments in an attempt to identify who will win as opposed to who should win. Limiting my scope to the post-strike timeframe to take advantage of the fact that none of the ensuing winners were pitchers, that all of them save for the 2003 version of Alex Rodriguez came from teams that finished above .500, and that 22 of the 28 hailed from teams that qualified for the expanded postseason, I built a carefully-gerrymandered system—Jaffe’s Ugly MVP Predictor (JUMP)—that “predicted” 14 winners, and put 27 out of 28 winners in its top three in points for that year.
Getting any closer on the direct hits proved impossible (at least for me), because over the 1995-2008 span, the voters were only so predictable. Nonetheless, certain tendencies made it easier to identify the top candidates, enabling me to rack up secondary hits in all but one election. Instead of focusing on the round-numbered benchmarks typical of a Jamesian system (30 homers, 100 RBI, etc.), I focused on league rankings among batting title qualifiers in key offensive categories, discarding anything that didn’t have the power to predict a player into that top three. To a bit of my surprise and disappointment, on-base percentage fell by the wayside in the process—one simply can’t do a better job of predicting voter behavior by taking it into account. The categories ultimately found to have some bearing on prediction were as follows: batting average, slugging percentage, home runs, total bases, runs, RBI, intentional walks and stolen bases. Additionally, player position and team success carried significant bonuses which had a major impact on winnowing the field of candidates, with middle infielders getting a boost, and designated hitters a penalty. Team success generated points for three levels of accomplishment, a .500 record, a wild card, or a division title. Playing for the Colorado Rockies carried an additional penalty.
Ultimately, the one player who slipped through the cracks—who didn’t make the top three in his league in points, but won an MVP award during the timeframe—was the one who appeared at that juncture to have some bearing on the 2009 AL MVP race: Ivan Rodriguez in 1999. I could come up with no positional or team bonus which could explain his election in what was an unusually crowded field; Pudge ranked 10th in JUMP points but won an especially close election. At the time of the article’s publication, Joe Mauer was leading the AL in all three triple-slash categories (as he did at season’s end), but because the Twins were a game under .500, he wasn’t anywhere close to the top three in points; I could figure out no bonus for catcher that didn’t upset the rest of the system’s historical “predictions.” One need only to remember the 2006 vote, when teammate Justin Morneau received 15 first place votes and won, while Mauer finished a distant sixth, to appreciate the fact that recent voters haven’t awarded backstops any extra credit for their position. Mind you, that was not at all to suggest Mauer wasn’t worthy of winning (a point initially misunderstood by a few readers) but simply that it would have been an anomaly given 1995-2008 voting patterns. Instead, the system identified Mark Teixeira, Miguel Cabrera and Chone Figgins as the three frontrunners for the award.
The Twins, as you’ll recall, did pull it together long enough to make the playoffs, providing Mauer with enough points to crack the final top three; he ranks between Teixeira and Derek Jeter. The Yankee first baseman, whose anointment by Tyler Kepner as the obvious frontrunner touched off the debate which led to my piece, led the league in homers, total bases and RBI while playing for a division winner—a set of accomplishments that are usually enough to garner the trophy, at least in years when a catcher doesn’t hit .365. With the voting to be announced today, however, there’s little doubt that Mauer will actually win; the question is more whether he’ll do so unanimously. While he won’t count as a direct hit on JUMP, he’ll keep the system’s secondary hit total intact.
As for the NL, the top three in order are Albert Pujols, Ryan Howard and Chase Utley, the same as they were back in late August. Pujols’ league lead in slugging, homers, total bases, runs, and intentional walks (not to mention top-three finishes in batting average and RBI) while playing for a division winner should be enough to carry the day, and to keep the JUMP system’s 50 percent rate of direct hits intact.
A quick list of those direct hits, for anyone interested:
American League
1997: Ken Griffey
1998: Juan Gonzalez
2000: Jason Giambi
2003: Alex Rodriguez
2004: Vladimir Guerrero
2005: Alex Rodriguez
2007: Alex Rodriguez
2008: Dustin Pedroia
National League
1996: Ken Caminiti
1998: Sammy Sosa
1999: Chipper Jones
2002: Barry Bonds
2003: Barry Bonds
2005: Albert Pujols
Jay: I'm surprised that in this article you haven't addressed the recent Cy Young votes, which seem to indicate a shift in the way the electorate thinks. It would seem likely that if Will Carroll and Keith Law use advanced stats in deciding on their Cy votes, other new voters will do the same for MVP.
I think that goes to the point where he's trying to predict who will win instead of who should win. Until the electorate actually changes the behavior and JUMP is way off base, the formula shouldn't change too much.
Jay, how do you build these formulae? The process sounds like something that might be interesting for a genetic algorithm approach, where you take a bunch of variables together, weight them randomly, then iterate over successive "generations" combining the most successful approaches and introducing random "mutations" to get past the sticking point. Apologizing for explaining something you already know, but it just struck me as an interesting application of something I learned in a random comp sci elective years ago and never followed up on.
Also, when you do something like this with a relative paucity of data (plenty of stats, but in the end really only 28 members in the set) are you ever concerned that you haven't created something predictive but rather just refined your algorithm until it learned the data set? I remember that being a big problem
Jack, my process was not tremendously far off from what you describe, though I didn't think of it exactly in those terms. At one point when I ran into trouble, I called it my "Three-I" system: intuition, iteration, and idiocy.
You can read the details in the original piece, but the short version goes something like this:
* I began by tabulating how many MVPs from the era had led their leagues or placed in the top 5 or 10 in several key offensive categories, and building a point system that would account for that, essentially as you suggest - experimenting with different weightings (usually double or half, or in some cases - OBP - zero) and watching whether tweaking them improved or compromised the predictive nature.
* For the next phase I added in team performance, ultimately settling on a system which awarded "team success points": one for finishing at .500 or above, another for winning the wild card, or two more for winning the division. Because team performance has had such a bearing on the vote, ruling out virtually everybody on sub-.500 teams during this era, this category ends up being multiplied by a much larger factor - I settled on Team Wins divided by nine, so a player from an 81-win team would get nine points (one "team success point" times 81/9), and from a 99-win division winner would get 33 points (three "team success points" times 99/9).
* Finally, I introduced positional bonuses and penalties as well as the Rockies penalty, basically arriving at optimal values based upon trial and error. there's no real rhyme or reason why playing second base is worth 3.33 points, shorstop 5 and DH -13 except that when I move the values or introduce other ones (say, a catcher bonus), fewer direct or secondary hits occur.
In the end, "algorithm" is probably a better term to use than "predictor," because while the primary goal was to predict the 2009 MVP races, the system really was designed to "predict" races which had already happened.
"While he won’t count as a direct hit on JUMP, he’ll keep the system’s secondary hit total intact."
This statement is confusing to me. It don't seem like Mauer is listed in the top 3 in your system and "intact" would imply that the secondary hit total would stay the same.
Looking at it again, keeping the "hit total intact" wasn't the right wording on my part. Since Mauer did finish in the final top three (between Teixeira and Jeter) it would have been better to say that he helped maintain the system's nearly-perfect secondary hit rate.
Questions about 1999. How badly did it rate Ivan Rodriguez? He did hit .332 with 32 homers, 25 steals, 113 RBI's and his customary gold glove. (plus the otherworldly 34 stolen bases allowed in 75 attempts and 10 runners picked off of first base.
If Pudge had been a shortstop with outstanding defense, how would that have changed the numbers?
The vast majority of Pudge's points (31.7) are from the Rangers winning 95 games and the division. He had top 10 finishes in AVG, SLG, TB and Runs, but as they were all outside of the top 5, those are worth only one point apiece. He was just 14th in RBI, so he gets no points for that. Nor does he get any points for defense, as great as his numbers were, because there's little evidence to suggest voters have looked at defensive numbers in any MVP vote during the era. Had he played shortstop, which would have given him the maximum positional bonus, he'd have climbed to eighth. But assigning catchers the same point totals as shortstops sets off more false positives elsewhere in the system than it solves, which is why there isn't a positional bonus applied.
One could argue that the fact that Pudge was the only catcher to win between Munson in 1976 and Mauer in 2009 suggests that the voters actually looked at defensive numbers in this instance; certainly, knowing that a catcher threw out more than 50 percent of base thieves is the kind of thing that might garner some attention. But Pudge did that multiple times, and I'd imagine a few other catchers did so as well, so again, handing out points for such accomplishment runs the risk of setting off false positives elsewhere (Mike Matheny, MVP candidate?) even if it does help shore up the explanation for that vote.
That 1999 vote was a very interesting one and totally anomalous. Pudge won narrowly over Pedro Martinez (whom JUMP discounts, right?). I also recall that two writers left Pedro off their ballots completely because they didn't believe MVP is a pitcher's award. However, I believe that one of those two was a writer for the NY Post that had previously voted for David Wells. So....
I wonder if the 1999 AL MVP race was just so close that all the little idiosyncratic prejudices that the voters have a greater impact and play havoc with attempts to establish a consensus set of rules. For example, there were writers who wouldn't vote for a Red Sox pitcher, writers who wouldn't vote for a umpire spitter, writers who wouldn't vote for Manny Ramirez, writers who wouldn't vote for a DH (Palmeiro). All of that lead to votes for Pudge that really should have gone to someone else.
Are there other close votes where JUMP was incorrect?
Interestingly, the Hall of Fame monitor faces the PED question now, doesn't it? Mark McGwire is a 170 on the scale - a sure-fire HOFer by that measure and yet he's not getting in any time soon. There is just no perfect way of accounting for how human beings will ultimately think and act. You couldn't put a factor in the HOF monitor for suspected PED usage or congressional testimony performance, could you?
On a system-wide basis, I don't know offhand whether JUMP was more likely to be incorrect on the closer votes, but I suspect so. From 1995 through 2008 there were actually three votes which were decided by narrower margins, all in the AL, and the system misses each time. It misses in 1995 on Mo Vaughn (calling for Albert Belle, who finished second by eight points), in 1996 on Juan Gonzalez (calling for Belle again; he finished a distant third, with A-Rod finishing second by just three points), and in 2001 on Ichiro Suzuki (calling for Bret Booone, who finished third, with Jason Giambi finishing second by just eight points.
As for the HOF Monitor, it's pretty out of date, period. Not only can't it account for PED questions - hey, who can? - it's not built to account for expansion, the wild card and additional round of playoffs, the reduction of complete games, and the rising home run and scoring rates which prevailed in late '90s and early '00s. A 30 HR/100 RBI season just isn't all that special in this day and age.
I've tried to, unsuccessfully. One need only look at Mike Piazza to be reminded of why. He had a handful of top five finishes in batting average, and top 10s in other categories. He put up some of the greatest offensive seasons ever for a catcher and still failed to win an award. JUMP places him in the top 3 only in 1995, and when one rejiggers the system to put him in there more often, it reduces the number of direct and secondary hits.
The bottom line is that the system is built to recognize guys who "look like" MVP candidates. In the wild card era, those guys have by and large not been catchers, so it can't make much headway in figuring out what a catcher has to do to win the award. Perhaps with another year of data and another several hours to jigger with it, I can improve the system. But it ain't happening today.
Interesting. I'm trying to build a similar system to determine which LOOGYs will receive a down-ballot vote but it just keeps returning "You're an idiot".
The more interesting question than why Konishi voted for Cabrera is why he had a vote in the first place. There's got to be, what, 800,000 people in the world more aware of what goes on in MLB. That kind of gross ignorance sullies the voting process (more than it's already sullied by idiots)...
Jay: I'm surprised that in this article you haven't addressed the recent Cy Young votes, which seem to indicate a shift in the way the electorate thinks. It would seem likely that if Will Carroll and Keith Law use advanced stats in deciding on their Cy votes, other new voters will do the same for MVP.
I think that goes to the point where he's trying to predict who will win instead of who should win. Until the electorate actually changes the behavior and JUMP is way off base, the formula shouldn't change too much.