December 30, 2011
Prospectus Hit and Run
Morris on the Ballot, Smith to Close
After delivering the JAWS piece on first basemen earlier this week, I had planned to tackle the outfielders—Tim Raines, Bernie Williams et al—next. The sad news of Greg Spira's untimely passing on Wednesday presented me with a reason to change course, however. In the service of working on a chapter on Jack Morris’s Hall of Fame case for Extra Innings: More Baseball Between the Numbers in November, I had called upon the Internet Wayback Machine to unearth Greg's seminal research piece questioning whether Morris "pitched to the score." a piece that was published in Baseball Prospectus 1997, predating Morris’s arrival on the BBWAA ballot by a three years and Joe Sheehan's own outstanding Morris research by five years. I suggested to Dave Pease that we republish it on our site to run alongside yesterday’s article in tribute to our fallen colleague and friend, a fine example of his intellectual curiosity and dogged research efforts, particularly as the work dated to a time when Retrosheet was in its infancy and the relevant data not easily compiled. This piece is dedicated to his memory.
Last year, the Baseball Writers of America broke a 20-year streak when they elected Bert Blyleven to the Hall of Fame. Not since Fergie Jenkins' election in 1991 had a starting pitcher with less than 300 wins been voted in. It was Blyleven's 14th turn on the ballot, a span longer than the existence of JAWS (even in its unnamed form), longer even than my own history of ballot analyses, which date back to 2002. With him safely tucked into Cooperstown, and with no particularly compelling newcomers on the ballot, the spotlight falls to Jack Morris, a candidate around whom there is passionate debate.
If you missed the introduction to this year's series, please read here. Note that this is the first ballot in which I've worked with the revised version of our Wins Above Replacement Player measure that Colin Wyers has spent the past year implementing around these parts. This has created something of a seismic shift, in that higher replacement levels and different methods of measuring offensive, defensive, and pitching value have shaken up the standings of some candidates relative to the standards, which have shifted as well—after all, they're averages of individual player values. In general, the WARP values for most players are lower, and in some cases very different from what previous iterations or various competing systems have told us. Baserunning is now in the mix, as is a play-by-play defensive system.
For pitchers, this is especially true. Our previous WARP values were inflated because pitchers were compared to a replacement level pitcher backed by replacement-level fielders. Research has shown that replacement-level fielding is a misnomer, and it makes far more sense to compare pitchers to a replacement level pitcher backed by average fielding, which serves to lower cumulative pitching WARP across the board. Relievers are especially hard hit as they're compared against a different baseline than starters, because the historical record strongly suggests that replacement level relievers perform better than replacement level starters.
Our new WARP values are driven by Fair Run Average, a runs-per-nine measure that adjusts for some of the shortcomings of Earned Run Average. Going through the play-by-play record with a fine-toothed comb, FRA does a better job of dividing up the responsibility when a pitcher departs with men on base by taking into account the run expectancy of the situation, the expected yield given the number of outs and the location of baserunners. While it does away with the distinction between earned and unearned runs—and thus scales about nine percent higher than ERA, pegged to the league scoring rate—it adjusts for the quality of defensive support received. Furthermore, it credits the pitcher’s sequencing; a walk issued with the bases empty is less costly than one with the bases loaded. It also credits a pitcher's ability to get groundballs and infield popups; a grounder with a man on first base and less than two out is worth more than one with nobody on base. For more on FRA, please read Colin's re-introduction here.
A brief explanation for the alphabet soup items, first from the old school table: AS is All-Star and CY is Cy Young Awards won; 3C is a tally of leagues led in the Triple Crown categories for pitchers (wins, ERA, and strikeouts); HoFS and HoFM are the Bill James Hall of Fame Standards and Hall of Fame Monitor, respectively; Bal is how many years the player has appeared on the ballot, and 2011% is the player's share of the vote on the last ballot, with 75 percent needed for election.
From the new school table: FRA is Fair Run Average, FRA+ is its normalized equivalent (like ERA+, but computed using the formula (2 - FRA/LeagueFRA) * 100 for reasons best explained here), PRAA is Pitching Runs Above Average, VORP is Value Over Replacement Player, Career is career WARP, Peak is WARP from a player's best seven seasons at large, and JAWS is the average of the two.
Note that unlike the positional standards, there is no adjustment to be made for positional distribution with the pitchers. However, as in that revised methodology, I am no longer throwing out the bottom scores in computing the standard—it's a straight average of the Hall's starters.
Morris racked up high win totals over the course of his 18 seasons, and put up some stellar performances in October (7-4, 3.80 ERA) beyond that Game Seven. He reached the 20-win plateau three times, and won at least 18 games six times. He accumulated more wins in the Eighties than any other pitcher, as his supporters are prone to say, as if we should privilege a total created by arbitrary endpoints that define the decade of skinny ties and trickle-down economics more highly than any other measure. In this view, pitcher wins are an endangered species because of the move to the five-man rotation and the systematic use of specialized bullpens designed to take advantage of late-inning matchups. Since he debuted in 1977 (another arbitrary endpoint alert), Morris’ss 254 wins have been surpassed by only six pitchers: Greg Maddux (355), Roger Clemens (354), Tom Glavine (305), Randy Johnson (303), Mike Mussina (270), and Jamie Moyer (267).
One problem with exalting Morris’s win total is that it ignores the level of offensive support that he received. Borrowing a concept from Pete Palmer and Gary Gillette in the ESPN Baseball Encyclopedia (a book for which Spira served as an associate editor) but using BP's own park factors, we can express a pitcher's run support in normalized form just as we can ERA+ or FRA+. Morris’s SUP+ was 106.4, meaning that he received run support that was 6.4 percent better than the park-adjusted league average. Via the Pythagorean Theorem, each extra percentage point difference in run support translates roughly to a .005 gain in winning percentage, or an extra win for every 200 decisions. All else being equal, Morris’ss 6.4 percent advantage would translate to a record of 234-206 over the course of his 440 decisions, assuming average run prevention ability.
Run prevention is where Morris’s problem relative to the Hall of Fame really begins. His 3.90 ERA would be the highest in Cooperstown, supplanting Red Ruffing's 3.80. His 104 ERA+ would be the second lowest, ahead of only Rube Marquard's 103. Just eight Hall of Fame pitchers have an ERA+ lower than 110. Morris’s supporters dismiss his high ERA by noting that it’s distorted by the 5.91 mark he put up over his final two seasons; through 1992, he stood at 3.73, with a 109 ERA+ but "only" 237 wins. This is hardly unique, even among Hall of Famers. Catfish Hunter was hit for a 4.52 ERA and an 86 ERA+ while battling injuries over his final three seasons; he finished with a 105 ERA+, one percent better than Morris. Steve Carlton was rocked for a 5.72 ERA over his final three seasons. Phil Niekro was lit for a 6.30 ERA in his final year. Byleven posted a 4.35 ERA and just a 90 ERA+ over his final four seasons, a span that included a full year missed with injury; he had one stellar year (17-5, 2.73 ERA) and two with ERAs above 5.00 in that span. All of them elevated their win totals by hanging on, but with the possible exception of Blyleven, none enhanced their Hall of Fame cases.
Morris isn't helped any by the move to Fair Run Average. His FRA+ of 100 (99.52, actually) means he was basically league average at run prevention once you adjust for defense and bullpen support. That's 10 percent worse than the average Hall of Fame starter, and it would be the third-worst among that lot, ahead of only Hunter (98.6) and Bob Lemon (94.0). His seasonal WARP totals aren't very impressive either. His 1983 season rates at his most valuable at just 4.0 WARP, and he's got just four other seasons above 3.0, and one other between 2.0 and 3.0. Just to throw out a comparison to the most recently elected pitcher, Blyleven had six seasons above 4.0, with a high of 9.1—the highest single-season mark since 1950, in fact—and another two above 3.0. With the new numbers, Blyleven ranks 19th among all starting pitchers in JAWS. Morris ranks 167th, tied with Aaron Sele while coming nowhere near either the career or peak mark.
Supporters have dismissed Morris’s high ERAs with claims that he "pitched to the score." The research efforts of Spira and Sheehan have long since put the lie to this claim. In studying Morris’s won-loss record through 1993 (his second-to-last season), Spira found that he was just four wins ahead of his projected record based upon run support. Sheehan, who pored over Morris’s career inning-by-inning via Retrosheet, concluded: "I can find no pattern in when Jack Morris allowed runs. If he pitched to the score—and I don't doubt that he changed his approach—the practice didn't show up in his performance record." Morris’s record is more a product of strong run support than it is special strategy. For all of his extra wins and post-season success, his case rests on a distortion of the value of one shining moment rather than a well-rounded career.
His candidacy is at a critical juncture. After vaulting from 44.0 percent in 2009 to 52.3 percent in 2010, his 11th year on the ballot, Morris inched forward only slightly last year, his 12th on the ballot, to 53.5 percent. With Barry Bonds, Roger Clemens, Craig Biggio, Curt Schilling, Sammy Sosa and Mike Piazza all hitting the ballot next year, and Greg Maddux, Tom Glavine, Mike Mussina, Jim Edmonds, and Jeff Kent all show up in 2014, the ballot will produce such a deluge of eligible players with reasonable cases for Cooperstown that there's no way Morris will sneak through, barring a Blylevenesque jump into the 70 percent range and a subsequent lowering of resistance by the opposition. I don't see it happening.
The original was an eighth-round pick out of a Tampa high school by the Twins in 1991, just months before Morris would pitch his brilliant Game Seven. The same round of that draft produced Jason Schmidt, Mike Matheny, Derek Lowe, and Steve Trachsel, a bumper crop as these things go. Radke debuted with the Twins as a 22-year-old in 1995, and spent the first six years of his career toiling for a team that didn't finish higher than fourth place or with more than 78 wins in any season. He struggled in his first two years, leading the league in homers allowed both times, but he broke through in his third season, going 20-10 with a 3.87 ERA, a showing that placed him third in that year's Cy Young balloting behind Clemens and Johnson.
Radke was still at the front of the Twins' rotation in 2001, when they went 85-77 and finished in second place in Tom Kelly's final season; he went 15-11 with a 3.94 ERA, and led the league with a microscopic walk rate of 1.0 per nine. That was the only time he led the league, but he finished no lower than sixth in in any of his 12 seasons, and ranked in the top three eight times. He led the league in strikeout-to-walk ratio as well that year with 5.3, one of six top-five and nine top-ten finishes in that category. He missed two and a half months the following season due to a groin strain, and finished with a career-worst 4.72 ERA; it was the only time between his rookie season and his final one in which he didn't pitch at least 200 innings or make at least 31 starts. The Twins made the postseason for the first time since 1991 that year, however, and Radke sparkled in the Division Series, beating the A's in Games One and Five for what still stands as their only postseason series victory in the Ron Gardenhire era.
The Twins would make it back to the postseason three more times during Radke's tenure, but he could never help them advance further. He had one more strong year in 2004, when he posted a career-low 3.48 ERA. In 2005, he pitched through the pain of a stiff neck and a torn labrum; rather than undergo surgery and an arduous rehab for the latter, he decided before the 2006 season that it would be his final one. He held to that, with his final appearance a four-inning start in an elimination game in the Division Series against the A's. His career was done three weeks shy of his 34th birthday.
Radke's 2004 season rates as his best according to our advanced metrics, worth 4.2 WARP. He had five other seasons where he was worth between 2.9 and 3.8 WARP, all of them in a row from 1997-2001. His FRA+ numbers in those seasons went as high as 113 in 1998; in a high-scoring era, he beat the league average every year from 1997-2004, with his 120 in that latter season his career best. Even so, he's nowhere near the career or peak standards for a starter, and falls far short on JAWS. He'll remain a point of reference for the Twins' organization long after he's off the ballot.
Traded from the Phillies to the Yankees in February 1994, Mulholland began to bounce around endlessly; during one seven-year stretch, he spent the whole year with the same team just twice. He was a more or less league average starter for most of that spell, though by the late Nineties, he referred to himself as a "utility pitcher," willing to start, mop up, or do anything in between. "I'm like a plumber who's on call 24 hours a day," he said. That willingness to do what was needed helped when he was traded to the Braves on July 31, 1999. He replaced Bruce Chen in the rotation, made a handful of relief appearances, and pitched out of the bullpen as the team made it to the World Series. He would get back to the playoffs again with the Braves in 2000, and with the Twins in 2004, but he would never win a championship. He won't get into Cooperstown either, but he made the most of his two-decade ride.
When I first cobbled together the system that became JAWS, just two relievers were in the Hall of Fame: Hoyt Wilhelm and Rollie Fingers. Since then, that number has more than doubled with the elections of Dennis Eckerlsey (2004), Bruce Sutter (2006) and Rich Gossage (2008). Though there's plenty to quibble about with regards to Sutter's election in 2006, the larger class made it easier to sketch out a standard for relievers, particularly with Keith Woolner’s development of the Reliever Expected Wins Added (WXRL) stat. WXRL accounted for the discovery that a reliever at the end of a ballgame has a quantitatively greater impact on winning and losing (a ratio called leverage) than a starter does. It measured that impact by comparing a team's chances of winning based on the game state (bases, outs, score differential) before he enters and after he leaves. Our new regime has done away with WXRL, however. Colin has voiced his dissatisfaction with the use of win expectancy to measure reliever contributions, basically objecting on the grounds that it was rewarding a player for things that were out of his control before he showed up. "Leverage travels in only one direction; players can create or destroy leverage for the players after them, but they cannot benefit in the same way from the actions of the player after them," he wrote.
As noted above, in revamping WARP, Colin has used a higher replacement level baseline for relievers than for starters, with the result that even some of the game's best closers don't account for much more than 2.0 WARP per year. Mariano Rivera's best WARP since moving into the closer role is 2.5 in 2007; he has averaged 1.8 per year since then despite ERAs under 2.00 each year. Rivera's career mark of 33.2 WARP still compares very favorably to the enshrined relievers, but this isn't about him.
Traded to Boston after 1987, he continued to post high-quality seasons, though his workload and save totals dipped a bit. Traded again to the Cardinals, he flourished, topping Sutter's NL save record and recording 160 saves in parts of four seasons—taking over the all-time lead in that category—before packing his bags again. Through five more stops, the innings began to take a toll, and his managers limited his usage to about 50 frames a year, one inning at a time, to keep him effective. He spent his last two seasons in a set-up role, with diminishing returns, finally hanging it up in 1998.
From a traditional standpoint, Smith's case starts with his status as the number three guy on the all-time saves list, his seven All-Star selections, and an amazing string of consistency which followed him to virtually every stop on his 18-year ride. Until his abbreviated final season, his ERA+ was always better than league-average, 32 percent better for his career. On the down side, his teams never went further than a LCS appearance, and he got bombed in his brief post-season appearances, blowing two ballgames in best-of-fives.
Smith's career WARP matches the standard for enshrined relievers, a standard based on an admittedly small sample and skewed by Eckersley's high total, much of which owes to his days as a starter:
Smith career total actually outdoes those of all of the non-Eckerlsey relievers. He falls shy on peak by what amounts to a couple of runs per year, toping only Fingers and Wilhelm, but in the balance, his score fits behind Eckersley and Gossage.
It's a borderline call. In the years I've done JAWS, I've come down on both sides for Smith, and have remained open-minded as to his qualifications. While the new WARP may have me convinced that it's not a great idea to put relievers in the Hall, it's pretty clear that Smith is more or less even with the standards of those who are in. Given a short slate of JAWS-approved candidates this year, I'm inclined towards inclusion.