For a number of reasons, the move from this trade deadline that seemed to occupy the biggest portion of our collective consciousness was the Aroldis Chapman trade. He was sent from the Yankees to the Cubs in exchange for Gleyber Torres, Adam Warren, Billy McKinney, and Rashad Crawford. Chapman was suspended earlier this season for violating the league's domestic violence policy, firing a handgun during an argument with his girlfriend and allegedly choking her as well, which meant this trade was accompanied by numerous thorny moral issues. It also happened before the real madness of the 48 hours leading up to the deadline, which meant it had less attention competition for our attention than some of the later trades.
But most relevant to this article, the return seemed huge. Yes, the Cubs are almost definitely going to make the playoffs, and they'll appreciate having a lights-out closer if/when they do, and yes, Torres was almost certainly blocked by other Chicago players, but this still seemed like a high price to pay. Andrew Miller, one of the other elite Yankees relievers, was also traded, and his DRA this season is almost a full run lower than Chapman's. Miller brought back a package of Clint Frazier, Justus Sheffield, Ben Heller, and J.P. Feyereisen, and while I'm not a prospect expert, my sense is the Chapman return is more desirable. It's at least close, which is amazing, because Miller has not only been better than Chapman this year, he's also signed for two years after 2016, at $9 million annually, while Chapman will be a free agent and will probably cost a lot more than Miller over those two years.
What do we do with this? What are we supposed to make of a team like the Cubs, which by all indications is run by very intelligent people, making a decision that looks indefensible? A common reaction (and I think reasonable reaction) is to try to find the assumption that makes the decision look that way, and wonder whether the team might know a reason it's wrong, After the Chapman trade, the most common such argument I saw looked generally like this:
What if the volatility of relievers makes the value of the very good ones go up because they are less volatile? https://t.co/TLpEbMmtrW
— Matt Winkelman (@Matt_Winkelman) August 1, 2016
There's something really satisfying about this idea, that teams are paying a premium for reliability. Relievers as a group are so volatile, and so when one of them has a record of consistency, it stands out. Chapman has been no worse than very good since his debut in 2010, and consistently outstanding since 2012. Miller, on the other hand, bloomed later, in 2014, and has several seasons of poor performance in his past.
It's totally plausible, however, that consistency isn't actually a skill that pitchers possess. Maybe Chapman has been consistent in the way that a coin that lands on heads five consecutive times has been consistent, which is to say, in a way that tells us nothing about his consistency going forward. We've got to at least try to test these ideas about what teams might know that we don't, which is what this article is supposed to do.
I’m going to default to using WARP as my measure of reliever excellence, for two reasons. One, DRA is the best measurement we have of pitcher skill, and it covers nearly all the bases by taking into account both the traditional three true outcome abilities and batted ball suppression. Two, WARP is a counting stat, and so automatically includes playing time and health controls, which I think is appropriate. When considering relievers, teams are presumably hoping for a guarantee of playing time and health, not just skill. I’m spending as little time on these definitional questions as possible, but they lurk in the background of everything that follows, so just keep that in mind when thinking about what exactly this shows.
Here’s a simple way of looking at things, to start. From 2007 through 2014, there were 93 reliever-seasons with a WARP of 2.0 or above. The next season:
- 27 of those players (29%) had a WARP of 2.0 or above
- 39 (42%) had a WARP between 1.0 and 2.0
- 16 (17%) had a WARP between 0.0 and 1.0
- 6 (6%) had a WARP below 0.0
- 4 (5%) didn’t pitch in the majors
- 1 (1%) was converted to starting
From 2008 through 2015, there were 86 reliever-seasons with a WARP of 2.0 or above. The previous season:
- 27 of those players (31%) had a WARP of 2.0 or above
- 21 (24%) had a WARP between 1.0 and 2.0
- 25 (29%) had a WARP between 0.0 and 1.0
- 3 (3%) had a WARP below 0.0
- 5 (6%) didn’t pitch in the majors
- 5 (6%) were converted from starting
So, excellent relievers have stayed excellent in the next year about 1/3rd of the time, and stayed at least good about 2/3rds of the time, but only half of them were good or excellent the year before. This is interesting, but it doesn’t really answer the question, since the allure of Chapman is not just a single year of dominance, but a sustained string.
So to answer that, I took three groups of player-seasons from 2008 to 2015: those with three or more prior consecutive seasons of 2.0 WAR or more, those with two such prior consecutive seasons, and those with only one. Maybe there’s some difference between those with three seasons of excellence and those with four, but I feel like once a player has put in three straight seasons like this, he has established himself as a consistent performer in the collective baseball mind. Plus, these are some stringent requirements, and the samples get pretty small: only four player-seasons had four consecutive prior years of excellence, and only seven had three, so combining them felt appropriate. I then looked at how those groups performed in the following year.
Each group is relatively small—11 players are in the three-or-more consecutive year group, 17 in the two, and 63 in the one—so differences of only a few percent shouldn’t be made too much of. The three-year group doesn’t come out looking bad, but the rate at which those relievers generate two or more WARP is nearly identical to the rate at which the relievers in the two-year group do. If consistency is instead defined as being at least good (i.e., 1 WARP or better), the three-year group has a bit clearer of a lead; 82 percent of its players hit at least that threshold, compared to 65 percent of the two-year group and 73 percent of the one-year group. And, for whatever it’s worth, none of the 11 players in the three-year group had a below-replacement level season. Based on this, past consistency seems to have something to do with future consistency, but not a ton; the biggest gap between the groups in any of the performance buckets is 17 percent. The pattern, of being pretty reliably good but very possibly bad, is consistent across all three groups.
It’s really hard to get around the small number of players in this analysis. There just aren’t that many elite relievers, which leaves this question in an unsatisfying place. If you believed that there was something special about Chapman before reading this article, you probably do; if you believe the Indians got a bargain in Miller, or the Nationals in Mark Melancon, you also probably still do.
The thing about small samples is that teams have to deal with them, too. They have countless advantages over the public sphere, probably, but having more excellent reliever seasons to analyze is not one. It’s possible the Cubs internal operations were able to tell Theo Epstein and Jed Hoyer with certainty that Chapman would be at least good, while the alternatives might falter at any moment, but I doubt it. I think they’re probably in the dark about relievers, along with the rest of us.