Last year, the must-have item of the Winter Meetings shopping season was a catcher with mad framing skills. In a short period of time, Hank Conger, Ryan Hannigan, Miguel Montero, and everyone who owned a chest protector on the Padres roster changed teams. This year, tastes have changed. Now, the new hip thing that all the cool teams have is a crazy good closer. More to the point, a second crazy good closer to pitch in the role once known as “the eight- inning guy.” It’s not enough to have one shoe any more. You need two.

Since Wade Davis got Wilmer Flores looking in the 12th inning of Game 5, Craig Kimbrel has been traded for a quartet of young prospects. Apparently not satisfied with that, the Red Sox also nabbed Mariners wunderkind set-up man Carson Smith. Not to be outdone, the Phillies got a quintet of kids for their closer, Ken Giles. The Dodgers had agreed to trade for Aroldis Chapman to pair with Kenley Jansen, although allegations of Chapman being involved in a domestic violence incident scuttled the deal. And the Yankees are reportedly dangling Andrew Miller as possible trade bait.

What’s interesting is that while deals for high-end relievers have always happened, the knee-jerk reaction that a smart team should never spend resources to chase something as fleeting as relievers, and especially closers, seems to have been skipped this year. Yes, the save stat is still silly. Yes, all of those relievers will pitch only 65 innings this year, and it’s still hard to trust their numbers in such small sample sizes. Yes, there will be some small number of relievers that put up microscopic ERA numbers in a small sample and will score a huge contract based on that next year.

A few weeks ago, in reviewing the Kimbrel trade, I suggested that the issue with paying for high-end relievers isn’t that it doesn’t make sense in terms of expected value, but that it’s a high volatility investment. Signing (or trading for) a very good closer makes perfect sense, even if you have to put some assets into it. I think a lot of the closer-hate came from a misunderstanding of what those truths mentioned above really meant. Saves are silly because they reward only the guy who was the last pitcher and ignores the other two guys who pitched scoreless innings in the seventh and eighth. It’s not that the closer doesn’t deserve an “attaboy.” He did his job, but the fact that he pitched in the ninth and racked up saves doesn’t necessarily mean that he’s better than the guy who did most of his work in the eighth. Teams shouldn’t pay for past saves; they should pay for the ability to get outs in tough situations. Saves might be an indicator of that talent, but not always. But while saves are not necessarily a good indicator of pitcher talent, I think people forgot to look at how incredibly valuable the act of “saving” a game is.

Fast forward to last week on Effectively Wild, where Ben and Sam wondered aloud whether the market was going crazy or if this was all a rational thing to do. Maybe it made sense to grab one elite reliever, but what about the teams that were grabbing two? It was Sam who spoke the profound words that the flip side of realizing the ninth inning is just like the eighth inning is realizing the eighth inning is just like the ninth. If we’re going to say that if you can pitch the eighth, you can pitch the ninth, then it also makes sense that if you’re going to get someone to lock down the ninth, you might as well get someone to lock down the eighth.

Sam also brought up the idea that one reason for opening up the vault for a reliever was the idea that relievers decline faster than starters. All big contracts take a risk on the back end, often paying for years when a player won’t be worth his salary to buy the upfront years where he will be a bargain, but are relievers more likely to turn more quickly into pumpkins?

What to make of the market for relievers? Have we been undervaluing them all along?

Warning! Gory Mathematical Details Ahead!

In reviewing the Kimbrel deal, I tried to make the case that WAR is a bad way to evaluate relievers. The problem with WAR in this case is that in our attempt to strip the context out of everything, we neglect that relievers, particularly closers, are used in a very specific context. Closers pitch at times when the leverage is high. By definition, if they don’t do their job well, it means much more in the win-loss column (the one that actually counts) than just a randomly selected inning. And yes, WAR calculations do attempt to account for that, but I think the best way to show the difference is to reference this chart of the top pitchers by WAR and this one showing the top pitchers by win probability added. The “Best WAR” chart is all starters. The WPA chart is a healthy mix of both starters and relievers. When your role is defined by being in high leverage situations, it makes more sense to use the metric that accounts for leverage.

The flip side of that is that even a “decent” closer is going to rack up a lot of WPA, not because it’s a sign of him being good, but because he’s going to pitch in a lot of high leverage situations. Still, since he is pitching in those spots, we want him to be as good as he can be.

We see that a reliever pitching in high leverage spots and who has a good (or lucky… or both) year can add value on par with that of a starting pitcher. So, why is it that, even with today’s inflated prices for relievers, we haven’t seen the Zack Greinke contract being given to a top reliever? Or even the Johnny Cueto one? And realistically, does it make sense for a team to have two elite relievers, or is that a waste of resources.

In reviewing the Kimbrel trade, I found that the “average” team faces roughly 40 ninth-inning save situations per year. “Save” situations in the eighth inning (the pitching team is winning at the start of the inning by three or fewer) actually happen slightly more often, with the average team getting about 44 of them over the course of a year. Better teams will face more of them because they will be more likely to have a lead. We also know that in the modern game of baseball, it’s rare that a starter is actually out there for the eighth inning anymore, so it’s likely that the manager will be in his bullpen at the time. Why not have one of the best relievers in the game to hold down the fort then too?

In ninth inning situations, the difference between a completed save and a blown one is fairly easy to figure on the win probability scale, especially in the bottom of the ninth. If the visiting team’s closer blows it and gives up both the tie and lead, his team loses. If he only allows the other team to tie, he’s basically taken the game to a 50/50 shot. (The exact win probability will depend on whether the closer is actually pitching for the home or road team.) But as shorthand, we assume that an extra blown save in the ninth inning is worth about .75 wins to a team.

In the eighth inning, using data from 2005-2014, a home team entering the top of the eighth inning up by one had a 77.87 percent chance of winning. If they kept the lead, their win expectancy went to 89.04 percent, but if they gave up one run, it drops to 64.58 percent and if they give up two runs (now they are down a run!), it falls to 29.56 percent. For the home team an extra “blown save” in the eighth inning is worth at least .41 wins. For the visitors, getting into the top of the ninth with a one-run lead has a win probability of 84.59 percent. If the eighth-inning guy had allowed his team to get tied up going into the top of the ninth, his team now has a win probability of 46.96 percent. If he allows the home team to take even a one-run lead, the visitors are now looking at a win probability of 17.74 percent, making an extra blown eighth-inning “save” for a visiting team worth an average of about .52 wins. The average extra “eighth-inning save” is worth at least .46 wins. Let’s round it to half a win. That’s the difference between preserving the lead and giving it up.

We saw last week that a good year for a closer was a save conversion rate of about 90 percent while a bad one might be around 80 percent. We see that ninth-inning saves are more valuable in terms of win probability added, but half a win is nothing to sneeze at. If there are 44 eighth-inning “save” chances for an average team (probably a few more for a good one), and having a good guy in the eighth-inning role can save 10 percent (4.4 of them) from becoming blown saves, that’s more than 2.2 wins worth of value.

To put it another way, if I could administer a potion to a closer and have it guarantee that the reliever would protect 90, rather than 80 percent of close leads in the eighth inning, that would be worth 2.2 wins, and that vial of potion should be priced accordingly. The problem is that we know that randomness makes it far from a guarantee that things will work that way. Now, it’s a matter of how confident you are in the potion.

With the Kimbrel trade, I put the value of moving from an 80 percent closer to a 90 percent closer at 3 wins, although with the caveat that because of sample size and randomness issues, a guy who would be a 90 percent closer given a billion chances might only notch an 80 percent rate over the course of 162 games. Still, we see that if a team could come up with some amount of predictive certainty that they had a reliever who was more on the 90 percent side, even if they were only 33 percent sure, they would have a good case to make that he’s worth a win or so more than the average closer. Still, that win is worth “only” $7 million, but it makes sense that high-end closers would be making that much more than a standard issue closer salary, which usually checks in around $5 million-$6 million, for those who were obtained in free agency. So, somewhere in the $10 million-$15 million range would be about right. Coincidentally, that’s around where top closer salaries tend to fall. And having second eighth-inning reliever in the $8 million-$12 million range could probably be justified too for some teams. So, the salary scale for relievers seems to be about right. Whether we’re handing the salaries out to the right people is another question, but we’re at least dealing with rational numbers.

And while a player’s seasonal save conversion rate might have a lot of variance in it, we’re not completely in the dark figuring out who the good ones and who the bad ones are. In fact, while a typical closer pitches to only 250-300 batters per year, that’s enough to get a good read on his strikeout and walk and ground ball rates. We know that those are fairly consistent over time, so why not use those as proxies? Enter the question mentioned by Mr. Miller of whether relievers actually decline at a rate faster than starters. That’s plausible, but is it true?

To look, I used data from 2010-2015 and separated out consistent starters (95 percent of their appearances in a season came as starters and they threw more than 150 IP, and the same was true the following year) from consistent relievers (95 percent of their appearances were in relief and they threw more than 50 IP in two consecutive years). I looked at their strikeout, walk, and ground ball rates in consecutive years. I put together a mixed design ANOVA (for the initiated, time[2] x group[2]) looking at each stat. In this sort of analysis, the most important thing to look at is the interaction between time and group.


Starters, year 1

Starters, year 2

Relievers, year 1

Relievers, year 2
















The interaction term for GB% and BB% was not significant, but the results for strikeouts were. Starters, on average, pretty much kept their strikeout rates level. Relievers, though they strike more people out in general, lost 0.7 percentage points on their strikeouts over time. When I played around a little bit breaking the data out by age, it became clear that the strikeout finding was being driven mostly by “older” relievers (31 and up).

What’s interesting is that I looked at two more measures which should have some decent reliability: fastball velocity, and horizontal break on sliders.


Starters, year 1

Starters, year 2

Relievers, year 1

Relievers, year 2

FB velo (mph)





SL X-break (in)





It doesn’t seem that relievers are losing velocity or movement on their breaking pitches at a greater rate than starters (both interactions were non-significant), so it doesn’t seem to be a loss of stuff. So, it doesn’t seem that relievers physically deteriorate faster than starters, but their results do.

(Side note: For the super-duper-initiated, you might be thinking, if he used a bunch of year-matched pairs, doesn’t that violate the independence of observation assumption. In other words, A.J. Burnett’s 2010 compared to his 2011 numbers gets a line. So does his 2011 compared to his 2012. Yes. You are right. And I actually went to the trouble of fitting a proper growth curve model. And it said the same basic thing.)

It’s not clear why relievers see a drag on their strikeout rate. One possibility is that we know that relievers often get banished to the pen because they lack a proper third pitch. In the bullpen, they might face only four hitters, so they don’t have to worry too much about mixing pitches and they can get by on stuff. But over time, “the book” on a pitcher gets out and even if a pitcher only has two pitches, if a batter is just kinda guessing, there’s at least a 50-50 chance he’s at least guessing the correct one. Then, it’s just a matter of seeing if he can time it up. Relievers may also be guys who specifically get by more on “stuff” and so when they lose a little bit, there goes their one weird trick to get guys out.

But the message is clear. Relievers become a liability to decay more quickly as they get older. So, when signing a free agent, a starter is more likely to hold his value than is a reliever over a few years. We’d expect to see shorter duration contracts for relievers… and that’s exactly what we get.

The Market is Wise

The idea of having a “second closer” or of spending resources to chase after an elite reliever just to have him pitch the eighth doesn’t seem to make sense at first, and running the numbers, having that sort of firepower in the eighth inning isn’t quite as valuable as having it in the ninth. That’s not the question that a baseball team has to answer, though. The question is how much value having that ace reliever in the eighth provides toward winning a ballgame, and if the answer is “a lot,” then how much is being charged for that privilege, and is that the best deal out there?

Given that there’s been a bias against paying anyone but the one guy who will pitch in the ninth for you any sort of money, there seems to be a nice little market inefficiency. I’ve made the case before that it’s a market inefficiency that has some big error bars around it. Relievers do still have to deal with the small sample size issue, and paying $10 million per year or a couple prospects for a flashy eighth inning guy who puts up a 4.50 ERA is never fun. But from a rational point of view, it makes sense to do.

What surprised me is that it seems that the types of contracts that are given to relievers are entirely rational in terms of their scale, both in dollars and years. Maybe we’re still curing teams of the over-emphasis on previous save totals, but the market seems to be in the correct place for relievers. And perhaps we’ll finally see eighth inning guys get their due. And bigger paychecks.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now
You need to be logged in to comment. Login or Subscribe
This is the drum I keep beating: baseball records are not generated by some simple sum of a game-level statistic (such as WAR) but are really the sum of the results of discrete contests. Maximizing the value of some yearly statistic is not equal to maximizing the value of wins produced by a squad. Oddly enough, there was once a hitting statistic that tried to get at this marginal effect (albeit poorly): game-winning RBI. Saves are also a way to try and measure this effect on the pitching side. Many, many analysts have confused the flaws in those statistics with a flaw in the underlying phenomena they are trying to measure.
Hey Russell, I don't think I followed you when you said that paying $10M for an 8th inning guy who puts up a 4.50 ERA makes sense. That seems pretty horrendous to me.
It isn't that it makes sense. It's that sometimes you sign a guy who you think will give you 65 IP of 1.95 ERA, and maybe that's his true talent, but sometimes, randomness is not your friend. It makes sense to sign that guy for $10M if you're fairly well convinced that's his true talent, but you have to live with the risk that he might lay an egg when he actually gets on the mound.
By the way, Russell...Happy Birthday!
I'm officially a double-adult.
Have we been underpricing birthdays?
Really interesting stuff. But doesn't it seem inaccurate to measure some players' contributions by WAR and others by WPA?

I mean, both are closed systems (I think), so you could measure everyone by WPA. But then Bryce Harper is worth 6.25 wins, compared to 9.5 WAR, and barely "outwins" Mark Melancon, at 5.19 WPA.
WAR specifically seeks to strip out the context from everything (including leverage) in order to compare everyone to the same baseline regardless of team context and the role in which they were used. That has its uses. But the entire point of "closer" is that the role is based entirely on context. You go in during very specific situations. It isn't that you evaluate the player differently, it's that you evaluate the effect that this player will have in that magnified role differently.
Turning that around, without high-WAR players, your reliever doesn't pitch in many high-leverage save situations because you're usually losing.

Let's look at WPA differently. Let's say next year, the Nationals convert Bryce Harper to an offensive "closer"-- a pinch hitter who only bats in critical situations. Harper might have a high WPA, despite few plate appearances.

Wouldn't the Nationals be better off keeping Harper in his current role?

The argument about elite closers should be "Do I get more wins from an elite closer by instead using him as a 1970s-vintage 'fireman'?"

Mariano Rivera's best season by both bWAR (5.0) and WPA (+5.4) remains 1996, when he had 5 saves and 107.2 IP.

Baseball Reference has Willie Hernandez (1984) as #3 all time for WPA in a season, with +8.65. (Only 1985 Gooden and 1940 Feller are higher.) Non-legendary pitchers named John Hiller (1973; #5) and Doug Corbett (1980; #13) are also on the all-time list. What links them? Good but not otherworldly relievers who pitched 120+ IP.
The answer depends in part on whether you would get 2 or 3 IP of the same quality if you routinely extended your closer's appearances. The Harper analogy slips a bit because if you increase Bryce Harper's PAs from 150 to 600 you will get something like 4X the production.
Last year was the only year, after twenty some odd years of playing fantasy baseball, that I ever kept a closer on my Scoresheet fantasy team.

Curse you, Greg Holland.
Late inning relievers do grow on trees. How does that figure into pricing?

I think we are entering into a short-lived era of grossly overpaying for relievers. Let this be known as the Royals effect. The Royals previous two "elite" closers, Soria and Holland, had very short windows of dominance and their current bullpen ace is a flamed out SP prospect. I think too much of the Royals success is attributed to the bullpen. It is an asset, but not the primary source of their recent success as it is portrayed.
It's not true that WAR strips out LI. It just reduces it in order to better estimate a reliever's true replacement level. Top notch relievers generally get 1.5 or 2.0 WAR, which would net them $12M to $16M per year in today's market, which is what the best ones get I think.

To me, Russell is arguing that this approach slightly undervalues setup guys (or LIPRA, according to Jeff Long). It may be a good point.
Sorry. LiRPA. Still learning.
It isn't that WAR strips out LI. It's that it does a really bad job of how it adjusts for LI. It basically looks at the average leverage index that the pitcher faced and goes halfway. So, if the reliever faced an average LI of 1.5. it multiplies his WAR by 1.25 (halfway between 1.5 and an league average LI of 1.0).

Closers really have two jobs. There are the save situations that they are actually paid for, and then they soak up an inning or so here and there because the manager needs an arm on the mound and he was available. (These are the 'Why is Kimbrel out there if we're losing 13-4' innings.) They just happened to be the short-straw/need to pitch guy in that case. Teams do not care how they perform in these situations, but LI records it as "He was in the game with an LI of 0.02, and we're going to weight that equally with the 5-out, came into the 8th inning with 2 runners on, up by 1 save he had last week."
Well Russell at least you now agree that WAR doesn't totally strip out context.

Your criticism of LI may be valid, but I don't see that WAR is impacted by it. WAR does what it does in order to better address the replacement level issue. Your criticism has more to do with how we evaluate the criticality of relief stats in general.

You're suggesting that it's better to do something such as counting the number of appearances made by a reliever when the LI is over 2.0--something like that--instead of averaging over all appearances. It's a good point and worth considering.
[quote]LI records it as "He was in the game with an LI of 0.02, and we're going to weight that equally with the 5-out, came into the 8th inning with 2 runners on, up by 1 save he had last week."[/quote]

That should read "WAR records it as". And yes, that is what it is doing, just like it's doing it for all pitcher and hitter stats.

Now, you could decide to do the "halfway" with WPA instead. You'd take the LI at the PA level you find at Fangraphs, and take it halfway to 1.0. Then divide WPA by that figure. Then add in the replacement level portion.

But if you do this for a group of closers, you won't find any difference.