Last week, the Tampa Bay Rays signed Grant Balfour to be their closer for 2014 (and presumably 2015), committing to pay him $12 million over the next two seasons. It’s not an expensive closer contract, as these things go. But for the cost-conscious Rays, it seemed a little strange. The team also re-signed Juan Carlos Oviedo (formerly Leo Nunez) and traded for Heath Bell over the winter. Another sabermetric darling team, the Oakland A’s, signed Eric O’Flaherty last week and, earlier in the winter, traded for Josh Lindblom and Jim Johnson.

Wait a minute, these are the two franchises that have had books written about them and how they embrace advanced analytics. It was the A’s who practically invented the Billy Taylor/Huston Street model of “developing a closer” (i.e., getting someone a bunch of saves) and then flipping him for other pieces. I thought that the official sabermetric orthodoxy was that teams shouldn’t allocate any of their precious resources on chasing relievers. Isn’t the wisdom that when a trade involves a reliever and something else, the team that gave up the reliever won? Bullpen guys are too volatile! For them, the traditional metrics used to evaluate pitchers (ERA, saves) are either not very reliable and/or are junk stats. When you look at relievers through the lens of WAR(P), they don’t produce anywhere near on par with elite starters or position players, so why pay them similarly? Teams would be better off getting a couple of fire-balling pre-arbitration guys and some guys with checkered records, and spending the money saved elsewhere. Then, they can hope that a couple of them have a BABIP-driven amazing season. Why blow money or prospects on a guy who’s going to pitch only 70 innings at most?

I’d argue that WAR(P), as we have defined it, doesn’t do a very good job of describing relievers. The disconnect can be summed up by looking first at this chart and then at this one. In case you don’t want to click through, the first chart is a listing of the top WARs of 2013, while the second is the top win probability added (WPA) scores of 2013. The WAR chart Top 30 doesn’t contain any relievers at all. The WPA chart alternates between elite starters and back-end relievers, mostly closers. There’s a lesson in here, if you’re careful to look for it.

WAR answers (or attempts to answer) the question “What is Smith worth over and above our common baseline, replacement level?” It does that by specifically trying to isolate the contributions that Smith made independent of any context. The reason that RBI totals are a bad way to compare players is that batters who happen to play on teams where they hit behind guys who are always on base will have big numbers. Those whose managers stick them in the leadoff spot and those who are just stuck on bad teams will have lower numbers. WAR also ignores any information about when in the game the event happened. To WAR, a single is a single is a single, no matter whether it was to lead off the first or to drive in the winning run of Game 7 of the World… sorry, bad Edgar Renteria flashbacks.

For position players, you can make the case that it all sort of evens out. You can’t really leverage a specific hitter to a specific situation (pinch hitting aside). Hitters take their appointed turn in the order, no matter the circumstances. If it’s the bottom of the ninth, two on, two out, down by one, and the no. 7 spot is due up, the cleanup hitter can’t just say “I got this one.” Hitters have little control over what situation they will find themselves in about the best prediction going forward is that they will have some big situations, some little situations, and some good old average situations to deal with. You might make the same sort of argument with starting pitchers as well. Relievers, on the other hand…

In the modern bullpen, it’s generally known ahead of time who will pitch in what situation. There is plenty to say about the way the modern bullpen is constructed, both good and bad, and let’s just lay that aside for now. Closers will pitch in the ninth inning with their teams up 1-3 runs, whether we like that or not. There are other relievers who only suck up low-leverage innings when it’s 10-3. That brings us to the WPA chart. We know that Greg Holland, who finished second in MLB in WPA last year behind Clayton Kershaw, did so because he was placed into a lot of high-leverage situations where there was a lot of win probability available. It would be a mistake to assume that because of that fact (and that fact alone) that Greg Holland was the best reliever in baseball last year. (Then again, it wouldn’t be a silly statement either!)

WPA has its problems—the biggest being that it credits or debits everything that happens in an inning to the pitcher, even things over which he has little control—and it isn’t a very good tool to evaluate individual pitchers. Had Holland done the same work in low-leverage situations, WAR(P) would still have recognized him for it, but WPA would not have. Holland had a good year, no doubt, but more importantly, he illustrates a point. Because teams have a lot more control over what relievers are placed into what situations, having a good reliever (or a reliever having a fluky good season) for those high-leverage situations can have a big impact on a team’s chances of winning games. I suppose this isn’t really news, we’ve just confirmed it with #GoryMath.

To flip the coin around, because Holland did his work in high-leverage situations, WAR(P) does not recognize his accomplishments as much as WPA does, and here WPA is more sensitive to a key aspect of how relievers are actually used. (In fairness, Baseball-Reference’s version of WAR has an adjustment for leverage when calculating reliever WAR scores, although for some technical reasons I still think it undervalues relievers’ actual contributions).

Lately, there seems to have been a shift in the free agent market. As more and more teams begin to use the WAR framework as a way to evaluate players going forward (and believe me, most of them do in some way or another), it’s led to a lot of free agent signings where the general consensus has been “Yeah, that’s about right according to WAR.” In the same way that when the market was responding to batting average, the A’s found a flaw in the stat and how it failed to match up with the realities of the game, maybe we’re just seeing teams start to take advantage of the flaws in WAR. Stop me if you’ve read this book before.

Betting on relievers is most certainly risky, but the point of risk isn’t to avoid it. The point is to properly manage it. The starters on the WPA leaders list make (or will eventually make) much more than the relievers on the list will, but the starters are also a safer bet to get the kind of performance that produces that sort of WPA year after year. It’s a lower cost, high-risk, high-reward bet, but when you live in a “small market,” sometimes those are the only bets you can afford.

It is true that it’s hard to get a handle on which relievers are good and which ones are not. However, we can certainly agree that there are some who are better at the craft than others, even if the numbers don’t always show it over 70 innings, and quality costs a little more. And with teams finally moving away from judging a reliever by his saves total, more fully understanding statistical reliability, and doing some deep sub-atomic studies using Pitch F/X and other mystical voodoo things, it’s a lot clearer who is a better investment. Yes, because of the small sample size, relievers will have big error bars around their range of expected outcomes, no matter what, but it’s worth the due diligence to at least make sure the mid-point of that error bar is as high as you can get it. It will take some luck to get the full benefit of the reliever, but it takes some luck to get anything fun in life.

Why are smart teams spending money on relievers? Well, for the same reason that smart teams spend money on anything. There’s a case to be made that relievers aren’t properly valued by the metrics. In addition, the conventional wisdom is that relievers aren’t worth paying very much, and maybe that’s depressing the market unfairly. If you want to make a case against a specific player (Heath Bell? Really?) that’s reasonable, but as an asset class, relievers might just have come around to having an expected value that’s more than they cost.