Image credit: © Kamil Krzaczynski-USA TODAY Sports

In 2019, I worked for the New York Mets. I was employed on a consulting basis and I had (and still maintain) another “real job”, and I wasn’t really knee deep in day-to-day operations of the team, nor was I even in New York. But I had a seat at the table in the Mets Slack R&D channel. There were times when those of us “in the room” would be asked for our opinion on potential moves. I still have the hat.

In August of that year, Joe Panik had been designated for assignment and then released by the Giants. Panik wasn’t having a great year in San Francisco, slashing .235/.310/.317, but at the time, the Mets were in a bit of a crunch at second base. Robinson Canó was hurt, and the plan to have Jeff McNeil play all nine positions at once was running into some serious limitations. I found myself being asked for input on a real-life MLB roster decision. Perhaps Panik was worth a short-term flyer as cover at second? History records that the answer the Mets came to was “yes.” I have no illusions that my voice alone carried the day, but hey, someone asked.

That’s the dream, right?

Panik logged 103 PA over the remainder of the year and hit .277/.333/.404 in Blue and Orange. Neither one of us returned to the Mets for the 2020 season.


Yesterday, in the ongoing saga of the MLB labor negotiation, “sources” told the usual intrepid reporters that franchise owners had agreed in principle (though not in dollar amount) to the idea of a pre-arbitration bonus pool. The pool, which would be funded from “central revenue” (no indication of from where the pot of money would be drawn), would provide bonuses to a select group of pre-arbitration eligible players. There was some ambiguity in which players would receive bonuses, but it would likely be “the good ones” and “good” would be determined by WAR.

Here we go again.

A couple of months ago, we saw a similar proposal around using WAR as the basis for setting salaries during a player’s “arb years” that may or may not have ever been on the table. But even as a trial balloon, it was slightly terrifying. WAR, by definition, needs the freedom to change as the game changes. The strength of WAR isn’t so much that it’s completely objective, it’s that its biases are laid bare for all to see. If I tell you that Smith was a 2-win player last year, I show you (if you care to stick around for it) exactly how I got to that number. I can do the same exact process with everyone else in the league and tell you their numbers.

What could possibly be more fair? I show no favoritism to anyone. I tell everyone how I define value and the rest is just plugging in the numbers. And assuming that I have some idea of what I’m doing, I could probably do a decent job of defining “value” and coming up with a reasonable estimate for everyone’s contributions. It’s the sort of thing that sounds objective until you realize that while I’m treating everyone equally, I’m also the one deciding how everyone gets treated.


How does one produce value on the baseball diamond? 

It seems like an inane question to even ask, but there are places where it’s a little harder to answer than you might want to admit. There are a few dozen of these cutpoints, but let me just put one of them out there. How should WAR handle luck? If Smith times one up and hits an absolute screamer of a line drive only to watch the ball be snared by a leaping shortstop, and then Jones gets jammed and somehow hits a dying quail that just happens to land in the one spot that the same shortstop couldn’t get to, Jones gets a single and Smith made an out. 

As fans, we know that 9 times out of 10, Smith’s line drive turns into a base hit. We know that Jones is equally as lucky. And we know that things like that happen in baseball. Should the formula reflect what Smith could control (hitting a screaming line drive) or the result of the play, however unmoored from that probability it is? There isn’t a right answer to that. There’s a reasonable case on both sides of that one. You can pick the one you find most compelling. If you’re an ethical researcher, you should disclose and discuss why, but we all know that you could just as easily have gone the other way. And come to think of it, we can probably calculate what would have happened if you had made that decision.

And someone might like those results better. Especially if there are a few million dollars on the line.


When the idea of incorporating WAR into arbitration salaries was floated last November, I was worried about the power issues that went with it. MLB would be putting the fate of millions of dollars – and a decent chunk of the payrolls of all 30 teams – in the hands of a website. I have the highest respect for our friends at FanGraphs, but that’s a lot of power, and there would suddenly be a lot of people very interested in how they were using it. Even minor wiggles in the numbers might buy someone a house.

I thought giving my opinion on Joe Panik was a lot of power.

Using a pre-determined pool of money, as yesterday’s proposal suggests, removes some of that pressure, because the money is already in the pot. Teams don’t have to worry about a change in the definition of WAR affecting their payroll flexibility, and the gatekeepers of WAR, who usually work for media companies, that rely on access provided by teams and who themselves often harbor dreams of working for those teams, don’t have to worry about which team they made angry. But the players might have a thing or two to say though. And it’s nice when the players give us quotes.

There’s a pretty big problem that’s still left to solve, even after using a pool of money. What happens when it’s time to make a change in WAR? The chassis of WAR is 20 years old. It’s based on a model where position players have a primary position that they played almost every day. What happens as teams and players move toward a multi-positional model, or perhaps one where even the nine positions that we thought we knew aren’t actually how things work anymore. Am I really playing “third base” if half the time I’m shifted over into short right field? How do we deal with Shohei Ohtani? Should we include framing in catcher value? What happens to the “follower” after the opener? Is that pitcher a starter? A reliever? Something entirely new? Does it make sense to continue to use the same WAR framework for that? Twenty years ago, that wasn’t a question. Come to think of it, five years ago, that wasn’t a question.

If this proposal, or anything like it, goes through, WAR will be asked to step in to mediate a rather large issue in the ongoing MLB labor battle. Eventually, there’s going to be a disagreement in how the statistic itself should be structured. If I may put on my “other hat” for a moment (my professional training is as a clinical psychologist), I can only hope that the human(s) who are running whatever version of WAR is settled on are familiar with how to conduct family therapy.


So, if we’re really going to do this, using WAR in any way to determine salaries – and at this point, the details of how exactly WAR fits in don’t matter, just the fact that WAR appears anywhere in plan – there are going to need to be some ground rules.

1) It would probably be best if the exact formula for WAR were agreed on ahead of the season. Since this is a collective bargaining agreement, I assume that both franchise owners and the Players’ Association will both want a say. But once it’s settled, no backsies. I’m confident that everyone will see the wisdom in this one.

2) As baseball changes – and it will – WAR will need to change with it. Otherwise, you’re going to have a static measure that will eventually become as outdated as judging players by batting average. Better data sources will come available. New strategic wrinkles will emerge. The Rays will… I’ll just leave it at “the Rays.” 

You can pay someone to write a formula, but if you really want WAR and all its glory, you’re going to need someone or several someones to be an independent body charged with making those changes when needed. That probably means that they would have the ability to act independently and without the prior consent of either side. To insulate them from undue influence, they’re probably going to have to be on the payroll.

You still in on this?

3) Then there are the system-gaming issues. Under this proposal, if we’re using a bonus pool where the money’s already committed, teams actually have an incentive to play the pre-arb players that they like more, in the hopes of sneaking them into the bonus pool and keeping them happy. Or… maybe they want to keep them out of the bonus pool so that they don’t get extra money and are a little hungrier for that “team friendly” extension.

Players, on the other hand, might hear that the version of WAR that’s being used is FIP-based, and know that this means strikeouts are good while weak contact is not really rewarded, and so move their approach toward a more strikeout-seeking one. Does that put yet more upward pressure on K-rates? If so, would there be yet more pressure on the definition of WAR to change to compensate?

If you’re going to keep the definition of WAR independent from interference, then you’re going to have to bite your lip on that one.


As a former therapist, I know firsthand that there are times when know that the patient you are working with is about to do something that might not end well. You’re probably not going to talk them out of it and you just have to hope that they talk themselves out of it. Sometimes the best you’re going to accomplish is doing damage control ahead of time.

The fact that the idea of using WAR in determining salaries has come up twice now makes me think that there’s some will behind it. The fact that apparently, both the franchise owners and MLBPA have bought into the idea of a bonus pool, but are just in disagreement on the money, makes me think it even more.

If MLB and MLBPA are serious about this idea, I would encourage them to think deeply about how they will structure it. This could easily spin out of control in ways that don’t work for anyone. It sounds like an easy fix to “just use WAR.” It’s actually a field of landmines that you’re tap-dancing into.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now
You need to be logged in to comment. Login or Subscribe
Joseph H
System gaming seems like a time bomb. To throw in an additional example: platoon splits (which WARP might account for but other models clearly don't).

Re: changing WAR frameworks, aren't the examples you cited stuff that MLB & MLBPA are going to have to work through regardless of whether or not they embrace a specific framework of WAR? The "starter v. follower" distinction is already impacting arb eligible players.
It's worth flagging that the current system does what you consider an obviously wrong response: they ignore the new data (statcast) and demand arb debates be constructed on data basically from the three or four major baseball websites not owned by MLB. You're probably right that this is going to be unsustainable over the long run but the only downside of WAR in this context is that it demands a rigorous pre-defined model on how to assign value (which could also be an upside for using WAR).
Craig Goldstein
Statcast data isn't wholly ignored in the current system, even though it is supposed to be.
Leonard V.
I find it hard to argue that WAR should be a major consideration for hall of fame induction, but that it's off limits for determining player salaries.

I think the key is the first point: settle on an exact formula for WAR (not just "look up the current fWAR values"). To the second point, let the formula be updated with each new CBA. That provides enough flexibility to keep it current without introducing some external body with authority to change the game.

This is a good idea, but the devil is in the details of how you select the right formula. Keeping it separate from any one site's approach means that the ideas from all the sites can influence the final formula.

My approach would be:
Each CBA sets criteria for what properties are desirable in a WAR measure (stability, alignment with team results, transparency, sensitivity to specific features such as how many innings of defense / how many at-bats a player had, etc.)
Anyone who wishes to submit a measure can do so during a window of time after the CBA is approved
The submitted measure that has performed the best over the past N years (3? 5?) according to the chosen criteria is selected as the official measure for the follow year.
Formula selection is done during the offseason (maybe in Nov?).
New measures submitted before the first game of a season can be considered during that offseason.
Future CBAs can of course tweak the criteria as needed.

System gaming might happen, players might change their approach to play more to the current WAR formula, but that will already happen, because these measures are already part of arbitration and free agent negotiations.
The Hall of Fame does not matter. Player salaries do. Objectivity is not necessary or always even desirable when determining who gets a plaque in a museum, it is vital when determining how people get paid. No one is saying WAR is off limits for determining player salaries, just that there are huge issues that are obviously not present with regards to the machinations of a hall of fame that has already been basically rendered irrelevant by the inconsistent application of the character clause.
Craig Goldstein
I don't know why you'd find it hard to argue that given they're completely different things, but also just given the scope. One-year WAR(P) values can have a substantial margin of error. Career WAR values are more likely accurate just do to the sample size.
Leonard V.
Is there a more accurate measure of performance in a small sample size? Or are you saying that performance bonuses shouldn't be a thing?
Craig Goldstein
My preference is that pay for performance not be included in a CBA in any capacity. Beyond that, I think most metrics carry concerns with them from varying viewpoints, and that it's a benefit that we have many metrics and ways to evaluate players. Boiling it down to one version of one stat is needlessly specific. You could have a panel of people responsible for creating the top 30 list based on a spectrum of stats and debate. You could also have the PA decide how to divvy up the money without a formula, because it's just a big pot of money already set aside from the central revenues, so there's no reason for the league to ask for a formula-based solution other than they see it as a pathway to get it elsewhere in the economics of the game (like arbitration, which they asked for earlier).