Image credit: Tommy Gilligan-USA TODAY Sports

We ask “replacement level” to be a lot of things. Sometimes contradictory things. Sometimes I wonder if we know what it even means any more. The original idea was that it represented the level of production that a team could expect to get from “freely available talent”, including bench players, minor leaguers, and waiver wire pickups. It created a common benchmark to compare everyone to, and for that reason, it represented an advancement well beyond what was available at the time. In fact, it created a language and a framework for evaluating players that was not just better but entirely different than what came before it.

But then we started mumbling in that language. The idea behind “wins above replacement” was one part sci-fi episode and one part mathematical exercise. Imagine that a player had disappeared before the season and suddenly, in an alternate timeline, his team would have had to replace him. The distance between him and that replacement line was his value. We need to talk about that alternate timeline.

Without getting too into 2:00 am “deep conversations” with extensive navel-gazing, it’s worth thinking about why one player might not be playing, while another might.

  • A player might not be playing because he has a short-term injury or his manager believes that he needs a day off.
  • A player might not be playing because he has a longer-term injury that requires him to be on the injured list.

There’s a difference here between these two situations. In particular, the first one generally doesn’t involve a compensatory roster move, while the second one does. It’s possible, though not guaranteed, that the person who will be replacing the injured/resting player would be the same in either case. That matters. Teams generally carry a spare part for all eight position players on the diamond, although in the era of a four-player bench, those spare parts usually are the backup plan for more than one spot.

A couple of years ago, I posed a hypothetical question. Suppose that a team had two players in its system fighting for a fourth outfielder spot. One of them was a league average hitter, but would be worth 20 runs below average if allowed to play center field for a full season. One of them was a perfectly average fielder, but would be 15 runs below average as a hitter, if allowed to play an entire season. Which of the two should the team roster? It’s tempting to say the second one, as overall, he is the better player. That misses the point. A league average hitter on the bench isn’t just a potential replacement for an injured outfielder. He might also pinch hit for the light-hitting shortstop in a key spot. You keep the average hitter on the roster, even though he isn’t a hand-in-glove fit for one specific place on the field, because being a bench player is a different job description than being a long-term fill-in for someone. If you find yourself in need of a longer-term fill-in, you can bring the other guy up from AAA.

When we’re determining the value of an everyday player though, if he had disappeared before the season and a team would have had to replace his production, they likely would have done it with a player who was a long-term fill-in type because they would have had to replace a guy who played everyday. Maybe that’s the same guy that they would have rostered on their bench anyway, but we don’t know. It gets to the query of what we hope to accomplish with WAR. Are we looking for an accurate modeling of reality or are we looking for a common baseline to compare everyone to? Both have their uses, but they are somewhat different questions.

Let’s talk about another dichotomy.

  • A player might not be playing because he isn’t very good and is a bench-level player.
  • A player might not be playing because there is another player on the team who has a situational advantage that makes him the better choice today. The classic case of this is a handedness platoon. On another day, he might be a better choice.

When we think about player usage, I think we’re still stuck in the model that there are starters and there are scrubs. We have plenty of words for bench players or reserves or backups or utility guys. We do still have the word “platoon” in our collective vocabulary, but in the age of short benches, it’s hard to construct one. It’s always been hard to construct them. You have to find two players who hit with different hands, have skillsets that complement each other, and probably play the same position. In the era of the short bench, one of them had probably better double as a utility player in some way. Baseball has a two-tiered language geared toward the idea of regulars and reserves. The fact that it was so easy for me to find plenty of synonyms for “a player whose primary function is to come into a game to replace a regular player if he is injured or resting” should tell you something.

I’m always one to look for “unspoken words” in baseball. What is it called when someone is both half of a platoon and the utility infielder? That guy exists sometimes, but he reveals himself in that role—usually by accident. We don’t have a word for that, and whenever I find myself saying “we don’t have a word for that”, I look for new opportunities. What do you call it, further, when the job of being the utility infielder is decentralized across the whole infield with occasional contributions from the left fielder? It’s not even a “super-utility” player. What happens when you build your entire roster around the idea that everyone will be expected to be a triple major?


I think someone else beat me to this one, and on a grand scale. Platoons work because we know that hitters of the opposite hand to the pitcher get better results than hitters of the same hand, usually to the tune of about 20 points of OBP. If you want to express that in runs, it usually comes out to somewhere around 10 to 12 runs of linear weights value prorated across 650 PA. But hang on a second, now let’s say that we have two players who might start today, both of roughly equal merit with the bat. One has a handedness advantage, but is the worse fielder of the two. In that case, as long as his “over the course of a season” projection as a fielder at whatever position you want to slot him into is less than a 10-run drop from the guy he might replace, then he’s a better option today. 

We’re not used to thinking of utility players as bat-first options, who would play below-average defense at three different infield positions. That guy might hook on as a 2B/3B/LF type (Howie Kendrick, come on down!) but teams usually think to themselves that they need as their utility infielder someone who “can handle” shortstop, the toughest of the infield spots to play. If someone can do that and hit well, he’s probably already starting somewhere, so he’s not available as a utility infielder. It’s easier for those glove guys to find a job. In a world where the replacement for a shortstop has to be the designated utility infielder, that makes sense.

But as we talked about last week, we’re living in a different world. The rate at which a replacement for a regular starter turns out to be another starter shifting over to cover has gone way up over the last five years. There was always some of it in the game, but this has been a supernova of switcheroos. Now if your second baseman is capable of playing a decent shortstop, that 2B/3B/LF guy can swap in. He’s not actually playing shortstop, and maybe the defense suffers from the switch, but if he’s got enough of a bat, he might outhit those extra fielding miscues. And in doing so, he is effectively your backup shortstop.

Somewhere along the lines, teams got hip to the idea of multi-positional play from their regulars. I’ve written before about how you can’t just put a player, however athletic, into a new position and expect much at first. The data tell us that. Eventually, players can learn to be multi-positionalists, but it takes time, roughly on the order of two months, before they’re OK. But there’s a hidden message in there. If you give a player some reps at a new spot, he’s a reasonably gifted athlete and somewhat smart and willing to learn, he could probably pick it up enough to get to “good enough,” and it doesn’t take forever. You just have to be purposeful about it. Maybe you get to the point where you can start to say “he’s still below average but we could move him there and get another bat into the lineup, and it’s a net win.”

Teams have started to build those extra lessons into their player development program. It used to be seen as a mark of weakness to be relegated to “utility player” because that meant that you were a bench player (all those synonyms above come with a side of stigma). Now, it’s a way of building a team. If you get a few reps in the minors (where it doesn’t count) at a spot, you’ll have at least played the spot at game speed before. There are limits to how far you can push that. A slow-footed “he’s out in left field because we don’t have the DH” guy is never going to play short, but maybe your third baseman can try second base and not look like a total moose out there. 


Back to WAR. I’d argue that the world of starters and scrubs is slowly disintegrating, for good cause. In the event that a regular starter really does go down with an injury – ostensibly, the alternate universe scenario that WAR is attempting to model – it makes the team a little more resilient to replacing him. And the good news is that you’re more likely to be able to replace him with the best of the bench bunch, rather than the third-best guy, because the best guy doesn’t have to be an exact positional match for the guy who got hurt. And that’s what the manager would want to do. He’d want to replace that long-term production, not with an amalgam of everyone else who played that position, but with the best guy available from his reserves.

Now this is still WAR. We still want to retain the principle that we should be measuring a player, and not his teammates. We need some sort of common baseline, and despite what I just said, we’ll still need some sort of amalgam. To construct that, I give to you the idea of the tranche. The word, if you’ve not heard it before, refers to a piece of a whole that is somehow segmented off. It’s often used in finance to talk about layers of a financial instrument.

Here, I want you to consider that there are 30 starters at each of the seven non-battery positions (catchers should have their own WAR, since only a catcher can replace a catcher). We can identify them by playing time, and we can futz around with the definition a little bit if we need to. Next, among those who aren’t in that starting pool, we identify the top tranche of the 30 best bench players, which I would again identify by playing time, and then the second and third and fourth and so on. If a player were to disappear, his manager would probably want to take a guy from that top tranche of the bench to replace him. In a world where even the starters can slide around the field, that becomes more feasible.

We can take a look at that top tranche and say “How many of them showed that they are able to play (first, second, etc.)?” and therefore could have directly substituted for the starter? How many of them could have been a direct substitute for our injured player? We don’t know whether one of them would be on a specific team, but we can say that 40 percent of the time, a manager would have been able to draw from tranche 1 in filling the role, and 35 percent from tranche 2. But on tranche 1, we can also look at how many of those players played a position that could have then shifted and covered for that spot. We’d need some eligibility criteria for all of this (probably a minimum number of games played) but it would just be a matter of multiplication. Shortstop would be harder to fill, and managers would probably be dipping a little further down in the talent pool, and so replacement level would be lower, as it is now. 

Doing some quick analysis, I found that the difference in just batting linear weights (haven’t even gotten into running or fielding) between tranche 1 and tranche 2 in 2019 was about 6.5 runs, prorated across 650 PA. Between tranche 1 and tranche 3, it’s 10.8 runs. The ability to shift those plate appearances up the ladder has some real value.

This part is important. We can also give credit to starters for the positions that they showed an ability to play, even if they didn’t play them (this is the guy fully capable of playing center, but who’s in a corner because the team already has a good center fielder) because he allows a team to carry a player who hits like a left fielder to functionally be the team’s backup center fielder. He facilitates that movement upward among the tranches. We can start to appreciate the difference between a left fielder who would never be able to hack it in center (and the compensatory move that his team would have to make) and the left fielder who could do it, but just didn’t have to very often.

Past that, you can continue to use whatever hitting and fielding and running metrics you like to determine a player’s value, but when we get down to constructing that baseline, I’d argue we need a better conceptual and mathematical framework. It’s going to require some more #GoryMath than we’re used to, but I’d argue it’s a better conceptualization of the way that MLB actually plays the game in 2020. If… y’know… MLB plays in 2020. If WAR is going to be our flagship statistic among the acronymati, then we need to acknowledge that it contains some old and starting-to-be-out-of-date assumptions about the game. We may need to tinker with it. Here’s my idea for how.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now
You need to be logged in to comment. Login or Subscribe
Great stuff as always. One point you almost went to earlier in the article is the timing if replacement is important. Player gets hurt in game, his replacement comes off the existing bench roster. Going to be out for awhile, maybe the AA prospect (presumably better than the bench guy) gets called up to replace him going forward. Going to be out for a year, maybe you trade for a replacement (yes, not 'freely available' but it's more real-world replacement).
Cliff Mayo
I am so glad you are writing about this stuff.
Ryan Sullivan
Outstanding Russell!
Byron Hauck
This is going to look like an unhinged rant because we're not allowed to use line breaks. I've learned a lot about WAR from these articles over the past couple years, principally that replacement level is just a flat subtraction from whatever average is. I don't like that method any more than you do. But I just cannot get behind your assumption that the "replacement player" is simply the replacement in the starting lineup. The definitions I saw when I was learning this stuff a decade ago all emphasized that a "replacement player" is a freely available replacement for a roster spot that we then pretend gets all the playing time at a position. They're not major-league bench players, or even borderline major-league bench players. They're the 28-year-old SS on someone else's AA team that they'll sell you for $10k. They're your team's 15th best pitching prospect from five years ago. They're the 35-year-old C with 50 MLB PA across four teams. As I said below the article you linked, when Dave Cameron looked for replacement players in 2013, he looked at waiver claims, not bench players who got more PT than expected. The point of WAR is how much worse a team would be if he disappeared and they were prevented from replacing him with anything other than a player they can add to the roster for free. It's really hard to confidently state how bad that player would be because they so rarely play and there's survivor bias, but I don't think you can start using people who are on 25-man rosters before anyone stepped out of the lineup instead.
Byron Hauck
I don't know, maybe it's dumb to measure off a black hole rather than the league-average ability for a roster to reorganize itself around an absence.
Byron Hauck
Oh, and I say "a decade ago" not as a brag but to expose that I wasn't around for the actual development of WAR. I guess it's a humblebrag that I'm not a child?
Craig Goldstein
Not to speak for Russell but I don't think he's saying that it has to be a literal player, but that the flexibility of literal players affects the composition of the black hole player in terms of quality and how much you want to adjust for positional value.
It's true, and has been since the inception of replacement level. that not every player who is lost gets replaced by the same barely adequate sort of guy, c.f., Wally Pipp. But I would worry if accounting for that went so far as to reduce a player's measured value based on circumstances like that, that are out of his control. That would be a step away from adjusting for context. For adjusting positional value, though, I think it probably does make a lot of sense.