BP Unfiltered: Daddy, What’s Replacement Level?

March 4, 2013

In the pages of yesterday’s Boston Globe, veteran sports reporter Bob Ryan declared war on WAR. We get that one a lot. But the unusual part of this particular declaration was that it was based on the belief that the “RP” in WARP—for “replacement player”—was a "judgment call" rather than the product of a mathematical formula. Ryan argued that the "replacement level" comparison, as currently constituted, is just a matter of opinion, and therefore arbitrary and unreliable. It's not often that we’re told that we’re not using enough math.

It seems that Mr. Ryan might be misunderstanding what replacement level is and how it’s calculated, mistaking a mathematical abstraction for "something that we make up as we go along." In fact, replacement level is the result of a perfectly logical calculation. So let me take a moment to set the record straight.

WARP seeks to answer this basic question: If Smith suddenly vanished from the face of the earth, how much production would his team lose as a result? The general idea is that his team would do the best that it could, either promoting a guy from the bench to the starting job, bringing someone up from the minors, or signing a scrap-heap free agent who plays the same position. It wouldn’t get the same production it would have gotten from Smith, but it would get something. We need a way to compare the value that Smith supplies to the value of these guys on the bench, in the minors, and on the scrap heap.

Mr. Ryan correctly points out that WARP converts a player's exploits on the diamond into run values, and includes his hitting, defense, and baserunnng contributions for hitters. We might say that Mike Trout contributed five billion runs (okay, the number might have been slightly smaller) to the Angels last year, all told. But to what shall we compare him? A summer's day? No, we compare him to the value of the "replacement players," who are the bench/minor league/scrap heap guys. Because Trout played center field last year, we need to find all the bench/minors/scrap heap center fielders out there. The 30 guys who led their teams in time spent in CF don't count. But everyone else who primarily played center field (i.e., that was the time where he personally spent the most time) does. We can look to see how much value these guys collectively brought to their teams.

Had Trout himself disappeared, the Angels probably would have responded by playing Peter Bourjos and Torii Hunter more often. But we don't want to credit or blame Trout for the presence of other players who just happen to be on his team, so we take an average of what everyone else's bench players might have done in Trout's place, rather than compare him just to the Angels’ backup options. Then we look at how much value those backup center fielders, on average, would have provided in the amount of time that Trout played last year.

Replacement level is a mathematical abstraction in that no such "replacement player" actually exists—you can’t point to Larry over there and say that he is the gold standard of replacement level. But really, a replacement player is just the per plate appearance (or per inning) mathematical (weighted) average performance of all backup center fielders, multiplied by the number of plate appearances (or innings) that Trout (or any other player whose value we want to assess) played.

In using this composite sketch of the state of backups in MLB, we trade the ability to answer the question, "What really would have happened to the Angels if Trout had vanished into thin air?" for the ability to compare everyone in MLB against a common baseline. Depending on the question that you want to answer, this may or may not be a beneficial assumption. It has advantages and disadvantages, but I'd argue that the advantages have more weight here.

If you'd like to take issue with how WAR defines value (and the assumptions inherent in it), then that's fine. If you'd like to take issue with the methodology used to calculate it, perhaps to say that the math and the definition don't fully match, that's fine too. A good scientist—and I consider myself to be a proper scientist—should give a fair hearing to a reasonable argument. But as always, we've started with a reasonable definition of what we're looking for, tried to create the best mathematical model that we can based on that definition, and then let the numbers fall where they will. That’s a better approach than making it up as we go along.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now

Russell A. Carleton

You need to be logged in to comment. Login or Subscribe

bishopscreed

3/04

As a physics teacher, I'm mulling sending this article to some students, to help them see that sometimes mathematical abstractions make you understand the world better.

One thing I've wondered about from time to time is how well a team of all replacement players would fare against the rest of the league. Also, does aggregate team WARP scale linearly against team wins? That is, do excellent teams perform better than their total WARP compared to a replacement level team, perhaps because of lineup synergy or having unusually good bench players?

Reply to bishopscreed

swarmee

3/04

A theoretical team of replacement level players (let's say the Astros for simplicity's sake) is normally considered to win 40 games in a 162 game season. That's why each additional WAR a team accrues is added onto the baseline of 40 games.

Reply to swarmee

nberlove

3/04

Well, let's do a little math with this year's PECOTA projections. The sum of all PECOTA WARP projections for all players projected to play at the MLB level this year (using the DC_FLAG on the spreadsheet), is 966. Divide it by 30, the average team is projected to win about 32. Since on average all teams win 81 games, a replacement team would be expected to win about 49.

Reply to nberlove

jdeich

3/04

I'm not aware of a way to quickly tabulate this, but it would be a good check of PECOTA to compare the WARP projected in the pre-season for 2012 to the WARP assigned to players in 2012. The number of wins available each year is constant, and a well-normalized system should roughly obey this.

Reply to jdeich

BillJohnson

3/04

And in the difference between this 49-win figure, and the 40-win figure accurately cited by swarmee as the theoretical basis for replacement level, can be seen the essence of Ryan's argument, which I believe has some merit -- not a lot, but some.

Let's be honest here: there IS a component of judgment -- or if you prefer, "subjectivity" -- as to where replacement level resides. If there wasn't, everybody's values for players' WAR (or WARP) would be the same. They aren't. Most of the time they are fairly similar, but there are occasional extreme outliers. This is in contrast to traditional stats, where at least everyone's understanding of batting average, HRs, RBIs, ERA, etc., is exactly the same (which is not to say that that understanding then translates to a correct understanding of value).

Furthermore, we ARE making it up as we go along. There is constant fine tuning of the calculation of WAR/WARP/whatever. I submit that this is a good thing, not a bad thing; it means that we are continuing to think hard about what it means to be good at baseball. The "making it up" is on a long-term basis, rather than just pulling numbers and formulas out of some body orifice that are different today than yesterday so that we can "demonstrate" the superiority of the guy we perceive today as being the best player. (Ryan apparently doesn't get that.) But it still goes on.

This doesn't mean that I agree with Ryan; at the 90% level, I don't. However, I do agree with him that it's important to avoid overclaiming what WAR can tell us about players, and about baseball.

(Incidentally, I'm also a physicist.)

Reply to BillJohnson

gweedoh565

3/04

"If there wasn't, everybody's values for players' WAR (or WARP) would be the same."

There are a LOT of things that differentiate WAR calculations aside from the chosen replacement level:

http://www.baseball-reference.com/about/war_explained_comparison.shtml

Reply to gweedoh565

BillJohnson

3/04

Oh, for sure, and I just focused in on replacement level as one of those components. The point still stands, though: WARP and things like it are outputs of models. A scientist knows that a model's outputs are only as good as the model itself, in combination with the input data. Since there are differences of opinion/assumptions from model to model, not to mention that the data set itself gets squirrely once one gets away from balls, strikes, hits, etc., to defensive "zones" and definitions of line drives versus fly balls, there's a lot of judgment/subjectivity to the models that we often ignore. As long as those things are handled consistently and with some detachment, the resulting conclusions have value. They're still not as detached and objective as some of the more fervent advocates of the models would have us believe they are.

Reply to BillJohnson

dethwurm

3/04

I agree with everything you say, but I think we need to exercise caution when using phrases like "making it up". People here get it, but to most baseball fans that sure sounds like "just pulling numbers and formulas out of some body orifice that are different today than yesterday so that we can "demonstrate" the superiority of the guy we perceive today as being the best player." "Refining" might be a better euphemism.

Reply to dethwurm

TangoTiger1

3/04

I've half-joked that if I ever roll out my own implementation of WAR, I'd call it W/W (Wins over Willie, for Bloomquist), since he embodies the ideals of the replacement-level player, someone who manages to find a job, usually with an under .500 team, rarely gets to be more than a platoon player, and can play any position on the field.

Reply to TangoTiger1

BillJohnson

3/04

LOL, and rec'd, but in all seriousness, is a guy who "can play any position on the field" really a replacement-level player? There aren't many of those -- whatever there actual worth may be.

Reply to BillJohnson

BillJohnson

3/04

"Their" actual worth, obviously. Wish these comments were editable!

Reply to BillJohnson

Oleoay

3/04

No one said Willie played every position well :)

Reply to Oleoay

rcbussbp

3/04

If/when MLB goes through another round of expansion, how would that affect replacement level?

I'd assume that the new RL would be lower than the old one, but the setting on the "replacement level-o-stat" would need to be turned down, so that your new replacement-level player would still be ~0.0 WAR / WARP / WOW. You would then no longer be comparing apples to apples.

Reply to rcbussbp

BarryR

3/04

Uh-oh. You found a hole in the system.

Logic tells us the replacement level would be lower - it would have to be, since a number of the previous season's "replacements" were now regulars, or in the major leagues. But something else happens in expansion seasons - the best players and pitchers take advantage of the watered down league. Watching the 1962 Mets against quality pitchers was truly sad (not for the pitchers, of course). So better players are having better seasons because they're facing replacement level players, then their WARP increases more because the replacements also got weaker.

The reverse of this should have happened between 1947 and 1960, where integration added a steadily increasing stream of quality players to the mix. Players who were regulars were pushed to the bench or the minors, which improves the level of replacement player, while at the same time pitchers had to deal with Jackie Robinson and Willie Mays, while hitters had to face Don Newcombe.

Maybe WAR/WARP/whatever accounts for this in some way. I'd be curios as to how.

Reply to BarryR

Oleoay

3/04

As I recall, the replacement player level was recalculated two to three years ago just because of new ballparks opening and the decline in offense from the 90s.

Reply to Oleoay

Oleoay

3/04

I thought the definition of replacement player was basically a quad-A player or minor league free agent. I don't remember bench players ever being part of the equation. Technically, aren't those bench players generally better than the quad-A/freely available talent anyway?

Reply to Oleoay

lopkhan00

3/04

I'll admit to being a little confused by this also, as the WARP definition in the glossary refers to minor league players: "Essentially, replacement-level players are of a caliber so low that they are always available in the minor leagues because the players are well below major-league average." This doesn't read the same to me as "we take an average of what everyone else's bench players might have done." Russell also refers to "bench/minors/scrap heap" being involved in the reckoning of replacement level. Which leaves me wondering who all are thrown into the equation?

Reply to lopkhan00

BarryR

3/04

It can't just be major leaguers, because the sample size of bench players based on just their primary position is much too small. So I would think minor leaguers must be in the equation. Which means that the formula includes translating minor league offensive and defensive numbers into major league equivalents and using minor league park factors to correct for biases, which makes the numbers a tad less precise than one might think.

Reply to BarryR

bobbygrace

3/04

I thought that link in the first paragraph was going to take me to a Wilco song.

Reply to bobbygrace

jimcal

3/04

I just want to say the first sentence I said to myself a couple times and now it won't go away.

"We are going to WARP on this thing."

Reply to jimcal

apfeffer

3/04

I'm curious about the history. Why not use wins above average? Seems easier to understand and measure. You would also know how many wins a team would be expected to get based upon the wins above average of their players?

Reply to apfeffer

mickeyg13

3/05

Suppose Player A plays at league average level for all 162 games, and Player B plays at a league average level for half a season and then gets hurt and misses the rest of the season. Both put up the same number of wins above average (0 WAA), but Player A helped out his team far more than Player B. Average players have significant value, and teams pay far more than the league minimum for their services on the open market.

Contrast this with the scenario where a replacement-level player gets injured. Typically he can be replaced by another player who would be willing to play for the league minimum. So, in a sense, a full season of 0 WAR and a half season of 0 WAR have the same value, whereas the same is not said of a full season of WAA vs. half a season of WAA.

There can be legitimate reasons to favor other baselines of course though.

Reply to mickeyg13

apfeffer

3/05

I think this is the crux of the issue: wins above replacement player is a well-defined and useful concept, but WARP is only an approximation to wins above replacement player, as evidenced by the fact that different publications have different metrics for the same underlying concept. As an approximation, it's useful, but it's worth acknowledging its limitations.

Reply to apfeffer

apilgrim

3/05

If it is of interest to anyone, I went to a seminar last year on protein folding prediction, where the big announcement was a method that brought accuracy up to around 37% from the industry standard (Princeton's group) of 30%.

Reply to apilgrim

Oleoay

3/05

I'd be happy if the weather forecasts were 30% accurate in Denver...

Reply to Oleoay

dfloren1

3/05

Maybe Bob Ryan's beef with WARP is that he seeks from baseball only the mythological and all of its fuzzy comforts, and, in a Caddyshack moment, saw statisticians pop, pop, popping up everywhere like rabid gophers, destroying the baseball gods' perfectly-mown pitch, thus momentarily causing Mr. Ryan to literally become a grass-addled Bill Murray intent upon drowning and dynamiting those pesky varmints, only using his keyboard instead of a detonator! Or, perhaps he was merely following the great French polymath Jean Cocteau's thought process ... "Man seeks to escape himself in myth, and does so by any means at his disposal. Drugs, alcohol, or lies. Unable to withdraw into himself, he disguises himself. Lies and inaccuracy give him a few moments of comfort." Well, that's probably a little too bleak, but there is a gleaming kernel of truth in the idea that lots of folks seek comfort from the great game of baseball. And when we value something sufficiently, we are usually willing to fight to protect it.

Reply to dfloren1

BP Unfiltered: Daddy, What’s Replacement Level?

Thank you for reading

Latest Articles

MLU: Bratt Frustrates Opposing Hitters $

Box Score Banter: Knuckling (Way, Way) Up B

The Most Dominated Teams of All-Time: 18-19 $

Golden Age: April 19-27 B

Yoshinobu Yamamoto Was Too Good To Be Great Right Away $

Russell A. Carleton

More about:

Latest Articles

MLU: Bratt Frustrates Opposing Hitters $

Box Score Banter: Knuckling (Way, Way) Up B

The Most Dominated Teams of All-Time: 18-19 $

Thank you for reading

Related Articles

Latest Articles

More about:

Latest Articles

Related Articles