BP360 is Back! One low price for a: BP subscription, 2022 Annual, 2022 Futures Guide, choice of shirt

Integral to those numbers is something called WAR, which stands for wins above replacement. What replacement? A replacement player, of course, but he’s mythical.
Statistics zealots apparently love to deal with mythical or hypothetical players. The problem for those of us who prefer dealing with reality and actual human beings is we can’t buy into the idea of using mathematical formulas instead of real players.
—Murray Chass, September 5, 2010.

I have considered WAR and VORP (“value over replacement player;” yes there’s that replacement guy again), and I have a basic problem with them. The replacement player isn’t real; he’s a myth, and I’ve never seen a myth play baseball. It’s like fantasy baseball. That stuff isn’t real either. —Chass, March 6, 2011

Before we begin, a disclaimer of sorts, or at least a plea for indulgence. I know we hit ol’ Murray quite recently, and at that time some of the comments suggested that we stop shooting at this fish and leave him in his Hall of Fame barrel. I’m sympathetic to that point of view to the extent that I suspect we in the sabermetric community are the only people paying the slightest attention, and unsympathetic because (a) the existence of retrograde thought offends me, (b) battling ignorance is part of my job description, and (c) attacking it is so darned fun.

I would still have left Murray alone except that while in Boston I found myself violently thrown out of a dream and quite nearly onto the floor, and the thought that echoed in my sweat-drenched skull was, “If you can teach the story of the three bears to children, you can teach replacement level to Murray Chass,” except the story that I dreamt was not that of the bear trio, but that of Babe Ruth’s Dead Fat Cat. Yes, my subconscious is due for an overhaul if I’m dreaming of Murray Chass, but I believe it spoke truly. Thus, the exhumation of Murray’s DOA ideas one last time.

It is important to draw a distinction between the stupid and the willfully obtuse. No one has ever seen a bear family set up a bed and breakfast and start cooking oatmeal. Nor, for that matter, has anyone stuffed a cat into a steel box along with a decaying radioactive substance and a flask of hydrocyanic acid, as Erwin Schrödinger proposed in his famous critique of quantum mechanics. Even he called this illustration “quite ridiculous.” In addition, most recent scholarship debunks the notion that Aesop came upon a fox and a crow arguing over a piece of cheese in the middle of the woods. These are clearly stories meant to illustrate some point, extended metaphors. Those of less than average intelligence may take such stories as intended to be literal representations of what they portray, and insist they are being lied to in some way—“He-ey! Bears don’t do that!”—but as that level of thickness is, even for the benighted human race, extremely rare, we are forced to assume that Chassian protestations in the same vein are therefore perversely inspired, a tool for argument. We will not guess at his motivations, but will proceed to the argument, such as it is.

Chass is correct that the replacement-level player is a hypothetical, but it is hypothetical in the same way as those bears or Schrödinger’s cat, imaginary figures that can teach us something valuable and real. The idea of the replacement level is slightly more complex than the three bears, but its purpose is not. Put aside cavils about exactly where the replacement level should be set, because it’s irrelevant to the utility of the concept itself, which is great as long as you’re not bent on insisting that something figurative be taken literally.

There is an old saying, “In the dark, all cats are gray.” Measuring ballplayers without sufficient context is like judging the color of cats in the dark. The idea of the replacement level is simply that in order to understand the real value of something, you need something else with which to compare it. Here we come to Babe Ruth’s Dead Fat Cat:

One fine Monday in 1923, Babe Ruth came into the clubhouse carrying a large canvas sack. “What you got there, Babe?” asked Wally Pipp as he bumped into the hat rack.

“Is it a piñata?” asked Bob Meusel.

“Whatever it is, it had better not get in my way,” growled Carl Mays.

All the players gathered around Ruth’s locker. “C’mon, Babe,” they said, speaking with almost one voice. “Tell us what it is!”

Ruth threw back his big head and laughed. “Take a look at this, fellows!” He shook out the bag. THUD! Something heavy, furry, and stiff landed on Wally Pipp’s foot.

“Owww!” shouted Wally.

“Is that what I think it is?” Wally Schang asked. He pulled on his face mask and frowned through the grill.

“It sure stinks,” cried Jumping Joe Dugan.

“It’s a dead cat!” said Ruth. “I was walking down the street and some guy just gave it to me!” They all looked down at the remains on the floor. The cat had a beautiful white coat, a pink nose, and it was clearly deader than Mad King Ludwig of Bavaria. Most shocking of all, though, was just how thick and round it was.

“It sure is fat!” Meusel said. “Let’s weigh it!”

They all rushed into the trainer’s room and placed the former feline on the scale. Home Run Baker fiddled with the weights until they came into balance. “Twenty-three pounds!” he announced.

“That is the fattest dead cat I have ever seen,” said Dugan, “and I have seen some dead cats in my time.”

“When I was with Connie Mack, he said the fattest dead cat on record was only 18 pounds,” Whitey Witt said wistfully, “and Connie fought in the War of 1812.”

“It’s a new record!” said Pipp. He shouted “Hooray!” and vanished over a cliff.

“Mark another one down for the Babe!” Ruth crowed. “Not only am I the greatest hitter of all time, but I am the owner of the world’s fattest dead cat!”

The players all applauded. Just then, Miller Huggins stormed into the clubhouse and, seeing what was going on, suspended Ruth for seven games. “And you’ve got the longest suspension ever, too,” he said, smirking.

The next day, Ruth entered the clubhouse with another sack, this one bulging as if it could barely contain its burden. “What’s that, Babe,” Dugan asked. “Another dead cat?”

“Yup,” Ruth said glumly. “Fellow just gave it to me.”

“Must be a Giants fan,” said Meusel.

“He had better stay away from me,” Mays spit.

“Why’d you take it, Babe?” Pipp asked as he was run over by a herd of buffalo.

“It seemed impolite not to,” Ruth shrugged.

“What’s the point?” Baker asked. “No way it’s bigger than yesterday’s. That was the biggest dead cat ever.” He watched as Ruth dumped the contents of the bag out on the floor. THUD! THUD! Once again, the cat had a beautiful white coat, a pink nose, and was stiffer than U.S. Grant. Yet, there was one crucial difference between Tuesday’s dead cat and Monday’s: the first dead cat had only two double chins. The second had three. The players all took a step back and marveled at its girth.

“I don’t know, Frank,” Schang said to Baker, warding off the dead thing with his chest protector. “You may have to eat those words.”

“I guess we had better weigh it,” Ruth sighed. As one, they trooped into the trainer’s room and commandeered the scale. Once again, Baker tapped the weights into place. “You won’t believe this,” he said in a disbelieving whisper. “This dead cat is 28 pounds.”

“Gee, we were wrong yesterday when we said that dead fat cat was the fattest dead cat ever.”

“It’s a new record,” Pipp shouted, accidentally severing his artery with a scimitar. Dugan offered him a band-aid.

“Hooray,” Ruth said limply. Just then, Miller Huggins stormed into the clubhouse and suspended Ruth another seven games. “I don’t approve of the way you treat dead animals,” he hissed.

On Wednesday, all the players all arrived at the clubhouse early, wondering if Ruth would bring another dead cat. “Yep, I got one, kids,” Ruth said as he bustled into the clubhouse. He carried a small paper bag. His face was smudged and there was a brown grease stain on his camelhair coat.

“Where?” asked Meusel.

“Right here in this bag,” Ruth said, inverting it. BIMP. A cat tumbled out and wafted gently to the floor. It had a white coat, a pink nose, and was no more well off than Edgar Allen Poe. Yet, it seemed strangely bereft of cellulite. “What do you think, boys?” Ruth said nervously. “Is it another record?”

“It’s not terribly fat, is it?” asked Pipp, burning his hand with a steam iron.

“In truth, it seems a bit malnourished,” Baker said.

“All cats is rotten,” Mays spat.

The players slowly drifted away. “Don’t you wanna weigh it?” Ruth pleaded.

“There hardly seems a point,” said Witt, nearly weeping. “Are you sure the guy wanted to give you this one?”

“Yeah, he—aw, it’s no use. I can’t lie to you fellahs. The man with the dead cats didn’t show up today. I didn’t want to disappoint you guys, so I spent all morning crawling around back alleys trying to find one.”

“Aw,” the assembled Yankees said as one.

“We’re touched, Babe,” Meusel said. “The only problem is, this one is below the limit. By the looks of it, it’s no more than eight pounds. You shoulda thrown it back.”

“No, it was a good thing he brought it in, Bob,” Baker interrupted. “If we weren’t sure what a dead fat cat looked like before, we sure know now that we have this one to compare it with.”

“We ought to keep it around, just in case,” Dugan squeaked. “Someone make room in the ice box.”

“Say, that looks like my cat,” Mays said.

On Thursday, Ruth entered staggering under the weight of the most massive sack that any of the players had ever seen. He had to be pushed through the door by two clubhouse boys, both of whom were easily dwarfed by the mammoth bag.

“Holy Moly,” said reserve catcher Al DeVormer, who no one had ever heard speak before.

“Mercy,” moaned Ruth from under the bag.

The players pulled the Babe free. Mays then came forward and chewed away the straps that held the bag closed. THUD! THUD! THUD! The cat had a white coat, a pink nose, and was clearly as extinct as King Kelly. Yet, this cat was less cat than a mountain of fat and fur the size and rough dimensions of a steamer trunk. Together, the players rolled it to the trainer’s room and pushed it onto the scale. At first, it seemed as if Baker would not be able to get the scale to balance, but finally he stepped back and admired his work.

“What is it, Frank?” the players shouted. “Tell us! Tell us!”

“The Babe’s dead fat cat weighs four-hundred and seventy-nine pounds.” A hush fell over the room.

“Where does he get these?” Dugan asked.

John McGraw, probably,” Meusel grumbled.

“We’re running out of places to put these dead cats,” Schang whined.

Stuff ‘em in Mays’ locker,” said Baker.

“Anyone seen Wally?” Whitey Witt asked, but no one paid any attention.

“So, Monday’s dead fat cat wasn’t the fattest ever, even though we thought it was,” Ruth said, scratching his head. And Tuesday’s dead fat cat wasn’t the fattest ever even though we thought it was.”

“We were wrong because we hadn’t yet seen the biggest cat,” said Dugan. “This one just has to be it!”

The players gathered around Ruth. “The Babe has the biggest dead fat cat of all time!”

Just then, Miller Huggins stormed into the clubhouse and suspended all of the Yankees on general principles. That afternoon, the team forfeited its game to the Browns, and all the players went home early. It was only later, when Carl Mays went to skin the 479-pound beast, that the Yankees discovered the half-digested body of Wally Pipp inside. And the next day, Ruth brought in another dead fat cat. This time, he needed a truck.


Ruth’s dead fat cat is mythical, but he serves a useful purpose because he demonstrates that measures of greatness are relative. If you’ve only seen one dead fat cat, you proclaim it the fattest ever at your own risk, because you haven’t assessed the entire population of dead fat cats, as the Yankees found to their sorrow. It would have helped them to have some kind of baseline against which to measure each new find.

The same is true of baseball players. Considering a player in a vacuum is no different than asking, “How far is Hong Kong?” without supplying any other point of reference. Do you mean, “How far is Hong Kong from Beijing?” “From Hoboken?” “From Mars?” Or more accurately, it’s like asking how far Hong Kong is from “here” when you don’t know where you are. When it comes to baseball, we frequently, incorrectly, assume that we don’t require a point of reference because our basic knowledge of the game has taught us to distinguish good from bad. Given a line like .295/.360/.500, we may feel that our almost a priori knowledge of the game is sufficient to tell us that this is a very good season, but that is not necessarily true, because the ground keeps shifting beneath us. The definition of “good” changes with each season, and it is only by knowing how far we are from some fixed point of reference that we can judge quality.

Consider again that .295/.360/.500 line. Coming from an American League shortstop of 2010, it would have been of MVP quality. From a first baseman of 1998, it would have been merely average. From a first baseman of 1930, it would have been downright poor. Because the bar keeps moving, we need something to which we can tether ourselves. For a long time, before sabermetrics, that something variously took the form of a .300 average, 30 home runs, 100 RBIs, or 20 wins—baseball writers of the past century had mistaken figures with changing meanings (or in the case of RBI the wrong meaning, and with individual pitcher wins, little meaning at all) for having a value as fixed as that of an inch or a pound. A pound of lead and a pound of feathers may have weigh the same, but a pound of .300 has a different value depending on when and where it was hit.

Once one is aware of the necessity of having a point of reference, its exact nature depends on how much you want to know. You could judge players against the league average, but that wouldn’t solve the problem of differing parks and positions. Further, judging players against the average can render too harsh a judgment of their skills: the population of ballplayers is large enough, and being average so difficult, that players can be below average and still have value. Setting a lower bar, a replacement-level bar (that is, Babe Ruth’s skinniest dead fat cat) is a measure that better conveys the ability of some players simply to show up regularly and play decently—at least it’s a cat, at least it’s dead; Nick Johnson is incapable of rising to that level. If his substitute is a below-average hitter, at least he’s better than zero, which is all that Johnson left in his wake when he hit the DL.

Alas, we are now far afield from the three bears and Ruth’s fictional felines. No doubt the Chass is somewhere grumbling “Ruth didn’t have any bloody cats.” If that’s the case, then we’ve run up against a wall of literalism so thick that naught will penetrate it. Yet, if even children can understand that Goldilocks had to experience too hot and too cold before she could grasp just right, then there is hope that even some of our most recalcitrant commentators might come to understand that a player’s value is only visible when observed in context. The replacement-level player is merely a yardstick, a more sensitive method of measurement than one that merely says that any number higher than 19 must be good, or any average higher than .299.

Those that cling to the notion that baseball—or anything—can be assessed using such blunt measures have ceased to think; they might as well be dead fat cats themselves.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now
You need to be logged in to comment. Login or Subscribe
I live in Hong Kong.

I don't even know where you are, but I can promise you that it's pretty far.
Well, the North Pole doesn't exist. It's mythical - there's no actual pole up there!

So clearly measuring longitude and latitude is just more poppycock from the mouths of sabermetricians, geographers and the like. Bah!
In that movie classic "Spies Like Us," Dan Ackroyd summed up all that is Murray Chass's MO: "We mock what we don't understand."

Chass may be the poster child for willful ignorance, but unfortunately, he's far from the only one. And generally speaking, the bigger the megaphone, the falser the prophet.

Brilliant stuff Steve. This article alone makes my subscription worthwhile.
I wish I had dreams like this :(
Are you allowed to say that? I thought all the Joe Sheehan fanboys had coopted the phrase "worth my subscription" and copyrighted it or something. Oh, my bad. They copyrighted "the only reason I subscribed" and "won't be renewing."

Nice piece, SG. I think all of BP has stepped it up this spring and I am enjoying the huge amount of content you all are providing. Well done!
With all due respect to Stephen, who's a wonderful writer and a fine baseball mind, I found this piece to be the exact opposite of brilliant. I found it to be long-winded drivel that does an awful job of making a simple point, a point that no one reading this site needs made.

Stephen, I love your work, but this was so far beneath your usual standards that I wonder whether Murray Chass, Bruce Jenkins or Dan Shaughnessy hijacked your computer and signed your name to this dreck.
I think Steve misses the point. The first cat was really the biggest because he was "feared" by the other cats as being the fattest.

If you have any doubts ask Dan Shaughnessy. He was actually there and doesn't need any weights or measures to tell him what he saw.
From Shaughnessy's piece this weekend on the NCAA Tournament. Apparently we eat pudding. Um, ok.

Here’s a little test: Walk out your door and try to find someone who can name five players in this year’s tournament. You won’t find anyone unless you live next door to Bob Ryan, my boss Joe Sullivan, or one of the pudding-eating, basement-dwelling blog boys who’d normally be tracking UZR or NFL fantasy teams.

You think it would do any good to e-mail this to Bruce Jenkins, too?
Just wanted to say that I've really been enjoying the new column and keep up the good work. This has been one of my favorites so far.
Following up OnDeck_Matt's spot-on comment, here's Jenkins from the SF Chronicle just a week or so ago:

"It won't be long before we get the first wave of nonsense from stat-crazed dunces claiming there's nothing to be learned from a batting average, won-loss record or RBI total. Listen, just go back to bed, OK? Strip down to those fourth-day undies, head downstairs (to "your mother's basement and your mother's computer," as Chipper Jones so aptly describes it) and churn out some more crap. For more than a century, .220 meant something. So did .278, .301, .350, an 18-4 record, or 118 RBIs. Now it all means nothing because a bunch of nonathletes are trying to reinvent the game?"

I generally like Jenkins's columns, but he really has a blindspot here.
Somone should ask Jenkins if he thinks Earl Weaver was a stat-crazed dunce.
Don't worry folks. Joe P. handled this one.
nice link. thanks
My word, that entire article described my childhood.
At this time of the same tale rewritten with VCU basketball players in place of the Yankees could be used on Basketball Prospectus.

So many commentators on ESPN and elsewhere said that teams like VCU, Penn State, and Richmond didn't deserve to be in the NCAA tournament because they didn't pass the "eyeball test", whatever that is. Certainly picking the last few at-large teams is subjective, but credit to the NCAA for using objective measures (RPI, SOS, etc.) to pick schools rather and the hilarious "eyeball index."
Problem is, RPI is a terrible objective stat. I would liken it to using RBI and Wins.
Then let's replace it with something better, but still objective.

I got sick and tired of Bilas and others saying the ADs and commissioners currently on the NCAA committee needed to be replaced by "basketball people", who would vote in teams on subjective measures such as the "eyeball test." That would guarantee that teams like VCU and Butler would be left out, while major conference schools like Colorado, Alabama, and VTech would take their place.
Wally Pipp sure is one unfortunate SOB, but I think we already knew that
Wally Pipp prefigured Kenny. Dude, there's nothing new!
People will fight hard to believe what they want to believe. Look at politics, climate change deniers, and religion.
Or people coneceited enough to believe a) humans cause everything, and b) humans can fix anything.
Mr. Jenkins managed to knock down a straw man and spray ad hominem all over in just that short passage deholm1 quoted. Logical fallacies per paragraph, a stat we can all grasp the meaning of.

@jrbdmb: The problem is that we have no way of knowing how the committee used such data, or even whether they really did. They go in a locked room, make their decisions, and emerge with long mealy-mouthed explanations that explain nothing.
Wouldn't it be great if there were a place on that displayed exactly what that fictitious, zero-warp, league-average player looks like in terms of everyday baseball statistics?

If Chass' ilk could come to a web page and see .264-65-10-65-4 or 8W-4.20 ERA-6.50 K/9 actually defined as BP's league average player, perhaps it would put it in understandable context. As is, we have to take it on blind faith that such a fictional person exists (or doesn't, actually) but most of us don't know what his stats would be.

The only argument against the hypothetical player right now is that he's transient or under-defined. has a link up front "click here if this is your first time!" BP could use something like that. "Click Here to Meet our League Average Players!". Perhaps some dynamic website whiz could even have it update based on the newest available historical stats every season, and identify the closest equivalent real-life player.

That would be sweet.

Great article, Steven.
That first paragraph. Yes. If you really want to emasculate Chass, have a publicly available web page that says, "This is what a Replacement Player is for //."
The autoformatter ate the end of my last sentence.

I had written:

This is what a Replacement Player is (statswise) for {league} / {position}.

I think it's a great idea.
I like that. "The Baseball Prospectus 2011 MLB Replacement Level Team!" I almost used the term "All-Stars" but that seems more than a bit inappropriate here.
All this dead cats parable shows is that we need signposts in order to measure. I think everyone who reads this forum agrees with that, indeed, I'm not sure that even Murray Chass would disagree with that. The question he raises in this particular instance -- whether the concept of "replacement player" is a good signpost -- is not a trivial question.

I'm not going to defend Murray Chass. He once was a decent baseball reporter. He then turned into an accountant, who had a "story" because no one else was really covering that beat. But when the business of baseball became a story for many, and most of the information became easily accessible on the web, he became a bore and was put out to pasture. It is very possible he thinks BA is a more informative stat than OPS (which is absurd), but that is not the issue here.

The issue is that the ideal stat or stats would tell us -- and put a number on -- how much a player is helping his team win. There are numerous problems with doing so: the rest of the team has an impact on each player's contribution, so does the park in which he plays, we really have a hard time comparing defensive runs prevented from offensive runs created.

And, of course, one additional problem in finding the stat or stats that answer the question how much a player is helping his team win is the question "compared to what?" Replacement player is an attempt to address that problem. But Chass' criticism that it is a mythical concept is not without force: replacement player is an imaginary concept, not a fixed point from which we can be sure our measurements were accurate. (Indeed, "replacement player" is not even defined in the BP Glossary.) While replacement player tries to establish a fixed point from which we can measure the contributions of others, replacement player is itself relative: if the current 750 players in MLB all decided to play poker instead, the 750 players who replaced them would, of course, be on average worse than the CURRENT view of the value of replacement player, but the vast majority of them would, of necessity, be higher than what would become the NEW value of replacement player -- some of that "second" 750 would be doing substantial things to help their team win.

The problem I have with this article is that it brings us down to Chass' level. Unlike Chass, I have no basis to reject "replacement player", WARP or VORP, at least not until I or someone else comes up with something better. But it clearly isn't perfect, and it may not even be good, and it is far better for us to recognize the weaknesses and think about what, if anything, might be better, than to waste time proving something we already know.
This is a must read column. Keep playing like a 427lb dead cat and in another 5-8 years, I will love you as much as I love CK.

My only issue is that I think a replacement player isn't mythical. The the average performance that the current crop of non-prospect minor league players would produce if allowed to play is literally replacement level, right? The replacement baseline will naturally change depending on a host of factors, but essentially AAA organizational warriors are as bad as it gets.
We could have a lengthy discussion as to bad a replacement level player should be. Theoretically, when a starter gets hurt, he is replaced by the top man on the bench who in turn is replaced by the best AAA player that team has who can fill his position. More often than ever teams have professional benches and call up that AAA player or a hot prospect at AA to replace an injured player - that is if the starter is hurt enough to go on the D.L.

It isn't likely that the 30 best players at each position have starting jobs. Sometimes, teams have an abundance of talent - such as having Jed Lowrie as a back-up utility player. Sometimes, teams will keep a guy in the minors to avoid having to go to arbitration a year earlier than otherwise.

My point is the abstract replacement level is hard to define, but if we look at the worse starting players in each league, we have a good idea of what we are looking for. He couldn't be more than only very slightly worse than that.

I know the A.L. better than the N.L., so I will suggest looking at:

Pitcher: Matt Harrison
Catcher: Jeff Mathis
firstbase: Matt LaPorta
secondbase: Christopher Getz
shortstop: Orlando Cabrera
thirdbase: Alberto Callespo
fast OF: Juan Pierre
slow OF: Jeff Francoeur
dedicated DH: Jack Cust

There you go, Mr. Chase, the replacement level in flesh and blood.
So Babe Ruth likes dead cats and Murray Chass is an old curmudgeon? Funny if not exactly enlightening. Wally Pip has worse luck than Dicken's Pip, though with lower expectations.
Just to play devil's advocate for a minute...

What was Wally Pipp's Value Over Replacement Player?

Was it high? He was one of the top hitters in the league.

Or was it low? When he got hurt, the Yankees plugged Lou Gehrig into his spot for the next 2,131 games and didn't miss him.

I think that is the point Murray Chass was trying to make... Pipp wasn't replaced with a hypothetical generic "replacement player", there was a specific guy waiting to take his place. A player might be more or less "valuable" to his team based on what backup options that team has.

The point is that not everyone has a Ryan Howard pushing aside a Jim Thome, so it's only fair to have a reference point that is equal across the board.
Measuring performance against a more or less arbitary baseline is useful to understanding performance level. Not the only way to look at it, but one useful way. Specific cases where replacements turn out to be better than the player they replace do not affect that fact. Wally Pipp's value to the 1925 Yankees is a different question from Wally Pipp's context-free performance level. The hypothetical aspect of RP is actually crucial to exploring the latter.
Has Murray ever been to the hospital for any sort of surgery or treatment? The basis for that procedure is the same sort of statistical generalization that Chass rails against. He should be glad we all figured out how to do clinical analysis in medicine - and in baseball for that matter. That he can't or won't tells you all you need to know about his view of the sport: it's religious, not scientific.