Last week in this space, we talked about the emergency catcher. Most teams carry two real catchers on their roster, but every once in a while, like with position players pitching, we see a guy with little-to-no previous catching experience strapping on the shin guards and heading behind the plate. It turns out that when we look at the record of these pretend catchers, as a whole, they are worse than the worst defensive catchers in the game.
It got me wondering about the rest of the diamond. Sure, fake pitchers and catchers are the most fun (i.e., cringe-worthy) to watch, but what happens when someone gets the call to play second base or right field for the very first time? Ever. What happens then?
Warning! Gory Mathematical Details Ahead!
Like last week, I’m using data from 1993-2017. The upside is that we have a large data set from which to draw our sample, which is good because these sorts of things, by definition, don’t happen often. The downside is that the data aren’t super detailed, owing to the fact that they were partly collected in the 1990s. I’m defining an “emergency” player as one who actually played the position, but did so for fewer than 18 innings in a season. To make sure that they’re not guys who only got a cup of coffee but who were actual trained shortstops or center fielders, the innings logged at the new position had to represent less than 10 percent of their workload for the year.
Shall we go for a little tour?
The most common time for a first baseman to touch the ball is when it is thrown to them on the “3” end of a 5-3 ground ball. Still, they do field some grounders that are hit to the right side. Because we’re using rather rough hewn data, we’re going estimate range for a first baseman on ground balls by looking at the number of grounders that were fielded by the first baseman versus those which were fielded by the right fielder. For some of those grounders, no living human being would have been able to stop the ball from going into the outfield, and some of them were probably the domain of the second baseman, but over a large enough sample a lot of that noise will wash out.
Regular first baseman had a success rate of 73.0 percent on this measure. “Fake” first basemen snagged 73.5 percent of our sample of ground balls. It seems that fake first basemen have a bit more range? Well, yes and no. First basemen aren’t often selected for their lithe feet. In fact, first base has long been considered the place you put the guy who can hit but can’t field. Our replacement, by definition, hails from another part of the diamond, where range is more likely to have played a bigger role in their assignment to that spot. They can take that range with them.
But what about receiving throws? Adam Dorhauer did an investigation of this very question in 2016, and found that our novice first baseman had the ball clank off their glove one extra time out of a few hundred. Given that a first baseman receives so many throws over the course of a season, those extra errors can add up. I similarly found that on ground balls to one of the other three infielders where a throw was made, the first baseman made an error 0.1 percent of the time. The fake first basemen did so 0.2 percent of the time. Chances are, within a single game, the replacement first baseman would get it right, but over time value might bleed away.
But then I wondered whether errors were the correct metric. It’s easy to see the catch that the first baseman didn’t make, but what about the catch that he could have made better? Many plays aren’t that close and any minimally competent major leaguer could make them. The money is made on the marginal plays, and sometimes mistakes are made or not made in ways that aren’t entirely obvious. For example, on a ground ball in the hole, the shortstop may range over, grab the ball, and make a throw that will require a good scoop. A trained first baseman is not only going to be familiar with how to make that scoop, but also tricks that might make all the difference on a proverbial “bang-bang” play. Our fake first baseman might not know those tricks and the ball might take an extra tenth of a second to find the pocket of his glove. No official scorer would ever give him an error, but a more experienced fielder would have done it better and there wouldn’t be a runner at first.
I looked at ground balls hit to an infielder, and 3.7 percent of them ended up as hits or reached-on-errors when a regular first baseman was patrolling the bag. When a fake first baseman was over there, there was a 4.4 percent reach rate. Over the course of 162 games, that’s about 8.25 ground balls that don’t get turned into outs, which is worth roughly six runs. Our first baseman does gain a little bit back with his slightly superior range, but we can see that the average emergency first baseman has been about five runs below average on ground balls.
I also looked at how often first basemen catch pop-ups headed their way. For the regulars, 98.8 percent of those pop-ups ended up as outs. For the replacements, only 97.4 percent did. Sure, most of the time pop-ups on the infield get caught, but what about those little ducksnorts that might just have enough on them to clear the infield? Over the course of 162 games, a fake first baseman would drop one extra pop-up. (Worth noting: The pop-up sample size was small enough that it’s worth hemming and hawing over whether this is a reliable estimate.)
All told, an emergency first baseman is probably 5-6 runs worse than league average. Or at least, he has been over the years. (We’ll talk more about this later.)
Second Base, Shortstop, and Third Base
Turning to the other three infielders, we look at their main job, which is to track down ground balls. For second basemen, we take the somewhat unfair view that a ground ball that reaches either the right fielder or the center fielder is their fault. It’s not always, but it’s the best we’re going to do with the available evidence, and again, over a large enough sample, the noise washes out. For the shortstop, we’ll debit him if the left fielder or the center fielder receive the ground ball. For the third baseman, it’s only his fault if it gets through to left field.
How likely is it that a ground ball will get through the infield when a regular vs. an emergency infielder is on the case?
|Position||Regular “range” rate||Emergency “range” rate||Difference|
Fake shortstops are much, much worse than real ones, and the magnitude by which they are worse dwarfs that of the differences between second and third base fill-ins and their regularly scheduled fielders. Over the course of 162 games, these differences would add up to about 3.6 extra plays not made by a second baseman, 4.7 extra plays not made by a third baseman, and a whopping 19 plays at shortstop. That’s just extra ground balls that get through the infield.
Once the ball is in the infielder’s glove, how often does that result in an out being made?
|Position||Regular “throw out” rate||Emergency “throw out” rate||Difference|
Prorated over 162 games—I’ve been using the 2017 numbers for the frequencies of how often an “average” team sees their fielders called on to make one of these plays—it’s 4.6 additional non-plays for a second baseman, 13.4 plays for the third baseman, and 16.0 plays for the shortstop.
We’re now up to 35 plays on ground balls that a regular shortstop would have made that an emergency one would not. Even if we assume that those all end up as singles, turning an out into a single is worth three-quarters of a run, meaning that we’re already 26 runs below average. The entire defensive “spectrum” (and we’ll talk about those scare quotes in a bit) is assumed to be about 20 runs wide from first base to shortstop. The calculations that form the basis of WAR(P) assume that a league-average first baseman would be roughly a 20 runs below an average shortstop, if given a chance, over the course of a season.
The reality is that this emergency shortstop sample isn’t populated with below-average first basemen. It’s mostly guys whose primary position was second or third base. If a manager needs an emergency shortstop, he’s probably going to pick the guy who maybe at one point could have been a shortstop, and maybe even played there at some point in his life, but eventually had to move off the position. Of course, he’s not a great fit for the spot, but he’s probably the best available among the bad ideas. Both second and third base are separated from shortstop on the defensive “spectrum” by five runs. Our analysis here paints a much darker picture. You can’t just slide over, take a small penalty, and be fine.
What’s frustrating is that the effect sizes we’re talking about in terms of plays made or not made are small, and not always visible to the naked eye. We’re talking about 1-2 percent of plays where the regular gets it but the fill-in doesn’t. If a team really did play a guy out of position, fans would see him chasing after ground balls and getting to most of them. And hey, he missed a few, but no one gets to all of them. But in the same way that the difference between a .260 hitter and a .300 hitter is one extra hit a week, those extra balls that get through add up.
There are other places where our fake infielders are found wanting. As with first basemen, there’s a (very) small pop-up problem for shortstops and third basemen. (Inexplicably, fake second basemen are a little better at catching pop-ups.) And in the middle infield, our emergency keystone players aren’t as good at getting the second out in a double play situation in which we have evidence that they’ve already received the first throw. Pretend second basemen were particularly bad at it (65.3 percent “pivot” rate for real second basemen vs. 56.1 percent for the fill-ins).
All told, we’re talking about estimates of 15-30 runs below average over a full season for our emergency replacements at second, short, and third. The kicker is that the majority of all three samples are drawn from players who primarily play the other two spots. It’s mostly second basemen and shortstops who are having so much trouble manning the hot corner, despite the fact that a second baseman is (theoretically) just as good as a third baseman and a shortstop is (theoretically) better. Far from it, everybody got worse (and much worse!) when put into a new place.
That’s a bit of a problem.
Left Field and Right Field
I’ve previously shown that left field has become something of a “slush” spot for teams. The regular starting left fielder has become something of a relic and players from all over the diamond have taken their turns patrolling the area behind third base. What happens when we have an emergency left fielder? On fly balls to left field (that don’t fly over the wall), regular left fielders catch 88.0 percent of what’s hit to them. The left fielder who just found his way to the space? He catches 84.2 percent.
There’s a split that’s worth looking at within these data. If the guy patrolling left field is normally a center or right fielder, he’s got an 85.8 percent chance of catching the ball. If he’s from the infield, his success rate drops to 82.9 percent. It looks like familiarity with playing the outfield is important, but not enough to completely make it work in left field. A spare outfielder, patrolling left field, would miss an extra five balls over the course of a season. For a utility infielder slumming it in left field, that number jumps to 12 balls. We see the same pattern with line drives to left field. Not very many are caught, but regular left fielders get to more of them (22.1 percent) than even regular center and right fielders (21.1 percent). Infielders playing left are even worse (18.1 percent).
I also looked at situations in which an outfielder’s arm might come into play, like a potential sacrifice fly or a base hit where a runner has a chance to steal an “extra” base (first to third on a single, for example). If the left fielder managed to throw the runner out or keep him from even trying, it was a success for him. It turns out that our emergency left fielders were actually better than the regular ones, by a little bit.
Left field, supposedly one of the “easiest” positions to play, turns out to be incredibly hard to play when staffed by someone with no experience there. No, the guy out there isn’t going to be a complete fool, but he’s going to let a few extra runs in on the margins. That matters. I could re-write the same paragraphs for right field, but it would be the same numbers and the same basic findings. There is a “new guy” penalty in right field, as measured by success on fly balls, line drives, and “arm plays,” but it is less of a penalty for guys with outfield experience than guys who are primarily infielders.
All told, if the new guy is gliding in from another outfield position, he’s performed at a rate that’s worth 6-7 runs over the course of a season. If he’s really an infielder, the penalty is about double that.
A funny thing happens in center field. As we might expect, emergency center fielders haven’t been quite as good as regulars at catching fly balls hit to center (87.9 percent vs. 87.0 percent), but the stopgaps have actually performed better at stopping liners from getting into the gap (25.9 to 24.9 percent). Even more interesting is that when we look at left and right fielders who do a guest appearance in center, they are actually worse than the infielders who take a turn out there. Weird.
Emergency center fielders have—and this is not a misprint—performed at roughly the same level defensively as regular center fielders over the years. That could be an artifact of our sample. These might be fourth outfielders or corner guys who are fully capable of playing center, but never get to do it because there’s someone better on the depth chart.
Maybe this chart helps to explain it. It’s the percentage of plate appearances in the sample (1993-2017) in which a team was operating with an “emergency” fill-in at the position.
Managers make it a point not to find themselves in a place where they have to use an emergency fill-in with their up-the-middle spots. And while sometimes injuries happen, managers may be a little more careful in how much they “over-manage” if it means being exposed in center field. And maybe they only do it if they have a corner guy who can move over and do the job. Or maybe center field isn’t quite as hard as it’s advertised to be.
Re-Imagining (Mostly Getting Rid Of) the Defensive Spectrum
Let’s take stock of the findings.
- Players playing out of position at first base have performed at roughly 5-6 runs below average. That’s not a heavy penalty, but it’s something to think about before just sticking someone there and saying, “He’ll figure it out.”
- In the outfield, left and right field are not “the same” and outfielders who shift between the two without previous experience tend to play with a quality that prorates to 6-7 runs below average over the course of a season. However, corner outfielders playing center field tend to do OK.
- Infielders make terrible outfielders.
- Second, short, and third all see their new arrivals playing at abysmally bad levels, even though most of the sample for each spot is players from the other two positions. Infielders seem to be more specialized than we might have otherwise thought. It also means that utility infielders who play all three decently (or guys who can competently handle both the infield and the outfield) might have a skill that we don’t value fully.
We need to clean up one other methodological issue: It’s one thing to say that players have performed poorly at their new positions, but perhaps they were flawed players coming into the exercise? I actually checked using my own home-brewed fielding numbers (mostly because they were handy). This chart shows, for each position, the average fielding “runs saved” (pro-rated this time to 150 games/1,350 innings) of those playing there on an emergency basis compared to those playing their primary positions.
(It’s much clearer like this: Emergency shortstops were mostly guys who primarily played second and third base. When those emergency shortstops were playing that primary spot, they were, on average, just below average. Even clearer: Emergency shortstops are mostly defensively average second and third basemen.)
|Position||Primary Position Fielding|
We now have something that should not theoretically be happening. The players who moonlight at first base are (slightly) above-average fielders at their regular positions. By definition, all of those players are parachuting in from positions that are further up the defensive spectrum. In theory, their performance at first base should be well above average. Instead, it’s below average.
We also see that emergency 2B/3B/SS tend to be imports from other infield spots and tend to be fairly close to average fielders in those spots. And yet, we see a group of average infielders shifting to a new spot and putting up truly awful fielding numbers. The positional adjustment chart suggests they should be fine. They are not.
In the outfield, for the most part, below-average center and right fielders produce a slightly worse performance in left (and below-average center and left fielders produce a slightly worse performance in right). Perhaps the “new guy” penalty isn’t quite so severe in the outfield … if you’re already an outfielder. But infielders—theoretically higher up on the defensive spectrum—are pretty much butchers in the outfield.
Something has gone wrong.
We need to talk about the fundamental question of WAR in light of these findings (and the ones from last week about catchers). WAR is supposed to answer to the question: “If Smith had disappeared prior to the season, what would we have gotten from a replacement-level player (bench/waivers/minors) who plays his same position?” In theory, that replacement level could be position specific. Center fielders could be compared only to those who actually played center field. Instead, players are compared to the entire range of replacement-level position players for their hitting (and running) ability, and their “positional value” is adjusted upward and downward based on the “defensive spectrum.”
Yes, you’re being compared to backup first basemen, but we’ll give you some extra credit based on the fact that center field is harder to play. If we had a solid estimate of how different the two positions are, using a positional adjustment makes sense. This approach has a couple of advantages. For one, if replacement level were position specific, then the smaller sample size of backups from which to draw might lead to year-to-year and position-to-position volatility in how much to expect from a bench guy. A broad range of replacement level candidates smooths that out a bit.
It also sidesteps some rather sticky questions that would have to be asked if replacement level were position specific. Should we take in anyone who played center field? What about the guy everyone knows would be just fine in center, but has never been needed there? What about the guy who would never have played center for any other team, but hey, it was a rough two weeks and we needed a warm body out there? Do we weight everyone in that group evenly? Do we weight by how often they played center field? What’s the minimum number of innings someone has to play in center field before they become part of the sample? With the current approach, no one has to answer those questions.
But the hidden assumption in the WAR system, as it is currently constructed, is that players are infinitely modular and that we can compare a third baseman to a shortstop because moving back and forth between the two positions would have a penalty (or a bonus) of X runs. The data suggest that this is a seriously flawed assumption.
The positional adjustments are based largely on the study of utility players. The theory has been that if you want to know how much more difficult it is to play shortstop compared to second base, look at players who amassed a sufficiently large sample at both positions to see how they score on some reasonably well-developed defensive metric. I’ve critiqued before how these comparisons tend to be based on very selective samples. Guys who play both left field and first base tend to be hide-a-players with defensive deficiencies. Guys who play both left field and center field tend to be above-average left fielders who can “handle” center. Yet, we chain those two samples together to get an adjustment number from first base to center field.
And even utility players appear to be a special class of player. Someone thought that it was a good idea to give them a reasonably large amount of playing time at both shortstop and second base. If a player only ever plays second, and never short, is his team indirectly trying to tell us something? Maybe. What we find in these analyses is that players are not infinitely modular. If you put a guy at a position he barely ever plays, he’s going to be really bad. You can’t just put a guy at third base and expect him to get by on sheer athleticism. There seems to be a lot of value in having practiced at a spot for years, even if you aren’t the greatest athlete. Not just anyone can be a utility player. It seems that this is a skill that one must aspire to and achieve.
Consider a team that suddenly finds itself with an opening on its roster at shortstop. They have, in their minor-league system, a player who profiles as a blue-chip hitter, but a corner outfielder. They also have a 33-year-old “depth guy” whose main talents include hitting .230, but also playing a non-embarrassing shortstop. Which one will get the call-up? It’s almost always the guy who’s played short, even if you could make the case that the blue-chip hitter would out-hit the old vet at a rate above the positional penalty. Teams rarely throw guys into positions they’ve never played. And from these findings, we see why. When they do, it’s usually either a special case with some combination of a particularly strong player, a particularly weak set of other options, or a player with at least some experience at the “new” position.
So, we confront a metric in WAR that assumes infinite modularity with the reality that players aren’t really all that modular. Even shortstops, who should “play up” as they move down the defensive spectrum, seem to have trouble when they get to their new address. When we think about who would actually replace (the “R” in WAR) an injured shortstop or third baseman or left fielder, we need to be a bit more realistic about the candidates. Of course, we could look at other players who have played in the spot, but there’s the temptation to say, “Well, we could look at guys who played right field, because right field is basically the same as left field.” It turns out, it’s not. At least, not initially.
The only guy who’s going to replace a catcher is a catcher. The only guy who’s going to replace a shortstop is a shortstop. The same seems to go for second and third basemen. It’s not that it could never happen, it’s that the circumstances where it would make sense are rare. So, effectively, it’s not really an option. Even in the “easy” corner positions, you might shift an outfielder around or stick a guy at first base, but there is going to be a defensive penalty to be paid, even if the replacement is coming from higher up on the “defensive spectrum.”
It’s possible that I’m being too hard on the new guy at a position. I did pick guys who played very few innings at a spot as my sample. Perhaps if they really did stick around at a position over the course of a year, they’d get less bad. Maybe after 10 or 20 or 50 games in the spot, they’d be fine, or at least in line with what is now known as the positional adjustment chart. But that’s a very big “maybe,” and how many teams can afford to spend 20 games with a guy who’s an absolute sinkhole defensively because he’s learning on the job? That initial transaction friction is strong and probably prohibitive within the context of an actual season.
I think that we need to ask whether WAR is really reflecting the reality of baseball, and this isn’t an academic exercise. If catchers are only capable of replacing catchers, then the bar for replacement level on offense just got a lot lower for a catcher. Against that lower bar, good starting catchers might have more value than we initially imagined. Maybe they’re undervalued.
The main lesson, though, is that sabermetrics has never really questioned the word “spectrum” when it comes to defensive abilities. When Bill James introduced the concept, it was to point out that we expect less of our shortstops offensively than our first basemen offensively, and that this was a commentary on the relative difficulty of the positions. Taken in isolation, this is a good point. But the word spectrum seems to have taken on a life of its own. It denotes, perhaps even subconsciously, baseball positions as existing on some sort of uni-dimensional sliding scale, as if they were all the same basic thing, just with differing difficulty levels.
The data suggest that they are not. It appears that each position has its own set of skills, and to play that position at an acceptable level requires at least some experience. Each position is its own box and the boxes are a lot less similar to each other than we might have thought. It’s possible to learn how to go between boxes, but that appears to be a skill unto itself. We need metrics that reflect that reality.