Manufactured Runs: Solving the Mays Problem

September 8, 2010

So, we’ve been talking about revising the metrics we use here at Baseball Prospectus—I’ve described a fielding metric and a complementary batting metric. So now let’s go about discussing some of the ways they fit together.

One of the big things we need to do when we build all-encompassing metrics is adjust for position. That’s because of the way we construct our metrics—we have offensive metrics that compare players to all other players, but defensive metrics that compare players only to other players at that position.

That makes it difficult to compare two players who play vastly different positions. Baseball fans of course know this intuitively—if you have a first baseman and a shortstop with the same batting line, the shortstop is likely the better player.

Given the nature of the problem, the most straightforward solution would seem to be to compare players’ batting to their peers at the same position. And for the most part, that approach works well.

The Mays Problem

But you run into a problem in some extreme cases. Willie Mays is one of those cases.

See, the thing about Mays is, he could have walked into Cooperstown wearing a first baseman’s glove. He was simply an astonishing hitter. It’s just that he could play an excellent defensive center field as well. The problem with analyzing Willie Mays is that he was just so superlative that he moves the baseline to which he’s being compared.

And I call it the Willie Mays problem, but it wasn’t Willie Mays alone causing it. Mays had a lot of help. Looking at the top players in games played in center field in the 1950s:

Name	Games
Richie Ashburn	904
Duke Snider	887
Larry Doby	836
Jim Busby	746
Johnny Groth	729
Mickey Mantle	528
Willie Mays	458
Gus Bell	452
Bill Bruton	440
Sam Jethroe	418

(Bold indicates players in the Hall of Fame.)

That’s a really impressive list of baseball players. You have Mays and Mantle of course—Mantle, like Mays, is a guy whose bat was impressive for any position. Ashburn, Snider, and Doby were also incredible ballplayers, though, and prolific during the '50s.

See, if we use raw positional averages, we end up asserting that the average center fielder is as good as the average corner outfielder. In the '50s, that wasn’t true—the average center fielder was a better defensive outfielder than the men in the corners, but he was also a better hitter than them.

So in those cases, using positional averages for offense falls short. What else are we to do? We can’t simply increase our sample size to wash out the noise—as I said, even in a 10-year sample, we don’t see it wash out.

Adjusting Defense

What Tom Tango has proposed, and has been adopted in much of the prevalent Wins Above Replacement metrics outside of Baseball Prospectus, is adjustments based upon comparing fielding stats of players who play multiple positions.

This is going to do OK on the Mays problem—at least the one actually involving Mays—but I feel that it introduces quite a few problems of its own along the way:

It is sensitive to the starting assumptions used. And some of these assumptions, like “How good an infielder would Melvin Mora be if he were left-handed?” are impossible to test empirically.
There are selective sampling biases in position switcher data. Teams do not move all players between positions. They move players based upon what positions they’re suited to, and if they aren’t suited to any, they go to the minors and find someone else. Our population of position switchers is biased.
It does not consider defensive skills outside of fielding batted balls. Outfield throwing arms, infielders turning the double play, first basemen scooping throws—these are all skills that add value. And to tie in with the second point, the players who switch between second base and third base readily aren’t randomly drawn from the population of all third basemen, they’re the third basemen that teams think can handle the double play pivot.
It ignores the impact a player’s fielding position can have on his hitting. Certain positions call for certain body types over another, for instance. Does that affect a player’s offensive output?

In other words, what you are doing is analyzing a much smaller (and biased) population with much cruder analytical tools instead of analyzing a very large sample, all players, with very sharp analytical tools, like modern offensive metrics.

The larger problem you come to is that the distribution of defensive talent shifts over time. Comparing position switchers is a very crude way to track that—you need a lot of years of data to do that sort of analysis and you need to manually intervene in some of the analysis yourself. So it’s very hard to see where those shifts are occurring and include those in the positional baselines.

OK, but do those problems present themselves in such a way as to cause problems with our comparisons of players? I think they do, and it’s along the boundary between second and third base and between left and right field.

For second and third base, we have over 50 years of data that says that third basemen practically always outhit second basemen. And yet looking at the position switchers in terms of fielding data, what you see is they’re basically even. What are the possibilities here?

Well, the first is that there is always a Willie Mays problem at third base—always a cluster of absurdly talented Hall of Famers at the hot corner biasing our evaluation. That seems rather unlikely, given that there are no players in common between, say, the '50s and the '90s, and yet we always see the same pattern.

The next possibility is that baseball teams are just doing a poor job of allocating talent, and that they are needlessly diluting the second-base talent pool by keeping a greater portion of the good players at third base. I don’t see any hard evidence for that contention, and it doesn’t seem to be particularly reasonable.

The third possibility is that we’re simply missing something—that the model is failing to represent the underlying reality. To be perfectly blunt, I think the responsible way to do sabermetrics is to be very careful in asserting that it is our model, not reality, that is correct when the two are in conflict. And the way actual baseball teams behave, it seems like the defensive responsibilities at second are harder, and therefore teams are more likely to put better defenders there than at third.

And again you see the same thing with the corner outfield spots—although interestingly enough in the late '70s you see a shift, where before then the left fielders were generally the better hitters and after that the right fielders generally were.

So can we solve the Mays problem while still using offensive adjustments?

Outliers

At first blush, the Mays problem really looks like a simple (and common) problem in determining the average of a population—outliers.

The arithmetic mean, the most common form of average, is very sensitive to outliers. There is a lot of good existing research on how to use more robust measures of central tendency—the median, truncated means, log transformations, and so on.

Imagine my frustration when I discovered that none of them were effective.

The problem? Willie Mays (and the superlative players) aren’t the only outliers. They’re not even the biggest outliers.

What you tend to find out is that the biggest outliers at a position are typically the worst, not the best, players. That’s almost entirely a function of the data set—when you’re looking at your very sub-marginal players, what you’re not seeing are the guys just like him who are playing in the high minors.

And so if you’re trying to reduce the influence of outliers, what you do is you end up curtailing the effects of the below-average players as much as you do the exceptional players, and it washes out. (Actually, you tend to raise the positional averages more than you do to lower them).

What I’ve done is to take and split the sample in half—above and below average. Then I looked at the distance between the mean of the two halves. If there is very little skew, the halfway point between those two is going to be the same average I used to split the dataset. But that’s rarely what we see—we see one half with a larger distance from the average and fewer representative plate appearances. The weighted difference of the two halves will give us the initial average, but the unweighted difference is an estimate of how the skew is tilting the average.

So I used the unweighted difference to “shift” the average, and then applied an average amount of skew-related difference back to each position so that the positional averages will add back up to the total league average.

This process is, as one might imagine, rather unstable over a single season. But over a period of nine years it stabilizes quite nicely (I chose nine because it let me focus on the four seasons before and after the season of interest, as well as the season itself.) That isn’t to say that the picture is entirely clean—looking at corrected runs per plate appearance by position over the years:

Adjusted positional average RPA by season

You still do see a lot of shifts (little and big) over the years. Some of them may in fact be just noise. Others may be teams shifting talent around as conditions change. I mean, honestly—I wish it was cleaner, I do. But baseball analysis can get messy sometimes, and I think this is one of those times. Best to acknowledge the mess, rather than trying to put some throw pillows over it and act like its not there.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now

Colin Wyers

Latest Articles

You need to be logged in to comment. Login or Subscribe

JinAZReds

9/08

Nice article. Position adjustments are a hairy issue, but they're so critical to everything.

...

Do you think the "Mays Problem" as you describe it is really an issue of outliers? Or is it just an overall talent gap between positions? Teams often start most players out of high school and college at the far left of the defensive spectrum (SS if righty, CF if lefty) and then move them to the right when you decide they can't play that fielding position adequately.

In other words, doesn't your method still assume, in the end, that total position talent (offense + fielding) is constant across positions? I've always felt that assumption was the biggest problem with offense-based position adjustments, because I don't think that's likely to be the case. 2B, in particular, seems to be a position to which players are moved to when they can't field well enough to be a SS, but you're also not a particularly good hitter. I'll accept that they should be better fielders than 3B's (skills should be different, of course), but I do tend to think that 2B is a below-average position in terms of talent.

Here are data from Tango's fan scouting report last year. No position-specific weights, just overall averages of skills (rated 1-5) across positions:

CF 3.7
SS 3.6
2B 3.4
3B 3.3
RF 3.3
C 3.0 (weird position, though, so not apples to apples)
LF 3.0
1B 2.98

This seems about right to me in order and size of gap between players, maybe with the exception of CF's over SS. I wonder if this kind of data, though, is the better solution to all of the various issues you've raised with position switching data than going back to offense-based adjustments. Obviously it only applies to modern baseball, but it's a start.

Cheers,
Justin

Reply to JinAZReds

cwyers

9/08

"2B, in particular, seems to be a position to which players are moved to when they can't field well enough to be a SS, but you're also not a particularly good hitter. I'll accept that they should be better fielders than 3B's (skills should be different, of course), but I do tend to think that 2B is a below-average position in terms of talent."

Okay, but you could say that about most positions - guys who can't field well enough to play short move to 2B, 3B or CF, depending on their other talents. We know that the ones that move to 2B tend to hit less well than the ones that move to 3B or CF.

What we're left wondering is why? Is it because teams have just decided to put the worst of the failed shortstops at 2B? I don't think so - it makes no sense, and I tend to require a preponderance of evidence before I believe teams are behaving in a wholly irrational fashion.

Well, we know why the CF pool tends to hit better - it's where you put the guys who failed at being shortstop on account of being left-handed. So it really comes down to a question of 2B and 3B.

To speak broadly - the guys who move from SS to 2B are the guys with the range to play SS but not the arm. The guys who move to 3B are the ones with the arm but not the range. (This makes one wonder about who the players are who are represented in the position switcher pool, doesn't it?)

What I'm not doing is requiring the positions (offense and defense) to be equal. If you take the average center fielder from 1955 (without correcting for skew, as I have) in a full season, in this system he'll probably be +5, assuming neutral defense. A left fielder is going to be more like a -2. So you aren't forcing everything to zero - center fielders as a group were more valuable than left fielders that year, and you're capturing that difference.

Reply to cwyers

gmolyneux

9/08

Colin: Any chance you could add a version of the chart adjusting for MLB runs/PA? That would make it easier to see how the relative rankings of the positions changes over time.

Reply to gmolyneux

cwyers

9/08

Sure thing.

Reply to cwyers

DavidHNix

9/08

My hypothesis: Teams tend to move a guy from 3rd to 2nd because he doesn't have the arm for 3rd but is a better hitter than the other available 2nd basemen. They tend to move him from 2nd to 3rd because they have another guy who can hit big league pitching but doesn't have as good an arm. Historically, of course, judgments like "as good a glove" and "a better arm" have been largely subjective, and often meant no more than "reminds the manager of a guy he used to play with who played that position."

Reply to DavidHNix

salvomania

9/08

Wait a minute... Wille Mays only played 458 games in CF during the 1950s? Where did he play during the other 600 games he appeared in that decade? And only six players in all of baseball played at least 500 games (3-1/2 full seasons) in CF?

Or am I just misreading that chart???

Reply to salvomania

cwyers

9/08

No, I made a mistake in preparing that chart. I have two different data sources for position played information, and I screwed up the data. My apologies, and I will get a corrected version out in a little while.

(For those who want a little more detail - official fielding stats don't break down between the outfielder positions, or at least didn't historically. So it's a bit convoluted to break down a player's games by position prior to the introduction of full play-by-play records. I mean, not so convoluted that I shouldn't have CAUGHT the mistake, but still.)

Reply to cwyers

Clemente

9/08

Yes, the CF games played chart cannot be right. Not sure it effects the analysis, though.

Reply to Clemente

beeker99

9/08

This is fascinating on a number of levels, but what I want to know is, when do we see the next part?

Reply to beeker99

hotstatrat

9/08

Bill Bruton - Hall of Fame? One of us needs a fact check.

Reply to hotstatrat

jwferg

9/09

Mays played an inning at shortstop in 1963. It was his first time with the Giants that he was not in CF.

Reply to jwferg

lonechicken

9/09

Is it me or does that first table need to be changed a little. Say, make italics the indicator for Hall of Fame. Because the "keyword matching" scheme of this site will bold all the names the first time they appear, which it did.

Reply to lonechicken

Manufactured Runs: Solving the Mays Problem

Thank you for reading

Latest Articles

Please, No! Not Another Closer Article! $

Deep League Landscape ’24: Week Three $

Box Score Banter: A Very, Very, Very Fine Houck B

MLU: Potential Rotation Fitts $

The Call-Up: Jonatan Clase $

Colin Wyers

Latest Articles

Please, No! Not Another Closer Article! $

Deep League Landscape ’24: Week Three $

Box Score Banter: A Very, Very, Very Fine Houck B