Image credit: Gregory J. Fisher-USA TODAY Sports
  • There’s a growing consciousness in MLB of the league’s unfortunate history (and ongoing problems) with racism. Scouts, players and prospect writers (including Baseball Prospectus’ own team) are confronting the legacy of racist language used to describe players of color. But a careful analysis of the data shows that the problem of racial bias in the game goes deeper than how talent evaluators talk about players: It has actually shaped careers to a significant degree, denying promotions to BIPOC and preventing deserving hitters from ever making the majors.

A few weeks ago I wrote about the racial biases in a collection of Reds scouting reports released to myself and Ben Lindbergh. Reds scouts deployed a host of negative words when referring to Black, Indigenous, and people of color (BIPOC) players, dropping much more praising verbiage on White prospects. I noted in that article that the concerns about player makeup (one of the areas where racist comments were most prominent) didn’t seem to be predictive of a lack of production from BIPOC players at the major league level.

Given the disparities in how scouts talked about White players and their colleagues, I figured there must also be an impact on the career progression of BIPOC. If scouts are constantly feeding their GMs a steady diet of negative reports on minor league BIPOC, it stands to reason that those players may not get the same opportunities to make the majors. So I grabbed minor league data going back to 1991 to test whether BIPOC position players had slower journeys moving upwards through the minors.

I used, with permission, a database of race information originally gathered by Mark Armour and Daniel Levitt for their landmark study on the demographics of MLB. A limitation (but for my purposes, in some ways a strength) of the data is that it only covers players who received at least some MLB playing time. As Armour and Levitt note, race may be best established by asking each individual player how they self-identify, but getting in contact with tens of thousands of current and former baseballers simply isn’t feasible. They describe using skin color in combination with country of origin to determine racial background. This isn’t a foolproof method and some individual players’ assignments may be erroneous. But with more than 7,000 players to examine and some 30,000 possible promotions, any small proportion (<20%) of misidentifications is unable to explain these results.

Armour and Levitt’s data puts players into one of four groups: White, African American, Latino, and Asian. This is not an exhaustive list of races and importantly people can identify as more than one race at once (for example, Afro-Latino people make up a significant subset of the players in MLB). Because the purpose of the study was not to identify differences among races but rather to determine whether there was a racial bias at all, I grouped all players into two categories: White and BIPOC. This choice is not meant to eliminate differences in experience between races, which surely exist, but rather to center the main question of whether bias exists in promotions between White people and BIPOC.

I considered only the four major levels: Single-A through Triple-A and MLB. I also didn’t consider promotions that occurred mid-season, specifying only year-to-year transitions. I did this to minimize the effects of (for example) injury replacements, which can confound an organization’s decision on a player’s readiness to make the jump with their need to field a complete team at each level.

I coded each year by whether a player made the jump to the next level (Low-A -> High-A, High-A -> Double-A, Double-A -> Triple-A, and Triple-A -> MLB) and I considered three facets of the player’s on-field performance: their offense (as described by year- and league-scaled OPS), their defense (as measured by FRAA), and their position (as determined by BP’s in-house positional adjustment runs, the same ones we use in WARP). I also included the player’s age relative to their level to account for how teams view prospects: Obviously a 19-year-old with an average slash line in Double-A is more promising than a 26-year-old posting the same performance.

I found a significant difference between the probability of a White person vs. BIPOC making a jump to the next league each year. Even controlling for their performance on the field, White players had a roughly three percent better shot at moving upwards each season than an equivalent BIPOC colleague. There are a variety of different ways to specify statistical models to test this hypothesis: including and excluding each variable, building in other considerations like national origin, limiting age ranges to exclude very young (or very old) prospects, and so on. None of the model specifications attempted returned an effect smaller than about 2.6 percent (or larger than four percent), suggesting that the bias is not an artifact of one particular way of modeling minor league promotions.

A three to four percent difference may seem small, but bear in mind a few factors. First, most true major-league talents will be so glaringly obvious that even if their race is held against them, it won’t be enough to stop them from making the majors. (Likewise, most players of all races simply aren’t good enough to play major league baseball and bias won’t make the difference for someone with a 0.01 percent chance of getting to MLB anyway). A scout may issue a racist report on Barry Bonds, but comments on his earrings aren’t going to be enough to stop his general manager from calling him up anyways. 

Second, the three percent impact is per player-year, which means that the cumulative impact over the course of a career is much larger. To illustrate this, consider two cohorts of White and BIPOC players with exactly equivalent starting abilities, both starting in Low-A in the same year. The BIPOC players have a 19 percent chance per year to move on to the next level, while the White players have a 22 percent chance. After five years, roughly 6.7 percent of the White players will have made MLB, while only ~4.3 percent of the BIPOC players will have. The cumulative impact of that three percent bias multiplies over time. Overall, in the data I have, BIPOC players take about 0.6 years longer to go from MiLB to MLB. Considering the economics of MLB and the relatively small timeframe players have to get paid, losing a half-season or more of prime playing ability is a high price to pay for being BIPOC.

Finally, in this article I am only studying the promotion aspect of minor league ball. In reality, the process of ascending the ladder into MLB is more complex than just working your way up. You also have to avoid demotions and attain enough playing time at each stop to continue making your case, two other areas where racial bias could (and based on the results of this study, likely would) also appear. So the total effect of all three processes (moving upwards, avoiding demotions, and getting plate appearances) is probably much larger than the three percent estimate.

It is always difficult in real-world data to make a clean identification of racial bias, especially as a causal force. Hundreds of intervening factors differ between races, any one of which could prove to be important. For example, race is undeniably intertwined with developmental factors and how a player learned the game. Growing up attending a Dominican baseball academy and jumping into the pros at 19 may give a player a different skillset than going to college at a NCAA baseball program.

It’s worth noting, however, that players in this sample of all races started in the minors with very similar average OPSs. (Considering that all players eventually made the majors, perhaps this is unsurprising.) Over time, BIPOC players tended to post better offensive and defensive numbers than their White colleagues in the same leagues, which is consistent with the idea that they were being held down from promotions they deserved and forced to become better and better than the leagues they were playing in. This is also consistent with evidence from the Reds’ scouting reports, which showed that BIPOC pitchers often had higher average fastball velocity than White colleagues, even though they often received worse reports. By objective measures, BIPOC players excelled; their sluggish journeys through the minors did not reflect that excellence.

Even considering the difficulty of establishing racial bias beyond any doubt, there’s a good case to make that racism has shaped the promotions of thousands of BIPOC players over the years. We know that BIPOC players performed at least as well by conventional statistical measures as their White counterparts. Armour notes in his demographic study that even in the major leagues, BIPOC players produce an outsized amount of WAR relative to their proportion in the league. We also have evidence (from leaked scouting reports and the testimony of front office personnel) that scouts and front office decision-makers display prominent biases in how they grade BIPOC players. Finally, we have a clear difference in outcomes between players of different racial backgrounds, even accounting for their performance on the diamond. 

It doesn’t take a huge leap to see that BIPOC players, systematically judged because of racism as lesser prospects than White peers by talent evaluators despite equal or better statistics, have received fewer opportunities to progress through affiliated ball over the years. The impact of this bias is potentially staggering. While all of the players studied made it to MLB, the system shut the door on thousands of other Black and Latino players over the years who should have had chances in the majors. It’s true that most of these would have been fringe major leaguers, destined to bounce back and forth between the high minors and the bottom of the roster. But just as with any sample of a few thousand minor leaguers, a handful likely could have become great if exposed to the rigors of MLB competition.

The hundreds or thousands of BIPOC players whose careers stalled or ended prematurely is a tragedy. Viewers and fans suffer from great talents never playing, but the people whose ambitions were crushed by bias paid a much larger price. Sports reflect society and baseball, branded America’s Pastime, may reflect the United States best of all: from the use of coded language to diminish BIPOC to the cracked assumption of a meritocracy, baseball’s biased talent pipeline has eerie echoes of the problems facing the country as a whole.

Hi Rob, great analysis. You mentioned the dataset began in 1991, so I'm wondering whether there's been a shift over the years. When 'Moneyball' (using that as a shorthand for analytical front offices) happened, did that cut down on the racial discrepancy? If OBP is life, then who gets promoted becomes less about in-person scouting reports.
Robert Arthur
I didn't look into this question too deeply because it could be quite complex, depending on expansion, new international markets opening up, etc. It's a good avenue for follow-up analysis.
Great and thought provoking piece as mentioned above. To piggyback on that question to some extent for follow up analysis, is there sufficient data to attach a monitary value to the projected service time and projected money lost through arbitration and free agency on average?
Rob - back in ancient times (1979) I did my senior college thesis on "Is there Racial Bias in Major League Baseball". Of course, resources were much more limited in those days, so my primary source of statistics was The Baseball Register (from the Sporting News), and I had to determine if a player was, in simplistic terms, American Black, American White, or Latin from the bio picture (less-PC terms from the day) and birthplace info listed in the book. Data had to be keyed onto punch cards to be loaded into the computer for analysis. I mention this to give a sense of how much has changed since then, and thus how limited my study was. My assumption was basically that there was an unlimited pool of players of all 3 backgrounds, and that a Black or Latin player had to perform better than a white to make the majors. I did find a statistical difference in performance between White Americans vs. the other two groups.
Robert Arthur
Wow, incredible! I can't imagine how much work that must have been. I had it so easy by comparison, thanks to Mark, Dan, and modern computing software.
Earl Weaver
Mr. Arthur, very nice analysis, especially having to deal with all of the data issues. Just over 20 years ago, I did a similar analysis that was published in the Journal of Sports Economics, which covered 1968-69, 1976-77, and 1991-97. Here is the cite and the abstract:

Bellemore, Fred A., “Racial and Ethnic Employment Discrimination: Promotion in Major League Baseball,” Journal of Sports Economics (November 2001), 2:256–68.

Employment discrimination is studied by examining the performance of baseball players at the highest minor league level in the 1960s, 1970s, and 1990s. Both Blacks and Hispanics face discrimination in promotion to the major leagues. Blacks faced it in the 1960s and 1970s and still did in the 1990s, but it subsided in years when jobs were created through expansion in the number of teams. Hispanics faced discrimination in the 1960s and 1970s that abated in years of expansion, but there is no significant evidence that they did in the 1990s.

I'm not actually Earl Weaver. The analysis covered AAA to MLB only, but included players who did not ever make the majors, and only used batting statistics (no fielding or pitchers). African-Americans were 9.3% less likely to be promoted 1968-77 period and still 8.1% over 1991-97. Of the 11 years, there was no significant evidence of discrimination in the 3 expansion years of 1969, 1977, and 1993, Since your data set included that 1993 and the 1998 expansions, I was wondering whether you saw any less of a problem in those years.
Robert Arthur
That's fascinating; I wasn't aware of your work but will certainly look it up. I'm glad to hear that you found similar effects to me. I will check those expansion years--it makes sense that bias would become less visible in those seasons.
You briefly mentioned national origin. Is there any difference in the disadvantage to African-American players and foreign-born players?
Robert Arthur
I didn't find country of origin to be a significant predictor in my regressions. I did find race-specific differences but I don't feel like they have much weight because of the difficulty of classifying Afro-Latino players, the paucity of Asian players, and the confounding of factors like race, national origin, and training before entering professional baseball.
I suspect that the players had significantly different backgrounds. A white US college player versus a black US high school player versus a Dominican recruit have much different backgrounds before entering the MLB organization. Is there classism going on? (That is college better than high school better than foreign)
John Johnston
Does it ever end?
Anton Dahbura
Another factor that is linked to race because of the way the baseball recruiting system is designed is the signing bonus a player is paid. I've long suspected that players in which an MLB organization has more invested in (with scouting/front-office careers potentially on the line) are more likely to be pushed up through the minor league system than Latino players who are considered to be more expendable dollar-wise, even if the Latino players are superior performers. Only recently have some Latino players received non-trivial signing bonuses.

Is there a link? I would think so.
Mac Guyver