Race and the color line have played a central role in baseball history. One of the most well-known stories of the game’s past is Jackie Robinson’s breaking of the color line in 1947. Every year, Major League Baseball celebrates Jackie Robinson Day, leading observers to recall that prior to 1947, baseball was segregated. That sin touched every aspect of the game. Namely, the history of baseball statistics is enmeshed in the history of race in baseball.

Baseball statistics receive an enormous amount of attention. Likewise, the history of baseball gets its fair share of treatment in the form of popular histories, academic works, and SABR biographies. The history of baseball statistics, however, garners far less consideration. Indeed, it’s seldom acknowledged that statistics even have a history outside of themselves. Not only that, but when the history of baseball statistics is afforded scrutiny, the results suggest the markings of a still emerging site of study. The story tends to travel from Alexander Cartwright in the 19th century to Bill James near the turn of the millennium. While those figures are important, there is more to the history of baseball statistics than a series of individual actors.

The objects of study in this article are two of the most significant publications regarding baseball statistics: the Official Encyclopedia of Baseball (BE), first published in 1951, and the 1969 Macmillan Baseball Encyclopedia (MEB). Specifically, it examines them with regard to the nature of encyclopedias and in the specific context in which they emerged. They were products of a moment in baseball history when the color line was a fresh memory and full integration was just getting underway. Because of that, the manner in which they addressed black baseball was specific and revealing.[1]

The intersection of these elements—the nature of encyclopedias, the moment they emerged, and the way they handled black baseball—is significant because it had a cascading effect. In particular, these ingredients combined to present a neutral and linear narrative of baseball history based on statistics. The problem with such a narrative of baseball history is that it removes the color line from baseball history, which is another way of saying that it is removed from baseball history altogether. In order to integrate the history of baseball statistics into baseball history in general, statistics have to be seen through the lens of race.

Baseball Encyclopedias in the Midst of Integration
The single most crucial thing to understand about statistics and encyclopedias is that they are neither objective nor neutral. First, the very field of statistics emerged as a tool for European states to survey and classify for self-interested aims—it was initially termed “political arithmetic.”[2] Second, encyclopedias are argumentative in nature. They are not collections of knowledge offering objective information, though they are frequently presented as such, but texts ripe for interpretation based on the contexts in which they emerge and the contentions they put forth. This was true of the first modern encyclopedia, Denis Diderot and Jean le Rond d’Alembert’s Encyclopédie, which sought to show that knowledge could be centralized and instrumentalized in order to combat institutions of power, such as the state and the church.[3] The BE and MEB also make arguments. They submit that the history of baseball is best told through the supposedly neutral lens of statistics. The result, however, is the detachment of statistics from baseball history.

The BE and the MEB presented themselves as objective and authoritative. The evidence for this is in the names. We need only to recall that both the BE and the MEB included “official” in their titles. The “official” sanction represented the institutional relationship the books held with Major League Baseball. The “official” stamp indicates that the book was endorsed by MLB, and that endorsement invested the books with authority.

The encyclopedias also made a more direct assertion. They claimed to be complete. In the 1956 revised edition of the BE, American League President Will Harridge turned to the language of comprehensiveness to illustrate the authority of the encyclopedia. He wrote that the BE “is a complete record, under one cover, of major league playing history . . . the Official Encyclopedia adds the ‘last word’ to the great volume of baseball records.”[4] The first edition of the MEB, published in 1969, has “complete” in the title itself: The Baseball Encyclopedia: The Complete and Official Record of Major League Baseball.

These encyclopedias—with endorsements of authority and claims of completeness—and the statistics they held also have to be understood at the moment in time in which they emerged. What the BE was designed to do was entirely new, and the novelty of the encyclopedic project took place at a time when the presence of baseball players of color in MLB games and in MLB records was also new. The environment couldn’t help but shape the encyclopedias.

When the BE was first published in 1951, it was an initiation rather than a culmination. From the late-19th century to the mid-20th, scorecards and box scores throughout the country held MLB baseball statistics. The numbers that formed the backbone of early baseball encyclopedias weren’t always reliable, especially those from the 19th century, but they were there. The problem was that they were scattered. In 1969, the MEB arrived as competition to the BE and advanced the BE’s project of centralization. That it generated its figures from a computer and had more advanced statistics isn’t relevant for this story.

The significance of integration is evident in the way the first edition of the BE, and later the first edition of the MEB, dismissed 1947’s importance in baseball history. Both volumes fetishize the idea of the anniversary, and by doing so they turn attention away from a recent momentous change and emphasize a history of continuity.

For instance, the 1951 BE identifies that year, 1951, as singularly important for the publication of a centralized statistical record because it represented 80 years of consecutive MLB play, the 75th anniversary of the National League, and the 50th anniversary of the American League.[5] Similarly, the 1969 MEB contains a statement by then commissioner Bowie Kuhn. He asserted, “it is particularly fitting” for the for the volume to be “published in Professional Baseball’s centennial year.”[6] Like the presentation of statistics in the encyclopedias, the focus on anniversaries leads the reader to view baseball history linearly. It’s particularly striking coming in 1951, when MLB was undergoing a fundamental change in the form of integration.

Works of history respond to context, and both the BE and the MEB were explicitly meant to be historical. Indeed, it was central to their arguments. The first edition of the BE opens with a question: “What is the best way to present a history of baseball?” Using philosopher Thomas Carlyle’s famous dictum that “history is the essence of innumerable biographies,” the authors state that “this Encyclopedia first presents baseball history in the form of the sport’s pithiest biography: actual playing records.”[7] By doing so, the encyclopedia established playing statistics as the standard that constituted a sufficient biography to be included in baseball history. The result was a history of continuity without change.

The Absence and Presence of Black Baseball
Neither the BE nor the MEB completely ignored black baseball.[8] But what they did do was confine black baseball to the periphery, which reinforced the notion that race was an incidental matter in baseball history when seen through a statistical lens. The handling of black baseball in these encyclopedias is also part of the history of baseball statistics: Baseball players of color did not receive the legitimacy of inclusion that statistics invested.

The argument held within the encyclopedias, and the moment in which they emerged led both of them to handle black baseball, the Negro Leagues and its players, in specific ways. Baseball changed with integration. But the real arc of baseball’s evolution, the MEB argued, was in the changes to statistics dating back to the 19th century. The MEB suggested, “the evidence that baseball is a changing game can be found in the following pages, which, through the use of tables and graphs, clearly shows the trends in major league baseball from 1876 to 1968.” This is the linear narrative of baseball history upon which the encyclopedias rested. It was one without a color line.

The exclusion of black baseball was not because the encyclopedias were only concerned with the National and American Leagues. For instance, there is a section of the BE, after the statistical biographies, titled “Major League History.” This history has a subsection titled “Other Major Leagues.” It is here that the encyclopedia integrates short-lived and defunct baseball leagues. The American Association, which operated from 1882 to 1890, was the longest lived of the leagues detailed. The three “Other Major Leagues” were the Union Association, the Player’s League, and the Federal League. The first two leagues functioned for one season each, in 1884 and 1890, respectively, while the last existed for two years, 1914 and 1915.[9] There was nothing self-evidently “major” about these leagues. In fact, when the MEB wrote its own narrative of baseball history addressing “other major leagues,” it adhered to the decisions of the Special Baseball Records Committee. The committee was formed in 1968 to determine which 19th century leagues could be considered “major.”[10]

According to both the BE and the MEB, the various Negro Leagues did not fit into the encyclopedic paradigm of what was a “major league.” Although it is unclear what the standards of qualification were, it is likely that playing records were a consideration. What is evident, however, is that both encyclopedias considered Negro Leagues, and the players included, as outside the parameters of major-league baseball.

Such a view might have been reinforced by the lack of statistics available. The BE, for instance, covered the Negro Leagues in the section “Outside Professional Baseball.” They fit the Negro leagues in between sandlot play and softball. And this despite the fact that the Negro National League operated from 1920 to 1931, which was three more seasons than the American Association and 11 more than the single years the Federal League and the Player’s League existed, and competed with the Negro Eastern League in a World Series starting in 1924.[11] Later editions relegated discussion of Negro Leagues to a “Features” section. Doing so cast them as novelties—located side-by-side with essays on spitball pitchers and players who wore glasses.[12]

Neither encyclopedia addressed Negro League players without an MLB record until the final editions of each. There was a legitimate reason for this. Record keeping in the Negro Leagues was not as consistent. Interestingly enough, recent successful efforts on the part of the Professional Baseball Hall of Fame to unearth Negro League statistics indicate that they were there. Still, the BE addressed black baseball as something unknowable because they didn’t have their statistical records.

For example, the BE presented Negro League players as if they were fictional creations who played in an entirely other realm and lacked the substance that made them visible (playing records). “Between World Wars I and II,” one entry in the BE reads, “Negro baseball boasted such legendary heroes as shortstop John Henry Lloyd, catcher Josh Gibson, pitchers Cyclone Joe Williams and Cannonball Dick Redding.” And yet: “Only one of the fabled figures of this lost chapter in baseball history managed to benefit through modern emancipation: Leroy (Satchel) Paige.”[13] The MEB’s language of legitimacy was similar. It referred to the Mexican League, for instance, as “outlaw,” which was another term that designated a “non-major” league.[14]

It is also worth noting that the earlier editions of the BE and the MEB might have directly contributed to the inability of the later editions to include these records. The quote above shows that the BE referred to black baseball from about 1920 to 1939 as “lost.” It did so in 1951—not long after the supposedly hidden period of baseball history. In fact, the Negro Leagues still existed at that time, although, partly due to integration, they were quickly diminishing.

Not only that, but the MEB was more interested in players who never existed than they were in baseball players of color prior to 1947. Author Alan Schwartz writes that one of the MEB researchers’ “responsibilities in 1966 and 1967 was to ferret out ‘phantom’ players: men who appeared in existing record books but had not, in fact, existed.”[15] Nobody was tasked with researching baseball players of color before integration: The nature of their questions led them exclusively to white statistics.

Here, we can see the snake eating its own tail. The lack of a statistical record meant exclusion, and exclusion meant the lack of a statistical record in the “official” books that were ostensibly and misleadingly “complete.”

Where to Go From Here
Statistics are an inextricable part of baseball history. The relationship, however, is not unidirectional. Baseball history is also an inextricable part of the history of statistics, and race plays a central role in that history. A history of baseball statistics that does not include race is incomplete. It’s as incomplete as a history of baseball without reference to the color line. In fact, narrating baseball history through a statistical lens without acknowledging the fact that statistical records were indelibly stamped by segregation—that all statistics prior to 1947 were white, and they continued to be so after—is simply a way of removing the color line from baseball history.

None of this means we should stop using statistics to compare players across eras. In fact, doing so would just be another way of erasing the color line from baseball’s past. It’s incumbent upon us to remember that segregation touched all areas of baseball, not to forget it. Nor do the racial implications of statistics at mid-century mean that 1947 should necessarily be the point at which statistics become more usable. While 1947 was a watershed moment in baseball history, it did not change the nature of the game overnight. Integration was a piecemeal process that took years to complete. For comparative purposes, starting in 1961 might make sense because it was the first full season after all teams began integrating, and because it dovetails with the first round of expansion. In any case, accounting for the whiteness of statistics should be no more difficult than accounting for the steroid era.

The BE and the MEB are two of the most important documents in the history of baseball and the history of baseball statistics. An examination of these encyclopedias represents just one way in which texture can be added to the history of baseball statistics. That history is not a progressive story that underwent a series of innovations from the box score to Wins Above Replacement, but a complex history with implications for the matters that do nothing short of define baseball, such as race and the color line.

This is just one story. Indeed, another one might consider statistics as a representation of equality. Perhaps by the late 1970s baseball statistics’ veneer of neutrality was a way to equalize white players with players of color. Maybe statistical data were quantitative counterpoints to the qualitative stereotyping that still existed within organizations and in the press. When and why were Roberto Clemente’s statistics cited? In another iteration, the history of statistics could be told through the lens of labor. Would statistics have ever have even existed if it weren’t for clashes between players and owners? How did that change across the twentieth century? A more critical history of baseball statistics might also tell us something about baseball as a global game. To what extent has trust and distrust of statistics from foreign leagues shaped MLB and popular perceptions of “foreign” baseball? Yet another might consider whether or not the general lack of statistical data among women’s sports has anything to do with how far we seem to be from a woman playing in Major League Baseball.

Whatever the question might be, the history of baseball statistics needs more questions whose answers are not just statistical—it needs questions whose answers are essential.

Eric Garcia McKinley is an editor and writer at Purple Row, SB Nation's Colorado Rockies blog, and a contributor at Beyond the Box Score. He holds a PhD in European history from the University of Illinois, Urbana-Champaign.

[1] The Baseball Encyclopedia: The Complete and Official Record of Major League Baseball (New York: Macmillan, 1969), 22; emphasis in original.

[2] Theodore Porter, The Rise of Statistical Thinking, 1820-1900 (Princeton: Princeton University Press, 1988), 18.

[3] Daniel Brewer, The Discourse of Enlightenment in Eighteenth-Century France: Diderot and the Art of Philosophizing (Cambridge: Cambridge University Press, 1993).

[4] The Official Encyclopedia of Baseball, Jubilee Edition, Hy Turkin and S.C. Thompson (New York: A.S. Barnes and Company, 1951), first edition, xii-xiii.

[5] BE, first edition, Preface.

[6] MEB, first edition, 5.

[7] BE, first edition, 1.

[8] I use the term “black baseball” as a short-hand to refer to all of those excluded from Major League Baseball, which also included Latinos and non-African-American persons of color.

[9] BE, first edition, 398-401.

[10] Alan Schwartz, The Numbers Game: Baseball’s Lifelong Fascination with Numbers (New York: St. Martin’s Griffin, 2005).

[11] BE, first edition, 398-401.

[12] BE, revised edition, 1956, 506.

[13] BE, first edition, 427.

[14] The Baseball Encyclopedia: The Complete and Official Record of Major League Baseball (New York: Macmillan, 1969), 15.

[15] Schwartz.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now
You need to be logged in to comment. Login or Subscribe
Alan Schwarz' The Numbers Game is highly recommended for those who want a history of baseball statistics.
You state that "In any case, accounting for the whiteness of statistics should be no more difficult than accounting for the steroid era." - is this a dig at the HoF process, and those voters who refuse to induct players who've performed under the PED cloud?
We've immortalized those players who excelled in an "inferior" product, as Pre-Integration MLB had a much smaller talent pool thus giving those with talent that much more of an advantage. Should we recast our opinions of those Pre-Integration players so that they get the same type of *asterisk-bound* treatment as those players who excelled during the PED Era, some of whom knowingly ingested steriods and such?
It's less of a dig than it is an observation that, given the nature of change in baseball history, all eras have their own asterisk. I do think that pre-Integration Hall of Famers and their stats should be viewed in the context of segregation—how can they not be? I don't, however, think the opinions should be recast to the extent that the HoF should be rethought completely (not that you're implying that). That would just be another way of erasing this history.
I would speculate that the black players certainly may have made it more difficult for white players to amass their statistics had they been playing together in the early part of the century. The flip side being that many of the black players stats may also have been diluted by playing against the white major league ball players. I don't believe the two statistical bases are comparable and should be judged on their own merit and competition.

The steroid question of era is odd because who knows when the era started or ended? That is more speculation than reality based and doesn't help in any discussion of how to resolve the lingering doubts about player A or B. A player still had to work out and try to get stronger in order to benefit from steroids. They couldn't sit around the couch during the off season drinking beers and taking steroids and hope to come back cut and fit. Personally, I think taking a couple of aspirin in the morning after a long night of partying has a lot more immediate impact on a player's performance than taking a steroid, but that's just me.
I really enjoyed this article. It's a great example of the "victors" writing the history, ignoring the contributions of the oppressed.