July 16, 2013
The Secret History of Sabermetrics
Most of our writers didn't enter the world sporting an @baseballprospectus.com address; with a few exceptions, they started out somewhere else. In an effort to up your reading pleasure while tipping our caps to some of the most illuminating work being done elsewhere on the internet, we'll be yielding the stage once a week to the best and brightest baseball writers, researchers and thinkers from outside of the BP umbrella. If you'd like to nominate a guest contributor (including yourself), please drop us a line.
Jack Moore can be seen at Sports on Earth, CBSSports.com, and wherever else the internet will have him. He owns a set of Russell Wilson Gastonia Grizzlies baseball cards.
Reprinted with permission from The Classical Magazine.
The above screed could easily be mistaken for something off the electronic pages of FanGraphs or Baseball Prospectus in the current millennium, or perhaps from one of the paper pages of Bill James's famous Baseball Abstracts from the 1970s and 1980s. In reality, the story it comes from is approaching its 100th birthday.
F.C. Lane was the editor-in-chief of Baseball Magazine, one of the first monthly baseball publications, from 1912 until 1938. Lane is also considered by many to be the first sabermetrician. In 2012, the Society of American Baseball Research—SABR, the root of the word “sabermetrics”—posthumously bestowed upon him the Henry Chadwick Award, established “to honor those researchers, historians, analysts and statisticians whose work has most contributed to our understanding of the game and its history.”
In a 1915 issue of Baseball Magazine, Lane penned an article titled “Why the System of Batting Averages Should Be Changed,” and subtitled “Statistics Lie at the Foundation of Baseball Popularity – Batting Records Are the Favorite – And Yet Batting Records Are Unnecessarily Inaccurate.”
Not much has changed in 98 years.
In the article, Lane asks the following question: “Suppose you asked a close personal friend how much change had in his pocket and he replied, 'Twelve coins,' would you think you had learned much about the precise state of his exchequer?”
Here, in this question, lies the foundation of sabermetric thought. Baseball demands numbers. No fandom looks to its statistical history with more frequency or reverence than baseball's, and no sport has a statistical record as clean or as robust as baseball's. Data demands analysis, and thus the early statistics like batting average and earned run average were born. Lane's question is specific, but it alludes to deeper, more primal concerns: Do our measurements describe what happens on the field? Are we closer to understanding how baseball teams score runs, get outs, and win games?
Batting average makes enough sense on its surface. The batter's goal is to get a hit, therefore measuring how often he does so should describe his quality. But this idea collapses once we take a closer look at how run scoring actually occurs. The hitter's goal is twofold: to reach base himself—whether via a hit, walk, hit by pitch, or anything else—and to move runners, including himself, along the path to home plate. As such, the great comparative value of the home run to the triple, the triple to the double, and the double to the single becomes clear. Calling every hit a hit is as silly as calling every coin a coin.
Even in 1915, people understood this was a problem. John Heydler, the secretary-treasurer of the National League and the creator of the earned run, admitted the inaccuracy of batting average but told Lane, “It has never seemed practicable to use any other system.”
Imagine how difficult record keeping must have been for baseball in its early days, without computers or even calculators to ease the load. Compiling records and ensuring accuracy must have been enough of a bear. Applying complex calculations on top? Not, as Heydler said, practicable.
Practicality was not Lane's concern. Baseball Magazine under Lane was devoted to deepening and widening the fan's knowledge of the game, and Lane was one of the first to realize how numbers could help in this endeavor. As Lane wrote, “Fans and figures have a mutual attraction.” This is the importance of statistics: as flawed as they can be, these numbers are the easiest and most effective way for us to communicate our games to one another.
The great sabermetricians—the researchers whose findings allowed fans and teams alike to make great strides in understanding how baseball works—have all been people with an intense curiosity about how the world works. Most people with this kind of curiosity focus it on worldly subjects like science or profitable ones like the stock market. A select few choose, for some reason, baseball.
Bill James, as Daniel Okrent wrote in a 1981 Sports Illustrated profile, had “a B.A. in English and economics, graduate credits in psychology, a passion for William Faulkner and an abiding interest in the French Revolution”—that is, nothing that would get him a well-paying job. So he spent his time as a night watchman in a Kansas food packing plant poring through records and writing his Baseball Abstracts, some of the most influential statistical writings on baseball to this day.
Pete Palmer spent his days as a radar systems engineer for Raytheon Corporation, a company resembling a real-life Stark Industries. Raytheon's projects have ranged from computer guidance on Apollo 11 to scud missiles used in the Persian Gulf War. After work hours, the same computers used for advanced warfare and space exploration became tools for Palmer to create his own solution to Lane's batting average problem. Alongside missile guidance systems, Raytheon's computer systems held within them the first true database of baseball records, the building blocks for Palmer's seminal Hidden Game of Baseball.
This isn't to say baseball progress didn't come from within the game itself. Branch Rickey arguably changed the game more than anybody else in baseball history, from the obvious Jackie Robinson story to the development of the farm system, to being one of the first baseball men—if not the first—to use on-base percentage and isolated slugging percentage. Scouts opened pipelines to new baseball lands like the Dominican Republic and Japan. And of course there was Billy Beane, who made the Moneyball phenomenon possible from within the Athletics' front office.
But front offices kept—and keep—their advances to themselves. The book Moneyball was an extraordinarily rare glimpse into the development of a baseball team's philosophy, and Beane wasn't thrilled to have Michael Lewis hanging around the clubhouse. The baseball establishment has no incentive to enlighten its fans, and teams run the risk of losing their advantages if they make new discoveries public.
Any push for those outside the establishment to understand how baseball works—how and why the wins and losses fans, players and executives stress over and pass down as memories actually happen—has thus been forced to come from outside.
“The honest truth is, it has never been easier to be a sabermetrician than it is right now,” Colin Wyers, Baseball Prospectus's Director of Statistical Operations and self-professed “sabermagician,” wrote in an e-mail. “You can go into any Wal-Mart in the country and pick up a (in historical terms) absurdly powerful personal computer for a few hundred dollars. You get that thing home and hook it up to any of several widely available sources of high-speed Internet and you can download large amounts of rich, rich baseball data for free.”
The internet revolution—and the nerd revolution it brought on—has created spaces for niche communities of people who would otherwise be too isolated and restricted for those communities to exist. The relatively small group of people who saw a James Baseball Abstract or Palmer's Hidden Game of Baseball as among the most influential literature of their lifetimes finally had a robust space to unite and discuss their ideas.
Baseball Think Factory and Baseball Prospectus each debuted on the web in 1996. If, as Wyers says, James and Palmer mark the first sabermetric golden age, 1996 was the origin point for the second. Add in the ubiquity of the personal computer, which Wyers says “made it possible to do better in an afternoon what Palmer was able to do with months of effort,” and conditions were ripe for the new community to spread and improve the ideas Palmer and James (and Lane before them) postulated.
Over roughly the next decade, baseball statistics and statistical concepts underwent a radical metamorphosis. Voros McCracken's defense-independent pitching statistics—DIPS—pushed a view of pitching that proposed the pitcher did not have control over balls in play and shifted the focus to strikeouts, walks, and home runs. The introduction of OPS+, the first “indexed stat,” acknowledged and adjusted for the differences in league quality and park effects over the years. The development of the flawed but powerful advanced fielding metric UZR forced a new focus on discovering and isolating the value of defensive play in run prevention. The multiple formations of Wins Above Replacement statistics have taken an ambitious step towards quantifying a player's entire value in one number.
Perhaps more importantly, old ideas from the first golden age finally began to spread and take hold with a larger audience. The concept of win probability has been around since the brothers Eldon and Harlan Mills proposed “Player Win Averages” in 1970, but it didn't gain a popular foothold in the baseball world until FanGraphs established a scoreboard of live win expectancy charts in 2005. James's Pythagorean record concept—a formula to show how many games a team should have won based on run scoring and prevention—was refined and included in standings and even playoff odds at Baseball Prospectus.
Most famously, the Moneyball A's spearheaded a renewed and invigorated approach to answering Lane's question of the improved batting average. The advent of on-base percentage was the most visible result. Even OPS—on-base percentage plus slugging percentage—became so mainstream as to appear on Topps cards in 2004. Eventually, statistics like Weighted On Base Average (wOBA) and True Average (TAv; formerly Equivalent Average/EqA) that gave separate, historically derived weights to singles, doubles, triples, home runs, and other events emerged and now headline the offensive statistics of FanGraphs and Baseball Prospectus, respectively. These solutions to Lane's problem—the most advanced all-encompassing batting statistics—are remarkably similar in spirit and in function to Lane's work in Baseball Magazine.
For all the progress since “Why the System of Batting Averages Should Be Changed,” Lane's assessment of the relative state of batting, fielding, and pitching analysis still rings true today. Batting statistics are the most accurate—least debated, certainly—of the set. Pitching and defense—and the question of how to separate the two—remain somewhere between murky and incomprehensible.
Since Baseball Prospectus and FanGraphs made Wins Above Replacement publicly available in the middle of the decade, though, there has been all of one radical, game-changing sabermetric discovery: the notion of catchers impacting the game with pitch-framing. Baseball Prospectus's Mike Fast—now an employee of the Houston Astros—was among the first to publish work suggesting a large spread in this talent among catchers. According to Fast's work, skilled catchers like Yadier Molina and Jonathan Lucroy save their teams somewhere in the range of 30 to 40 runs over the course of a season compared to the average catcher—about equal to the offensive contribution of Adrian Beltre or Matt Holliday compared to a league-average hitter in 2012, per FanGraphs.
Quantifying the catcher's defensive impact was one of the largest sabermetric roadblocks, and to knock it down is a great achievement. But elsewhere, progress has been minimal. Specifically, public defensive metrics have copious issues with both accuracy and precision. Many analysts have thrown up their hands in surrender to a lack of quality data. MLB has plans for a system called FIELDf/x—a robust camera system to track fielders much as MLB Gameday's PITCHf/x tracks pitches—and one commonly held belief suggests sabermetricians need this data before there can be another breakthrough.
In this attitude, we see the situation the sabermetric community faces now. “What's at stake,” Wyers says, “is the notion of public, amateur sabermetrics as sort of the forefront of baseball research.” Should we be surprised? Teams have seen how analytics can directly lead to winning and have poured money—not large sums, but more than can typically be made in the public sphere—into analytics departments.
Whereas PITCHf/x data can be easily accessed at multiple sources online, FIELDf/x data will be locked down and sold at fees too exorbitant for outside analysts to afford. PITCHf/x, Wyers says, “was intended to be simply an entertaining addition to MLB.com's Gameday app.” It wasn't until public sources got their hands on it and started analyzing it that the usefulness of the data was revealed. The same mistake—at least, from the perspective of those who control the data—will not be made twice.
The same goes for the people who would analyze this data. At least 15 noted seamheads have gone from the mastheads of Baseball Prospectus or FanGraphs into MLB analytics departments—and, of course, Nate Silver parlayed his analyst chops into a lucrative career dissecting politics. People who used to occupy the spot of a Palmer or a James now have their findings held under lock and key rather than open for public consumption.
The second golden age of sabermetrics is over. But for about a decade, findings by public amateur sabermetricians changed how front offices evaluated players. The result was a wholesale change in the world of sports, from the front office through the clubhouse and all the way down to fans.
Every major league team has established an analytics department, in some form. Players are forced to abandon the approaches learned from old-school coaches of their youths and adapt to new philosophies—like the Moneyball patience ethos or the radical defensive shifts employed by teams like the Rays now. Formerly alien concepts like “run expectancy” are displayed on scoreboards and applications like ESPN's Gamecast—and yes, Jeff Francoeur, so is on-base percentage.
The integration of sabermetrics into front offices was inevitable. Big Data has taken over the business world, from which more and more front office employees (like Tampa Bay Rays general manager Andrew Friedman) trace their origins. These business types don't just acknowledge the value of data in developing ideas. They also acknowledge the value of protecting this data.
The more radical change, from the fan perspective, is the integration of sabermetrics into mainstream baseball discussion. “Now you have sabermetrics that's in a sense establishment,” Wyers said. “There are shows on MLB Network that put forth a very sabermetric point of view—and yes, they put up old-school guys like Harold Reynolds to speak up for the old view, but what's important is that it's there at all. Brad Pitt is showing up on the cover of Sports Illustrated with the formula for FIP right next to him.”
The sabermetric community's constantly growing profile was easy to justify: this stuff worked. Concepts appearing in Baseball Prospectus and FanGraphs were actually manifesting themselves in the construction of MLB rosters. The 2004 and 2007 Red Sox respectively signed players like David Ortiz and J.D. Drew in an effort to improve on-base and slugging while eschewing batting average—at the behest of Bill James, no less. The post-Devil Rays targeted slick defenders and pitchers favored by DIPS theory rather than ERA. Billy Beane's Athletics “never won anything,” as fired Memphis Grizzlies coach Lionel Hollins and many others will remind us—except five division titles in 13 years despite a payroll consistently in the league's basement.
The mainstream is an alien frontier for sabermetrics. Despite James's obvious influence, the job he took with Boston in 2003—26 years after he published his first Baseball Abstract—was his first within the game. Much of the practically infinite reservoir of data at Baseball-Reference comes from a crowdsourced project called Retrosheet that has collected and digitized box scores and play-by-play for every single game dating back to the early 1950s—well over 200,000 games.
There was no establishment; no Dave Cameron on MLB Network or Jay Jaffe at Sports Illustrated or Jonah Keri on ESPN. Sabermetrics was just a hobby, something for a few ambitious oddballs to tinker with in their spare time. The Internet gave it force, a way for the vast numbers of curious baseball fans to get together and and create a way for those without access to baseball's ivory towers to learn and understand the game. This force changed the game from the ground up.
“The question,” according to Wyers, “is really whether or not the current sabermetrics movement is going to be a part of that. And if so, will they be on the side of the outsiders, or will they be the establishment being rebelled against?”
Three years ago, I was a 20-year-old with a nice little position analyzing baseball at FanGraphs. I was about as well-positioned as any college kid could hope to be in this sabermetric revolution. I decided to try to make it a career. I switched my academic path from a mathematics PhD track and realized, with a few summer courses, I could add an economics major to my BS in Mathematics. It would be the perfect supplement to my baseball knowledge. Economics is the study of scarcity in the world. Sabermetrics, when it is boiled down, is the study of scarcity—of runs, of wins, and of talent—in baseball.
I finished the coursework and received my diploma a year later, but I learned more about economics from a 368-page book my father gave me my freshman year: Robert L. Heilbroner's The Worldly Philosophers, a history of the great economists and what shaped their ideas. In his introduction, Heilbroner perfectly captures why I found my economics coursework so disengaging when economics in life can be so fascinating:
The sabermetric stakes—my stakes—are far lower, of course. No sabermetrician will find himself in the barricades. But this science has changed baseball irrevocably, to the point where it and its discourse would be nigh unrecognizable from merely 20 years prior. That alone makes it important and worth studying.
There will be other Jameses and other Palmers—of this I have no doubt. The majority of our research, though, will fall behind the work of the teams; locked out from their information for the foreseeable future, it will be impossible to maintain pace. So the crisis, for outsiders, is not how to stay at the forefront of baseball research. The crisis for outsiders is keeping an audience even when we no longer possess the utmost authority on baseball research—whether or not this authority was deserved, we did have it, for a fleeting moment in baseball history.
I use the term crisis a bit liberally. What I actually see is opportunity. James and Palmer, and the multitude of formerly public sabermetricians who have been hired and will have their findings held under lock and key are Heilbroner's logisticians: doing the gruntwork, but removed from the thrill of discovery and the excitement of a marketplace of ideas. The logistics are undeniably important, they are the base from which sabermetrics springs. But for sabermetrics to become a rich, full, and widely interesting discipline, it needs more than just logistics.
So it's time to move on, from the “What?” to the “Why?” You've shown us the average value of a single is 0.71 runs. The average center fielder makes a catch on a ball over his head 90 percent of the time, but Carlos Gomez makes it 97 percent of the time. The best catchers can save up to three extra strikes per 100 called pitches. Great. Now we can start talking about why it matters.
Baseball—all sport, really—derives its power from the stories it can tell. What is a statistic if not a story? Ted Williams recorded a hit over 40 out of every 100 times he stepped to the plate in 1941. Carl Yazstremski led the league in batting average, home runs, and RBI in 1967. Pedro Martinez recorded a 1.74 ERA in 2000. Mike Trout was worth 10 Wins Above Replacement (via FanGraphs) in 2012. These are all stories, albeit very simple ones without the context of the people and places behind them.
Still, these simple stories are important. They have moved fans since the game's infancy. I believe deepening our knowledge of what goes on behind the stories—what actually creates the stories—can only make these stories richer. I think this is what F.C. Lane, the sabermetric pioneer, was after in his investigation of the systems of batting averages. Back then, nobody else was doing the research, so he dove in and did it himself. His end goal was simple: tell a better baseball story.
“The time has passed,” Lane concludes, “when the public will any longer swallow the palpable falsehood that a home run run is no better than a scratch single. It knows better, instinctively feels better, and should be told the truth by a presentation of the season's statistics founded upon a sane workmanlike basis.”
Lane, of course, was too far ahead of his time, too optimistic that his work could overcome the inertia of the establishment. But now, 98 years later, sabermetrics has demonstrated its value to baseball—demonstrated so much value, in fact, that the game's institutions have moved to wrest data and computing power away from the public and into the establishment. The sabermetricians have done enough over the past two decades to earn this spot on the stage. We have the tools. Now it's time for the next evolution. It's time to tell our stories.
The Same Old Game is the third issue of The Classical Magazine. In addition to this piece, it features writing by Carson Cistulli, Tim Marchman, Eric Freeman and more; as well as nine artists including Craig Robinson, Dmitry Samarov, and Amelie Mancini.
You can read The Same Old Game a few different ways. The most highly recommended, if you're an Apple user, is to get our app: it's free to download the app, $3.99 for the one issue, or $29.99 for a year (12 issues). Otherwise, we have PDF, Kindle, and .Puig files available DIY style at the same prices. Just get in touch with Pete Beatty at firstname.lastname@example.org to work out a transaction.