August 13, 2013
SABR and the Importance of Preserving Sabermetric History
This was my third year attending the annual convention of the Society for American Baseball Research—in this case, the 43rd such event. It is one of the social highlights of the year for a community that essentially suffered a diaspora at birth—it’s never been easier for baseball researchers to communicate, but every so often it’s vital to actually bring them together under one roof, and SABR is a vital way of doing that.
There are panel discussions, keynotes, presentations, posters, and committee meetings. There are also discussions in hallways and on escalators and in line at cheesesteak vendors and in bars… well, okay, mostly in bars. And those ad hoc interactions are at least as important as the formal events, if not more so. I’ve tried to recap the formal events, at least the ones I found of suitable interest. But it doesn’t really do enough to capture the sense of what the thing is. So let’s talk a bit. I don’t mean so much talk about SABR, although I’ll do that plenty. I mean let’s talk like we’re at the bar, with room to meander and ruminate and think about larger things. Now, obviously, I’m going to be doing most of the talking here to start, but a few of my victims from the hotel bar on Saturday night can tell you that’s pretty typical of being at SABR too.
I think there a lot of people out there who aren’t exactly sure what to make of SABR. There’s the public at large, who equates SABR with sabermetrics, despite the fact that it’s a very small part of the organization’s mission. On the flip side is the larger sabermetric community and its fellow travelers, who often have a hard time seeing how SABR is or could be relevant to the discussion in the Internet age. And there’s the leadership of SABR itself, which is unsure of how to make SABR more relevant to the modern generation of sabermetricians without driving off the current members in the process. And they have to, because SABR faces an existential crisis if it does not—the organization is aged and literally dying, and if younger people are not brought into the fold, eventually it will simply run out of members.
Bill James named sabermetrics after SABR, in homage to the organization. But most SABR members are not metricians—the organization has a much stronger focus on historical baseball than it does on statistical baseball. And most practitioners of sabermetrics do work privately and either self-publish their results or publish through organizations like ours or other websites’. The brief heyday of SABR’s “By The Numbers” newsletter as a hotbed of baseball research has passed. Data comes from people like Sean Lahman and the devoted volunteers at Retrosheet. Lahman and the Retrosheet board were both at SABR, but neither receives material support from SABR and could function without SABR.
SABR is facing a crisis, and a healthy field of inquiry named after it is too good an opportunity to pass up as it tries to preserve itself for the future. But there is room to ask whether or not sabermetrics has any use for SABR, and if so, whether or not the role SABR is trying to take on is one that’s useful for the field.
Let’s start with the role SABR seems to have taken on: public advocacy for sabermetrics. It has set up the annual analytics conference. It has partnered with Rawlings to try to bring modern defensive metrics into the Gold Glove discussion. It’s easy to see why this approach appeals to SABR. It lets it put its name on the field’s progress on a whole, even where it hasn’t directly contributed to any of it. It doesn’t require any of the actual researchers to change how they go about things, nor does it require SABR to get involved at a more fundamental level.
The question is, is it needed? And I think one has to conclude that it really isn’t. If sabermetrics has a problem these days, it isn’t reach. There is a Brad Pitt movie about how the underdog stats geeks took over the world. There are TV shows that discuss the sabermetric viewpoint. There are websites devoted to espousing sabermetric player measures, and they’re far from obscure. They get cited during actual baseball broadcasts.
And it’s not clear that SABR is particularly well equipped to be the PR arm of the sabermetricians. It’s been a largely private organization for most of its existence; most people know of it through sabermetrics, rather than the other way around. Sabermetricians have a larger following in the media than SABR does.
The last bastion of the old guard in the media is the newspaper writers and the like in the Baseball Writers Association of America. But let’s be honest: they face many of the same demographic challenges that SABR does. With the continuing downward spiral of the newspaper industry and the rise of blogging, the BBWAA is trying to get younger, and it’s turning to writers who grew up with a sabermetric viewpoint. (BP has BBWAA-credentialed writers of its own.) It’s a long road until those sort of people become the majority, but the same is true of SABR, and it’s not at all clear that SABR has figured out how to make that transition smooth for its own organization, much less another.
So if SABR is inserting itself somewhere that isn’t a real area of need for the field of sabermetrics, it can be tempting to conclude that there isn’t a role for it to play. But before we do that, let’s take a look at the problems with the field of sabermetrics and see if there are some that SABR is well suited to correcting.
The first problem with the field of sabermetrics we should probably address, because we’re already wandering past it, is that not enough people are asking the question, “What are the problems facing the field of sabermetrics?” A little introspection is healthy. A little outright perspective is good, too. (And sabermetrics needs to do a better job of accepting criticism from outside the field.) But I don’t see much of a role for SABR there.
So having gotten that out of the way, what other problems are there? A very big problem is brain drain. As sabermetrics becomes more popular, it also loses many of its best and brightest to teams and to other fields of study (one of the most famous sabermetricians is largely famous for his work on predicting election outcomes, not his baseball research). Could SABR offer incentives to help keep researchers in the public domain? The answer seems likely not; there’s far less money in public baseball research than there is in professional baseball, and it’s not realistic or fair to expect SABR to find a way to make that less so. (It should be noted that SABR is offering scholarships to young researchers to encourage new people to enter the field; it is in fact attempting to do something here.)
There is another problem, though, that if not exactly related, at least is exacerbated by the constant turnover in the field. It’s that sabermetrics, in many ways, is a field with a shallow connection to history—both its own history and the history of baseball in general. And that’s a problem.
There are really two distinct phases of the sabermetric movement: the “books and letters” phase and the Internet phase. What’s rather incredible—I do not mean this in the modern meaning of the word, which means “wonderful,” but the literal meaning, “difficult to believe or comprehend”—is how little of either is being preserved.
The biggest name from the “books and letters” phase was of course Bill James, who mostly resorted to self-publishing until 1982, when Ballantine became his publisher. All of James’ abstracts are out of print now. The Hidden Game Of Baseball, by John Thorn and the criminally underrated Pete Palmer, is similarly out of print. Many lesser writers, getting by largely on self-publishing via mimeograph, are even less well preserved.
And the situation actually doesn’t improve until fairly late in the Internet era. A lot of the early history of sabermetrics on the Internet comes from USENET groups like rec.sport.baseball, which has been preserved by Google (at least, until Google gets tired of it), but it’s like wandering a forest without a map or compass much of the time. A fair number of sabermetricians got their start on a forum called BaseballBoards.com, later renamed Fanhome, but that site is lost pretty much entirely. Much of the early sabermetric output on the Internet has been lost entirely, and what remains is poorly catalogued if it’s catalogued at all.
Why does this matter? As an illustration, consider Total Average, created by Tom Boswell in the late 70s, back when he was more interested in dabbling in sabermetrics than mocking it. Total Average was a summation of bases (total bases, walks, and steals) divided by a player’s outs. At the same time, Barry Codell independently derived the same idea, which he called Base-Out Percentage.
Total Average/Base-Out Percentage isn’t a bad metric. Certainly at the time it was conceived of, it was better than a lot of what was out there (but not as good as OPS, which is a big reason why it hasn’t stuck around). But a funny thing has happened: people keep independently “discovering” bases per outs. And it seems to be legitimately independent, too, not simply appropriating the ideas of someone else, if for no other reason than there are really quite a few better ideas to steal. But when you lack good records and good cataloguing of those, nobody knows the idea has been done before. And more importantly, they don’t know all the research that shows why the idea isn’t worth keeping around.
It’s true that the history of sabermetrics contains a lot of dead-ends and ideas that have been supplanted. But preserving those doesn’t matter only if you think we’ve reached the endpoint of the sabermetric movement. If we’re still developing new things and improving old things, that history is vital. It tells us what others have tried, it tells us how they came to their conclusions, it tells us why they did what they did, and it tells us what flaws others saw in it. Without that, every time a researcher wants to build something, he or she has to start almost from scratch. And that means a lot of time and effort spent making the same mistakes someone has already made. Best-case scenario, the mistakes are caught and someone just wasted their time. Worst-case scenario, the mistakes aren’t caught and people waste even more time believing things that have already been debunked.
There’s also a disconnect between sabermetrics and the history of the game. The problem is that baseball was not birthed fully formed; it developed over time. The rules of the game, as well as the unwritten rules (not the silly ones everyone talks about, but the vital ones that seem to dictate how managers behave) evolved in response to real learning about the game. If you look at the finished product, and not all the history behind it, you lose all the education that went into that—and for a field that’s about the accumulation of knowledge, that’s a deep loss.
One of the things that confounded sabermetricians early on was the bunt. Looking at the run expectancy tables made it clear that the sacrifice bunt was almost always a terrible idea, unless you were a pitcher hitting. But upon closer scrutiny, it turns out that the bunt is not as terrible as raw run expectancy would lead you to believe:
[A] sac bunt attempt obviously does not lead to an out and a base runner advance 100% of the time (or even close to 100%); in fact the average result from a sac bunt attempt is not even equivalent to an out and a base runner advance. Also, the average result varies a lot with the speed and bunting skill of the batter and whether and by how much the defense is anticipating the bunt or not (among other things).
Because you can have a single or a reach on error on a sacrifice attempt, and because those aren’t recorded as sacrifices, the bunt is actually a better percentage play than the change in run expectancy would indicate. And because of game theory, even suboptimal bunting may be of value in that it affects how the defense is lined up against you.
Early sabermetricians simply couldn’t understand why managers would bunt, given the evidence they had, so they concluded that the bunt was an egregious wrong. Later analysis is much more forgiving of the bunt (even if it would probably conclude that it’s overdeployed). People of a certain bent are likely to be struck by how this resembles the principle of Chesterton’s Fence. As G.K. Chesterton wrote:
In the matter of reforming things, as distinct from deforming them, there is one plain and simple principle; a principle which will probably be called a paradox. There exists in such a case a certain institution or law; let us say, for the sake of simplicity, a fence or gate erected across a road. The more modern type of reformer goes gaily up to it and says, “I don’t see the use of this; let us clear it away.” To which the more intelligent type of reformer will do well to answer: “If you don’t see the use of it, I certainly won’t let you clear it away. Go away and think. Then, when you can come back and tell me that you do see the use of it, I may allow you to destroy it.”
This paradox rests on the most elementary common sense. The gate or fence did not grow there. It was not set up by somnambulists who built it in their sleep. It is highly improbable that it was put there by escaped lunatics who were for some reason loose in the street. Some person had some reason for thinking it would be a good thing for somebody. And until we know what the reason was, we really cannot judge whether the reason was reasonable. It is extremely probable that we have overlooked some whole aspect of the question, if something set up by human beings like ourselves seems to be entirely meaningless and mysterious. There are reformers who get over this difficulty by assuming that all their fathers were fools; but if that be so, we can only say that folly appears to be a hereditary disease. But the truth is that nobody has any business to destroy a social institution until he has really seen it as an historical institution. If he knows how it arose, and what purposes it was supposed to serve, he may really be able to say that they were bad purposes, or that they have since become bad purposes, or that they are purposes which are no longer served. But if he simply stares at the thing as a senseless monstrosity that has somehow sprung up in his path, it is he and not the traditionalist who is suffering from an illusion.
If sabermetric inquiry cannot figure out why major league managers behave, almost to a man, in a certain fashion, that means nothing more or less than a failure of inquiry. And the sabermetrican needs to go back and study the matter further until he understands why. That reason may well end up being wrong. But until you find out what it is, how will you know that?
Many questions currently being discussed among sabermetricians—how to evaluate managers, how bullpens should be organized and used, how often teams should employ defensive shifts—are questions that could greatly benefit from similar amounts of depth. On the subject of shifts in particular, there is very little questioning of why teams line up in the defensive alignment they do. It may be that proponents of radical changes in defensive alignments are correct. But it may be that there’s a lot of hidden wisdom in the way the defense is traditionally aligned, and until you know what that is, you risk losing it altogether.
As an organization, SABR has been only loosely connected with the development of sabermetrics; most of the important work in the field has been done without it. In terms of preserving the game’s history, though, few have done the kind of work SABR has done. (Even in terms of preserving the history of sabermetrics itself, SABR’s archive of its “By The Numbers” newsletter is probably the single greatest record of the work of sabermetricians not named James or Palmer prior to the dawn of the Internet era.)
SABR is an organization with deep roots in history that needs to find ways to be relevant to the here-and-now to survive. Sabermetrics is a field that’s very relevant now but that has underdeveloped roots in history, both its own and the history of what it studies. The two could complement each other beautifully. It remains to be seen whether or not they will.