November 16, 2006
The Numb3rs Game
"It is a characteristic of statisticians that they see the game by the thousands. It's a way of looking at the biggest possible picture of the game. Backing away from it a great distance and trying to see patterns that aren't apparent close up."
-Bill James, from "Baseball's Secret Formula"
There is no doubt that the last several years have seen a rapid proliferation of sabermetric ideas. Whether it's the publication of books like Moneyball and The Numbers Game, the increased visibility of advanced metrics in mainstream writing about the game, or the front office hiring of analysts like Keith Law and Bill James that directly bring to bear insights gained from performance analysis, all signs point to a synthesis of traditional and quantitative approaches that will create a new orthodoxy.
These ideas and insights branch out to touch not only the game on the field and those who report on it, but the legions of fans whose interests range from deep-seated allegiance to a particular franchise to the projected performance of the players on their fantasy teams. It should come as no surprise then that these ideas have also made their way into the larger culture in small yet interesting ways. This week, we'll take a closer look at two examples of this phenomenon, including last Friday night's episode of the CBS crime show Numb3rs and a program that first aired earlier this year on The Science Channel titled Baseball's Secret Formula.
What Are All Those Numb3ers?
As a person who certainly likes numbers and appreciates math, I was almost giddy some months ago when I saw the listing for a new drama on CBS titled Numb3rs. The premise revolves around cases that an FBI agent solves with help from his younger brother who just happens to be a college mathematics professor. Sounded like something right up my alley, so accompanied by the thought "how cool is that?", I set the DVR and prepared to be entertained. Unfortunately the series, despite several viewings, has never really clicked with me, and not just because the use of various algorithms and their explanations are glossed over so much as to make them unintelligible, but rather because the storylines are often too simplistic and predictable. I have tried to like the show. I really have.
So, as you can imagine, I was a little hesitant when I saw the listing (hat tip to a fellow Cubs fan) for the episode titled "Hardball," where sabermetrics are utilized to help solve a crime. Nevertheless, curiosity got the better of me, so I again set the DVR, and even asked my wife to watch with me. WARNING: For those who haven't seen the episode you can watch it here but press on here at your own risk, since I'm now going to ruin the plot for you.
The result was a bit of a mixed bag.
Our story begins when former major leaguer slugger Vic Johnston, struggling through rehabbing injuries at the minor league level, takes his cuts in what appears to be a practice game. After hitting a gapper, he slides into second safely. Well, sort of: He's actually dead on arrival. After finding vials of steroids in his clubhouse locker, the FBI arrives and later discovers that the slugger died of a stroke from a tainted dose of a designer steroid. Interestingly, later in the episode Johnston's manager tells the FBI that players are not tested unless there is suspicion, although of course in the minor leagues players are actually subject to four unannounced tests per year (including the offseason) under a policy put in place in 2002. Be that as it may, during the investigation the FBI also recovers Johnston's laptop with e-mails indicating that someone had found out he was juicing. The method of detection was, yes, you guessed it, "advanced statistical baseball analysis," or sabermetrics. You see, the e-mails contain an attachment with several pages of formulas like:
GJ = 1.4HP + TMP + .72AGR + .43(TMP)(AGR)
At this point the math professor brother seeks out help in deciphering the formulas, leading him to enlist the help of colleague at the university (who is also a fantasy league champion) played by none other than Bill Nye, The Science Guy. After Nye finds similar notations (where TB stands for "thrown bats" and FHR is "foul home runs") on a fantasy site called "Boxscore Times," the investigation leads to a 25-year-old unemployed high school dropout named Oswald Kitner. After reverse engineering the formulas, our math professor discovers that Kitner has reinvented a particular "change-point detection procedure" that identifies steroid use in particular players to which it is applied.
It should be noted that change-point detection is in fact a field of statistical research, although it typically deals with detecting state changes in a set of signals such as a heart rate or an electroencephalogram. However, certain techniques have been applied to problems as varied as attempting to detect denial of services (DoS) attacks in network traffic.
Back in the real world, performance metrics, even advanced ones devised by unemployed high school dropouts, are far too blunt an instrument to approach the subject of PEDs on an individual level. As Nate Silver discussed in the chapter "What do Statistics Tell Us About Steroids?" in Baseball Between The Numbers "unexplained changes in performance are the norm and not the exception," and even when analyzing players in the aggregate who we know used because they were caught, there is no statistically significant change detectable in their performance after they were caught and presumably stopped using.
Even though the formulas shown are nonsensical and the idea implausible, the show did contain two bits of truth. First, in a discussion of Kitner's work, our hero the math professor notes that sabermetrics "really is a powerful form of analysis," and, in a voice-over accompanied by some nice graphics, continues:
The physical nature of the game involves chance. So the difference between a hit and out could be millimeters or milliseconds. So when you have athletic situations involving chance repeated over and over again, a statistical analysis can isolate and reveal human performance.
Well said, yes indeed. His colleague then goes on to say that Kitner's work wouldn't be the first time that math revealed a surprising truth about the sport, and cites a real study published by Jay Bennett in a 1993 issue of The American Statistician in which he uses the Mill's Brothers Player Game Percentage method (which is based on the concept of win expectancy) to analyze Joe Jackson's 1919 world series performance for evidence that he was involved in throwing the series while hitting .375.
In the study, Bennett finds "substantial support to Jackson's subsequent claims of innocence" but unfortunately the colleague (also a professor who should know better) says that Bennett's study "proves" that Jackson played up to his potential in each game of the series. While it's nice to see a reference to some real work in the field, leaving the impression that statistical analysis "proves" a particular viewpoint, whether it's in the case of Shoeless Joe Jackson, steroid use, or some other study, is at the very least an incorrect message to leave with viewers.
The limits of statistical analysis are such that, when testing a hypothesis, the most that can be said is that we either do or do not find sufficient statistically significant evidence to reject the hypothesis. If we do, it is still not necessarily false, and if we don't, it is not therefore true. This is at the heart, for example, of studies that test for the presence of clutch hitting or the existence of "hot hitting." A study can never definitively prove that these phenomena don't exist, but can instead only show that, as in the case of clutch hitting, if they are real, they likely have a small effect and are difficult to discern statistically.
So what of the case of the dead slugger? Once the sabermetric connection was made, the story unraveled rather quickly. It turns out that the young Kitner's work was stolen by a friend and e-mailed to Johnston, who then told his agent. The agent, knowing full well that Johnston was juicing, killed Kitner's friend, attempted to kill Kitner, and arranged for the fatal dose to the slugger with the help of the evil drug company. One is left pondering why the agent would kill Johnston since the agent still stood to make a lot of money if Johnston made it back to The Show, and even more importantly he would have to have known that an autopsy would reveal Johnston's cause of death, and eventually point the finger in his direction. But as discussed earlier, the plotlines in Numb3rs often leave something to be desired, and this episode was no different. The happy ending? Kitner gets a gig with the FBI, and another with Johnston's minor league team, so all's well that ends well.
Not So Secret Formulas
Our second representation of sabermetric ideas was aired earlier this year and re-run in the last month on The Science Channel, and is titled "Baseball's Secret Formula". Certainly more fact than the fiction in the previous program--unfortunately this show also had its issues.
Those problems manifest themselves at the very start, when the narrator, after providing a brief introduction to Bill James, intones, "So what if all it really took is a calculator and stat book to win the World Series?" An interesting question, but one that right away sets up a false picture of what performance analysis and "sabermetricists" (as they call James et. al.) can do.
From this dramatic introduction, the program backtracks to review traditional scouting methods centering on the five tools, with the help of the Angels' Eddie Bane. The show notes that this approach "alone" is steeped in subjectivity, and also comes with what James calls "a great deal of received wisdom," much of which is false. As James says, many times "the things that people know are the enemies of what they learn."
However, in introducing more objective measures (and they do mention the weaknesses in traditional measures like batting average and RBI that leave out skills or context) the program only mentions Linear Weights, Win Shares, and something called "Slugging to Winning," whose meaning eludes me. This revealed a basic weakness in the presentation. In 45 minutes they simply can't do justice to most of the concepts they introduce (setting aside for the moment the question of whether they even discussed the right ones), and as a result leave viewers with some misconceptions.
For example, the formula for on-base percentage is shown incorrectly with at-bats in the denominator, and in discussing Win Shares they leave the impression that the offensive component combines OBP and Runs Created somehow. More problematic, though, is the segment discussing the greatest players of all time, complete with James wandering around the grounds of the Hall of Fame, looking thoughtful. There they briefly discuss James' similarity scores, as if these were somehow useful in that discussion. This comes after a fine discussion of the evolution of the game and its various eras, park effects, and how all of that context should shape how we evaluate performance.
But still, there were a number of parts of the program that were both instructive and accurate.
First, in an early segment I was particularly struck by the time devoted to discussing just why it is that baseball can be analyzed so well in comparison to other sports. The geometric configuration of the field leading players to be separated, players taking turns, and the limited options available to each player creating an "orderly universe" according to James, all contribute. They then build on this foundation to emphasize the value that an out is the only finite component of the game. That insight is then used to introduce Sandy Alderson, who discusses his use of sabermetrics to find players who possess skills that were undervalued. All great stuff.
Secondly, they spent a later segment reviewing the role sabermetric insights can provide to the approximately 11,000 in-game decisions managers make each season, while cutting to parts of an interview with Red Sox manager Terry Francona. Francona comes off as open-minded and as someone who is willing to re-evaluate traditional methods, and as somebody who uses James' input to "not overvalue some statistics that have been traditionally overvalued." He also later points out that "baseball's traditionally been twenty years behind...and we're just trying to catch up." The in-game strategies that are presented are:
The last segment, shot in the offices of Baseball Info Solutions was well done, providing a peek at both the methodology in use and the tools that are being developed. The lead-in begins with the idea that fielding statistics are the "last bastion" for sabermetrics. Alongside interviews of John Dewan and Steve Moyer, there is an excellent overview of the system behind The Fielding Bible that then transitions nicely to future ideas that include computer chips embedded in the baseball, on-field cameras for better tracking, and even sensors implanted in the field of play that would allow "every activity" to be quantified.
As mentioned previously, the beginning of the program featured the throwing down of the gauntlet by questioning whether all it really takes to win a World Series is a calculator and stat book. Although not really focused on in the subsequent segments, at the very end, as the credits roll, they answer the question by briefly touching on the complexity of any human activity. Appropriately enough, they show Schwarz and James saying that reality is far too complicated to understand perfectly, and that sabermetrics is indeed only one piece of the puzzle. Not better than the scout's way or the manager's way, but simply different. It's too bad that message wasn't given more prominent play in an effort to provide better context.
Clarifying the Message
So what, if anything, have we learned from these two very different depictions of performance analysis? Perhaps one of the lessons is that, as with most ideas and the childhood game of "telephone" teaches us, the farther an idea travels from its originators, the more watered-down and misunderstood it oftens become. That makes it especially incumbent on those of us within the community to clearly communicate just what this kind of analysis can do, and perhaps more importantly, just what its limits are.