September 7, 2012
Do the Dodgers Lack Chemistry?
So, Ken Rosenthal laid down the gauntlet this morning:
How else indeed. Lemme get out the old calculator, and we'll see how else one can account for it. Let's treat everything as an independent binomial (I really don't feel like breaking out the hypergeometric distribution for this, and it'll be close enough). And let's give the Dodgers a win probability estimate of about .583 over that time period—I'll hand out all the formulas you need here, so if you have a different estimate, please feel free to use it. To come up with the expected random standard deviation, we simply do:
Run those figures, and we get an expected random standard deviation of .142, which is to say that over a 12-game stretch, we should expect a team with a "true talent" of .583 to fall somewhere between .441 and .725 68 percent of the time. (This assumes a symmetrical margin of error, which there really isn't a good reason to expect—we should expect to see outcomes below .441 more often than we see outcomes of .720 and above. That's because we should, all things being equal, expect performance closer to the mean than farther away from it. We can model this, but for our purposes right now, we don't really have to. We're also not accounting for uncertainty in our estimate of true talent. The cumulative effect of all the things we aren't accounting for is that we're probably understating how often we should expect a 5-7 run in 12 games.)
Okay, so over that stretch, we have the Dodgers winning at a .417 clip, or at 1.17 standard deviations away from our initial estimate, so we go from about 68 percent to 75.8 percent—in other words, three quarters of the time, we should expect teams to fall that close to their true talent. So that means a quarter of the time we should expect teams to be more extreme than that. Given how many 12-game sets there are in a 162-game schedule, we should expect to see results like this several times per season. In fact, the Dodgers went 1-11 from June 19th through June 30th. Granted, that may have been an inferior edition of the Dodgers, but not so inferior that we would have projected a 12-game stretch like that. So one other way we can account for a talented team losing over a small sample is sheer randomness. And it's nearly impossible to tell them apart.
Now, attend please, sabermetricians and fellow travelers. The fact that something is indistinguishable from randomness given our data and our model does not make it random. Rosenthal closes with a remark about the "absolute certainty that the statistically inclined crave," which should be the opposite of what those of us in the sabermetric community crave. If you remember nothing else, folks, remember this: What profits a man if he pisses off Ken Rosenthal but loses his soul? We should strive to acknowledge uncertainty wherever it exists, and in this case, we are uncertain about the validity of Rosenthal's claims. (And by "strive to" I mean do it or the Krampus will eat you.)
The reason we're uncertain about Rosenthal's theory is because we lack good ways to test it. We should, of course, try to seek out those ways and actually do that. But until then, his theory is of very little utility. It doesn't help us predict what the Dodgers will do. (And let's face it, the rest of the season is not so long that a poor performance out of the Dodgers would do much in confirming it.) And because the theory seeks to explain previously observed behavior, it has the uncomfortable feeling of being a post hoc rationalization, rather than predicting something (as J. Wheatley-Schaller of Vegas Watch pointed out, the article would be much more compelling had it been written BEFORE the poor run of performance). So while Rosenthal is not necessarily wrong (saying again, absence of evidence is not evidence of absence), he's not contributing a lot of substance to the discussion with his theory, either.