World Series time! Enjoy Premium-level access to most features through the end of the Series!
August 15, 2012
The Importance of Imperfect Models
From the Twitters yesterday morning:
All the context you really need to know to understand that tweet:
I leave the matter of Barney’s defensive rating as an exercise for the reader—that equine hasn’t stopped twitching, but I’ll hold off on the beatings for right now. No, I want to discuss something that came out of a long discussion on Twitter in response to that remark: What does it mean when we think WAR (or any other metric) is wrong? Can we still use it? Should we discard it until we’ve worked out all the imperfections?
Unlike the WAR example earlier, we can confirm that FIP is wrong in Chapman’s case, since negative runs are impossible in baseball. So if FIP can be wrong for Chapman, is it possibly wrong in other cases? Does that mean we should stop using FIP?
The answer to those questions, if you’re impatient:
It’s quite easy to see how FIP “breaks” here—it’s a linear model, and the slope of the line means that it will go below zero if the conditions are right. Unlike reality, FIP is not bound at zero at the lower end. If a pitcher’s strikeout rate, relative to his walks and home runs, gets very high, you will see FIP go negative. But FIP will bend before it breaks—there are going to be some above-zero pitchers who nonetheless have lower estimates than they would if FIP was realistically bound at zero on the lower end.
To be blunt, FIP is a model. What we can say about models is this:
Let’s take an example from physics. Most everyone is taught Isaac Newton’s theories of gravity is school these days, despite the fact that Newtonian physics has been shown to be at best incomplete, and in terms of working physics has been supplanted by Einstein’s theories of relativity and the notion of quantum mechanics. There are real, observed phenomena (like measurements of the rotation of the galaxy) that point out problems with Newton’s theories. So why do we still teach them and use them?
The answer is simple—because they’re still useful in predicting the behavior of gravity as we can observe it in our everyday lives. The extreme cases where it breaks down, either at the level of large galaxies or of minute quantum particles, are simply not relevant to us. Moreover, learning Newton’s theories can still teach us quite a bit about gravity.
(Some of you may be trying to reconcile this statement with my opinions about, say, batted ball data. I do believe that imperfect models, which is to say all models, can be useful. But that isn't to say that all of them are. When looking at batted ball data, there is evidence that suggests that the model isn't adding anything to our understanding, while adding to our complexity. When it comes to accepting a model, I tend to err on the side of parsimony, which is to say the idea that the simplest theory that explains what we're looking at is best. That doesn't mean that complexity is bad, but for a more complicated model to be embraced it should offer conclusive evidence that it's adding to our understanding.)
The world is not as simple as being right or wrong. As science fiction author Isaac Asimov wrote, "When people thought the earth was flat, they were wrong. When people thought the earth was spherical, they were wrong. But if you think that thinking the earth is spherical is just as wrong as thinking the earth is flat, then your view is wronger than both of them put together." As Patriot (whose blog is the sabermetric site with the lowest ratio of readers to useful insights out there, and which you owe it to yourself to read more often) put it:
So let me tell you a little story.
I was in the kitchen, doing dishes. Suddenly, I hear the sound of a child’s tears from the living room, and as I look over my shoulder, I see my little girl standing in the doorway of the kitchen. In her left hand is the saucer section of the USS Enterprise, registry number NCC-1701-D. In her right hand is the engineering section of the USS Enterprise.
That particular model kit (which I had assembled all the way back in high school) was not intended to detach the two sections of the ship from each other. She looked up at me and said, “I broke your spaceship, daddy.”
Now, I am going to tell you what I told her that day.
Sometimes, these things happen. Sometimes models get old. Sometimes they’re brittle. Sometimes they’re built by teenagers living in their mother’s basement and they’re not built to last. That’s okay. It was put out there to be used, played with, and enjoyed. We’ll see if it’s still usable, or if we can fix it. If not, we’ll throw it away and get a new one.*
This is of course a fine line to be walked. We don’t want to keep models around after they’ve long been supplanted by superior ones. We want to keep progressing, and if we’re too accommodating, we’ll stagnate. (This is, if you think about it, one of the key reasons Bill James was able to have the impact he did—because by and large that’s what the generation of baseball writers before him had done.)
But if we wait around for perfect models, we will be waiting forever. This doesn’t mean we shouldn’t be humble about how we use our models, because we should. And we should be infinitely more cautious when using a model where we haven’t found flaws than using a model where we have found flaws. We should avoid false certainties, instead presenting our uncertainties and our uncertainties about our uncertainties. But we can do that only by playing with our toys more, not less. We learn so much more when we take the toys off the shelves and start to use them.
*In case you were wondering – the Utopia Planitia shipyards were unable to salvage the craft. In life, as in art, it has been replaced by the Sovereign-class USS Enterprise, NCC-1701-E. It lights up and makes cool noises when you press the buttons on it.