April 27, 2004
Pooh-Poohing PythagBill James came up with a team stat a long time ago that's found wide traction. It's the Pythagorean theorem, and it looks like:
RS^2 Winning Pct = WPct = ------------ RS^2 + RA^2
As Clay Davenport wrote in an excellent 1999 article on the theorem and improving its calculation:
RS = runs scored, RA = runs allowed, and ^ means "raised to the power of", in this case, 2. It was the "raised to the power of 2" parts that reminded James of geometry's Pythagorean theorem (a^2 = b^2 + c^2), hence the bestowment of an unwieldy name.
James later said that 1.82 was an even better exponent, and Davenport (in that article) proposed an even better way of calculating it. What's happened with the stat since then is what interests me.
The stat itself turns out a win percentage, generally within a couple of wins over the whole season. It's amusing, but ultimately not that useful. It's also served as a starting point for all kinds of statistical silliness that we should be wary of.
Given a discrepancy between "Pythagorean Wins" (what you'd expect from a team given a specific runs scored/runs allowed set) and actual wins leads to all kinds of investigation, chin-scratching, nose-picking, and navel-gazing. Some people will say a team is "stronger" than its actual record because it's underperforming the formula, and so forth. Suspects for the gap typically include:
This leads to interesting observations and theories (team x is 12-0 in one run games, manager Joe's teams consistently outperform their Pythagorean record except when they don't) but rarely insight. It's not as bad as putting a couple of stats into a blender, pressing the "pulse" button a couple of times, and claiming the resulting undrinkable smoothie is some kind of innovation. But it's still a waste of time.
Minnesota, as I write this, is two games over their "expected win-loss" record. Kansas City is two games below. Montreal, at 5-14, is two games over its "expected win-loss" record. I put expected in quotes not to make a derogatory implied comment, but because that's the term listed on ESPN's stat pages (even though at the bottom of the page there's this sentence: "This formula was designed to relate a team's runs scored and runs allowed to its won-lost record.")
At what point did this little stat become the metric used to generate a yardstick for team performance?
Take weather. Chaos theory arose out of meteorology, when Lorenz noticed that running the same simulations with extremely small variations in starting conditions resulted in wildly divergent results down the road. It's part of why long-range weather forecasting is a fool's task. In the same way, so is baseball forecasting: reading pre-season forecasts from anyone (especially me) about what things will look like at the end of a 162-game season, much less one then followed by a couple of short playoff series, should be for entertainment value only.
In the same way that Lorenz's weather simulations turned out differently, so does the season. I can run two Diamond Mind simulated seasons, and while over time you'll get a pretty good idea of where things most likely will end up, there's no way to know for certain what will happen each season. You can watch the chaos arise from the first pitch thrown, which affects the first at-bat, difference piling on difference until it tips who goes to the World Series and who doesn't.
Or imagine doing economic analysis this way. The modern world economy is probably the most open and complicated system in existence. There are thousands of high-level economic indicators that show how home buying is going, or consumer debt spending. These represent all people on the planet doing their business, no matter what country or economic system they work in or mooch off. If the U.S. economy goes into a recession, does it do any good to say "but given that steel prices were down and bonds were up, we should have seen a boom period"?
Of course not. What happened happened, and while maybe someone can show me that generally when those two things happen, there's happy days, that still neglects every other force at work, the forces that ultimately resulted in something unexpected happening. To scratch your head and say "Why didn't steel and bond prices determine the economy's performance? Is the economy stronger than it appears?" obviously misses the point.
Back to expected win-loss record, though. If you re-ran this season exactly as it was today, Minnesota would be two games over its expected win loss record, a product of all the thousands of tiny threads that came together in each game and each series to produce the record they hold today. A reliever used today can't come in tomorrow to save a game, and the guy who does come in might give up three home runs and cost his team the game. The difference between using that reliever and having him available is just a nudge, even in that game: a groundball caught by an infielder instead of trickling out for a single, a couple of more efficent outs allowing the starter to pitch an extra inning.
All the tiny events get together to force GMs into trades to cover for injuries, for managers to get fired, and for franchises to decide to tear down and rebuild, and each of these has impacts of its own.
Teams aren't stronger than their records, or weaker than their records. They stand on their records, and the won-loss columns are only what they are, a history of what's happened so far. Perhaps there's a pot of gold at the end of this rainbow-chase, and the Pythagorean formula will lead us to understand something about how baseball is played, or how it can be played better. For now, though, the variances from the Pythagorean record only show that baseball's a complicated and fascinating sport that's also not easily reducable, and I think everyone knows that already.