CSS Button No Image Css3Menu.com

Baseball Prospectus home
  
  
Click here to log in Click here for forgotten password Click here to subscribe

No Previous Article
<< Previous Column
Breaking Balls: Streng... (04/22)
Next Column >>
Premium Article Breaking Balls: Maroon... (04/29)
No Next Article

April 27, 2004

Breaking Balls

Pooh-Poohing Pythag

by Derek Zumsteg

Bill James came up with a team stat a long time ago that's found wide traction. It's the Pythagorean theorem, and it looks like:


                            RS^2
   Winning Pct = WPct = ------------
                         RS^2 + RA^2

As Clay Davenport wrote in an excellent 1999 article on the theorem and improving its calculation:

RS = runs scored, RA = runs allowed, and ^ means "raised to the power of", in this case, 2. It was the "raised to the power of 2" parts that reminded James of geometry's Pythagorean theorem (a^2 = b^2 + c^2), hence the bestowment of an unwieldy name.

James later said that 1.82 was an even better exponent, and Davenport (in that article) proposed an even better way of calculating it. What's happened with the stat since then is what interests me.

The stat itself turns out a win percentage, generally within a couple of wins over the whole season. It's amusing, but ultimately not that useful. It's also served as a starting point for all kinds of statistical silliness that we should be wary of.

Given a discrepancy between "Pythagorean Wins" (what you'd expect from a team given a specific runs scored/runs allowed set) and actual wins leads to all kinds of investigation, chin-scratching, nose-picking, and navel-gazing. Some people will say a team is "stronger" than its actual record because it's underperforming the formula, and so forth. Suspects for the gap typically include:

  • Strength of bullpen

  • Managerial use of bullpen

  • Clutch hitting

  • Clutch pitching

  • Chemistry

  • Managerial strategies in tight games

  • Luck

This leads to interesting observations and theories (team x is 12-0 in one run games, manager Joe's teams consistently outperform their Pythagorean record except when they don't) but rarely insight. It's not as bad as putting a couple of stats into a blender, pressing the "pulse" button a couple of times, and claiming the resulting undrinkable smoothie is some kind of innovation. But it's still a waste of time.

Minnesota, as I write this, is two games over their "expected win-loss" record. Kansas City is two games below. Montreal, at 5-14, is two games over its "expected win-loss" record. I put expected in quotes not to make a derogatory implied comment, but because that's the term listed on ESPN's stat pages (even though at the bottom of the page there's this sentence: "This formula was designed to relate a team's runs scored and runs allowed to its won-lost record.")

At what point did this little stat become the metric used to generate a yardstick for team performance?

Take weather. Chaos theory arose out of meteorology, when Lorenz noticed that running the same simulations with extremely small variations in starting conditions resulted in wildly divergent results down the road. It's part of why long-range weather forecasting is a fool's task. In the same way, so is baseball forecasting: reading pre-season forecasts from anyone (especially me) about what things will look like at the end of a 162-game season, much less one then followed by a couple of short playoff series, should be for entertainment value only.

In the same way that Lorenz's weather simulations turned out differently, so does the season. I can run two Diamond Mind simulated seasons, and while over time you'll get a pretty good idea of where things most likely will end up, there's no way to know for certain what will happen each season. You can watch the chaos arise from the first pitch thrown, which affects the first at-bat, difference piling on difference until it tips who goes to the World Series and who doesn't.

Or imagine doing economic analysis this way. The modern world economy is probably the most open and complicated system in existence. There are thousands of high-level economic indicators that show how home buying is going, or consumer debt spending. These represent all people on the planet doing their business, no matter what country or economic system they work in or mooch off. If the U.S. economy goes into a recession, does it do any good to say "but given that steel prices were down and bonds were up, we should have seen a boom period"?

Of course not. What happened happened, and while maybe someone can show me that generally when those two things happen, there's happy days, that still neglects every other force at work, the forces that ultimately resulted in something unexpected happening. To scratch your head and say "Why didn't steel and bond prices determine the economy's performance? Is the economy stronger than it appears?" obviously misses the point.

Back to expected win-loss record, though. If you re-ran this season exactly as it was today, Minnesota would be two games over its expected win loss record, a product of all the thousands of tiny threads that came together in each game and each series to produce the record they hold today. A reliever used today can't come in tomorrow to save a game, and the guy who does come in might give up three home runs and cost his team the game. The difference between using that reliever and having him available is just a nudge, even in that game: a groundball caught by an infielder instead of trickling out for a single, a couple of more efficent outs allowing the starter to pitch an extra inning.

All the tiny events get together to force GMs into trades to cover for injuries, for managers to get fired, and for franchises to decide to tear down and rebuild, and each of these has impacts of its own.

Teams aren't stronger than their records, or weaker than their records. They stand on their records, and the won-loss columns are only what they are, a history of what's happened so far. Perhaps there's a pot of gold at the end of this rainbow-chase, and the Pythagorean formula will lead us to understand something about how baseball is played, or how it can be played better. For now, though, the variances from the Pythagorean record only show that baseball's a complicated and fascinating sport that's also not easily reducable, and I think everyone knows that already.

0 comments have been left for this article.

No Previous Article
<< Previous Column
Breaking Balls: Streng... (04/22)
Next Column >>
Premium Article Breaking Balls: Maroon... (04/29)
No Next Article

RECENTLY AT BASEBALL PROSPECTUS
Before They Were Prospects
Fantasy Article Fantasy Team Preview: Baltimore Orioles
Premium Article Rumor Roundup: The Ace Chase, the Rays' Face...
Premium Article Daisy Cutter: Jon Lester's New Peers
Premium Article Prospect Mechanics
Moonshot: A New View of Plate Discipline, Pa...
The Lineup Card: Nine of the Worst Baseball ...

MORE FROM APRIL 27, 2004
Aim For The Head: Hidden Perfect Games
Premium Article Under The Knife: Kearns Krisis
Premium Article Transaction Analysis: April 19-25, 2004
Ticket Price Survey
Premium Article Prospectus Today: The Other Coast

MORE BY DEREK ZUMSTEG
2004-05-10 - Premium Article Breaking Balls: More MLB Promotions
2004-05-06 - Breaking Balls: Sixteen Innings of Bliss
2004-04-29 - Premium Article Breaking Balls: Marooned in Montreal
2004-04-27 - Breaking Balls: Pooh-Poohing Pythag
2004-04-22 - Breaking Balls: Strength Up The Middle
2004-04-15 - Premium Article Breaking Balls: Quick and Dirty Numbers, Pit...
2004-04-13 - Breaking Balls: Quick and Dirty Numbers, Hit...
More...

MORE BREAKING BALLS
2004-05-10 - Premium Article Breaking Balls: More MLB Promotions
2004-05-06 - Breaking Balls: Sixteen Innings of Bliss
2004-04-29 - Premium Article Breaking Balls: Marooned in Montreal
2004-04-27 - Breaking Balls: Pooh-Poohing Pythag
2004-04-22 - Breaking Balls: Strength Up The Middle
2004-04-15 - Premium Article Breaking Balls: Quick and Dirty Numbers, Pit...
2004-04-13 - Breaking Balls: Quick and Dirty Numbers, Hit...
More...