CSS Button No Image Css3Menu.com

Baseball Prospectus home
Click here to log in Click here for forgotten password Click here to subscribe
<< Previous Article
Premium Article Prospectus Today: Firs... (10/02)
<< Previous Column
Premium Article Prospectus Hit and Run... (09/29)
Next Column >>
Premium Article Prospectus Hit and Run... (10/07)
Next Article >>
Playoff Prospectus: Wh... (10/02)

October 2, 2008

Prospectus Hit and Run

Predictive Power

by Jay Jaffe

the archives are now free.

All Baseball Prospectus Premium and Fantasy articles more than a year old are now free as a thank you to the entire Internet for making our work possible.

Not a subscriber? Get exclusive content like this delivered hot to your inbox every weekday. Click here for more information on Baseball Prospectus subscriptions or use the buttons to the right to subscribe and get instant access to the best baseball content on the web.

Subscribe for $4.95 per month
Recurring subscription - cancel anytime.

a 33% savings over the monthly price!

Purchase a $39.95 gift subscription
a 33% savings over the monthly price!

Already a subscriber? Click here and use the blue login bar to log in.

If you've been following this space recently, you know I've been focused on teams that overachieve or underachieve relative to their projected records. I've particularly had my eye on the Angels, who wound up setting a record by exceeding their third-order Pythagenpat projection-that is, their projected record after adjusting for run elements, park, league, and quality of competition-by a staggering 16 games, 3.4 more than any other team since 1900. That they're squaring off in the first round against the Red Sox, who fell short of their projected record by an MLB-high 7.1 games raises the question: does this mean anything?

The answer, while not definitive, surprised me. Like a lost set of car keys, I found it in the last place I looked. I began by examining whether there was anything to be gleaned from the series with the widest third-order discrepancies between two clubs (their D3s, in Adjusted Standings terms) and stumbled onto something a bit more interesting. As it turns out, those D3 numbers are generally better predictors of short-series outcomes than either raw won-loss records or projected won-loss records.

To examine this, I went back to the historical Adjusted Standings data set which I've used for those recent articles, and tallied how often the outcome of a given playoff series could be predicted by actual winning percentage (W0), first-order winning percentage based on runs scored and runs allowed (W1), second-order winning percentage based on run elements (W2), third-order winning percentage (W3), or Hit List Factor (HLF, the average of those four, used for tabulating the weekly Hit Lists). What I discovered was that actual records are rather poor predictors of short-series outcomes-they get it right less than 50 percent of the time. Any of the projected records call the outcome correctly more often, but still only around 50 percent of the time, if that. For some reason, first-order records do a slightly better job than their orderly cousins.

That alone is an interesting finding that bears repeating: in a short series, teams' projected records are better predictors of their chances of winning than their actual records, but they still don't predict the outcome significantly more often than 50 percent of the time. We have to dig deeper, and so I did, at least as far as the Adjusted Standings data goes. As it turns out, the winner of this beauty pageant of predictive power is a team's third-order discrepancy (D3). The team with a greater discrepancy-the one that overachieved more relative to their projected record, in other words-turned out to be the victor more often.

I've broken the data down by era and by series length, excluding the 1981 strike-year playoffs and separating the five-game League Championship Series era (1969-1984) from the seven-game one (1985-1993). Obviously, there's no 1994 set either. The "#" is the number of series in the set, and the numbers under each column represent how often that indicator correctly predicted the outcome:

Era     Series       Period       #   W0   W1   W2   W3  HLF   D1   D2   D3
2-DIV   5-Game LCS   1969-1984   30   12   10   10    9   11   18   17   16
2-DIV   7-Game LCS   1985-1993   18    6   10   10   10    8    8    8    9
2 DIV   7-Game WS    1969-1993   24   11   10   10   11   11   10   10   10
2 DIV   All          1969-1993   72   29   30   30   30   30   36   35   35

3-DIV   5-Game LDS   1995-2007   52   24   29   28   28   26   25   29   31
3-DIV   7-Game LCS   1995-2007   26   14   16   16   16   15   14   14   15
3-DIV   7-Game WS    1995-2007   13    5    6    5    5    6    6    6    5
3-DIV   All          1995-2007   91   43   51   49   49   47   45   49   51

        All 5-Game   1969-2007   82   36   39   38   37   37   43   46   47
        All 7-Game   1969-2007   81   36   42   41   42   40   38   38   39

        All Non-WS   1969-2007  126   56   65   64   63   60   65   68   71
        All WS       1969-2007   37   16   16   15   16   17   16   16   15

        All          1969-1993   72   29   30   30   30   30   36   35   35
        All          1995-2007   91   43   51   49   49   47   45   49   51

That's a ton of data, and before going much further, it makes sense to pare this down to a few logically-grouped sets with larger sample sizes. Using winning percentages:

Series       Period       #    W0    W1    D3
All 2-Div    1969-1993   72  .403  .417  .486
All 3-Div    1995-2007   91  .473  .560  .560

All 5-Game   1969-2007   82  .439  .476  .573
All 7-Game   1969-2007   81  .444  .519  .481

All Non-WS   1969-2007  126  .444  .516  .563
All WS       1969-2007   37  .432  .432  .405

All          1969-2007  163  .442  .497  .528

Any resemblance to the NL West standings is entirely coincidental, though it does make for a convenient metaphor. The data underscores the utter futility of using actual records to predict playoff series; that .442 winning percentage is a nearly exact match for the actual record of this year's Giants (.444). The success rate is considerably higher using first-order records, over .500 in some blocks but not all of them, enough to suggest that even using those is pretty much a crapshoot. It's at its highest with the third-order discrepancies, a little higher than the actual record this year's Dodgers on the whole (.519), and at times about as high as those big, bad Phillies (.568).

I don't want to overstate the claims about what all of this tells us given the sample sizes, but it's worth laying out the inferences we can draw:

  1. Projected records appear to be solid indicators of series success in the Wild Card era, much moreso than in the two-division era.
  2. Those projected records appear to do a much better job in the intermediate series than they do in the World Series (the smallest sample here).
  3. Third-order discrepancies appear to be the strongest indicators in five-game series, and they match up well across the entire Wild Card era.

The first and third points have the current era in common, and when we consider the difference between this period and the two-division one, one factor that stands out is the evolution of the bullpen's importance. Recall that Nate Silver found closer quality (as measured by WXRL) to be a significant enough predictor of post-season success that he incorporated into what he termed the "Secret Sauce," and add to this my own finding of a modest correlation (r = .42) between team WXRL totals and third-order discrepancies across the 1954-2007 Retrosheet era, a correlation that edges up to .49 in the Wild Card era. What we appear to have stumbled upon is some further evidence of a link between regular season over- or underachievement, bullpen quality, and postseason success, one that merits further exploration.

That will have to wait for another day. In the meantime, it's too late to get money down on the four first-round series at hand, but it's worth a quick gander at those first- and third-order indicators:

Team         W0    W1    D3
Cubs       .602  .612   2.5
Dodgers    .519  .535  -4.7

Phillies   .568  .575   5.2
Brewers    .556  .539   3.6

Red Sox    .586  .592  -7.1
Angels     .617  .543  16.0

Rays       .599  .566   0.0
White Sox  .546  .550   0.9

In the National League, both the W1 and D3 indicators match the conventional wisdom regarding that the Cubs and Phillies will prevail. In the American League, the indicators split on both series, with the D3's take on the White Sox-Rays perhaps the most surprising given the relative rankings of the two bullpens, with Tampa Bay leading the majors in WXRL and the White Sox running 10th. By comparison, this year's Secret Sauce rankings call for both Chicago teams, Boston, and Milwaukee to prevail (a refresh of my browser window tells me that the White Sox gained one point in the closer category via Game 163; prior to that, they were even with the Rays). Clay Davenport's Postseason Odds favored the Cubs, Brewers, Red Sox, and Rays before play opened yesterday. Our own prognosticators (myself included) have called for the Cubs, Phils, Red Sox, and Rays to move on, incorporating subjective factors as well as objective ones.

One slate of results certainly won't prove anything, but it should be fun to sort out where these various methodologies succeeded and failed. In the meantime, enjoy the current smorgasbord of post-season baseball.

Jay Jaffe is an author of Baseball Prospectus. 
Click here to see Jay's other articles. You can contact Jay by clicking here

Related Content:  Records

5 comments have been left for this article.

<< Previous Article
Premium Article Prospectus Today: Firs... (10/02)
<< Previous Column
Premium Article Prospectus Hit and Run... (09/29)
Next Column >>
Premium Article Prospectus Hit and Run... (10/07)
Next Article >>
Playoff Prospectus: Wh... (10/02)

What You Need to Know: Funk Blast
Premium Article The Prospectus Hit List: April 24, 2017
Short Relief: Of Sad Lexicons and Sad Hardwa...
Premium Article The Call-Up: Chih-Wei Hu
Premium Article Monday Morning Ten Pack: April 23, 2017
Premium Article Flu-Like Symptoms: Marte, McCutchen, and Foo...
Players Prefer Presentation: Manny Machado's...

Playoff Prospectus: White Sox versus Rays
Premium Article Prospectus Today: First-Day LDS Action
Premium Article Player Profile: Alexei Ramirez
Prospectus Preview: Thursday's Games to Watc...
Premium Article On the Beat: Game One Report, Brewers versus...

2008-10-08 - Playoff Prospectus: Dodgers versus Phillies
2008-10-07 - Premium Article Prospectus Hit and Run: Big Gains, Big Loser...
2008-10-05 - Premium Article Prospectus Hit List: Season Wrap-up
2008-10-02 - Premium Article Prospectus Hit and Run: Predictive Power
2008-10-01 - Playoff Prospectus: Phillies versus Brewers
2008-09-29 - Premium Article Prospectus Hit and Run: A Strange but Memora...
2008-09-25 - Prospectus Hit and Run: The Mets Bullpen Fai...

2008-10-15 - Premium Article Prospectus Hit and Run: The Comeback Kings
2008-10-14 - Premium Article Prospectus Hit and Run: Popping the Lidge Ea...
2008-10-07 - Premium Article Prospectus Hit and Run: Big Gains, Big Loser...
2008-10-02 - Premium Article Prospectus Hit and Run: Predictive Power
2008-09-29 - Premium Article Prospectus Hit and Run: A Strange but Memora...
2008-09-25 - Prospectus Hit and Run: The Mets Bullpen Fai...
2008-09-22 - Premium Article Prospectus Hit and Run: The Decline and Fall...