Prospectus Hit and Run: Predictive Power

If you’ve been following this space recently, you know I’ve been focused on teams that overachieve or underachieve relative to their projected records. I’ve particularly had my eye on the Angels, who wound up setting a record by exceeding their third-order Pythagenpat projection-that is, their projected record after adjusting for run elements, park, league, and quality of competition-by a staggering 16 games, 3.4 more than any other team since 1900. That they’re squaring off in the first round against the Red Sox, who fell short of their projected record by an MLB-high 7.1 games raises the question: does this mean anything?

The answer, while not definitive, surprised me. Like a lost set of car keys, I found it in the last place I looked. I began by examining whether there was anything to be gleaned from the series with the widest third-order discrepancies between two clubs (their D3s, in Adjusted Standings terms) and stumbled onto something a bit more interesting. As it turns out, those D3 numbers are generally better predictors of short-series outcomes than either raw won-loss records or projected won-loss records.

To examine this, I went back to the historical Adjusted Standings data set which I’ve used for those recent articles, and tallied how often the outcome of a given playoff series could be predicted by actual winning percentage (W0), first-order winning percentage based on runs scored and runs allowed (W1), second-order winning percentage based on run elements (W2), third-order winning percentage (W3), or Hit List Factor (HLF, the average of those four, used for tabulating the weekly Hit Lists). What I discovered was that actual records are rather poor predictors of short-series outcomes-they get it right less than 50 percent of the time. Any of the projected records call the outcome correctly more often, but still only around 50 percent of the time, if that. For some reason, first-order records do a slightly better job than their orderly cousins.

That alone is an interesting finding that bears repeating: in a short series, teams’ projected records are better predictors of their chances of winning than their actual records, but they still don’t predict the outcome significantly more often than 50 percent of the time. We have to dig deeper, and so I did, at least as far as the Adjusted Standings data goes. As it turns out, the winner of this beauty pageant of predictive power is a team’s third-order discrepancy (D3). The team with a greater discrepancy-the one that overachieved more relative to their projected record, in other words-turned out to be the victor more often.

I’ve broken the data down by era and by series length, excluding the 1981 strike-year playoffs and separating the five-game League Championship Series era (1969-1984) from the seven-game one (1985-1993). Obviously, there’s no 1994 set either. The “#” is the number of series in the set, and the numbers under each column represent how often that indicator correctly predicted the outcome:


Era     Series       Period       #   W0   W1   W2   W3  HLF   D1   D2   D3
2-DIV   5-Game LCS   1969-1984   30   12   10   10    9   11   18   17   16
2-DIV   7-Game LCS   1985-1993   18    6   10   10   10    8    8    8    9
2 DIV   7-Game WS    1969-1993   24   11   10   10   11   11   10   10   10
2 DIV   All          1969-1993   72   29   30   30   30   30   36   35   35

3-DIV   5-Game LDS   1995-2007   52   24   29   28   28   26   25   29   31
3-DIV   7-Game LCS   1995-2007   26   14   16   16   16   15   14   14   15
3-DIV   7-Game WS    1995-2007   13    5    6    5    5    6    6    6    5
3-DIV   All          1995-2007   91   43   51   49   49   47   45   49   51

        All 5-Game   1969-2007   82   36   39   38   37   37   43   46   47
        All 7-Game   1969-2007   81   36   42   41   42   40   38   38   39

        All Non-WS   1969-2007  126   56   65   64   63   60   65   68   71
        All WS       1969-2007   37   16   16   15   16   17   16   16   15

        All          1969-1993   72   29   30   30   30   30   36   35   35
        All          1995-2007   91   43   51   49   49   47   45   49   51

That’s a ton of data, and before going much further, it makes sense to pare this down to a few logically-grouped sets with larger sample sizes. Using winning percentages:


Series       Period       #    W0    W1    D3
All 2-Div    1969-1993   72  .403  .417  .486
All 3-Div    1995-2007   91  .473  .560  .560

All 5-Game   1969-2007   82  .439  .476  .573
All 7-Game   1969-2007   81  .444  .519  .481

All Non-WS   1969-2007  126  .444  .516  .563
All WS       1969-2007   37  .432  .432  .405

All          1969-2007  163  .442  .497  .528

Any resemblance to the NL West standings is entirely coincidental, though it does make for a convenient metaphor. The data underscores the utter futility of using actual records to predict playoff series; that .442 winning percentage is a nearly exact match for the actual record of this year’s Giants (.444). The success rate is considerably higher using first-order records, over .500 in some blocks but not all of them, enough to suggest that even using those is pretty much a crapshoot. It’s at its highest with the third-order discrepancies, a little higher than the actual record this year’s Dodgers on the whole (.519), and at times about as high as those big, bad Phillies (.568).

I don’t want to overstate the claims about what all of this tells us given the sample sizes, but it’s worth laying out the inferences we can draw:

Projected records appear to be solid indicators of series success in the Wild Card era, much moreso than in the two-division era.
Those projected records appear to do a much better job in the intermediate series than they do in the World Series (the smallest sample here).
Third-order discrepancies appear to be the strongest indicators in five-game series, and they match up well across the entire Wild Card era.

The first and third points have the current era in common, and when we consider the difference between this period and the two-division one, one factor that stands out is the evolution of the bullpen’s importance. Recall that Nate Silver found closer quality (as measured by WXRL) to be a significant enough predictor of post-season success that he incorporated into what he termed the “Secret Sauce,” and add to this my own finding of a modest correlation (r = .42) between team WXRL totals and third-order discrepancies across the 1954-2007 Retrosheet era, a correlation that edges up to .49 in the Wild Card era. What we appear to have stumbled upon is some further evidence of a link between regular season over- or underachievement, bullpen quality, and postseason success, one that merits further exploration.

That will have to wait for another day. In the meantime, it’s too late to get money down on the four first-round series at hand, but it’s worth a quick gander at those first- and third-order indicators:


Team         W0    W1    D3
Cubs       .602  .612   2.5
Dodgers    .519  .535  -4.7

Phillies   .568  .575   5.2
Brewers    .556  .539   3.6

Red Sox    .586  .592  -7.1
Angels     .617  .543  16.0

Rays       .599  .566   0.0
White Sox  .546  .550   0.9

In the National League, both the W1 and D3 indicators match the conventional wisdom regarding that the Cubs and Phillies will prevail. In the American League, the indicators split on both series, with the D3’s take on the White Sox-Rays perhaps the most surprising given the relative rankings of the two bullpens, with Tampa Bay leading the majors in WXRL and the White Sox running 10^th. By comparison, this year’s Secret Sauce rankings call for both Chicago teams, Boston, and Milwaukee to prevail (a refresh of my browser window tells me that the White Sox gained one point in the closer category via Game 163; prior to that, they were even with the Rays). Clay Davenport‘s Postseason Odds favored the Cubs, Brewers, Red Sox, and Rays before play opened yesterday. Our own prognosticators (myself included) have called for the Cubs, Phils, Red Sox, and Rays to move on, incorporating subjective factors as well as objective ones.

One slate of results certainly won’t prove anything, but it should be fun to sort out where these various methodologies succeeded and failed. In the meantime, enjoy the current smorgasbord of post-season baseball.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now

You need to be logged in to comment. Login or Subscribe

coreyk626

10/02

Would it be possible to account for which team had the home-field advantage and then rerun the numbers?

Reply to coreyk626

jjaffe

Definitely possible but not immediately, as it requires manual data gathering.

It should be noted that except for the World Series, the HFA usually goes to the team with the better won-loss record (and hence W0) anyway except when that record is held by a Wild Card winner.

Reply to jjaffe

davejsch

The biggest X-factor of the short series is the use of only the top 3 or sometimes 4 pitchers on the staff. Is there a way to filter for this? Obviously, team A\'s #5 doesn\'t always play team B\'s #5 throught the season so it may not be possible.

Reply to davejsch

There\'s probably a way to do that, but it\'s something that more likely falls into a Secret Sauce-level analysis that can incorporate the performances of individual personnel as well as team. What I\'ve tried do do here is stay at the team level via the Adjusted Standings report to see if there\'s anything to be gleaned.

ericmvan

10/04

The slight advantage of D3 over D2 is almost certainly meaningless, especially considering that the W3 / D3 numbers are completely inaccurate, and that the adjustment for strength of schedule really should be made separately for actual, Pyth, and EqRS/RA Pyth win percentages, the raw versions of which can then be safely ignored.

(See http://sonsofsamhorn.net/index.php?showtopic=36962 for the problem with W3.)

Reply to ericmvan

Prospectus Hit and Run: Predictive Power

Thank you for reading

Latest Articles

speX ’24: Week Four $

Will I Be Drawing These Stupid Rabbits Forever? $

Deep League Landscape ’24: Week Four $

MLU: Bratt Frustrates Opposing Hitters $

Box Score Banter: Knuckling (Way, Way) Up B

Jay Jaffe

Latest Articles

speX ’24: Week Four $

Will I Be Drawing These Stupid Rabbits Forever? $

Deep League Landscape ’24: Week Four $