PECOTA 60-game projected standings are here!

It happens every week: a reader sees his favorite team trailing one of its division rivals in the Hit List rankings despite leading in the actual division race, and fires off a snarky e-mail or comment questioning the validity of the list, occasionally while making anatomical references, and usually citing last year’s division race or post-season results. Yes, Phillies fans, I can assure you that we’ve counted the rings. Well into my fifth season of writing the Hit List, I’m far more amused by such occurrences than I am offended, but the weekly give-and-take serves as a reminder for the occasional need to explain the list’s workings in greater detail. As such, I annually set aside a column called the Hit List Remix to walk readers through the process.

First, a quick refresher course on the Hit List’s basics. It’s BP’s version of the power rankings, created by me back in 2005, and based upon an objective formula which averages a team’s actual, first-, second-, and third-order winning percentages via the Adjusted Standings. To go into a bit more detail:

  • First-order winning percentage is computed (via Pythagenpat, Pythagoras’ slightly more sophisticated sibling) using actual runs scored and allowed.

  • Second-order winning percentage uses equivalent runs scored and allowed, based on run elements (hits, walks, total bases, stolen bases, etc.) and the scoring environment (park and league adjustments).

  • Third-order winning percentage adjusts for the quality of the opponent’s hitting and pitching via opposing hitter EqA (OppHEqA) and opposing pitcher EqA (OppPEqA), both of which Clay Davenport recently added to the Adjusted Standings report for those of you curious enough to take an interest in such things.

With the exception of an injection of pre-season PECOTA projections during the season’s first month, those numbers are all that go into the rankings, which are averaged into what I’ve called the Hit List Factor (HLF). There are no subjective choices to be made, no additional tweaking to favor the A’s or to hurt the Phillies or fit into any of the other 28 conspiracy theories our readers might think of offering. No recent hot or cold streaks or head-to-head records are accounted for, either, despite the frustration of readers wondering why their team hasn’t vaulted to the top thanks to a 5-2 week against their division rivals. It’s all about runs, actual and projected, because run scoring and run prevention give us the best indication of a team’s strength going forward. Using all four percentages is a way for correcting for teams that over- or underperform relative to the various areas examined.

With that in mind, let’s take a look at the American League Central race, which has drawn comment because, despite maintaining at least a share of first place since May 10, the Tigers have consistently trailed either the White Sox or the Twins on the Hit List, and sometimes both of them. In last Friday’s edition of the Hit List-which I’ll use for all of the examples in this article-the White Sox ranked 12th, the Tigers 16th, and the Twins 17th despite the Tigers holding a 2½-game lead on the Sox and a four-game lead on the Twins at the time. Here’s the breakdown of the various winning percentages that went into that week’s Hit List Factor:

Rk  Team          W0     W1     W2     W3    HLF
12  White Sox   .512   .514   .524   .515   .516
17  Tigers      .533   .513   .491   .486   .506
17  Twins       .479   .501   .506   .509   .499

Of the three teams, the Tigers had the best winning percentage (W0), but the White Sox had the best run differential (+16 to the Tigers’ +15 and the Twins’ +1) and thus a very slight edge in first-order percentage (W1). Those two figures were almost perfectly in sync for the Sox, but the Tigers were 2.4 wins ahead of their expectation, the Twins 2.6 wins behind theirs. In terms of run elements, the gap grew even wider, with the Sox compiling enough hits, walks and other goodies to project as outscoring their opponents by 28 runs and the Twins doing so by seven runs, but the Tigers projecting to be outscored by 11 runs.

In terms of third-order adjustments, all three teams had faced below-average slates of opposing hitters and pitchers. Recall that .260 is defined as the league average:

Team       OppHEqA  OppPEqA
White Sox   .2576    .2596
Tigers      .2584    .2594
Twins       .2591    .2584

The Sox had faced the easiest hitters of the three, the Twins the easiest pitchers, and when all that was factored in, the Sox maintained a 29-point third-order lead on the Tigers and wound up with a Hit List Factor right in line with their winning percentage. The Tigers, on the other hand, were 47 points ahead of their third-order winning percentage, a difference of 5.7 wins. That overperformance is why they’re atop the AL Central, and it’s been partially credited here via the inclusion of W0. But it’s also not necessarily something to bank on going forward relative to the other metrics which suggest they’re so far ahead of expectation.

Turning to another race which you may have heard about:

Rk  Team          W0     W1     W2     W3    HLF
2   Yankees     .628   .584   .607   .608   .607
3   Rays        .542   .563   .587   .589   .570
4   Red Sox     .575   .577   .552   .561   .566

Prior to this past weekend’s series in Fenway, the Yankees led the Red Sox by 6½ games and the Rays by 10½ despite the fact that the first-order spread encompassing the three teams was only about 2½ games. The Yankees had been unusually efficient in converting their runs scored and allowed into wins, and the Rays had not-a result that likely owed something to the fact that the Yanks led the league in WXRL at the time while the Rays were seventh. The Red Sox, whose bullpen ranked second, had been on target in converting their runs into wins but trailed both teams on the Hit List because their second- and third-order winning percentages were lower than either of the other teams, but in all their performance has been closest to their various projected winning percentages.

As of last week, those were the only divisions where the Hit List rankings deviated from the standings as far as the contenders were concerned, although that hasn’t always been the case; such anomalies are more common early in the season, but they tend to sort themselves out along the way, even if the pace at which they do can seem glacial.

Speaking of divisions but turning from the micro to the macro, here’s a look at how the six of them stack up:

            --------2009-------   --------2008-------    HLF
Division    Avg RK  WPct    HLF   Avg RK  WPct    HLF    +/-
AL East       9.8   .522   .534     7.6   .538   .549  -.015
NL West      13.4   .511   .516    20.0   .463   .474   .042
AL West      14.0   .533   .513    18.8   .487   .475   .038
NL East      15.2   .492   .500    15.4   .490   .495   .005
AL Central   18.8   .470   .480    16.0   .501   .505  -.025
NL Central   20.5   .482   .467    15.8   .515   .498  -.031

Last year saw a historically strong AL East, one which ranked fourth in winning percentage within the Wild Card Era, as well as third in Hit List Factor. This year’s AL East is strong enough to rank fifth in the latter category, thanks to that trio of top five teams, though it misses the top 10 in the former-a product, mainly, of the Blue Jays‘ falloff from being the best fourth place team ever; they’ve declined from being a .556 HLF team in 2009 to a .506 one this year, though they spent much of the first half living up to last year’s performance before the cracks in their foundation started to show.

Meanwhile, the AL West has the highest winning percentage this year, good enough for fifth in the Wild Card Era, this after posting the 12th-lowest winning percentage in the era last year, one point ahead of the NL West. Both Wests have made drastic improvements relative to the rest of the pack since 2008. While the Angels aren’t runaway favorites in the AL West, they’re still a very strong club, and the Rangers have improved enough to become Wild Card threats. In the NL West, the Dodgers have ranked atop the Hit List for most of the year, and while they’ve come back to the pack a bit in the division race, that’s in part due to the Rockies and Giants playing some strong baseball as they jockey for the wild-card lead. The Rockies ranked 22nd during the week they canned Clint Hurdle as manager, but they’ve methodically climbed the rankings to the point where they were sixth last week.

The NL Central, on the other hand, is bad enough to rank as the seventh-worst of the era in HLF; no less than four of the division’s teams (the Brewers, Astros, Pirates, and Reds) are strewn among the bottom 10 spots on the most recent list. That’s in a division that spent most of the first half with the top four teams separated by just three games and appeared to have an interesting race for the postseason on tap. At least those Central teams can take comfort in the fact that they’re about four wins ahead of their third-order projections apiece, the widest average discrepancy of any division and enough to push their raw winning percentage past that of the AL Central.

On a league level, the split between the AL and NL isn’t as wide as it was last year:

Year  AL HLF  NL HLF   Diff
2009   .509    .493    .016
2008   .512    .490    .022
2007   .506    .495    .012
2006   .513    .488    .025
2005   .509    .492    .017

The shrinking gap owes something to the fact that the AL’s interleague advantage this year was only 137-114, down from 149-103 last year, though in both years the AL has beaten its first-order Pythagenpat projection by a couple of wins. Given the persistence of that split, it might make sense to include a league-based adjustment, particularly since that isn’t built in elsewhere. That’s something I’m toying with back in the Hit List lab, and while I haven’t decided whether to implement it on a weekly basis, it’s something I’m considering, and certainly something that readers have suggested. Here’s what last week’s rankings would look like if I applied a nine-point bonus to the AL teams and a seven-point penalty to the NL ones (the numbers aren’t exactly equal because the NL has 16 teams to the AL’s 14, so they are actually .0086 and .0075):

Rk   Team          AHLF
 1   Yankees       .615
 2   Dodgers       .603
 3   Rays          .579
 4   Red Sox       .575
 5   Angels        .562
 6   Rangers       .560
 7   Phillies      .556
 8   Rockies       .552
 9   Cardinals     .537
10   Braves        .536
11   White Sox     .525
12   Giants        .523
13   Blue Jays     .515
14   Tigers        .514
15   Twins         .507
16   Marlins       .507
17   Cubs          .505
18   Mariners      .498
19   Indians       .484
20   Athletics     .466
21   Diamondbacks  .465
22   Brewers       .459
23   Mets          .456
24   Astros        .441
25   Orioles       .428
26   Royals        .413
27   Nationals     .407
28   Pirates       .407
29   Reds          .403
30   Padres        .402

Via what I’ll call provisionally call AHLF for Adjusted Hit List Factor (greaaaat, another acronym), the effect isn’t overwhelming. The top spot changes hands between the Dodgers and the Yankees, but five of the top 10 teams are still from the weaker senior circuit, though they move down an average of one rung apiece. Many of the rankings in the middle of the list are unchanged; at the bottom of the list, the Royals benefit by vaulting from 30th to 26th. In all, the magnitude of the adjustment may be a bit conservative, but conceptually, such an adjustment is probably an appropriate step to take. Consider it a topic for further exploration, and an appropriate spot to end this tour of the Hit List sausage factory. I’ll be back on Friday with the full serving of links.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now
You need to be logged in to comment. Login or Subscribe
Looks like you have a snafu there with your AHLF table.
My biggest issue is you consider them 'power rankings', when they really aren't, they are just a summary of how teams have performed (while ignoring the biggest indicator of performance, wins) to date.

Why ignored 0th order wins? BP likes to call teams that outperform their 1st, 2nd, or 3rd order wins 'lucky', but I haven't seen a recent article showing that the correlation from year to year on Wins Above Expectation is near 0. Unless that's the case, I think wins are a necessary part of a power ranking measurement.

Additionally, Power Rankings ought to be be informing us of how teams are playing right now, so additional weight ought to be given to the most recent 7, 14, or 30 days. It would be interesting to see what the correlation is between the current power ranking formula and the performance in the next 2 weeks or so. I'm guessing that correlation could be improved by weighting the more recent games more heavily.

On a different subject, if you are going to 'adjust' for league difficulty, please make sure you do it right. I don't have the extensive database to prove it, but I'm 100% certain that if the AL and NL were perfectly evenly match, the AL's expected winning percentage in interleague is significantly higher than .500, because the advantage of having a DH on your roster and using them while the NL has to use the first guy off their bench in AL parks is way more than the advantage the NL has in their park, where their pitcher will hit a little better than the AL pitcher due to more practice. If the AL should win 56% of their home games and the NL should win 52% of theirs, the AL would have an expected record of 131-121.
If I read correctly, real wins (W0) are indeed part of the equation.
I thought it was just W1 (based on runs scored/allowed) + W2 (equivalent runs scored/allowed) + W3 (Eq. RS/RA adjusted for schedule) divided by 3.
I missed the word "actual" on first reading too, but Jay explains that it "averages a team's actual, first-, second-, and third-order winning percentages."
Oops. So it does. I missed it on the 2nd reading as well. Ok, so ignore my second paragraph above. The rest of it still applies. :)
And that's why I have to write these articles, because people seem to miss the fine print. "...based upon an objective formula which averages a team's ACTUAL, first-, second-, and third-order winning percentages via the Adjusted Standings.

As for the idea of weighting recent performance more heavily, there isn't really any evidence to suggest that two weeks will tell us much about a team's level of success going forward; if anything, it's likely to indicate future regression to the mean.
No one said only use 2 weeks of data, but I think the last 2 weeks tells us more than the first two weeks of April on who is likely to win in the next week or two.
If you've got a yen to see who the hottest teams are, you can check out the Elo-adjusted playoff odds here, with accompanying background here.
I guess my objection is more with the term power rankings if that's NOT what your goal is. Power Rankings, in general usage, are typically rankings of 'who would beat who right now'. The Hit List is more of an adjusted standings than that, in my opinion.
OK. I just did a quick-and-dirty study which, while hardly definitive, may shed some light into the predictive nature of recent performance versus year-to-date performance in this context.

Using last year's Hit List, I broke the season up into four-week chunks and tested the correlation between each team's "monthly" w0, w1, w2, w3, and HLF and their following month's actual record. I used these four-week splits because that's what could easily create from the master Hit List spreadsheet (I only save the adjusted standings for the days I use to compile the list). I don't have end-of-month splits available, and waiting for Clay Davenport to dig them up would take some time.

The correlations for "monthly" ____ winning percentage to next "month's" actual winning percentage:

actual: .21
first-order: .24
second-order: .18
third-order: .17
HLF: 0.22

I then tested the correlation between the various year-to-date winning percentages from those increments and the next month's actual winning percentage.

actual: .304
first-order: .289
second-order: .298
third-order: .296
HLF: .312

This is a pretty slapdash study, but it does support the none-too-controversial idea that a larger sample size such as a year-to-date performance is more useful in predicting W-L performance going forward than a recent increment is. And at the very least, using HLF for that purpose is no worse than using any of the individual winning percentages, and possibly better.
Fair enough. That completely contradicts what I found when I did something similar for a different sport, but I guess every sport has their quirks.

oops, second link should be this.
I think the reason that looking at the last two weeks might be significant is that team personnel changes throughout the year. as an example, I'll compare my favorite team, the Rangers, from last sunday's lineup to Opening day's lineup:

C Saltalamacchia -> Ivan Rodriguez
1B Davis -> Blalock
2B Kinsler -> Kinsler
3B Young -> Young
SS Andrus -> Andrus
LF Byrd -> Byrd
CF Hamilton -> Hamilton
RF Cruz -> Cruz
DH Blalock -> Jones
SP Millwood -> Millwood
SP Padilla -> Feldman
SP McCarthy -> Holland
SP Benson -> Hunter
SP Harrison -> Nippert

Surprisingly, the only changes are at Catcher, Firstbase, and DH, and tonights lineup might have Davis at first and Blalock DH'ing, leaving the only change at Catcher. But Kinsler has missed some time, and Cruz has been hurt and Hamilton has been out too. And of course the rotation is entirely different. Benson lasted like 2 starts, and Harrison and McCarthy both lasted 11 (Hunter has had 11 starts as well....) and Padilla 18.

Trying to rank teams is always kind of a moving target, but the Hit list isn't a bad way to do it, as long as you understand what it is.

Pretty easy to see why people would think there are 3 components when you have 3 bullet points. It would have been easier to read/comprehend if you just listed the four bullet points, and not explained the actual winning percentage part is detail.
"an objective formula which averages a team's actual, first-, second-, and third-order winning percentages"

That's W0, W1, W2, and W3. I suspect you were interpreting *actual* as a stray adjective, rather than a category.

Anyway, you're assuming your conclusion by calling wins the "biggest indicator of performance". Wins are the biggest indicator of wins, and the perfect stat for figuring out where you are in the standings. But if you want to predict future wins, you're better off looking at performance -- i.e. what players have done in individual plate appearances or batters faced. Better still, adjust those for park and league and quality of opposition.
Gaah. That will teach me not to take time to think about how to phrase things...
I don't disagree ... but winning matters too.

And honestly, until we see that the Hit List factor is actually correlated with future performance, they really aren't much more than a different form of standings, are they? They tell us about the past, not about the future.
I can't wait to see your Power Rankings prognosticating future performance, MH. You make sure to tell us all when you have that ready, OK?
I used to do it for another sport, but no longer do so due to time constraints. If you are willing to pay me whatever BP pays Jay to do Hit List, though, I'll reconsider.
Is a .600 record over two weeks of games against the Nationals and Pirates more indicative of future performance than a .500 record against the Yankees and Red Sox over the same time period?

Maybe building in a "remaining strength of schedule" factor would matter more than recent performance.
the point of measures like expected wins and underlying performance formula is the idea that actual records obscure more stable and baseline performance.

it is a reasonable direction of inquiry given the factor of luck, and the expectation that luck does not continue.

One thing confuses me about your first two tables. If all three AL Central teams have faced below-average opposition, shouldn't their third-order winning percentages be lower than the second-order? That's the case for the Sox and TIgers, but not the Twins.
Is BP's baserunning metric (ex-stolen base runs) good enough to add into the mix? It seems a natural addition to figure second-order W%...hits, walks, total bases, steals, and bases advanced / outs avoided. Or is that part of "etc."?
It's part of the etc.; the link to the glossary entry is there for a reason. The full buffet for Raw EqA, the starting point, is

Don't the third order projections take league into account through EqA? If AL teams are stronger they should have better OppHEqA and OppPEqA.
If memory serves, the Phillies have consistently outperformed their second and third order wins over the past three years. I'm curious if there might be a particular reason why a particular team might consistently outperform its second and third order wins over several years
omg i sorry for snarky comment last time! i forgot to delete the last line of my post xD even though i felt bad about it afterwards. this is still the most interesting and best power ranking. ^_^
Thanks for a comprehensive explanation of the Power Rankings. Perhaps you could provide a link in future posts to this article to limit your need for any further explanation.

I would think that the methodology to develop a model which best fits historial data would be completely different than that used to develop a predictive model so I wouldn't feel compelled to alter the Power Rankings to simulate the Power Rankings used by bettors of college football.

Jay, I definitely think you should add the league adjustments to the Hit List factor. Intuitively though, it's not obvious that the adjustment should be linear - Pythagenpat isn't... so, maybe the adjustment should be more for teams in the middle? It would take some figuring out.
Allow me to say that the "Hit List" is easily the most useless recurring feature at BP.
Hmmm, I find them fun!!! The comments by Jay are simply more informational than the weekly scout's comments, which are similar to expressions one finds in a fortune cookie or the daily horoscope found in newspapers.
Besides, if there wasn't a ranking list of teams, BP readers would definitely ask for one. Personally I like how Jay does it.
Make that informative