August 26, 2008
Prospectus Hit and Run
Angels closer Francisco Rodriguez notched his 50th save on Sunday night, and if you've been following some of Joe Sheehan's recent work, you know that K-Rod is well on his way to smashing Bobby Thigpen's single-season record of 57 saves, set back in 1990. You're probably not aware that he and the rest of the Halos are in sight of another record as well.
Through Sunday, the Angels were 29 games over .500 at 79-50, despite having outscored their opponents by only 53 runs. That put them 9.7 wins above their expected record—their first-order Pythagenpat projection based on actual runs scored and runs allowed. They're 13.5 wins above their second-order projection, based on Equivalent Runs scored and allowed as derived from run elements (hits, walks, total bases, stolen bases, etc.) and adjusted for their park and league scoring environment. They're 12.2 wins above their third-order projection, which adjusts for the quality of their opponents' pitching and hitting via Equivalent Average (EqA) allowed and opponents' EqA. That last figure would tie for third all-time if the season were to have ended on Sunday (the Angels lost on Monday night, slightly lowering these figures). Turning to the big board for the top 20:
Rk Year Team W L PCT R RA AEQR AEQRA D3 Won 1 2004 NYA 101 61 .623 897 808 911 831 12.7 Division 2 1970 CIN 102 60 .630 775 681 757 676 12.6 Pennant 3T 2007 ARI 90 72 .556 712 732 708 739 12.2 Division 3T 2008 ANA 79 50 .612 600 547 588 566 12.2 5T 1954 BRO 92 62 .597 778 740 782 749 12.1 5T 2005 CHA 99 63 .611 741 645 740 684 12.1 World Series 7 1905 DET 79 74 .516 512 604 524 601 11.9 8T 1924 BRO 92 62 .597 717 679 717 684 11.7 8T 2002 MIN 94 67 .584 768 712 759 741 11.7 Division 10 1954 CLE 111 43 .721 746 504 717 511 11.4 Pennant 11T 1907 CHN 108 44 .711 574 390 552 394 11.2 World Series 11T 1961 CIN 93 61 .604 710 653 705 658 11.2 Pennant 13T 1931 PHA 107 45 .704 858 626 841 639 11.0 Pennant 13T 1972 NYN 83 73 .532 528 578 533 583 11.0 15 1984 NYN 90 72 .556 652 676 657 671 10.7 16T 1936 SLN 87 67 .565 795 794 808 809 10.2 16T 1977 BAL 97 64 .602 719 653 719 662 10.2 16T 2006 OAK 93 69 .574 771 727 791 772 10.2 Division 19T 1997 SFN 90 72 .556 784 793 780 789 10.0 Division 19T 2007 SEA 88 74 .543 794 813 792 824 10.0
Since the spreadsheet provided to me by Clay Davenport (who cooks up the Adjusted Standings every day) doesn't go beyond the first decimal place, I haven't bothered to break the ties here. AEQR and AEQRA are the adjusted Equivalent Run figures once opponent strength has been incorporated; D3 is the difference between third-order wins and actual wins, with a positive number representing a team that's exceeded its projection. "Won" notes whether a team won their division, pennant, or World Series.
All in all, that's a pretty interesting group of overachieving teams spanning more than a century. Three of the top eight squads since 1901 in terms of single-season winning percentage are represented here, in the form of the 1954 Indians (third), 1907 Cubs (seventh), and 1931 A's (eighth). As you'd expect, reaching that rarefied air generally requires a team to outplay their already-favorable projections. The 22 teams who have finished full seasons with actual winning percentages of .682 or above (three of whom tied for 20th-best all-time, with 105-49 records) averaged 6.4 wins above their third-order projections, while only the 1939 Yankees—legendary for being Joe DiMaggio's first team and Lou Gehrig's last—finished below it. Believe it or not, those Yankees actually should have been 1.5 games better than their already impressive 106-45 record. Blame Gehrig, the slacker.
At the other end of greatness relative to the above list are six teams who were actually outscored on the season, including the 2007 Diamondbacks, with whom the Angels are currently tied. All six used their Pythagorean voodoo to wind up with winning records, but none won more than 90 games; the 1997 Giants were the only ones besides those Snakes to make the playoffs (six teams with negative run differentials have made the postseason in all). Oddly enough, just over half of the above teams (11) made the postseason, and of the ones that did, only the 1907 Cubs and 2005 White Sox won the World Series. Overachievement can apparently only take you so far.
The Angels are currently tied with last year's Diamondbacks, who finished 18 games over .500 while being outscored by 20 runs. It's a misnomer to say that they're "on pace" to break the 2004 Yankees' record, since teams that are exceeding or underplaying their Pythagorean records by wide margins tend to regress to the mean. The Angels may well pass those Yanks before the end of the year, only to slide back behind them if the results don't fall just so. Finishing anywhere in the vicinity of the top of this list would be impressive enough.
Cracking the list should come with a warning label, however. Back when Bill James introduced the original Pythagorean analysis in his Baseball Abstract series, he noted that teams who exceeded their projections one year tended to decline the following year. He named that concept the Plexiglass Principle—plexiglass, once bent, tends to return to form—but it's really just another name for regression to the mean. The 2007 Mariners, who also crack the list above, weren't alone in seeing their fortunes fade the following season. A quick-and-dirty check of the 17 teams above who have full season follow-ups in the books shows that they declined by an average of 7.5 games the following year.
We tend to talk about teams that are over- or under-playing their projected records as "lucky" or "unlucky," but it's a misnomer to chalk up the entirety of this to luck. Such aberrations generally stem from an irregular distribution of runs; an overachieving team may win most of the close games but get blown out a few times. The 2004 Yankees were exemplars of this, as they were 24-16 in one-run games and 26-8 in two-run games, but 6-7 in games decided by nine runs or more, including a 22-0 drubbing at the hands of the Indians that apparently earmarked them for the top of this list. As for this year's Angels, they're 25-16 in one-run games and a staggering 23-7 in two-run games. They're also 0-3 in games decided by nine runs or more, a trio of contests in which they were outscored by 30 runs.
Overplaying one's projected record can also stem from having relatively more success in higher-leverage situations, such as hitting well with runners in scoring position, or being especially stingy in surrendering runs late in the game. In fact, a strong bullpen is probably the most consistent means of such overachievement. Of the 15 teams above who played after 1953 (the boundary of our sortable stat database), 14 of them had bullpens that finished in the top three in the league in Reliever Expected Wins Added (WXRL), and the trend continues if we round out the list of post-1953 third-order overachievers to an even 20:
Year Team D3 Rk 2004 NYA 12.7 2 1970 CIN 12.6 2 2007 ARI 12.2 2 1954 BRO 12.1 2 2005 CHA 12.1 2 2002 MIN 11.7 1 1954 CLE 11.4 1 1961 CIN 11.2 1 1972 NYN 11.0 2 1984 NYN 10.7 3 1977 BAL 10.2 14 2006 OAK 10.2 3 1997 SFN 10.0 3 2007 SEA 10.0 3 1960 NYA 9.6 2 1959 CHA 9.3 1 1978 CIN 9.3 1 1961 LAN 9.0 3 1969 NYN 9.0 1 2001 NYN 9.0 2
Now that's a trend. In this table, Rk is team's rank in WXRL within its league. Six of these teams led the league, eight more were the runners-up, five finished third (only one of them from the eight-team league days)... and then you have the 1977 Orioles, who did their best to drive a few nails into Earl Weaver's coffin with a bullpen that was the worst in the league, finishing right at replacement level (0.0 WXRL). They still went 32-19 in one-run games and 20-9 in two-run games, overcoming that leaky bullpen with the league's best starting rotation, namely Jim Palmer, Mike Flanagan, Rudy May, Ross Grimsley, and sometimes Dennis Martinez. They threw 65 complete games, 12 more than the second-ranked team, and led the league in SNLVAR going away.
Those Orioles are the exception that proves the rule. Between the rotation and the bullpen, the latter is far more likely to influence a disparity between actual and projected record. The correlation between a team's D3 and their SNLVAR total is just .21, whereas between their D3 and WXRL total it's .42. Alas, I don't have any hitting-based win-expectancy data at my fingertips to provide a correlation from the other side of the aisle.
This year' Angels, of course, have a typically excellent bullpen; they're second in the league in WXRL, though a good 3.5 wins behind the Rays (all stats and rankings through Sunday). K-Rod himself is second behind Joe Nathan, set-up man Scot Shields has rebounded from a rough 2007 season to rank 15th, rookie Jose Arredondo is 20th, and lefty Darren Oliver is 22nd. The Angels have a pretty good rotation too; while they rank sixth in the league in SNLVAR, they're closer to first place than to seventh. Which is why they're coasting to their fourth division title in the last five years with a hefty double-digit lead in the AL West. As the old saying goes, you can't have too much pitching.
The all-time D3 list above suggests several lines for further inquiry in addition to the ones already mentioned (follow-up seasons, correlation to other team-based statistics) or implied (a bizarro list of the teams who finished with the largest negative third-order discrepancies). One of the more obvious subjects is the fact that seven of the top 20 overachievers hail from 2000 or later, raising the question of whether there's something about the current era that lends itself to a wider spread in over- or under-achievement, such as today's more specialized bullpen usage patterns. As with the 2008 Angels' ultimate fate, that's a topic worth revisiting.