“My @#$% doesn’t work in the playoffs. My job is to get us to the playoffs. What happens after that is #@%$% luck.”–A’s General Manager Billy Beane, as quoted in Moneyball eloquently explaining the essence of probability in the postseason
Don’t you just love playoff baseball? I have to admit that the many hours spent watching the division series and now the league championship series have cut into the time I normally spend reading, researching, and writing about our shared passion. In the end, though, what true fan wouldn’t gladly trade those derivative experiences, as interesting and diverting as they may be, for enjoying the real thing, especially when every pitch counts? With that in mind, this week we’ll take a look at (literally) a few odds related to the League Championship Series, and some ends from a previous column on the case of the missing triples.
Simulating the Crapshoot
If there’s one thing that analysts know intuitively but don’t always do a good job of explaining is that in a short series almost anything can happen. Witness the Tigers three games to one upset of the heavily favored Yankees in the ALDS and my own “prediction” that the Padres would take care of business in four games against the seemingly slumping Cardinals.
One need only look at the following track record since divisional series play started in 1995 to get the picture:
- 25 of the 44 teams or 56.8% with equivalent or better Pythagorean records (the expected winning percentage of the team based on the total of their runs scored and runs allowed calculated using the exponent 1.83) won the Division Series
- 13 of the 22 teams or 59.1% with better Pythagorean records won the League Championship Series
- 5 of 11 teams or 45% with better Pythagorean records won the World Series
So teams with better Pythagorean records are more successful overall, winning 43 of 77 post season series (55.9%). It’s not exactly a total crapshoot, as in the immortal words of Billy Beane. But there is still plenty of room for variability to influence the outcome. The fact that playoff teams are in most cases very closely matched makes prognostication, if not a fool’s errand, a particularly dicey endeavor. Beane himself reiterated his mantra in a slightly different form recently when he said before this year’s division series matchup with the Twins:
“It becomes five hands of blackjack. There’s no sense counting cards. You’re at the mercy (of the cards). A great blackjack player wants to play a lot of hands. If you only play five, you won’t find out who is the best player.”
That said, we still love to predict, and so here I take a crack at the 2006 League Championship Series using a simple simulation program.
The program simulates two teams playing a seven-game series in the 2-3-2 format and uses the log5 formula Bill James introduced in the 1981 Baseball Abstract to calculate the probability of the home team winning each game. For those who don’t remember that formula is:
Where A is the winning percentage of one team, and B is that of another, and the result is the probability of Team A winning the game. So if Team A played .550 ball at home and Team B played .450 on the road, the probability of Team A winning a game at their home park would be:
.599 = (.550 – (.550 * .450)) / (.550 + .450 – (2 *.550 *.450))
To make the simulator a bit more interesting I calculated the expected winning percentage of each team both at home and on the road using their Pythagorean winning percentage (using an exponent of 1.83). In other words, the expected winning percentages for the ALCS and NLCS teams at home and on the road were calculated as follows:
Home Away RS RA Pyth W% RS RA Pyth W% Tigers 392 349 .553 430 326 .624 A's 372 346 .533 399 381 .521 Mets 395 347 .559 439 384 .561 Cardinals 399 348 .562 382 414 .463
In the AL, the Tigers clearly appear to be the better team here, in light of their excellent run differential on the road, a fact that in the simulator erases the home field advantage the A’s have in (potentially) playing four games in Oakland. Likewise, in the NL the Mets seem to have a decided advantage due to their performance in road games and the Cardinals’ less-than-stellar output in games away from the new Busch Stadium.
In any case, the simulator plays each series 100,000 times and the following were the results.
Here, you can see the Tigers win 66% of the time and the A’s just 34%, and in only 18% of those outcomes does the series get to seven games. This is primarily because of the Tigers’ excellent projected winning percentage on the road. But what’s interesting here, and one of the reasons why prediction in a short series is so difficult in baseball, is that even given the expected winning percentages, a clearly outmanned A’s team still wins in six games or less almost a quarter (23%) of the time.
Over in the NLCS, the Mets win fully 62.5% of the time, the Cardinals 37.5%, although 31% of the simulated series went the full seven games.
Obviously this simple model leaves out a tremendous amount of information–the most important being the pitching matchups and particularly the strength of each team’s top three starters, but also the results of the previous series, the health of individual players (Scott Rolen, for instance), the weighting of late-season as opposed to early-season games as the playoff roster takes shape, and even more subtle factors like the manager’s tendencies and willingness to manage differently in the postseason than in the regular season. For example, will Jim Leyland simply hand over the reigns to Joel Zumaya and his 103 mph fastball?
But if we had all of that, what would be the fun in watching?
Hitting the Trifecta
Several weeks ago in this space, we took a look at the decline in triples in the major leagues, and laid out the primary theories used to explain their increasing rarity. Those theories ranged from better fields, to changing park configurations, to risk aversion, and also the aging of the player population.
Several readers, however, wrote in to augment that list and point out additional factors that might be considered. For example, one reader postulated that the decline in triples is related to the increase in home runs. In other words, fly balls that would once have landed in the alleys or down the lines for triples are now making their way over the fence. In other words, triples have decreased simply because there are fewer potential triples hit.
To look at this idea I graphed the number of doubles and homeruns per 500 AB+BB, along with triples as shown in this graph:
What’s interesting here is that while home runs have indeed risen fairly steadily over the long haul, there have been long stretches where home run output has remained relatively constant while triples steadily decline. For example, from 1953 through 1993 the home run rate essentially remained constant with some fluctuation in the late ’50s and again in the late ’60s, while triples continued to decline (albeit with the spike in the late 1970s that I discussed in the previous column). In addition, the great spike in power beginning in 1921 does not seem to correlate with a corresponding decrease in triples; triples continued to be hit at the same rate until the early 1930s. As a result, it would seem that the overall increase in home runs is probably not to blame.
As you can also see, the rate at which doubles were hit has also varied more over time, and doesn’t appear to correlate with the triples trend either.
Several readers, including David Salsburg and Ben Lauderdale, also wondered whether the theory about changing ballpark configurations could be tested by taking a look at triples hit in ballparks whose configurations haven’t changed much over time. An excellent idea; to take a look at this I have created graphs of the number of triples per 500 AB+BB at Fenway Park, Wrigley Field, and Yankee Stadium (with the exception of 1974 and 1975 when it was being renovated), going back to 1970, that include the best fit linear trend line:
As you can see, in all three cases the rate at which triples are hit has declined, although less in Fenway Park than in the other two, and just slightly less at Wrigley Field than in Yankee Stadium. Interestingly, in Yankee Stadium that decline really began in earnest around 1985. Although a more radical configuration change was made before the 1976 season, in 1985 the distance to the left side of the bullpen gate in left-center went from 387 to 376, left center went from 430 to 411, and the left side of the centerfield screen went from 417 to 410, all of which may have conspired to reduce the number of triples hit. Lest you think an increasing walk rate may be responsible for the downward trend, the rate of triples per AB shows the same downward trend, with trend lines having the almost identical negative slopes.
Overall, however, these data points (cursory as they may be) do not provide support for the idea that the decrease in triples can be chalked up entirely to ballpark configuration, or a simple increase in home runs driving down the balls in play that might turn into triples. That leaves the idea that an increasing level of play by fielders, both in physical and strategic terms, may largely be the culprit in the case of the missing triples.
That said, enjoy the rest of the postseason. Let’s hope we see a few triples.