August 13, 2003
Lies, Damned Lies
A Roll of the Dice
The Red Sox ended Tuesday night four games behind the Yankees in the AL East. What are the odds that they can make up that deficit to take the division? And, failing that, what are their chances to edge out the A's for the wild card?
Seriously. Grab a pencil and a piece of paper, come up with your best guesstimate, and write it down. Harder than you thought, huh? Keep reading, and we'll have an answer for you in a bit.
Whether they realize it or not, major league teams are making calculations like this all the time. Implicitly or explicitly, they can determine the direction that a team chooses to take: whether to move prospects for veterans at the trade deadline, whether to shut a young pitcher down for the season, or try (injury risk be damned) to get as much work out of him as they can. Wins are the currency that baseball transacts in, but for many purposes, they're only as good as the pennants and postseason appearances that they can be redeemed for. Much as some pundits like to talk about Mystique, Aura, and Veteran Leadership, the postseason is a lottery of sorts. Winning 11 playoff games is often a lot easier than winning 90 or 95 in the regular season, and many teams consider their season a success if their postseason ticket is punched, and they get to take their chance in the playoffs.
What's less certain is if these teams are estimating these odds properly. It's human nature to evaluate your own strengths and weaknesses less rationally than you would anyone else's, and it's all but inevitable that, like a poker player gone on tilt, at least a couple of teams each year will imagine only the rosiest possible scenarios--making a bunch of bad bets in the process. While no estimate of postseason odds will be perfect, laying down some ground rules is potentially valuable.
It is, in fact, relatively easy to come up with a reasonable estimate these odds if you know just three things:
OK, so I'm making things out to be a little bit more straightforward than they actually are; only one of those factors is self-evident. Nevertheless, we can make some pretty good educated guesses about the other two, and by doing so can help to crystallize the playoff picture, sometimes producing surprising results.
Intrinsic team quality is something that has been researched pretty thoroughly. The W-L record that a team has produced in the past may not be the best indicator of how they're going to do in the future. Most of you are no doubt familiar with Pythagorean records, which estimate team quality based their runs scored and runs allowed, and have consistently been proven to be a more accurate predictor of future W% than W-L records themselves. Clay Davenport's Adjusted Standings page takes the concept a step further by evaluating two additional factors: how well a team's runs scored and runs allowed correspond to the raw inputs (hits, walks, homers, etc.) that produce these figures, and how the quality of opponents that a team has faced have affected their results in the season to date. Like Pythagorean records, these adjustments are designed to take luck out of the question. The result is something that we call third-order winning percentage (W3%), which provides an even better estimate of how a team is going to fare going-forward.
The other factor that we do need to consider is strength of schedule. While normally a concept more familiar to football fans, the unbalanced schedule makes quality of opponents a critical component of baseball playoff races too, especially as the season draws to a close and what symmetries there are in scheduling are lost.
Fortunately, if we have an estimate of each team's quality--the W3% numbers--then determining strength of schedule is straightforward. We can simply average the W3% of each of a team's remaining opponents, weighting based on the number of games played against them. We can also build in an adjustment for home field advantage, adding 40 points to an opponent's W3% for a road game, and subtracting 40 points from it for a road game. (Home teams typically win about 54% of the time in baseball, hence the scale of the adjustment).
As I've already alluded to, schedule strength can make a large amount of difference. Here, for example, were the teams with the strongest and the weakest schedules entering Tuesday night's games:
Easiest remaining schedules based on opponent W3%
Minnesota .456 Chicago Cubs .462 Cleveland .470 Houston .473 Philadelphia .479
Toughest remaining schedules based on opponent W3%
Baltimore .575 Tampa Bay .554 Boston .528 Oakland .520 N.Y. Mets .520
Six of the eight teams mentioned in those tables are in the midst of playoff races. Looking at the most extreme examples, Minnesota will face the equivalent of a series of 74-88 teams for the balance of the season, while Baltimore will play opponents with an average record on the order of 93-69.
Given a reasonable estimate of these two factors--a team's quality, and the quality of its opponents--we're just one small step away from guessing at their winning percentage for the balance of the season. For example, let's look at the Cubs' situation entering Tuesday night's game against the Astros:
Actual winning percentage: .513 (60-57)
Based on Clay's numbers, the Cubs have been a somewhat stronger team than their winning percentage alone suggests.
Opponents W3%: .462
We estimate that an average team would go 21-23 in playing out the rest of Cubs' schedule. Chicago stands to gain a game or in the standings by virtue of playing against a relatively weak set of opponents.
Rest of Year W% (ROY%): .585
Comparing the Cubs' W3% against that of their opponents leads to an estimate of how they'll perform for the rest of the year (the tool used to create this number is Bill James' log5 formula, nicely described here in an article by Tom Tippett). We figure the Cubs should finish with around 86 or 87 wins.
Of course, we're neglecting the very concept that I mentioned at the start of the article. Our best guess might be that the Cubs will win 87 games, but what are the chances that they'll leg out a couple of tight games and finish the season with 90 instead? And is that likely to be enough to get them into the postseason?
Our estimate for the Cubs' final record is better represented by something like this:
That chart is generated by what's known as a binomial distribution. You could run a simulation to do the same thing, as we did in last week's Braves PTP, but the binomial distribution estimates those probabilities precisely.
Of course, contrary to any Dusty Bakes admonitions about scoreboard watching, the Cubs also have the Cardinals and the Astros to worry about. Here are the win distributions for the three teams:
Notice how much overlap there is, which indicates that this is a wide-open race. We can translate a chart like that into postseason odds by following a relatively simple, three-step process.
Applying this procedure, we can come up with an overall estimate of the Cubs' probability of winning the N.L. Central--in this case, it's about 27%.
The process for estimating wild card odds is slightly more complicated, but works along lines. With so many teams in the hunt, we figure that the Cubs have less than a 3% chance to take the wild card if they don't win their division. Their composite odds to reach the postseason are around 30%. Not bad for the Loveable Losers, but don't start printing those playoff tickets just yet.
(Mathematically inclined readers will notice a flaw in this method: it assumes that the win probabilities for each team are independent of one another. Of course, that isn't true--a win for any one team means a loss for another--but the difference in the results isn't substantial in most cases, and this approach is considerably easier to calculate and explain).
Let's turn back the clock by two weeks or so, and look at the postseason odds as of the trade deadline on July 31:
AL East W' Div WC Total -------------------------------------------- Baltimore 72.6 0.0% 0.0% 0.0% Boston 95.0 8.0% 62.3% 70.2% N.Y. Yankees 102.2 92.0% 5.6% 97.7% Tampa Bay 60.3 0.0% 0.0% 0.0% Toronto 82.8 0.0% 0.8% 0.8% AL Central W' Div WC Total -------------------------------------------- Chi. White Sox 85.2 56.0% 0.2% 56.2% Cleveland 71.3 0.0% 0.0% 0.0% Detroit 47.3 0.0% 0.0% 0.0% Kansas City 82.4 19.0% 0.2% 19.2% Minnesota 83.2 25.0% 0.2% 25.3% AL West W' Div WC Total -------------------------------------------- Anaheim 81.5 0.0% 0.4% 0.4% Oakland 92.4 12.8% 25.3% 38.1% Seattle 98.2 87.2% 5.0% 92.1% Texas 69.0 0.0% 0.0% 0.0% NL East W' Div WC Total -------------------------------------------- Atlanta 103.3 98.5% 1.3% 99.7% Florida 89.1 0.3% 26.3% 26.5% Montreal 79.5 0.0% 0.1% 0.1% N.Y. Mets 63.1 0.0% 0.0% 0.0% Philadelphia 91.8 1.3% 61.2% 62.5% NL Central W' Div WC Total -------------------------------------------- Chicago Cubs 85.5 25.4% 1.3% 26.7% Cincinnati 70.8 0.0% 0.0% 0.0% Houston 87.2 45.6% 1.4% 47.0% Milwaukee 65.7 0.0% 0.0% 0.0% Pittsburgh 74.4 0.1% 0.0% 0.1% St. Louis 85.9 28.9% 1.4% 30.3% NL West W' Div WC Total -------------------------------------------- Arizona 85.7 0.1% 6.4% 6.6% Colorado 79.3 0.0% 0.1% 0.1% Los Angeles 81.0 0.0% 0.5% 0.5% San Diego 64.2 0.0% 0.0% 0.0% San Francisco 101.2 99.9% 0.1% 99.9%
There's a lot of good stuff here. The Braves and Giants--no surprise--were already virtual cinches to reach the postseason. The White Sox, in spite of trailing the Royals at the time, were the odds-on favorites to win the AL Central (they still are). The Phillies had a big edge in the wildcard race, while the Diamondbacks were longshots (they've gained ground in the past couple weeks).
How much would these odds be affected if a team added a star player to the mix? Well, that depends a lot on the team. What I did was to add a fixed percentage to each team's W3%, representing a gain of two additional wins for the balance of the season. Since the trade deadline falls almost exactly two-thirds of the way through the season, that's equivalent to acquiring a player that's worth six wins for a full season--think someone like Jose Vidro or Kevin Millwood, an All-Star caliber player. Making an acquisition of this magnitude, for example, would improve the Marlins projected post-season odd from 26% to 42%. That's a pretty good incentive to consider a deadline move.
Here are the same numbers for all the teams that had a non-laughable chance to reach the playoffs as of the deadline:
Impact of Adding Two Wins at the Trade Deadline:
Postseason % Team Before After Change -------------------------------------- Houston 47.8% 66.5% +18.7% St. Louis 29.7% 48.1% +18.3% Chicago Cubs 26.2% 44.0% +17.8% Chi. White Sox 56.2% 74.0% +17.8% Florida 26.5% 42.8% +16.4% Oakland 38.1% 53.9% +15.7% Minnesota 25.3% 33.9% +14.7% Kansas City 19.2% 40.8% +15.6% Philadelphia 62.3% 76.9% +14.6% Boston 70.2% 79.4% +9.1% Arizona 7.2% 15.2% +8.0% Seattle 92.1% 97.2% +5.1% N.Y. Yankees 97.7% 99.4% +1.7% Los Angeles 0.5% 1.9% +1.3% Toronto 0.8% 2.0% +1.2% Anaheim 0.4% 1.2% +0.7% Montreal 0.2% 0.7% +0.5% Colorado 0.1% 0.5% +0.4% San Francisco 99.7% 99.9% +0.2% Atlanta 99.7% 99.9% +0.2%Interesting that the Astros and the Cardinals, the two teams most often criticized for failing to make a deadline move, stood to benefit more than anyone else by doing so, increasing their playoff potential by nearly 20%. On the other side of the spectrum, we have the Yankees, who look silly for trading a great pitching prospect for Aaron Boone when they were already all but guaranteed to reach the playoffs.
To provide some context for these figures, I reran a version of Michael Wolverton's study on pennants added, which originally appeared in Baseball Prospectus 2002. The only adaptation I made to Michael's study is that I looked not at pennants added, but rather, "postseason appearances added," using data from the past eight seasons involving the current playoff setup.
I found that a six-win player, if added to a team at the start of the season, will help them advance to the playoffs between 12-13% of the time when they otherwise would not have. Think about that for a second: for the first nine teams on our list, adding a six-win player at the trade deadline has a larger marginal impact on playoff odds than we would have expected it to at the start of the season.
That seems counterintuitive, and we've often criticized teams for paying hefty ransoms for veteran help at the trade deadline, when they'll only get to use their acquisition for a third of the season. That argument, however, neglects the asymmetry of information that's in play here. A team knows a heck of a lot more about its playoff opportunities on July 31 than it does on March 31. If it makes an expensive move at the trade deadline, it's not being irrational, so much as paying a premium for that information.
We're going to begin publishing reports like these for each team, updated fresh for you every morning. The first installment is available here, where you can get an answer to the Red Sox question we posed. It's a lot of fun to see the numbers shift around from day to day. The results are already swinging as much as 10-15% based on a single night of competition, and that number will only get higher as the season gets closer to its endpoint.