Happy Thanksgiving! Regularly Scheduled Articles Will Resume Monday, December 1
September 2, 2004
Return of the Playoff Odds Report
Even the Rockies Can DreamLast year about this time, we brought you a report that estimated the end-of-season standings, along with the likelihood of each team making the playoffs.
Well, that report is back this year, this time with a significant difference. Last year's edition did not actually track each game; it relied on the team's winning percentage and that of its opponents to estimate the likelihood of winning 88 games, 89 games, 90 games, and so on. However, it did this for each team independently; when considering the likelihood of making the playoffs, it would essentially permit both teams to win games against each other.
This year's stat report relies on a completely different model to generate playoff possibilities. We set up a Monte Carlo simulation to run the rest of the season one million times.
What is a Monte Carlo simulation, you ask? If the name suggests a roulette wheel to you, you are on the right track. It is an analytical technique that solves a problem by performing a large number of trials, and inferring the solution from the collective results of the trials. Each trial relies on a random number (that's the Monte Carlo part); the value of that number controls the outcome of the trial, based on a pre-determined rule.
In our case, the random numbers control who wins or loses each game. We start the process by looking at the Adjusted Standings Report, updated daily on our stats page. From this report, we take away the current wins and losses, which will be our starting point for every iteration of the model. We also use the W3 and L3 scores for each team to set their expected winning percentage (EWP) for the remainder of the season. W3 is derived from the team's Pythagenport wins and losses, adjusted for, essentially, their strength of schedule, which we think is a better estimator of their future performance than their actual current record.
Given our starting points, we start stepping through the schedule. Suppose the Yankees are set to play the Red Sox in Fenway. Through August 31, the Yankees had an EWP of .577; for the Red Sox, it was .636. Since it is in Fenway, we are going to add in a typical home field advantage of .020 points, with a corresponding drop for the visitors, so that adjusts the Yankees down to .557 and the Sox up to .656. The likelihood of the Yankees winning is determined by Bill James' log5 method:
A - AB WinPct(A) = ------------------ A + B - 2 AB,where A is the first team's EWP and B is the second team's EWP. For the Yankees,
WinPct = (.557 - .557*.656)/(.557+.656-2*.557*.656) = .397
At this point, our random number generator spits out a number between 0 and 1. If it is less than .397, theeeeeeee Yankees win; if it isn't, they lose. Assuming we draw a .126, then we credit the Yankees with a win and the Red Sox with a loss, and move on to the next game on the schedule.
When you do that for every game on the schedule, then you've played out the season in a Monte Carlo simulation. At that point, it is simply a matter of sorting the teams by wins to figure out who won each division and who won the wild card. While that is a pretty trivial thing to sort out most of the time, things get a little hairy when you start sorting through two-, three-, or four-way ties. Actually, I'm still trying to get some bugs out of the four-way tie procedure, but I didn't think it was worth holding up the report for what is currently a 0.2% percent chance.
Now do it a million times.
With that many tries, you get to see some pretty unlikely results actually happen, like the 33 times out of a million that the Mets make up 15 games on the Braves with 33 games left to win the division, or The 47 times the Cardinals don't win the division. Then there are the two times the Rockies--the Rockies!--won the wild-card slot. For eight teams, even a one in a million shot at the playoffs has passed beyond their reach.
The playoff odds will be updated daily.