Flu-Like Symptoms: The World Series of Coin Flipping

November 29, 2016

Let’s have our own World Series, you and I. A World Series of coin flipping. First to four wins. Your call, heads or tails. Ready? Let’s go.

There are four ways you can win. First, you can win four flips in a row. Since the odds of winning any one flip are 50/50, your chance of a sweep are ½ x ½ x ½ x ½, or ½⁴, which equals .0625. You have a 6.25 percent chance of winning four straight flips.

There are a few ways you can beat me 4-1. I can win the first flip, you the next four. Or you can win one, I win one, then you win three. Or you win two, I win one, and you win another two. Or you take three, I win my one flip, and then you finish me off. The odds of each of those sequences, since there are five coin flips total, is ½⁵. But there are four ways for you to win, so 4 x ½⁵ = .125. You have a 12.5 percent chance of winning our World Series 4-1.

I’ll spare you the narrative, but you have a 15.625 percent chance of beating me 4-2. And our series will go to seven games, with you victorious, 15.625 percent of the time as well.

We can double those amounts and figure the chances that our World Series will last a given number of flips. If you have a 6.25 percent chance of winning in four straight, so do I. There is a 12.5 percent chance of a sweep. Similarly, there’s a 25 percent chance that one of wins in five. There is an equal 31.25 percent chance that our series goes to six or seven flips.

Why did I bother telling you all of that? Because it’s relevant as we look back on seven-game postseason series. Let’s look at only the divisional era, beginning in 1969. Since then, there have been 47 best-of-seven World Series (every year, 1969-2016, excluding 1994) and 62 best-of-seven Championship Series (every year beginning in 1985—the LCS prior to then was best-of-five—excluding 1994). That’s 109 best-of-seven postseason series. How have they shaken out?

Before I answer that, let me address something you may be thinking: Arbitrary endpoints. There have been World Series since 1903. (I’m ignoring the 1884-1890 World Series and 1892 Championship Series, which were considered exhibitions.) Why am I limiting the analysis to 1969 and later? Well, baseball in 1903 was a lot different than baseball in 2016, and comparisons from over a century ago are strained.

Yes, the difference between 1969, which I’m including, and 1968, which I’m not, was just one year, but there were significant differences to the strike zone, the mound, and the composition of the two leagues (divisional play, four expansion teams) in 1969. Runs per game shot up 19 percent from 1968 to 1969, the most in the two-league era. It’s as clean a divide as we’ve seen. Further, as I’ll explain, the addition of playoff rounds changes the dynamic.

For the LCS, there were eight four-game sweeps. Sixteen teams won in five, 23 in six, and 15 in seven. In the World Series, there were nine sweeps, 11 five-game series, 11 that went six, and 16 that went the distance. So adding them together, there have been:

17 four-game series
27 five-game series
34 six-game series
31 seven-game series

Let’s compare the distribution to that of the World Series of Coin Flipping:

See what’s going on there? There are more four-game sweeps and fewer seven-game series in baseball than in random coin flips. Here it is graphically:

Why is there a difference?

You’re probably thinking: Because baseball isn’t coin flips. And you’re right! Ratings for our World Series of Coin Flips would be terrible (though the games would be shorter). More to the point, a coin flip is a 50/50 proposition. Baseball isn’t. There are some series that appear to be evenly matched, but then there are teams like the 1998 Yankees, with a 114-48 regular-season record, who went 11-2 in the postseason. Every team they played was inferior. So might that explain what’s going on here?

Maybe. After all, if the odds aren’t 50/50, we’d expect more four-game series and fewer seven-game series, which is exactly what we’ve got. We can test for that. If the coin isn’t fair, the results will be different. If, for example, we have a coin that comes up heads 75 percent of the time, we’d expect heads to win most of the time, often in short succession. All told, we’d expect to see our series to go to four flips 32.0 percent of the time, five flips 32.8 percent, six flips 22.0 percent, and seven flips 13.2 percent.

It turns out that we can minimize the error between the actual distribution of postseason games and the theoretical number by assuming that the coin comes up heads more than 50 percent of the time. The exact number is about 58.3 percent. That gives us something like this:

Or, graphically:

See how the yellow bars and the red dots line up better? This underlying assumption—that one side of the coin, or one team, can be expected to win 58.3 percent of the time—creates a better fit to what actually has happened.

But now we have another problem. The 58.3 percent part … does that make sense? Is there a World Series matchup between two teams in which one of the teams goes into each game with a 58.3 percent likelihood of winning? In 1981, Bill James introduced the Log5 formula, used to calculate the expected winning percentage when two teams of unequal ability play one another.

For example, if a team with a talent level of .600 plays a team with a talent level of .400, log5 predicts that the .600 team has a 69.2 percent chance of winning. If a .600 team plays a .500 team, the odds drop to 60.0 percent. If the opponent is a .550 team, the probability falls further, to 55.1 percent.

So what kind of teams would yield a 58.3 percent winning probability for the better team, which best fits the distribution of series length we’ve seen since 1969? Well, it’s a sliding scale. If a 75-87 team plays a 62-100 team, the better team can expect to win 58.2 percent of the time. The thing is, we don’t get a lot of 75-87 teams in the postseason. So let’s look at a team that went 93-69. A team with that record can expect to win 58.3 percent of the time against a team that wins 79 or 80 games.

Think about that for a second. The pattern of series lengths—four, five, six, or seven games—in the 109 best-of-seven postseason series since 1969 best resembles the outcome of a best-of-seven series between this year’s Red Sox and the Marlins. Or the Dodgers and the White Sox. Or the Nationals and the Royals. Or the Yankees and the Phillies. That doesn’t seem right, does it? Postseason series may be uneven, due to personnel or luck or injuries, but they’re not that lopsided. Are they?

So what’s going on here? Are we seeing good teams play mediocre ones in the postseason without realizing it? I think there are three explanations.

First, while 109 seven-game series aren’t a tiny sample size, they aren’t a huge number, either. Maybe we should be looking at all postseasons. Prior to 1969, there weren’t any Championship Series, but of the 66 best-of-seven World Series (it was best-of-nine in 1903, 1919, 1920, and 1921, and not played in 1904), there were 27 that went seven games, 13 that went six games, 16 that went five games, and 10 that went four games. Add those to the 109 in this study, and the distribution looks a lot more like the 50/50 coin flip distribution:

But, as I pointed out earlier, there are some significant differences between baseball prior to 1969 and the years since. Further, the addition of the LCS, a second round of the postseason in 1969, and the Division Series, a third round in 1995 (and the 1981 strike season), increased the likelihood that a superior team will be eliminated early, creating more unbalanced contests (if not Red Sox vs. Marlins unbalanced).

Second, I’m using win-loss record to define which team is better. That’s fine in general, but may not necessarily apply in the postseason, when (take your choice) a dominant starter or two, a shutdown bullpen, contact hitters, or a home run-based offense may be a superior predictor of success. Postseason matchups since 1969 may not be Dodgers vs. White Sox on the face of them, but in some cases, that may be how they work in October.

Finally, this could just be noise. As noted, 109 series isn’t an enormous sample size. The disparities shown in the first table and graph in this article—four-game series comprising 15.6 percent instead of the theoretical 12.5 percent and seven-game series comprising 28.4 percent instead of the theoretical 31.3 percent of postseason series—are notable, but don’t rise to the level of statistical significance. Maybe we’re just going through a fallow period for long series, just as 1955-1968, when 10 of 14 World Series went seven games, was a bumper crop.

I’d argue for a combination of the latter two factors. I do believe that the three rounds of the postseason (four if you count the Wild Card game) creates the possibility, if not the likelihood, that postseason matchups are more imbalanced than they were in the past. That imbalance likely exacerbates differences between teams in the traits that lead to postseason success.

But we can’t dismiss randomness, either. We shouldn’t be surprised to see more seven-game series (like the last World Series, especially Game 7, please) going forward, and for non-gambler’s fallacy reasons. Postseason matchups may not be as close as they were before divisional play, but they’re probably closer than the past few decades might suggest.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now

Rob Mains

@Cran_Boy

Latest Articles

You need to be logged in to comment. Login or Subscribe

D1Johnson

11/29

Why does the Baseball seven-game series percentage change from 28.4% in the first table to 27.5% in the second table?

Reply to D1Johnson

mainsr

11/29

Because I'm a moron. Thought I'd fixed that, forgot to save the fix. Thanks for catching it; I'll get it fixed this AM once I can get to a hotspot.

Reply to mainsr

mainsr

11/29

The 28.4 figure in the first table is correct, so the error in the second table should be (0.3%).

Reply to mainsr

johnwood427

11/29

this is great

Reply to johnwood427

mainsr

11/29

Thanks, glad you liked it.

Reply to mainsr

Silvergun

11/29

Where's the jerk pointing out tails' probability is closer to 51%?

RIGHT HERE!

Reply to Silvergun

mainsr

11/29

Ah, but I have a Certified Fair Coin. (The 50/50 one, not the 58.3/41.7 one)

Reply to mainsr

greenengineer

11/29

And here lies the basic problem with the extended post season - it decreases the likelihood of the two strongest teams playing each other.

Reply to greenengineer

newsense

11/29

One variation on the explanations you propose: a team that falls behind in a seven game series (particularly if they are two or three games down) may adopt riskier strategies to avoid elimination at the cost of reducing their chances of winning subsequent games. Such an example would be overtaxing their best pitchers in an elimination game.

Reply to newsense

mainsr

11/29

Yes, good example. That wouldn't happen before 1969 unless there was a down-to-the-wire pennant race (e.g. NL 1951).

Reply to mainsr

lichtman

11/29

While we have little idea how much noise is present in that 58%, I think it is perfectly reasonable for the average superior team to be 16% better than its opponent for various reasons. I think we can easily simulate seasons to test that. The only thing we really need is an assumption about the spread in talent among teams. Interesting exercise for an aspiring saberist!

Reply to lichtman

mainsr

11/29

Thanks, I have two thoughts. First, do you think the average superior postseason team is 16 percentage points better than its opponent on a true talent basis? That seems high to me, but as I suggested, we don't know all the factors that go into postseason vs. regular season superiority. Second, I agree that the 58% number contains noise, and may well settle down, but I figured I'd throw the idea out there (the prior research I saw, based on WS, suggested anomalously high levels of 7-game series, I assume owing to that blip in the 50s and 60s) for future research. The sample size grows agonizingly slowly.

Reply to mainsr

hdruschel

12/13

Weeks late to this, but: one possibility (and I think it's a long shot) is that bullpen use could be making unbalanced matchups even more lopsided. If you take a series where Team A wins 55% of the time, and Team B wins 45% of the time, and the two teams are totally identical but Team A is just slightly better at everything, and then give them both super-elite bullpens, it seems like Team A would see a bump in their odds. A great bullpen needs a good offense and starting rotation to do much of anything, and in this hypothetical, it seems like Team A would get more early leads for their super-bullpen to come in early and hold, while Team B would see their super-bullpen languish.
This would line up nicely with the observed time frame, since it coincides with the rise of the modern bullpen. Ultimately, I think it's probably a coincidence, because I have trouble accepting that the bullpen's effect could be that large, but it seemed like an interesting possibility.

Reply to hdruschel

mainsr

12/18

And I'm weeks late in replying! Sorry about that.

I think you have a point here. I'm thinking of this year's Indians. They kind of limped into the postseason, with the Salazar and Carrasco injuries, and no Brantley, plus they compiled their record in part by playing in the relatively weak AL Central, where they won 70% of their games against three clubs (Chicago, Detroit, KC--they somehow went only 10-9 against the Twins) that comprised 35% of their schedule. But that bullpen, and Francona's management of it, seems to have shifted their odds in each series. I don't know whether two teams with super-elite bullpens bumps up the odds for one team or the other (though that gives me a research idea) as much as the team that does have a super-elite bullpen increases its own odds more than they would be increased during the regular season (when you can't, say, use Andrew Miller in two thirds of your games, averaging 1.9 innings per appearance--that's a 209 inning season!). But I like the point, thanks.

Reply to mainsr

Flu-Like Symptoms: The World Series of Coin Flipping

Thank you for reading

Latest Articles

The Call-Up: Andy Pages $

Searching for Hidden Homers $

Five & Dive Episode 364: It’s actually Jared Triolo

First-Pitch Swinging is Good, but for Who? $

TA: Marlins Get Less Meyer-ed, More Mired; Rafaela Extension; One Million Injuries $

Rob Mains

Latest Articles

The Call-Up: Andy Pages $

Searching for Hidden Homers $

Five & Dive Episode 364: It’s actually Jared Triolo