BP Comment Quick Links
August 18, 2005 Crooked NumbersRoyal Flush
There's bad, there's the Colorado Rockies, and then there's the Kansas City Royals. If you're into the Jayson Stark "Useless Info" columns, you could easily notch thousands of words about how bad the Royals have been for the past decade or more, a situation only highlighted by their recent losing streak. It's a tough time to be a Royals fan and if you're one of the few, the proud, read on and perhaps you'll feel a little bit better about your team. To start, let's get some perspective. The Royals' streak of 18 straight losses is not the worst run of baseball of all time. The worst losing streak in the major leagues since 1901 was the 1961 Phillies who managed to lose 23 games in a row from 7/29/61 to 8/20/61. Interestingly, it could have been a lot worse; the Phillies lost five in a row just before the streak, so they actually lost 28 of 29 games in what may very well be the worst month any team has ever had. Here are the rest of the worst: Year Team Games 1961 Philadelphia Phillies 23 1988 Baltimore Orioles 21 1969 Montreal Expos 20 1943 Philadelphia A's 20 1916 Philadelphia A's 20 1906 Boston Red Sox 20 1975 Detroit Tigers 19 1914 Cincinnati Reds 19 1906 Boston Braves 19 The Royals are on the cusp of greatness, but they're not quite there yet. But how bad is this streak? People have a tendency to grasp onto streaks because they're easily quantified. A team that's lost 15 games in a row is clearly worse than a team that's lost 12 or 10 games in a row. But streaks are as easily broken as they are quantified. Take baseball's greatest streak of all time: Joe DiMaggio's 56game hitting streak. As many of you know, after his streak was broken DiMaggio hit in another 16 games straight meaning the Yankee Clipper notched a hit in 72 of 73 games. While it's easy to say that DiMaggio's 56game hit streak was more impressive and improbable than Pete Rose's 44game streak, but what's more impressive: hitting in 56 games in a row or hitting in 72 of 73 games? To determine this, we need to get into some binomial distributions. If we assume that DiMaggio had a "true" probability of getting a hit in a game, then the question becomes quite simple. However, that's not quite true because we should instead assume he had a probability of getting a hit in an at bat; as such, things becomes much more complicated. Instead, let's get back to teams and winning games. The odds that a team with a winning percentage w will win any x number of games in a row is simply w^x. Conversely, the odds that they will lose any y number of games in a row is (1w)^y. This is a binomial distribution, but a very simple one. If, however, the goal is to determine if a team will win at least 2 of 3 games, the formula becomes more complex because more situations meet the standards for success. For example, if the team wins all three games, wins the first two, wins the second two, or wins the first and last games, all three situations must be counted. The odds of the team winning exactly two of the three gamesw2 * (1w)must be added to the odds that they will win all threew3. But since there are three ways in which the team can win two out of three games, that result has to be multiplied by three. However, the key to the puzzle is Pascal's Triangle, a tool that reveals the binomial coefficient by which each result must be multipliedthree, in the case above. Essentially, the triangle shows how many different ways the final counted result can be achieved by different distributions of the binomial choice. There are three different ways the team can win two games and lose one, but only way in which they can win all three. This is also referred to as "x choose y"essentially, if one is faced with the decision to choose y games out of x total games, how many possible combinations add up to y. Getting back to the Royals, we run into another problem when estimating the probability of their losing streak or comparing it to other losing stretches: what is the Royals' probability of winning an individual game? In baseball, we typically assume that this probability is a team's winning percentage. Isn't this was the whole regular season is about, determining who's the best team by who has the highest probability of winning a baseball game? But as Keith Woolner reminded us before the season, 162 games isn't enough time to properly discern a team's "true" winning percentage, the probability that they will win any given game. This is to say nothing of the fact that a team's probability of winning any given game is not a constant. Winning probability is affected by any number of factors: whether the team is home or away, who the starting pitchers are, who's in the lineup, who the opposing team is, and any number of other factors. We generally like to assume that those kinds of breaks even out of the course of 162 games, but if they did, then BP's Quality of Batters Faced and Quality of Pitchers Faced reports and all that hangwringing about the unbalanced schedule and the wild card would be for naught. While a team's winning percentage over 162 is the best guess we have about their true probability, we must be admit that there is an overwhelming probability that that number is wrong. It's going to be close, but the odds that the next 162 games would fall exactly the same as the previous 162 is miniscule. It's possible that by using the full season's winning percentage as a guide for a team's true probability of winning any single game, we're making such stretches of losing appear easier, but using only those games not involved in the streak would be an arbitrary removal of data. Thus, the fullseason winning percentage is as close as we can get, so we'll stick with that. Caveats aside, let's see what we can do about estimating just how bad the Royals are. First, let's take a look at the probability that a team with a given probability of winning each individual game will lose a certain number of games in a row. In this graph, there are five hypothetical teams with winning percentages between .500 and .300. Note that their odds of losing the first game are exactly the inverse of their winning percentages, as we'd expect. As the losses pile up, the probabilities decrease dramatically, to the point that by the time we get to 13 or 14 losses, it's nearly impossible to tell the difference between a .500 team and a .300 team. This is encouraging because it means that with streaks of the Royals' magnitude, the winning probability of the team doesn't make that much difference and we can continue knowing that our errors will be small in this regard. Now, let's assume for a minute that the Royals are actually a .319 team (their current winning percentage). What are the odds that they'll lose any given 18 games in a row? By binomial distribution, we know that that probability is .000984 or approximately 1015.5:1. That seems very impressive, but that's only the probability that they'll lose any given stretch of 18 games. A 162game season can be viewed as 144 separate 18game opportunities to lose 18 games in a row. While the Royals chances of losing any given 18games in 1015.51:, their chances of losing any stretch of 18 games over the course of a 162 game season is actually closer to 6.6:1. What's more, the Royals, given their .319 winning percentage, had a 50:50 chance of losing 13 games in a row at some point during the season. How does that compare to historical streaks? Obviously it's not as bad as the '61 Phillies, but it's possible that the Royals are breaking up several losing streaks with lone wins to make things look better. So instead of looking simply at streaks, let's see how bad the Royals are over a given stretch of games. For example, the Royals are 1440 over their last 54 games, but let's round it off to 50in which they were 1337. Compare that to the worst 50 game stretches since 1901: YEAR TEAM W L Win_Pct Prob Ratio InSeason Ratio 1916 PHA 4 46 .235 0.41% 240.6 37.15% 1.7 1937 PHA 7 43 .358 0.06% 1796.7 6.04% 15.6 1932 BOS 8 42 .279 3.74% 25.7 98.60% 0.0 1915 PHA 8 42 .283 3.31% 29.2 97.70% 0.0 1961 PHI 8 42 .305 1.51% 65.2 81.80% 0.2 2004 ARI 8 42 .315 1.05% 94.1 69.38% 0.4 1943 PHA 8 42 .318 0.92% 107.3 64.60% 0.5 1949 WS1 8 42 .325 0.71% 138.9 52.22% 0.8 1996 DET 8 42 .327 0.65% 153.5 51.67% 0.9 1979 OAK 8 42 .333 0.50% 197.7 43.17% 1.3 1907 SLN 8 42 .333 0.50% 197.7 43.17% 1.3 1923 BSN 8 42 .351 0.24% 413.6 23.70% 3.2 1982 MIN 8 42 .370 0.10% 1011.8 10.47% 8.5 The Royals are not even close. Getting back to our original question, what's more impressive: the Philadelphia A's going 446 over 50 games or the Philadelphia Phillies losing 23 in a row in 1961? Getting back to our binomial distributions, the probability of a .235 teamthe '16 A'swinning 4 games or fewer in a given 50 game stretch is about 240.6:1 and over a season is a mere 1.7:1. This year's Royalsby virtue of their .319 winning percentagehave a 148.9 chance of matching that feat any time in a season (16,733.9:1 in any50 game stretch). As mentioned, the odds of the Royals losing 18 games in a row at any point in the season is 6.6:1. Expand that to 50 games and the Royals would have to lose 44 of 50 to match those odds. Furthermore, while there have been several streaks longer than the Royals', only the '37 Athletics and '82 Twins can claim more improbable stretches of bad baseball since 1901. The Royals' streak is already more improbable than all but 2 stretches of 50 games since 1901 as well as those few teams that notched longer pure streaks than they did. But each game that the Royals' lose makes their stretch more and more improbable, likely vaulting them past those few teams remaining ahead of them. Is there some solace to be taken in the fact that the Royals' improbably bad stretch was over one 18game stretch and not a 50game valley? Maybe, but if you're looking for the most improbable losing streak in baseball, the Royals' are certainly making a case. 0 comments have been left for this article.
