BP Comment Quick Links

Happy Thanksgiving! Regularly Scheduled Articles Will Resume Monday, December 1


April 28, 2003 Doctoring The NumbersHot Starts, Part II
Welcome to Part 2 of our look at the importance of hot starts. If you haven't already, read Part 1 first. We'll wait for you to get back. Last time, I looked at how teams fared at season's end after starting the season with a particular record, varying the data by looking at starts of varying lengths. While I pointed out general trends in the data (as well as the exceptions that proved the rule), I did not sum up the data concisely into a single, coherent formula to predict a team's final record. That's what today's article is about. In Part 3yes, there will be a Part 3I want to examine how the interaction between a team's record at the start of the season, and its record the previous season, affects its final winning percentage. WARNING: This article contains lots of graphs, formulas, and other items found in an egghead's tool kit. If numbers aren't your thing, you're welcome to scroll down to the end and skip to the conclusions. For those of you who like to know what went into the batter (no, not that kind of batter), read on. As I pointed out in Part 1, there are measurable differences in the final performances of teams based on their record after even just 10 games. The following graph compares a team's winning percentage after 10 games (meaning there are only 11 possible records, ranging from 010, 19, etc, all the way to 100) to its final winning percentage. The graph shows plots for all 1300odd teams in the study:
The blue line represents a "bestfit" linear regression that comes closest to representing the fortunes of every team in a single formula. That formula, expressed numerically, is: Y = .398 + .205 * X r2 = 0.197 Where Y is the team's final winning percentage, and X represents the team's winning percentage after 10 games. There are two numbers in that formula that are important to understand. The .398 figure represents what was called a "Yintercept" in your 8thgrade math class, while the .205 figure represents the "slope". Essentially, the Yintercept represents what a team's overall winning percentage would be if their initial winning percentage was 0; i.e. they started 010. An 010 team would therefore be projected to win 39.8% of their games, which corresponds to about a 6597 record. The slope represents the impact that an increase in the team's winning percentage at the start would have on their final winning percentage. A 19 team has a .100 winning percentage; multiply .100 by .205 and you get .0205, so you would expect a 19 team to finish with a winning percentage .0205 better than the 010 teamwhich comes out to .4185, or about a 6894 record. We'll get to the meaning of r2 in a moment. Now compare the previous graph to this one, which looks at the same data after 40 games:
Two differences should be apparent with this graph: The angle of the "bestfit" line is steeper than in the previous graph, and the individual data points are clustered closer together. Very few of the data points are located far away from the bestfit line, whereas in the 10game graph, the data points were much more scattered, with many data points occurring far from the bestfit line. Here is the linear regression formula for the 40game graph: Y = .2145 + .571 * X r2 = 0.533 The Yintercept is much lower than in the 10game formula, while the slope is much higher. What this means, in a nutshell, is that changes in a team's winning percentage at the 40game mark are much more meaningful than after 10 games. This should make intuitive sense. A 100point swing in winning percentage after 40 games is the difference between a 2020 team and a 2416 team, which ought to be more significant than the difference between a team that starts 55 and one that starts 64. And as the slope of the line increases, its Yintercept must decrease, because the line is "centered" around the .500 mark on both axes. (If you plug .500 into either formula, you should get .500 out the other side, give or take a thousandth of a point.) The centering of the line around .500 corresponds to the principle of regression to the mean, which exists in baseball as in virtually all measurable things. As a whole, teams that play .700 ball at the start of the season are not going to keep up that pace all season. But as the Yintercept drops, the centering force weakens, and teams tend to finish closer to the initial pace. Obviously, teams that start 2812 are likely to finish closer to .700 than teams that start 73. One other point that's important to make: You see how the "r2" figure is much greater for the 40game graph than for the 10gamer? That figure represents the coefficient of determination of the data, or in plain English, how closely the bestfit line predicts the actual data. As I mentioned before, the data points in the 40game graph appear much more clustered together than in the 10game graph. Again, this is intuitive; teams are much more likely to play at or close to their true level over a 40game stretch than over just 10 games. The r2 value in the 10game graph is very small, .197, indicating that a team's initial winning percentage does not do a very good job of predicting its final record. In the 40game graph, the r2 of .533 is much higher and implies, obviously, that a team's winning percentage after 40 games is much more indicative of what its final record is going to be. (Whew. Breathe.) To save space, I won't display bestline graphs for every number of games in the study. But below, I've listed the formula at every five games, up to 50 games. After 5 games: Y = .4527 + .0952 * X r2 = .079 After 10 games: Y = .3983 + .2053 * X r2 = .197 After 15 games: Y = .3612 + .2779 * X r2 = .254 After 20 games: Y = .3186 + .3630 * X r2 = .339 After 25 games: Y = .2904 + .4174 * X r2 = .383 After 30 games: Y = .2583 + .4827 * X r2 = .449 After 35 games: Y = .2412 + .5183 * X r2 = .475 After 40 games: Y = .2145 + .5709 * X r2 = .533 After 45 games: Y = .1950 + .6096 * X r2 = .563 After 50 games: Y = .1719 + .6568 * X r2 = .608 The trend is clear: The more games a team has played at the start of the season, the more significance their record has (as demonstrated by the increasing slopes), the less centered around .500 their overall record is (as demonstrated by the decreasing Yintercepts), and the more precisely their final record can be projected (as demonstrated by the increasing r2 figures). Let's give one of these formulas a test run, using the 1984 Tigers' legendary 355 start. Using the 40game formula: Y = .2145 + .5709 * .875 Y = .2145 + .4995 Y = .7140 So the Tigers would have been projected to finish with a .714 winning percentage, which translates to 115.7 wins. They only finished with 104, but I think that 115.7 wins is a very good estimation, given that: 1) The Tigers had the best 40game start in history, and it's not closeonly one other team in my study (the 1939 Yanks) won more than 31 of their first 40 games; 2) as the 1998 Yankees and 2001 Mariners have shown, it is, in fact, possible for a team in the modern era to win around 115 games. Whether or not the Tigers would have finished with a better record had they not wrapped up the AL East by midseason is a question I can't answer, but it's worth noting that when their games started having meaning againin the postseasonDetroit went 71. We're only halfway to our goal. What we want is a single formula that applies for all teams, no matter how many games they have played, so you don't have to look up a chart to see what formula applies at the 37game mark, for instance. What's clear from the above numbers is that the slope and Yintercept of a formula are dependent on the number of games the team has played. So let's plot another graph, looking at how the slope and Yintercepts change depending on the number of games played:
As you can see, the two plots on this graph are significantly more related than in the graphs abovethe bestfit line nearly runs through all the data points. The r2 of this formula is .975, which is very close to the maximum r2 of 1. (Technical note: there are different r2 values for each of the bestfit lines, but in these graphs those values are almost identical, so I'll just refer to a single value from this point on.) There is one problem with the graph, though. Both data sets tend to curl as you approach zero gamesthe 5game data points have the worst fit of any of the points on the graph. Which suggests a straightline approach may not be the best way to come up with a bestfit line, because if you extend the line all the way to zero, funny things happen. The formula based on this bestfit line would suggest that a 10 team should have a .551 winning percentage, when we know from my previous article that teams starting 10 only had a composite winning percentage of .510. There are many ways to get around this problem, and the most direct method is simply to change the Xaxisthat is, rather than using games played as your independent variable, use some derivative of games played. After playing with the numbers for a lot longer than I should have, I can tell you that using the square root of games played as the independent variable yields the best fit, as the following graph shows:
The r2 of the bestfit lines on this graph are .999, which is about as close to perfect as we can get. Which means we should have our perfect formula. We should...except that it looks like this: Y = 0.5818  (.058 * SQRT(G)) + ((.1158 * SQRT(G))  .1623) * X) Where X is the team's winning percentage, and G is the number of games played. That, my friends, is one ugly looking formula. You're welcome to use it if you like, but I'm going to keep looking for something a little simpler, even if it means losing a tiny bit of accuracy. In the end, I elected to cheat a little, and simply deleted all data before the 10game point from the study completely. Essentially, I'm biting the bullet and stating that it's not even worth looking at a team's start until they've played 10 games. But from the 10game point to the 50game point, the graph looks more like a straight line:
The r2 of this graph is .983, which means we've shaved about a third of the error off the graph simply by eliminating that troublesome 5game data. Best of all, it means we now have a straightforward formula that we can use. (Technical note for the mathematicians in the group: the below formula was actually derived from a multivariate linear regression of all the data in the original study, with winning percentage and (winning percentage * games) as the independent variables. Many thanks to Jeff Hildebrand for his assistance.) That formula is: Y = 0.4428  (.0057 * G) + (X * (.1145 + (.0114 * G)) Yes, that still looks about as appealing as a Jack Nicholson/Kathy Bates naked hot tub scene. Let's simplify it a little: Y = 0.4428 + (X * .1145) + (G * .0057) + (G * X * .0114) Y = 0.4428 + (X * .1145) + [(2X  1) * G * .0057] Which, after knocking off a notsosignificant significant figure here and there, gives us our final formula: Y = 0.443 + (X * .114) + [(2X  1) * G * .0057] All right, someone invite the mathophobes back in the room. Let's explain this formula a bit. Essentially, there are three terms to the formula, which are:
What this means is that any team at .500 would be expected to finish at (.443 + (.114 * .5)), or (.443 + .057), or .500 exactly. Which makes sense. But it also means that teams over .500 will see their expected winning percentage increased by a factor of how far they are over .500, as well as the number of games they have played. Let's look at two teams with identical winning percentages, a team that starts 73 and one that starts 219. Both teams, by virtue of their .700 winning percentage, would have identical first terms (.443) and second terms (.7 * .114, or .0798). But the final terms would be different: 73 projects to: .443 + .0798 + [0.4 * 10 * .0057] = .5228 + .0228 = .5456 219 projects to: .443 + .0798 + [0.4 * 30 * .0057] = .5228 + .0684 = .5912 The exact opposite effect occurs on sub.500 teams; the more games they've played, the more their expected winning percentage will drop. Going back to the formula again: Y = 0.443 + (0.114 * X) + [(2X1) * G * .0057] But wait. If X is winning percentage, it can be expressed as W/G, right? So (2X1) = 2(W/G)  1. (2X1) * G * .0057 ((2W/G)1) * G * .0057 (2WG)/G) * G * .0057 (2WG) * .0057 And since G = wins + losses...(2WG) * .0057 becomes (2W  (W+L)) * .0057 (2W  W  L) * .0057 (W  L) * .0057 Which means our final formula is: Y = 0.443 + (0.114 * X) + [(WL) * .0057] Where X = current winning percentage and (WL) is wins  losses, or simply games above .500. It's not exactly the Pythagorean formula (the real one or the baseball one, take your pick), but it's not nearly as complicated as wading through all those charts and graphs, is it? Disclaimer: Keep in mind that this formula is only valid after a team has played 10 games. On the other side of the graph, I make no claims that the formula works beyond 50 games either. Actually, you can see that it shouldn't, because if you continue the lines far enough, eventually the slope will exceed one and the Yintercept will go below zero, which is impossible. (If it wasn't, eventually we'd be projecting the Tigers to finish with 170 losses.) So, to answer the question that first sparked this entire series: Where should we expect a certain collectivelypossessedbyaliens Midwestern team to finish? If that Midwestern team happens to be, say, 175, then we can calculate their expected finish to be: Y = 0.443 + (.114 * .773) + [(17  5) * .0057] Y = 0.443 + .088 + (12 * .0057) Y = 0.531 + .0684 Y = 0.5994 Which is to say, the Royals should finish with approximately 97.1 wins. Hmmm. There's something fishy about that result. Namely: Shouldn't it trouble us slightly that we're projecting the Royals' finish based on the first 22 games of the season, yet we're not taking into account their previous 162 gamesthat is, last year's recordat all? Shouldn't the fact that the Royals lost 100 games last year dampen our expectations slightly? It should. And I'll cover that very topic next time. If there is a next time. I have this exam to...oh, never mind.
Rany Jazayerli is an author of Baseball Prospectus. 0 comments have been left for this article.
