BP Comment Quick Links


August 18, 2009 Ahead in the CountHomeField Advantages, Part Two
Last week, we began our look into homefield advantage by looking at what home teams actually do better than road teams. It has been well documented throughout baseball history that the home team wins about 54 percent of ballgames, and last week we determined that the home team was better at pretty much everything. They struck out less, walked more, hit more home runs, got more hits on balls in play, made fewer errors, converted more doubleplay opportunities, stretched more extrabase hits into triples, hit more line drives, and they recorded more completegame shutouts. The home team was able to take an advantage in nearly every aspect of the game. This week, we will carry that discussion of what homefield advantage helps into who it actually helps the most. On June 4, Cole Hamels shut out the Dodgers, who at that time boasted the best record in the National League. More impressively, this feat came in Dodger Stadium. As mentioned last week, even though road teams scrape out 46 percent of ballgames, they only put together 41 percent of all completegame shutouts. What was special about Cole Hamels? Was he a hundred feet tall? Was it something about the Phillies? The first clue might have been that, at that point, the Phillies had a 206 record on the road, but were only a 1214 record at home for 2009. The average difference between home winning percentage and away winning percentage is eight percent; after Hamels' shutout, the Phillies sat at 30 percent! This did not seem to surprise the Philadelphia media all that much. The Phillies only had a regularseason homefield advantage of 4.9 percent in 2008, and in the four years before that they had homefield "advantages" of 6.2, 3.7, 4.9, and 2.5 percent. It seemed that the Phillies played much worse than other teams at home relative to how they played on the road. Phillies manager Charlie Manuel had an explanation for the team's 2009 performance at the readythe fanfare surrounding the World Series celebrations was distracting them. There had been ceremony after ceremony through the first several weeks of the season, and Manuel supposed that this was keeping his players from concentrating at home. Last year, Jimmy Rollins said that the Phillies fans had intimidated the home nine, and Rollins even went so far as to call them "frontrunners," immediately giving every cable sports show the hottest topic in the world to run with for a few days. Would the Phillies fans boo Rollins when he came back to town? Were the Phillies fans and their antiSanta agenda too much? Others suggested that the Phillies were a flyball pitching staff, and that they were thus more vulnerable to Citizens Bank Park's homerfriendly dimensions. What complicated this speculation was that the Phillies two percent homefield advantage for 20042008 was contradicted by their 70 playoff record at home in their World Series run in 2008. They only went 43 on the road in the playoffs, meaning that their homefield advantage for the playoffs was 43 percent, more than five times the league average homefield advantage, and over twenty times their advantage over the previous five seasons. Was there something different about the playoffs? Alternatively, perhaps there was nothing special about either the Phillies in the regular season or the Phillies in the playoffs. Consider the following possibilityperhaps no team has a larger homefield advantage than any other team. That sounds impossible, right? Look at the homefield advantage over the last five years for every major league team: Team HFA% Phillies 1.98 Orioles 3.32 Tigers 3.95 Angels 3.95 Padres 4.32 Cubs 4.56 Giants 4.84 Marlins 5.31 Mets 5.43 Indians 6.68 D'backs 6.91 Royals 7.11 Cardinals 7.55 Nationals 7.99 White Sox 8.50 Athletics 8.52 Braves 9.14 Reds 9.14 Dodgers 9.88 Rangers 9.88 Pirates 10.73 Yankees 10.86 Twins 11.48 Astros 11.50 Mariners 12.16 Blue Jays 13.47 Rockies 13.94 Red Sox 14.32 Brewers 15.70 Rays 17.66 The Phillies had the smallest homefield advantage in the major leagues over that time span. There was a huge difference between the Phillies and the Rays, who had a winning percentage that was nearly 18 percentage points higher at home than on the road. However, we would not expect that every team had a homefield advantage of exactly eight percent, even if no team had any special homefield advantage; some teams would have some luck at home, or some luck on the road, and the numbers would change. So I checked the correlation between homefield advantage one year and the next for 20042008the correlation was only 0.05. That is not statistically significant, not even close. Instead of only running a simple yeartoyear correlation, I ran an AR(1) intraclass correlation (with some help from Eric Seidman among others). Intraclass correlation is very similar to yeartoyear correlation, but gives some extra credit to the correlation if a team is especially good at home in 2006 and 2008, but not in 2007. It looks at each team in general, rather than two consecutive seasons. The intraclass correlation was also only 0.05, which is not statistically significant either. From this, it seems very unlikely that any team has a significantly different homefield advantage than any other team, at least when looking at the last five years. Although there is clearly a distribution of homefield advantages that vary from team to team, that is exactly what should happen if no team has a larger homefield advantage than another. If this theory holds true, any team should be expected to have an eight percent homefield advantage next year on average, regardless of what their homefield advantage was this year. It will not be exactly eight percent, but would be just as likely to be above eight percent as below it. The last five years made sense as an initial starting point for looking at homefield advantage, because team composition does not change as drastically over a fiveyear span as it does over an even longer span. However, it is worth checking whether this holds true over a larger time period to see if maybe the smaller sample size is blurring an effect. I gathered the homefield advantage numbers for every team during 19982008 (the elevenyear time period in which there were 30 teams), and I attempted to discover whether there was any persistence to homefield advantage using that data; the correlation stayed low and insignificant, though it did rise to 0.102. The intraclass correlation only went up to .104 too, which is weakly statistically insignificant and slightly more noticeable. It seems pretty clear that if there is any persistence to homefield advantage, it must be a very small effect. Numerically, even if you see a team put up a homefield advantage of 18 percent one year, you probably would not even expect them to have a homefield advantage of nine percent the following year. As we will see below, that may even be too high as well. Although the correlations seem very low, I thought that it would be important to try some other angles to see if we can learn more about teamspecific homefield advantage, if such a thing exists. The thesis that I am generating here is that the variance that we observe in homefield advantage is exactly what we would expect if every team had the same skill at creating a homefield advantage. Therefore, it makes sense to check a chisquared test to see if the variance is in fact what we would expect. A chisquared test allows us to compare the expected variance that we would expect if every team had an equal homefield advantage against what the observed homefield advantage was for the 19982008 period that we are considering. To generate the expected variance, I found the winning percentage of each team over the last eleven years, and calculated the variance of their expected home winning percentage minus away winning percentage, to see if their home winning percentage was about four percent above their overall winning percentage, and their away winning percentage was four percent below it. The expected variance would have been 0.0166 according to this estimate; the actual variance was 0.0195. The chisquared statistic is therefore 34.1, which is statistically insignificant as well. So, we fail to reject the hypothesis that there is teamtoteam difference in homefield advantage. In other words, the variance was only slightly above what we would expect if there was no such thing as teamspecific homefield advantage. Despite this result, it seems pretty clear that each test shows a positive but statistically insignificant effect of teamspecific homefield advantage, which means that perhaps it does exist on some level. Over the last eleven years, there is one team who has by far the largest home field advantage of any team in the leaguethe Colorado Rockies. They have having a winning percentage that is 15.4 percent higher at home than on the road. This becomes even more clear when we look at the following chart, in which I plotted the expected number of teams that would have approximately X percent of a homefield advantage over eleven years if there were no such thing as homefield advantage, as well as the number of teams that have approximately X percent of homefield advantage for 19982008. This makes it even clearer how anomalous the Rockies are:
Notice that one would probably expect to see a team around two percent even though every team would expect to be at about eight percent on average. What you would not expect is for a team to be up at 15.4 percent, as the Rockies are. The explanation for why the Rockies are an anomaly has been given countless times; the overwhelming likelihood is that due to the difficulty in adjusting to the altitude and the difficulty in getting accustomed to playing at that altitude, the Rockies gain an advantage over their opponents. Whether this is their own strength in adjusting to their altitude, or their own weaknesses in adjusting to normal conditions is not clear, but it does seem like the Rockies are a different breed altogether, and they seem to break the model. Consider the yeartoyear correlation for 19982008 without the Rockiesthat clocks in at .065, which is nowhere near statistically significant. The intraclass correlation of .068 is even more insignificant than with the Rockies; it looks more and more like homefield advantage is not teamspecific for the other 29 teams. Looking at it from another angle makes this even clearer. While excluding the Rockies, the expected variance of homefield advantage as described above would be 0.0160. Instead, it was 0.0138even less than we would expect if the outcomes were random (though insignificantly so). The Rockies truly seem to be the anomaly. What this means is that to analyze homefield advantage, we should keep in mind that every team except for the Rockies has pretty much the same homefield advantage. Claims that crowds in different stadiums or that certain kinds of teams are more prone to have a homefield advantage are likely to be false, or more politely, below the threshold of statistical measurability. It is probably true that if you get a groundball pitcher or a power hitter in a small stadium, you are likely to increase your homefield advantage some, but these effects are probably extremely minor. It takes a long time for the difference between two similar percentages to show any kind of consistent trend, and that means that we should wary of explanations of why certain teams have homefield advantages after the fact. There has been a tendency historically for domed home teams to do well in domed stadiums, for example, but most similar explanations will not hold water. Although this result is perhaps somewhat shocking, it simplifies some of our analysis of homefield advantage for the rest of this series. We can now look for trends without worrying that our data is contaminated with large teamspecific effects. As we have now delved into the question of what home field advantage is and who does or does not have it, the next step is to consider where it might be the strongest. Specifically, what kinds of games exhibit the largest homefield advantages? We will look at divisional matchups and both intra and interleague matchups, and try to see if we can learn anything more about homefield advantage using those. In doing so, we can rest assure that we do not need to worry about certain teams (other than the Rockies) tricking us into reaching inaccurate conclusions, since it does not appear that the other 29 teams in the majors exhibit or enjoy any special homefield advantage.
Matt Swartz is an author of Baseball Prospectus. 31 comments have been left for this article. (Click to hide comments) BP Comment Quick Links Tim Kniker (42100) Interesting article. The one thing that I wonder about (it doesn't seem to be exactly addressed) on HFA where I think one does need to look at the individual teams (something that was close, but not quite on Part 1). When comparing the same time, does the HFA come from the given Home Team scoring more runs or allowing fewer runs, or is it a perfect, RA/G is a little lower at home, and RS/G is a little higher. I just think it would be interesting to see what happened with each individual team. Aug 18, 2009 10:33 AM kantsipr (1382) I'd also be interested in how the magnitude varies with time spent at home, i.e. is the home field advantage stronger in the 8th game in a row at home than in the first. Aug 18, 2009 11:00 AM I was curious about that too, and I've been looking into some of these effects. That will probably come later in the series. Thanks. Aug 18, 2009 15:29 PM I75Titans (27541) This issue has been discussed regarding the Rockies for quite some time. It would be most interesting to see some statistics on whether there is some sort of Coors Field hangover effect when they leave the altitude for the flatlands. Does the offense get affeted similarly in the first game on a road trip as it does in the 6th? Aug 19, 2009 10:52 AM Scott D. Simon (1384) I'm looking at the rankings of HFA for 20042008. Here are the top eight teams: Aug 18, 2009 11:06 AM There is a small effect, but the 20042008 data are a small sample that you can't overreact to. Over 19982008, the Rays still have the 3rd highest HFA, but the Astros are only 11th, the Twins are only 12th, the Mariners are only 13th, the Jays are only 14th, the Brewers are only 16th, and the Diamondbacks are 17th. There is a small documented effect of domes the Astrodome certainly had terrible lighting, but the effect is not quite as pronounced as the 20042008 data seemed, so I did not include it beyond the small mention at the end of the second to last paragraph. Aug 18, 2009 15:41 PM Darsox64 (10662) Where are the pvalues? You gave the chisquared test, but it's much easier to understand what you mean if you gave them with the correlations. Simple correlations cannot (as you implied) provide grounds for statistical insignificance. They can suggest low practical significance, but it's not the same thing. Aug 18, 2009 12:24 PM pvalues for simple correlations: Aug 18, 2009 15:46 PM evo34 (33584) This is interesting, but I hope you can break down the avg. HFA by game number in a series and/or by day of the week. Travel is presumed to represent a significant portion of HFA in every sport, but no one has tested it. Baseball teams have many road games where they have already at least spent one day in that city. So it is ripe for an analysis of how much travel actually impacts nextday performance in the sport. Aug 18, 2009 13:38 PM molnar (170) Outstanding article, Matt. I need to read this like six or seven more times. Aug 18, 2009 14:08 PM jayman4 (4850) Being a Padres fan, and not a fan of the extremity of Petco on depressing hitting, I tried to do an analysis on the park effects and HFA. After looking at the data I came up with the correlation between high HFA and park neutrality (the Rockies being the exception). The extreme pitching and hitter parks (again, Colorado was an outlier)dampened HFA, while the more neutral parks generally had higher HFA. Aug 18, 2009 16:33 PM When I first started studying HFA, I tried doing analysis like this, but the fact that the correlation is so low means that it may just be noise. The test for statistical significance essentially tests whether there is a less than 5% chance that you would falsely conclude that there is an effect. The thing is that 1 out of every 20 types of tests you could run is going to come up as a false positive if there is no team specific HFA. Aug 18, 2009 20:08 PM sbnirish77 (17711) "The test for statistical significance essentially tests whether there is a less than 5% chance that you would falsely conclude that there is an effect." Aug 20, 2009 08:09 AM R.A.Wagman (32721) When I play baseball, as a righthanded batter, I have always felt that I can hit better when my team has the 3rdbase dugout, as I can watch the pitcher better from the ondeck circle  I wonder if similar leftyrighty splits might not be more pronounced. Essentially, do LHB hit better on the road and RHB better at home? Aug 18, 2009 20:35 PM BobbyRoberto (907) R.A.Wagman, Aug 18, 2009 21:18 PM dpowell (1025) Matt, can you regress HFA_2008 on avgHFA(19972007) and report the coeff and pvalue? (or you choose the time period for the avg HFA that you think is most "relevant") Aug 18, 2009 22:25 PM HFA_2008 = .115  .032*avgHFA(19972007). Aug 19, 2009 06:50 AM schlicht (17769) Matt Aug 19, 2009 07:39 AM morillos (3878) Not to beat the horse (not sure it it's dead) from the comments on the first article, but the conclusions of this article (altitudeloving Rockies aside, and indeed all Denver teams  the Nuggets and Broncos have the best longterm HFAs in their sports, too) seem to reinforce the point that team construction etc etc is only explaining a part of the HFA, since variances from the expected value dont correlate yeartoyear. Which, to my mind, emphasizes the possibility that a good deal of HFA is simply structural. The data from the comments in the last article suggest at least half of it, in fact. In other words, 5248 just based on batting last, 5446 based on that plus other stuff. Aug 19, 2009 10:48 AM ni2016 (39173) Matt, Aug 20, 2009 10:49 AM This could very well be true. Home field advantage jumped from 6.6% on average from 19972002 up to 20032009 it became 8.9%. So home teams went from winning about 53.3% to about 54.4% of games. That is weakly statistically significant and could represent a trend. In the first article of this series, I showed that the ratio of triples to doubles for home teams was much higher as road teams struggled to field there. It would not surprise me if quirkier stadiums and teams learning to take advantage of them were improving. Whether this leads to an era of teamspecific home field advantage that persists would depend on if teams adjusted or if only a few teams improved. For 20042008, the correlation was only 0.05, so there has not been much of a demonstration of teamspecific home field advantage yet, but naturally that could change. Aug 20, 2009 15:46 PM Luke (47423) Matt, Aug 20, 2009 10:50 AM This is interesting and makes sense in theory, but there is no negative correlation (I just checked) between closeness to .500 and HFA. In fact, it's insignificantly positive. This would probably be more true for sports other than baseball where the best teams wins a lot more than 60% of their games. Aug 20, 2009 15:49 PM Not a subscriber? Sign up today!

Pet peave: The Santa Philly fans booed was so drunk he could barely walk straight, or at all. Philly has no "antiSanta" agenda.
Pet peeve: Poor spelling. I probably deserve to have my rating decreased on this post. :)
I'm an actuary, I'm not so good with these letter things.
You probably could have begged the excuse of thinking so much about baseball, that you got the spelling confused with "Peavy"....but that's kinda a stretch.
Mountainhawk, don't worry I'm from Philly and I know the story about the drunk Santa understudy; I meant it sarcastically :)