Last week, we began our look into home-field advantage by looking at what home teams actually do better than road teams. It has been well documented throughout baseball history that the home team wins about 54 percent of ballgames, and last week we determined that the home team was better at pretty much everything. They struck out less, walked more, hit more home runs, got more hits on balls in play, made fewer errors, converted more double-play opportunities, stretched more extra-base hits into triples, hit more line drives, and they recorded more complete-game shutouts. The home team was able to take an advantage in nearly every aspect of the game. This week, we will carry that discussion of what home-field advantage helps into who it actually helps the most.
On June 4, Cole Hamels shut out the Dodgers, who at that time boasted the best record in the National League. More impressively, this feat came in Dodger Stadium. As mentioned last week, even though road teams scrape out 46 percent of ballgames, they only put together 41 percent of all complete-game shutouts. What was special about Cole Hamels? Was he a hundred feet tall? Was it something about the Phillies? The first clue might have been that, at that point, the Phillies had a 20-6 record on the road, but were only a 12-14 record at home for 2009. The average difference between home winning percentage and away winning percentage is eight percent; after Hamels’ shutout, the Phillies sat at -30 percent!
This did not seem to surprise the Philadelphia media all that much. The Phillies only had a regular-season home-field advantage of 4.9 percent in 2008, and in the four years before that they had home-field “advantages” of 6.2, -3.7, 4.9, and -2.5 percent. It seemed that the Phillies played much worse than other teams at home relative to how they played on the road. Phillies manager Charlie Manuel had an explanation for the team’s 2009 performance at the ready-the fanfare surrounding the World Series celebrations was distracting them. There had been ceremony after ceremony through the first several weeks of the season, and Manuel supposed that this was keeping his players from concentrating at home. Last year, Jimmy Rollins said that the Phillies fans had intimidated the home nine, and Rollins even went so far as to call them “frontrunners,” immediately giving every cable sports show the hottest topic in the world to run with for a few days. Would the Phillies fans boo Rollins when he came back to town? Were the Phillies fans and their anti-Santa agenda too much? Others suggested that the Phillies were a fly-ball pitching staff, and that they were thus more vulnerable to Citizens Bank Park’s homer-friendly dimensions.
What complicated this speculation was that the Phillies two percent home-field advantage for 2004-2008 was contradicted by their 7-0 playoff record at home in their World Series run in 2008. They only went 4-3 on the road in the playoffs, meaning that their home-field advantage for the playoffs was 43 percent, more than five times the league average home-field advantage, and over twenty times their advantage over the previous five seasons. Was there something different about the playoffs?
Alternatively, perhaps there was nothing special about either the Phillies in the regular season or the Phillies in the playoffs. Consider the following possibility-perhaps no team has a larger home-field advantage than any other team. That sounds impossible, right? Look at the home-field advantage over the last five years for every major league team:
Team HFA% Phillies 1.98 Orioles 3.32 Tigers 3.95 Angels 3.95 Padres 4.32 Cubs 4.56 Giants 4.84 Marlins 5.31 Mets 5.43 Indians 6.68 D'backs 6.91 Royals 7.11 Cardinals 7.55 Nationals 7.99 White Sox 8.50 Athletics 8.52 Braves 9.14 Reds 9.14 Dodgers 9.88 Rangers 9.88 Pirates 10.73 Yankees 10.86 Twins 11.48 Astros 11.50 Mariners 12.16 Blue Jays 13.47 Rockies 13.94 Red Sox 14.32 Brewers 15.70 Rays 17.66
The Phillies had the smallest home-field advantage in the major leagues over that time span. There was a huge difference between the Phillies and the Rays, who had a winning percentage that was nearly 18 percentage points higher at home than on the road. However, we would not expect that every team had a home-field advantage of exactly eight percent, even if no team had any special home-field advantage; some teams would have some luck at home, or some luck on the road, and the numbers would change. So I checked the correlation between home-field advantage one year and the next for 2004-2008-the correlation was only 0.05. That is not statistically significant, not even close.
Instead of only running a simple year-to-year correlation, I ran an AR(1) intraclass correlation (with some help from Eric Seidman among others). Intraclass correlation is very similar to year-to-year correlation, but gives some extra credit to the correlation if a team is especially good at home in 2006 and 2008, but not in 2007. It looks at each team in general, rather than two consecutive seasons. The intraclass correlation was also only 0.05, which is not statistically significant either.
From this, it seems very unlikely that any team has a significantly different home-field advantage than any other team, at least when looking at the last five years. Although there is clearly a distribution of home-field advantages that vary from team to team, that is exactly what should happen if no team has a larger home-field advantage than another. If this theory holds true, any team should be expected to have an eight percent home-field advantage next year on average, regardless of what their home-field advantage was this year. It will not be exactly eight percent, but would be just as likely to be above eight percent as below it.
The last five years made sense as an initial starting point for looking at home-field advantage, because team composition does not change as drastically over a five-year span as it does over an even longer span. However, it is worth checking whether this holds true over a larger time period to see if maybe the smaller sample size is blurring an effect. I gathered the home-field advantage numbers for every team during 1998-2008 (the eleven-year time period in which there were 30 teams), and I attempted to discover whether there was any persistence to home-field advantage using that data; the correlation stayed low and insignificant, though it did rise to 0.102. The intraclass correlation only went up to .104 too, which is weakly statistically insignificant and slightly more noticeable. It seems pretty clear that if there is any persistence to home-field advantage, it must be a very small effect. Numerically, even if you see a team put up a home-field advantage of 18 percent one year, you probably would not even expect them to have a home-field advantage of nine percent the following year. As we will see below, that may even be too high as well.
Although the correlations seem very low, I thought that it would be important to try some other angles to see if we can learn more about team-specific home-field advantage, if such a thing exists. The thesis that I am generating here is that the variance that we observe in home-field advantage is exactly what we would expect if every team had the same skill at creating a home-field advantage. Therefore, it makes sense to check a chi-squared test to see if the variance is in fact what we would expect. A chi-squared test allows us to compare the expected variance that we would expect if every team had an equal home-field advantage against what the observed home-field advantage was for the 1998-2008 period that we are considering.
To generate the expected variance, I found the winning percentage of each team over the last eleven years, and calculated the variance of their expected home winning percentage minus away winning percentage, to see if their home winning percentage was about four percent above their overall winning percentage, and their away winning percentage was four percent below it. The expected variance would have been 0.0166 according to this estimate; the actual variance was 0.0195. The chi-squared statistic is therefore 34.1, which is statistically insignificant as well. So, we fail to reject the hypothesis that there is team-to-team difference in home-field advantage. In other words, the variance was only slightly above what we would expect if there was no such thing as team-specific home-field advantage.
Despite this result, it seems pretty clear that each test shows a positive but statistically insignificant effect of team-specific home-field advantage, which means that perhaps it does exist on some level. Over the last eleven years, there is one team who has by far the largest home field advantage of any team in the league-the Colorado Rockies. They have having a winning percentage that is 15.4 percent higher at home than on the road. This becomes even more clear when we look at the following chart, in which I plotted the expected number of teams that would have approximately X percent of a home-field advantage over eleven years if there were no such thing as home-field advantage, as well as the number of teams that have approximately X percent of home-field advantage for 1998-2008. This makes it even clearer how anomalous the Rockies are:
Notice that one would probably expect to see a team around two percent even though every team would expect to be at about eight percent on average. What you would not expect is for a team to be up at 15.4 percent, as the Rockies are. The explanation for why the Rockies are an anomaly has been given countless times; the overwhelming likelihood is that due to the difficulty in adjusting to the altitude and the difficulty in getting accustomed to playing at that altitude, the Rockies gain an advantage over their opponents. Whether this is their own strength in adjusting to their altitude, or their own weaknesses in adjusting to normal conditions is not clear, but it does seem like the Rockies are a different breed altogether, and they seem to break the model. Consider the year-to-year correlation for 1998-2008 without the Rockies-that clocks in at .065, which is nowhere near statistically significant. The intraclass correlation of .068 is even more insignificant than with the Rockies; it looks more and more like home-field advantage is not team-specific for the other 29 teams.
Looking at it from another angle makes this even clearer. While excluding the Rockies, the expected variance of home-field advantage as described above would be 0.0160. Instead, it was 0.0138-even less than we would expect if the outcomes were random (though insignificantly so). The Rockies truly seem to be the anomaly.
What this means is that to analyze home-field advantage, we should keep in mind that every team except for the Rockies has pretty much the same home-field advantage. Claims that crowds in different stadiums or that certain kinds of teams are more prone to have a home-field advantage are likely to be false, or more politely, below the threshold of statistical measurability. It is probably true that if you get a ground-ball pitcher or a power hitter in a small stadium, you are likely to increase your home-field advantage some, but these effects are probably extremely minor. It takes a long time for the difference between two similar percentages to show any kind of consistent trend, and that means that we should wary of explanations of why certain teams have home-field advantages after the fact. There has been a tendency historically for domed home teams to do well in domed stadiums, for example, but most similar explanations will not hold water.
Although this result is perhaps somewhat shocking, it simplifies some of our analysis of home-field advantage for the rest of this series. We can now look for trends without worrying that our data is contaminated with large team-specific effects. As we have now delved into the question of what home field advantage is and who does or does not have it, the next step is to consider where it might be the strongest. Specifically, what kinds of games exhibit the largest home-field advantages? We will look at divisional matchups and both intra- and interleague matchups, and try to see if we can learn anything more about home-field advantage using those. In doing so, we can rest assure that we do not need to worry about certain teams (other than the Rockies) tricking us into reaching inaccurate conclusions, since it does not appear that the other 29 teams in the majors exhibit or enjoy any special home-field advantage.
Thank you for reading
This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.Subscribe now
Anecdotally, the Rockies seem to click right back into their offensive powerhouse mode when returning to Coors field, but I have not seen any data to back that up either.
Blue Jays 13.47
Red Sox 14.32
I believe this list includes every team that plays in a covered stadium except for the Diamondbacks (who have the 11th-smallest HFA).
Given the conclusion that team-specific HFA is positive but statistically insignificant, why do all these dome teams show up over the last four years?
On the other hand, it may be that the effects are only significant for recent years if there has been a change in how teams utilize their domed stadiums to their advantage.
Thanks for pointing that out in general, though, since the point of the article is not that there is NO team-specific HFA, but that the effects are so minor that they can essentially be ignored.
Also, instead of using fussy stochastic models, you could have just an ordinary regression a couple of lagging variables. The lags would get around the lack of... credit for existence of correlations between season n and season n+2, but not with season n+1 (and if you don't get statistical significance on the lagging variable, the concern is probably unwarranted.
p=.30 with Rockies, p=.37 without Rockies
p=.08 with Rockies, p=.27 without Rockies.
I talked to a couple people who said that they thought the ICC was the way to go. I did try to use a regression, but the coefficients were predictably insignifcant.
My analysis was a lot less rigorous and data fit reasonably well, but curious if your research showed anything along those lines.
My hypothesis is that the bias affects the hitters or pitchers in bad ways. In Petco, the Padres hitters may start getting in their heads to hit line drives and tempted to change their approach at home vs. the road. My guess is that is bad. Same for pitchers in hitter friendly parks.
Your argument makes sense, but the thing is that you could make a seemingly logical argument for teams in pitchers' parks having better HFA, for teams in hitters' parks having better HFA, for teams in neutral parks having worse HFA...one of these tests may end up coming up statistically significant. I'd be curious to hear about your methodology and what kind of p-stat you got, but given the low correlation, it may be that there is not actually a true effect.
Alternatively couldn't you have applied a test of significance for the inverse hypothesis (there is no effect)?
I realize a 95% confidence level is routinely used for statistcal significance but aren't there a boatload of situations that fit in-between the 5% signficance of the hypothesis and the 5% signficance of the counter hypothesis.
And don't some of those situations favor one side or another - just not at a 95% confidence level?
According to wikipedia, 18 MLB teams use the 1st-base dugout at home. It's weird. I thought the home team traditionally sat in the 3rd-base dugout, but it looks like that's not true these days.
p-value = .939
(-.892,+.827) is the confidence interval for the coefficient.
It's obviously going to be very tough to run a regression on 30 variables, since the coefficient will bounce around a lot to match outlier values. For instance:
HFA_2008 = .088 + .307*avgHFA(2004-2007)
p-value = .250
(-.229,+.843) is the confidence interval for the coefficient.
Nothing is really going to come up significant, because the standard error is going to be so large that even 1.00 might not be significant, but it seems like this adds to the idea that team-specific HFA is miniscule.
This data set seems more suited to an analysis of variance rather than correlation analysis. ANOVA will answer the question I think you are asking: Do some teams have statistically significantly larger or smaller HFA than average?
In the long run, if the effects actually come down to how individual players respond to home vs away conditions, then the actual makeup of individual teams will end up as a determining factor.
I'd still like to see that tested via a Monte Carlo simulation. Two identical teams play a million games. How many does the home team win?
This is not to deny that other things affect HFA. I just want to know the magnitude of their effect against the structural baseline.
Great series, in any case.
While I agree with the broad brush principle that no one team (save, perhaps, the Rockies) will particularly skew a long-term sample, depending on the analysis, more recent years & teams may be skewed by, for example, the Red Sox in Theo Epstein's tenure.
As teams have used advanced statistics & become more aware of park effects and more able to parse out the drivers of those effects, they have been able to more accurately tailor their team for their particular stadium.
For example, from 2003 through YTD 2009 (i.e. Epstein's entire tenure as the Sox GM), the Red Sox have only once been below the 8% threshold, with Home vs. Away %'s of 20% / 21% / 7% / 12% / 16% / 21% / 14% from most to least recent.
To me, this raises both the question of potential skews, but more so, the question of whether - as statistics become more descriptive, accurate, and available to unearth the drivers & levers - the Home vs. Away gap has grown over time. I.e., if, in fact, the Sox' recent run of home outperformance is indicative of having learned more about how to use Fenway than before, are other teams following suit or doing this in parallel? (sorry if you dealt with this in the 1st part - didn't get a chance to see that)
Regardless, interesting stuff. Thanks!
I haven't run any numbers but it seems to me that there should be some effect on a team's HFA% based on their overall record.
By that I mean, if a team wins 100 games in a season they need to be winning a lot of their home games and away games. The opposite being true for teams that are losing a lot of games each year.
Would that mean that the distribution of HFA% against overall win percentage would be higher for teams whose records are around .500?
Further, do teams with better records have better HFA%?
Any correlation at all?