Last week, we began our look into home-field advantage by looking at what home teams actually do better than road teams. It has been well documented throughout baseball history that the home team wins about 54 percent of ballgames, and last week we determined that the home team was better at pretty much everything. They struck out less, walked more, hit more home runs, got more hits on balls in play, made fewer errors, converted more double-play opportunities, stretched more extra-base hits into triples, hit more line drives, and they recorded more complete-game shutouts. The home team was able to take an advantage in nearly every aspect of the game. This week, we will carry that discussion of what home-field advantage helps into who it actually helps the most.

On June 4, Cole Hamels shut out the Dodgers, who at that time boasted the best record in the National League. More impressively, this feat came in Dodger Stadium. As mentioned last week, even though road teams scrape out 46 percent of ballgames, they only put together 41 percent of all complete-game shutouts. What was special about Cole Hamels? Was he a hundred feet tall? Was it something about the Phillies? The first clue might have been that, at that point, the Phillies had a 20-6 record on the road, but were only a 12-14 record at home for 2009. The average difference between home winning percentage and away winning percentage is eight percent; after Hamels’ shutout, the Phillies sat at -30 percent!

This did not seem to surprise the Philadelphia media all that much. The Phillies only had a regular-season home-field advantage of 4.9 percent in 2008, and in the four years before that they had home-field “advantages” of 6.2, -3.7, 4.9, and -2.5 percent. It seemed that the Phillies played much worse than other teams at home relative to how they played on the road. Phillies manager Charlie Manuel had an explanation for the team’s 2009 performance at the ready-the fanfare surrounding the World Series celebrations was distracting them. There had been ceremony after ceremony through the first several weeks of the season, and Manuel supposed that this was keeping his players from concentrating at home. Last year, Jimmy Rollins said that the Phillies fans had intimidated the home nine, and Rollins even went so far as to call them “frontrunners,” immediately giving every cable sports show the hottest topic in the world to run with for a few days. Would the Phillies fans boo Rollins when he came back to town? Were the Phillies fans and their anti-Santa agenda too much? Others suggested that the Phillies were a fly-ball pitching staff, and that they were thus more vulnerable to Citizens Bank Park’s homer-friendly dimensions.

What complicated this speculation was that the Phillies two percent home-field advantage for 2004-2008 was contradicted by their 7-0 playoff record at home in their World Series run in 2008. They only went 4-3 on the road in the playoffs, meaning that their home-field advantage for the playoffs was 43 percent, more than five times the league average home-field advantage, and over twenty times their advantage over the previous five seasons. Was there something different about the playoffs?

Alternatively, perhaps there was nothing special about either the Phillies in the regular season or the Phillies in the playoffs. Consider the following possibility-perhaps no team has a larger home-field advantage than any other team. That sounds impossible, right? Look at the home-field advantage over the last five years for every major league team:

Team         HFA%
Phillies    1.98
Orioles     3.32
Tigers      3.95
Angels      3.95
Padres      4.32
Cubs        4.56
Giants      4.84
Marlins     5.31
Mets        5.43
Indians     6.68
D'backs     6.91
Royals      7.11
Cardinals   7.55
Nationals   7.99
White Sox   8.50
Athletics   8.52
Braves      9.14
Reds        9.14
Dodgers     9.88
Rangers     9.88
Pirates    10.73
Yankees    10.86
Twins      11.48
Astros     11.50
Mariners   12.16
Blue Jays  13.47
Rockies    13.94
Red Sox    14.32
Brewers    15.70
Rays       17.66

The Phillies had the smallest home-field advantage in the major leagues over that time span. There was a huge difference between the Phillies and the Rays, who had a winning percentage that was nearly 18 percentage points higher at home than on the road. However, we would not expect that every team had a home-field advantage of exactly eight percent, even if no team had any special home-field advantage; some teams would have some luck at home, or some luck on the road, and the numbers would change. So I checked the correlation between home-field advantage one year and the next for 2004-2008-the correlation was only 0.05. That is not statistically significant, not even close.

Instead of only running a simple year-to-year correlation, I ran an AR(1) intraclass correlation (with some help from Eric Seidman among others). Intraclass correlation is very similar to year-to-year correlation, but gives some extra credit to the correlation if a team is especially good at home in 2006 and 2008, but not in 2007. It looks at each team in general, rather than two consecutive seasons. The intraclass correlation was also only 0.05, which is not statistically significant either.

From this, it seems very unlikely that any team has a significantly different home-field advantage than any other team, at least when looking at the last five years. Although there is clearly a distribution of home-field advantages that vary from team to team, that is exactly what should happen if no team has a larger home-field advantage than another. If this theory holds true, any team should be expected to have an eight percent home-field advantage next year on average, regardless of what their home-field advantage was this year. It will not be exactly eight percent, but would be just as likely to be above eight percent as below it.

The last five years made sense as an initial starting point for looking at home-field advantage, because team composition does not change as drastically over a five-year span as it does over an even longer span. However, it is worth checking whether this holds true over a larger time period to see if maybe the smaller sample size is blurring an effect. I gathered the home-field advantage numbers for every team during 1998-2008 (the eleven-year time period in which there were 30 teams), and I attempted to discover whether there was any persistence to home-field advantage using that data; the correlation stayed low and insignificant, though it did rise to 0.102. The intraclass correlation only went up to .104 too, which is weakly statistically insignificant and slightly more noticeable. It seems pretty clear that if there is any persistence to home-field advantage, it must be a very small effect. Numerically, even if you see a team put up a home-field advantage of 18 percent one year, you probably would not even expect them to have a home-field advantage of nine percent the following year. As we will see below, that may even be too high as well.

Although the correlations seem very low, I thought that it would be important to try some other angles to see if we can learn more about team-specific home-field advantage, if such a thing exists. The thesis that I am generating here is that the variance that we observe in home-field advantage is exactly what we would expect if every team had the same skill at creating a home-field advantage. Therefore, it makes sense to check a chi-squared test to see if the variance is in fact what we would expect. A chi-squared test allows us to compare the expected variance that we would expect if every team had an equal home-field advantage against what the observed home-field advantage was for the 1998-2008 period that we are considering.

To generate the expected variance, I found the winning percentage of each team over the last eleven years, and calculated the variance of their expected home winning percentage minus away winning percentage, to see if their home winning percentage was about four percent above their overall winning percentage, and their away winning percentage was four percent below it. The expected variance would have been 0.0166 according to this estimate; the actual variance was 0.0195. The chi-squared statistic is therefore 34.1, which is statistically insignificant as well. So, we fail to reject the hypothesis that there is team-to-team difference in home-field advantage. In other words, the variance was only slightly above what we would expect if there was no such thing as team-specific home-field advantage.

Despite this result, it seems pretty clear that each test shows a positive but statistically insignificant effect of team-specific home-field advantage, which means that perhaps it does exist on some level. Over the last eleven years, there is one team who has by far the largest home field advantage of any team in the league-the Colorado Rockies. They have having a winning percentage that is 15.4 percent higher at home than on the road. This becomes even more clear when we look at the following chart, in which I plotted the expected number of teams that would have approximately X percent of a home-field advantage over eleven years if there were no such thing as home-field advantage, as well as the number of teams that have approximately X percent of home-field advantage for 1998-2008. This makes it even clearer how anomalous the Rockies are:

hfa graph

Notice that one would probably expect to see a team around two percent even though every team would expect to be at about eight percent on average. What you would not expect is for a team to be up at 15.4 percent, as the Rockies are. The explanation for why the Rockies are an anomaly has been given countless times; the overwhelming likelihood is that due to the difficulty in adjusting to the altitude and the difficulty in getting accustomed to playing at that altitude, the Rockies gain an advantage over their opponents. Whether this is their own strength in adjusting to their altitude, or their own weaknesses in adjusting to normal conditions is not clear, but it does seem like the Rockies are a different breed altogether, and they seem to break the model. Consider the year-to-year correlation for 1998-2008 without the Rockies-that clocks in at .065, which is nowhere near statistically significant. The intraclass correlation of .068 is even more insignificant than with the Rockies; it looks more and more like home-field advantage is not team-specific for the other 29 teams.

Looking at it from another angle makes this even clearer. While excluding the Rockies, the expected variance of home-field advantage as described above would be 0.0160. Instead, it was 0.0138-even less than we would expect if the outcomes were random (though insignificantly so). The Rockies truly seem to be the anomaly.

What this means is that to analyze home-field advantage, we should keep in mind that every team except for the Rockies has pretty much the same home-field advantage. Claims that crowds in different stadiums or that certain kinds of teams are more prone to have a home-field advantage are likely to be false, or more politely, below the threshold of statistical measurability. It is probably true that if you get a ground-ball pitcher or a power hitter in a small stadium, you are likely to increase your home-field advantage some, but these effects are probably extremely minor. It takes a long time for the difference between two similar percentages to show any kind of consistent trend, and that means that we should wary of explanations of why certain teams have home-field advantages after the fact. There has been a tendency historically for domed home teams to do well in domed stadiums, for example, but most similar explanations will not hold water.

Although this result is perhaps somewhat shocking, it simplifies some of our analysis of home-field advantage for the rest of this series. We can now look for trends without worrying that our data is contaminated with large team-specific effects. As we have now delved into the question of what home field advantage is and who does or does not have it, the next step is to consider where it might be the strongest. Specifically, what kinds of games exhibit the largest home-field advantages? We will look at divisional matchups and both intra- and interleague matchups, and try to see if we can learn anything more about home-field advantage using those. In doing so, we can rest assure that we do not need to worry about certain teams (other than the Rockies) tricking us into reaching inaccurate conclusions, since it does not appear that the other 29 teams in the majors exhibit or enjoy any special home-field advantage.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now
You need to be logged in to comment. Login or Subscribe
Pet peave: The Santa Philly fans booed was so drunk he could barely walk straight, or at all. Philly has no "anti-Santa" agenda.
Pet peeve: Poor spelling. I probably deserve to have my rating decreased on this post. :)
I'm an actuary, I'm not so good with these letter things.
You probably could have begged the excuse of thinking so much about baseball, that you got the spelling confused with "Peavy"....but that's kinda a stretch.
Mountainhawk, don't worry-- I'm from Philly and I know the story about the drunk Santa understudy; I meant it sarcastically :)
Interesting article. The one thing that I wonder about (it doesn't seem to be exactly addressed) on HFA where I think one does need to look at the individual teams (something that was close, but not quite on Part 1). When comparing the same time, does the HFA come from the given Home Team scoring more runs or allowing fewer runs, or is it a perfect, RA/G is a little lower at home, and RS/G is a little higher. I just think it would be interesting to see what happened with each individual team.
That's an interesting idea. I might add that in later in the series. Thanks, Tim :)
I'd also be interested in how the magnitude varies with time spent at home, i.e. is the home field advantage stronger in the 8th game in a row at home than in the first.
I was curious about that too, and I've been looking into some of these effects. That will probably come later in the series. Thanks.
This issue has been discussed regarding the Rockies for quite some time. It would be most interesting to see some statistics on whether there is some sort of Coors Field hang-over effect when they leave the altitude for the flatlands. Does the offense get affeted similarly in the first game on a road trip as it does in the 6th?

Anecdotally, the Rockies seem to click right back into their offensive powerhouse mode when returning to Coors field, but I have not seen any data to back that up either.
I'm looking at the rankings of HFA for 2004-2008. Here are the top eight teams:

Twins 11.48
Astros 11.50
Mariners 12.16
Blue Jays 13.47
Rockies 13.94
Red Sox 14.32
Brewers 15.70
Rays 17.66

I believe this list includes every team that plays in a covered stadium except for the Diamondbacks (who have the 11th-smallest HFA).

Given the conclusion that team-specific HFA is positive but statistically insignificant, why do all these dome teams show up over the last four years?
There is a small effect, but the 2004-2008 data are a small sample that you can't overreact to. Over 1998-2008, the Rays still have the 3rd highest HFA, but the Astros are only 11th, the Twins are only 12th, the Mariners are only 13th, the Jays are only 14th, the Brewers are only 16th, and the Diamondbacks are 17th. There is a small documented effect of domes-- the Astrodome certainly had terrible lighting, but the effect is not quite as pronounced as the 2004-2008 data seemed, so I did not include it beyond the small mention at the end of the second to last paragraph.

On the other hand, it may be that the effects are only significant for recent years if there has been a change in how teams utilize their domed stadiums to their advantage.

Thanks for pointing that out in general, though, since the point of the article is not that there is NO team-specific HFA, but that the effects are so minor that they can essentially be ignored.
Where are the p-values? You gave the chi-squared test, but it's much easier to understand what you mean if you gave them with the correlations. Simple correlations cannot (as you implied) provide grounds for statistical insignificance. They can suggest low practical significance, but it's not the same thing.

Also, instead of using fussy stochastic models, you could have just an ordinary regression a couple of lagging variables. The lags would get around the lack of... credit for existence of correlations between season n and season n+2, but not with season n+1 (and if you don't get statistical significance on the lagging variable, the concern is probably unwarranted.
p-values for simple correlations:
p=.30 with Rockies, p=.37 without Rockies
p=.08 with Rockies, p=.27 without Rockies.

I talked to a couple people who said that they thought the ICC was the way to go. I did try to use a regression, but the coefficients were predictably insignifcant.
This is interesting, but I hope you can break down the avg. HFA by game number in a series and/or by day of the week. Travel is presumed to represent a significant portion of HFA in every sport, but no one has tested it. Baseball teams have many road games where they have already at least spent one day in that city. So it is ripe for an analysis of how much travel actually impacts next-day performance in the sport.
I definitely will write this up-- should be two weeks from now. I have the results but I thought it was important to focus on the division/league/interleague stuff first. I will certainly get to this, though.
Outstanding article, Matt. I need to read this like six or seven more times.
Being a Padres fan, and not a fan of the extremity of Petco on depressing hitting, I tried to do an analysis on the park effects and HFA. After looking at the data I came up with the correlation between high HFA and park neutrality (the Rockies being the exception). The extreme pitching and hitter parks (again, Colorado was an outlier)dampened HFA, while the more neutral parks generally had higher HFA.

My analysis was a lot less rigorous and data fit reasonably well, but curious if your research showed anything along those lines.

My hypothesis is that the bias affects the hitters or pitchers in bad ways. In Petco, the Padres hitters may start getting in their heads to hit line drives and tempted to change their approach at home vs. the road. My guess is that is bad. Same for pitchers in hitter friendly parks.
When I first started studying HFA, I tried doing analysis like this, but the fact that the correlation is so low means that it may just be noise. The test for statistical significance essentially tests whether there is a less than 5% chance that you would falsely conclude that there is an effect. The thing is that 1 out of every 20 types of tests you could run is going to come up as a false positive if there is no team specific HFA.

Your argument makes sense, but the thing is that you could make a seemingly logical argument for teams in pitchers' parks having better HFA, for teams in hitters' parks having better HFA, for teams in neutral parks having worse of these tests may end up coming up statistically significant. I'd be curious to hear about your methodology and what kind of p-stat you got, but given the low correlation, it may be that there is not actually a true effect.
"The test for statistical significance essentially tests whether there is a less than 5% chance that you would falsely conclude that there is an effect."

Alternatively couldn't you have applied a test of significance for the inverse hypothesis (there is no effect)?

I realize a 95% confidence level is routinely used for statistcal significance but aren't there a boatload of situations that fit in-between the 5% signficance of the hypothesis and the 5% signficance of the counter hypothesis.

And don't some of those situations favor one side or another - just not at a 95% confidence level?
When I play baseball, as a righthanded batter, I have always felt that I can hit better when my team has the 3rd-base dugout, as I can watch the pitcher better from the on-deck circle - I wonder if similar lefty-righty splits might not be more pronounced. Essentially, do LHB hit better on the road and RHB better at home?
According to wikipedia, 18 MLB teams use the 1st-base dugout at home. It's weird. I thought the home team traditionally sat in the 3rd-base dugout, but it looks like that's not true these days.

That is weird. I still would wonder if RHB would have an edge when in the 3B dugout and lefties an edge in the 1B dugout.
Matt, can you regress HFA_2008 on avgHFA(1997-2007) and report the coeff and p-value? (or you choose the time period for the avg HFA that you think is most "relevant")
HFA_2008 = .115 - .032*avgHFA(1997-2007).

p-value = .939

(-.892,+.827) is the confidence interval for the coefficient.

It's obviously going to be very tough to run a regression on 30 variables, since the coefficient will bounce around a lot to match outlier values. For instance:

HFA_2008 = .088 + .307*avgHFA(2004-2007)

p-value = .250

(-.229,+.843) is the confidence interval for the coefficient.

Nothing is really going to come up significant, because the standard error is going to be so large that even 1.00 might not be significant, but it seems like this adds to the idea that team-specific HFA is miniscule.

This data set seems more suited to an analysis of variance rather than correlation analysis. ANOVA will answer the question I think you are asking: Do some teams have statistically significantly larger or smaller HFA than average?

In the long run, if the effects actually come down to how individual players respond to home vs away conditions, then the actual makeup of individual teams will end up as a determining factor.

carl s
Not to beat the horse (not sure it it's dead) from the comments on the first article, but the conclusions of this article (altitude-loving Rockies aside, and indeed all Denver teams -- the Nuggets and Broncos have the best long-term HFAs in their sports, too) seem to reinforce the point that team construction etc etc is only explaining a part of the HFA, since variances from the expected value dont correlate year-to-year. Which, to my mind, emphasizes the possibility that a good deal of HFA is simply structural. The data from the comments in the last article suggest at least half of it, in fact. In other words, 52-48 just based on batting last, 54-46 based on that plus other stuff.

I'd still like to see that tested via a Monte Carlo simulation. Two identical teams play a million games. How many does the home team win?

This is not to deny that other things affect HFA. I just want to know the magnitude of their effect against the structural baseline.

Great series, in any case.

While I agree with the broad brush principle that no one team (save, perhaps, the Rockies) will particularly skew a long-term sample, depending on the analysis, more recent years & teams may be skewed by, for example, the Red Sox in Theo Epstein's tenure.

As teams have used advanced statistics & become more aware of park effects and more able to parse out the drivers of those effects, they have been able to more accurately tailor their team for their particular stadium.

For example, from 2003 through YTD 2009 (i.e. Epstein's entire tenure as the Sox GM), the Red Sox have only once been below the 8% threshold, with Home vs. Away %'s of 20% / 21% / 7% / 12% / 16% / 21% / 14% from most to least recent.

To me, this raises both the question of potential skews, but more so, the question of whether - as statistics become more descriptive, accurate, and available to unearth the drivers & levers - the Home vs. Away gap has grown over time. I.e., if, in fact, the Sox' recent run of home outperformance is indicative of having learned more about how to use Fenway than before, are other teams following suit or doing this in parallel? (sorry if you dealt with this in the 1st part - didn't get a chance to see that)

Regardless, interesting stuff. Thanks!
This could very well be true. Home field advantage jumped from 6.6% on average from 1997-2002 up to 2003-2009 it became 8.9%. So home teams went from winning about 53.3% to about 54.4% of games. That is weakly statistically significant and could represent a trend. In the first article of this series, I showed that the ratio of triples to doubles for home teams was much higher as road teams struggled to field there. It would not surprise me if quirkier stadiums and teams learning to take advantage of them were improving. Whether this leads to an era of team-specific home field advantage that persists would depend on if teams adjusted or if only a few teams improved. For 2004-2008, the correlation was only 0.05, so there has not been much of a demonstration of team-specific home field advantage yet, but naturally that could change.

I haven't run any numbers but it seems to me that there should be some effect on a team's HFA% based on their overall record.

By that I mean, if a team wins 100 games in a season they need to be winning a lot of their home games and away games. The opposite being true for teams that are losing a lot of games each year.

Would that mean that the distribution of HFA% against overall win percentage would be higher for teams whose records are around .500?

Further, do teams with better records have better HFA%?

Any correlation at all?
This is interesting and makes sense in theory, but there is no negative correlation (I just checked) between closeness to .500 and HFA. In fact, it's insignificantly positive. This would probably be more true for sports other than baseball where the best teams wins a lot more than 60% of their games.