BP Comment Quick Links

May 24, 2009 Prospectus Idol EntryBaseball Prospectus Basics: Park FactorsThe new Yankeee Stadium has received a lot of press this spring for the large number of homeruns hit there so far. On April 21, 2009, Buster Olney wrote at ESPN http://sports.espn.go.com/mlb/news/story?id=4080195 "The New York Yankees might have a serious problem on their hands: Beautiful new Yankee Stadium appears to be a veritable wind tunnel that is rocketing balls over the fences...including 17 in the first three games in the Yankees' first home series against the Indians. That's an average of five home runs per game and, at this pace, there would be about 400 homers hit in the park this year  or an increase of about 250 percent. In the last year of old Yankee Stadium, in 2008, there were a total of 160 homers." The first mistake in Olney's analysis is to take the homerun rate of five games and extrapolate that over a full season, and the second is to refer to how many were hit in the old Yankee Stadium last year, without considering if there might be different players on the field. The accepted method of measuring park factors, on any statistic, is to compare the home totals of both the batters and pitchers to those compiled on the road, where playing in fifteen or more different parks minimizes the effect of any one park. The factors then allow us to estimate how these players would perform in a neutral environment. As of this writing on May 20, the Yankees have played 19 games at home, which have seen a total of 71 homeruns, 37 by Yankees hitters, 34 by their opposition. They've played 21 games on the road, with 49 homeruns, 27 by Yankees hitters and 22 by their opposition. 71 homers at Yankee Stadium divided by 49 in the Yankees road games gives a factor of 1.45indicating the new Yankee Stadium inflates homerun rates 45%. The Yankees have played two more games on the road than at home, so let's instead find the ratio of the home HR% (hr/(abso)) of .064 to their road game rate of .043, which is 1.48slightly higher, but basically the same. Is 20 games, a quarter of a season, enough of a sample size to get a reliable factor? After two exhibitions and three regular season games, Olney calculates an increase of 250%. After 19 regular season home games, I calculate an increase of 45%. What is it likely to be by the end of the season? From 1985 to 1991, a period of seven seasons, there were no changes in the National League in either ballparks or schedule. I ran a series of one year, two year and three year factors to find out how much each varied from the seven year 'true' value at each park. The chart below shows the standard deviation of the results for each category at each sample size. After one year all categories are fairly close to 2 decimal point accuracy, except homeruns which take three years and triples which take even longer. If Yankee Stadium still has a homerun factor of 1.45 at the end of the year, with a SD of .149, that means there's a 70% chance the 'true' value is between 1.30 and 1.60, and a 95% chance of it being between 1.15 and 1.75. After 19 games it is still possible that Yankee Stadium could turn out to be an average park. SDT XBH SI DO TR HR BB SO 1 Yr .039 .083 .044 .091 .292 .149 .069 .044 2 Yr .023 .057 .025 .060 .207 .085 .054 .030 3 Yr .018 .046 .020 .045 .161 .060 .041 .023 A stadium having a factor of 1.45 tells us that plays in that park will be increased by 45% over normal rates. We can use this number to normalize the performance of batters and pitchers to what they would have done in a 'neutral' park. Each team is scheduled to play half their games at home, the other half at the various road parks. If we assume that the road parks average out to 1.00, then the 'team' factor which is applied to the seasons stats would be (home+road)/2, or in this case (1.45+1.00)/2, which is 1.22. Yankees hitters would be normalized by having their homerun percentage reduced by 22%, and the pitchers increased by 22%. However, with interleague play and unbalanced schedules, we can not assume the a team's road parks average 1.00. The Pirates play division games in Great American Ballpark, Miller Field, Wrigley Field and Minute Maid Park, all of which are among the easiest to homer in. The Rockies play division games in Petco Park, Dodger Stadium and AT&T Park, which are among the hardest. After the initial calculation of each park's factors, use those to normalize each team's road statistics and rerun to generate a new version of factors. A third time is even better, but more than that doesn't add any meaningful accuracy. The chart shows that it takes at least three years to get a fairly accurate set of factors, but before that time has gone by a new stadium has likely been constructedthe road parks have changed. Assuming Yankee Stadium's HR factor reamains higher than the park it replaced, the factor for Fenway Park will go down because Red Sox hitters can be expected to hit more homers on the road. In 1978, Fenway was the fourth easiest park in the AL to homer in, but in 1999, Fenway had dropped to the ninthFenway hadn't changed, it was all the other parks that changed. Can we legitimately say "It used to be a hitter's park, but now it's a pitcher's park." It would make sense for each park's factors to remain constant as long as there had not been any changes in that park. To find each team's factors, multiply how many times they play in each park by each park's factors, then divide the sum by the total number of games. The team factor can change each year with a different mix of road parks for each team, while the factors for each park do not change as long as the park hasn't changed. When play by play data is available, team factors to adjust a season total are not needed. Instead, how each player performed in each ballpark can be normalized with that park's factors, and then summed into an adjusted season total. 1978 American League 1999 American League ParkID Name HRpf ParkID Name HRpf SEA02 Kingdome 1.55 DET04 Tiger Stadium 1.21 DET04 Tiger Stadium 1.21 BAL12 Camden Yards 1.13 TOR01 Exhibition Stadium 1.06 STP01 Tropicana Field 1.12 BOS07 Fenway Park 1.02 TOR02 Skydome 1.10 MIN02 Metropolitan Stadium 1.00 ARL02 Ballpark at Arlington 1.10 CLE07 Cleveland Stadium 1.00 SEA02 Kingdome 1.09 ARL01 Arlington Stadium 0.94 NYC16 Yankee Stadium 1.07 OAK01 Oakland Coliseum 0.93 KAN06 Kaufman Stadium 1.05 NYC16 Yankee Stadium 0.92 BOS07 Fenway Park 1.02 ANA01 Anaheim Stadium 0.86 ANA01 Anaheim Stadium 1.01 MIL05 County Stadium 0.85 MIN03 Metrodome 0.98 CHI10 Comiskey Park 0.79 CHI12 Comiskey Park II 0.98 KAN06 Kaufman Stadium 0.77 OAK01 Oakland Coliseum 0.97 BAL11 Memorial Stadium 0.76 CLE08 Jacobs Field 0.95 In calculating long term park factors, I first made a list of ballpark 'versions'. Three Rivers Stadium opened in Pittsburgh in 1970, so that's version 1. In 1975, an inner wooden fence was constructed, about 6 feet shorter, creating version 2 which lasted until it's closing after the 2000 season. Version 2 of Veteran's Stadium in Philadelphia existed from 1972 to 2003. Three River v2 and Veteran's v2 both existed from 1975 to 2000. For those 26 seasons, compare the Pirates and Phillies stats in Pittsburgh with the same two teams stats in Philadelphia. Repeat for every combination of ballpark versions, then compare the total home to road stats for the entire range of years. I've spoken mainly of homeruns in this article, as that category is the one that varies the most between ballparks, ranging from 1.65 for the Polo Grounds 19541963 to 0.48 for the Astrodome 19771984. Other than the mile high Coors Field with it's BABIP factor of 1.15, base hits range from Kansas City's Municipal Stadium at 1.08 to Milwaukee's County Stadium at 0.92. Candlestick Park in San Francisco had the highest SO factor at 1.11, while Coors Field is the hardest place to fan at 0.85. The bottom of the SO factor list is populated by the various incarnations of fields in Denver, Kansas City, Atlanta, Pittsburgh, Chicago and St. Louisalmost all of the major league cities away from the coasts and a thousand or more feet above sea level. The theory is that breaking pitches don't move as much at higher altitudes, where the air is thinner, resulting in higher contact rates, but that's another article. In summary
NAME ParkID Ver Since Games SDT XBH SI DO TR HR BB SO Angel Stadium of Anaheim ANA01 4 1997 812 1.00 0.96 1.02 0.99 0.76 1.01 1.00 0.99 Rangers Ballpark in Arlington ARL02 1 1994 1027 1.03 1.02 1.02 1.02 1.35 1.10 1.00 0.95 Turner Field ATL02 1 1997 810 1.01 0.94 1.03 0.95 1.03 0.96 1.00 0.99 Oriole Park at Camden Yards BAL12 1 2002 1100 0.98 0.89 1.01 0.89 0.70 1.13 1.02 0.97 Fenway Park BOS07 7 1956 3965 1.07 1.15 1.03 1.27 1.01 1.02 1.00 0.98 Wrigley Field CHI11 7 1956 4006 1.02 0.98 1.02 1.01 0.98 1.19 1.02 0.99 U.S. Cellular Field CHI12 2 2001 569 0.99 0.96 1.00 0.97 0.80 1.26 1.02 0.97 Great American Ballpark CIN09 1 2003 406 0.97 0.99 0.97 1.01 0.50 1.24 0.97 0.99 Progressive Field CLE08 1 1994 1008 1.01 1.02 1.00 1.05 0.78 0.95 1.03 1.00 Coors Field DEN02 2 2005 244 1.10 0.97 1.11 1.03 1.24 1.09 0.98 0.85 Comerica Park DET05 2 2003 324 1.00 0.93 1.02 0.86 1.56 0.87 0.95 0.94 Minute Maid Park HOU03 1 2000 648 1.02 1.00 1.02 0.98 1.39 1.18 0.96 1.00 Kauffman Stadium KAN06 4 2004 7323 1.04 1.08 1.01 1.11 1.21 0.83 1.04 0.92 Dodger Stadium LOS03 6 2001 7567 0.99 0.89 1.03 0.91 0.61 1.08 1.03 1.03 Land Shark Stadium MIA01 2 1994 1017 1.00 0.99 1.01 0.95 1.36 0.92 1.06 1.05 Miller Park MIL06 1 2001 570 0.98 1.03 0.97 1.02 0.92 1.13 1.04 1.01 Hubert H. Humphrey Metrodome MIN03 2 1983 1836 1.03 1.09 1.00 1.11 1.28 0.98 1.00 1.04 Shea Stadium NYC17 3 1985 1744 0.98 0.95 1.00 0.94 0.90 0.93 0.97 1.02 Yankee Stadium NYC16 7 1988 1420 0.99 0.94 1.01 0.95 0.73 1.07 0.96 0.99 Oakland Coliseum OAK01 6 1996 885 0.96 1.01 0.96 0.98 0.89 0.97 0.97 0.96 Citizens Bank Park PHI13 1 2004 324 1.01 0.96 1.03 0.97 0.96 1.23 0.89 0.97 Chase Field PHO01 1 1998 729 1.05 1.06 1.03 1.07 1.60 1.11 1.03 0.92 PNC Park PIT08 1 2001 565 1.03 1.01 1.03 1.08 0.77 0.89 0.95 0.92 PetCo Park SAN02 2 2006 162 0.94 0.86 0.99 0.77 1.07 0.90 1.00 1.08 AT&T Park SFO03 2 2004 325 1.05 0.98 1.05 1.00 1.24 0.87 0.96 0.94 Safeco Field SEA03 1 1999 650 0.96 0.96 0.97 0.94 0.76 0.93 1.09 1.07 Busch Stadium III STL10 1 2006 161 1.01 0.90 1.05 0.91 0.82 0.82 0.97 0.90 Tropicana Field STP01 2 2001 561 0.99 1.01 0.99 0.97 1.29 0.98 0.98 1.02 SkyDome TOR02 1 1989 1320 1.00 1.10 0.96 1.10 1.11 1.10 1.02 1.01 Robert F. Kennedy Stadium WAS10 3 1971 324 0.97 0.94 0.99 0.90 0.98 0.77 0.88 1.01 80 comments have been left for this article.
 
I liked so much about this piece, as I've often though about the true accuracy of park factors  the part about how park factors should not change when the park doesn't was especially good  but the opening part just really bugged me. I think it's become the norm to look for guys like Buster or Stark or Gammons writing something that people can rip, and it's kind of cheap. Buster was using the numbers to make a point about how many damn homers there have been. Just like you know that it's really not going to end up there, he knows that too.