BP Comment Quick Links


August 3, 2009 Ahead in the CountRuns Per Inning, and Why I Love the Long Ball
If you have ever tried to explain the concept of Pythagorean Record to a baseball novice, you probably have had to answer the following criticism: "That counts the extra runs at the end of a blowout as much as other runs, even though it does not matter whether you win 100 or 150." The answer that we give to that criticism is that teams that can take advantage of blowouts have better offenses and those type of teams will be more likely to win close games in the future. That is the reason that we have thousandrun estimators that try to approximate how many runs a team will score on average, and why we evaluate players with statistics like VORPmeasured in runs over replacement player. Runs are the building blocks of wins, and you win by scoring more runs than your opponent. We cringe when we hear offenses evaluated by batting average because we know that the goal of offenses is to score runs, not get hits. The Inning However, with all of these run estimators that sabermetricians have developed, we often forget the context in which runs are scoredby innings. Teams get to score as many runs as they can before their opponents record three outs; then they get to try again eight more times. That environmenthow much you can score before three outsis the environment to keep in mind when we talk about winning games. Nearly a decade ago, Keith Woolner wrote about the link between runs per inning and runs per game, and how well you can predict the frequency of zerorun innings, onerun innings, tworun innings, etc. by looking at how many runs teams score per game. It is certainly true that the rate of scoring a certain number of runs in an inning and the average number of runs per game are related. In fact, teams that have more variance in their runscoring per inning also have more variance in their runscoring per game. This is tricky to show because teams that score more runs also have more variance in the number of runs they score per gamethat makes sense, because they have a lot of eightrun games, 10run games, and 15run games, so they are bound to have a higher variance because they needed enough big innings to put up those run totals. Simply checking the variance of runs per game against how frequently those teams have big innings would obviously yield a positive correlation. Instead, I needed some way to neutralize the variance of runs per game. I initially tried dividing by the number of runs per game, but that statistic still had a positive correlation with runs per game. I tweaked with things until I found a way to measure variance of runs per game that did not have any correlation (0.0025) with runs per game, which I call "Adjusted Variance" or "AdjVar," is this: (Variance of Runs/Game) AdjVar =  ((Runs/Game) ^1.30) Looking at 19982008 data for each team (330 team seasons total), I found that this number was slightly positively correlated with the frequency of scoring zero runs in an inning (correlation = 0.097, twosided pstat = 0.075), highly correlated with the odds of scoring four or more runs in an inning (correlation = 0.258, twosided pstat = 0.000), and highly correlated with the odds of scoring five or more runs in an inning (correlation = 0.292, twosided pstat = 0.000). That much should not come as a surprise; we predicted that teams that have more variance in their runsperinning scoring would have more variance in their runspergame scoring. RunScoring Variance and Pythagorean Record The next step is to check if teams with more variance in their runs per game tend to underperform their Pythagorean records. In fact, this is truethe difference between actual wins and Pythagorean expected wins is negatively correlated with the AdjVar statistic above (correlation = 0.303, twosided pstat = 0.000). Teams that are more volatile in their rate of scoring runs are going to lose more often than other teams that score similar number of runs, but are not as volatile. Now we know that teams that have high variance in their runscoring by inning have more variance in their runscoring per game. We also know that teams that have more variance in their runscoring by game are not as likely to win as teams that put up the same number of runs but without as much of a spread. The next step is to figure out if there is any way to predict which offenses will have less variance in their runs per inning. Which Offenses Spread Their Runs Around Better Three years ago, Sal Baxamusa looked at 2006 teamscoring data and used the Weibull Distribution to predict how often they would score a certain number of runs. The Weibull Distribution does a pretty good job at predicting the number of times teams will put up certain run totals, but tends to underestimate how often teams are shut out. This is likely due to the fact that the talent level of pitchers is different, so analyzing how a team scores in general will not take this into account. You face Johan Santana sometimes, and you face Livan Hernandez at others, and Santana might shut you out more often than a model of hitting alone would predict. Baxamusa demonstrated that slugging teams were shut out less often, and also were more likely to score at least three runs in a game than their season run total and the Weibull Distribution would predict. This was useful information, but given the difficulties with the Weibull Distribution and the small sample size of just thirty data points, he was unable to check this in much detail. By looking at runs per inning, we can look at a much larger samplethere were 477,884 halfinnings from 19982008. Using this, we can check which type of offenses are more likely to spread their runs around and win more games as a result. The correlations between the odds of scoring at least a given number of runs in an inning and a number of common offensive rate statistics reveal even more evidence of Baxamusa's suspicionthat the teams that score with power are more likely to win than other teams who score similar numbers of runs. For reference, note that the average team from 19982008 only scored in 29 percent of the innings that they played, but they scored two or more 14 percent of the time, they scored three or more six percent of the time, they scored four or more three percent of the time, and they scored five or more one percent of the time. Below I list the correlation between the frequencies of scoring at least a certain number of runs in an inning and onbase percentage and slugging percentage. Note that each of these have a 0.887 correlation with runs per game. You will notice an interesting trend: At least X Runs/inning OBP SLG 1 .822 .872 2 .741 .723 3 .603 .573 4 .746 .716 5 .667 .611 The trend that you probably noticed is that highslugging teams are more likely to pick up at least a run in an inning, but highOBP teams are more likely to have big innings. The reason that this is so important is that we have shown that being able to spread your runs around different innings is more valuable than scoring a lot of runs in one inning, in terms of wins and losses, since high variance in run scoring tends to be correlated with underperforming your team's Pythagorean Record. This means that all of our standard measures of runscoring are overweighting the contribution of OBP towards winning and underestimating the contribution of SLG towards winning. The connection can be highlighted even further by using regression analysis to predict the probability that a team scores at least X runs in an inning. I regressed the probability of scoring at least one, two, three, four, and five runs in an inning on onbase percentage and slugging percentage and found the following formulas: Prob(Scoring at least 1 run) = 0.154 + 0.659*OBP + 0.526*SLG Prob(Scoring at least 2 runs) = 0.224 + 0.686*OBP + 0.307*SLG Prob(Scoring at least 3 runs) = 0.164 + 0.462*OBP + 0.171*SLG Prob(Scoring at least 4 runs) = 0.090 + 0.235*OBP + 0.094*SLG Prob(Scoring at least 5 runs) = 0.050 + 0.138*OBP + 0.039*SLG The important thing to realize when looking at these formulas is that the coefficient on SLG gets smaller relative to the coefficient on OBP as you increase the number of runs per inning. Teams that string together a lot of baserunners are more likely to score by putting up big innings than teams that swing for the fences, who will spread their runs around better. The link remains strong when you look at similar statistics for scoring at least a certain number of runs in a game: Prob(Scoring at least 1 run) = 0.673 + 0.336*OBP + 0.387*SLG Prob(Scoring at least 2 runs) = 0.238 + 0.815*OBP + 0.814*SLG Prob(Scoring at least 3 runs) = 0.205 + 1.46 *OBP + 1.06 *SLG Prob(Scoring at least 4 runs) = 0.584 + 2.00 *OBP + 1.21 *SLG Prob(Scoring at least 5 runs) = 0.889 + 2.65 *OBP + 1.11 *SLG Prob(Scoring at least 6 runs) = 0.973 + 2.51 *OBP + 1.14 *SLG Prob(Scoring at least 7 runs) = 0.909 + 2.26 *OBP + 0.974*SLG Conclusion It is clear that power helps you score frequently, and onbase skill helps you pile on when you do score. In fact, a team's home runs per atbat has a 0.15 correlation with the difference between the number of wins a team gets beyond what their Pythagorean record predicts. Teams that hit more home runs do better than their Pythagorean Record suggests. What this means is that power hitters are even more valuable than their VORP suggests. Power hitters not only change the scoreboard, but they change the scoreboard when it matters. The next time somebody tells you that a team is falling short because they rely too much on the long ball, you can reply that they may not rely on it enough.
Matt Swartz is an author of Baseball Prospectus. 46 comments have been left for this article. (Click to hide comments) BP Comment Quick Links jgibson (32845) Matt Very interesting analysis. You were my favorite writer in Idol. Glad to see BP is keeping you around. Keep up the great work. Aug 03, 2009 11:15 AM Edwincnelson (33517) Amazingly well written, and well researched article. I was discussing the reasoning behind the LaRocheKotchman trade from the Braves' point of view this morning, and this article certainly sheds some light on how the Braves may have looked at the trade. Aug 03, 2009 11:20 AM CRP13 (46873) Excellent point. This article and your comment explained that trade better than anything I've heard by the talking heads. I'd be surprised if this type of analysis was used though  more likely it was just a money swap. Aug 03, 2009 11:49 AM Shaun P. (676) This is really interesting stuff, Matt. I wonder if this could, in part, account for the Yanks usually finishing ahead of their Pythag record, especially during the later Torre years. IIRC, the Yanks were usually among the league leaders in home runs those years . . . Aug 03, 2009 11:53 AM It's possible that played some role, but I think that the commonly cited reason is that the late 90s Yankees had such a great bullpen that they did not let close games get away. I would imagine that their powerhitting helped them get runs in those close games, too, but looking back they also seemed to have pretty great OBP's those years as well. Interesting application, though. Thanks. Aug 03, 2009 12:01 PM Pat Folz (6254) Thank you for this. I'd been wondering about something along these lines for awhile now, but was never able to organize it into a cohesive idea, much less figure out how to investigate it. Aug 03, 2009 12:49 PM thegeneral13 (32625) Interesting stuff. I would like to see a little more exploration of the negative correlation between variance and performance vs. pythagorean record. Intuitively, good teams should benefit from low variance while bad teams should benefit from high variance. If Team A scores 6 runs a game with zero variance and Team B scores 5 runs a game with zero variance, Team A will win every game. If you add variance to either team, it will hurt Team A's winning percentage and help Team B's. As variance becomes infinite Team A's winning percentage will approach 50%. I'm not sure if I understand why your findings would suggest that low variance is uniformly good. I realize this is relative to the pythag model, but if the pythag model is a good approximation in aggregate, then I would think that a team with a positive run differential should outperform if it has below average variance while a team with a negative run differential should outperform if it has above average variance. Maybe if you divided the sample into two cohortsteams that outscored their opponents and teams that were outscoredthe results would look different. Just a thought. Aug 03, 2009 12:54 PM I see your point, but it might help for me to explain a little bit more about the variance specifically, it's not normally distributed. The distribution for all teams has a very large right tail, so more variance almost always means that you outscore your opponents by a lot in blowouts but does not really much otherwise. So even below average teams prefer more variance. If it were normally distributed, your point would be dead on and I see what you're saying. Aug 03, 2009 14:35 PM thegeneral13 (32625) Thanks for the reply, Matt. Aug 03, 2009 15:43 PM It's actually not quite a lognormal distribution Baxamusa cited a math professor named Stephen Miller that showed it was Weibull. Obviously the two have similar shapes, but the issue is how they treat zeros. Aug 03, 2009 18:37 PM KaiserD2 (15467) I have been studying these issues for decades, albeit without such statistical sophistication, and I have a simpler (although not contradictory) explanation for this finding. Aug 03, 2009 14:58 PM awayish (20768) you've touched on an important and neglected topic here. good work Aug 03, 2009 16:31 PM Adam Madison (20269) My only complaint about the article is that it seems to be missing a thesis. In the middle of reading it, I was asking myself, "What is he trying to prove?" That could be easily solved, and the article was great nonetheless, but it was definitely a bit disconcerting as a reader. Aug 03, 2009 17:01 PM Shawn Hoffman (14345) Great stuff Matt. I'd love to see it peerreviewed, but really good job. Aug 03, 2009 18:45 PM Interesting thoughts, thanks. I'm not sure I have enough data on hand to answer that very well. I checked the correlations year by year from 1998 to 2008, after reading your comment, for WXW and AdjVar and got (all values negative): .12, .07, .10, .40, .42, .32, .35, .40, .43, .25, .50. Since run scoring was a little higher in 19982000 than 20012008, there does seem to be some tendency for less correlation in higher run scoring environments, and the correlation for 19982008 combined is more negative in the NL than in the AL (.34 vs .27), so that's maybe a little more evidence towards a lower correlation in higher run scoring environments. I'm not sure that this is all that conclusive because of the sample size (30 teams per year for the 11 correlations above), but maybe the higher average runs/game gives more room for variance on the low side. Aug 03, 2009 19:10 PM WaldoInSC (26415) This is outstanding research. In time it will have farreaching consequences refining Pythagorean accounting. Encore! Aug 03, 2009 19:23 PM Sky Kalkman (3454) Great stuff. Something I'd been meaning to get to, but never did. Aug 03, 2009 19:50 PM Glad to help with that :) I checked those too, but wasn't very persuaded that the results helped much. Aug 03, 2009 20:07 PM Richard Bergstrom (36532) I know that at BP, they often use third order wins to use the elements of run scoring and prevention to predict a team's success. I don't know what formula they use, but I would be interested to see if they weight SLG higher or OBP lower. Aug 04, 2009 00:32 AM 1st order record uses RS/RA and and 2nd order record uses EqR for the team and its opponent; 3rd order record improves on 2nd order record by adjusting for difficulty of opponents. EqR is a run estimator in that it approximates the RS/RA of a team in a neutral setting over the course of the season. The point of the article was that run estimators like this will estimate RS/RA well, but that there is an additional factor to account for in terms of biased distributions of RS/RA. Aug 04, 2009 06:00 AM Richard Bergstrom (36532) While I understand that the point of the article had to do with run distribution, there seemed to be a corollary that teams that get extra base hits/home runs tend to get shut out less in a single game than teams that get on base. The idea is that, on a given day, teams with better slugging don't need to chain as many events together to score runs. Aug 04, 2009 07:50 AM You are correct. The implication of the article is that SLG could refine WL predictions a little bit better. If you had two equal quality bullpens in terms of Fair RA or something like that, but one was relatively more capable at reducing SLG and the other was relatively more capable at reducing OBP, I suppose that bullpen might fare a little bit better. Keep in mind that pitchers generally control K%/BB%/GB% and not much else. I guess the implication would be that a bullpen with higher BB% but higher K% and higher GB% might do a little at reducing SLG? That's probably true but the magnitude of that effect is likely pretty small. Certainly an interesting point, though. Aug 04, 2009 08:16 AM Richard Bergstrom (36532) Pitchers generally control K/BB/GB, but my understanding was that pitchers also appear to have an influence on home run rate too... so the question is whether a HR rate allowed is a better indication of surpressing an opposing team's ability to score in any given inning than BB/9, for example. Aug 04, 2009 10:18 AM anderson721 (18704) Okay, I'm puzzled. The Angels are a high OBP, average HR team, and they blow away pythag annually. Is it all bullpen and baserunning? Aug 04, 2009 04:59 AM It's mostly good bullpen. Although the OBP vs. SLG tendency is going to be true on the whole, it's not going to be the deciding factor for every team. The Angels have had a very good bullpen for several years, and that will definitely be a larger effect on Pythagorean Record. Additionally, the Angels OBP and SLG ranks aren't as lopsided as you may think, but they definitely lean a little towards the OBP side. Aug 04, 2009 06:04 AM Scherer (34879) Remember, too, the SLG is not really about HRs, or even power for that matter, as batting average is a significant component of SLG. Ichiro currently has a SLG similar to Curtis Granderson, but that doesn't mean he's a comparable HR hitter. Aug 04, 2009 10:16 AM Evan (47) I really enjoyed the article, but what I enjoyed most was that link to Woolner's article from 2000. That's the sort of thing I'd like to see more of from BP. I want more statistical heavylifting. Aug 04, 2009 09:46 AM jdseal (46813) The research and the premise behind this article were outstanding. Great to see you on the team, Matt. I'm a professional statistician so I actually understood most of this, but I felt that, by usual BP standards, there was more jargon and less explanation, and I fear more of the readers were lost. It's such good work, I hated to see that happening. Aug 04, 2009 16:37 PM Hoff (37596) Haha, you have two conflicting comments at the end here; some want more data some suggest less jargon. Good luck finding the balance. (I was taught stats by another penn econ phd, bob tayon, maybe you know him, following wasn't too hard) Aug 05, 2009 08:47 AM Wow, I can't believe you know Rob Tayon! He's a great friend of mine! I see that you call him Bob, which I guess means you've known him for a while, because I think he mostly goes by Rob now, at least among people he knew in grad school we all called him Rob. That's very cool. Aug 05, 2009 09:53 AM Hoff (37596) I should have said Professor Tayon. We didn't say either rob or bob, so it looks like I guessed wrong. Ahem. Aug 05, 2009 10:47 AM I'm not sure I understand what you're saying. Certainly the coefficients would change if I took one out or the other. But I'm not trying to develop a production function for runs, just developing a way to look at correlations of two variables at the same time. Aug 05, 2009 11:53 AM Hoff (37596) I'm just wondering how if your coefficients might be deceptively significant given the high correlation between the two. If by dropping one variable you found p stats changing dramatically there might be something wrong with the coefficients on the full regression. Aug 05, 2009 14:19 PM dpowell (1025) This would work the other way. If they're highlycorrelated, then this should drive the standard errors for both variables up (towards insignificance). Aug 05, 2009 14:48 PM Thanks. Yeah, that's what happened when I just ran them again on obp and slg individually in single regressions. They became higher and more significant because their coefficients in single regressions now picked up both of their effects. Aug 05, 2009 14:52 PM jdseal (46813) Wow, this discussion is so wonderfully geeky I can hardly contain myself. I wish more or my colleagues (who, professionallyspeaking SHOULD be able to have these kinds of discussions) could follow all that. Anybody here interested in a job analyzing marketing and survey data? Aug 08, 2009 08:40 AM Richard Bergstrom (36532) Ok so here's a question.. is there a breakeven point on OBP or on SLG where, if playing for one run and that one run will win the game and there is currently a runner on second base, and assuming that each other person in the lineup is a leagueaverage hitter, it's better for a batter to bunt to advance a runner instead of trying to swing for extra bases? Aug 05, 2009 17:38 PM Not a subscriber? Sign up today!

I'm curious: BP never pays attention to batting average, but I wonder what, if any, correlation there is between batting average and outperforming the pythagorean projection. Intuitively, it seems like teams that have a higher percentage of their OBP as hits, rather than walks, will also score single runs in a few more innings than expected.
BP doesn't discuss batting average much, but it does actually account for it in VORP and other statistics so we do count a single more than a walk, which you're right to point out we should. Pythagorean Record just uses run totals though, so I think your question would be if batting average is correlated with second order Pythagorean Record. I can't tell you about that but the correlation with regular Pythagorean Record is only 0.038, which is not statistically significant. As far as per inning scoring, it seems that AVG has a similar effect as OBP in that it is more important in putting together big innings than getting one or two runs across, but the effect is not as distinct as OBP's.