BP Comment Quick Links
June 14, 2016 Prospectus FeatureWhat the #StroPoll Results Got WrongOn Wednesday night, this image created a small Twitter sensation. Mind you, it was a small sensation. On a night that featured noteworthy pitching performances ranging from Yu Darvish’s injury to Jameson Taillon successful debut to James Shields to Snoop Dogg, there wasn’t room for a large sensation. But this screenshot during the Houston broadcast (in a game in which the Astros actually beat the Rangers in Arlington!) caused some of us to drop our slide rules in amazement:
The two most obvious reactions were: 1. Wow, people actually are starting to realize that batting average isn’t all it’s cracked up to be! 2. What in the world is a “sabermetrics?” (Rob Neyer hypothesized that it was shorthand for WAR, which probably wouldn’t have looked good on a Twitter poll anyway. If that’s the case, we should thank the Root Sports program director for sparing us the inevitable Edwin Starr references.)
Here’s a third reaction: They got it wrong. Onbase percentage isn’t the “most important to a hitter’s value,” even among these options. Slugging percentage is.
I think there are two reasons for onbase percentage’s popularity. First, of course, is Moneyball. Michael Lewis demonstrated how there was a market inefficiency in valuing players with good onbase skills in 2002. The second reason is that it makes intuitive sense. You got on base, you mess with the pitcher’s windup and the fielders’ alignment, and good things can happen, scoringwise.
To check, I looked at every team from 1913 through 2015—the entire Retrosheet era, encompassing 2,214 teamseasons. I calculated the correlation coefficient between each team’s onbase percentage and its runs per game. And, it turns out, it’s pretty high—0.890. That means, roughly, that you can explain nearly 80 percent of a team’s scoring by looking at its onbase percentage. (Square the correlation coefficient, r, and you get r^{2}, the percentage of variation explained by a linear model.) Slugging percentage is close behind, at 0.867. Batting average, unsurprisingly, is worse (0.812), while OPS, also unsurprisingly, is better (0.944). TAv would undoubtedly be better still.
But that difference doesn’t mean that OBP>SLG is an iron rule. Take 2015, for example. The correlation coefficient between onbase percentage and runs per game for the 30 teams last year was just 0.644, compared to 0.875 for slugging percentage. Slugging won in 2014 too, 0.8570.797. And 2013, 0.8960.894. And 2012, and 2011, and 2010, and 2009, and every single year starting in the Moneyball season of 2002. Slugging percentage, not onbase percentage, is on a 14year run as the best predictor of offense.
And it turns out that the choice of endpoints matter. Onbase percentage has a higher correlation coefficient to scoring than slugging percentage for the period 19132015. But slugging percentage explains scoring better in the period 19392015 and every subsequent span ending in the present. Slugging percentage, not onbase percentage, is most closely linked to run scoring in modern baseball.
Let me show that graphically. I calculated the correlation coefficient between slugging percentage and scoring (i.e., runs per game), minus the correlation coefficient between onbase percentage and scoring. A positive number means that slugging percentage did a better job of explaining scoring, and a negative number means that onbase percentage did better. I looked at threeyear periods (to smooth out the data) from 1913 to 2015, so on the graph below, the label 1915 represents the years 19131915.
A few observations:
· The Deadball years were extreme outliers. There were dilutionoftalent issues in 1914 and 1915, when the Federal League operated. World War I thinned rosters and shortened the season in 1918 and 1919. And nobody hit home runs back then. The Giants led the majors with 39 home runs in 1917. Three Blue Jays matched or beat that number last year. · Since World War II, slugging percentage has been, pretty clearly, the more important driver of offense. Beginning with 19461948, there have been 68 threeyear spans, and in only 19 of them (28 percent) did onbase percentage do a better job of explaining run scoring than slugging percentage. · The one notable exception: the years 19951997 through 20002002, during which onbase percentage ruled. Ol’ Billy Beane, he knew what he was doing.
Why is this? The graph isn’t random; there are somewhat distinct periods during which either onbase percentage or slugging percentage is better correlated to scoring. What’s going on in those periods?
To try to answer that question, I ran another set of correlations, comparing the slugging percentage minus onbase percentage correlations to various pergame measures: runs, hits, home runs, doubles, triples, etc. Nothing really correlates all that well. I tossed out the five clear outliers on the left side of the graph (191315, 191416, 191517, 191618, 191719), and the best correlations I got were still less than 0.40. Here’s runs per game, with a correlation coefficient of 0.35. The negative correlation means that the more runs scored per game, the more onbase percentage, rather than slugging percentage, correlates to scoring.
That makes sense, I suppose. When there are a lot runs being scored—the 1930s, the Steroid Era—all you need to do is get guys on base, because the batters behind them stand a good chance of driving them in. When runs are harder to come by—Deadball II, or the current game—it’s harder to bring around a runner to score without the longball. Again, this isn’t a really strong relationship, but you can kind of see it.
So respondents to the StroPoll: Good job realizing that batting average isn’t the most important single offensive statistic! You’ve moved beyond 20^{th} Century thinking. Now it’s time to move beyond turnofthemillennium as well. Onbase percentage is a really useful statistic. But if you want to predict run scoring, look at slugging percentage first.
Rob Mains is an author of Baseball Prospectus. Follow @Cran_Boy
4 comments have been left for this article.

If you guys could get Pat Tabler and Buck Martinez to read this article it would make watching Jays games on TV a little less painful. Those guys are stuck in the 1970's.
bhacking, I there is no team for which I appreciate the ability to listen to the radio broadcast while watching games on mlb.tv more than the Blue Jays. I really like your radio crew.