April 2, 2013
Testing the Dewan Rule
Can Spring Training Slugging Really Predict Breakouts?
It’s only natural to seek meaning in spring training statistics. By the time spring games roll around, we’re baseball-starved enough to believe anything. We’re also preparing for fantasy drafts, which means we’re always on the lookout for any info that could give us an edge. And contrary to the popular stathead saying, spring training stats aren’t actually meaningless—they’re just less meaningful, compared to a same-sized sample of big-league performance. Any change in a player’s performance should produce a corresponding (albeit small) change in our projection for that player. The more extreme that change in performance is, and the larger the sample, the more that projection shifts.
The most commonly cited method for assessing spring training statistics was proposed and popularized by John Dewan, the owner of Baseball Info Solutions. Dewan has devoted most of his analytical efforts to quantifying fielding, but he tackles other statistical topics in his “Stat of the Week” series at the website of publisher Acta Sports. Since at least 2005, Dewan has published an annual list of players whom he thinks stand a good chance to break out in the upcoming season, based on their spring training stats.
To earn a place on Dewan’s list, a player must (in most years, at least—more on that later) meet the following criteria:
Conceptually speaking, this makes some sense: the more extreme a spring training performance, the more likely it is to be meaningful, despite the small sample. In recent years, the 200/40/.200 criteria, now known as “the Dewan Rule,” has become quite popular, with posts about potential breakout candidates popping up around the internet in anticipation of or in response to the appearance of Dewan’s list. However, before fantasy players can depend on the Dewan Rule to help them predict breakouts, they need to know whether it works. We’ve never come across any attempts to test its effectiveness, so we decided to do so ourselves.
Before we could determine how well the Dewan Rule works, we had to figure out what it claims to do. That was harder than it sounds. Here’s an excerpt from the introduction to the first incarnation of the Dewan Rule at Acta’s site, which was published on March 25, 2005.
Neither that introduction, nor the introductions to any of the lists published since, mentions when that “past research” took place or what it consisted of. It’s also important to note that both the criteria for inclusion in the list of players predicted to improve and the claims about the Rule’s predictive powers have changed in the nine seasons in which the list has appeared at Acta. The following table shows the number of career at-bats and spring at-bats specified as the minimum for inclusion in the group of candidates for improvement each year, as well as each year’s claim about the predictiveness of a 200-point spring increase in slugging percentage. (The years link to the Acta posts from each season; in recent seasons, Dewan’s lists have also been cross-posted at Bill James Online.)
*In 2008, the introduction said, “Our definition of ‘exceptionally well’ was slugging 100 points higher in spring training than their previous career slugging percentage,” but the list of players provided was still those with spring slugging percentages 200 points or higher than career norms. That’s the only year in which slugging 100 points higher in spring training was mentioned.
Since the first year the list was published at Acta, the minimum career at-bat count required for inclusion has risen from 100 to 175 to 200, and the minimum spring at-bat count has fluctuated from 36 to 35 to 40. The percentage of players predicted to exceed their career slugging percentage has alternated between “better than/over 60 percent,” “about two-thirds,” and “about three-fourths,” with this season’s list claiming that exactly 60 percent of extreme spring training sluggers go on to record regular-season slugging percentages “significantly” above their career averages.
In those seven seasons, 226 players fit the Dewan Rule description, with 218 of them going on to get regular-season at-bats after their high-slugging spring trainings. (You can find the full sample here.) We excluded the eight who didn’t play in the regular season from the sample, so they didn’t count as successes or failures. Here are the results, counting anyone whose regular-season slugging percentage surpassed his previous career rate—by any margin—as a “success”:
The Dewan Rule met or exceeded a 60 percent success rate in only one of the seven seasons, and it never got to two-thirds, let alone three-fourths. On the whole, fewer than half of the players who exceeded their career slugging percentages by at least 200 points in spring training went on to top their career slugging percentages during the regular season. Over the last seven seasons, at least, a coin flip would have predicted individual power improvements as well as or better than the Dewan Rule.
Of course, this testing method assigns equal weight to each individual, regardless of playing time, so Lance Niekro’s .176 SLG in 17 at-bats in 2007, for instance, counts as much against the Dewan Rule as Jose Bautista’s .617 SLG in 569 at-bats in 2009 counts in its favor. It’s possible that comparing the collective post-spring slugging percentages of the 218 players in the study to their collective career slugging percentages might yield better results for the Dewan Rule. So we checked that, too:
Again, there’s nothing positive to report. The players posted a lower collective slugging percentage after their high-SLG springs than they had in their big-league careers prior to that point.
It's hard to avoid the conclusion that the Dewan Rule, as currently constituted, doesn’t work. In fact, it doesn’t work even if we give it a helping hand. Last year, Russell Carleton reported that slugging percentage stabilizes at 320 at-bats. So what if we look only at the 130 Dewan Rule qualifiers who got at least 320 at-bats in the regular season following their hot spring trainings? Of course, there’s no way to know for sure before the season starts which players will get at least that much playing time, and there’s a survivor bias at work: the players who reach the 320 at-bat mark are more likely to have hit better (and have higher regular-season slugging percentages) than those who fall short. And even so, the 130 Dewan Rule candidates with the most staying power showed no improvement as a group:
Maybe the concept behind the Dewan Rule is sound, but the parameters are too tame. What if we tweak the criteria to pinpoint only the most extreme spring slugging performances, those with 300 and 400 points above career average?
Still nothing we can hang our hats on. The sample sizes are even smaller (so small as to be meaningless, in the case of the >=.400 group; there’s about one of those players a year), and regardless, the results are uninspiring.
Okay, last try. This time, we’ll keep the spring slugging the same—200 points above career average—but increase the sample size from 40 at-bats to 50, 60, and 70:
These results show the direction of effect we’d expect, at least, but we can’t conclude much based on samples this size. Making the Dewan Rule more selective doesn’t seem to increase its utility.
There’s one important factor we haven’t yet considered: the league-wide offensive environment. The Dewan Rule doesn’t consider it either; Dewan doesn’t claim that the Rule predicts improvements after adjusting for park or league, just that it predicts improvements. Still, league-wide slugging percentage has declined significantly over the sample we studied:
Over the past seven seasons, league-wide slugging was .414. Over the seven seasons before that, league-wide slugging was .425. So a batter who entered 2010, say, with a career record compiled in a higher-offense environment might appear to have declined due to the run environment, when really he improved relative to the rest of the league.
We can account for that. The Baseball Prospectus stat True Average adjusts for run environment so that league average is always .260. Before their high-slugging spring trainings, the Dewan Rule hitters had a collective .2725 TAv. Their collective TAv in the season following those springs was .2727. If you have to go to a fourth decimal place to see a difference, you know it’s not significant. Dewan Rule batters are exactly as productive after their high-SLG springs as they were before.
To be fair, Dewan doesn’t make any claims about overall productivity; he just predicts higher slugging percentages. (Although if the hitters it identifies aren’t improving across the board, it’s hard to see how identifying them beforehand helps.) So we era-adjusted their slugging percentages, too. First we looked at the weighted delta of each player’s SLG and the league SLG (with pitchers excluded) between their careers and their high-SLG spring training. Then we took the difference between their slugging percentages and what the league slugged in the regular seasons following their successful spring trainings. Then we took the delta of the two deltas. Full results are here.
Even after adjusting for league slugging percentage (which, again, the Dewan Rule doesn’t specify as a necessary step), the results revealed nothing of use. Of the 218 Dewan Rule batters, 112 improved (51.4 percent). Before their hot springs, the group as a whole slugged 24 points higher than the league. In the seasons after their hot springs, the group slugged 22 points higher than the league.
Why doesn’t the Dewan Rule work? Really, you’d sort of expect it to: if you outslug your career stats, it stands to reason that you might be on the upward part of the aging curve, and expected to improve anyway. But compounding the sample-size issue, there are other potentially confounding considerations that may affect the utility of spring training statistics. For instance, spring training hitters face pitchers with talent levels ranging from low minors to major-league quality. Those pitchers are further away from mid-season form, and they may not always be utilizing their full repertoires when facing the batter.
In addition, the population of batters who get at least 40 at-bats may well lean toward players on the active roster bubble who are riding the bus to all the away games as they try to turn their past cups of coffee into Opening Day roster spots. Since those players already have 200 at-bats in the majors, they may be on their way out of the league—until they have a fluky spring, their manager makes too much of it, and they get big-league playing time, at which point they decline. Dewan’s goal of trying to find predictive value in spring training statistics is good, and there may be ways to achieve it. However, the Dewan Rule is not one of those ways.
So why has the Dewan Rule lasted so long, despite such lackluster results? Well, it claims to be based (and no doubt is based) on some sort of research, which allays a lot of skepticism. It’s likely that the Dewan Rule seemed to work for whatever sample Dewan originally studied, but doesn’t work as well out of sample (much like BP’s “Secret Sauce” for predicting postseason success, which we retired in 2010). And the Dewan Rule is a good source of material for writers in search of something interesting to say about spring training, which means it self-propagates and provides its own publicity.
Maybe more importantly, it’s easy to remember the Dewan Rule’s successes while forgetting its failures. Pick any sample of spring training hitters—even a randomly selected sample—and some of its members are bound to improve (just as some young pitchers are bound to get hurt or regress, which can make the Verducci Effect appear predictive despite its being debunked). Yes, you can use the Dewan Rule group as a starting point and refine it from there—maybe disregarding the old guys or focusing on hitters whom you thought would improve anyway—but you could do that with any same-sized group of hitters, spring training slugging aside, and you’d be just as likely to predict improvements correctly.
Of course, some hitters who have hot springs really are about to break out—most notably Bautista—but spring slugging percentage alone is not a reliable predictor of improvement. To divine any deeper significance, you’d have to combine it with scouting information, or knowledge about a meaningful change in approach—Bautista’s revamped swing, for example. Dewan’s list doesn’t do that, which means it’s time to retire the Rule.
Thanks to Ryan Lind for invaluable research assistance.