We’re about a month into the season, which means it’s time for Deserved Run Average (DRA) to go live on BP’s stats page.
DRA is our pitcher run estimator that credits pitchers for the runs they probably deserved to give up, rather than the runs “charged” to them by the official scorer. Because player responsibility is no longer an all-or-nothing proposition on each play, DRA better reflects the most likely (average) contribution made by pitchers to run prevention. DRA is indexed to RA9 (as opposed to ERA) and follows the distribution of RA9 over the course of the season (as opposed to other estimators, which tend to merely center themselves at the same place, and then artificially constrain the variance among pitchers).
As usual, we improved the architecture this offseason. The difference between pitchers you recognize tends to be minimal, but we’re always on the lookout for ways to address outliers who exhibit quirks the system did not previously anticipate. This year, we simplified things quite a bit by moving to a true multinomial structure and by condensing what had been 23 models into 10. Within a week or two, we’ll publish another article discussing the core functions behind DRA, as well as the R code behind them.
DRA has several new features this season, a few of which are particularly notable.
Pitch Differentiation Makes its DRA Debut
The first new feature is the incorporation of one of our tunneling-related metrics. DRA already incorporates, when appropriate, factors like temperature, framing, called strike probability, pitch classifications, and play-by-play adjustments for opponent and park. Last year, however, we began to dig down into the “tunneling” effect some pitchers use to disguise their pitches.
This offseason, we investigated how DRA might benefit from the effect of those new metrics. We found, when controlling for the platoon effect, that “plate distance”—the distance between successive pitches as they approach the plate—helps explain pitcher strikeout rate. Our strikeout model therefore now controls for a pitcher’s ability to conceal differences between pitches in this way. The effect of plate distance is numerically dwarfed by factors like whiff rate and strikeout pitch velocity, but struck us as potentially notable nonetheless.
Uncertainty, At Last
The second feature is far more important; indeed, it may be one of the most important features we have ever introduced at this website. As of today, and going forward, we are providing uncertainty estimates for DRA. This means we will now be able to quantify uncertainty not only for DRA, but for pitcher win values (PWARP) also. The uncertainties are calculated through our Bayesian Bootstrap procedure previously disclosed in the context of catcher framing.
In plain English, the “uncertainty” reflects the fact that each pitcher generates a collection of events, such as strikeouts, home runs, walks, and the like over the course of a season. Since every pitcher can cause any of these events to happen, and we are never entirely sure which combination of those events is the “best” reflection of the pitcher’s contribution, the uncertainty summarizes the different possible combinations which might reflect how the pitcher most likely performed.
Let’s start with DRA. Here are the top 2017 pitchers by innings pitched, along with their DRA and the uncertainty we have calculated around that DRA, which we describe with the “DRA SD” or standard deviation:
|Verlander, Justin||DET, HOU||206.0||3.85||0.23|
|Darvish, Yu||TEX, LAN||186.7||3.36||0.27|
How should you interpret this new information? The easiest way is to see a player’s most likely DRA being the indicated DRA number, “plus or minus” the standard deviation. So, Chris Sale’s DRA for 2017 was 2.22 runs per nine innings, plus or minus .13—or, to do the math, between 2.09 and 2.35. In other words, he was indisputably damn good. (Fortunately, we have advanced statistics to make things like this clear).
What is the relationship, then, between the DRA we assign for Sale (2.22) and the range provided by the DRA_SD on either side of it? It means that the most likely DRA for Sale is 2.22 (at least, to two decimal places). But there are a bunch of points surrounding 2.22 that are almost as likely as 2.22. Taken together, the standard deviation around that DRA is the most likely range of runs per nine innings that Sale deserved to allow. That range of DRA comprises most of the likely probability space: at least 70 percent of the correct DRA answers can be found there.
By contrast, early on during the season we would expect these uncertainty estimates to be somewhat wider. Indeed, they are about twice as wide, although still quite narrow, all things considered. Here are some of the top 2018 pitchers by innings pitched through Saturday, along with their DRA and DRA SD:
As of the end of April, this Kluber guy is pitching extremely well, but his results so far could, depending on which plate appearances you view as more indicative, reflect either an ace or, on the low end, more of a no. 2 starter or high-end no. 3. Patrick Corbin and Max Scherzer, on the other hand, have been aces in every sense of the term. All of these numbers will change, particularly for those pitchers in the 1.00s, but we can already say that while there is still plenty of uncertainty after one month of the season, it’s not at all uncertain that most of these pitchers have pitched very well.
You’ll note that while innings pitched (i.e., sample size) is a strong driver of player variation, it’s not the only factor. Different pitchers have different levels of volatility. Compare, for example, the difference is uncertainty between Corbin and Carlos Martinez or Stephen Strasburg. It’s useful to think about certain pitchers having much higher uncertainty around their DRA than other pitchers, in some cases more than three times as much! This reflects another benefit of incorporating uncertainty into player evaluation: you can appreciate not just the average outcome on paper for your roster, but the effect of adding “high-variance” players, too.
Of course, because pitcher Wins Above Replacement Player (PWARP) are based on DRA, this means that PWARP now can be assigned uncertainty intervals also. Instead of arguing about whether pitchers with similar win values are truly different, readers like Bill James can now verify whether those players are in fact pretty much the same. If the PWARP ranges for two players overlap, then you have at least a good statistical answer to that question.
Here is another table featuring the most productive two pitchers in each league in 2017, this time with pitcher win values and win value uncertainties added:
|Name||Team(s)||IP||DRA||DRA SD||PWARP||PWARP SD|
A quick glance shows two pitchers in each league with seemingly similar DRAs (and the AL pitchers perhaps benefiting from being in the tougher league). Currently, readers could guess that each set of two players are probably close to the same, but it would just be that: a guess. Not anymore. The SD provided for each DRA confirms that in fact both sets of players are comfortably in the same range relative to each other. But what does that mean for their win value, as this introduces yet another factor, the respective innings pitched?
Once again, that is now directly estimated. The ranges for Sale and Kluber overlap just slightly: Sale could be as low as 7.7 wins, with Kluber as high as 7.73. But it’s not much of an overlap and you would be justified saying Sale was almost certainly more valuable. There is no such ambiguity, however, with Scherzer and Kershaw. Between Scherzer’s superior performance and clear lead in innings, there really is no comparison. Kershaw was terrific, but Scherzer was clearly the most valuable in 2017.
Of course, people already make these comparisons when they fill out their Cy Young ballots. What’s new about DRA and PWARP this year is that these answers are now being directly estimated.
One feature that many fans enjoy is the ability to view “splits” of a statistic: the differing performance between pitchers playing at home or away, or versus left- or right-handed batters. In the past, incorporating splits into DRA has been a difficult thing for us to do, and we have declined to wade very far into it.
One of the many benefits of this year’s revision is that we can create virtually any split we want. For now, we’ve started with home-away and batter-handedness splits, which are available for both DRA and cFIP. We’re open to any suggestions for other splits people find helpful and for ways to help present them, in tables or otherwise.
DRA Runs Table
The DRA Runs table is our continuing effort to put the most important DRA-related information in one place. The DRA Runs table is available as a pull-down when you click on the “Stats/Tools” toolbar at the top of the BP home page. In the past, we’ve tried a few different formats, alternately hearing from readers that we were offering too much or not enough information.
We hope this year’s format is a good compromise. Much of the information is familiar: innings pitched, DRA, RA9, and difference between DRA and RA9. This season we’ve added the SD for both DRA and PWARP. We’ve also restored a category that many people missed last year, Framing Runs: the number of runs each pitcher has gained or lost from the combination of catchers, umpires, and batters they have faced over the course of the season. To keep things consistent, those are direct inputs from our framing model. Negative is always good (a run prevented), while positive (a run yielded) is unfavorable.
There are three more categories that have always been there, but which people have not used very much: NIP Runs, Hit Runs, and Out Runs. This is a shame, because I think they may be the key to understanding how DRA reaches the conclusions it does, and key to understanding how pitchers generally are and are not successful. All of our flagship models fit into one of those categories: Strikeouts, walks, and hit batters are “Not in Play” runs (“NIP”); singles, doubles, triples, home runs, and reaching on errors are considered “Hit Runs”; and finally “Out Runs” are the number of runs gained by the number of outs credited to a pitcher on balls put into play.
All of these numbers are relative to the average rate for each category, multiplied by the event run value and the number of batters faced. You’ll also notice that it’s extremely difficult to be meaningfully above average in all three categories. In other words, the very best pitchers excel in NIP Runs and Hit Runs but are often “below average” in Out Runs. By contrast, pitchers who are above average in Out Runs tend to not only be below average in general, but also do poorly in NIP Runs and Hit Runs. Think about why that is, and I think you will gain a better understanding of how it is that good pitchers succeed, and why unsuccessful pitchers do not. (Hint: the answer is not BABIP.)
All of these updates for now only apply from 2008 to the present. We’ll be rolling them out to the minor leagues and previous major-league years in short order.
For now, these improvements also apply to pitchers only. With some luck, they will also extend to batters in time.
As always, we appreciate your feedback and suggestions.
 We have chosen standard deviation as the means of measuring the uncertainty here because readers are generally familiar with the concept and because it explains an easy-to-understand concept: the most likely DRA for the player lies within one standard deviation of either side of the actual estimated DRA. But sophisticated readers should remember that RA9, to which DRA is scaled, is a more of a gamma than a normal distribution, so the tricks typically employed for normal distributions (symmetric intervals, defined quantiles of multiple standard deviations, and such) are not necessarily true here. The only safe way to get a defined interval of use for DRA (or RA9, really) is take quantiles (or a highest density interval) from samples of that distribution.
 For pitchers with only a few innings and extraordinary results, sometimes the SD “extends” below zero under the rule of thumb I’ve provided. That obviously is not possible, so in that situation understand that the range of possibilities truncates close to zero. That said, the same principle nonetheless applies: at least 70 percent of the probability space will be even within that truncated lower interval and the complete higher interval. As a practical matter, such pitchers have virtually nowhere to go but up, and nobody should read much into a performance reflecting three innings of work.
 Player variance in baseball sometimes gets caught up in discussions about so-called “stabilization” and “true talent.” We don’t find either of those concepts particularly helpful, although others may find them applicable in some way.