Justin Verlander has been through an interesting few years. How interesting, exactly?
Using Deserved Run Average (DRA), our new metric to describe pitcher performance here at Baseball Prospectus, we can track the trend. Because we want to evaluate Verlander across several seasons, we’ll also go one step further and use DRA–. DRA– is based on DRA, but is normalized to an average of 100 for each season, with lower being better. This allows you to compare pitchers across different seasons and different runscoring environments.
Now that we’ve got our scorecard, let’s look at Verlander’s recent seasons.
Year 
DRA– 
DRA– Rank 
DRA– Historical Rank 

2011 
251.3 
44 
1 
2 
2012 
237.7 
45 
1 
4 
2013 
218.3 
89 
24 
1684 
2014 
206 
101 
55 
3453 
2015 
133.3 
62 
4 
95 
(All ranks refer to pitchers with at least 162 innings pitched, except for 2015, which ranks pitchers with at least 125 innings pitched).
What you see is that Verlander deservedly won the AL Cy Young award in 2011 and should have won it in 2012.[1] Both seasons were historic performances. Among qualified pitchers, Verlander’s 2011 season was the secondbest by DRA– over the last six decades in baseball. His 2012 season was the fourthbest over that same period.[2] It is, along with the 1999–2000 seasons of Pedro Martinez, the most dominant twoyear run by a pitcher in modern majorleague history.
Of course, things started to head south for Verlander in 2013, shortly after he signed a huge contract extension. In 2013, his DRA– of 89 was merely aboveaverage. His 2014 season was even worse, plunging to averagelevel performance, 55^{th} among qualified starters. The Tigers — a team that already appeared to be on the last legs of a championship stretch — looked to have made a mistake in extending Verlander.
And then there was 2015. Verlander started the year out hurt, so he pitched only 133 innings and did not even qualify for the ERA title. On the surface, Verlander’s performance was nothing special. Among starters with at least 125 innings, his ERA ranked 30^{th}, with his FIP in agreement at 31^{st}.
But then last week I received a rather interesting tweet:
Raise your hand if you thought Justin Verlander would have the lowest DRA of all AL starters! @bachlaw is the gap mostly catcherrelated?
— Benjamin Drozdoff (@drozbaseball) October 4, 2015
That surprised me. Verlander ranking that highly in DRA? Above the AL Cy Young candidates and behind only the NL Big Three of Kershaw, Greinke, and Arrieta? Really? Well, let’s go back to the bottom line of the chart above. Verlander ranked not only fourth in baseball among pitchers with at least 125 IP, but also finished in the top 100 of all such pitchers since 1953. Holy smokes! How is this possible? Well, what we’ll see is that Verlander was terrific last year, and much better than his ERA made him look. We’ll talk more about it in a moment.
But Benjamin’s question raises a larger issue that is the driving force behind this article. I get tweets like that one a lot: “This player’s DRA surprises me; can you explain why DRA says this?” As those of you who follow me know, I consistently respond. But the bigger issue is the perception of some (not necessarily Benjamin) that DRA is a black box only a select few can understand. Admittedly, the adjustments DRA makes for pitcher externalities — catcher framing, team defense, temperature, and the like — can be complicated. But the truth is that those adjustments account for only a small amount of a pitcher’s DRA. The vast majority of DRA is driven by plain old linear weights allowed, and in sabermetrics it doesn’t get much simpler than that.
The bulk of DRA’s work is performed by a linear mixed model, which, stripped to its core, follows this overall formula:
lwts = externalities + participants
Here, “lwts” is an abbreviation for linear weights; we’ve already talked about the “externalities”; and “participants” are the people we care about on the field: pitchers, catchers, batters, etc.
Let’s talk about linear weights in particular. Most of you have heard of them: Linear weights are the average values of different events in baseball. So, home runs are consistently worth about 1.4 runs, single outs are worth about .3 runs, and so on. If you’re not familiar with individual values, you still almost certainly know about pitcher statistics like FIP (which uses linear weights to measure home runs, strikeouts, walks, and hit batsmen) and batter statistics like weighted On Base Average (wOBA) or True Average (TAv), our parkadjusted equivalent here at Baseball Prospectus. In fact, virtually everyone in the sabermetric community prefers wOBA or TAv to describe batter performance. So, ask yourself this: if wOBA / TAv are the standard means of evaluating batters, shouldn’t the fundamental measure of pitcher value be the extent to which they limit batter wOBA / TAv?
Of course it should. The reason people have avoided this so far probably is the fear that wOBA / TAv would unfairly punish a pitcher for batting average on balls in play (BABIP), defense, stadiums, and other factors largely outside a pitcher’s control. DRA, of course, does control for those “externalities,” meaning that a pitcher’s linear weights allowed, as adjusted by DRA, therefore becomes a reasonable measure of pitcher performance. So, if you have one takeaway from this article, please let it be this: The vast majority of a pitcher’s DRA is explained by his average linear weights allowed to opposing batters. This means that a pitcher who tends to allow singles rather than extra basehits will have a lower DRA and is a better pitcher. This fact is obvious to people who watch baseball games, and DRA finally allows that common knowledge to be reliably applied in a sabermetric setting.
You can verify this relationship for yourself. Whenever you want to understand a pitcher’s DRA, start by pulling up our pitcher DRA Sortable Stats page, click on the “Statistics Selection” link at the top, and add “TAv” (which is True Average) to the list on the right side. After you hit the button at the bottom saying “Submit Statistics Selection,” TAv will be added to the pitcher DRA table. If you’re feeling lazy, just click here, where I’ve done that for you and also trimmed out a few unnecessary columns.
If you do this, you’ll notice a trend. Let’s look at pitchers from 2015 with at least 125 innings, ranked in ascending order of DRA / DRA–:
IP 
DRA 
DRA– 
TAv 

232.7 
2.16 
50 
0.194 

222.7 
2.17 
51 
0.191 

229 
2.31 
54 
0.187 

Justin Verlander 
133.3 
2.65 
62 
0.231 
228.7 
2.77 
65 
0.220 

208 
2.77 
65 
0.217 

232 
2.78 
65 
0.209 

220.3 
2.89 
67 
0.227 

Jacob deGrom 
191 
3.03 
71 
0.224 
212 
3.09 
72 
0.228 

212.3 
3.15 
73 
0.243 

205.3 
3.16 
74 
0.246 

129.7 
3.28 
77 
0.213 

208 
3.31 
77 
0.228 

222 
3.31 
77 
0.225 

154 
3.34 
78 
0.239 

129.7 
3.39 
79 
0.245 

181 
3.4 
79 
0.229 

208.7 
3.4 
79 
0.234 
You can see that while the numbers jump back and forth a bit, by and large the DRA values for a pitcher correspond with their True Average allowed (TAv). In fact, if you correlate DRA / DRA– with True Average, you get a score of .93.[3] This means about 85 percent of the variability of a pitcher’s DRA can be accounted for simply by looking at their True Average. True Average allowed and DRA are not the same thing, but they both rely on linear weights, and so True Average will provide you with a very good idea of where DRA is coming from.
Of course, DRA and True Average don’t always agree, which is because of that other 15 percent that makes up a pitcher’s final DRA. That 15 percent would be the externalities in the equation above, including temperature, catcher framing, opposing batter quality, and stadiums. Those factors won’t turn Clayton Kershaw into Jason Marquis, or Kyle Kendrick into Corey Kluber, but they do help you distinguish between pitchers who allow similar levels of batter productivity, but are sometimes doing so under very different conditions.
The effects of these externalities are published on our DRA Runs table. DRA Runs estimates, from 1998 through the present time, how each factor has impacted each pitcher who has taken the mound. DRA Runs does not contain every adjustment, but it contains those factors which most likely explain why a pitcher with a higher TAv allowed can still leapfrog a nearby pitcher to end up with a lower DRA.
Which (finally) brings us back to Justin Verlander. When you look at the chart of 2015 pitchers above, Verlander’s TAv allowed ranks 14^{th}. This explains about half of the difference in ranking from his ERA (which put him 30^{th}), but not all of it. Verlander suppressed batters well, but his batter productivity is more comparable to Chris Archer or Chris Sale (still pretty good company) than Zack Greinke or Max Scherzer.
And so, we head to the DRA Runs table to see what made the rest of the difference. There we see that Verlander dealt with a poor framing and belowaverage defense, in all costing him about six runs more than he deserved. Compare that to David Price, who had essentially average sailing (one run undeserved) and Jacob deGrom (who got five free runs), and you get a sense of why Verlander’s DRA jumps up to the front of the pack.
So, to get back to Benjamin’s question, yes, Verlander’s 2015 DRA is in part due to poor catcher framing. But it is due much more to the fact that he held batters to a similar True Average as did Matt Harvey and Madison Bumgarner, while doing so under more challenging conditions. This doesn’t necessarily mean Justin Verlander is “back,” but it confirms that DRA, as usual, has a good reason for seeing him as having pitched better than his ERA suggests.
While I continue to welcome your questions, my greater hope is that this article will allow you to start answering these questions for yourselves. At the end of the day, it’s not just about getting the answer, but having confidence in the answer you reach.
[1] Our DRA Runs table estimates that David Price was gifted 7 runs from catcher framing and 5 runs from his mix of stadiums. Overall, Verlander deserved to give up 13 fewer runs than Price in 2012.
[2] Pedro 2000 is number 1 with a DRA–: of 38. Maddux 1995 is essentially tied with Verlander 2011.
[3] .93 is the weighted Pearson correlation (using innings pitched) for all pitchers in 2015. The Spearman correlation (which tracks ranks) is .95.
Thank you for reading
This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.
Subscribe now
However, DRA DOES NOT attempt to rank which pitchers were "actually better". It attributes all of the random variation to the pitcher. Over a small sample, if Clayton Kershaw and Jason Marquis have the same DRA (wOBA with other externalities accounted for), we should believe that Clayton Kershaw actually pitched better. Our guess would be that Kershaw was hurt by random variation, and that Marquis was helped. However, DRA does not try to make this step. This is fine, but then DRA should not claim that it describes what actually happened. DRA does not estimate true talent demonstrated. DRA is simply a wOBA allowed metric. It gives all of the random variation to the pitcher.
Remaining variance after controlling for externalities is assigned primarily to batters (who are much more relevant than pitchers to plate outcomes). Pitchers come in second, and the explainable remaining variance is then assigned to catcher and umpire on each play. The question of who gets assigned what is determined by a maximum likelihood function that tracks the context of each individual plate appearance.
Moreover, no one is "giving variance" to anybody. The random effects for players (including pitchers) are shrunk toward the grand mean for the league, which functions as a prior. Players who are assigned unique variance by the model, whether above or below the league mean, are much more likely to deserve it than they would a straight BABIP.
When we convert DRA to an RA/9 scale, pitchers certainly do hold the bag at the end of the day for otherwiseunexplained runs allowed, because that's how the rules work. Pitchers get charged for runs allowed. All we can do is try to make those adjustments as fair as possible. We do this in part by controlling for the externalities, and also by preserving the scale of their respective most likely responsibilities as we convert to runs allowed per 9 innings.