keyboard_arrow_uptop

Justin Verlander has been through an interesting few years. How interesting, exactly?

Using Deserved Run Average (DRA), our new metric to describe pitcher performance here at Baseball Prospectus, we can track the trend. Because we want to evaluate Verlander across several seasons, we’ll also go one step further and use DRA–. DRA– is based on DRA, but is normalized to an average of 100 for each season, with lower being better. This allows you to compare pitchers across different seasons and different run-scoring environments.

Now that we’ve got our scorecard, let’s look at Verlander’s recent seasons.

Year

IP

DRA–

DRA– Rank
(season)

DRA– Historical Rank
(1953-present)

2011

251.3

44

1

2

2012

237.7

45

1

4

2013

218.3

89

24

1684

2014

206

101

55

3453

2015

133.3

62

4

95

(All ranks refer to pitchers with at least 162 innings pitched, except for 2015, which ranks pitchers with at least 125 innings pitched).

What you see is that Verlander deservedly won the AL Cy Young award in 2011 and should have won it in 2012.[1] Both seasons were historic performances. Among qualified pitchers, Verlander’s 2011 season was the second-best by DRA– over the last six decades in baseball. His 2012 season was the fourth-best over that same period.[2] It is, along with the 1999–2000 seasons of Pedro Martinez, the most dominant two-year run by a pitcher in modern major-league history.

Of course, things started to head south for Verlander in 2013, shortly after he signed a huge contract extension. In 2013, his DRA– of 89 was merely above-average. His 2014 season was even worse, plunging to average-level performance, 55th among qualified starters. The Tigers — a team that already appeared to be on the last legs of a championship stretch — looked to have made a mistake in extending Verlander.

And then there was 2015. Verlander started the year out hurt, so he pitched only 133 innings and did not even qualify for the ERA title. On the surface, Verlander’s performance was nothing special. Among starters with at least 125 innings, his ERA ranked 30th, with his FIP in agreement at 31st.

But then last week I received a rather interesting tweet:

That surprised me. Verlander ranking that highly in DRA? Above the AL Cy Young candidates and behind only the NL Big Three of Kershaw, Greinke, and Arrieta? Really? Well, let’s go back to the bottom line of the chart above. Verlander ranked not only fourth in baseball among pitchers with at least 125 IP, but also finished in the top 100 of all such pitchers since 1953. Holy smokes! How is this possible? Well, what we’ll see is that Verlander was terrific last year, and much better than his ERA made him look. We’ll talk more about it in a moment.

But Benjamin’s question raises a larger issue that is the driving force behind this article. I get tweets like that one a lot: “This player’s DRA surprises me; can you explain why DRA says this?” As those of you who follow me know, I consistently respond. But the bigger issue is the perception of some (not necessarily Benjamin) that DRA is a black box only a select few can understand. Admittedly, the adjustments DRA makes for pitcher externalities — catcher framing, team defense, temperature, and the like — can be complicated. But the truth is that those adjustments account for only a small amount of a pitcher’s DRA. The vast majority of DRA is driven by plain old linear weights allowed, and in sabermetrics it doesn’t get much simpler than that.

The bulk of DRA’s work is performed by a linear mixed model, which, stripped to its core, follows this overall formula:

lwts = externalities + participants

Here, “lwts” is an abbreviation for linear weights; we’ve already talked about the “externalities”; and “participants” are the people we care about on the field: pitchers, catchers, batters, etc.

Let’s talk about linear weights in particular. Most of you have heard of them: Linear weights are the average values of different events in baseball. So, home runs are consistently worth about 1.4 runs, single outs are worth about -.3 runs, and so on. If you’re not familiar with individual values, you still almost certainly know about pitcher statistics like FIP (which uses linear weights to measure home runs, strikeouts, walks, and hit batsmen) and batter statistics like weighted On Base Average (wOBA) or True Average (TAv), our park-adjusted equivalent here at Baseball Prospectus. In fact, virtually everyone in the sabermetric community prefers wOBA or TAv to describe batter performance. So, ask yourself this: if wOBA / TAv are the standard means of evaluating batters, shouldn’t the fundamental measure of pitcher value be the extent to which they limit batter wOBA / TAv?

Of course it should. The reason people have avoided this so far probably is the fear that wOBA / TAv would unfairly punish a pitcher for batting average on balls in play (BABIP), defense, stadiums, and other factors largely outside a pitcher’s control. DRA, of course, does control for those “externalities,” meaning that a pitcher’s linear weights allowed, as adjusted by DRA, therefore becomes a reasonable measure of pitcher performance. So, if you have one takeaway from this article, please let it be this: The vast majority of a pitcher’s DRA is explained by his average linear weights allowed to opposing batters. This means that a pitcher who tends to allow singles rather than extra base-hits will have a lower DRA and is a better pitcher. This fact is obvious to people who watch baseball games, and DRA finally allows that common knowledge to be reliably applied in a sabermetric setting.

You can verify this relationship for yourself. Whenever you want to understand a pitcher’s DRA, start by pulling up our pitcher DRA Sortable Stats page, click on the “Statistics Selection” link at the top, and add “TAv” (which is True Average) to the list on the right side. After you hit the button at the bottom saying “Submit Statistics Selection,” TAv will be added to the pitcher DRA table. If you’re feeling lazy, just click here, where I’ve done that for you and also trimmed out a few unnecessary columns.

If you do this, you’ll notice a trend. Let’s look at pitchers from 2015 with at least 125 innings, ranked in ascending order of DRA / DRA–:


NAME

IP

DRA

DRA–

TAv

Clayton Kershaw

232.7

2.16

50

0.194

Zack Greinke

222.7

2.17

51

0.191

Jake Arrieta

229

2.31

54

0.187

Justin Verlander

133.3

2.65

62

0.231

Max Scherzer

228.7

2.77

65

0.220

Sonny Gray

208

2.77

65

0.217

Dallas Keuchel

232

2.78

65

0.209

David Price

220.3

2.89

67

0.227

Jacob deGrom

191

3.03

71

0.224

Chris Archer

212

3.09

72

0.228

Cole Hamels

212.3

3.15

73

0.243

Shelby Miller

205.3

3.16

74

0.246

Jaime Garcia

129.7

3.28

77

0.213

Gerrit Cole

208

3.31

77

0.228

Corey Kluber

222

3.31

77

0.225

Masahiro Tanaka

154

3.34

78

0.239

Hisashi Iwakuma

129.7

3.39

79

0.245

Marco Estrada

181

3.4

79

0.229

Chris Sale

208.7

3.4

79

0.234

You can see that while the numbers jump back and forth a bit, by and large the DRA values for a pitcher correspond with their True Average allowed (TAv). In fact, if you correlate DRA / DRA– with True Average, you get a score of .93.[3] This means about 85 percent of the variability of a pitcher’s DRA can be accounted for simply by looking at their True Average. True Average allowed and DRA are not the same thing, but they both rely on linear weights, and so True Average will provide you with a very good idea of where DRA is coming from.

Of course, DRA and True Average don’t always agree, which is because of that other 15 percent that makes up a pitcher’s final DRA. That 15 percent would be the externalities in the equation above, including temperature, catcher framing, opposing batter quality, and stadiums. Those factors won’t turn Clayton Kershaw into Jason Marquis, or Kyle Kendrick into Corey Kluber, but they do help you distinguish between pitchers who allow similar levels of batter productivity, but are sometimes doing so under very different conditions.

The effects of these externalities are published on our DRA Runs table. DRA Runs estimates, from 1998 through the present time, how each factor has impacted each pitcher who has taken the mound. DRA Runs does not contain every adjustment, but it contains those factors which most likely explain why a pitcher with a higher TAv allowed can still leapfrog a nearby pitcher to end up with a lower DRA.

Which (finally) brings us back to Justin Verlander. When you look at the chart of 2015 pitchers above, Verlander’s TAv allowed ranks 14th. This explains about half of the difference in ranking from his ERA (which put him 30th), but not all of it. Verlander suppressed batters well, but his batter productivity is more comparable to Chris Archer or Chris Sale (still pretty good company) than Zack Greinke or Max Scherzer.

And so, we head to the DRA Runs table to see what made the rest of the difference. There we see that Verlander dealt with a poor framing and below-average defense, in all costing him about six runs more than he deserved. Compare that to David Price, who had essentially average sailing (one run undeserved) and Jacob deGrom (who got five free runs), and you get a sense of why Verlander’s DRA jumps up to the front of the pack.

So, to get back to Benjamin’s question, yes, Verlander’s 2015 DRA is in part due to poor catcher framing. But it is due much more to the fact that he held batters to a similar True Average as did Matt Harvey and Madison Bumgarner, while doing so under more challenging conditions. This doesn’t necessarily mean Justin Verlander is “back,” but it confirms that DRA, as usual, has a good reason for seeing him as having pitched better than his ERA suggests.

While I continue to welcome your questions, my greater hope is that this article will allow you to start answering these questions for yourselves. At the end of the day, it’s not just about getting the answer, but having confidence in the answer you reach.



[1] Our DRA Runs table estimates that David Price was gifted 7 runs from catcher framing and 5 runs from his mix of stadiums. Overall, Verlander deserved to give up 13 fewer runs than Price in 2012.

[2] Pedro 2000 is number 1 with a DRA–: of 38. Maddux 1995 is essentially tied with Verlander 2011.

[3] .93 is the weighted Pearson correlation (using innings pitched) for all pitchers in 2015. The Spearman correlation (which tracks ranks) is .95.

You need to be logged in to comment. Login or Subscribe
brownsugar
10/12
Excellent summary of what SkyNet is doing under the hood, thanks.
consumerchad
10/12
Great article. Focusing on a single case study really helps to fill in the gaps I had in my understanding of DRA after reading the longer, more abstract explanations.
bobody
10/14
The problem we have here is the idea the previous claim that DRA explains "what actually happened". What "actually happened" is that players demonstrated certain true talent levels throughout the season. If a pitcher over the season has, for example, a league average "wOBA prevention ability"(in modern MLB about a .313), we will expect him in a large enough sample with no externalities to have a .313 wOBA allowed. His actual observed BABIP will be affected by externalities (defense, batter quality, framing, weather, etc) AND by random variation (which can be significant due to the small sample). DRA attempts to account for the externalities. DRA ranks the pitchers based on wOBA allowed after accounting for the externalities. However, DRA DOES NOT attempt to rank which pitchers were "actually better". It attributes all of the random variation to the pitcher. Over a small sample, if Clayton Kershaw and Jason Marquis have the same DRA (wOBA with other externalities accounted for), we should believe that Clayton Kershaw actually pitched better. Our guess would be that Kershaw was hurt by random variation, and that Marquis was helped. However, DRA does not try to make this step. This is fine, but then DRA should not claim that it describes what actually happened. DRA does not estimate true talent demonstrated. DRA is simply a wOBA allowed metric. It gives all of the random variation to the pitcher.
bobody
10/14
In the first paragraph where I said BABIP i meant wOBA
bachlaw
10/15
I'm sorry, but this is not accurate. Remaining variance after controlling for externalities is assigned primarily to batters (who are much more relevant than pitchers to plate outcomes). Pitchers come in second, and the explainable remaining variance is then assigned to catcher and umpire on each play. The question of who gets assigned what is determined by a maximum likelihood function that tracks the context of each individual plate appearance. Moreover, no one is "giving variance" to anybody. The random effects for players (including pitchers) are shrunk toward the grand mean for the league, which functions as a prior. Players who are assigned unique variance by the model, whether above or below the league mean, are much more likely to deserve it than they would a straight BABIP. When we convert DRA to an RA/9 scale, pitchers certainly do hold the bag at the end of the day for otherwise-unexplained runs allowed, because that's how the rules work. Pitchers get charged for runs allowed. All we can do is try to make those adjustments as fair as possible. We do this in part by controlling for the externalities, and also by preserving the scale of their respective most likely responsibilities as we convert to runs allowed per 9 innings.
Pernellius
2/17
I love this article, it really helps me understand DRA/DRA- better. I'm also coming to understand that it's not a predictive stat. Is cFIP the closest to DRA as a predictive stat, and thus was chosen for PECOTA?