First, he presented some research, came to a conclusion, published, then did some more work, refined his techniques, came to a different conclusion, and published that as well. I’m over-simplifying here, but the point is that in a world in which medical researchers bury results they don’t like and the Oxford English Dictionary Word of the Year (post-truth, if you don’t want to click the link) reflects how “objective facts are less influential in shaping public opinion than appeals to emotion and personal belief,” it’s reassuring that our little corner still values the scientific method and full disclosure.
Second, his findings are interesting! His article’s title lays out his research: “What does it mean when a pitcher has a few really bad starts that mess with his ERA?”
We’re all familiar with this kind of thinking. Last July 3, Jon Lester started for Cubs at Citi Field against the Mets. He allowed a home run to Curtis Granderson in the first inning, but things really fell apart for him in the second frame: home run, strikeout, double, home run, walk, double, single, single, wild pitch, single, single. He was pulled, having allowed eight runs, all earned, in one-and-a-third innings.
Lester, of course, had a fine season in 2016: 2.44 ERA (second in the National League), 3.45 FIP (seventh), 3.10 DRA (fifth), 5.3 PWARP (fifth). Relevant to MGL’s title, if you remove just that one start from Lester’s season, his seasonal ERA drops all the way to 2.10, which would have enabled him to edge out teammate Kyle Hendricks’ 2.13 to lead the league.
Of course, you can’t just ignore one bad game, just as you can’t ignore Mike Trout’s August 7 game in Seattle, when he wore a Golden Sombrero at the hands of James Paxton. A season is a combination of good and bad games, aggregated together.
MGL indentified every pitcher from 1977 to 2016 with at least 100 innings who had at least four starts in which they pitched five or fewer innings and gave up six or more runs. He then compared all such pitchers who gave up an average of five or more runs per nine innings (RA9) with all pitchers who pitched at least 100 innings with an RA9 of at least 5.00 who didn’t have four or more such starts.
In other words, he was looking at a lot of starters who had disappointing seasons (i.e., RA9 of 5.00 or more), some of whom had four or more terrible starts, some of whom didn’t. Last year, pitchers with 100 innings and an RA9 of at least 5.00 ranged from Josh Tomlin (5.02), Taijuan Walker (5.02), and Michael Pineda (5.03) to Shelby Miller (6.42), Adam Morgan (6.45), and Tyler Duffey (6.97).
This gave him two buckets—pitchers with an RA9 of 5.00 who had at least four starts in which they were truly terrible, and pitchers with an RA9 of 5.00 who didn’t have big blowups. For each bucket, he compared the pitchers’ performance the following year with their projected performance, to see whether pitchers whose season was sabotaged by a few bad starts would outperform their more consistently unimpressive peers.
Since the projection system he used was based on season stats, it wouldn’t know that Jimmy Nelson (5.42 RA9 in 2016) had eight games in which he allowed six or more runs in five or fewer innings while Robbie Ray (also 5.42 RA9) didn’t. Or, in other words, that Nelson’s RA9 and ERA were 3.41 and 3.10, respectively, in 24 starts but 13.89 and 11.01 in the other eight.
The somewhat surprising result (to me at least) was that while the pitchers who were victimized by a handful of awful games performed more or less in line with expectations, those who were just consistently sub par did slightly better than expected. So, in the example above, Nelson’s blowup-fueled 5.42 RA9 was a more accurate indicator of his true talent than Ray’s more consistent 5.42 RA9. As MGL initially concluded:
The next time you read that, “So-and-so pitcher has bad numbers but this was only because of a few really bad outings,” remember that there is no evidence that an ERA or RA which includes a “few bad outings” should be treated any differently than a similar ERA or RA without that qualification, at least as far as projections are concerned.
As they say in infomercials, though, wait, there’s more.
MGL subsequently considered whether his definition of a blowup was too liberal. He re-ran his experiment, changing his definition of a bad start from six or more runs in five or fewer innings to eight or more runs in five or fewer innings. That’s a really bad start. Last year, there were 481 starts in which the starter allowed six or more runs in five or fewer innings, just under 10 percent of all starts. There were only 99 games in which the starter allowed eight or more runs in five or fewer innings, only two percent of starts. That’s really bad.
The starters in 2016 who met the revised criteria of two or more starts of five or fewer innings and eight or more runs allowed, minimum 100 innings, are Edinson Volquez with four such starts; James Shields and Josh Tomlin with three each; and Jorge De La Rosa, Jon Gray, Zack Greinke, Jason Hammel, Jeff Locke, Jimmy Nelson, Aaron Nola, Drew Smyly, and Steven Wright with two.
Skipping to the conclusion, MGL found “for starters whose runs allowed are inflated due to two or three really bad starts, if we simply use overall season RA or ERA for our projections we will understate their subsequent season’s RA or ERA by maybe 0.2 or 0.3 runs per nine innings.” [Italics mine]
Put another way, when looking at pitchers whose seasonal averages get trashed by two or three truly terrible starts, we probably should consider those starts to be partial outliers, and the pitcher’s true talent to be somewhat better.
Now, to this point, I’ve largely given you a book review of MGL’s post. Let me make it a little more interesting by noting its relevance to the 2017 season. Here is an alphabetical list of pitchers whose 2016 figures were hurt by two or three starts of five or fewer innings and eight or more runs allowed. I added their PECOTA projections for ERA and WHIP in 2017:
Per MGL’s research, there’s a reasonable likelihood that these pitchers will modestly beat their projections in 2017 in aggregate. The key words in that sentence are modestly and in aggregate. It would be foolhardy to wait until the middle rounds of your fantasy draft and then snatch up De La Rosa, Locke, and Shields. (Important caveat: Ignore the last sentence if you’re an owner in any of my leagues. Go for them!) PECOTA is likely a little too pessimistic about these pitchers as a group, because their 2016 numbers were hurt so severely by two or three disaster starts.
Jon Gray’s seasonal ERA rose by more than two full runs on May 19, when he allowed nine runs in 3.1 innings in St. Louis. MGL’s research suggests that while that one game put a hurt on his 2016 numbers, it may not be as relevant in the upcoming season as the 2.61 ERA he compiled over his next 13 starts.