Earlier this month, I wrote an article that was based on a post that Mitchel Lichtman (aka MGL), sabermetrician and co-author of *The Book*, wrote on his blog, MGL on Baseball. MGL explored the question—I’m quoting from his title here—“What does it mean when a pitcher has a few really bad starts that mess with his ERA?”

If you already read my initial post, or MGL’s, you can skip to the next paragraph. If not, MGL’s conclusion was, briefly, that if a starting pitcher has two or three really terrible starts—*really terrible* defined here as eight or more runs allowed in five or fewer innings—that contribute to a season-long runs against average (RA9) of 5.00 or higher, we’re likely to overshoot our projection of the pitcher’s ERA for the subsequent season by 0.2 or 0.3.

In other words, two or three disaster starts have such a strong impact on a pitcher’s seasonal ERA (e.g., Jon Gray, 4.61 ERA overall, 3.86 excluding blowups) that the projection systems like PECOTA take too pessimistic a view of the pitcher for the upcoming season. It’s not like we can ignore bad outings, but maybe we’re letting their large impact on the pitcher’s full season output color our perception.

Commenter matthew_kenerly responded:

I can’t help but wonder if you’ll do this with relief pitchers as well, because the first player I thought of was Santiago Casilla.

Well, Matthew (or matthew_), I hadn’t thought of looking at relievers as well, but that’s a great idea! So here goes.

Defining a really terrible outing for a reliever is difficult. Ignoring the entire issue of inherited runners, relievers generally don’t stick around long enough to allow a lot of runs. Plus, the runs may not be the reliever’s fault. Take a right-handed reliever who’s called on with two out and two on in the seventh inning. Assume he gets the last out, and first out of the eighth inning as well.

Then he walks a guy, and the next batter hits a grounder to third that the third baseman airmails into the dugout. Second and third, one out. The manager has our pitcher walk the bases full intentionally, then calls on a lefty reliever to face the batting team’s lefty slugger, who sends the ball into the right-field stands for a homer. Our righty reliever is charged with three runs in two-thirds of inning—that’s an RA9 of 40.50—and really didn’t pitch badly.

I decided that in order to charge a reliever with a poor appearance, he’ll need to have allowed at least one run to score while he’s on the mound. That’s a minimum of four runs allowed. And he’ll have to have pitched no more than an inning. To keep with the topic of how *a few* bad outings can affect a pitcher’s outlook rather than *one*, we’ll look for pitchers with two or more such outings in a season. Finally, since I want to compare actual results to projections, I’m going to look at the 2003-2015 seasons, because we have both PECOTA projections and actual results for 2004-2016.

I also decided to set a lower limit on innings pitched in relief at 10. Before somebody screams “selection bias,” yes, I’ll buy that, but there’s a built-in selection bias in the study of starters as well. For a starter to pitch two or three horrible starts as defined by MGL, he’s got to be good enough for the manager to call on him again. There were 11 pitchers who had two or three blowups in 2016, and their innings ranged from 111 (Aaron Nola) to 181.2 (James Shields). Six were ERA qualifiers, and two were within six innings of 162 as well. The analysis of starters is skewed toward more durable performers and I wanted to do the same.

From 2003 to 2015, there were 457 relievers (an average of 35 per year) who had two or more appearances in which they allowed four or more runs in one or fewer innings. Of them, 183 (14 per season) pitched at least 10 relief innings and allowed an RA9 of 5.00 or greater. (As MGL did, I scaled RA9 to an MLB average of 4.00, i.e., the limit in a given year was 125 percent of the overall average.)

My total sample size of 183, or 14 per season, wasn’t bad. But it kept getting whittled down as the analysis progressed. Of the 183, only 127 pitched the next season. That’s 30 percent attrition! Having a couple bad games doesn’t do much for reliever job security, it seems. And of the remaining 127, there were a handful for which there aren’t PECOTA projections and larger handful who pitched fewer than 10 innings in the following season. That cut the sample size down to 102. Still, that’s an average of nearly eight pitchers per season for 13 seasons. It’ll do.

So how did they do, these 102 relievers who had two terrible (four-plus runs, one or fewer innings) appearances and an RA9 of 5.00+ in one season, in the following season, relative to expectations? Here are the results:

MGL found that pitchers who had a few disaster starts had an actual RA9 the following season that was about 0.28 lower than his projection. In this case, using PECOTA projections for relievers, we’ve got an ERA difference of 0.33—very similar to the result for starters.

Now, before you run out and draft relievers who had two or more appearances with four or more runs allowed in one or fewer innings last season, a bunch of caveats apply. First, note that only 54 percent of the pitchers in total beat their ERA projections. Those who did did so nicely, but the chance that a given pitcher will beat his projection is almost a coin flip—55 out of 102. (And for WHIP, it was a coin flip, with only 51 of 102 topping PECOTA’s projection.)

Second, beating the ERA projection by a third of run isn’t a huge difference when your starting point is, on average, 4.87. And given that this is an aggregate improvement, applying a 0.33 improvement to any one pitcher is missing the point of this exercise, which is looking at all relievers as a group. Third, this sample isn’t all that robust. MGL went back to 1977, so he had 40 years. Since I wanted to have PECOTA projections, I had 13 years. Fourth, relievers are, as the attrition data shows, pretty volatile. They’re hard to predict in the first place, given that our projections are virtually always based on fewer than 100 innings.

With those caveats in mind, for any fantasy players looking to potentially buy cheap in fantasy drafts, here is a list of relievers who had two or more appearances of four or more runs in one or fewer innings, 10 or more innings, and an RA9 25 percent or more higher than the league average in 2016, along with their 2017 PECOTA projections for ERA and WHIP:

Boy, does *caveat emptor* ever apply here.

Oh, and Santiago Casilla, whom matthew_kenerly cited in his comment prompting this research? Not only does he not qualify for this study because his RA9 for 2016 was only 3.57, he didn’t have any games in which he allowed four runs, and only two in which he allowed only two. He wasn’t as blowup-prone as Giants fans might recall.

#### Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now
Otherwise we don't know that Pecota isn't under-projecting ALL relievers.

Trust me, it's easy to have too high (bad) of a projection for bad pitchers. The reason is that teams are more likely to bring back the pitchers whom THEY know (but WE don't) are better than the ones whom they don't bring back.

So, if 30% drop out of any sample after a bad year, it is likely that those 30% would have performed worse than the 70% who did not drop out even if they had the same projection and even if they had the same overall stats in year 1.

Again, that is because the teams "know" things that the projection models don't. That's especially true if the projection models don't use minor league stats. The pitchers that live to see another year likely have better minor league stats than the pitchers who don't even if they have the same MLB stats.

To be honest, PECOTA historically has not had the best projections, so I would not be terribly surprised if they under-projected all their relievers who had bad seasons.

One is "projections projections." These are exactly what we expect a player to do (mean, median, whatever) IF he plays in the major leagues (assuming they are MLB projections) or if we forced him to play in MLB.

The second kind are projections of players who end up playing in the major leagues (whether they played in the majors before or not).

There's actually a third one of players who end up playing AND amass a minimum level of playing time - any minimum.)

All three of these require different projections (with increasing optimism). All of them but the first one are a fudge. And while the first one is the most "accurate," it will also fare the worst in any kind of testing.

I'll let the readers figure out what this means and why it's all true!