An Intro to Evaluating and Predicting Pitching Performance
The crack of the bat as it makes contact with the ball at a live baseball game has to be one of the more nostalgic “American audible events.” There’s just something about it that makes us all remember summer afternoons at the ballpark: the smell of hot dogs, beer and peanuts – and lobster rolls and tacos? – lingering in the air and a good friend sitting at our side. More recently we have learned, however, that it also represents a significant sabermetric transition. Specifically, we are talking about the transition of responsibility associated with run prevention from the pitcher to factors beyond his control, such as his fielding defense behind him. Let’s take a look at what the heck I’m talking about here, and delve into the world of pitcher evaluation through the somewhat recent phenomenon that is Fielding or Defense Independent Pitching.
Defense Independent Pitching theory originated with a posting by Voros McCracken postulating that pitchers are not really able to control whether balls put in play against them are converted into outs or result in hits for the opposition. He based this opinion on the idea of the Defensive Responsibility Spectrum (DRS), explained brilliantly here by Tom Tango. Essentially, what DRS is able to do is assign responsibility to “all those things that the defense is responsible for,” approximated as follows:
Responsibility Almost all Pitcher: HBP, Balk, Pickoff, K, BB, HR Mix: WP, SB, CS, 2B, 3B, 1B, Batting Outs, PB Almost all Fielders: Other Running the Bases Outs
McCracken basically concluded that the best way to evaluate and predict pitcher performance was to analyze the data in the first row, namely that for which the pitcher bears just about all of the responsibility, as opposed to any of the more contextually dependent phenomena in the rows below.
This bold thought seemed counterintuitive, and sparked a great deal of response, debate, and ground-breaking analysis after McCracken expanded on his theory on this website in early 2001. Tom Tippett produced an incredibly in-depth follow up on McCracken’s piece using 90 years of accumulated baseball data, and found while McCracken’s initial conclusion may have gone a bit too far – something McCracken had already begun to take into account himself – this new way of thinking was dead on.
Mitchel Lichtman then followed Tippett’s analysis with one of his own, where he made the breakthrough of correlating certain types of batted balls allowed into expected performance on % of balls in play allowed resulting in hits, or Batting Average on Balls in Play (BABIP). Lichtman concluded that although pitchers cannot really control their BABIP against, they do have a significant amount of control over the types of batted balls they allow, and this in turn allows us to be intelligent in our predictions in their future BABIP performance. In English now, Lichtman is basically the man who put numbers behind the idea that there are “ground-ball pitchers,” i.e. pitchers who induce a greater than average percentage of ground balls on balls in play, and “fly-ball pitchers.” Pitchers do maintain some control over a hitter’s ability to stroke line drives and pop the ball up as well, but the correlations are not as strong. As is intuitive, good pitchers generally allow slightly fewer line drives (balls that are hit pretty hard) and slightly more infield pop flies (contact is not strong) than your average pitcher.
David Appelman posted a great piece on the topic of BABIP against for pitchers over at FanGraphs which allows us to generate the following estimation table on the rates of different types of batted balls being turned into outs:
Hit Type Expected Out % Fly Balls 85% Ground Balls 76% Line Drives 27%
Looking at this you may be wondering – why are ground ball pitchers all the rage today if fly balls are actually converted into outs at a higher rate than are ground balls? This has to do with the fact that one of the worst things a pitcher can do is to give up a home run. In general, home runs seem to occur on about 10% or 11% of all fly balls allowed, somewhat independently of which pitcher is allowing them. As such, it follows that a pitcher who allows more fly balls will allow more home runs, even if he’s an otherwise outstanding pitcher – think Johan Santana or Roy Oswalt. While the ground ball pitcher may give up a few more singles – as well as a few more unearned runs – the fly ball pitcher is going to get hit with the dinger more frequently, thereby limiting his effectiveness somewhat if his other peripheral statistics (i.e. the defense independent numbers we looked at above) do not remain strong.
So what do we do with all of this information now that we have it? Let’s review the two main tenets of the conversation so far:
- The best indicators of a pitcher’s future performance are defense-independent rate stats including: K%, BB%, and to some extent HR% allowed.
- A closer look at specific batted ball data can tell us even more about what to expect from a pitcher in terms of his BABIP against
Let’s go ahead now and take a look at one of the actual statistics used to express the concept of Defense Independent Pitching and then apply it to a real example that might even help a few fantasy teams along the way.
If you clicked the link with Tom Tango earlier in the piece you probably ran into his statistic Fielding Independent Pitching (FIP). It takes into account exactly what we’ve been talking about so far, and does so in a simple, easy to understand format. While some of the newer, slightly different, statistics in this area take batted ball data into account, FIP purely covers the stuff that the pitcher alone is responsible for, and is calculated based on the following formula:
Now, I’ve added in “NC,” which we’ll call our normalizing constant to help make the number that FIP generates match what would be a projected ERA – or about what a pitcher’s ERA should be based on proprietary performance. This number is generally adjusted for current league conditions, but right now generally sits at around 3.2. Dave Studeman went ahead and refined this statistic a little bit by normalizing the home run rate included in an attempt to eliminate fluky FIP numbers as a result of extreme HR luckiness or unluckiness, an example of which can be found here. For the purposes of our discussion, however, we’ll just use the simplest format as it is most readily available to all, standard FIP.
Consider the following two players based on their standard fantasy league statistics:
Pitcher ERA WHIP K/9 W Pitcher A 7.78 1.68 7.56 2 Pitcher B 2.65 1.35 6.00 4
That’s not particularly close, right? But hold on; take a look at their FIP, noting that these numbers represent RA approximations, not ERA as we look to distinguish simply between pitcher dependent and independent data:
Pitcher FIP Pitcher A 4.31 Pitcher B 4.85
How can this be and what does it mean? What we are looking at here is a situation where Pitcher A has actually performed better than Pitcher B in the defensive independent categories we’ve been looking at, namely, BB%, K% and HR% allowed. Now, I’ve manipulated this example to be sure that we aren’t looking at quirky home run data or major discrepancies in park effects, so that’s not the explanation. The difference between these two pitchers fantasy numbers so far comes down to a combination of luck and defensive support. There’s not much we can do about the luck factor, but in terms of fantasy baseball, defensive support is going to be an ongoing concern so let’s unveil the names in order to take that into account:
Pitcher A Ricky Nolasco Pitcher B Matt Cain
No matter which way you look at it, the Marlins defense has been significantly worse than that of the Giants, so we’ll take that into consideration by noting that pitchers playing in front of a good defensive team will generally outperform their FIP as compared to those in front of poor defensive teams. Even accounting for this, FIP suggests Nolasco may be pitching better than Cain right now, so we aren’t necessarily looking for an improvement in Nolasco’s own performance, but rather an improvement in his luck on balls in play which will in turn result in improved fantasy numbers. This is obviously a somewhat simplified analysis, but it remains insightful. I’d bet most Nolasco owners out there right now would be thrilled if you dangled a little Matt Cain Sampler in front of their nose. This is a good example of how using Defense Independent Pitching Statistics can help predict future performance, so keep an eye on pitchers who’s FIP and ERA just don’t line up, as they may be due for a regression soon.
Defense Independent Pitching is an incredible research field full of brilliant minds and big equations, and incredible advances are being made every day. Don’t be overwhelmed, however, as the concept remains accessible to all: when the pitcher turns his head after the crack of the bat to watch where the ball is going, he’s just about as much of a spectator as you are.
Thank you for reading
This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.Subscribe now
"Defense Independent Pitching theory originated with a posting by Voros McCracken postulating that pitchers are not really able to control whether balls put in play against them are converted into outs or result in hits for the opposition."
I don't know about you but I can't say that in one breath...and if I was new to this concept, it'd take me multiple reads to understand what you are saying.
But, this kind of article just won't cut it as the field gets narrowed... so consider this a save.
1. People are "who;" things are "that." You can't say "the pitcher that has an ERA..."
2. I.e. means "in other words." If you mean "for example," use e.g.
This applies to most BP authors as well as Byron and Brittany.
Obviously the Nolasco and Cain examples were intentionally chosen as extreme examples. However, a first-time reader may look at a pitcher's ERA of 7.78 and a FIP of 4.31 and think to themselves that maybe the disparity isn't the result of luck or bad defense, but merely the failure of the statistic to properly measure pitcher performance.
So, I thought the article would be helped by an analysis of the statistics ability to (in the aggregate) accurately describe and predict a pitcher's ERA. This could be as simple as average, standard deviation, etc. sort of statistics.
The history lesson behind the stat was interesting, however, and his explanation of the stat was fine. Byronâ€™s Idol entry essay on Travis Hafner was far more polished (although unenlightening), but I am glad he has toned down his cockiness. There isn't enough here, however, to put Byron in the upper half of the competition.
Of no relevance to the quality of this article, FIP has its uses (which Byron didn't go into adequately), but it puts the Home Run rate on equal footing with walks and strikeouts. It is important, but is affected too strongly by normal variance. If as Brian Cartwright maintains you need three years' worth of data from an entire team to get an accurate home run rate in a park factor, it must be impossible to get an accurate home run rate from a single pitcher. I propose someone look at SlgA - BABIP ... sort of an improved version of isolated power, as an alternative to HR9 or GB%, etc. as a balance to K & BB.
Sadly enough I disagree with Kevin on one point: links are good in basics articles. I have no intention of reading most of them but they give me the option to go in depth if I choose. I found the writing acessible. I am not a writing instructor or editor so maybe it is bad. All I know is I finished the article this time.
My problem was alphabet soup. It is a basics article - I don't care how a statistic came to be, just what it is. By the time you got to FIP you didn't have enough words left to explain the formula which left the statistic just another number with a big 'trust me' factor.
And, are you criticizing our criticisms by sarcastically referring to us as editor/statisticians? What's this contest about, anyway? Isn't this our chance to play editor/statistician?
Another piece I just read claimed it was either .001 or .000.
Which is it?
Overall, I would have liked less background and more about FIP.
This is an introduction, and the best introduction not only inspires but encourages further reading on the subject, those outside links do that. I wish I could vote twice.
In all seriousness, if you're going to tout your writing ability as a strength, you need to do better than this.
If a contestant is bringing something new or unique to the table, he better be one heck of a writer to compensate.
This was a rehash of old analysis, and frankly you can find this on any number of free blogs, like fangraphs. BP can do better.
I, for one, strongly encourage the links to external anaylsis, especially defining analysis that created a field of study. Nobody's forcing the reader to follow them, but they're there if they want to. And the recurring criticism of BP as being too insular is not without merit; this kind of article would help fix that.
ObQuibble: If HR per fly ball is roughly the same for everyone, then shouldn't fly ball rate be the core defense-independent stat, not HR rate? I understand that HR rate is much more widely available, but it would be nice to note that using it is a convenience, not a necessity.