An Intro to Evaluating and Predicting Pitching Performance

The crack of the bat as it makes contact with the ball at a live baseball game has to be one of the more nostalgic “American audible events.” There’s just something about it that makes us all remember summer afternoons at the ballpark: the smell of hot dogs, beer and peanuts – and lobster rolls and tacos? – lingering in the air and a good friend sitting at our side. More recently we have learned, however, that it also represents a significant sabermetric transition. Specifically, we are talking about the transition of responsibility associated with run prevention from the pitcher to factors beyond his control, such as his fielding defense behind him. Let’s take a look at what the heck I’m talking about here, and delve into the world of pitcher evaluation through the somewhat recent phenomenon that is Fielding or Defense Independent Pitching.

Defense Independent Pitching theory originated with a posting by Voros McCracken postulating that pitchers are not really able to control whether balls put in play against them are converted into outs or result in hits for the opposition. He based this opinion on the idea of the Defensive Responsibility Spectrum (DRS), explained brilliantly here by Tom Tango. Essentially, what DRS is able to do is assign responsibility to “all those things that the defense is responsible for,” approximated as follows:

Almost all Pitcher:  HBP, Balk, Pickoff, K, BB, HR
Mix: WP, SB, CS, 2B, 3B, 1B, Batting Outs, PB
Almost all Fielders: Other Running the Bases Outs

McCracken basically concluded that the best way to evaluate and predict pitcher performance was to analyze the data in the first row, namely that for which the pitcher bears just about all of the responsibility, as opposed to any of the more contextually dependent phenomena in the rows below.

This bold thought seemed counterintuitive, and sparked a great deal of response, debate, and ground-breaking analysis after McCracken expanded on his theory on this website in early 2001. Tom Tippett produced an incredibly in-depth follow up on McCracken’s piece using 90 years of accumulated baseball data, and found while McCracken’s initial conclusion may have gone a bit too far – something McCracken had already begun to take into account himself – this new way of thinking was dead on.

Mitchel Lichtman then followed Tippett’s analysis with one of his own, where he made the breakthrough of correlating certain types of batted balls allowed into expected performance on % of balls in play allowed resulting in hits, or Batting Average on Balls in Play (BABIP). Lichtman concluded that although pitchers cannot really control their BABIP against, they do have a significant amount of control over the types of batted balls they allow, and this in turn allows us to be intelligent in our predictions in their future BABIP performance. In English now, Lichtman is basically the man who put numbers behind the idea that there are “ground-ball pitchers,” i.e. pitchers who induce a greater than average percentage of ground balls on balls in play, and “fly-ball pitchers.” Pitchers do maintain some control over a hitter’s ability to stroke line drives and pop the ball up as well, but the correlations are not as strong. As is intuitive, good pitchers generally allow slightly fewer line drives (balls that are hit pretty hard) and slightly more infield pop flies (contact is not strong) than your average pitcher.

David Appelman posted a great piece on the topic of BABIP against for pitchers over at FanGraphs which allows us to generate the following estimation table on the rates of different types of batted balls being turned into outs:

Hit Type     Expected Out %
Fly Balls        85%
Ground Balls     76%
Line Drives      27%

Looking at this you may be wondering – why are ground ball pitchers all the rage today if fly balls are actually converted into outs at a higher rate than are ground balls? This has to do with the fact that one of the worst things a pitcher can do is to give up a home run. In general, home runs seem to occur on about 10% or 11% of all fly balls allowed, somewhat independently of which pitcher is allowing them. As such, it follows that a pitcher who allows more fly balls will allow more home runs, even if he’s an otherwise outstanding pitcher – think Johan Santana or Roy Oswalt. While the ground ball pitcher may give up a few more singles – as well as a few more unearned runs – the fly ball pitcher is going to get hit with the dinger more frequently, thereby limiting his effectiveness somewhat if his other peripheral statistics (i.e. the defense independent numbers we looked at above) do not remain strong.

So what do we do with all of this information now that we have it? Let’s review the two main tenets of the conversation so far:

  • The best indicators of a pitcher’s future performance are defense-independent rate stats including: K%, BB%, and to some extent HR% allowed.
  • A closer look at specific batted ball data can tell us even more about what to expect from a pitcher in terms of his BABIP against

Let’s go ahead now and take a look at one of the actual statistics used to express the concept of Defense Independent Pitching and then apply it to a real example that might even help a few fantasy teams along the way.

If you clicked the link with Tom Tango earlier in the piece you probably ran into his statistic Fielding Independent Pitching (FIP). It takes into account exactly what we’ve been talking about so far, and does so in a simple, easy to understand format. While some of the newer, slightly different, statistics in this area take batted ball data into account, FIP purely covers the stuff that the pitcher alone is responsible for, and is calculated based on the following formula:

FIP = NC + (HR*13 + (BB+HBPIBB)*3 – K*2)/IP

Now, I’ve added in “NC,” which we’ll call our normalizing constant to help make the number that FIP generates match what would be a projected ERA – or about what a pitcher’s ERA should be based on proprietary performance. This number is generally adjusted for current league conditions, but right now generally sits at around 3.2. Dave Studeman went ahead and refined this statistic a little bit by normalizing the home run rate included in an attempt to eliminate fluky FIP numbers as a result of extreme HR luckiness or unluckiness, an example of which can be found here. For the purposes of our discussion, however, we’ll just use the simplest format as it is most readily available to all, standard FIP.

Consider the following two players based on their standard fantasy league statistics:

Pitcher      ERA   WHIP    K/9   W
Pitcher A   7.78   1.68   7.56   2
Pitcher B   2.65   1.35   6.00   4

That’s not particularly close, right? But hold on; take a look at their FIP, noting that these numbers represent RA approximations, not ERA as we look to distinguish simply between pitcher dependent and independent data:

Pitcher       FIP
Pitcher A    4.31
Pitcher B    4.85

How can this be and what does it mean? What we are looking at here is a situation where Pitcher A has actually performed better than Pitcher B in the defensive independent categories we’ve been looking at, namely, BB%, K% and HR% allowed. Now, I’ve manipulated this example to be sure that we aren’t looking at quirky home run data or major discrepancies in park effects, so that’s not the explanation. The difference between these two pitchers fantasy numbers so far comes down to a combination of luck and defensive support. There’s not much we can do about the luck factor, but in terms of fantasy baseball, defensive support is going to be an ongoing concern so let’s unveil the names in order to take that into account:

Pitcher A   Ricky Nolasco
Pitcher B   Matt Cain

No matter which way you look at it, the Marlins defense has been significantly worse than that of the Giants, so we’ll take that into consideration by noting that pitchers playing in front of a good defensive team will generally outperform their FIP as compared to those in front of poor defensive teams. Even accounting for this, FIP suggests Nolasco may be pitching better than Cain right now, so we aren’t necessarily looking for an improvement in Nolasco’s own performance, but rather an improvement in his luck on balls in play which will in turn result in improved fantasy numbers. This is obviously a somewhat simplified analysis, but it remains insightful. I’d bet most Nolasco owners out there right now would be thrilled if you dangled a little Matt Cain Sampler in front of their nose. This is a good example of how using Defense Independent Pitching Statistics can help predict future performance, so keep an eye on pitchers who’s FIP and ERA just don’t line up, as they may be due for a regression soon.

Defense Independent Pitching is an incredible research field full of brilliant minds and big equations, and incredible advances are being made every day. Don’t be overwhelmed, however, as the concept remains accessible to all: when the pitcher turns his head after the crack of the bat to watch where the ball is going, he’s just about as much of a spectator as you are.

You need to be logged in to comment. Login or Subscribe
Liked it, didn't love it. I learned something, but at times it just go to be too much with all of the links where I felt I had to read about 20,000 more words to keep up with all of it. The last part helped you a lot with the actual cases and what might be gleaned from data like this.
I'll quibble with the choice here, using a non-BP stat, but let's face it, there's a lot of good work out there. I'm not familiar with FIP (I've seen it, but not followed it) so I'm coming at this like a lot of people out there will. This is my shot at understanding it. He does a solid job of it, though I'd have liked to have seen more of the second half where he explained using good examples rather than the first half, which was more a history. I think he could have lost some of that and done more explanation and made this stronger. Good, not great.
I'd credit Byron for being willing to expand the Basics series beyond BP metrics, because I'm glad he was willing to not cater to us by dealing with something from among our own menagerie in the expanding bowl of alphabet soup one might spoon something out of. I was also happy to see somebody touch on Tom Tippett's work, which was an important part of the dynamic when this stuff was truly breaking new ground. The problem is the writing, which is at times very sloppy; the conclusion was anything but crisp, and I could do without the throwaway references to fantasy concerns--those aren't really germane to the topic at hand. Solid, but it could have been better.
I liked the way the idea was presented and the process was quite methodical and easy to follow. It was a nice overview of McCracken's initial work and the subsequent research and you did a good job at anticipating some readers' questions. Yet, try not to rely too much on external sources for a new reader. The fantasy league stuff tidbit might have worked better if you began the article with a comment about fantasy leagues, instead of just tossing it in at the end there... That being said, I liked the opening paragraph but most of the writing style was weak and awkward. Example: "Defense Independent Pitching theory originated with a posting by Voros McCracken postulating that pitchers are not really able to control whether balls put in play against them are converted into outs or result in hits for the opposition." I don't know about you but I can't say that in one breath...and if I was new to this concept, it'd take me multiple reads to understand what you are saying.
I'm now giving Byron a thumbs up. His writing in this piece if very clunky, and especially in the first half, the tone feels like a research essay or summary... but I thought he covered a lot of ground pretty well. Remember that if people have qualms about the writing of an article, they'll also question how precise the data was and how logical the analysis was. However, his initial entry had a much better writing style, and I like how his mind works, so I don't want to lose him just yet... But, this kind of article just won't cut it as the field gets narrowed... so consider this a save.
The intro is lame and generic and the writing is really clunky, but it introduced me to a concept I didn't know gently enough. I just wish it didn't read so much like a second-year university essay.
As long as you're evaluating writing, it's germane to note violations of grammar, usage and syntax. I know that many of you baseball fans won't care, but writers should. To wit: 1. People are "who;" things are "that." You can't say "the pitcher that has an ERA..." 2. I.e. means "in other words." If you mean "for example," use e.g. This applies to most BP authors as well as Byron and Brittany.
About halfway through, I felt I was reading a history lesson about I don't think I ever would have gotten to the analysis if this were my first exposure to the concept.
I'm quite familiar with FIP, but still had trouble following the flow of this article. There's a high degree of difficulty here, though, as you're basically covering both BABIP theory and the construction of a DIPS-based metric. Unfortunately, both pieces were half answered, with only a half explanation of DIPS and a link to the derivation of FIP.
I would have liked a bit of analysis regarding how good FIP does predicting ERA. (Maybe it in one of the links, but hey, I'm trying to catch up here.) Obviously the Nolasco and Cain examples were intentionally chosen as extreme examples. However, a first-time reader may look at a pitcher's ERA of 7.78 and a FIP of 4.31 and think to themselves that maybe the disparity isn't the result of luck or bad defense, but merely the failure of the statistic to properly measure pitcher performance. So, I thought the article would be helped by an analysis of the statistics ability to (in the aggregate) accurately describe and predict a pitcher's ERA. This could be as simple as average, standard deviation, etc. sort of statistics.
Nice opening five words, then his opening paragraph took a long turn to nowhere before turning into a big fat blob. It turned out those opening words had nothing to do with his subject. The article is about a pitching stat not a hitting stat and Byron's attempt to relate them was a silly mess. The history lesson behind the stat was interesting, however, and his explanation of the stat was fine. Byron’s Idol entry essay on Travis Hafner was far more polished (although unenlightening), but I am glad he has toned down his cockiness. There isn't enough here, however, to put Byron in the upper half of the competition. Of no relevance to the quality of this article, FIP has its uses (which Byron didn't go into adequately), but it puts the Home Run rate on equal footing with walks and strikeouts. It is important, but is affected too strongly by normal variance. If as Brian Cartwright maintains you need three years' worth of data from an entire team to get an accurate home run rate in a park factor, it must be impossible to get an accurate home run rate from a single pitcher. I propose someone look at SlgA - BABIP ... sort of an improved version of isolated power, as an alternative to HR9 or GB%, etc. as a balance to K & BB.
I'm starting to wonder, this being my second article, if I'm part of a silent majority or an outcast. This thought is based on the comments. Either I'm the only non-editor/statistician on the site or you have a serious echo chamber problem with the comments. It will be interesting to see how the voting compares to the comments. Sadly enough I disagree with Kevin on one point: links are good in basics articles. I have no intention of reading most of them but they give me the option to go in depth if I choose. I found the writing acessible. I am not a writing instructor or editor so maybe it is bad. All I know is I finished the article this time. My problem was alphabet soup. It is a basics article - I don't care how a statistic came to be, just what it is. By the time you got to FIP you didn't have enough words left to explain the formula which left the statistic just another number with a big 'trust me' factor.
I don't see what you are talking about, Patrick. The opinions expressed here look quite varied to my eyes. And, are you criticizing our criticisms by sarcastically referring to us as editor/statisticians? What's this contest about, anyway? Isn't this our chance to play editor/statistician?
I enjoyed reading this, but had a question about a fact quoted - that pitcher's year to year line drive percentages are positively correlated. Another piece I just read claimed it was either .001 or .000. Which is it?
Oh, and in the Cain vs. Nolasco comparison, it would have been nice to have the actual walks/9 and HR rate in the table, or whatever elements Nolasco is superior on that makes him have a lower fact, his WHIP is much worse, but I guess that's the point (fantasy scoring can deceive). He must walk almost nobody compared to Cain, but give up many more hits.
Too much introduction before getting to FIP. It should not have taken over half of the article to introduce the statistic you set out to explain. I agree with Mr. Goldstein about there being too many links. I am also not certain that the two tenets you set for review are the two takeaways of the piece to that point. Sentence structure could also improve. Overall, I would have liked less background and more about FIP.
This was terrific. Using Tango and MGL's material was, I think, a wonderful decision. Tango is, I think, at the forefront of making advanced sabermetrics widely accessible. This is an introduction, and the best introduction not only inspires but encourages further reading on the subject, those outside links do that. I wish I could vote twice.
This article took a good while to read. That being said, I enjoyed it immensely. Keep up the good work! Thumbs up all the way!
The content is fine, but the structure and prose style need some work. The author should have started with the Nolasco vs. Cain example, and used it to set up the utility of FIP. Hell, I'm an academic, and even I thought there was too much literature review in this submission.
After reading Byron's initial entry, which I thought mostly well-written and entertaining, I expected a LOT more than what is presented here. As a fellow holder of that always marketable B.A. in English, this reminds me of too many of my own lesser efforts, lo those many years ago, when my love of beer and inherent procrastinating tendencies would lead me to submit 10-page essays written in the early morning hours just prior to class. In all seriousness, if you're going to tout your writing ability as a strength, you need to do better than this.
Another article on Voros' theory? Yikes!
Personally, I enjoyed the history lesson as I think it rounded out the basics well.
I subsribe to BP for unique insight, great baseball writing, and some of the best analysis on the web. If a contestant is bringing something new or unique to the table, he better be one heck of a writer to compensate. This was a rehash of old analysis, and frankly you can find this on any number of free blogs, like fangraphs. BP can do better.
Ignore my last comment. I'm an idiot, and hadn't read the rules for this round.
I'm more positive about this one than most of the comments here. Yes, the prose could have been better (and snappier). Yes, the organization was less than perfect. But the article did its job well, introducing a concept and making a point. I, for one, strongly encourage the links to external anaylsis, especially defining analysis that created a field of study. Nobody's forcing the reader to follow them, but they're there if they want to. And the recurring criticism of BP as being too insular is not without merit; this kind of article would help fix that. ObQuibble: If HR per fly ball is roughly the same for everyone, then shouldn't fly ball rate be the core defense-independent stat, not HR rate? I understand that HR rate is much more widely available, but it would be nice to note that using it is a convenience, not a necessity.
I really liked this one - you could follow the logic without clicking the links but if it whetted your appetite the information was all there.