I was reading Yahoo!’s RotoArcade blog this weekend and noticed a riff about James Shields, HR/FB, and DIPS theory. I thought that this would provide a terrific jumping point for a discussion on these topics and how randomness plays into how we evaluate players
The Shields case also raises a debate about the worthiness of xFIP — a peripherally-suggested ERA that installs a league-average HR/FB rate.
Scott Pianowski (the post’s author) talks about xFIP, but as you likely know, BP’s SIERA attempts to do the same thing (only a little more accurately since there’s more interaction between terms). I’m here today to write on behalf of xFIP, SIERA, and DIPS theory. In his post, Pianowski questions the utility of these things on the basis of HR/FB rate.
We know avoiding home runs is very important for pitchers, and we also know that a heavy ground-ball rate is the best way to get that done. But do pitchers have control over what percentage of their fly balls go over the fence? It's a ticklish subject. There are plenty of SABR-friendly analysts who don't feel pitchers can control this aspect of the game, but you'll occasionally find someone who doesn't go along with that line of thinking…
While analysts will often talk about how a pitcher has been “unlucky” because of his HR/FB or his BABIP, things aren’t always so clear-cut. For the vast majority of pitchers, the analysis of saying “Pitcher A has a high HR/FB and therefore has been unlucky” will bring us extremely close to the truth of the matter. Occasionally, though, we’ll find a pitcher who doesn’t seem to be following the “rules,” and fantasy owners and analysts alike will have trouble figuring out what to make of the guy.
The truth of the matter is, pitchers do have control over BABIP and HR/FB—just to a much smaller extent than they do over things like strikeouts and walks. Because of this, after two months of a baseball season, we can look at a pitcher’s strikeout and walk rates and say something meaningful about the pitcher because these things stabilize a lot quick than BABIP and HR/FB do—that is, there’s much less random variation in these stats. The fact remains that pitchers do have some measure of control over BABIP and HR/FB, and when we run into a pitcher like Shields, when the quick-and-dirty rules don’t seem to apply, this is an important thing to remember.
So what do we do? Pianowski’s suggests, “To get a fair gage of where he's at, I'd say we need to judge him against his own established career norms, not the league-wide average.” This might or might not prove to be a better route than assuming he’ll post a league average HR/FB, but in any case, it’s not the optimal route.
To bring us closer to the optimal answer, we need to regress Shields’s HR/FB to the mean. As I said before, pitchers have control over just about everything to some degree or another. For HR/FB, I’ve found that it takes roughly 600 fly balls for HR/FB to stabilize* (this is using Retrosheet data. If we use MLBAM or BIS, this number will change a bit). That is, after 600 fly balls, the pitcher’s observed HR/FB tells us exactly as much about his true HR/FB skill as merely relying upon league average HR/FB would. Any fewer and the league average is more accurate.
*I’ll talk more about how I arrived at this in my article next Monday. If you’re really interested before then, ask in the comments or e-mail me.
It will take an extreme fly-ball pitcher about 2.66 seasons and about 4.75 seasons for an extreme ground-ball pitcher to get 600 fly balls. In the case of Shields, he has allowed 890 fly balls since his debut in 2005 (there’s no 2011 Retrosheet data, so I’ve assumed a career average rate) with a HR/FB of 14.9 percent. League-average since then has been 12.9 percent.
So for Shields, Pianowski would be right that we’d be better off trusting Shields’s career rate over the league average since he has more than 600 flies, but that’s first assuming that every year is worth the same (i.e. that 2005 tells us as much about Shields as 2010 does—not a trivial or particularly correct assumption to make), and second assuming that these are our only two options.
But there is a third option: combining our observed Shields performance with a mean performance. For the sake of simplicity, if I ignore seasonal weighting, aging, park effects, etc. and use the league average as our mean, based upon the 890 flies that Shields has induced in his career, we would include 69 percent Shields and 31 percent league average in our estimation of his true HR/FB talent, leaving us with a final result of a 14.2 percent HR/FB. That’s enough above the league average that we can assume SIERA or xFIP will be a bit too optimistic for him.
If we run this analysis for most pitchers (especially ones who have only been around a couple seasons), the final result will be very close to the league average, and SIERA will be fine to use for practical purposes. For a very small minority (like Shields, Brett Myers, Matt Cain, and a select few others), we may be a little off using SIERA (or FIP, xFIP, or any other ERA estimator).
Just because it’s off for a few pitchers by a not-trivial-but-still-relatively-small amount doesn’t mean that xFIP and SIERA lack “worthiness” (and it certainly doesn’t make ERA a better choice). As I said in my opening article, everything needs to be taken within the proper context. I wrote this article today because once we have a greater understanding of what it is we’re looking at, we can make better decisions about how to use the data in front of us.
As a quick endnote, just in case it wasn’t clear, I wanted to make it known that I’m not picking on RotoArcade in anyway. In my live chat the other day, one question that I didn’t get around to answering (OK, avoided for fear of leaving someone out) was, “What are your favorite fantasy sites to read?” RotoArcade definitely would have been on this list. I like the writing, and I’m friends with a lot of the guys over there. This particular article merely gave me a reason to talk about this sort of thing without having to resort to using a straw man argument, and if you read the entire post, you’ll see that Pianowski is more framing a debate than taking any one side, so consider this article a response to the debate presented.