Resident Fantasy Genius: HR/FB, SIERA, and Luck

June 6, 2011

I was reading Yahoo!’s RotoArcade blog this weekend and noticed a riff about James Shields, HR/FB, and DIPS theory. I thought that this would provide a terrific jumping point for a discussion on these topics and how randomness plays into how we evaluate players

The Shields case also raises a debate about the worthiness of xFIP — a peripherally-suggested ERA that installs a league-average HR/FB rate.

Scott Pianowski (the post’s author) talks about xFIP, but as you likely know, BP’s SIERA attempts to do the same thing (only a little more accurately since there’s more interaction between terms). I’m here today to write on behalf of xFIP, SIERA, and DIPS theory. In his post, Pianowski questions the utility of these things on the basis of HR/FB rate.

We know avoiding home runs is very important for pitchers, and we also know that a heavy ground-ball rate is the best way to get that done. But do pitchers have control over what percentage of their fly balls go over the fence? It's a ticklish subject. There are plenty of SABR-friendly analysts who don't feel pitchers can control this aspect of the game, but you'll occasionally find someone who doesn't go along with that line of thinking…

While analysts will often talk about how a pitcher has been “unlucky” because of his HR/FB or his BABIP, things aren’t always so clear-cut. For the vast majority of pitchers, the analysis of saying “Pitcher A has a high HR/FB and therefore has been unlucky” will bring us extremely close to the truth of the matter. Occasionally, though, we’ll find a pitcher who doesn’t seem to be following the “rules,” and fantasy owners and analysts alike will have trouble figuring out what to make of the guy.

The truth of the matter is, pitchers do have control over BABIP and HR/FB—just to a much smaller extent than they do over things like strikeouts and walks. Because of this, after two months of a baseball season, we can look at a pitcher’s strikeout and walk rates and say something meaningful about the pitcher because these things stabilize a lot quick than BABIP and HR/FB do—that is, there’s much less random variation in these stats. The fact remains that pitchers do have some measure of control over BABIP and HR/FB, and when we run into a pitcher like Shields, when the quick-and-dirty rules don’t seem to apply, this is an important thing to remember.

So what do we do? Pianowski’s suggests, “To get a fair gage of where he's at, I'd say we need to judge him against his own established career norms, not the league-wide average.” This might or might not prove to be a better route than assuming he’ll post a league average HR/FB, but in any case, it’s not the optimal route.

To bring us closer to the optimal answer, we need to regress Shields’s HR/FB to the mean. As I said before, pitchers have control over just about everything to some degree or another. For HR/FB, I’ve found that it takes roughly 600 fly balls for HR/FB to stabilize* (this is using Retrosheet data. If we use MLBAM or BIS, this number will change a bit). That is, after 600 fly balls, the pitcher’s observed HR/FB tells us exactly as much about his true HR/FB skill as merely relying upon league average HR/FB would. Any fewer and the league average is more accurate.

*I’ll talk more about how I arrived at this in my article next Monday. If you’re really interested before then, ask in the comments or e-mail me.

It will take an extreme fly-ball pitcher about 2.66 seasons and about 4.75 seasons for an extreme ground-ball pitcher to get 600 fly balls. In the case of Shields, he has allowed 890 fly balls since his debut in 2005 (there’s no 2011 Retrosheet data, so I’ve assumed a career average rate) with a HR/FB of 14.9 percent. League-average since then has been 12.9 percent.

So for Shields, Pianowski would be right that we’d be better off trusting Shields’s career rate over the league average since he has more than 600 flies, but that’s first assuming that every year is worth the same (i.e. that 2005 tells us as much about Shields as 2010 does—not a trivial or particularly correct assumption to make), and second assuming that these are our only two options.

But there is a third option: combining our observed Shields performance with a mean performance. For the sake of simplicity, if I ignore seasonal weighting, aging, park effects, etc. and use the league average as our mean, based upon the 890 flies that Shields has induced in his career, we would include 69 percent Shields and 31 percent league average in our estimation of his true HR/FB talent, leaving us with a final result of a 14.2 percent HR/FB. That’s enough above the league average that we can assume SIERA or xFIP will be a bit too optimistic for him.

If we run this analysis for most pitchers (especially ones who have only been around a couple seasons), the final result will be very close to the league average, and SIERA will be fine to use for practical purposes. For a very small minority (like Shields, Brett Myers, Matt Cain, and a select few others), we may be a little off using SIERA (or FIP, xFIP, or any other ERA estimator).

Just because it’s off for a few pitchers by a not-trivial-but-still-relatively-small amount doesn’t mean that xFIP and SIERA lack “worthiness” (and it certainly doesn’t make ERA a better choice). As I said in my opening article, everything needs to be taken within the proper context. I wrote this article today because once we have a greater understanding of what it is we’re looking at, we can make better decisions about how to use the data in front of us.

As a quick endnote, just in case it wasn’t clear, I wanted to make it known that I’m not picking on RotoArcade in anyway. In my live chat the other day, one question that I didn’t get around to answering (OK, avoided for fear of leaving someone out) was, “What are your favorite fantasy sites to read?” RotoArcade definitely would have been on this list. I like the writing, and I’m friends with a lot of the guys over there. This particular article merely gave me a reason to talk about this sort of thing without having to resort to using a straw man argument, and if you read the entire post, you’ll see that Pianowski is more framing a debate than taking any one side, so consider this article a response to the debate presented.

As will always be the case, if you have any questions (related to this article or not), feel free to ask away in the comments or get in touch with me via e-mail, Facebook, or Twitter.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now

Derek Carty

Latest Articles

You need to be logged in to comment. Login or Subscribe

Pronk4848

6/06

I want to make sure I am reading this right. Are you saying Shield's SIERA and FIP estimations are 69% MORE optimistic for him then they would be for a league average HR/FB rate pitcher?

Reply to Pronk4848

derekcarty

6/06

Well, we'd need to figure out the exact percentage separately (69% seems much too high - it'd likely be under 10%), but I am saying that Shields's SIERA/FIP/xFIP/estimator of choice will be a bit more optimistic than it should be. And I'm saying that it will be less correct for Shields than it would be for a player with the same inputs but whose HR/FB expectation is league average.

Reply to derekcarty

brianjamesoak

6/06

You know, the thing about SIERA is that it underestimates the value of a high groundball rate. For example, here is a quick and dirty list of buckets for all pitchers with over 150 innings over the last three years and how much higher SIERA is for each:

GB% ERA SIERA Difference
55-60 3.64 3.95 .31
50-55 3.83 4.05 .22
45-50 4.09 4.20 .11
40-45 3.94 4.04 .10
35-40 4.23 4.25 .02

First, I am not sure why SIERA is consistantly higher than ERA. I would guess maybe it was buiult off data when the league as a whole had stronger hitters, but I'm not sure I buy that. Usually league changes in hitting correspond with a change in pitcher rate stats.

But also, SIERA clearly pegs pitchers with low groundball rates much more accurately. Those with high GB rates can expect to beat their SIERA by a third of a run.

Reply to brianjamesoak

brianjamesoak

6/06

Sorry about the typos. Didn't have my coffee this morning.

Reply to brianjamesoak

derekcarty

6/06

Thanks, Brian. ERA estimators are far from perfect, but they are definitely going to be better than using ERA on the whole. Your numbers reinforce my point about knowing what you're looking at, so thanks for bringing them up. I didn't build SIERA, so I'm probably not the best guy to defend it. I'm not sure how exactly it has handled the change in run environment over the past couple years either. I do know that Colin Wyers is working on an improved ERA estimator for BP.

Reply to derekcarty

cwyers

6/06

Hmm, this is interesting - I don't know what to make of it. For a quick-and-dirty little study, I took a ran a correlation between (ERA-SIERA) and GB_PERCENT for pitchers with at least 120 IP, and I get a correlation of only -0.05. And the scatterplot looks about like you'd expect given that correlation (a great big blob, basically) - there's no indication that there's some non-linear relationship that the correlation isn't picking up very well.

My first inclination is that what you're seeing is an artifact of small sample sizes within the bins, rather than some defect within SIERA itself - that said, I'll try and recreate your exact study tonight and see if it doesn't shake out differently there.

As for SIERA constantly being higher than ERA - the intercept is fixed, rather than changing to reconcile to same-season ERA.

Reply to cwyers

cwyers

6/06

I got pretty close (but not exact) to what you were doing on the 2008-2010 data I have; I don't know that the differences are significant.

But if I do the exact same thing but on 2000-2007 data, I no longer see the pattern you describe. Again, my gut tells me that you found a fluke.

Reply to cwyers

surfdent48

6/06

With the 600 fly ball criterion, I would place more emphasis on the consistency pattern over the years. James Shields has a HR/9 rate of: 1.3, 1.17, 1.19, 1.50, 1.23 and a HR/FB % rate of 13, 11, 11, 13.8, 15 over the past 5 years. For me this his would lower the "luck" factor greatly for him.
Chris Volstad, however, with just under 500 fly balls, has had wildly varying rates of HR/9: .32, 1.64, .87, 1.49 and HR/FB % of 3.9, 17.5, 8.8, 15.9. So the "luck" factor would seem to be very high for him. His xFIP is a full 2 runs lower than his ERA now, so luck is a big issue for him now. Comments?

Reply to surfdent48

derekcarty

6/06

Recent performance is definitely going to be more indicative than past performance, but I'm not a big fan of pattern analysis because a lot of times we'll think we've found a pattern when in reality it's simple random variation.

I don't think that because Volstad has had wide swings in his HR/FB that he is any more prone to luck than a guy who's HR/FB has been more consistent. Sure, luck has impacted has past performance more, but as long as we believe that what we were seeing really was just luck, that's not going to impact what kind of luck he has in the future.

What we do know about Volstad is that he has a somewhat small sample of major league performance in terms of HR/FB, and that when trying to assess his true HR/FB talent (assuming we have no minor league data to work with), we'd use something like 45% him and 55% league average.

Also, I should note that I rarely use HR/9. I think it's better to use OF FB% and HR/OF. Since HR/9 is influenced by the two, we can't tell why a pitcher is allowing more HRs by looking at it. Maybe it's a more legitimate change in OF FB% (which is much more stable than HR/OF and can be altered by a change in repertoire or approach by a pitcher) or a change in HR/FB which is more likely to be mere random variation. Because there could be two very different reasons for changes in it, I don't think it's very reliable.

Reply to derekcarty

markpadden

6/07

The main problem with using SIERA (or xFIP) for fantasy purposes is that ballparks have a large influence on HR/FB, and these ERA estimators totally ignore it. E.g., as far as I know, SIERA/xFIP use the same expected FB/HR for pitchers for Padres and Rockies pitchers. Ditto, for defense. Assuming a MLB-average HR/FB rate and MLB-average defense is appropriate for evaluating true skill of pitchers in between seasons. It's not, however, a good practice for predicting second-half fantasy stats, which I believe more people care about.

In other words, please fill this void and create a metric that assumes the pitcher will remain in his current park/defense environment for the rest of the season.

Reply to markpadden

derekcarty

6/08

I think the effect of ballparks on HR/FB is often overstated. If a park has a 1.25 HR park factor (which would qualify as one of the most extreme park in the majors) and we assume league average HR/FB is 11%, a pitcher for that team would have a park-adjusted HR/FB of just 12.4%. Sure, it's significant, but it's not huge. For an extreme fly ball pitcher, this would translate to about 0.21 points of ERA. For an extreme groundballer, 0.14 points of ERA. Sure, that's significant, but keep in mind that this is also at the extremes. If the park only affects HR/FB by 10%, the run impact is under 0.10 points of ERA (and about 0.05 for a GB pitcher).

Sure, this needs to be a consideration when evaluating a player's fantasy prospects fully, but we need to consider what exactly we want an ERA estimator to tell us. For me, I think an ERA estimator is most useful to get a snapshot of how the player is performing in a given year given neutral context. I want it to tell me about the player's skills and nothing else. Then, if it seems the player himself is performing differently than in the past, I can see if anything is different with his stuff or his approach.

I mean, if we want to know how a player will perform going forward, an ERA estimator is not the best thing to use. If we want to know how the player will perform given the context and setting he finds himself in, why not look at projection? I mean, PECOTA is being updated in-season now, so why not go with that? It's going to be much more accurate than an ERA estimator - even one using park factors, defense, etc - so why try to make the ERA estimator a dumbed down version of the projection system by including park, defense, etc? I'd rather it tell me something a little different in an attempt to help identify a potentially legitimate shift in the pitcher's skills.

Sorry for the long response. These are the kind of things I love thinking about.

Reply to derekcarty

markpadden

6/08

I hear you. Obviously, a high-quality rest-of-season projection is what we are all looking for. But until we are there, people continue to generate lists if xFIP to ERA, or SIERA to ERA differences mid-season for fantasy purposes. My argument is that these are not good tools to use to evaluate what a pitcher "should" have done thus far this season. We agree that xFIP and SIERA are not great raw predictors of ROS stats, but this presently is not stopping people from using them exactly as such over and over.

I'm all in favor of quality ROS projections, but I believe there is still a need for a single-single season ERA predictor metric -- a SIERA-type stat that aussumes the player's environment is not changing. THis would satisfy those of us who want to get a simple look at how a pitcher is truly performing in the context of his current team/stadium, without the baggage of an opaquely-weighted multi-season projection.

Reply to markpadden

Resident Fantasy Genius: HR/FB, SIERA, and Luck

Thank you for reading

Latest Articles

The Stash List ’25: Big Picture Numbers $

Lineup Lockdown: National League, September 2025 $

BSB: Luck, Speed, and Other Fortunes B

The Adaptable Swing: Why One Size Doesn’t Fit All $

The First Post-Sabermetric MVP Race? $

Derek Carty

Latest Articles

The Stash List ’25: Big Picture Numbers $

Lineup Lockdown: National League, September 2025 $

BSB: Luck, Speed, and Other Fortunes B