Resident Fantasy Genius: How Do We Measure Award Candidates?

November 29, 2011

Last Monday, I penned an article about the NL Cy Young and AL Rookie of the Year awards voting that drew a lot of feedback from readers who disagreed with what I wrote. Today, I wanted to take a step back and look at awards voting in a much broader (but hopefully clearer) sense.

By the time the end-of-year awards are given out, everyone has their own ideas about who should win each award, and these ideas are often based on very different concepts of what the award is all about and what makes a player worthy of the award. This is a very important thing to remember that sometimes gets lost in the confusion of arguing the merits of individual players.

For some people, the awards are all about what actually happened. It doesn’t matter if Justin Verlander’s .236 BABIP is unsustainable or that, when viewed through the lens of FIP, his season wasn’t all that spectacular (since the dawn of the Cy Young era, 84 pitchers have posted a better FIP+ in at least 250 innings—and many logged far more—but just four have won the MVP). It doesn’t matter that Jose Bautista was hitting his homers in the friendly Rogers Centre while Jacoby Ellsbury was hitting in the deceptively unfriendly Fenway. What matters for these people is what actually happened on the field. Some people in this camp might account for things like park, defense, offensive support, etc., but the actual results are the centerpiece.

Others like to take a different tact and feel that we should reward a player for the things he’s most responsible for and ignore the things that he has little conrol over. In the case of pitchers, this means refusing to credit a guy like Verlander for his .236 BABIP. When trying to strip out “luck,” ERA estimators are often the first stop since they do a good job of combining the stats a pitcher can control the most, though things like park factors, quality of opposition, and other contextual factors are important to consider as well. And of course, ERA estimators are only shorthand and are far from perfect. They incorporate 100 percent regression for certain stats, like BABIP, while they don’t incorporate any regression for others, like strikeouts and walks. This leads to a very simplified look at the pitching dynamic, since pitchers obviously don’t have perfect control over strikeouts and zero control over hits.

To combat this, others might choose to apply the proper regression to each player’s entire set of stats, put them back together, and then see who comes out on top. While this might be a good next step for those in our second philosophical camp, it can create problems of its own. Going back to Verlander again, we see that he had an 8.8 percent HR/FB this season. However, if we were to regress that rate, we would end up regressing almost the entire way back to the league average of 11 percent or so. That could be a problem, since Verlander’s career HR/FB is well below league average at 7.8 percent. Similarly, we can look at a pitcher like Clayton Kershaw, who has posted outstanding HR/FB rates for a shorter period of time (in this case, three seasons), and make the judgment over how much is real and how much is noise even more difficult.

Maybe pitchers in general don’t deviate much from league average, but certain ones do, and those that do are usually found at the extreme ends of the spectrum, where awards voting takes place. And if we can’t say for certain whether a pitcher is showing a legitimate skill, is it fair to discredit him for it? We won’t know for several more years whether Kershaw’s HR/FB is more skill or luck, but voting takes place now, creating a bit of a dilemma for those who want to reward pitchers based on what they can control.

For stats like HR/FB and BABIP, if all we have is one year of data, regressing heavily would be correct. But when we’re dealing with pitchers who have an established track record, we reach an impasse. History tells us that Verlander is unlikely to be a league-average pitcher in terms of HR/FB, but do we really want to incorporate past data into our Cy Young/MVP deliberations? Are we looking more for the player who had the greatest season, or the player whose great season is most likely to be repeatable? Here we begin straying into projection territory, and that may well answer a different question entirely. This camp might contend that the MVP should go to the best player in the league, in which case a projection might be the right direction to go (scouting data would also be important here), but this would come with the realization that the best player might not be the one who had the best season, either on the surface or peripherally.

There are a lot of different ways to approach the awards voting issue but no truly clear-cut solution. I’m sure everyone has their own opinions as to which way is best (and I know I haven’t touched on nearly every approach), and there’s no way to say, absolutely, that one way is superior to another. They all present problems that must be reconciled. Before we argue over the worthiness of specific awards candidates, it’s important that we make sure we’re on the same page as far as what question we’re looking to answer. Most of us (myself included) are guilty of failing to do this occasionally, which creates more questions and more confusion than it does answers. I don’t presume to have the perfect solution, but recognizing this is important unto itself.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now

Derek Carty

Latest Articles

You need to be logged in to comment. Login or Subscribe

tombores99

11/29

Advanced apology for the shameless plug - seriously, I have no shame - but I published a related piece yesterday at BDD, and your article from last week covered a large chunk of the debate. This was especially true in the comments section, where several readers brought up specific points that had been banging around in my head over the last couple weeks. It was crazy, like having my own words read back to me, and a tribute to the depth of conversation that takes place in the halls of BP.

So... thanks for spurring an excellent discourse, and for having the gumption to follow up and address a difficult topic at the center of the discussion.

Well done, Derek!

http://www.baseballdailydigest.com/2011/11/27/true-value-part-i-failure-to-communicate/

Reply to tombores99

rawagman

11/29

Put me firmly in the "best season", not "best player having a good season" camp. I save the latter for the Hall of Fame debates.

Reply to rawagman

hotstatrat

11/29

. . . and all-star game selections

Reply to hotstatrat

TheRedsMan

11/29

I think this fundamentally mis-frames the argument. It is not "what actually happened" vs. "what the player was actually responsible for". It's both. Yes, ERA is an account of actual runs scored, but those runs are scored against a team of players, not a pitcher.

Is FIP a record of runs allowed/prevented? Nope. Is it a measure of "what actually happened on the field"? You bet it is. It's just presented on a friendly scale.

Those strikeouts, walks and homers all actually happened. And furthermore, they tend to be a more accurate measure of the pitcher's actual performance than his ERA.

We should not conflate the issue of imprecise/inaccurate measurements of performance with a debate between performance vs. true-talent. The former is ERA vs. FIP. The latter is FIP vs. xFIP.

Reply to TheRedsMan

TheRedsMan

11/29

And put me in the best performance box as well -- so long as you measure the aspects of performance for which said player is actually responsible.

Reply to TheRedsMan

derekcarty

11/29

I think we're just talking semantics here. By "what actually happened," I simply meant the final results of everything -- runs, wins, whatever. Obviously strikeouts and walks and home runs actually happened, but FIP ignores other things that actually happened like hits and runs. So maybe it's more accurate to say that the first camp is more prone to take in everything that actually happened as opposed to picking and choosing. There are certainly different shades to every perspective, but I think there's a distinct camp where the runs are paramount and a distinct camp where the runs are mostly ignored in favor of things that are more "stable."

Reply to derekcarty

wilsonc

11/29

One way of looking at it is that, when trying to isolate a pitcher's performance from his team's (not getting into true talent, but rather looking at actual performance), there are really two different ways to look at it: you can start with the big picture and then credit or debit the pitcher based on components that are more dependent on his team than his own performance, or we can start from the base components and try to build a model based on what those components would be worth in a neutral environment.

I think sometimes this difference in perspective leads us to underestimate the limitations of our current tools.

FIP gives us broad-stroke insight, but it treats batters faces as independent entities, whereas in reality pitchers have a significant impact on creating the context for the batters they face. That makes it a bit of a hybrid between a performance metric and a true-talent estimator. It suggests performance by capturing the biggest factors, but ignores finer details.

A runs-based metric solves the core issues that FIP has, because sequencing is implicitly included, as are all aspects of a pitcher's performance. The problem is that it also introduces noise: defense, bullpen support, etc. While these are elements that can be measured, it's not easy to properly divide the credit or debit among the different pitchers on a team.

Though the objective is the same with both methods, one school prefers to err on the side of assuming team responsibility for uncertain elements, whereas the other assumes pitcher responsibility for uncertainties. Sometimes I think we spend too much energy debating which approach is "better", and not enough time looking at the degree of inaccuracy involved in whichever method is being discussed.

Reply to wilsonc

TheRedsMan

11/29

I don't think it's semantics. The primary division is not due to a desire for "stability". Rather it is "accountability". That the things which are more directly a result of the player himself tend to be more stable is a side effect.

The FIP crowd doesn't abandoned runs scored in award voting because they are variable. It abandons them because they are a function of the performance of the entire team, not the player to which the award vote is being given. Even if you could isolate all the randomness, you'd still have the reality that a pitcher's ERA is more like a hitter's RBI than his OPS.

Reply to TheRedsMan

hotstatrat

11/29

And, there is the argument over whether a pitcher's overall contributions to winning games should count as much as a position player's in the MVP award voting. Personally, I am in the camp that says the awards are whatever we want them to be and that we shouldn't get hung up on what they are called. Since for the Cy Young Award, there is no corresponding Honus Wagner Award for the best position player, we can make the MVP award just that - or lean towards that unless the pitcher just blows away all position players in value.

Reply to hotstatrat

Resident Fantasy Genius: How Do We Measure Award Candidates?

Thank you for reading

Latest Articles

Next Man Up ’24: Week Three $

Fantasy Starting Pitching Planner ’24: Week Four $

speX ’24: Week Three $

Box Score Banter: Experiments in Takeout Slides B

Some Potential Answers for Pete Fairbanks $

Derek Carty

Latest Articles

Next Man Up ’24: Week Three $

Fantasy Starting Pitching Planner ’24: Week Four $

speX ’24: Week Three $