September 28, 2010
Racing for the Cy
It’s hard to believe that the regular season is almost over, as it feels like yesterday that Ubaldo Jimenez had his sub-1.00 ERA and Roy Halladay tossed his perfect game. The year has flown by, and as we enter the final week of the regular season it is impossible to avoid discussions of potential award winners. Today, we will focus on the Cy Young Award, as this was the “Year of the Pitcher” after all, and in addition to a couple of top-notch candidates in both leagues, there are a bevy of pitchers whose numbers may have merited more serious inclusion in another year. Unfortunately, discussions of the award tend to veer off in different directions because the award itself is perceived as ambiguous. Before getting into any conversation centering on either who will or who should win the award, who is supposed to get the award?
Realistically, the award is used to honor the best pitcher in his particular league in a given season. Now, how do we define “best”? Cue the convolution. For the most part, voters tend to use wins and earned run average. The former doesn’t tell us anything of tangible value, while the latter is merely a useful tool in shaping the resume of a pitcher, not the end-all tool. Using a statistic like wins above replacement isn’t entirely valid either, as it includes the batting and fielding components; does the difference in the batting lines of Adam Wainwright and Roy Halladay have anything to do with their pitching attributes? Yes, Wainwright might stay in games more because he can handle the bat, but his opponent’s TAv is what is of interest, not his own.
Beyond WARP or similar all-encompassing metrics, is it better to gauge value using actual run-prevention measures or the stats that hold more predictive value like SIERA? Again, this is integral to determining who is the best pitcher, but nowhere near the vicinity of clear-cut. If we go too far in either direction we run the risk of caring too much, or too little, about the defenses behind these hurlers, as well as the luck associated with their lines. Luck has varying degrees, but right now we just do not know with real certainty how to make the differentiation between the different forms. For instance, Francisco Liriano has terrific peripherals but his ERA is four-tenths of a run higher than his SIERA. His numbers are inflated by a high batting average on balls in play, but how do we properly account for the number?
We don’t know if he is being penalized by bloops and ducksnorts on solid pitches, or if the opponents are just taking advantage of his mistake pitches at a higher rate. While both scenarios result in added hits, the former scenario is probably less likely to persist. Completely eliminating the results of the inflation doesn’t seem right, nor does including everything. Lastly, too many people get caught up in who will as opposed to who should win the award. Why spend time trying to get into the minds of voters? Is it really interesting to figure out which pitchers will have their wins overvalued? Personally, I don’t think so, and for that reason I decided against running a regression in order to determine which three or four variables have the strongest relationship to an award win.
Discussing the best pitchers should involve discussing the best pitchers. If the “will” and “should” happen to be one and the same, wonderful, but they aren’t always intertwined. To that end, I chose seven statistics that summarize the performance of a pitcher, and ran the pitchers in each league through a simple ranking system. As in, the lowest score would provide us with the best pitcher in each league, as he would have been closer to the league lead in each category than anyone else. The only pitchers I excluded were those with fewer than 150 innings, as I wanted to restrict the sample to starting pitchers who have pitched for the entire season.
The statistics of interest were innings pitched, ERA, FRA, SIERA, WHIP, SNLVAR, and SNWP. Yes, there are other stats that could have been included, but I felt this septuplet got the job done. Quantity doesn’t imply quality, but if I had the choice between two pitchers with identical SNWPs, with one having thrown 40 more innings, my selection would be very easy. The triumvirate of run-prevention marks gives a broader scope of skill than just ERA, and for those unfamiliar, FRA is essentially ERA with an adjustment for inherited/bequeathed runners. This way, pitchers aren’t under- or over-credited based on the skills of the succeeding relievers. Lastly, the support-neutral metrics paint a picture of skill independent of the run support received while adjusting for the strength of the opposition. Here are the results for the National League:
And the same table for the American League:
Starting with the senior circuit, the numbers confirm what everyone likely expected: it is Halladay, Wainwright, and then everyone else. Halladay and Wainwright are tied for first, with each pitcher ranked near the top in all seven categories. Interestingly, Wainwright has a smaller deviation in his ranks; he finished second, third, or fourth, while Halladay finished anywhere from first through fifth in the categories. Without any type of context, this looks to be a wash. In cases like that, qualitative tiebreakers are fair game, and Halladay performed this well for a division champion in dire need of consistent value with all of its injuries, while also throwing a perfect game. Either one of these pitchers deserves the award, but Halladay seems to deserve it just a bit more.
One of the most interesting tidbits on the National League list is the third-place finisher: Roy Oswalt. Raise your hand if you thought he would be third. OK, you in the back, stop lying. Oswalt has been sensational since joining the Phillies, and while the differences between he, Tim Hudson, and Josh Johnson are relatively small, it’s great to see a strong rebound after 2009’s injury-plagued mediocrity. Johnson fell victim to the injury bug and secured the ERA title, though if he had remained healthy he may have been able to give Halladay and Wainwright a run for their money for the Cy Young.
The rest of the list isn’t really that unexpected. Cole Hamels and Matt Cain have been brilliant. Jimenez has stayed fantastic for the entire season. Mat Latos is already an ace, regardless of innings restrictions. And if not for the divorce of their owners and their pitiful season, the Dodgers’ Clayton Kershaw would be garnering much more attention for his efforts. At the end of the day, my vote for the best pitcher in the National League this year would go to Halladay, but I wouldn’t exactly lose any sleep if Wainwright took home the hardware a year after he was arguably the best pitcher in the league and lost by a hair. Incidentally, Halladay and Wainwright represent a meeting of both sides, as they top everyone else in both basic and advanced metrics. Everyone agrees these two have been the best.
The American League results bring up one of the most debated questions of the last few weeks: Felix or CC? A quick look at the data not only indicates that Felix Hernandez has been, far and away, the best pitcher in the league, but that CC Sabathia isn’t second on the list. He isn’t third on the list. Heck, he isn’t even fourth. Now my system here wasn’t formulated by NASA engineers, but it is a decent proxy for success, and what is undeniably clear is that Sabathia ranks first in only one category: wins. He leads in a category that is about as overhyped as the movie Donnie Darko, and that’s it. Now, his numbers are not poor, but they are nowhere near as great as Hernandez’s, and giving him the award would be reminiscent of Bartolo Colon beating Johan Santana in 2005. In fact, check this out:
By all accounts, Colon’s winning the award was a cheap one given that Santana had a superior season in every conceivable way aside from wins. I would argue that the gap between Santana and Colon is smaller than the one between Hernandez and Sabathia, and SNLVAR agrees. I haven’t weighed in on this subject in a while, either here or on Twitter, but I have to say that, hyperbole aside, the vote for this year’s AL Cy Young award has the potential to be one of those monumental moments in baseball. If Hernandez wins the award, which he clearly deserves without any debate, it will mark a true changing of the way people think about value in baseball which is, for lack of a better description, incredibly cool.
When I first started writing, my goal wasn’t to be smarter than anyone else or to make fun of the mainstream media. My goal was to help anyone interested understand the best ways to assess value. Wins, very clearly, are nowhere near the top of that list. And while I still think it looks good when someone wins 20 games or sports a 19-7 record, I understand that the perceived value is much greater than the actual value. The only quantitative advantage Sabathia has on Hernandez is in the wins department. Everything else is qualitative, such as his playing for a contender.
A win by Hernandez with a .500 or worse record—I actually hope he pitches another eight inning, one-run gem of a loss so he finishes 12-13—would show that more and more people are coming to terms with the fact that receiving run support from an offense in no way makes a pitcher better or worse than another. Hernandez has had a much better season than Sabathia and should be honored as such. For those who argue that, if not Sabathia, it should be David Price, well then I would agree—if the contest were to name the fourth- or fifth-best pitcher in the league this year.
Unlike the NL, the American League race represents a parting of the ways, as the advanced and basic stats disagree. Fortunately, everything other than wins is so overwhelmingly in favor of Hernandez that to ignore his qualifications would be akin to admitting sheer ignorance. I am not expecting every writer in the country to suddenly start using SNWP or FRA in their stories, but I have yet to read or hear any argument favoring Sabathia that does not hinge on his “being a winner” or other such garbage. It has become so tough for me to understand why certain people will force that puzzle piece to fit, just to try and keep the “wins” stat afloat when every other piece of evidence points them in a much different direction.
At the end of the day, the best pitchers in the National League are Halladay and Wainwright. I would vote for Halladay first, but Wainwright is tremendous as well. In the American League, I would vote for Hernandez (shocker!), and my second place vote would go to Jered Weaver who, ironically, is excellent behind a 13-11 record. Sabathia would maybe go fifth on the list. This isn’t a Halladay-Wainwright situation at all. We will have to wait and see how things shake out, but if Hernandez does not take home junior circuit hardware, we’re going to have one sad Seidman on our hands.