June 2, 2010
Checking the Numbers
Earlier in the year we introduced SIERA, an ERA-based estimator designed to more accurately predict pitcher ERA given more stable and significant—both clinical and statistical—inputs. With the season two months in, we are more closely monitoring the returns, and in doing so observed that the Arizona Diamondbacks have a vast disconnect between their ERA as it actually stands and what it should be given what are referred to as controllable and stable skills. Interestingly enough, this was a bugaboo for the desert dwellers last season, as several pitchers finished the year with a similar statistical archetype: very solid K/BB ratios, more grounders than flyballs, poor ERA.
When immersed in a field as much as we are, seeing those former two components usually translates into “good numbers” but the latter doesn’t always agree. This issue plagued the Diamondbacks last season, and while they were unlikely to play meaningful games into October this year or last, the problem has once again surfaced. While some may wonder whether defense is playing a role, the Diamondbacks’ defensive efficiency in 2008 and 2009 was .686 and .687. They have fallen to .673 this year with mostly the same cast of defenders, and defense certainly does not explain the 18.7% HR/FB, dwarfing the 12.3% in 2008 and the 13.0% of 2009. With the sample sizes increasing with each start we decided to place three different Snakes starters under the microscope to explain both how their SIERA either does or doesn’t match up with their ERA, as well as assuage some common misconceptions about estimators.
Haren’s ugly ERA for the first two months of 2010 represents exactly why statistics like SIERA are designed. His repeatable skill-based stats (strikeout, walk, and ground ball rates) have held steady, while his luck-based stats—HR/FB and BABIP (especially with runner on base)—have shot through the roof. While some pitchers are better than others at keeping batters from hitting fly balls to the outfield in general, there is no distinct group of pitchers that specializes in keeping 300-foot fly balls from going 400 feet. On top of that, even though Haren has allowed fewer line drives and fewer balls to reach the outfield in the air (both indicators that batters have primarily made weak contact), his BABIP has shot up to .335, and is an even more unlucky .400 with men on base.
His lack of luck is more determinable because his ugliest run totals came in games in which he performed rather well. On April 10, Haren recorded nine punchouts and issued two walks against 32 Pirates hitters. He allowed just five outfield fly balls, but two landed on the wrong side of the wall. Ten days later, he struck out eight Cardinals, while walking just two, but three of his four outfield flies left the park. On May 11, the Dodgers whiffed 10 times in their 31 plate appearances against the ace, walking just once. Of the other 20 hitters, 10 recorded hits on balls in play, leading to four earned runs and only 6 1/3 innings.
Haren’s only legitimately bad start—meaning he pitched poorly and the box score backed that assertion up—was May 16 against the Braves, when he struck out two hitters and walked three, allowing two home runs out of 10 outfield flies, adding up to six runs in just 4 1/3 innings. Last Thursday's performance against the Rockies was even more typical of his season, as four of his 15 outfield flies landed on the wrong side of the Coors Field walls. All in all, the Diamondbacks are just 5-6 in his starts, and should have another two wins if not for some bad luck in his games.
A quick glance at Jackson would yield the same claim of “bad luck,” where we play a little “sabermetric pepper,” and note the bad BABIP, the bad HR/FB, the solid walk, strikeout, and ground ball numbers, and quickly conclude that they all indicate Jackson is likely to turn it around and be an above average pitcher. That’s all true. Jackson is likely to rebound, because skills like strikeouts and walks are more likely to repeat than balls in play falling for hits. That’s why SIERA is an “ERA predictor.” However, nobody is going to argue that Haren is on a different plane than Jackson, and while the former is likely to finish the season with a sub-4.00 ERA as he does every year, the latter’s definition of “rebound” might not be as substantial.
Jackson was chosen for this study because he reeks of confirmation bias and represents the type of pitcher for whom extra research is needed to avoid inaccurate assessments. Jackson has always been one of those loads-of-potential pitchers who had never turned the corner. And then the first half of last year happened. Instead of asking if that was fluky, most assumed he had finally turned the corner. After all, the same people had expected Jackson to perform this way for several years. In reality, it is perfectly valid to think of his first half with the Tigers as the outlier.
Jackson’s BABIP is so high primarily due to two consecutive terrible starts: April 27 and May 2. In those two games, he only struck out two hitters and walked three, while giving up 18 runs in 6 1/3 innings. His BABIP in those games was .541 (20 for 37), as compared to .280 otherwise. From a PITCHf/x standpoint, Jackson doesn’t appear to be doing too much differently. His fastballs and sliders have dropped about 3-5 percent with increases in curveballs and changeups making up the difference. This isn’t anywhere close to the realm of extreme. On top of that, his velocities have remained intact, and the only number not in line with years past is fastball movement.
While his pitches moved 4.9 inches horizontally and 10.3 inches vertically last year, the movement components are currently 4.0 and 9.4. The overall effect of less vertical and horizontal movement could certainly be the difference between a pitch on the outside corner versus one hovering over the plate. This might not be the definitive reason for Jackson’s struggles—after all, he wasn’t that great in 2007 or 2008 with similar movement numbers but little else sticks out from this granular standpoint. With nothing else sticking out to explain the two incredibly poor starts, and little difference in the movement of his pitches, chances are his SIERA will rise as his ERA falls towards it, giving him a smaller gap to close than it may seem.
Short of a brief stint early in 2008 and a couple starts in 2007 and 2009, Kennedy had not gotten any real major-league experience before this season. In 2010, he certainly has had some good luck (.247 BABIP) that has had a larger effect than some of his bad luck (16.4% HR/FB), but Kennedy has still just been a solid pitcher this year, finally living up to some of the hype that surrounded him at USC and in the Yankees organization. When Haren’s luck turns around and Jackson’s numbers look a little bit less like they were torpedoed by two ugly starts a month ago, Kennedy will fit snugly into the Diamondbacks’ otherwise solid rotation, striking out nearly three times as many hitters as he walks, keeping the damage from his fly ball tendencies to a minimum.
But Kennedy is not assured of keeping his pretty numbers all year long. While it makes sense that he should given his propensity for K/BB glory, the BABIP numbers are very low, well below even what one might expect from aces like Roy Halladay, CC Sabathia or Zack Greinke, who might be the extreme pitchers with the skills to lower their ERA by about 0.15 below their team’s average, while Kennedy fits better in the category of someone perceived as a prospect that didn’t pan out. This is another important concept of estimators: just because one is close to an ERA does not mean the pitcher will sustain both marks all year long. With a pitcher like Haren, it is acceptable to expect a big turnaround due to both his track record and very solid controllable rates this year. With one like Jackson, it’s just as acceptable to expect not as substantial of a turnaround, with the ERA perhaps settling around 4.75 barring something unforeseen.
And with someone like Kennedy, it makes sense to expect nothing big to change, but that should not be treated as a guarantee. For all we know, his numbers could be 4.31 ERA/4.52 SIERA a month from now, with little disconnect between the two but a downturn in performance across both. The Diamondbacks are unlikely to make any headway in the National League West even with these three meeting expectations, but the lack of agreement across the metrics right now does not help their cause in any way.
Eric Seidman is an author of Baseball Prospectus.