Do teams like the Orioles that excel in one-run games do so out of skill, or have they just gotten lucky?
A few weeks ago, the topic for the BP Lineup Card was "Unanswered questions for the second half." I noted that at the time, the Cardinals were several games behind both the Pirates and the Reds in the NL Central standings, despite the fact that they had a better Pythagorean record than either. In theory, the Cardinals should have been atop the NL Central.
The rest of this article is restricted to Baseball Prospectus Subscribers.
Not a subscriber?
Click here for more information on Baseball Prospectus subscriptions or use the buttons to the right to subscribe and get access to the best baseball content on the web.
When Eric Seidman and I introducedSIERA last winter, we ran a number of tests to determine if our theoretical foundation of run prevention led to a superior estimation of pitchers’ skill levels. While SIERA had a solid advantage at predicting future ERA over some ERA estimators and a last decimal-point small lead over xFIP, we ran the tests again after 2010 to ensure that it held a lead going forward. Although the regression formula did not incorporate future ERAs and should not have been biased, it's still important to test the following year to see how well SIERA held up.
Examining past MVP and Cy Young winners and the differences between their winning seasons and non-winning seasons.
With the Most Valuable Player and Cy Young awards announced in the last two weeks, we saw a first-time MVP in each league, a first-time American League Cy Young winner and a National League Cy Young winner who had won the American League Cy Young Award seven years prior. Winning consecutive MVP or Cy Young awards is a rarity, though we have seen recent repeats by Albert Pujols and Tim Lincecum. In the last 18 years (1993-2010, which encompasses the last two rounds of expansion), we have seen just six of 36 MVP awards go to the previous year’s winner, and just nine Cy Young Awards to the previous recipient. But the best hitter or best pitcher in the league is usually not a different person every year.
Look at which direction some hitters with high batting averages on balls in play are likely headed in 2011.
Last week, I discussed several pitchers who were pitching well in front of or well behind their peripherals using SIERA. This week, I will discuss several hitters who have particularly high BABIPs, and how much of that performance is skill versus luck.
As long as there are official scorers, we'll never get a completely factual recording of a game.
Former Astros third baseman Morgan Ensberg wrote his take on baseball stats recently. It’s actually a pretty interesting read. I don’t want to dwell on it, but there is one little comment in particular I’d like to talk about:
"The term 'Garbage in, Garbage out' is the most accurate description I can give. If the sample used is garbage, then the answers won’t be accurate. Sabermetrics requires accurate information or organizations may misinterpret the data."
With SIERA on our stat menu, here's an explanation of why it predicts pitcher performance so well.
It sometimes seems as if the main reason people are wary of Defense Independent Pitching Statistics as a way to measure pitching performance is that they are reluctant to believe the theory that pitchers do not control the hit rate on balls in play (BABIP). It does not make intuitive sense, and it isn't even entirely true. Certainly, fans who disagree loudly with these theories should be reassured by the knowledge that defense-neutral ERA estimators are usually much closer to next year's ERA than the previous year's ERA, but many fans still can't get past the point that ERA estimators usually assume that pitchers do not have control over the outcome of balls in play. That is because these estimators simply look to interpret the effect on scoring of a strikeout, walk, and home run. This gives them the strength to predict ERA well because they are able to explicitly state the effect of each of these outcomes.
The estimators are tested to show the strength of Baseball Prospectus' new pitching metric.
Over the last three days the ERA estimator SIERA has been introduced, complete with explanations of its origins and the derivation of its formula. Now comes one of the most important aspects of building a new metric: making sure it works and testing it against its peers. Any estimator should be both realistic in its modeling and accurate in its predictive ability. These are not mutually exclusive attributes, however, as you could have a situation where a regression on sock height and induced foul balls caught by middle-aged men holding babies somehow predicts park-adjusted ERA better than anything else. Sure, it tested well, but that type of modeling is absurd and illogical; those two variables should not be assumed to have any impact whatsoever on run prevention. This regression means nothing in spite of its hypothetical test results, but situations may also arise in which the most fundamentally modeled statistic tests poorly.
Realistic modeling is based on a combination of truths and assumptions, as we've discussed before; the former being that walk and strikeout rates are stable with the latter suggesting that HR/FB is comprised more of luck than skill. During the course of our post yesterday, it seems safe to say that our modeling is sound as the variables used make sense as far as perceived impact on what they seek to measure. The question then becomes one of how we can test the results to determine how it compares to other estimators currently on the market. For the purposes of this study, we used root mean square error testing, a simple but effective method that informs on the average error between an actual and a predicted array of data. In terms of calculations, RMSE is simple enough to do in Excel: take the difference between the actual and predicted term, square it, and take the square root of the average of those previously squared deltas. When compared to other predictors, the lower the RMSE the better.
Unveiling a new statistic that provides a clearer picture of pitcher performance.
Baseball fans who have no use for advanced metrics can realize the flaws in evaluating pitchers by their won-lost records, but may struggle to understand the inherent flaws in the more commonly used earned run average. Henry Chadwick invented ERA in the 19th century to measure the effect of defense on pitching performance, but not until Voros McCracken explained the concept of Defense Independent Pitching Statistics (DIPS) did our understanding of the relationship between pitching and defense take a big step forward. McCracken explained that pitchers controlled the rates of whiffing, walking, and getting walloped with home runs, showing that the correlation between these statistics in consecutive years was strong. Though he inferred an ability for hurlers to control these numbers, another finding suggested little persistence in their Batting Average on Balls in Play (BABIP), leading to the conclusion that ERAs were dependent on defense (or luck), and therefore very volatile.
Armed with this information, sabermetricians began to develop methods of estimating ERA by controlling for the factors that can muddy the proverbial waters. These estimators enable the evaluation of pitching performance based on what pitchers actually control, rendering more accurate the tracking of their abilities. Watching trends in actual skills that pitchers control can help us better grasp whether shifts in ERA are the result of changes from the individual or from external factors. Since then, many competing estimators have emerged with their accompanying strengths and weaknesses. Perhaps the most popular ERA estimator is Fielding Independent Pitching (FIP), which uses the following straightforward formula: FIP = 3.20 + (3*BB - 2*K + 13*HR)/9, where the 3.20 is a constant dependent on the league and year, used to place the outputted number on the ERA scale.