What weight can be given to batted ball data? How much can we use it to refine our knowledge of pitcher skills?
This question really implicates two related subquestions. First, how reliable is the underlying raw data? In other words, how accurate are scorers' codings of batted balls? Second, if we had perfect information about batted ball type (that is to say entirely consistent), which types can be consistently induced (or avoided) and which ones are the result of contingent (or luck) based factors?
On the first question, most agree batted ball coding is not, at the moment, particularly accurate or precise. Even Sportsvision's hitF/X, which promises further accuracy, will not be able to measure ball spin, which will limit its effectiveness. But what about the second question?
Matthew Carruth argues that batted ball types (but particularly line drives) are repeatable skills.
They have developed metrics in an attempt to reflect this belief (and it must necessarily be a belief since it is so hard to verify given the low quality of the raw data), and incorporated efforts to control for the imprecision of the raw data by regressing the components heavily to the mean. Is this result sound? You can find further treatment in this podcast with Carruth and Matt Klaassen. [Please note that this is an area about which many people hold strong views, but that is no excuse for being unkind, behavior which I simply will not tolerate.]
Others, like Tom M. Tango, have shown that batted ball data can offer only marginal gains.
Are there specific questions that batted ball metrics help us to understand that non-batted ball data do not? Vice versa?
Beyond performing the rote mechanics of statistics, what does it mean to think statistically?
This non-baseball article has many suggestions, including a reminder to look outside the data and to aggressively avoid having an agenda. Preconceived ideas are antithetical to new learning.
If all this is too heavy for you, Rogue's Baseball Index offers baseball definitions of film terms.
A particular favorite: "Independence Day: The day on which a player is released by, or traded away from, the Kansas City Royals. Mark Teahen, for example, celebrated Independence Day on November 6, 2009."