November 6, 2009
Checking the Numbers
Ever since Billy Beane wrote Moneyball (right, Mr. Morgan?) in order to prove that the true path to success involved only seeking the services of high-OBP employees, analysts of several varieties have worked diligently to discover market inefficiencies worth exploiting. One of the areas that has risen to prominence recently, likely due to the increased availability of the data, focuses on plate discipline on both sides of the spectrum-for hitters, or induced by pitchers.
Data providers such as Baseball Info Solutions record information based on the strike zone in a plate appearance, determining the percentages of swings and contact on balls both in and out of the zone, as well as the rate of pitches thrown or observed that fell in the zone itself. This type of granular information affords analysts the opportunity to track tendencies such as which hitters chase more pitches out of the zone or which pitchers induce these chases more often. However, the numbers remain a tad ambiguous given that their application is largely contingent upon conventional wisdom; higher rates of out-of-zone swings are bad, mmmkay? This isn't always the case, though, and the rarely discussed inverse of taking too many pitches inside the zone could also be considered poor in process.
Luckily, with the ever-growing PITCHf/x dataset, we can apply a method known as the signal detection theory to gauge discipline at the plate. You might remember the signal detection theory from such articles as "Is Walk the Opposite of Strikeout?" or "The Return of the Fisheye". The technique is commonly used in epidemiological studies used in cognitive psychology and engineering. It hinges on the idea of a perfect test, one that codes all positive results as true positives and all negative results as true negatives. Unfortunately, such tests do not exist, with false negatives-being told you are healthy when you really aren't-and false positives-hearing some bad news in error-surfacing. The first linked article above, written by Russell Carleton, applied this technique to Retrosheet data in order to measure plate discipline in a results-based fashion. Since Retrosheet lacks data for pitch location, the study was restricted to the actual results-swings and misses, balls put in play, and called pitches.
Cue the wonderful dataset that is PITCHf/x. Essentially, the goal here is to apply the signal detection theory to PITCHf/x by coding the processes in and out of the strike zone, as opposed to just the end results. In that regard, a pitch thrown in the strike zone at which the batter swung becomes a true positive. A pitch in the zone that is taken is a Type II error, or a false negative. Moving outside of the zone, swings are Type I errors, or false positives; taken pitches become true negatives. With the pitches classified in this fashion, we basically treat every major league hitter as if he is his own epidemiological study. Then, a series of calculations (to be discussed in further detail in the coming paragraphs) will explain which hitters are more prone to mistakes, as well as whether or not they are biased more towards freely swinging or taking pitches. The former statistic is known as sensitivity, while the latter is called the response bias.
Ideally, sensitivity will be high, as higher numbers correspond to fewer mistakes. The goal for response biases is to get as close to 1.0 as possible, since that mark exudes balance. Below 1.0 and the hitter's level of success in being disciplined is biased towards keeping the bat stagnant with the opposite true for numbers above that threshold. As an example, over 2008-09, Luis Castillo posted a sensitivity rate slightly above the major league average, but with a very low .252 response bias that suggests his ability to make fewer mistakes in the box heavily relied upon a seeming refusal to swing. Because he rarely swung, he received some extra ball calls, but it came at the expense of many more called strikes. Hunter Pence had an almost identical sensitivity rating as Castillo, but with a response bias of .961, extremely close to 1.0, indicative of the fact that Pence has been more balanced in making errors and perhaps is not as easily exploitable as Castillo. In fact, Pence will actually make fewer mistakes than Castillo, because he is truly optimizing his balance. He is not costing himself anything extra in either direction.
Coding everything into my PITCHf/x database involved the assumptions that an appropriate range for the horizontal portion of the strike zone started at -0.9 and went all the way to 0.9. Some studies automatically set the horizontal parameters to range from -1 to 1, while others move closer to the 0.85 that the rulebook seems to dictate, so this seems to be a happy medium. For the vertical parameters, I am using the sz_top and sz_bot fields in the dataset, which are top and bottom coordinates set by the system operator prior to each pitch. No slack was given in any direction in order to utilize a strict definition of the strike zone relative to each hitter. If a pitch fell within the zone and the result involved a swing-swinging strike, foul, ball put in play, etc.-a true positive response was coded. This process was repeated for the different errors and the true negative response. Two alterations were made, however, in that 3-0 takes in the zone were removed since that is almost an automatic take, and foul balls with two strikes were deemed true positive responses regardless of location since, with two strikes, hitters are going to widen their zone in order to "protect" at the plate.
Once each of the four categories was summed, I exported the results to Excel to add in the necessary calculations. First, we need to calculate the true positive rate and the false alarm rates. These are fairly simple compared to the others in that the true positive rate is merely true positives divided by the sum of true positives and false negatives, calculating the percentage of pitches in the strike zone featuring a swing of some sort. The false alarm rate measures the number of swings on pitches out of the zone out of all pitches out of the zone.
The next step involves finding the z-distribution for each of these rates, which is the area under a normally distributed curve with a mean of zero and a standard deviation of one. In Excel, the NORMINV function comes into play. Chipper Jones has had a true positive rate of .692, so we would plug in =NORMINV(0.692,0,1), where the 0 and 1 correspond to the aforementioned mean and standard deviation. This step gets repeated for the false alarm rate and the sensitivity rate itself is merely the True Positive Z-Distribution - False Alarm Z-Distribution. In the case of the Braves third baseman, 1.504 spits out, the third highest sensitivity rating amongst batters to see 2000+ pitches over the last two seasons. Here are the top and bottom ten sensitivity ratings:
Player Pitches Sens. Player Pitches Sens. Daric Barton 2739 1.567 Ronny Cedeno 2105 0.881 Chris Iannetta 3071 1.527 Alexi Casilla 2504 0.874 Chipper Jones 3959 1.504 Victor Martinez 3740 0.873 Carlos Quentin 3352 1.467 Kendry Morales 2741 0.864 Carlos Pena 4481 1.456 Ryan Braun 4779 0.848 Milton Bradley 3703 1.451 Yuniesky Betancourt 3428 0.844 Joey Votto 4191 1.447 Michael Cuddyer 3436 0.838 Brad Hawpe 4702 1.445 Erick Aybar 3172 0.787 Lance Berkman 4469 1.445 Shane Victorino 4626 0.779 Akinori Iwamura 3805 1.426 Garret Anderson 3728 0.772
Remember, with sensitivity, higher numbers are better, and it should come as no surprise that some of these players fell into their respective bins. Players with good eyes like Chipper, Milton Bradley, and Berkman are not going to let too many zone pitches pass them by, nor will they fish out of the zone too often. On the flip side, Garret Anderson and Shane Victorino had the lowest success rates in this area, basically taking pitches when it would be more optimal to swing, and vice versa. What happens when we throw in their response biases, however, since not everyone obtains their sensitivities in the same fashion?
Player Pitches Sens. Bias Player Pitches Sens. Bias Daric Barton 2739 1.567 0.281 Ronny Cedeno 2105 0.881 1.058 Chris Iannetta 3071 1.527 0.436 Alexi Casilla 2504 0.874 0.679 Chipper Jones 3959 1.504 0.471 Victor Martinez 3740 0.873 0.662 Carlos Quentin 3352 1.467 1.127 Kendry Morales 2741 0.864 0.885 Carlos Pena 4481 1.456 0.772 Ryan Braun 4779 0.848 0.929 Milton Bradley 3703 1.451 0.594 Yuniesky Betancourt 3428 0.844 1.008 Joey Votto 4191 1.447 0.836 Michael Cuddyer 3436 0.838 0.764 Brad Hawpe 4702 1.445 0.821 Erick Aybar 3172 0.787 1.047 Lance Berkman 4469 1.445 0.717 Shane Victorino 4626 0.779 0.774 Akinori Iwamura 3805 1.426 0.442 Garret Anderson 3728 0.772 0.938
Here is where the dichotomy of strategies emerges, as Barton scored so high by rarely swinging, while Quentin was much more balanced, with an absolute deviation much closer to 1.0 than anyone in his group. Interestingly enough, the hitters with lower sensitivity scores were much more balanced in terms of response bias, suggesting that they didn't necessarily discriminate in making errors, they just flat-out made errors.
Calculating the response bias itself is a bit more tedious than the sensitivity. To determine which way the hitters sway, we need to compute the phi statistics for the true positive z-distribution and that of the false alarm. This is achieved by raising 'e' to the negative form of the squared z-distribution for each, with the result divided by the square root of two times pi. In Excel, EXP(2.4) calculates 'e' raised to the power of 2.4. Once the phi statistics are both calculated, dividing phi_false / phi_true positive produces the response bias. Here are the leaders and trailers:
Player Bias Dev-1.0 Player Bias Dev-1.0 Aaron Rowand 0.999 0.001 Denard Span 0.363 0.637 Mike Jacobs 0.996 0.004 Miguel Olivo 1.661 0.661 Yuniesky Betancourt 1.008 0.008 Bobby Abreu 0.321 0.679 Rajai Davis 0.990 0.010 Nick Johnson 0.315 0.685 Carlos Gonzalez 0.986 0.014 Daric Barton 0.281 0.719 Brandon Moss 0.983 0.017 Marco Scutaro 0.267 0.733 Miguel Cabrera 0.982 0.018 Luis Castillo 0.252 0.748 Ross Gload 1.020 0.020 Josh Hamilton 1.927 0.927 Ty Wigginton 1.024 0.024 Vladimir Guerrero 2.151 1.151 Marlon Byrd 1.025 0.025 Pablo Sandoval 2.483 1.483
Sorted by the absolute deviation from 1.0, we essentially see a top ten filled with guys upon which no real disciplinary reputation has been bestowed. Jacobs and Betancourt are known more as free-swinging, low-OBP players, but this group comprises the most balanced hitters over the last two years in terms of where their discipline or lack thereof is derived. Rowand is equally likely to take a pitch down the middle as he is to swing at one in the dirt. Across the table we start to see some familiar names, a cast of characters consisting of reputed patient hitters and their exact opposites. Does it surprise you that Bobby Abreu and Nick Johnson's performances are so imbalanced? It makes sense given how often they take pitches, but perhaps this idea works similarly to the idea of a break-even stolen-base success rate, in that these two on-base luminaries take too often, while the likes of Hamilton, Guerrero, and Sandoval swing far too often.
One important caveat here is that discipline does not always translate into positive results, as our most sensitive hitter, Daric Barton, is a borderline replacement-level hitter. Likewise, the trailers in sensitivity have a handful of All-Star appearances between them and would produce a fairly formidable lineup if placed on the same team.
This is but a granular approach to determining which hitters are more or less likely to make mistakes at the plate under the assumption that out of zone swings and in zone takes are mistakes-with alterations to 3-0 takes and two strike fouls. This assumption does not operate without faults, as the strike zone as defined here is very strict and robotic, when in reality it more closely represents a LOESS-like, or ovular shape. While enhancements such a more accurate definition of the zone and perhaps some more restrictions or constraints can make better this method, the method itself remains valid and worth investing time into as a means of digging deep into discipline. Knowing about a hitter's tendencies in terms of making mistakes and the direction in which he leans can only aid pre-formed scouting reports.
The methodology and data above incorporated all data from 2008-09, meaning that there was no segregation of pitches. Moving forward, I plan on looking at specific pitches as a means of determining, say, the levels of sensitivity and bias inherent in hitters as a curveball comes their way, which can certainly help pitchers to understand who will take on a "get me over" and who will flail at a dirt-grab offering. Additionally, we will re-visit Dan Fox's fish-eye method and compare the results to hammer home which hitters could make more optimal their approach by swinging more or offering less.