BP Comment Quick Links


November 6, 2009 Checking the NumbersDetecting Discipline
Ever since Billy Beane wrote Moneyball (right, Mr. Morgan?) in order to prove that the true path to success involved only seeking the services of highOBP employees, analysts of several varieties have worked diligently to discover market inefficiencies worth exploiting. One of the areas that has risen to prominence recently, likely due to the increased availability of the data, focuses on plate discipline on both sides of the spectrumfor hitters, or induced by pitchers. Data providers such as Baseball Info Solutions record information based on the strike zone in a plate appearance, determining the percentages of swings and contact on balls both in and out of the zone, as well as the rate of pitches thrown or observed that fell in the zone itself. This type of granular information affords analysts the opportunity to track tendencies such as which hitters chase more pitches out of the zone or which pitchers induce these chases more often. However, the numbers remain a tad ambiguous given that their application is largely contingent upon conventional wisdom; higher rates of outofzone swings are bad, mmmkay? This isn't always the case, though, and the rarely discussed inverse of taking too many pitches inside the zone could also be considered poor in process. Luckily, with the evergrowing PITCHf/x dataset, we can apply a method known as the signal detection theory to gauge discipline at the plate. You might remember the signal detection theory from such articles as "Is Walk the Opposite of Strikeout?" or "The Return of the Fisheye". The technique is commonly used in epidemiological studies used in cognitive psychology and engineering. It hinges on the idea of a perfect test, one that codes all positive results as true positives and all negative results as true negatives. Unfortunately, such tests do not exist, with false negativesbeing told you are healthy when you really aren'tand false positiveshearing some bad news in errorsurfacing. The first linked article above, written by Russell Carleton, applied this technique to Retrosheet data in order to measure plate discipline in a resultsbased fashion. Since Retrosheet lacks data for pitch location, the study was restricted to the actual resultsswings and misses, balls put in play, and called pitches. Cue the wonderful dataset that is PITCHf/x. Essentially, the goal here is to apply the signal detection theory to PITCHf/x by coding the processes in and out of the strike zone, as opposed to just the end results. In that regard, a pitch thrown in the strike zone at which the batter swung becomes a true positive. A pitch in the zone that is taken is a Type II error, or a false negative. Moving outside of the zone, swings are Type I errors, or false positives; taken pitches become true negatives. With the pitches classified in this fashion, we basically treat every major league hitter as if he is his own epidemiological study. Then, a series of calculations (to be discussed in further detail in the coming paragraphs) will explain which hitters are more prone to mistakes, as well as whether or not they are biased more towards freely swinging or taking pitches. The former statistic is known as sensitivity, while the latter is called the response bias. Ideally, sensitivity will be high, as higher numbers correspond to fewer mistakes. The goal for response biases is to get as close to 1.0 as possible, since that mark exudes balance. Below 1.0 and the hitter's level of success in being disciplined is biased towards keeping the bat stagnant with the opposite true for numbers above that threshold. As an example, over 200809, Luis Castillo posted a sensitivity rate slightly above the major league average, but with a very low .252 response bias that suggests his ability to make fewer mistakes in the box heavily relied upon a seeming refusal to swing. Because he rarely swung, he received some extra ball calls, but it came at the expense of many more called strikes. Hunter Pence had an almost identical sensitivity rating as Castillo, but with a response bias of .961, extremely close to 1.0, indicative of the fact that Pence has been more balanced in making errors and perhaps is not as easily exploitable as Castillo. In fact, Pence will actually make fewer mistakes than Castillo, because he is truly optimizing his balance. He is not costing himself anything extra in either direction. Coding everything into my PITCHf/x database involved the assumptions that an appropriate range for the horizontal portion of the strike zone started at 0.9 and went all the way to 0.9. Some studies automatically set the horizontal parameters to range from 1 to 1, while others move closer to the 0.85 that the rulebook seems to dictate, so this seems to be a happy medium. For the vertical parameters, I am using the sz_top and sz_bot fields in the dataset, which are top and bottom coordinates set by the system operator prior to each pitch. No slack was given in any direction in order to utilize a strict definition of the strike zone relative to each hitter. If a pitch fell within the zone and the result involved a swingswinging strike, foul, ball put in play, etc.a true positive response was coded. This process was repeated for the different errors and the true negative response. Two alterations were made, however, in that 30 takes in the zone were removed since that is almost an automatic take, and foul balls with two strikes were deemed true positive responses regardless of location since, with two strikes, hitters are going to widen their zone in order to "protect" at the plate. Once each of the four categories was summed, I exported the results to Excel to add in the necessary calculations. First, we need to calculate the true positive rate and the false alarm rates. These are fairly simple compared to the others in that the true positive rate is merely true positives divided by the sum of true positives and false negatives, calculating the percentage of pitches in the strike zone featuring a swing of some sort. The false alarm rate measures the number of swings on pitches out of the zone out of all pitches out of the zone. The next step involves finding the zdistribution for each of these rates, which is the area under a normally distributed curve with a mean of zero and a standard deviation of one. In Excel, the NORMINV function comes into play. Chipper Jones has had a true positive rate of .692, so we would plug in =NORMINV(0.692,0,1), where the 0 and 1 correspond to the aforementioned mean and standard deviation. This step gets repeated for the false alarm rate and the sensitivity rate itself is merely the True Positive ZDistribution  False Alarm ZDistribution. In the case of the Braves third baseman, 1.504 spits out, the third highest sensitivity rating amongst batters to see 2000+ pitches over the last two seasons. Here are the top and bottom ten sensitivity ratings: Player Pitches Sens. Player Pitches Sens. Daric Barton 2739 1.567 Ronny Cedeno 2105 0.881 Chris Iannetta 3071 1.527 Alexi Casilla 2504 0.874 Chipper Jones 3959 1.504 Victor Martinez 3740 0.873 Carlos Quentin 3352 1.467 Kendry Morales 2741 0.864 Carlos Pena 4481 1.456 Ryan Braun 4779 0.848 Milton Bradley 3703 1.451 Yuniesky Betancourt 3428 0.844 Joey Votto 4191 1.447 Michael Cuddyer 3436 0.838 Brad Hawpe 4702 1.445 Erick Aybar 3172 0.787 Lance Berkman 4469 1.445 Shane Victorino 4626 0.779 Akinori Iwamura 3805 1.426 Garret Anderson 3728 0.772 Remember, with sensitivity, higher numbers are better, and it should come as no surprise that some of these players fell into their respective bins. Players with good eyes like Chipper, Milton Bradley, and Berkman are not going to let too many zone pitches pass them by, nor will they fish out of the zone too often. On the flip side, Garret Anderson and Shane Victorino had the lowest success rates in this area, basically taking pitches when it would be more optimal to swing, and vice versa. What happens when we throw in their response biases, however, since not everyone obtains their sensitivities in the same fashion? Player Pitches Sens. Bias Player Pitches Sens. Bias Daric Barton 2739 1.567 0.281 Ronny Cedeno 2105 0.881 1.058 Chris Iannetta 3071 1.527 0.436 Alexi Casilla 2504 0.874 0.679 Chipper Jones 3959 1.504 0.471 Victor Martinez 3740 0.873 0.662 Carlos Quentin 3352 1.467 1.127 Kendry Morales 2741 0.864 0.885 Carlos Pena 4481 1.456 0.772 Ryan Braun 4779 0.848 0.929 Milton Bradley 3703 1.451 0.594 Yuniesky Betancourt 3428 0.844 1.008 Joey Votto 4191 1.447 0.836 Michael Cuddyer 3436 0.838 0.764 Brad Hawpe 4702 1.445 0.821 Erick Aybar 3172 0.787 1.047 Lance Berkman 4469 1.445 0.717 Shane Victorino 4626 0.779 0.774 Akinori Iwamura 3805 1.426 0.442 Garret Anderson 3728 0.772 0.938 Here is where the dichotomy of strategies emerges, as Barton scored so high by rarely swinging, while Quentin was much more balanced, with an absolute deviation much closer to 1.0 than anyone in his group. Interestingly enough, the hitters with lower sensitivity scores were much more balanced in terms of response bias, suggesting that they didn't necessarily discriminate in making errors, they just flatout made errors. Calculating the response bias itself is a bit more tedious than the sensitivity. To determine which way the hitters sway, we need to compute the phi statistics for the true positive zdistribution and that of the false alarm. This is achieved by raising 'e' to the negative form of the squared zdistribution for each, with the result divided by the square root of two times pi. In Excel, EXP(2.4) calculates 'e' raised to the power of 2.4. Once the phi statistics are both calculated, dividing phi_false / phi_true positive produces the response bias. Here are the leaders and trailers: Player Bias Dev1.0 Player Bias Dev1.0 Aaron Rowand 0.999 0.001 Denard Span 0.363 0.637 Mike Jacobs 0.996 0.004 Miguel Olivo 1.661 0.661 Yuniesky Betancourt 1.008 0.008 Bobby Abreu 0.321 0.679 Rajai Davis 0.990 0.010 Nick Johnson 0.315 0.685 Carlos Gonzalez 0.986 0.014 Daric Barton 0.281 0.719 Brandon Moss 0.983 0.017 Marco Scutaro 0.267 0.733 Miguel Cabrera 0.982 0.018 Luis Castillo 0.252 0.748 Ross Gload 1.020 0.020 Josh Hamilton 1.927 0.927 Ty Wigginton 1.024 0.024 Vladimir Guerrero 2.151 1.151 Marlon Byrd 1.025 0.025 Pablo Sandoval 2.483 1.483 Sorted by the absolute deviation from 1.0, we essentially see a top ten filled with guys upon which no real disciplinary reputation has been bestowed. Jacobs and Betancourt are known more as freeswinging, lowOBP players, but this group comprises the most balanced hitters over the last two years in terms of where their discipline or lack thereof is derived. Rowand is equally likely to take a pitch down the middle as he is to swing at one in the dirt. Across the table we start to see some familiar names, a cast of characters consisting of reputed patient hitters and their exact opposites. Does it surprise you that Bobby Abreu and Nick Johnson's performances are so imbalanced? It makes sense given how often they take pitches, but perhaps this idea works similarly to the idea of a breakeven stolenbase success rate, in that these two onbase luminaries take too often, while the likes of Hamilton, Guerrero, and Sandoval swing far too often. One important caveat here is that discipline does not always translate into positive results, as our most sensitive hitter, Daric Barton, is a borderline replacementlevel hitter. Likewise, the trailers in sensitivity have a handful of AllStar appearances between them and would produce a fairly formidable lineup if placed on the same team. This is but a granular approach to determining which hitters are more or less likely to make mistakes at the plate under the assumption that out of zone swings and in zone takes are mistakeswith alterations to 30 takes and two strike fouls. This assumption does not operate without faults, as the strike zone as defined here is very strict and robotic, when in reality it more closely represents a LOESSlike, or ovular shape. While enhancements such a more accurate definition of the zone and perhaps some more restrictions or constraints can make better this method, the method itself remains valid and worth investing time into as a means of digging deep into discipline. Knowing about a hitter's tendencies in terms of making mistakes and the direction in which he leans can only aid preformed scouting reports. The methodology and data above incorporated all data from 200809, meaning that there was no segregation of pitches. Moving forward, I plan on looking at specific pitches as a means of determining, say, the levels of sensitivity and bias inherent in hitters as a curveball comes their way, which can certainly help pitchers to understand who will take on a "get me over" and who will flail at a dirtgrab offering. Additionally, we will revisit Dan Fox's fisheye method and compare the results to hammer home which hitters could make more optimal their approach by swinging more or offering less.
Eric Seidman is an author of Baseball Prospectus. 11 comments have been left for this article. (Click to hide comments) BP Comment Quick Links MGL (2121) Great stuff Eric! I think it is pretty critical that you break the numbers up into counts (and even game situations, like base runners, outs, score, etc.), or adjust for those things, although obviously you are going to run into serious sample size issues then. Nov 06, 2009 12:47 PM Very, very true. The ideal it seems would be to extend this to contact, breaking the test outcomes into their own test. So, a test solely on swings, where the desired result is contact on a swing. As the sample sizes of PITCHf/x grows the Counts can definitely be broken down deeper, and I'm really curious about the curveballs, changeups, sliders, etc since that could conceivably help pitchers realize they can throw a curveball out of the zone more often perhaps to a certain hitter given his propensity to chase them. Nov 06, 2009 13:03 PM Christopher Taylor (37350) I've thought about this issue before (I've used SDT in my PhD thesis) and when I thought through it operationalizing the type I/II error in this way may not actually capture what this analysis wants to... take a couple of extreme examples, the young vlad (heck even the older vlad) who can hit any ball tossed near (or bounced before the plate) or the anecdote (was this about Giambi?) that there was a zone one ball high and one ball wide where he'd swing and miss (but next to this zone was an area where he would crush the ball. Nov 07, 2009 07:41 AM Not a problem. I'm actually more concerned with the contact aspect than redefining the strike zone. It seems the goal would be to treat swings as their own test: true pos = in zone contact, true neg = out of zone contact, false pos = in zone whiff, false neg = out of zone whiff. Then to run through the same process for swings and factor that into the results found through this overall process. Nov 07, 2009 08:32 AM R.A.Wagman (32721) Eric  I don't have the mathematical chops to comment on the calculations you have run, but giving you the benefit of the doubt on that end, your analysis is a great first step to real usable knowledge. Again. Nov 07, 2009 09:16 AM ZamBR1985 (23370) Fantastic article. If/when my brain is able to process this information, I'll be able to comment on some specifics. Nov 07, 2009 10:18 AM Ira (1386) I'm guessing that this data, properly filtered, and maybe even more broken down, is available to advanced scouts for teams. If its not, can you imagine how much that could help a team. If they could look at the next days lineup and go beyond the normal advanced scouting of, "Player X is a firstball hitter" and say "Player X has a 80% chance of swinging at a first pitch fastball inside the strikezone, and a 50% chance of swining at one outside the strikezone. Player Y has a 30% chance to swing at a first pitch fastball in the strikezone" and a 5% chance for one outside the zone. Nov 09, 2009 12:23 PM Brian Cartwright (4519) You used Luis Castillo as an example, and this metric charges Catillo with errors for taking called strikes. However, Castillo has just about the best contact rate on swings, both at avoid swing & misses and putting the ball in play. I'm sure in his mind, he can afford to take a strike or even two, because with virtually every swing making contact he knows he won't strike out, even with two strikes. Therefor, the 'cost' of a called strike is very low for Castillo. Nov 10, 2009 09:52 AM Not a subscriber? Sign up today!

Seems like a good method to check to see if players have or haven't made adjustments or improvements over time, and where those improvements came (with your future expansion into pitch type). I think that if you translate this into an "aging" curve, you could identify why certain prospects fail to blossem into stars and why some fail. I'm sure there are other reasons besides the ones you are examining here, but I would expect at least some of them would.
Gordon, exactly! Right now we just have 2008 and 2009 but something in my queue is a comparison. Did Luis Castillo have a response bias of 0.5 in 2008 and 0.1 in 2009, etc? Did Player X see a drastic improvement in results that his improved signal detected process indicated?