Checking the Numbers: Detecting Discipline

November 6, 2009

Ever since Billy Beane wrote Moneyball (right, Mr. Morgan?) in order to prove that the true path to success involved only seeking the services of high-OBP employees, analysts of several varieties have worked diligently to discover market inefficiencies worth exploiting. One of the areas that has risen to prominence recently, likely due to the increased availability of the data, focuses on plate discipline on both sides of the spectrum-for hitters, or induced by pitchers.

Data providers such as Baseball Info Solutions record information based on the strike zone in a plate appearance, determining the percentages of swings and contact on balls both in and out of the zone, as well as the rate of pitches thrown or observed that fell in the zone itself. This type of granular information affords analysts the opportunity to track tendencies such as which hitters chase more pitches out of the zone or which pitchers induce these chases more often. However, the numbers remain a tad ambiguous given that their application is largely contingent upon conventional wisdom; higher rates of out-of-zone swings are bad, mmmkay? This isn’t always the case, though, and the rarely discussed inverse of taking too many pitches inside the zone could also be considered poor in process.

Luckily, with the ever-growing PITCHf/x dataset, we can apply a method known as the signal detection theory to gauge discipline at the plate. You might remember the signal detection theory from such articles as “Is Walk the Opposite of Strikeout?” or “The Return of the Fisheye”. The technique is commonly used in epidemiological studies used in cognitive psychology and engineering. It hinges on the idea of a perfect test, one that codes all positive results as true positives and all negative results as true negatives. Unfortunately, such tests do not exist, with false negatives-being told you are healthy when you really aren’t-and false positives-hearing some bad news in error-surfacing. The first linked article above, written by Russell Carleton, applied this technique to Retrosheet data in order to measure plate discipline in a results-based fashion. Since Retrosheet lacks data for pitch location, the study was restricted to the actual results-swings and misses, balls put in play, and called pitches.

Cue the wonderful dataset that is PITCHf/x. Essentially, the goal here is to apply the signal detection theory to PITCHf/x by coding the processes in and out of the strike zone, as opposed to just the end results. In that regard, a pitch thrown in the strike zone at which the batter swung becomes a true positive. A pitch in the zone that is taken is a Type II error, or a false negative. Moving outside of the zone, swings are Type I errors, or false positives; taken pitches become true negatives. With the pitches classified in this fashion, we basically treat every major league hitter as if he is his own epidemiological study. Then, a series of calculations (to be discussed in further detail in the coming paragraphs) will explain which hitters are more prone to mistakes, as well as whether or not they are biased more towards freely swinging or taking pitches. The former statistic is known as sensitivity, while the latter is called the response bias.

Ideally, sensitivity will be high, as higher numbers correspond to fewer mistakes. The goal for response biases is to get as close to 1.0 as possible, since that mark exudes balance. Below 1.0 and the hitter’s level of success in being disciplined is biased towards keeping the bat stagnant with the opposite true for numbers above that threshold. As an example, over 2008-09, Luis Castillo posted a sensitivity rate slightly above the major league average, but with a very low .252 response bias that suggests his ability to make fewer mistakes in the box heavily relied upon a seeming refusal to swing. Because he rarely swung, he received some extra ball calls, but it came at the expense of many more called strikes. Hunter Pence had an almost identical sensitivity rating as Castillo, but with a response bias of .961, extremely close to 1.0, indicative of the fact that Pence has been more balanced in making errors and perhaps is not as easily exploitable as Castillo. In fact, Pence will actually make fewer mistakes than Castillo, because he is truly optimizing his balance. He is not costing himself anything extra in either direction.

Coding everything into my PITCHf/x database involved the assumptions that an appropriate range for the horizontal portion of the strike zone started at -0.9 and went all the way to 0.9. Some studies automatically set the horizontal parameters to range from -1 to 1, while others move closer to the 0.85 that the rulebook seems to dictate, so this seems to be a happy medium. For the vertical parameters, I am using the sz_top and sz_bot fields in the dataset, which are top and bottom coordinates set by the system operator prior to each pitch. No slack was given in any direction in order to utilize a strict definition of the strike zone relative to each hitter. If a pitch fell within the zone and the result involved a swing-swinging strike, foul, ball put in play, etc.-a true positive response was coded. This process was repeated for the different errors and the true negative response. Two alterations were made, however, in that 3-0 takes in the zone were removed since that is almost an automatic take, and foul balls with two strikes were deemed true positive responses regardless of location since, with two strikes, hitters are going to widen their zone in order to “protect” at the plate.

Once each of the four categories was summed, I exported the results to Excel to add in the necessary calculations. First, we need to calculate the true positive rate and the false alarm rates. These are fairly simple compared to the others in that the true positive rate is merely true positives divided by the sum of true positives and false negatives, calculating the percentage of pitches in the strike zone featuring a swing of some sort. The false alarm rate measures the number of swings on pitches out of the zone out of all pitches out of the zone.

The next step involves finding the z-distribution for each of these rates, which is the area under a normally distributed curve with a mean of zero and a standard deviation of one. In Excel, the NORMINV function comes into play. Chipper Jones has had a true positive rate of .692, so we would plug in =NORMINV(0.692,0,1), where the 0 and 1 correspond to the aforementioned mean and standard deviation. This step gets repeated for the false alarm rate and the sensitivity rate itself is merely the True Positive Z-Distribution – False Alarm Z-Distribution. In the case of the Braves third baseman, 1.504 spits out, the third highest sensitivity rating amongst batters to see 2000+ pitches over the last two seasons. Here are the top and bottom ten sensitivity ratings:


Player          Pitches   Sens.    Player            Pitches   Sens.
Daric Barton      2739    1.567    Ronny Cedeno        2105    0.881
Chris Iannetta    3071    1.527    Alexi Casilla       2504    0.874
Chipper Jones     3959    1.504    Victor Martinez     3740    0.873
Carlos Quentin    3352    1.467    Kendry Morales      2741    0.864
Carlos Pena       4481    1.456    Ryan Braun          4779    0.848
Milton Bradley    3703    1.451    Yuniesky Betancourt 3428    0.844
Joey Votto        4191    1.447    Michael Cuddyer     3436    0.838
Brad Hawpe        4702    1.445    Erick Aybar         3172    0.787
Lance Berkman     4469    1.445    Shane Victorino     4626    0.779
Akinori Iwamura   3805    1.426    Garret Anderson     3728    0.772

Remember, with sensitivity, higher numbers are better, and it should come as no surprise that some of these players fell into their respective bins. Players with good eyes like Chipper, Milton Bradley, and Berkman are not going to let too many zone pitches pass them by, nor will they fish out of the zone too often. On the flip side, Garret Anderson and Shane Victorino had the lowest success rates in this area, basically taking pitches when it would be more optimal to swing, and vice versa. What happens when we throw in their response biases, however, since not everyone obtains their sensitivities in the same fashion?


Player          Pitches  Sens.   Bias   Player            Pitches  Sens.   Bias
Daric Barton      2739   1.567  0.281   Ronny Cedeno        2105   0.881  1.058
Chris Iannetta    3071   1.527  0.436   Alexi Casilla       2504   0.874  0.679
Chipper Jones     3959   1.504  0.471   Victor Martinez     3740   0.873  0.662
Carlos Quentin    3352   1.467  1.127   Kendry Morales      2741   0.864  0.885
Carlos Pena       4481   1.456  0.772   Ryan Braun          4779   0.848  0.929
Milton Bradley    3703   1.451  0.594   Yuniesky Betancourt 3428   0.844  1.008
Joey Votto        4191   1.447  0.836   Michael Cuddyer     3436   0.838  0.764
Brad Hawpe        4702   1.445  0.821   Erick Aybar         3172   0.787  1.047
Lance Berkman     4469   1.445  0.717   Shane Victorino     4626   0.779  0.774
Akinori Iwamura   3805   1.426  0.442   Garret Anderson     3728   0.772  0.938

Here is where the dichotomy of strategies emerges, as Barton scored so high by rarely swinging, while Quentin was much more balanced, with an absolute deviation much closer to 1.0 than anyone in his group. Interestingly enough, the hitters with lower sensitivity scores were much more balanced in terms of response bias, suggesting that they didn’t necessarily discriminate in making errors, they just flat-out made errors.

Calculating the response bias itself is a bit more tedious than the sensitivity. To determine which way the hitters sway, we need to compute the phi statistics for the true positive z-distribution and that of the false alarm. This is achieved by raising ‘e’ to the negative form of the squared z-distribution for each, with the result divided by the square root of two times pi. In Excel, EXP(2.4) calculates ‘e’ raised to the power of 2.4. Once the phi statistics are both calculated, dividing phi_false / phi_true positive produces the response bias. Here are the leaders and trailers:


Player               Bias   Dev-1.0     Player              Bias   Dev-1.0
Aaron Rowand        0.999    0.001      Denard Span        0.363    0.637
Mike Jacobs         0.996    0.004      Miguel Olivo       1.661    0.661
Yuniesky Betancourt 1.008    0.008      Bobby Abreu        0.321    0.679
Rajai Davis         0.990    0.010      Nick Johnson       0.315    0.685
Carlos Gonzalez     0.986    0.014      Daric Barton       0.281    0.719
Brandon Moss        0.983    0.017      Marco Scutaro      0.267    0.733
Miguel Cabrera      0.982    0.018      Luis Castillo      0.252    0.748
Ross Gload          1.020    0.020      Josh Hamilton      1.927    0.927
Ty Wigginton        1.024    0.024      Vladimir Guerrero  2.151    1.151
Marlon Byrd         1.025    0.025      Pablo Sandoval     2.483    1.483

Sorted by the absolute deviation from 1.0, we essentially see a top ten filled with guys upon which no real disciplinary reputation has been bestowed. Jacobs and Betancourt are known more as free-swinging, low-OBP players, but this group comprises the most balanced hitters over the last two years in terms of where their discipline or lack thereof is derived. Rowand is equally likely to take a pitch down the middle as he is to swing at one in the dirt. Across the table we start to see some familiar names, a cast of characters consisting of reputed patient hitters and their exact opposites. Does it surprise you that Bobby Abreu and Nick Johnson’s performances are so imbalanced? It makes sense given how often they take pitches, but perhaps this idea works similarly to the idea of a break-even stolen-base success rate, in that these two on-base luminaries take too often, while the likes of Hamilton, Guerrero, and Sandoval swing far too often.

One important caveat here is that discipline does not always translate into positive results, as our most sensitive hitter, Daric Barton, is a borderline replacement-level hitter. Likewise, the trailers in sensitivity have a handful of All-Star appearances between them and would produce a fairly formidable lineup if placed on the same team.

This is but a granular approach to determining which hitters are more or less likely to make mistakes at the plate under the assumption that out of zone swings and in zone takes are mistakes-with alterations to 3-0 takes and two strike fouls. This assumption does not operate without faults, as the strike zone as defined here is very strict and robotic, when in reality it more closely represents a LOESS-like, or ovular shape. While enhancements such a more accurate definition of the zone and perhaps some more restrictions or constraints can make better this method, the method itself remains valid and worth investing time into as a means of digging deep into discipline. Knowing about a hitter’s tendencies in terms of making mistakes and the direction in which he leans can only aid pre-formed scouting reports.

The methodology and data above incorporated all data from 2008-09, meaning that there was no segregation of pitches. Moving forward, I plan on looking at specific pitches as a means of determining, say, the levels of sensitivity and bias inherent in hitters as a curveball comes their way, which can certainly help pitchers to understand who will take on a “get me over” and who will flail at a dirt-grab offering. Additionally, we will re-visit Dan Fox‘s fish-eye method and compare the results to hammer home which hitters could make more optimal their approach by swinging more or offering less.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now

Eric Seidman

Latest Articles

You need to be logged in to comment. Login or Subscribe

Gordon

11/06

Seems like a good method to check to see if players have or haven't made adjustments or improvements over time, and where those improvements came (with your future expansion into pitch type). I think that if you translate this into an "aging" curve, you could identify why certain prospects fail to blossem into stars and why some fail. I'm sure there are other reasons besides the ones you are examining here, but I would expect at least some of them would.

Reply to Gordon

EJSeidman

11/06

Gordon, exactly! Right now we just have 2008 and 2009 but something in my queue is a comparison. Did Luis Castillo have a response bias of 0.5 in 2008 and 0.1 in 2009, etc? Did Player X see a drastic improvement in results that his improved signal detected process indicated?

Reply to EJSeidman

lichtman

11/06

Great stuff Eric! I think it is pretty critical that you break the numbers up into counts (and even game situations, like base runners, outs, score, etc.), or adjust for those things, although obviously you are going to run into serious sample size issues then.

Also, a batter's tendencies has a lot to do with their success when making contact or their ability to make contact. I realize that you are trying to look at these tendencies independent of a player's overall hitting success or their success when making contact or their ability to make contact, but the readers need to be careful about concluding anything about whether a batter is optimizing his approach without taking into consideration measure of success. For example a player like Valddy can afford to swing at pitches out of the zone because he is so good at it (making contact and hitting the ball hard when he does). A player like Castillo can not, because, for example, even if he were able to make contact on pitches out of the zone, he would not hit them very hard.

Reply to lichtman

EJSeidman

11/06

Very, very true. The ideal it seems would be to extend this to contact, breaking the test outcomes into their own test. So, a test solely on swings, where the desired result is contact on a swing. As the sample sizes of PITCHf/x grows the Counts can definitely be broken down deeper, and I'm really curious about the curveballs, changeups, sliders, etc since that could conceivably help pitchers realize they can throw a curveball out of the zone more often perhaps to a certain hitter given his propensity to chase them.

Reply to EJSeidman

taylorcp

11/07

I've thought about this issue before (I've used SDT in my PhD thesis) and when I thought through it operationalizing the type I/II error in this way may not actually capture what this analysis wants to... take a couple of extreme examples, the young vlad (heck even the older vlad) who can hit any ball tossed near (or bounced before the plate) or the anecdote (was this about Giambi?) that there was a zone one ball high and one ball wide where he'd swing and miss (but next to this zone was an area where he would crush the ball.

What I'l getting at is that letting a pitch go in strike zone is not always a type II error nor is swinging at a pitch outside the zone a type I error. Not that this analysis isn't great, but I think that more than a correction for the oval shape of the zone needs to be done. The data need to be combined with something like "hit tracker" to establish a "zone of hittable pitches" for each hitter.
Simply, if a player can't hit the high fastball for anything but a IF pop-up in the zone there's no point in him doing anything but taking the pitch and calling that an error is a tad misleading.

Ooops. Sorry, I ought to have read the comments before posting... as I've repeated some of the content of previous posts.

Reply to taylorcp

EJSeidman

11/07

Not a problem. I'm actually more concerned with the contact aspect than redefining the strike zone. It seems the goal would be to treat swings as their own test: true pos = in zone contact, true neg = out of zone contact, false pos = in zone whiff, false neg = out of zone whiff. Then to run through the same process for swings and factor that into the results found through this overall process.

That way, we get an idea of sensitivity towards making contact; players with lower swing sensitivities--those more prone to in zone and out of zone whiffs--are the ones with suboptimal strategies IF they have a response bias in the method proposed in this article that is skewed towards swinging too often. It would tell us that they swing too often AND those swings don't amount to anything.

On the flipside, those with high sensitivity marks for swings and a response bias above 1.0 in THIS study wouldn't be penalized as much. Vladdy would fall into this category. His response bias is through the roof with an average or so sensitivity but his strategy is still sound since he connects on most of his swings.

Reply to EJSeidman

rawagman

11/07

Eric - I don't have the mathematical chops to comment on the calculations you have run, but giving you the benefit of the doubt on that end, your analysis is a great first step to real usable knowledge. Again.
Well played.
On a somewhat related front to those of other commenters about how certain hitters have "hot spots" that may be out of the strike zone, did anybody else notice this postseason how Hideki Matsui was seemingly always crushing low breaking balls?

Reply to rawagman

ZamBR1985

11/07

Fantastic article. If/when my brain is able to process this information, I'll be able to comment on some specifics.

Reply to ZamBR1985

irablum

11/09

I'm guessing that this data, properly filtered, and maybe even more broken down, is available to advanced scouts for teams. If its not, can you imagine how much that could help a team. If they could look at the next days lineup and go beyond the normal advanced scouting of, "Player X is a firstball hitter" and say "Player X has a 80% chance of swinging at a first pitch fastball inside the strikezone, and a 50% chance of swining at one outside the strikezone. Player Y has a 30% chance to swing at a first pitch fastball in the strikezone" and a 5% chance for one outside the zone.

Reply to irablum

blcartwright

11/10

You used Luis Castillo as an example, and this metric charges Catillo with errors for taking called strikes. However, Castillo has just about the best contact rate on swings, both at avoid swing & misses and putting the ball in play. I'm sure in his mind, he can afford to take a strike or even two, because with virtually every swing making contact he knows he won't strike out, even with two strikes. Therefor, the 'cost' of a called strike is very low for Castillo.

Guys like Chipper and Berkman swing and miss much more often, and thus can't afford to give the pitcher a strike. They have to swing more often to maximize the probability of putting the ball in play.

Try looking at results passing thru two strike counts - Castillo should have a much lower SO% starting at two strikes than do the lower contact guys.

Some players also vary their approach based on the count. My eyes tell me that Freddy Sanchez is reasonably disciplined with 0 or 1 strikes - he takes pitches out of the zone at a fairly normal rate, and probably swings at a higher than normal rate on those in zone. But, with 2 strikes he will swing at anything within 3 feet of the plate.

Reply to blcartwright

EJSeidman

11/10

Brian, the article this week discusses contact. Castillo is well above average in contact in and out of the zone, but even that isn't enough since we need to know what TYPE of contact he's making. So sit tight!

Reply to EJSeidman

Checking the Numbers: Detecting Discipline

Thank you for reading

Latest Articles

Next Man Up ’24: Week Three $

Fantasy Starting Pitching Planner ’24: Week Four $

speX ’24: Week Three $

Box Score Banter: Experiments in Takeout Slides B

Some Potential Answers for Pete Fairbanks $

Eric Seidman

Latest Articles

Next Man Up ’24: Week Three $

Fantasy Starting Pitching Planner ’24: Week Four $

speX ’24: Week Three $