Last Friday, I discussed plate discipline at length, noting that the commonly cited facet of performance extends beyond its synonym of patience and into the realm of making fewer responsive mistakes in a given trip to the dish. I introduced signal detection theory as a means of more accurately measuring which hitters produce the correct responses most often, since having good plate discipline must also cover the optimization of in zone pitches and not merely how often a hitter chases.

As a brief refresher what I did last week was this: using PITCHf/x data, I coded events with swings on pitches thrown in the strike zone as true positive responses; takes on pitches out of the zone as true negatives; takes on zone offerings as false negatives; and swings on pitches out of the zone as false positives. Two metrics were introduced as well: sensitivity and response bias. The former measures how mistake-prone a hitter is, with higher figures translating to a lesser rate of erring in decision. The latter serves as a barometer of where their mistakes are biased, with the ideal output hovering right around the 1.0 mark. Below that threshold and hitters are more likely to take in zone pitches, while figures in excess are biased towards, well, think Vladimir Guerrero. An important aspect of discipline not discussed dealt with the actual results of these decisions. Without factoring in the likelihood of contact and/or positive results, we really cannot determine which hitters are stepping into the box with suboptimal strategies.

Think of it this way: Luis Castillo was found to be league average in terms of his sensitivity rating, but he was heavily biased towards keeping the bat on his shoulder. Due to the rather extreme bias, he was the recipient of extra ball calls, all of which came at the expense of many more called strikes. By comparison, Hunter Pence had a near-identical sensitivity rate with a response bias practically sharing a room with the 1.0 neutrality, meaning that he was equally likely to take a pitch in the zone as he was to chase one down in the dirt, which made him less exploitable and, to an extent, less prone to mistakes. But the logical assumption given this information, that Castillo utilized a suboptimal strategy, was incomplete because a good chance remained that perhaps he rarely swung because he understood his limitations and knew that earning freebies might hold the highest probability for him to reach first base. The question then becomes: How often did Castillo make contact, and what type of contact was made?

Baseball Info Solutions publishes plate discipline data that keeps track of swings and contact made both in and out of the zone, but I decided to use my PITCHf/x database in order to keep consistent within this entire methodology. Using the same parameters used to code swings and takes in and out of the zone, three columns were added-zone, swing, and contact-in order to make querying easier. Each column was binary in nature, where 1=yes and 0=no, making the PITCHf/x versions of these contact rates more easily calculable.

Looking at anyone who saw 2000 or more pitches over 2008-09, the league-average rates were 87.9 percent for zone contact, and 68.2 percent for out-of-zone contact. In this same span, Luis Castillo made contact 95.9 percent of the pitches in the strike zone at which he offered, while getting the bat on the ball 87.9 percent of the time on out-of-zone pitches. Castillo made contact at a much better rate than the league. Again, it would seem that all signs indicate his strategy is not optimal. He doesn’t swing all that much, but clearly boasts an ability to make contact whenever he swings. Why not swing more often?

Well, these numbers strictly counted the number of times contact was made, but contact is a rather ambiguous term. Contact literally refers to connecting the bat with the ball, and there are more ways than one to accomplish this feat. Aside from the differences in grounders, liners, and fly balls and the various subsets of each of those-sharp liners, frozen ropes-there is the simpler separation of balls put in play and those fouled off. If Castillo is making a ton of contact, but a hefty portion of that can be attributed to foul balls and spoiling off pitches, his contact rates may be misleading. Sure, fouling off pitches in many cases is a better result than a whiff, but such a situation makes less clear whether or not Castillo could have done himself better to offer a bit more often. The simple solution here is to add a fourth binary column recording whether or not a foul occurred on a swing. On a pitch that the batter fouled off in the zone, all of the zone, swing, contact, and foul columns would display a ‘1.’ Looking at overall foul-ball rates to avoid any type of sample-size issue potentially inherent in a zone/out of zone breakdown, the league average came out to 48.1 percent. Slightly less than half of all contacted balls amongst these 2000+ pitch batters were of the foul ball variety.

Castillo clocks in at 39.2 percent. Now it’s getting really interesting, as he swings less often than anyone in the game-his 31.7 percent swing rate was substantially lower than that of Bobby Abreu, who was second with 34.0 percent-and Castillo makes more contact than anyone in the game on the rare swings. Only five hitters-Cristian Guzman, Jeff Keppinger, Pedro Feliz, Placido Polanco, and Vernon Wells-produced lower foul-ball rates on said contact. While this confluence of characteristics should once again suggest that Castillo wields a suboptimal batting strategy, the aforementioned “Less Fouls Than Castillo” quintet does not exactly consist of perennial All-Stars or players reputed to boast solid discipline. This suggests that yet another step is needed to truly gauge strategy optimization: what happens on the non-foul contacted balls?

Unfortunately, our journey in this regard has to be put on hold at this juncture because there simply is not any reliable and freely available data on the percentage of balls in play that were hit sharply or weakly. The MLB Gameday application attempts to break things down in this fashion, occasionally coding balls as sharply hit or weak, but the lack of consistency across parks and the fact that not every event is coded in this fashion ultimately precludes its use in this arena. This exercise reflects why HITf/x data, whenever it becomes available, is going to be immensely important, as the velocity off the bat when contact is made is the final, missing piece to determining disciplinary optimization.

Before looking at a few more selected players under this lens, I did want to point out that there was an ever-so-slight miscalculation in the response bias last week, in that the phi statistics were raised to the squared term of the z-distribution for true correct results and false alarms and then multiplied by negative one. The correct version is to multiply by negative one-half. It does not change a whole heck of a lot, but I felt compelled to point this out. The sensitivity ratings were still valid, but here are the top and bottom ten revised response biases, where the top group has the least absolute deviation from 1.0 with the inverse true of the bottom group.

Player              Bias     Dev     Player             Bias     Dev
Aaron Rowand        1.000   0.000    Pablo Sandoval     1.576   0.576
Mike Jacobs         0.998   0.002    Luis Castillo      0.502   0.498
Yuniesky Betancourt 1.004   0.004    Marco Scutaro      0.516   0.484
Rajai Davis         0.995   0.005    Daric Barton       0.530   0.470
Carlos Gonzalez     0.993   0.007    Vladimir Guerrero  1.467   0.467
Brandon Moss        0.991   0.009    Nick Johnson       0.561   0.439
Miguel Cabrera      0.991   0.009    Bobby Abreu        0.566   0.434
Ross Gload          1.010   0.010    Denard Span        0.602   0.398
Ty Wigginton        1.012   0.012    Josh Willingham    0.609   0.391
Marlon Byrd         1.012   0.012    Josh Hamilton      1.388   0.388

Again, this tells us where they sway in terms of making mistakes. Aaron Rowand is literally neutral in this regard, meaning that without discussing his actual level of making mistakes, he is perfectly balanced when mistakes are made. Pablo Sandoval and Luis Castillo are interesting to compare given their positions on opposite sides of the response bias spectrum: Sandoval is more likely to err on the side of swinging too frequently than Castillo is to continue playing the role of Nicholas Noswing in the off-Broadway hit, Johnny Get That Bat Off of Your Shoulder.

The largest deviations here also tend to suggest that hitters are more likely to err on the side of caution than to let loose. In fact, the average response bias across the qualifying players is 0.845, indicating a rather substantial league-wide bias towards taking called strikes. In any event, there are a few players I was asked about when discussing this type of methodology that seems worthy of further analysis in this forum. Keep in mind the following league averages: Sensitivity (1.144), Zone Contact (87.9 percent), OOZ Contact (68.2 percent), Fouls (48.1 percent), and Response Bias (0.844 even though the deviation from 1.0 is of more interest).

  • Vladimir Guerrero:
    His 1.122 sensitivity is right is in line with the league, as is his 88.1 percent rate of zone contact. His 73.1 percent out of zone contact rate exceeds the league average, however; so does his 49.5 percent rate of foul balls. Combined with a response bias heavily skewed towards swinging at pitches out of the zone, Guerrero appears to be Castillo’s free-swinging analog. He swings much more often and contacts a higher raw number of pitches, but much of that comes in the form of foul balls. If his rate of well-hit balls exceeded the average, our assessment would shift. If this study were to be conducted a decade ago, different results would likely surface, as balls Guerrero can now only hope to foul off may have gone for extra base hits or solidly hit liners.

  • Carlos Quentin:
    The White Sox corner outfielder boasts the highest sensitivity with the most neutral response bias, the fourth highest sensitivity, and a 1.062 response bias. He is essentially league average on zone contact while below average in terms of out-of-zone contact. He also fouls off a higher percentage balls than the league. This seems like the perfect recipe for a player of average discipline, not a league leader.

  • Jack Cust:
    All signs here point towards an optimal strategy being used for a rather suboptimal player. Cust is well below average in zone contact (73.5 percent) and out-of-zone contact (49.5 percent), with one of the highest foul-ball rates at 55.6 percent. His response bias of 0.614 indicates a preference for not swinging, and with good reason, as he makes much less contact than the league, and a higher-than-average percentage of said contact comes in the form of foul balls. The rest of his contact may be balls hit over the wall or hit sharply around the diamond, but these numbers tend to agree with his strategy.

Again, we need some sort of measure of what type of contact was made before really moving forward but I hope this provides the framework for future studies in terms of determining plate discipline. We need to look past the idea that discipline equals not chasing bad pitches, and get more in tune with the idea that disciplined hitters understand their abilities and limitations as far as which types of pitches they can make decent contact with, and do not let these pitches go by, while spitting on pitches they do not feel capable of handling very well. A player can utilize a completely optimal strategy while batting and be suboptimal himself, which is more on the front office than the abilities of the individual player, another important distinction to make. Knowing which way the batters are biased towards in terms of making mistakes as well as their individual proneness to making mistakes in the first place is incredibly valuable, but the missing ingredient of what happens when contact is made is of equal value on its own since it can help redefine which pitches and events should be classified as errors in signal detection studies.