Last week I wrote about how thinking about the zone in a probabilistic way could inform a better approach to plate discipline. In brief, I wrote that the zone as a discrete box above the plate does not exist. In its place, we can judge each pitch according to the probability that it will be called a strike, building into our estimate some of the factors which we know change the geometry of the zone. Examining plate discipline in this fashion proved illuminating, not to mention predictive of walk rates.

There’s a further step we can take with this probabilistic zone, which is to bring in linear weight information in order to judge the actions of each batter. In so doing, we can get an idea of the value of a hitter’s decision-making translated into the fundamental currency of baseball, runs.

Plate discipline is fundamentally about knowing when to take a pitch. To a first approximation, taking a pitch is good when the probability of the pitch being a ball exceeds the probability that the pitch will be a strike. In this case, the count will be advanced in a way that is favorable to the batter. However, the situations are a little bit more complicated than that, because the value of balls and strikes depends to some extent on the position in the count. For example, a take becomes more valuable at 3-0, because even if the pitch isn’t likely to be a called ball, if it is then the hitter gets a walk (a high-value play). In this case, it might be more valuable to take a pitch that is slightly more likely than even to be a strike, because the reward (a walk) dramatically exceeds the penalty (a slightly less advantageous count).

To integrate probability and linear weights we need a concept called the expected value. The expected value of a decision to take is simply the probability of a strike multiplied by the value of that strike, added to the probability of a ball multiplied by the value of that ball. Value is ascertained by linear weights computed by Harry Pavlidis; probability of a ball/strike is determined by a machine learning algorithm I’ve used and discussed before.

Using this framework, I computed the expected value of each taken pitch for all hitters in the league with more than 1500 pitches in 2014. These were my league leaders, on a rate basis:

I’m showing here the value (in runs per taken pitch [second column]), along with several of the statistics I introduced last week (third to fifth columns) and the walk rate (last column). There’s a strong overlap between this list and the leaderboard for the plate discipline statistics I introduced last week. Unsurprisingly, hitters with good responsiveness and low baseline swing rates get more value from their take decisions. Also unsurprising are the corresponding walk rates, which range from merely just below average (Pablo Sandoval) to outstanding (Giancarlo Stanton).

Here are the worst in the league, on a rate basis, at gaining run value from their take decisions:

Ben Revere doesn’t actually get zero value from his decisions not to swing, but it’s sufficiently small that it disappears via rounding. His take decisions are far and away the worst at gaining run value in the league. There’s at least one surprise on this list, and that’s ever-confusing Matt Carpenter. Carpenter walks plenty, has good plate discipline statistics, and yet gains little value, at least on average, by taking pitches. His success despite that apparent weakness suggests he might be getting the better of the pitcher in a way the model can’t fully understand.

Now let’s shift gears, and look at the league leaders on a cumulative basis. This leaderboard will not necessarily be identical to the rate leaderboards I showed before, because hitters differ significantly in their propensity to swing. So an aggressive hitter like Josh Hamilton might get reasonable value from taking pitches on the rare occasions he decides to take, but not gain much cumulative value because of the paucity of his takes.

Indeed, we see Josh Hamilton drop off, and low swing rate batters like Carlos Santana rise to the top. That Hamilton falls off the list suggests that he might benefit by curtailing his perpetually swinging tendencies. By the same logic, we can applaud Santana and Ryan Howard, who can harness their patience to the fullest extent, and are able to derive dozens of runs worth of value as a result.

On the other side were my league laggards:

Just like before, Ben Revere is far, far at the bottom of the take value list, with less than a single run gained from his decisions to hold. These players lose some 30 runs of value, in comparison to the most patient hitters I enumerated above. In this list in particular, as well as the corresponding previous list of lowest run values/take, I see a stereotype taking shape: that of a light-hitting defensive specialist (embodied perhaps by Jackie Bradley).

This stereotype brings an important point to the fore. There’s a strong correspondence between the cumulative (and rate) value statistics and the mean strike probability for each hitter. (Just like it sounds, the mean strike probability describes the average probability that a hitter will see a strike, as reckoned by my model.) Ben Revere has one of the highest average strike probabilities, because pitchers don’t much fear his singles-slinging bat.

As a consequence, Revere probably shouldn’t take many pitches. They are more likely to be strikes than not, and so the average Ben Revere take is of little value. On the flip side, look at newly rich Giancarlo Stanton: With one of the lowest average strike probabilities, he has to take pitches quite often. If he didn’t, he’d be constantly swinging at balls in the dirt, six feet off the ground, or three feet off the plate. When Stanton quite wisely chooses not to swing, the metric I’ve introduced rewards him handsomely. But there’s something unfair here, in that if Ben Revere was seeing all of the same pitches Stanton got, he might be deriving lots of value from his take decisions as well. Indeed, if you look at the modelled patterns of plate discipline of Stanton and Revere (at a 1-1 count), they are quite similar.

Despite the gross similarity in their approaches, Stanton and Revere achieve dramatically different outcomes. This disparity lies at the heart of the matter. Even though plate discipline is a conceptually separable skill in hitting, it nevertheless interacts with all of the other skills of hitting in interesting ways. Intellectually, plate discipline is a very different ability than bat speed, but in practice, all of a hitter’s skill are wrapped up together in one package. A player with high ISO is rewarded by seeing pitches that are less likely to be strikes, which in turn affects his ability to get walks. Raw strength, in addition to driving the ball, buys walks.

Would that we could feed a constant and controlled diet of pitches to each hitter, so we might be able to figure out the true value of Revere’s plate discipline. In practice, this value is confounded with Revere’s contact ability and his power. For this reason, it isn’t entirely appropriate to divvy up the worth of a hitter into individual components like plate discipline. Next time, I’ll try to take a more global view of hitting, by analyzing the total value of a hitter’s decision-making along with his skill when he decides to swing. This objective can be accomplished by examining the marginal value of a hitter’s decision to swing or not swing, relative to the opposite decision he could have made.

In the meantime, I am including a spreadsheet with all of the statistics I wrote about today and last week, for your viewing and sabermetric pleasure. These statistics aren’t meant to be the final word on the subject, as there are various aspects of their calculation that could be improved upon (the model for determining strike probability, for example, could incorporate additional factors). Rather, I hope these can be useful as starting points for future investigations.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now
You need to be logged in to comment. Login or Subscribe
I am surprised not to see Adam Jones at the bottom of the list.
Jones seems to be another one of those guys (like Sandoval, who I discussed last time) who swings too much, but also swings much more at stuff in the zone than outside it. Also like Sandoval, he sees many fewer strikes than average, so the high baseline rate of swinging is to his detriment.
This is fantastic work Robert, I love it. Looking forward to the next installment.

Related to this comment, I have an article at The Hardball Times on Monday that shows a particular type of pitch sequence that Adam Jones has trouble handling. I think he's at the bottom (or top, depending how you look at it) of the list.

Thank you, Jon, and that sounds great, I am looking forward to reading it.
I hope Javier Baez is a subscriber because I'm going to forward this article his direction.
I'm not really seeing the value of a metric like this. The "value" of a take is certainly not the run average value of the result. It is the difference between that and the run value of a swing (at that pitch on that count). The latter is hard to compute.

As you see in the chart the average value of a take goes up the more you swing at everything and only take pitches which are clearly balls. As far as cumulative value you can also see from the chart that those with the highest total value are the best power hitters simply because they get so many pitches on the border of and outside the zone.

So I not seeing how either rate or cumulative value tells us anything at all about a batter.
"The "value" of a take is certainly not the run average value of the result. It is the difference between that and the run value of a swing (at that pitch on that count)."

That (the marginal value of a take, relative to a swing) is the goal; this article is a step towards that goal.
Could Matt Carpenter's issue be that he takes too many pitches? In 2014, no qualified batted in MLB swung less than Carpenter did, bay far. His O-Swing%, swings at balls out of the zone, was 19.3, lowest in MLB behind Crisp, Dunn, and Sanata). With some exception, that's good. But his Z-Swing%, swings at balls in the zone, was 49.4%, lowest in MLB behind Dozier, Hardy, and Prado. (Obviously, Carpenter is lowest in overall Swing% as well, at 33.1%. The next on the list is Brett Gardner at 37.0%! Capenter is quite the outlier.)

Not surprisingly, we see Martin Prado on your list of laggards as well. And if you lower the threshold to 400 PA, Sam Fuld comes in with a Z-Swing% that is the 7th lowest and Swing% that is the 16th lowest. It turns out that guys at both extremes are hurting themselves with their discipline, not just those who chase too many pitches.

I find it interesting that Carpenter, Prado, and Fuld are all very low power guys who make a ton of contact, but who don't hit for much power and thus don't especially benefit from ensuring the highest quality contact. This is especially true in contrast to other low Swing% guys with a lot of power like Jayson Werth (3rd lowest Swing%), Mike Trout (9th), and Adam Dunn (21st).

Perhaps with guys like Carpenter, the value gap between the a ball taken and a ball in play is smaller, meaning they get comparatively greater ROI from discipline in terms of walk-value. And because these guys are presumably good bat control guys who can get a ball in play with two strikes if need-be, they aren't afraid of falling behind in the count. So perhaps the per-pitch model unfairly penalizes this type of player who is actually maximizing the value of his skill set at the plate-appearance level.