Notice: Trying to get property 'display_name' of non-object in /var/www/html/wp-content/plugins/wordpress-seo/src/generators/schema/article.php on line 52
keyboard_arrow_uptop

Believe it or not, most of our writers didn't enter the world sporting an @baseballprospectus.com address; with a few exceptions, they started out somewhere else. In an effort to up your reading pleasure while tipping our caps to some of the most illuminating work being done elsewhere on the internet, we'll be yielding the stage once a week to the best and brightest baseball writers, researchers and thinkers from outside of the BP umbrella. If you'd like to nominate a guest contributor (including yourself), please drop us a line.

Matt Lentzner has carved out a (very) small niche in the baseball analysis world by examining the intersection of physics and biomechanics. He has presented at the PITCHf/x conference in each of the last two years and has written articles for The Hardball Times, as well as a previous article for Baseball Prospectus. When he’s not writing, Matt works on his physics-based baseball simulator, which is so awesome and all-encompassing that it will likely never actually be finished, though it does provide the inspiration for most of his articles and presentations. In real life, he’s an IT Director at a small financial consulting company in the Silicon Valley and also runs a physical training gym in his backyard on the weekends.

 

Challenge the strike zone with all types of pitches all the time.—Dr. Mike Marshall, Pitching Coach

The single most important thing for a hitter is to get a good pitch to hit.Rogers Hornsby (as told by Ted Williams)

It ain't nuthin' until I call it.—Bill Klem, HOF Umpire

I realize it’s a little excessive to start an article off with three quotes, but I have my reasons for engaging in such behavior. What I am trying to illustrate is that there are three actors involved in any pitch: the pitcher himself, the batter, and lest we forget, the umpire—the arbiter of strikes and balls.

Of specific interest is the quote from Dr. Marshall. Notice that he doesn’t say to throw strikes; he says to challenge the strike zone. That is, the pitched ball should be close enough that the batter cannot immediately dismiss the pitch as a ball.

On the other hand, Rogers Hornsby makes the case that the batter should be waiting for the pitcher to make a mistake or be forced to throw a hittable pitch. The pitcher is trying to avoid this but at the same time throw pitches that are close, but not too close to the heart of the plate. Klem reminds us that there is a human involved in calling these pitches who can’t be dismissed either.

Plate discipline measurements have up until now only considered the binary condition of the pitch—either it was in the rulebook strike zone or it was not. While that is the simplest and most obvious way to classify pitches, it ignores the reality of how baseball is played. 

 

 

The main factor that makes the binary approach too simple is the umpire’s limited perception. There is a zone near the center of the plate that is almost always called a strike. There is a zone far from the plate where pitches are almost always called balls. But there is also a zone where pitches are sometimes called balls and sometimes called strikes. The delineation is not as sharp as the binary model assumes. We have a gradient. 

 

This graphic is based on a study originally done by the Greta Garbo of Sabermetrics, John Walsh. The original study is here. It’s an oldie but a goodie. If you haven’t seen it already, I encourage you to do so. (And John, if you read this, just know that we all miss you).

In the study, John looked at the middle section of the strike zone and determined the chances of the pitch being called a ball or strike. Notice the trapezoidal shape of the graphs. We have a relatively flat zone that corresponds to the interior of the strike zone where nearly all pitches are guaranteed to be called strikes. The line slopes down to an effectively zero chance of a called strike. However, in between those two areas is a gray area.

How often do pitchers throw a pitch in the gray area? More often than you’d think—about 50 percent of the time.

So, Instead of strikes and balls, we have three conditions—a ternary model:

  • True Strikes: Pitches that are >95 percent called as strikes
  • True Balls: Pitches that are >95 percent called as balls
  • Borderline: The rest

 

This amazing graphic courtesy of David Allen, called a heat map, shows the per pitch value of pitches thrown to different locations. Red is best for the pitcher and blue is best for the batter. Dr. Marshall was right. You can see how pitches too far from the strike zone are a boon for the batter. A pitcher must challenge the strike zone.

Notice the donut shape of the pitcher’s best zone. Close, but not too close is the best result. Notice how the red zones creep past the nominal strike zone. Note that the strike zone pictured is based on how umpires actually call games. The rulebook zone is considerably narrower and taller.

Here’s a better picture of what’s really going on regarding a batter’s approach to hitting.

 

I owe a huge thanks to David Allen, who not only shared his excellent graphics with me but also determined the exact dimensions of the ternary zones. Note that the horizontal strike zone is not the same for right-handed and left-handed batters. This is an effect of umpiring.

“ Sz_bot” and “sz_top” refer to the vertical strike zone boundaries set by the PITCHf/x operator based on the height and stance of the batter.

Horizontal Boundaries for RHBs

Horizontal Boundaries for LHBs

Vertical Bourndaries

 

-1.38ft = inside boundary of true ball

-0.58ft = inside boundary of true strike

0.54ft = outside boundary of true strike

1.3ft = outside boundary of true ball

 

 

-1.50ft = outside boundary of true ball

-0.82ft = outside boundary of true strike

0.38ft = inside boundary of true strike

1.14ft = inside boundary of true ball

 

 

0.22ft below sz_bot = bottom boundary of true ball

0.48ft above sz_bot = bottom boundary of true strike

0.42ft below sz_top = top boundary of true strike

0.46ft above sz_top = top boundary of true ball

If this is making your eyes glaze over, here are the CliffsNotes:

How do the pitch types break down? Let’s take a look.

 

This actually surprised me. Pitchers work the Borderline about 50 percent of the time, and only 20 percent of the pitches they throw would be considered “mistake” or “get-me-over” pitches. The fact that almost one in three pitches are True Balls is also surprising to me. The fingerprints of being a pitcher as opposed to simply a thrower are all over this pie chart.

As we will see, most batters, especially power hitters, tend to murder True Strikes. Not surprisingly, batters across the board do poorly against True Balls when they swing at them—which is not very often. Borderline pitches are a pitcher’s best friend. In many counts, the batter is compelled to swing at them with predictably mediocre results. The alternative is risking a strikeout, so it’s a sort of a damned-if-you-do, damned-if-you-don’t situation. Of course, the pitcher is loving it.

Here’s the average performance of batters against the differing pitch types.

 

Overall

True Strikes

Borderline

True Balls

Swing %

45%

70%

50%

20%

Contact %

83%

90%

83%

63%

BAcon

.327

.357

.314

.292

SLGcon

.522

.610

.483

.417

“Swing %” is simply the percentage of time that batters swing at this pitch type. “Contact %” is the chance that the batter will make some kind of contact with the pitch—not necessarily put it in play. “BAcon” is the batting average on the pitches contacted and put in play. “SLGcon” is the “slugging percentage” of pitches contacted and put in play.

BAcon and SLGcon are probably unfamiliar measurements. Both will look very high because we are removing the strikeouts from the equation. BAcon is even higher than BABIP (Batting average on balls in play), since BAcon includes home runs while BABIP does not. The “Overall” column is there so you can get your bearings on what is an average value for those measures.  

I hope by now you agree that the binary (strikes and balls) model doesn’t work that well. Within the strike zone we have good and bad pitches. Outside of the strike zone we have good and bad pitches. Setting up your buckets that way significantly muddies the water and dilutes and conclusions you can make. By breaking the pitches up into three groups, you can characterize them much better and get better answers.

Let’s move on. There is a ton of analysis possible with this approach—several articles’ worth, at least. Let’s look at a little of that just to get an idea of ways to look at hitters and how they differ.

Here’s a scatter plot of swing% against True Strikes (x-axis) vs. take% (opposite of swing%) against True Balls (y-axis) for all batters with a minimum of 1500 PAs over 2008, 2009, 2010, and the first half of 2011. 

 

The first thing you can see here is that the swing% of true strikes versus the take% of true balls is not very closely correlated. You can sort of see a weak linear effect going from the high left quadrant to the low right quadrant—maybe.

One thing to keep in mind is that batters who reside on the low left quadrant have the worst plate discipline. They swing at the most True Balls and watch the most True Strikes go by, dardly a recipe for success. I would expect a strong selection bias that would remove most of these players from the major-league hitting pool. I doodled a little on that same scatter plot to show this.

 

The upper left quadrant contains the players who just take a lot of pitches. They are biased toward not swinging at True Balls, so they miss quite a few True Strikes in the process. The lower right quadrant batters are those guys who like to swing the bat. They are biased toward not missing out on hitting the snot out of a True Strike, so they end up chasing more True Balls. The upper right quadrant hitters are the elite. They aggressively attack True Strikes yet are not often fooled by True Balls.

Let’s take a look at some of the outliers to get an idea of who these guys are.

 

The identities of a lot of these players are no surprise. Vladimir Guerrero is one of the most aggressive hitters in baseball? Stop the presses! But I was surprised by Chris Iannetta. I’m more of a American League guy myself, so I didn’t know much about him, but if we’re talking elite plate discipline, then he’s the guy. Brett Gardner has a freakish ability to take pitches. Garrett Anderson leads in the hopelessly confused category.

Regarding Garrett, it’s too bad we don’t have PITCHf/x data going back far enough to see what he did in his prime. It is certainly possible that his loss of plate discipline led to his decline as a hitter.  It’s also possible that he was a unique player who could be good with substandard plate discipline.

It would be interesting to track hitters to see if a loss of plate discipline predicts a future decline in hitting ability. Likewise, positive changes could be signs of improving skill and future productivity.

My last comment would be that plate discipline is not everything. I included Jose Bautista and Albert Pujols to give some perspective on that. Both have good plate discipline, but not elite. Iannetta is a good hitter (especially for a catcher), but he is not an elite hitter. We have good and bad hitters who are aggressive and good and bad hitters who are passive. I would say that you can be successful by attacking True Strikes or passing on True Balls, but you won’t be successful doing neither.

Let’s see how that graph looks with pitchers.

 

I kept the scale the same as the batters graph so you could see right away that there is less variability among pitchers than hitters. There also seems to be even less correlation between the two factors. Also keep in mind that the best pitchers are in the lower left quadrant now. What’s good for a batter is bad for a pitcher, and vice versa.

Here’s what that looks like with my notes added.

 

So, whom are we looking at?

 

What’s interesting about this graph is although the variability is lower, the relationship to pitcher quality is much higher. The elite “anti-discipline” pitchers are also among the elite pitchers overall. The pitchers on the other end of the spectrum are universally terrible.

Carlos Marmol is an interesting and extreme case. We should call him “The Paralyzer.” Nobody induces looking strikes like he does—it’s not even close. It must be some combination of ridiculous movement and terrible control; batters are just hoping they get walked or end up in a fastball count.

Another way I looked at the data is by profiling. You tend to see similar types of hitters and pitchers appearing. For example, with hitters we have the contact-hitting type compared with the power hitters.

Contact Hitters

Power Hitters

Luis Castillo

True Strike SLGcon: .374

Borderline SLGcon: .306

True Ball SLGcon: .314

 

Alex Cora

True Strike SLGcon: .300

Borderline SLGcon: .338

True Ball SLGcon: .298

 

Adam Dunn

True Strike SLGcon: .974

Borderline SLGcon: .616

True Ball SLGcon: .547

 

Carlos Pena

True Strike SLGcon: .874

Borderline SLGcon: .593

True Ball SLGcon: .382

 

I find this fascinating. For these two contact hitters, their results on balls in play are largely unchanged regardless of where the ball was pitched. Note that this doesn’t mean that it doesn’t matter at all where the ball was pitched. Castillo, for example, has much higher contact rates versus true strikes as opposed to true balls, even though the results of hit balls are not much different.

These two power hitters are exactly the opposite. They destroy true strikes and the drop-off as pitch quality degrades is pretty severe. Carlos Pena does not hit True Balls much better than the slap hitters do. Much worse, actually, since he is a lot more likely to miss it completely.

So we have two interesting profiles here. Contact hitters get their value from putting the ball in play and not striking out.  Power hitters get most of their value from hitting the crap out of “mistake” pitches. I don’t pretend that this is earthshaking news to anyone, but now we can see it in the numbers. It’s been quantified.

Here are some pitcher comparisons that are interesting.

Control Pitchers

Dominance Pitchers

Mark Buerhle

True Strike SLGcon: .600

Borderline SLGcon: .438

True Ball SLGcon: .368

 

James Shields

True Strike SLGcon: .652

Borderline SLGcon: .519

True Ball SLGcon: .329

 

Roy Halliday

True Strike SLGcon: .519

Borderline SLGcon: .434

True Ball SLGcon: .411

 

Felix Hernandez

True Strike SLGcon: .492

Borderline SLGcon: .414

True Ball SLGcon: .440

 

This somewhat parallels the hitter profiles in almost an opposite way. The control pitchers have to be careful where they pitch the ball. True Strikes will get hammered. They get their value by not throwing True Strike mistake pitches and getting batters to chase True Balls.

The dominance pitchers have a much more “you can’t touch this” approach. Even on True Strikes, batters don’t do particularly well. They can attack the strike zone aggressively. The control artists need more guile and finesse. Also not a shocking discovery, but there it is, measured and quantified.

With that, I will wrap this article up. But as I mentioned earlier, we have just scratched the surface. How does this approach pan out by counts? How does it change over a career? Can we use this analysis to predict future success with prospects based on minor-league numbers or rookie year performances? I’m sure you can think of a few more avenues of exploration I didn’t mention.

Fear not. I intend to explore this area more fully in future articles, and I hope I’ve inspired others to do the same. 

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now
You need to be logged in to comment. Login or Subscribe
davidpom50
9/30
Fascinating article. Great stuff, Matt!
Richie
9/30
This is very good stuff. Thank you!
davidpom50
9/30
I would love to get access to a spreadsheet of the swing rates and such of all the hitters involved.
lentzner
9/30
Dave, I'll check with BP and see if there's some way they could host this.
ScottBehson
9/30
This was a truly awesome article. Put is next to Mike Fast's in the "Best of 2011 collection"
bornyank1
9/30
Not to mention Matt's first article for us, which you should go back and check out if you haven't already. It's linked in his bio above.
crperry13
9/30
Great research, and outstanding presentation. Not a knock on anybody else, but there's a lot to be learned here about how to clearly present a complex idea to non-experts, and I hope others take note.

I have a couple thoughts:

First, it would be really fun to see graphics on TV showing "Mistake Pitch", "Sucker Pitch", and "Pitcher's Pitch", regardless of the pitch outcome. Big splash graphics reading SUCKER PITCH flashing with video game intensity. Awesome...

Second, I think the case of Chris Ianetta and the presentation of the hitter plated discipline graph above is interesting. The distinction between this graph and what actually makes a good hitter is important, and I don't think the graph actually shows true plate disclipline.

Ianetta has a 75% contact rate, which isn't very good, and his contact rate on pitches "in the zone" (by the old definition) is a low 81%. In addition, he oddly only swings at pitches in the "zone" 70% of the time (compared to Vlad's 80.5%, for fun). His ability to recognize True Strikes plus what I just mentioned tells me that Iannetta is just awful at dealing or recognizing Borderline Pitches (Borderline pitches aren't shown by the chart, which is why it could be misleading when valuing a player's plate discipline).

So, Iannetta has outstanding plate discipline when the pitch is either A) right down the middle, or B) way outside, but struggles to do anything with borderline stuff, which is probably what he's thrown most of the time. All this explains why scouts loved him enough to give him top prospect status, and also why he hasn't lived up to their expectations in six seasons in the majors.

Very interesting stuff, and like you said, I think there is a lot you can do with this information.
yadenr
9/30
Great read. I especially like how it all seemed very intuitive, but there were some real surprises like the Hitter Discipline Pickouts.
rrvwmr
9/30
Great article. I'd be interested to see if Carlos Pena's year to year changes in productivity have been correlated to these measures. My gut tells me that the hitters in the upper right quadrant have amazing pitch recognition ability, but are probably taking the wrong approach at the plate. Where can I get the True Strike Swing % and True Ball Take % by player?
ofMontreal
10/02
I agree that this is outstanding stuff here. And like CRP, you've grabbed the other bizarre result in Pena. I watch Pena all the time and he swings and misses at pitches right down the middle an enormous amount of the time. To the point where it's shocking to me when he hits a HR. On the other hand, he has great skill at fouling off Borderline pitches. Which is why he ends up where he is on the chart. That and the ability to just not swing half the time. Anyway, my point being that somewhere up there on the chart needs to be a sign saying This Way Leads to Madness.
rrvwmr
10/03
Agreed, I'm most curious if the other upper right players fit this "madness" description as well. Obviously, not swinging at true balls is a good thing. The fact that something is a true strike though doesn't mean you should swing at it. I think that is where these players are erring. They aren't being selective enough on balls in the zone.
sam19041
9/30
Wow. This is literally the most exciting baseball article I've read in many years. My hat is off to you. Just wow.

Selfishly I would like to see player by player data. Any chance you could make it available?
bravejason
9/30
Outstanding article. It is definitely a nominee for best article of the year.

I would like to echo an earlier poster's comment about the presentation. What makes this article "work" (aside from the fact that it new, interesting, and original research) is that the information is presented in a way that is readily understandable. There is no amount of words or numbers that could have told the story better than the graphics.

One critique: the true strike % vs true ball % graphs need vertical and horizontal axis labels. On the very first of those graphs it is not possible to tell which axis is for balls and which on is for strikes.
lentzner
10/01
Thanks for all the nice comments guys.
sam19041
10/02
So Matt, what do you say? Can you share the data? It would be fascinating to see, for example, if Adam Dunn's approach changed in 2011 -- or if David Wright's decline correlated with a change in approach (and stadium).
tnt9357
10/01
This article could be like the Voros McCracken one that became a touchstone of research for the following five years. Very nicely done.

The chart I'd love to see (and I also commend the presentation) is a pie chart of just first pitches.

Also, I guess we have to now call them the Two And A Half True Outcomes? :-)
revelaire
10/02
And speaking of Voros, in addition to BAcon (or, AVG on contact, depending on your view with respect to whether it would be good to have a statistic pronounced like the cured breakfast meat), I'd really like to see BABIP for each of the three areas. I suspect this might be yet another way where a pitcher (at least one with decent command/control) can exert influence over BABIP.
crperry13
10/02
Bacon makes everything better.
pmatthews
10/01
Truly outstanding stuff. One of the best articles I've read all year.
smallflowers
10/02
Give this guy a job!
JeffZimmerman
10/05
Matt - One complaint. A player's and umpire's zone changes for each count. Is there a way to run the data for just 0-0 counts? All players would at least get one sample per PA. The pitches down the middle that are taken on a 3-0 count would be removed. Hacks at 0-2 counts also. I think count could make a huge difference with some of the data. Even if count doesn't change the outcome, at least it would be ruled out as a factor.