There will be a very short planned maintenance outage of the site tonight (7/22) at 11 PM ET
December 6, 2013
Explaining Mistake Splits
Most of our writers didn't enter the world sporting an @baseballprospectus.com address; with a few exceptions, they started out somewhere else. In an effort to up your reading pleasure while tipping our caps to some of the most illuminating work being done elsewhere on the internet, we'll be yielding the stage once a week to the best and brightest baseball writers, researchers, and thinkers from outside of the BP umbrella. If you'd like to nominate a guest contributor (including yourself), please drop us a line.
Evan Petty is a 22-year-old lifelong student of the game who’s studying Magazine Journalism and Applied Statistics at Syracuse University. Raised on the North Shore of Massachusetts, Evan remains an ardent J.D. Drew defender.
As a player, Evan was a catcher who was more Molina than Mauer. He moved to Singapore in High School and became the first Massachusetts native to win a baseball championship on two continents and be named to an All-Southeast Asia Slow-Pitch Softball Team in the same season.
At some point in a given baseball game, there’s a good chance that the broadcaster will mention a “mistake pitch.” A two-seamer might run over the heart of the plate, or an errant fastball might miss up in the zone. Perhaps a slider doesn’t bite and ends up in the upper deck.
“The pitcher made a mistake and he didn’t miss it!”
The fear of making a mistake is what makes pitching one of the more stressful jobs in sports. It’s what makes pitching inside so terrifying. And it’s what keeps fans on the edge of their seats even when the opposing pitcher appears unhittable.
Every pitcher makes mistakes. Some stress making as few as possible and others aren’t as fastidious. Everyone knows mistakes happen; but nobody knows when. There will always be uncertainty about the quality of every pitch—a value I’m surprised analytics ignores. Mistake analysis aims to fuse the two.
In May of 2013, a friend and I watched the Red Sox and chatted about Will Middlebrooks. For a player who seemingly had so much talent, his production seemed to come and go. We rationalized that Middlebrooks is a streaky hitter—not exactly Earth-shattering. But the more I thought about it, the more I realized that most of Middlebrooks’ production came on similar types of pitches. He’s not a hitter who turns a tough pitch around. But pitchers know not to leave one out over the middle to him.
And therein lies the question: Do Middlebrooks’ ups and downs have more to do with what he does at the plate or the pitches he gets? And are some hitters more dependent on the pitch they get than others?
I started watching the game with these questions on my mind, looking for differences in styles. It became evident that considering the specific pitch when examining a batter’s production is important. People are quick to attribute results to what the batter does; but perhaps the pitches he sees have something to do with these results.
I began developing a split that works just like any other split in baseball by dividing a player’s total production into different categorical variables. Instead of analyzing how much of a player’s production comes at home versus on the road, or during the day versus night, mistake splits divide a player’s total production into whether it came against a mistake pitch or not. I started watching tape, taking data and spitting out thoughts in hopes of finding anything interesting. Here’s where I try to explain some of my thinking, discoveries, and further questions derived from my analysis of mistakes.
How to Record Mistake Splits
The two italicized words are key in defining a mistake pitch. The intended location potentially differentiates two identical pitches. Grooving a 3-0 fastball down the middle isn’t a mistake if it’s what the pitcher meant to do. Missing there 0-2 almost certainly is. The definition refers to location only, which rules out scouting or decision-making. Practically, some consider a pitch a mistake to Miguel Cabrera but not to Dustin Ackley. This notion stems from Cabrera’s ability to consistently hit certain non-mistakes that Ackley doesn’t.
“Clearly” refers to the grader’s binary opinion and not the degree of mistake. The assessment is only a yes or no and doesn’t take the degree of mistake into account; “clearly” doesn’t mean the mistake must be extreme. It’s not the graders’ responsibility to differentiate mistake from mistake; it’s to distinguish mistake from non-mistake.
It’s important to note that a pitch can also miss badly without necessarily being a mistake. If a pitcher misses enough on the inside corner with a pitch that he wanted outside, it may not be a mistake. Identifying a mistake involves knowledge of the situation, definition of mistake, and of the game of baseball.
Yes, mistake splits are subjective. They aren’t the first subjective figure in sports and they won’t be the last. That’s because subjective statistics can still be precise.
Accuracy vs. precision is probably best explained with the diagram below.
Accuracy refers to closeness about a fixed or universal point. Precision refers to closeness about the other points. This applies to data, too.
The working definition of mistake isn’t completely accurate. Different graders surely conflict in spots. It’s already very accurate, and will continue to improve. In a couple of comparisons, graders assessed pitches with about 90 percent overlap. But unless there’s perfect science, perfect accuracy can’t exist. Mistake data are precise, which helps validate them to a large extent. So long as the criteria remains the same through an entire data set, the set will be precise. The key is consistency.
Consider an umpire calling balls and strikes. Guidelines determine strikes versus balls, similar to defining mistakes versus non-mistakes. All umpires have slightly different strike zones, even though the separation shrinks every year. As long as pitches are graded consistently, there’s validity. And there’s no reason why grading accuracy won’t improve with repetition, like any other subjective data.
Walks are the most obvious omission from mistake analysis. Mistake splits include only at-bats that end with a swing—strikeouts or balls in play. Walks count only when the bases are loaded—when they’re considered mistakes. Batters hit by pitch and wild pitches follow the same guidelines.
Other than walks and hit batters, mistake splits ignore baserunning, which mostly affects pitchers’ splits. Mistake splits don’t reflect a third of an inning pitched when a runner is caught stealing. This explains why some pitchers’ splits don’t reflect their exact innings pitched total. Pitchers also fail to receive credit for an out when a batter reaches on an error. The batter gets credited with an 0-1 in such a situation, but the pitcher doesn’t, which means that mistake splits potentially inflate a pitcher’s BAA slightly.
Other oddities include inherited runners. The pitcher who allows an inherited runner to reach base gets hit with the earned run, as per usual. If an inherited runner reaches base on a mistake and scores on a non-mistake, the run is a mistake earned run charged to the pitcher who put him on.
With the exception of Earned Production (which is defined in the glossary below), stats recorded are common baseball metrics. For a batter, hits, outs, and strikeouts are split into either mistake or non-mistake. Splits show the percentage of a batter’s production that comes against a mistake versus a non-mistake as well his production rate against each. Below is partial data for Matt Holliday of the St. Louis Cardinals through 231 at-bats of the 2013 MLB season. (Click to enlarge.)
His total numbers are split based on whether they came off a mistake or a non-mistake, and the split also shows the percentage of his home runs, RBIs, and strikeouts that coincide with a mistake or non-mistake pitch. The “rate” numbers at the far right indicate how many at-bats Holliday averages between each hit, home run, RBI and strikeout. For instance, Holliday averages a home run every 11 at-bats that end on a mistake pitch. He averages a homer for every 35.2 at-bats that end on a non-mistake.
Pitchers’ numbers work the same way. Lance Lynn’s splits through 85 innings of the 2013 MLB year are below.
Each out Lynn records marks a third of an inning. The split identifies whether hits, earned runs, and home runs came off of mistakes or non-mistakes and shows how his production and production allowed break down. Strikeout rate and home run allowed rate indicate how many batters on average Lynn faces between each strikeout or homer allowed.
Here is some more sample data, so you can get a feel for what exactly mistake splits measure. You can see each player’s effectiveness against both a mistake and non-mistake as well as how their total production is divided between mistake pitches and non-mistake pitches. (Click to expand.)
The column labeled “Quest” indicates how many grades were questionable. Below is an example of a mistake box score, which displays one full game. The column furthest to the right indicates the inning the questionable grade came. The top row for each player represents his mistake splits, and the bottom, his non-mistake splits.
What Mistake Splits Measure
Consider the classic power hitter who strikes out often and hits a lot of home runs. Against this type, pitchers work meticulously to avoid mistakes because these batters prioritize driving a pitch over just putting it in play. Often described as “all or nothing” hitters, mistake hitters crank pitches misplaced in their wheelhouse but prove to be pretty easy outs when the pitcher hits his spot due to limited plate coverage. Think of guys like Adam Dunn, Jay Bruce, and Mike Napoli.
Some batters aren’t as dangerous against mistakes but distribute production across a wider range of pitches. While Adam Dunn’s production might be heavily clustered among hittable pitches down the middle and up, Dustin Pedroia’s production comes off of a flatter distribution of pitches across the entire strike zone. His production isn’t as reliant on pitch location.
Mistake data still applies to pitchers but, at least initially, it’s more applicable to hitters. Pitchers typically either make more mistakes with stuff that’s harder to hit or seldom make mistakes with more hittable stuff. The latter would be a Bruce Chen, Bronson Arroyo type, while the first example might describe a prototypical reliever who has nasty stuff but makes too many mistakes over the middle.
Distinguishing players’ styles is the first conceptual application of mistake splits. It also highlights the most important concept of this paper: that different batters depend on what the pitcher does to varying degrees.
One of the most frequent questions in baseball has to be why a player is over/underachieving for a various stretch of time. Why is Jose Iglesias hitting .440 since being called up? Ray Lankford had a 143 OPS+ last year—why is it under 100 this year? The baseball world has identified a million explanations for why production strays from expectation. But the pitches themselves might be more overlooked than anything other.
The final section entertains where this trend of quantifying and rating each pitch might go. Mistake splits do a so-so job of explaining stretches of unusually high or low production through pitch quality, but they’re also very basic. For now, noting that a player’s awesome production comes with an abnormally high number of mistake pitches, or that another’s struggles coincide with a stretch of tough pitching, has to do. A batter’s fortune depends a lot more on what the pitcher does than we often credit. And mistake splits provide access to information that might be otherwise ignored. This application remains conceptual because of how underdeveloped it is (covered below in the “What it Means—The Future” section); the practical application has a lower ceiling but is more valuable at this time.
Mistake splits naturally fall into a five-part model that gives a manager practical, game-to-game application. The splits identify a hitter’s effectiveness against a mistake and a non-mistake and the pitcher’s effectiveness when he throws a mistake and a non-mistake, as well as mistake frequency. Those five variables are synthesized to project a result. The projection already makes mistake splits more valuable than the established numbers managers and other forecasters still use to assess matchups. Like any other platoon, inserting a mistake hitter into as many situations as possible against a pitcher who makes many mistakes gives him an opportunity to maximize efficiency. Of course, many other variables combine to forecast a matchup: handedness, career numbers, hot zones, ability to hit specific pitches. Mistake splits are just another tool.
The same knowledge allows for projection based on a change in mistake rate. This applies to a postseason atmosphere where mistakes may be scarce. The notion that non-mistake hitters tend to thrive in the playoffs while mistake hitters suffer has yet to be confirmed nor rejected. But it’s an interesting angle in an era very much dominated by “Moneyball” ideals. Nothing productive would come from opposing the spirit of Moneyball, but looking at another side of the game with the same type of methodology can be. That’s mistake analysis’ focus.
The pitch a batter hits heavily influences the play’s result. Another interest is figuring out the extent to which a batter controls the pitch he hits. The pitcher surely has a big hand in this, but does the batter? Why do some batters see more mistakes than others? While the pitcher controls the ball’s location, the batter determines which pitch he swings at. Both randomness and batting approach dictate the batter’s determination.
Although it’s unusual, batters experience extreme fluctuations in opportunity. Of course, extreme patterns get less common as the sample size grows. Many times, the batter does not have much, if any, control over the pitch that ends an at-bat, but sometimes he does, thanks to his batting approach.
An “approach” at the plate encompasses many aspects of hitting. The most prominent aspects are skills like patience, pitch recognition, and anticipation. But the best word might be discipline.
Improving mistake rate with patience works much like on-base percentage generates offensive production. Whenever a batter gets on base, he doesn’t make an out, and therefore extends the inning. The longer a lineup extends an inning, the more expected runs it scores. Similarly, patient batters see more mistake pitches. Free-swingers tend to end at-bats on non-mistake pitches more often than patient batters with knowledge of the strike zone.
Disciplined hitters don’t only hit more non-mistakes due to patience. A sense of what to swing at and when limits bad swings against a “pitcher’s pitch.” Pitchers make many mistakes that go unnoticed throughout an outing, because mistake splits record only pitches that end at-bats. A batter’s mistake frequency increases if he hits a higher rate of the mistakes available to him. The more refined a batter’s pitch recognition, the higher his mistake rate will be. Conversely, a batter’s mistake rate suffers when he chases pitches. Numbers like O-Swing Percentage and Z-Swing Percentage help quantify a batter’s discipline.
If a batter’s approach dictates the pitches he sees, which dictates his results, the next question attempts to lengthen the chain. What dictates approach? And does it fluctuate? The answers piece together an explanation of how streaks manifest.
What it Means—Streakiness
An unproven but believable assumption is that confidence relates directly to recent production. In other words, confidence is higher when a player is hitting well, and lower when he isn’t. The relationship between production and confidence, explained by approach, helps outline the way in which a streak develops.
Production impacts confidence, which impacts approach, which impacts production. This hypothesis doesn’t disclose the degree of each variable’s impact on the others, but it outlines the structure. With it, we expect a hitter’s production to increase with confidence, and his confidence to increase with production. But does a player hit well when he’s confident or is he confident when he hits well? Even more likely is that confidence and production spiral together to build an element of streakiness.
Many smart people have spent a lot of valuable time scientifically testing whether streakiness exists. And both sides have compelling mathematical arguments. Streakiness is often explained as a stretch of time when a batter experiences unnaturally low or high levels of production. This is true, but under this definition, “streaks” also manifest due to variation alone. Just because a batter has eight hits in his last 17 at-bats doesn’t necessarily mean he’s hot. The true notion of streakiness suggests that something going on actually makes him a better hitter in those 17 at-bats. Being “hot” means a batter’s expected production is influenced by something that’s already happened or is happening. Whatever that may be spirals back and forth in a cycle with production to create a streak. The “cycle” in the example of production, confidence, and approach looks like the following:
The production cycle suggests, generally, that production leads to even more production while struggles lead to more struggles. This assumes that the player’s confidence is volatile to an extent.
Something has to kick-start this cycle, though. Something that initially affects production, which will then affect confidence, which will affect approach, which will cycle back to affect production. There are a ton of nuances of baseball that could suffice. Facing a familiar pitcher or having a couple weak grounders find a hole might get a hitter going. Getting a couple mistakes to hit might also do the trick.
Replace the initial “production” with “mistakes,” and an intriguing causal relationship applicable to mistake splits exist. Mistakes lead to production, then production seems to lead back to mistakes—at least to a degree. The quality of pitches a batter has to hit might well touch off this spiral. According to the relationship, a good pitch to hit leads to success, and success leads to more good pitches to hit because of production’s impact on confidence and confidence’s impact on approach. Hitting bad pitches leads to struggles, which lead to hitting more bad pitches due to poor approach.
The quality of pitches that a batter has to hit doesn’t completely dictate streakiness. But it does seem to potentially explain a small portion of what is an extremely complex phenomenon.
What it Means—The Future
Mistake splits quantify a pitch in the simplest form by categorizing it as either a mistake or a non-mistake. Baseball people have categorized pitches throughout history, starting with names. Fastball, slider, curveball and changeup are nominal data. One fastball is different from another fastball, but there’s no way to distinguish them by name alone. Velocity and movement are interval data, but "98 MPH" and "98 MPH" aren’t necessarily practically equivalent despite being physically the same. Not all 98 MPH fastballs are created equal, for a variety of reasons. The end game of mistake analysis looks to data that give free reign to distinguishing one pitch from another—where a “98” is entirely equivalent to another "98" and "97" is definitely better than "96."
Mistake analysis primitively displays the start of this process. Categorizing a pitch as either a mistake or a non-mistake is one of the few ways to assess the pitch itself, and the only way that sorts production accordingly. The next step is to distinguish good from good and bad from bad. To quote myself above:
The long-term goal differentiates mistake from mistake. This task gets harder because of all the variables involved: exact location, trajectory, and nastiness, among others. Future graders of mistake analysis can consider all relevant variables to better quantify each pitch. They will also certainly not be humans. Once a more in-depth means of quantifying a pitch is set, a computer could calculate and store the quality of every pitch thrown.
The process is long, and the end lies somewhere far in the future, but it must start somewhere. Mistake analysis is a movement more than anything else—the tip of the iceberg. Only time will tell how big the iceberg actually is.
Earned Production (EP)
Theoretically, a player’s EP always lands somewhere between 0 and 1.00 because of the literal definition of “mistake.” The working definition of “mistake” reads that the batter “clearly gets a better opportunity to produce.” So greater production off of mistakes is an assumption. Hypothetically, a player with an EP > 1.00 can exist, however.
In the EP columns, the Earned Production is the number that coincides with the non-mistake split. Under the EP column, the number that coincides with the mistake split represents the same proportion, but for mistakes. It takes the proportion of mistake production and divides it by percentage of mistake at-bats. Theoretically, this number should always be greater than 1.00 for the same reason EP of non-mistakes is always between 0 and 1.