BP Comment Quick Links
April 19, 2016 Baseball TherapyThe One About Exit VelocityIt’s 2016 and Statcast is everyone’s favorite new toy. It’s not exactly a new toy, of course. Bits and pieces of the system were rolled out in 2014 and last year, there were plenty of chances for the data to make themselves known on game broadcasts. Baseball fans have begun to absorb a new set of numbers as they watch the game. Unlike some of the “advanced” stats that have come before Statcast, these are numbers that a lot of people had actively wondered about, but had very little ability to measure. How fast was he running on that play? That looked like a long way to run to make that catch, but how long was it? One of the shiniest new toys coming out of Statcast has been “exit velocity” off the bat. For years, it’s been easy to measure how fast the pitcher throws the ball toward the batter, but never a way to know how fast the batter returns the favor. It always seemed that some guys hit the ball harder than others, but other than judging “the crack of the bat” there was no way to apply a little methodological rigor to the subject. But now there is. We can look up a leaderboard on exit velocity and see who’s been hitting the ball the hardest (and perhaps getting unlucky by hitting it right at someone) and who’s been giving up the hardest hit balls. The idea behind Statcast was that it would allow front offices and fans alike to evaluate players in new ways. With hitters, it might provide data on performance that was more divorced from luck. A hitter might go 0for4 in a game, but hit three screamers right at the shortstop that – an inch or two to the left – would have shot into the leftcenter gap for tworun doubles. Exit velocity might help us pick out a few diamonds in the rough whose result stats don’t look great, but who clearly are doing something right. But before we get too excited about the possibilities of exit velocity, perhaps we need to ask a few more fundamental questions. Calculating a player’s average exit velocity (or a pitcher’s exit velocity allowed) is easy enough. But is it meaningful? It’s not only 2016, but it’s April of 2016, a time of the year when there are inevitably some small sample size wonders, but people start wondering when we reach that magical point in the season when it becomes something more than just a weird fluke. This is where reliability analysis comes in. We have a pretty good idea of what exit velocity can do for a hitter. How long until we can believe his exit velocity? Warning! Gory Mathematical Details Ahead! I pulled data from the 2015 Statcast database. I found all batted balls which were into play (i.e, no foul balls, although home runs were welcome in the sample). I only used balls on which exit velocity was recorded. I required that each hitter had 300 such balls, and I lined them all up chronologically. Given that some stats require 500 or 600 plate appearances before they become “reliable enough” I wasn’t sure if I’d have the sample size to sustain these analyses, but I figured I’d give it a try. Because exit velocity is a continuous variable, to conduct a reliability analysis we’ll use Cronbach’s alpha statistic. For the uninitiated, let’s say that I have 100 plate appearances for each hitter in my sample (the first 100 of the season with a recorded exit velocity). What Cronbach’s formula does is takes 50 random exit velocities and takes their average and compares them to the average of the other 50 exit velocities. Then it selects another 50 at random and compares those to the 50 that were not chosen. And it does this over and over again until every possible logical combination of 50 has been selected. What’s produced is essentially a giant correlation. The idea is that if you take a sample of 50 PA and compare it to another sample of 50 PA drawn from the same basic timeframe and under the same circumstances, they should produce (within some margin for error) the same basic results. If Cronbach’s alpha is above .70, it’s generally considered to be good reliability. There’s nothing magical about the number .70. It’s an arbitrary line in the sand, although one that has at least some rational basis. At a correlation of .70, you have an Rsquared of .49, which is (yes, I know a little bit less than) half. At that point, we are accounting for half (that is, the majority) of the variance as a result of the hitter himself. Anything north of .70 means we’ve got even more than half. I used 20 PA intervals of sampling frame. That is, I started with everyone’s first 20 PA’s and had Cronbach’s split them 10and10 and thus, got the reliability number for 10 PA’s. The results (n = 154, for the curious):
I think it’s worth adding in a caveat about these sorts of analyses that people don’t often heed. The idea that a number has become “reliable” is not the same thing as saying that the player is now that number and that going forward this is what we should expect out of him. Reliability in this sense is a retrospective number. I can look back on the first few weeks of the season, look at the Statcast leaderboard and feel pretty good that those 40 balls in play represent a good estimate of what a player’s actual “talent” for exit velocity was during that time which is now past. It’s not a bad assumption that he might continue at that talent level during the next 40 balls in play or the next 100 or the next 200, but it is an assumption. Still, that’s a pretty low number. Exit velocity is pretty quick to stabilize for hitters. So… what about pitchers? Last year Rob Arthur found that the variation between pitchers was much less than the variation between hitters and calculated that a pitcher’s “contribution” to the exit velocity of a specific ground ball was about 1/5^{th} that of the hitter. Well, let’s see what our reliability analysis yields. (Same method as above, this time with balls binned by pitchers, rather than hitters; n = 133)
Maybe “collect more data” isn’t always a great idea. There’s a tendency to view baseball players as their season stats. If a hitter puts up a .300 average during a season, we tend to look back on that season and assume that he was .300 hitter all along, from April to September. What if he was really a .280 hitter in the first half, then at the AllStar break, he made an adjustment and was really a .320 hitter in the second half? What if it was even more nuanced than that? If we could somehow know his true talent and could graph it over the course of a season, we could see that it wandered hither and yon. I’d suggest that we’re seeing something similar for pitchers and the time that it takes for that true talent to wander around is a lot less than you might think. We know that pitchers do get better and worse as they develop and age, but maybe those developments are less linear and more rapid than we thought. It is possible – and according to these numbers, common!  that while a pitcher might have had a good April, by June he could be a different pitcher. What’s strange is that we’re not seeing that these numbers are unreliable. In that case, we might say that exit velocity allowed is all chance, sorta like BABIP. In fact, in small doses, exit velocity is quite reliable. A Crack in the DIPS Code? This might be a little breakthrough in understanding the mystery of DIPS. Maybe the problem was that we were conceptualizing the problem wrong. We assumed that pitchers should be the same throughout a year and that more data were better. If performance wasn’t correlating, then it must be a function of luck, rather than rapid, but real fluctuation in talent level. These findings suggest we’d do better looking into understanding how a pitcher changes within a season – maybe within a month – if we really want to understand him.
Russell A. Carleton is an author of Baseball Prospectus. Follow @pizzacutter4
17 comments have been left for this article.

For pitchers, I wonder if the dip from 10 to 20 BIPs is due to the identity of the batters. For a starter the 10 BIP number could all be coming from the same game  average BIPs per start is about 18. If you increase the sample to get 20 BIPs per half you're usually going to be covering two different teams.
That makes sense.
This question provokes my question then:
Why not randomize the 10 BIP also? If I followed the article correctly, you took the first 10 BIP of the season and all randomized 5x5 combinations of it. But could you take a random set of 10 BIP from the original set of 300 PA, and then do the 5x5?
I ran them chronologically because that's how we get them in real time. I'm guessing that upfront, we have the issue that's been identified, but as you go further into the season, it smoothes out.