It’s 2016 and Statcast is everyone’s favorite new toy. It’s not exactly a new toy, of course. Bits and pieces of the system were rolled out in 2014 and last year, there were plenty of chances for the data to make themselves known on game broadcasts. Baseball fans have begun to absorb a new set of numbers as they watch the game. Unlike some of the “advanced” stats that have come before Statcast, these are numbers that a lot of people had actively wondered about, but had very little ability to measure. How fast was he running on that play? That looked like a long way to run to make that catch, but how long was it?

One of the shiniest new toys coming out of Statcast has been “exit velocity” off the bat. For years, it’s been easy to measure how fast the pitcher throws the ball toward the batter, but never a way to know how fast the batter returns the favor. It always seemed that some guys hit the ball harder than others, but other than judging “the crack of the bat” there was no way to apply a little methodological rigor to the subject. But now there is. We can look up a leaderboard on exit velocity and see who’s been hitting the ball the hardest (and perhaps getting un-lucky by hitting it right at someone) and who’s been *giving up* the hardest hit balls.

The idea behind Statcast was that it would allow front offices and fans alike to evaluate players in new ways. With hitters, it might provide data on performance that was more divorced from luck. A hitter might go 0-for-4 in a game, but hit three screamers right at the shortstop that – an inch or two to the left – would have shot into the left-center gap for two-run doubles. Exit velocity might help us pick out a few diamonds in the rough whose result stats don’t look great, but who clearly are doing something right.

But before we get too excited about the possibilities of exit velocity, perhaps we need to ask a few more fundamental questions. Calculating a player’s average exit velocity (or a pitcher’s exit velocity allowed) is easy enough. But is it meaningful? It’s not only 2016, but it’s April of 2016, a time of the year when there are inevitably some small sample size wonders, but people start wondering when we reach that magical point in the season when it becomes something more than just a weird fluke. This is where reliability analysis comes in. We have a pretty good idea of what exit velocity can do for a hitter. How long until we can believe his exit velocity?

**Warning! Gory Mathematical Details Ahead!**

I pulled data from the 2015 Statcast database. I found all batted balls which were into play (i.e, no foul balls, although home runs were welcome in the sample). I only used balls on which exit velocity was recorded. I required that each hitter had 300 such balls, and I lined them all up chronologically. Given that some stats require 500 or 600 plate appearances before they become “reliable enough” I wasn’t sure if I’d have the sample size to sustain these analyses, but I figured I’d give it a try.

Because exit velocity is a continuous variable, to conduct a reliability analysis we’ll use Cronbach’s *alpha* statistic. For the uninitiated, let’s say that I have 100 plate appearances for each hitter in my sample (the first 100 of the season with a recorded exit velocity). What Cronbach’s formula does is takes 50 random exit velocities and takes their average and compares them to the average of the other 50 exit velocities. Then it selects another 50 at random and compares those to the 50 that were not chosen. And it does this over and over again until every possible logical combination of 50 has been selected. What’s produced is essentially a giant correlation.

The idea is that if you take a sample of 50 PA and compare it to another sample of 50 PA drawn from the same basic timeframe and under the same circumstances, they should produce (within some margin for error) the same basic results. If Cronbach’s *alpha* is above .70, it’s generally considered to be good reliability. There’s nothing magical about the number .70. It’s an arbitrary line in the sand, although one that has at least some rational basis. At a correlation of .70, you have an R-squared of .49, which is (yes, I know a little bit less than) half. At that point, we are accounting for half (that is, the majority) of the variance as a result of the hitter himself. Anything north of .70 means we’ve got even more than half.

I used 20 PA intervals of sampling frame. That is, I started with everyone’s first 20 PA’s and had Cronbach’s split them 10-and-10 and thus, got the reliability number for 10 PA’s.

The results (n = 154, for the curious):

Balls in Play |
Reliability of Exit Velocity |

10 |
.350 |

20 |
.527 |

30 |
.635 |

40 |
.679 |

50 |
.732 |

At 40-something balls in play, we can get an average exit velocity for a hitter that is fairly reliable. To put that in some context, the batting stats that have the lowest point of reliability are things like swing rate and strikeout rate, which become reliable around 50 or 60 PA. Exit velocity requires a ball in play, so 40 balls in play might take a few more PAs, but the point where it’s “not a small sample size any more” is very quick to come in the season.

I think it’s worth adding in a *caveat* about these sorts of analyses that people don’t often heed. The idea that a number has become “reliable” is **not** the same thing as saying that the player is now

*that number*and that going forward this is what we should expect out of him. Reliability in this sense is a retrospective number. I can look back on the first few weeks of the season, look at the Statcast leaderboard and feel pretty good that those 40 balls in play represent a good estimate of what a player’s actual “talent” for exit velocity was

*during that time which is now past.*It’s not a bad assumption that he might continue at that talent level during the next 40 balls in play or the next 100 or the next 200, but it is an assumption.

Still, that’s a pretty low number. Exit velocity is pretty quick to stabilize for hitters.

So… what about pitchers?

Last year Rob Arthur found that the variation between pitchers was much less than the variation between hitters and calculated that a pitcher’s “contribution” to the exit velocity of a specific ground ball was about 1/5^{th} that of the hitter. Well, let’s see what our reliability analysis yields. (Same method as above, this time with balls binned by pitchers, rather than hitters; n = 133)

Balls in play |
Reliability of Exit Velocity Allowed |

10 |
.390 |

20 |
.195 |

30 |
.405 |

40 |
.758 |

There’s a blip at the start (those numbers on 10 and 20 are not reversed, they are correct), but exit velocity quickly reaches “good enough” reliability at 40ish balls in play as well. Normally, I’d stop there, but I think you might want to see the rest of the chart.

Balls in play |
Reliability of Exit Velocity Allowed |

50 |
.772 |

60 |
.745 |

70 |
.728 |

80 |
.719 |

90 |
.720 |

100 |
.701 |

110 |
.698 |

120 |
.688 |

130 |
.691 |

140 |
.697 |

150 |
.718 |

As you add more data to the sampling frame, reliability of exit velocity doesn’t get better for pitchers. It actually starts trending… downward. More data makes for a less reliable estimate of true talent. When I saw this, I looked at what happened to the batters as you add more data, and found a more traditional expectation that reliability went up and up and up with more data. What’s going on with pitchers?

Maybe “collect more data” isn’t always a great idea. There’s a tendency to view baseball players as their season stats. If a hitter puts up a .300 average during a season, we tend to look back on that season and assume that he was .300 hitter all along, from April to September. What if he was *really* a .280 hitter in the first half, then at the All-Star break, he made an adjustment and was *really* a .320 hitter in the second half? What if it was even more nuanced than that? If we could somehow know his true talent and could graph it over the course of a season, we could see that it wandered hither and yon.

I’d suggest that we’re seeing something similar for pitchers and the time that it takes for that true talent to wander around is a lot less than you might think. We know that pitchers do get better and worse as they develop and age, but maybe those developments are less linear and more rapid than we thought. It is possible – and according to these numbers, common! — that while a pitcher might have had a good April, by June he could be a different pitcher. What’s strange is that we’re not seeing that these numbers are *un*reliable. In that case, we might say that exit velocity allowed is all chance, sorta like BABIP. In fact, in small doses, exit velocity is quite reliable.

**A Crack in the DIPS Code?**

The implications of this are kinda big. For example, we have long assumed that pitcher BABIP was essentially random and that pitchers essentially had no control. If we take exit velocity as a proxy for BABIP, perhaps the problem isn’t that pitcher has no control. Of course, there’s going to be luck with everything in baseball, but the issue might be that what we’ve really been seeing all along is that pitchers can be six different animals over the course of a season when it comes to their ability to prevent hard hit balls. It’s no wonder that shows up as randomness when we look at reliability.

This might be a little breakthrough in understanding the mystery of DIPS. Maybe the problem was that we were conceptualizing the problem wrong. We assumed that pitchers should be the same throughout a year and that more data were better. If performance wasn’t correlating, then it must be a function of luck, rather than rapid, but real fluctuation in talent level. These findings suggest we’d do better looking into understanding how a pitcher changes within a season – maybe within a month – if we really want to understand him.