May 11, 2012
The Stats Go Marching In
All About Velocity
Cooling off the radar guns
Everyone likes looking at radar guns.
I love finesse pitchers and junkball artists, probably because they look the most human to me. Nevertheless, my eyes gravitate toward the velocity reading after Aroldis Chapman throws his heater. I also look up to see if Justin Verlander has just hit triple digits on his hundredth pitch in the eighth inning, as he seems to gain velocity as the game progresses.
Slow pitches can be even more compelling than speedy ones. I always look to the broadcast graphics to learn the speed (or lack thereof) of Vicente Padilla’s blooper (it goes as low as 53 mph, while his fastball can leave his hand at 96), Barry Zito’s roundhouse curve, and all of Jamie Moyer’s arsenal (he’s topped 80 mph 11 this year at the time of this writing, with no offering passing 81.)
Since the advent of PITCHf/x, we haven’t been limited to the radar readings shown on TV, as every pitch’s velocity is captured and archived for posterity. Thus we can dig into the big database, which now has close to three million pitch speeds recorded for major-league games, and answer a lot of questions (like, for instance, how slow Padilla’s blooper is and how many times Moyer has cracked 80 mph).
PITCHf/x data accuracy is very high. However, over the years we have seen that the excellent system devised by Sportvision may encounter some calibration issues, sometimes isolated, sometimes for a prolonged time (as happened in Minute Maid Park in 2010.) Great analyses can be performed on raw PITCHf/x data, but you can’t jump at an umpire for calling a pitch that missed the outside corner according to the PitchTrax on TV, because what you see on the graphics might just be an artifact of the system being calibrated a bit to the inside.
Several attempts have been made to reduce park-to-park and game-to-game inconsistencies in pitch location data. There is great stuff in these pages by Mike Fast, and yours truly also devoted some effort to the issue. While reaching perfection is impossible, both Mike’s and my algorithms produced correction values to improve the already good PITCHf/x location data.
There is no use in repeating the work you can read in the articles linked above. Suffice it to say that, while we used different tools, we applied a very similar framework. The idea was to look at all the pitchers appearing in a particular game and compare their locations in the game to the ones in previous games. Every time I write an article using pitch location data, I correct them using my algorithm before performing any kind of analysis.
I also have four years of correction values for pitch speed, calculated in the same way as the location coordinates. However, I have not used them so far.
Why not? Because there is a lot more going on with velocity, and that’s what I’m going to explore in this article.
Washington’s gas policy
One obvious question Ben had was whether what he was observing was real rather than an artifact of the PITCHf/x system in Washington being generous with speed recordings.
My automated corrections say that the games played in the Capital in the month of April 2012 have velocities depressed by 0.15 when pitchers and pitch types are considered (the games played by the Nationals on the road have a similar mark, at -0.14.)
This does not mean that the “radar gun” in Washington is cold (or at least below average). While we can be pretty sure that PITCHf/x calibration is the main culprit when something is off with location data, several other factors enter the equation for velocity. In fact, all but four ballparks appear to have cold guns this season, and the -0.15 value for Washington is actually above league average for the month of April 2012.
That’s nothing surprising. Ben noted in his article that the speeds posted by the Nationals’ starters were all the more remarkable because they were limited to the month of April, when velocity is at its minimum across the league.
Here are the corrections the algorithm I use for locations would apply to speeds, grouped by month.
They work like this: in order to arrive at the adjusted speed, you have to subtract the correction value from the recorded speed. Thus, a negative correction value means the gun is cold, and you should add some speed to yield a truer velocity.
It’s obviously not a coincidence that I have added a column indicating the average game time temperature to the above table. In fact, it appears (unsurprisingly) that the velocity measurements are biased toward higher values when the temperature is higher.
This allows us to state a first obvious thing: when we apply to velocity the same corrections devised for location, what we capture is not only the bias in the PITCHf/x calibration, but also the effect of weather.
If we were able to isolate the two components, we would be able to present the real velocity of the pitch, as delivered particular weather conditions. However, if we go back to Ben’s use of PITCHf/x data, we note that:
So, for something like Ben’s question, the location-like corrections can be used—unless there is something else going on.
All things considered
Ball/strike count and the presence of runners on the bases (as a proxy for throwing from the set position) were initially tested in the model but later discarded when they showed no significant effect on the location calibrations. Throwing them back into the model gives the following results:
Pitchers deliver the ball over 0.7 miles per hour faster (relative to an 0-0 pitch) when they need one more strike to eliminate the batter, throw around 0.2 mph slower when grooving a 3-0 pitch, and throw roughly 0.2 mph harder on every other count.
In the figures above, the type of pitch is already accounted for, so the fact that fastballs are prevalent in some counts should not be an issue. I’m not sure why the first pitch of the at-bat should be the slowest one (except for the 3-0 pitch), but the difference is not huge. The remaining numbers make sense.
I expected to see that pitchers throw harder in bases-empty situations because they can pitch from the windup; however, that does not seem to be the case, as they apparently reach for a little extra gas (+0.2 mph) when they are in a tight spot. Maybe the extra effort pitchers make with men on is enough to make up the difference between the full windup delivery and the set position, and then some.
I did not include temperature in the model but, as I did a couple of weeks ago when evaluating the factors that influence run scoring, I used month of the year and time of the day. Here is what emerges from the model. I thought it would be interesting to add a column reporting the results from my previous article on run scoring.
While the two are not perfectly related, runs come a little scarcer when pitch speed increase. However, while the month table, like the one we showed at the beginning of the article, indicates a strong (and expected) relationship between temperature and pitch speed, the one with times of the day illustrates something different, as there is an increase in speed in the coolest parts of the day.
Note that the identity of pitchers is already considered, so the fact that flame-throwing relievers are more likely to be on the mound at later hours is also an already neutralized factor in the model.
There’s one final element incorporated into the model that was considered in the article on run scoring as well: pitcher fatigue. Unless your name is Justin Verlander, you are likely to see a decrease in your speed as the game progresses and your arm tires.
Here is the effect of fatigue on velocity when all the factors previously mentioned are considered.
Not surprisingly, fresh pitchers throw the ball harder.
Back to adjustments
Here are the five ballparks where the “radar gun” has been the most generous toward pitchers in the last couple of years.
And now, the coldest guns in the majors.
Actually, despite all of our efforts, the numbers above do not capture the offset of the PITCHf/x system alone. While month and hour of the day encapsulate a great deal of the atmospheric conditions, the cities around the States (and the one in Canada as well) have their own peculiar climate.
Seeing New York ballparks so close would suggest something in the air of the Big Apple was missed by the model. On the other hand, having Wrigley Field and The Cell so distant would not suggest such a conclusion for Chicago.
So what do we make of these adjustments? Let’s go back to Ben Lindbergh’s question once again.
Sometimes simpler is better. Taking the raw adjustments I presented at the beginning of the articles suggests that the reported velocities are slightly down for the Nationals’ arms; on the other hand, the finer adjustments say the ballpark in Washington has a hot gun. The former do not give us any info on why we should adjust the speed marks upward—they just provide a number that represents weather plus ballpark plus unknown factors. The latter show separate effects for the components, but one would have to take care of all of them when trying to answer a simple question such as, “Are the Nationals starters throwing hard?”
The bottom line is that the choice between the simpler and the more complex adjustments is not univocal: it depends on the question we are trying to answer. Next time Ben asks me about velocities, I’ll know that my simple corrections won’t be completely useless.