Notice: Trying to get property 'display_name' of non-object in /var/www/html/wp-content/plugins/wordpress-seo/src/generators/schema/article.php on line 52
keyboard_arrow_uptop
BP360 is back! Pick up a yearly subscription, 2025 Annual, and t-shirt for one great price!

The last decade has seen much discussion and evolution in sabermetric thought around the relative abilities of batters, pitchers, fielders, and Lady Luck to control the outcome of batted balls. Data collected by Sportvision and MLBAM sheds new light on this question, but before we tackle that data, let’s review some of the history of how we came to our current state of knowledge.

When Voros McCracken published his Defense-Independent Pitching Statistics in 2001, his findings were considered extremely controversial. Since that time, however, the sabermetric community has largely adopted his conclusions, with some refinements and caveats.

McCracken refined his approach a year later and summarized his conclusions as follows:

1. The amount that MLB pitchers differ with regards to allowing hits on balls in the field of play is much less than had been previously assumed. Good pitchers are good pitchers due to their ability to prevent walks and homers and get strikeouts in some sort of combination of those three.

2. The differences that do exist between pitchers in this regard are small enough so that if you completely ignore them, you still get a very good picture of the pitcher’s overall abilities to prevent runs and contribute to winning baseball games.

3. That said, the small differences do appear to be statistically significant if generally not very relevant.

The following year, Tom Tippett published an extensive study that modified some of McCracken’s conclusions. Tippett’s summary of his work mostly reflects the current state of knowledge on the topic:

1. Pitchers have more influence over in-play hit rates than McCracken suggested. In fact, some pitchers (like Charlie Hough and Jamie Moyer) owe much of their careers to the ability to excel in this respect.

2. Their influence over in-play hit rates is weaker than their influence over walk and strikeout rates. The most successful pitchers in history have saved only a few hits per season on balls in play, when compared with the league or team average. That seems less impressive than it really is, because the league average is such a high standard. Compared to a replacement-level pitcher, the savings are much greater.

3. The low correlation coefficients for in-play batting average suggest that there's a lot more room for random variation in these outcomes than in the defense-independent outcomes. I believe this follows quite naturally from the physics of the game. When a round bat meets a round ball at upwards of 90 miles per hour, and when that ball has laces and some sort of spin, miniscule differences in the nature of that impact can make the difference between a hit and an out. In other words, there's quite a bit of luck involved.

4. Year-to-year variations in IPAvg-versus-team can occur if the quality of a pitcher's teammates varies from year to year, even if that pitcher's performance is fairly consistent.

5. The fact that there's room for random variation doesn't necessarily mean a pitcher doesn't have any influence over the outcomes. It just means that his year-to-year performances can vary randomly around value other than zero, a value that reflects his skills.

6. Unusually good or bad in-play hit rates aren't likely to be repeated the next year. This has significant implications for projections of future performance.

7. Even if a pitcher has less influence on in-play averages than on walks and strikeouts, that doesn't necessarily mean that in-play outcomes are less important. Nearly three quarters of all plate appearances result in a ball being put in play. Because these plays are much more frequent, small differences in these in-play hit rates can have a bigger impact on scoring than larger differences in walk and strikeout rates.

In 2005, John Burnson found that pitchers did not have much impact on their rate of home runs allowed other than the extent to which they allowed outfield flies in general. (Dave Studeman created the xFIP statistic based upon this concept, normalizing not only a pitcher's BABIP rate but also his rate of home runs allowed per outfield fly ball.)

In 2005 and 2006, respectively, J.C. Bradbury and David Gassko found that pitchers had no consistency from year to year in their rate of line drives allowed. They confirmed the finding that pitchers had little year-to-year consistency in the rate of home runs allowed on outfield flies, and they also observed some statistically-significant year-to-year correlation in pitchers’ popup rates.

Having done this research, it becomes obvious why Voros’ original postulate works so well. While pitchers exhibit great control over the types of balls in play they allow, they show little overall control on the two batted ball types that impact BABIP the most—infield flies (where there is some year-to-year correlation) and line drives (where there is none). More so, as infield flies occur relatively rarely (constituting only slightly more than 4% of all balls in play), they will not have enough of an overall impact for any strong year-to-year relationship in year-to-year BABIP. You can make sense of a pitcher’s season just by looking at his home run, strikeout, and walk rates. But you’ll get a better and more detailed picture by using batted ball data.

At this point the devolution of the pitcher’s control over batted balls in sabermetric understanding was basically complete. What mattered on balls in play was whether a pitcher allowed ground balls or fly balls; the rest of his batted-ball performance was unpredictable from year to year. Many analysts thus concluded that strikeouts, walks, and ground ball rate (and perhaps popup rate) were all that mattered for major-league pitchers. In this view, batted ball results beyond getting ground balls (and popups) were due either to the performance of the batter, the pitcher’s fielders and park, or to unrepeatable luck.

Other analysts, including this author, believed that the nature of the physics of the game indicated that, though the current statistics did not show it, the pitcher must have significant control not just over the vertical angle at which the ball came off the bat but also over whether the batter’s contact itself was weak or solid. In fact, a conversation to that effect with Tom Tippett at the 2008 Sportvision PITCHf/x Summit has stayed in my mind ever since. I hope that this study will illuminate the question of whether major-league pitchers have a varied and persistent skill in eliciting weak contact.

At that same 2008 PITCHf/x Summit, Peter Jensen presented a proposal for measuring the initial speed of batted balls using the PITCHf/x camera footage. Over the following off-season, Sportvision developed the HITf/x system to do just that, and the following summer, Sportvision released the HITf/x data from April 2009 for public study.

Earlier this year, I examined the April 2009 HITf/x data to learn whether pitchers had a persistent skill around quality of contact. I found that batters seemed to have a greater degree of control over how hard the ball was hit but that pitchers also had a significant degree of control over batted ball speed. However, the one-month sample size restricted the ability to draw firmer quantitative conclusions, and I did not publish my findings at that time.

This summer, Sportvision graciously provided me with the full season of 2008 HITf/x data, allowing me to study the question on a larger sample of just over 124,000 batted balls.

The HITf/x data measures the speed and direction of each batted ball throughout its trajectory in the PITCHf/x camera frames, which cover roughly the area between home plate and the pitcher’s mound. The reported speed is the average speed over this distance, which will be slightly lower than the initial speed off the bat due to the drag force. In addition, the speeds of ground balls that bounce very near home plate may be difficult to measure prior to the first bounce. Nonetheless, I believe that the initial speeds reported in the data are accurate and consistent enough for this type of evaluation.

To measure the quality of contact, I calculated the initial speed of batted balls in the plane of the playing field. Popups or balls pounded sharply into the ground may leave the bat at a high speed, but they are not usually difficult to field. Balls that travel quickly toward the outfield fence provide a much greater challenge to the fielders.

How does the horizontal component of the speed of the ball off the bat relate to the chances that a ball will fall for a hit?

 

A batted ball with a horizontal speed off the bat (hSOB) of less than 60 mph had only about a 10 percent chance of turning into a hit. These batted balls were typically infield popups or weak ground balls. At horizontal speeds above 50 or 60 mph, the harder the ball was hit, the better the chance the batter reached safely. When the hSOB was 100 mph or more, the chance of getting a hit exceeded 60 percent.

We will revisit later how quality of contact and other factors affect batting average on balls in play, but let’s return to the question of who controls the quality of contact.

I randomly split the batted balls from the 2008 HITf/x data into two halves and compared the average hSOB between halves for each pitcher and batter with at least 300 total batted balls.

 

Batters have a good deal of correlation between halves of the sample, with a correlation coefficient of r=0.76 with an average of 201 batted balls in each half. That means that we would add 63 batted balls (or about one month’s worth) at league average to the observed average speed for each batter in order to estimate his true skill.

Here are the batters (excluding pitchers) with the highest and lowest average hSOB in 2008, after applying the regression toward the league average:

Batter

Batted Balls

Observed hSOB (mph)

Regressed hSOB (mph)

Ryan Howard

418

79.5

78.4

Manny Ramirez

410

79.3

78.2

Joe Mauer

450

78.3

77.4

Chipper Jones

352

77.9

76.9

Miguel Cabrera

441

77.5

76.7

Shin-Soo Choo

220

78.3

76.6

Derrek Lee

485

77.3

76.6

Andre Ethier

450

77.3

76.5

Joey Votto

407

77.1

76.2

Jorge Posada

126

78.9

76.2

League Average

 

70.9

 

Jeff Mathis

183

63.0

65.0

Augie Ojeda

190

63.0

65.0

Tony Pena

155

62.6

65.0

Luis Castillo

222

63.1

64.9

Juan Uribe

261

63.2

64.7

Reggie Willits

68

58.4

64.4

Willy Taveras

360

60.5

62.1

Carlos Gomez

400

60.6

62.0

Alexi Casilla

328

60.0

61.5

Joey Gathright

212

54.9

58.5


Here is the same data for pitchers who allowed at least 300 batted balls in 2008.

 

Pitchers have fairly good correlation between halves of the sample, though not as good as batters. The correlation coefficient is r=0.48 with an average of 251 batted balls in each half. That means that we would add 269 batted balls (or about three months’ worth for a starter) at league average to the observed average speed for each pitcher in order to estimate his true skill.

One thing that stands out is that the spread of values among pitchers is not as big as the spread among batters. For players with at least 300 batted balls, the standard deviation in average hSOB for batters was 3.2 mph, and for pitchers it was 1.8 mph.

Here are the pitchers with the lowest and highest average hSOB allowed in 2008, after applying the regression toward the league average:

 

Batter

Batted Balls

Observed hSOB (mph)

Regressed hSOB (mph)

Mariano Rivera

155

60.6

67.1

Carlos Marmol

175

63.7

68.1

Carlos Zambrano

570

67.2

68.4

Craig Breslow

125

63.2

68.4

Sean Marshall

186

64.9

68.5

Craig Hansen

130

63.6

68.5

Jake Peavy

414

67.0

68.6

Luis Vizcaino

123

63.5

68.6

C.C. Sabathia

624

67.8

68.7

Chris Young

252

66.4

68.7

League Average

 

70.9

 

Jeremy Bonderman

206

75.9

73.1

Josh Rupe

257

75.3

73.1

Mike Mussina

570

74.1

73.1

Miguel Batista

385

74.6

73.1

Kyle Kendrick

530

74.4

73.2

Kevin Correia

380

74.9

73.2

John Lackey

481

74.5

73.2

Daniel Cabrera

528

74.7

73.4

Carlos Silva

548

75.0

73.6

Livan Hernandez

662

75.1

73.9


To look further into the question of who controls the speed of the ball off the bat, I performed a multivariate regression comparing the hSOB of each of the 102,000 batted balls in the sample to the regressed average hSOB for the batter and pitcher involved, where the batter and pitcher each had at least 100 batted balls. The best prediction for the horizontal speed of the ball off the bat comes from weighting the pitcher’s regressed average hSOB by 1.83 and the batter’s regressed average hSOB by 1.20.

However, the spread (standard deviation) of the batters’ regressed average hSOB of 2.76 mph is wider than the spread of the pitchers’ regressed average hSOB of 1.08 mph. Thus, we can estimate that the batter’s average hSOB has about (2.76*1.20) / (1.83*1.08) = 1.7 times as much influence on the resulting hSOB of the batted ball as does the pitcher’s average hSOB.

To put it another way, the pitcher’s average quality of contact is more predictive of the quality of contact on a given batted ball than is the batter’s average quality of contact. However, the average quality of contact varies much less among pitchers than it does among batters in major-league baseball. As a result, the identity of the batter is more important in determining the resulting quality of contact than the identity of the pitcher, at least to the extent that we can determine it with these statistical techniques.

I also performed a similar regression comparing the hSOB of the 40,000 batted balls in the sample to the observed average hSOB for the batter and pitcher involved where the batter and pitcher each had at least 300 batted balls. The results are similar. For that sample, the best prediction for the horizontal speed of the ball off the bat comes from weighting the pitcher’s regressed average hSOB by 1.04 and the batter’s regressed average hSOB by 0.99. The spread of the batters’ observed average hSOB of 3.16 mph is wider than the spread of the pitchers’ regressed average hSOB of 1.77 mph. Thus, we can estimate that the batter’s average hSOB has about (3.16*0.99) / (1.04*1.77) = 1.7 times as much influence on the resulting hSOB of the batted ball as does the pitcher’s average hSOB.

I tried the same regression using pitcher strikeout rate per plate appearance as an additional independent variable, but it had virtually no additional explanatory power in the model (p-value of 0.47).

It is probably possible to build a more sophisticated model to predict batted ball speed based upon batter and pitcher characteristics. However, this simple model suggests that the batter has about twice as much influence on the quality of contact as does the pitcher. A major-league pitcher does not only control whether he gets ground balls or fly balls; he also has a significant degree of control over how hard the ball is hit, though the batter has somewhat more control over the quality of contact than the pitcher. I consider this an extremely significant finding.

Given what we know about DIPS and the unreliability of pitcher BABIP, this conclusion may surprise some. However, let me quickly clarify two points.

First, I have not excluded home runs from the analysis to this point. Removing home runs was a construct, and an illuminating one, that McCracken chose to make DIPS work. However, if we wish to discuss quality of contact, it would be arbitrary and incorrect to remove many of the hardest-hit balls from the sample. We have access to data that was not available a decade ago; thus, we can look at the quality of contact more directly. This analysis is independent of the fielders by virtue of looking at the batted ball speed rather than by segregating by batted ball outcome.

Second, batter and pitcher split-half hSOB correlations are basically unchanged if home runs are excluded from the analysis.

It is possible to conduct a similar analysis with an eye toward better understanding BABIP. The causes of batted ball results are complex and interdependent, but in the second part of this study, I will sketch out some preliminary findings on that topic.

Thanks to Sportvision and MLBAM for providing the HITf/x data for the study. Thanks to Colin Wyers for his input and feedback. Thanks also to Brian Mills and Dave Studeman for their assistance.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now
You need to be logged in to comment. Login or Subscribe
jrfukudome
11/16
Awesome.
sam19041
11/16
Love it! Way to go, KS (Mike)!!!

Btw doesnt your IP minimum skew results? Only "successful" pitchers will have the chance to pitch that many innings. Those who allow harder hit balls might be weeded out (though maybe that's an accurate take on MLB).
sam19041
11/16
Maybe 300 batted balls is a low enough threshold that it doesn't matter...
mikefast
11/16
Thanks, Sharky! As far as batter/pitcher control over the result is concerned, I looked at that both at a 100 batted ball and 300 batted ball threshold and found similar results.

In terms of whether those who hit weaker balls (as batters) or allow harder-hit balls (as pitchers), I wouldn't be surprised if they get weeded out earlier, perhaps very quickly for the many fringe-MLB quality players who only get a brief chance to establish themselves. Tom Tippett's study of BABIP indicated as much. There is, of course, a selective sampling issue in that future playing time is allocated partly based upon the past outcomes for that player as opposed to their actual skill (we learn their skill partly from their outcomes).

I'd probably need multiple seasons of HITf/x data, or minor league HITf/x data, before I could tease out that effect better than was done in Tippett's study, for example.
meanwhoogean
11/16
Great read. Thanks for info. Live this stuff.
metty5
11/16
Mike, fantastic work.
piraino
11/16
Wow, tremendous work. Are any foul balls included in this analysis?
mikefast
11/16
Thanks. Foul balls are not included. I wish that Sportvision and/or TrackMan would track data on foul balls, too. I believe that data has almost as much analytical value as the data for fair balls.
hotstatrat
11/16
This may be irrelevant to the study, but how did Edwin Encarnacion hit 26 homers in 2008 with the ball averaging only about 65 mph off of his bat?
mikefast
11/16
The short answer is lots of fly balls (which produced the home runs) and popups (which cut down on his average hSOB).
brownsugar
11/16
Great work, Mike.

And Kudos to Sportvision for supplying the data. I understand they have a business to run, so I completely understand why Hit f/x is not publicly released. Hopefully a model like this could work where they provide back-data from previous years to the public.

Better to get to the party late than never get there at all. Even if Andrew Friedman already ate all the chips.
bmmillsy
11/16
Mike,

Really cool stuff. Thinking a bit more from our contact before, I think the next step here would be to look at within-pitcher variation in hSOB as well. Across pitcher variability gives more of an idea of the spread in talent, while within pitcher variation (with significant regression to the mean) might give more insight to ability to control the outcome. I think that you do address this with the split-half correlations, but I'd love to see Observed and Regressed standard deviations for the players in the tables as well to get an idea of the spread as you do at the aggregate level.

Really awesome stuff.
mikefast
11/16
Thanks, Brian. I looked at within-pitcher variation in hSOB, and I found something I didn't understand. The standard deviation in hSOB tends to go down as average hSOB goes up. That was true somewhat for batters but especially for pitchers. (SD on the order of 20-25 mph).

The only thing I could think was that there's practically/physically an upper limit on hSOB around 100-110 mph that is closer to the mean than is the lower limit at 0. Also, the distributions are typically not normal (peak above average with a long lower tail), so I don't know how well standard deviation describes the distribution in that case.
jorens
11/16
Can you explain hSOB a little more? Is that a HitF/X measurement or a number derived from normal SOB? Did you use it because you considered it to be a better representation of reaction time?
mikefast
11/16
HITf/x measures the speed of the ball and its direction. From that, it is easy to calculate the various components of the batted ball speed.

For example, take a fly ball that is in the air for four seconds before it is caught at the 375-ft sign against the left-center field wall. Ignoring the effect of drag that would have slowed the ball slightly, it traveled 375 feet horizontally in four seconds, for a speed of 375/4 = 93.75 ft/sec (equivalent to 63.9 mph).

HITf/x doesn't measure the whole flight of the ball, just the initial portion, but the idea is the same.

Take another example, a popup that is skied over the infield and caught half way down the third base line after seven seconds in the air. The popup may have come off the bat going really fast, maybe 70-80 mph, but most of that speed was vertical. The horizontal component of the speed was much less. Again ignoring air resistance effects, the ball went only 45 feet horizontally in 7 seconds, and 45/7 = 6.4 ft/sec = 4.4 mph.

The horizontal component of the speed tells you more about how solidly the ball was hit than the total speed (including the vertical component). It also tells you more about how difficult the ball was to field because it tells you how quickly the ball got to or past the fielder (how long they had to react, as you said.)
studes
11/16
Interesting, Mike. From a physics perspective, does it make intuitive sense that the pitcher would have more influence on the speed of a batted ball, since he initiates the pitch and the batter reacts? I wonder what the correlation between pitch speed and hSOB is?

Also, the next question that would occur to me is whether the batter/pitcher interaction has an impact on Batting Average. That is, batting average may not follow the line graph published above on a batter/pitcher specific basis. I assume that's something you'll touch on in the next piece?
mikefast
11/16
Dave, it makes intuitive sense to me that both the pitcher and the batter have some influence over the quality of contact. I'm not sure it's possible to intuit accurately who would have more influence. The pitcher controls the location of the pitch and which way it's moving, which limits the possibilities that the batter has, but the batter is the one who actually swings the bat and determines how the bat contacts the ball. I don't know any way other than observation to determine which one is more important.

Correlation between pitch speed and hSOB is not strong, at least not at major league game velocities. Pitch types and locations make a bigger difference than pitch speed itself. That's not to say that fastball speed has no effect, but it's a lesser effect, and it's not trivial to disentangle from pitch movement and from selective sampling effects (i.e., pitchers that throw slower are in MLB because they are above average at other things).

I'm not planning to directly address your last question in the next piece. It's something I've previously investigated from the April 2009 HITf/x data, but I don't intend to publish the results from the batter-pitcher model I developed from that.
kantsipr
11/16
It might be interesting to divide the pitcher's influence into fastball or average pitch velocity and everything else. For good contact, the faster it comes in, the faster it goes out.
mikefast
11/16
It's true that if the ball strikes the bat in exactly the same way, that the faster it came in, the faster it will go out. A two-mph increase in incoming pitch speed will result in a little less than a one-mph increase in outgoing batted ball speed.

However, it also seems to be true that the faster the pitch comes in, the harder it is for the batter to square up the bat on the ball.

These two effects seem to roughly cancel out in the MLB population of batters and pitchers, though the latter effect may be somewhat more important.
a-nathan
11/18
Actually a 2 mph increase in pitch speed will result in an ~0.4 mph increase in batted ball speed for a ball hit squarely on the sweet spot. Bat speed matters much more in determining batted ball speed.
myshkin
11/17
Mike, great stuff. You mention that pitch types and location make a big difference; what is the nature of this variation?
mikefast
11/17
I don't think I can say any more than I said about that at this point. Sportvision gave me the data under NDA, and I don't want to go beyond the bounds of what I told them I would write about without getting permission from them.

I'll just say that it's not inconsistent with what you can find about the effect of pitch types and location on BABIP from the public PITCHf/x data.
padresprof
11/16
FYI: For some time the Rays having been using the measurement of the ball's speed off the bat as the metric for a player's batting ability. It appears that MLB is finally discovering that baseball is an exercise in biophysics. When will the Rays lose their extra 2%?
Oleoay
11/16
When I read this, I think about the "straight, flat, 100 mph fastball" idea and the idea you suggest that the batter controls quality of contact while the pitcher controls how hard the ball is hit.

Is it fair to say that, the slower a pitcher throws, the more control they have over "how hard the ball is hit" ala Moyer/Hough/knuckleballers? Can that argument be extended to the importance of having a quality changeup or offspeed pitch?
mikefast
11/16
No, that's not what I'm saying. How hard the pitcher throws has very little to do with how hard the ball is hit, at least in MLB. (It may have a little bit to do with it, but to the extent that it does, it appears that the harder the pitch, the slower the resulting batted ball.)

The pitcher and batter BOTH control quality of contact. The batter has a little bit more control over that than the pitcher, but the pitcher has a lot more control than people have thought since the acceptance of DIPS.

The pitcher presumably controls the quality of contact by deceiving the batter as to where and when he should swing.

Mo Rivera is one of the best, probably THE best, in baseball at this, and he throws hard. But he locates extremely well, and this makes it difficult for the batter to make solid contact with the ball.
lentzner
11/16
I suspect that the harder a pitcher throws, the more a batter has to cut back on his swing to keep up with the pitch. For example, if you had a pitcher who pitches 4 mph faster than average and that made the batter's swing 4 mph slower to compensate, the net loss in batted ball speed would be 2 mph.

That would be a way you could have slower batted ball speeds with a hard throwing pitcher.

Deception (which pitch speed influences) would also cause the same effect. The longer a batter needs to see a pitch to id it, the less time he has to gear up his swing.
a-nathan
11/18
Interesting observation. In some measurements we did a number of years ago, we found that elite slow-pitch softball players were routinely having bat speed in excess of 90 mph. If someone were to swing at an MLB pitch with that bat speed and made good contact, the ball would travel a very long distance--much farther than we probably have ever seen in MLB. My own benchmark for swing speed in MLB is about 70 mph, which is a lot less than achieved by these softball players. So, I am completely agreeing with what Matt is saying.
lentzner
11/16
Mike,

So glad you were finally able to do this analysis.

Awesome work, as usual.

Matt
jparks77
11/16
Fantastic work, Mike.
juiced
11/16
Terrific article, thank you.
hmckay
11/16
Mike, I note that the lower left hnd corner of your hitters scatter has a number of hitters that often attempt to bunt for hits. These are balls in play, but they are not attempting to hit them hard. What proportion of batted balls are bunts? Is it non-trivial? Secondly, is this analysis also pointing back towards the importance of line drive percentages, which seem to have been assumed away due to their incinstancy year-to-year? I note that Chipper Jones, Ethier, Cabrera and Pujols and Joe Mauer are at the top of the sample in terms of hit speed, and they hit a lot of line drives - some of which go out.
mikefast
11/16
I did not remove bunts from this analysis, though in retrospect that would have been a good idea.

Bunts make up about 2 percent of batted balls in MLB, and large portion of those are by pitchers, but for a few batters it's much more significant.

Taveras, for instance, had 12 percent of his batted balls as bunts in 2008, and Bonifacio 11 percent.

bobbygrace
11/17
I excitedly explained this article to my wife, and she said, "Baseball is the nerdiest sport ever." So true!
bobbygrace
11/17
Also, I wanted to thank you for the general introduction to DIPS. It was a great refresher and the best such general introduction I can recall seeing on Baseball Prospectus.

It would be wonderful if BP had a page giving an introduction to advanced metrics. My dad loves baseball and is not shy of numbers, but he didn't end up using the gift subscription I gave to him very often because the advanced metrics daunted him. I tried to send him introductory-type articles when they appeared, but really most articles on the site assume a significant level of familiarity with advanced metrics. That's okay and necessary -- you can't explain the genesis of BABIP every time you want to talk about it, and there are the glossary definitions -- but I would bet that my dad's not the only newcomer to the site who felt like he'd just never be able to "get it" and gave up.

I would also bet that for many other readers who do stick with the site, even among those of us who visit the site regularly, advanced metrics still feel like a second language, which we speak with varying degrees of facility and in which we still don't feel completely fluent. A section of the site devoted to a general introduction to advanced metrics would be great for those of us who never systematically studied advanced metrics (and would like not to have to leave this site to do so -- call me lazy!).

Thank you again for a terrific article. It's exciting new stuff, and a great encapsulation of the foundational concepts of DIPS.
bobbygrace
11/17
P.S. Regarding the first paragraph of the above: I've noticed, Mike, that you give careful attention to the introductory material in many of your other pieces. Thank you for continuing to lay the groundwork so methodically and clearly when you present your findings.
mikefast
11/17
Thanks, Bobby.

It's always helpful for me to review the past literature on a topic when I am studying it, and I also think I owe the reader and the previous researchers a mention of the work that I am building on.
mcesare
11/17
Is there any analysis that can be done to look at how pitchers vary the pitch speed from pitch to pitch and its role on hSOB? I'm thinking that the data might show better pitchers are changing speeds more and also thinking about the frequently stated hypotheses that an off-speed pitch in the range of 10 mph slower than a fastball is an ideal difference in speed?
mikefast
11/17
Yes, the HITf/x data should be illuminating in that regard. We already know something about the most effective speed range for changeups (See Dave Allen's piece here.), but we don't know nearly as much as we could about the how and why.

We also don't know much about whether it is helpful for a pitcher to vary the speed on his fastballs by 2-3 mph. That turns out to be a very difficult question to study properly because speed changes are related to differences between four-seam and two-seam fastballs (and in some cases pitchers use cutters as fastballs, too). Those pitch types tend to be used in different locations, different ball-strike counts, etc., which complicates the analysis.
pjbenedict
11/18
Astounding. If you're not hired away by a team in the near/immediate future, baseball is still more broken than we hope. This piece and your catcher piece are the most interesting new baseball research I've read in quite some time, with the most significant application for valuing players effectively.
rrvwmr
11/18
Next article...What are the correlations of hSOB and BABIP to these 4 factors: horizontal pitch location, vertical pitch movement, change in speed from the previous pitch, and change in location from the previous pitch. I'd do this myself, but this data is located at Fort Knox, or possibly "in the computer" next to Selig's pillow. I'd assume these all have a statistically significant correlation to hSOB/BABIP and are probably a fairly decent indicator of pitcher success when comparing pitchers w/ similar fastball velocity and walk rate.
mikefast
11/18
My next article is actually on how hSOB and vertical launch angle interact to affect BABIP. It's important to understand that before moving on to determining what the pitcher and batter are doing that determine hSOB and VLA.

I agree, though, that the kind of thing you mention is where we are headed with this, though I don't know that a linear regression is the best tool for the job. I prefer to develop a physical model for what is happening, if I can. Linear regression can play a supporting role in that process, but ultimately, we want to know why and how the batter and pitcher do what they do.