In the first part of this study, I used detailed batted ball speed information from HITf/x to examine the degree of skill that batters and pitchers had in quality of contact made or allowed. Here, I will look deeper into the question of why some batted balls fall for hits and others do not.
The Defense Independent Pitching Statistics (DIPS) developed by Voros McCracken and subsequent research by others had led much of the sabermetric community to conclude that pitchers have little control over the quality of batted ball contact that they allow beyond the ability to influence the vertical launch angle of the ball. In other words, pitchers primarily control their batted ball results by getting ground balls or fly balls, but not by controlling how hard the ball is hit.
However, by using detailed HITf/x data provided by Sportvision and MLBAM from the 2008 season, I found that a majorleague pitcher does not only control whether he gets ground balls or fly balls; he also has a significant degree of control over how hard the ball is hit. I used the batted ball speed in the plane of the playing field as the measure for quality of contact. I found that the best prediction for the speed of any given batted ball was influenced about 1.7 times as much by the batter’s typical batted ball speed as by the pitcher’s typical allowed batted ball speed, largely because there is wider variety in average batted ball speed among majorleague batters than among majorleague pitchers.
How does this finding square with the previous studies on DIPS and pitcher control over batted ball results? There are two things that primarily influence whether a batted ball will become a hit. One is how hard the ball is hit, and the other is the direction in which it is hit, both vertically and horizontally. Batted ball classifications do a decent job of capturing information about the vertical launch angle of the ball. The success of battedballbased pitching metrics is due to the importance of vertical launch angle in determining whether batted balls result in hits. However, batted ball classifications do not capture very much information about the speed of the batted ball.
In the previous article, I reported on the splithalf correlation in average horizontal speed off the bat (hSOB) for batters and pitchers in the 2008 season. I included home run balls in that analysis. If home runs are removed, the splithalf correlations do not change much. The correlation coefficient is r=0.77 for batters, with an average of 196 balls in play in each half of the sample, and r=0.50 for pitchers, with an average of 244 balls in play in each half of the sample.
How well does average hSOB correlate to a player’s batting average on balls in the field of play (BABIP)? It performs about equally well for batters and pitchers, with a correlation coefficient of r=0.41 for batters and r=0.45 for pitchers, for batters and pitchers with at least 300 balls in play in 2008.
How hard the ball is hit is clearly an important determining factor for BABIP, both for batters and pitchers, but it is not the only factor. What else could affect BABIP? Vertical launch angle is certainly important, and we will discuss it further in a moment. Let’s explore some other possible factors first.
In 2002, McCracken observed that a pitcher’s strikeout rate could be used to help predict his future BABIP, and J.C. Bradbury confirmed that finding in 2005. Bradbury offered a plausible explanation: “The fear of strikeouts possibly induces hitters to take weaker protective swings to stay alive, and thus yields softer hits that are more likely to result in outs.”
In fact, we do find that pitchers who strike out more batters also yield softer batted balls. A pitcher’s strikeout rate and his average allowed hSOB on balls in play had a correlation coefficient of r=0.44 for the 135 pitchers who allowed at least 300 balls in play in 2008.
It is unclear from a single season of data whether pitcher strikeout rate is predictive of BABIP allowed other than through its correlation with the speed of the batted ball. A linear regression between 2008 strikeout rate and BABIP allowed did not find a significant relationship, either with or without hSOB included.
One other potential factor in a pitcher’s BABIP allowed is the quality of the fielders behind him. Logically, it makes sense that his team’s fielding would show up in his BABIP allowed. Statistically, it is harder to show, particularly in recent years. The relationship between a pitcher’s BABIP allowed and the BABIP allowed by the rest of the pitchers on the same team in a given season is fairly weak, with a correlation coefficient of r=0.18 for pitcherseasons with at least 300 balls in play from 19932011.
Also, some of the correlation between the pitcher’s BABIP and team BABIP is likely due to park effects rather than fielding performance.
Moreover, some of the credit for team BABIP goes to the other pitchers on the team rather than to the fielders. From the 2008 HITf/x data, we find that team BABIP correlates fairly well with the average team hSOB allowed.
The harder the ball was hit, on average, off of the team’s pitchers, the fewer balls the fielders turned into outs. (The BABIPallowed numbers are based upon the batted balls in the HITf/x data set. There are some balls missing from the data, which means that the BABIP allowed will not exactly match the fullseason totals.)
Despite having a better BABIP allowed, the Chicago White Sox may have actually had a worse fielding performance in 2008 than the Texas Rangers. Though the White Sox turned many more balls into outs, the balls they had to field were hit two miles per hour slower on average.
Splitting the credit for BABIP between pitchers and fielders is not simple, and it is a topic for further research. However, approaches that consider all balls in play to be of equal fielding difficulty assign a large chunk of the pitcher’s performance to the fielders.
We have seen that hit ball speed has a significant effect on the batter’s chance of getting a hit. Another significant factor is the vertical launch angle (VLA) of the batted ball.
BABIP peaks at a VLA of about 12 degrees, with about 80 percent of the balls struck at that angle falling safely in the field of play. Balls launched at zero degrees (flat initial trajectory) or at 25 degrees (low fly ball) fall in only half as often, though as launch angles approach 30 degrees, nearly 20 percent of batted balls leave the field as home runs.
In the first part of this study, I showed a graph of batting average as a function of batted ball speed in the plane of the playing field. It is also interesting to look at the batted ball speed in the plane inclined by 12 degrees above the horizontal, the launch plane for which the ball is most likely to become a hit.
Let’s examine four regions of the preceding graph in more detail, each corresponding to roughly a quarter of all batted balls.
First, there is the region with an initial speed off the bat of less than 60 mph in the 12degree plane. The overall batting average in this region is a paltry .095. Few of these balls make it out of the infield. The chances of a hit rise to about 20 percent if the batter hits the ball hard enough to get it away from the catcher but not so hard that it gets quickly to an infielder. This region also includes a large number of popups that almost always turn into outs.
Let’s look at the batted ball fielding locations from MLBAM Gameday data for the balls hit at less than 60 mph in the 12degree plane during the 2008 season.
Almost all of these balls stay on the infield, or in foul territory, or barely make it through to the edge of the outfield. Hits are most likely to occur when the third baseman fields a shallow dribbler. None of these balls is wellhit, so we would consider that the pitcher did his job for batted balls in this region. (Note that these are fielding locations, so a few balls may roll after landing until an outfielder picks them up. Also, the data may have a handful of errors that I have not scrubbed.)
Next, here are results for the balls that were hit between 60 and 80 mph in the plane 12 degrees above horizontal.
Balls hit harder than 60 mph in the 12degree plane start to make it out of the infield in appreciable numbers, mostly in the air but also to a lesser extent on the ground. On the ground, most of these balls are turned into outs, but about 10 percent of them sneak between the infielders and another 6 percent result in infield hits.
In the air, most of the balls make it over the infielders, but balls hit with a launch angle higher than about 30 degrees are easy catches for the outfielders. Roughly a quarter of the air balls in this group fall in for hits. A few balls scoot down the foul lines for extra bases, but mostly these balls are not hit hard enough to find the gaps between the outfielders.
Hitting the ball harder between 60 and 80 mph does not result in much increase in batting average because the extra ground balls that make it between infielders are offset by extra air balls carrying to the outfielders. Pitchers who give up batted balls in this group have mostly done a good job and have about an 80 percent chance of getting an out. However, their chances of success are very sensitive to the vertical launch angle, with a window of VLA between 10 and 26 degrees where a ball has a 70 percent chance of falling between the infielders and outfielders for a hit.
Here are the locations of batted balls hit between 80 and 90 mph in the 12degree plane.
Once the batted ball speed in the 12degree plane exceeds 80 mph, over two thirds of the balls make it out of the infield, and batting average rises as the ball is hit harder. Ground balls start to find their way between infielders with greater frequency, with 28 percent making it to the outfield and another five percent resulting in infield hits. Air balls start to find the gaps between outfielders and down the lines, and some sneak over the fences in the corners. Most air balls, however, are still not hit hard enough to land over the heads of the outfielders.
Finally, here are the locations of batted balls hit harder than 90 mph in the 12degree plane.
Nearly half of ground balls hit harder than 90 mph result in hits. Infielders are still able to scoop up the hard grounders hit close to them, but many make it through to the outfield. The majority of the balls hit this hard, though, are in the air. Many of these air balls find the gaps or sail over the outfielders’ heads, and 18 percent of them are home runs. Of those air balls that stay in the park, 64 percent result in hits.
Here is a summary of the results in the four regions of hSOB in the 12degree plane.
hSOB12 deg. (mph) 
Ground Balls 
% GB Infield Hits 
% GB Outfield Hits 
Air Balls 
% Air Ball Hits 
% Air Ball Home Runs 
<60 
19160 
.081 
.023 
12860 
.082 
.000 
6080 
11728 
.056 
.105 
20393 
.255 
.000 
8090 
9984 
.049 
.276 
16228 
.364 
.045 
>90 
12221 
.036 
.443 
21490 
.708 
.178 
Of the 40,364 hits tracked by HITf/x in the 2008 season, 52 percent of them were air balls that were hit harder than 80 mph in the 12degree plane (including the 11 percent of hits that were home runs). Another 23 percent of the hits were groundball smashes, also hit harder than 80 mph in the 12degree plane. But the remaining 25 percent of the hits were of the relatively cheap (or lucky) variety: 15 percent were bloops over the infield, six percent were seeingeye grounders that were hit softer than 80 mph but managed to find a hole, and four percent were bleeders—ground balls that weren’t hit hard enough to make it to an infielder for an out.
How does the BABIP picture look if we focus only on the 75 percent of hits that were wellstruck? Though batters hit .158 on any ball with a speed less than 80 mph in the 12degree plane, they hit .497 on balls hit harder than 80 mph, .464 if we exclude home runs. If we assume that pitchers did their job any time a ball was hit less than 80 mph, and that a hit on such a ball was a matter of luck or poor fielding, we can credit the pitcher with .842 of an out for every such batted ball. If we combine that with the actual results on balls in play that were hit harder than 80 mph, we can calculate an adjusted BABIP for pitchers.
Such an adjusted BABIP metric has a better splithalf correlation for pitchers with at least 300 balls in play in 2008, with a correlation coefficient of r=0.26, compared to r=0.13 for normal BABIP. In fact, if we further adjust the BABIP by also crediting the pitcher with .536 of an out for every ball in play hit harder than 80 mph, we can improve the splithalf correlation coefficient to r=0.34. These pitchers had an average of 244 balls in each half of the sample, which means that we would regress their BABIP toward league average by adding in 464 balls in play at leagueaverage BABIP. If we do that, here are the pitchers in 2008 with the best and worst regressed adjusted BABIP allowed.
Balls in Play 
Weak BIP Pct. 
Actual BABIP 
Regr. Adj. BABIP 

151 
77% 
.225 
.282 

606 
62% 
.300 
.285 

399 
62% 
.283 
.288 

552 
60% 
.268 
.289 

512 
61% 
.270 
.289 

232 
64% 
.310 
.289 

258 
63% 
.287 
.290 

613 
59% 
.284 
.290 

127 
69% 
.244 
.290 

549 
60% 
.299 
.290 

League Average 

54% 
.300 
.300 
431 
48% 
.332 
.308 

513 
48% 
.288 
.309 

128 
40% 
.305 
.309 

249 
45% 
.293 
.309 

335 
45% 
.334 
.310 

507 
47% 
.323 
.310 

538 
47% 
.301 
.311 

554 
47% 
.314 
.312 

531 
46% 
.343 
.312 

638 
43% 
.345 
.318 
Once again, these numbers are based upon balls in play from 2008 for which we have HITf/x data. Ideally, we would use HITf/x data from multiple years to make this calculation more accurate, and Mariano Rivera might look even more outstanding than he does here.
Of course, one could develop an expected BABIP metric based upon something more sophisticated than a twofold division by batted ball speed in the 12degree plane. One could also make a similar metric for batters.
What about line drives? Do pitchers control the amount of line drives they allow? As we discussed in the first article of this study, Bradbury and Gassko found from standard batted ball data that pitchers do not control their line drive rates. One season of HITf/x data is probably a better indicator than the standard batted ball data, but it is still not a big enough sample to come to a firm conclusion on the question. The number of line drives allowed by any given pitcher in one season is small and becomes smaller when split in half for correlation. However, the HITf/x data does offer some instructive insights on the nature of line drives.
First, we can divide line drives roughly into two categories. One category is balls hit hard in the air on a low trajectory; this makes up about three quarters of line drives. The other category is balls blooped over the heads of the infielders. If these balls were hit harder, they would turn into outs because the outfielders would catch them. This category makes up about one quarter of line drives.
The splithalf correlation for pitcher linedrive rates, as best as it can be measured with this sample of data, is higher for the hardhit category of line drives (r=0.19) than it is for the bloophit category of line drives (r=0.06). There is also some suggestion that a pitcher who gives up more of one type of line drive allows less of the other type. These conclusions beg for confirmation from a multiyear sample, however.
Another interesting way to look at the pitcher line drive rate question is to look at the batted ball characteristics of pitchers based upon how many line drives they allow. For example, we can divide the pitchers from 2008 into three groups based upon their regressed rate of hardhit line drives allowed. Then, we can compare the average distribution of vertical launch angles for the batted balls allowed by these groups. (I defined hardhit line drives as balls with a VLA between seven and 20 degrees and a speed of greater than 80 mph in the 12degree plane.)
The group that allowed the most hardhit line drives, unsurprisingly, allowed the most batted balls with a vertical launch angle around +15 degrees, which is near the center of typical line drive launch angles. Since we defined the three groups based on linedrive rates, the difference in that VLA range follows by definition. However, what is somewhat more interesting is that the pitchers who allowed more hardhit line drives allowed fewer groundballs.
A different way to present the same information is to look at the difference in the number of batted balls in each VLA bin between the group of pitchers with a high line drive rate and the group with a low line drive rate.
The pitchers who gave up the most hardhit line drives allowed about two percent more of their batted balls in the VLA range of 1015 degrees, about two percent more in the 1520 degree bin, and almost one percent more in the 510 degree bin, as compared to the pitchers who gave up the fewest hardhit line drives. In other words, the group of pitchers who allowed the most line drives had a line drive rate about five percent higher than the group who allowed the fewest line drives.
The pitchers who allowed the most line drives also allowed fewer ground balls. Between half and one percent less of their total batted balls fell in each of the VLA bins between 30 and 5 degrees, as compared to the pitchers who allowed the fewest line drives.
I frankly am unsure what to make of this observation. Do “linedrive” pitchers really allow fewer groundballs? Does this mean that groundball pitchers allow fewer line drives?
To attempt to shed more light on the question, I split the batted balls randomly in half for all pitchers with at least 300 balls in play and grouped the pitchers by their rate of hardhit line drives allowed in one half of the sample. Then, I looked at the VLA distribution for the pitcher groups in the other half of the sample.
The insample curve looks very similar to the previous graph, as expected. It is based upon the same data and a similar group of pitchers, but from half the sample instead of the full year. The outofsample curve shows how the pitcher groups performed relative to each other in the other half of the sample where I did not control for linedrive rate. The pitchers who allowed many hardhit line drives in one half of the sample still allowed more line drives in the other half of the sample, though only about a fifth as many. They also allowed fewer groundballs. This seems to be a persistent feature.
What happens if we select our pitcher groups based upon rate of ground balls in the VLA range of 30 to 0 degrees (labeled GB* in the following graph), roughly the range where the linedrive pitchers show a dip in batted balls allowed?
These pitchers are preferentially groundballers, and they allow fewer line drives and fly balls in the half of the sample I used to select them. In the other half of the sample, they retain most of their groundball tendencies and practically all of their tendency to avoid fly balls. However, they do allow more line drives on low trajectories (with VLA between five and 15 degrees).
These results are in keeping with Brian Cartwright’s finding, based upon HITf/x data collected from AprilJune 2009 and recently published in The Hardball Times Baseball Annual 2012:
Due to the lower vertical angle of all balls off the pitches from groundball pitchers, balls hit in the air are more likely to be hits. This is the result of two things: air balls off groundball pitchers are more likely to be line drives, and balls categorized as “outfield flies” are more likely to be hits. All because of the lower vertical angle.
There is an alternative way to think about these batted balls that does not focus on the likely outcomes, which are based upon the dimensions of a baseball field and the typical positions of fielders. Instead, we can think about what constitutes square and solid contact with the baseball, independent of what this means about where the ball may land. Though it is more productive for a batter to hit slightly under the center of the ball, resulting in a line drive over the infielders, perhaps a batter hitting the ball squarely or just slightly over the center of the ball is also indicative of a pitch that was easy to contact solidly.
There is some evidence that batters use a slight upswing in order to have a better chance of contacting pitches that are descending due to gravity as they cross the plate. Dr. Robert Adair theorized in The Physics of Baseball that the optimum angle for a batter swing optimized for contact was about eight to 10 degrees above horizontal. Matt Lentzner and I found that the optimum batter performance occurred on pitches that were descending at about a sevendegree angle.
Perhaps the most solid contact by the batter is thus centered on a vertical launch angle of seven degrees? If so, we might get a better idea of whether a pitcher’s offerings are easy to square up if we include a larger region of solid contact centered on a VLA of seven degrees.
In order to investigate this hypothesis, I grouped the pitchers based upon their rate of solid contact allowed, defined as batted balls with a VLA between 10 and +25 degrees and a hSOB of at least 75 mph, in one half of the sample.
The solid contact rate does have some persistence from one split half of the sample to the other, similar to the persistence for line drive rate as typically defined.
We can firmly conclude that pitchers do have some persistent linedrive skill, or solidcontact skill. That skill appears to be related to the overall “shape” of their batted ball distribution.
The reasons that balls fall for hits are complex. We saw in the previous article that pitchers have significant control over how hard batted balls are hit. We saw here that the speed of the batted ball interacts with the vertical launch angle to determine whether the ball is likely to fall for hit.
About a quarter of hits are of the blooper and bleeder variety. The other three quarters of base hits are hit harder, smashed between fielders or blasted over the heads of outfielders. Pitchers do not seem to have much control over whether the bloopers and bleeders turn into hits. However, they do have control over how many bloopers and hard smashes they allow. Further research is needed to determine why some pitchers allow more hardhit balls than others do.
HITf/x data has revealed limitations of the current DIPS metrics based upon batted ball categories. HITf/xbased metrics offer a great deal of promise for improvement over DIPSbased metrics in pitcher performance evaluation. However, it would be interesting to know if current public batted ball data, such as that from Gameday, could be adapted to capture additional BABIP effects, based on the learning from HITf/x.
Further research is also needed to better understand and quantify what pitchers and batters are doing to create weaklyhit or solidlyhit batted balls. There are many potential avenues of investigation. Certainly, factors such as pitch location, type, and speed could be important. Batter stances, swing planes, and swing speeds could also be important factors, though they are more difficult to measure. Ultimately, a model of the ballbat collision utilizing data about the incoming pitch trajectory and outgoing battedball trajectory could be the most useful explanatory tool.
Thanks to Sportvision and MLBAM for providing the HITf/x data for the study. Thanks to Colin Wyers for his input and feedback.
Thank you for reading
This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.
Subscribe now
"It is also interesting to look at the batted ball speed in the plane inclined by 12 degrees above the horizontal, the launch plane for which the ball is most likely to become a hit."
In the data, you show an .800 BA on those balls. But in the hSOB graph for 12degree balls, BA only reaches .800 over 100 mph. So I don't get what we're looking at here.
Take three batted balls, all leaving the bat at 70 mph. A high 70mph popup has most of its speed going vertical, and will have a very low speed in the 12degree plane, maybe 10 mph or so. A 70mph line drive just out of the reach of the shortstop will have a speed in the 12degree plane of about 70 mph. A ball pounded into the dirt in front of the plate at about 70 mph will have a low speed in the 12degree plane maybe also around 10 mph, depending on how steep of an angle it was hit at.
The tweak to the 12degree plane is just an acknowledgment that it helps to get the ball over the head of the infielders, so a little bit of vertical speed is a plus (but not so much vertical speed that it's a popup or a can of corn to the outfield).
If you think of it simply as the measurement I made in the previous article, which was in the horizontal (0degree) plane, you will still get the basic idea.
With the hSOB, it's how fast the ball travels toward the base of the outfield fence. With the 12degree SOB, it's how fast the ball travels toward a point about 80 feet in the air above the outfield fence.
http://www.hardballtimes.com/main/article/thetruthaboutthegrounder/
This is exactly what people who have played with batted ball data (such as ground balls and line drives) have found, but you're looking at the the underlying physical data instead of "observed" batted ball data.
Have you shown that your approach is better than the batted ball approach? I'm not doubting it, just wondering.
For any individual batted ball, the angle off the bat matters a lot, but for large samples this effect is very muted. There is still some effect based on the typical vertical launch angle, but don't give it equal credit with the speed of the ball.
Pitchers do have a linedrive skill (which is mainly where launch angle comes into play), and it's important to consider this for additional accuracy, but I don't feel like I understand the origin of that skill well enough yet to build it into a metric. I'm still working on that.
However, I did show how my approach worked for pitcher BABIP based upon dividing every batted ball into either hardhit or weaklyhit. I did not compare to a battedball BABIP predictor. Which one did you have in mind? Something based upon Graham's tRA? Something based upon Brian's findings in the THT Annual?
Good stuff!!!
That's a very good suggestion, Alan.
I guess I love all the data and analysis, but I'm not sure what the takeaway is.
For most batted balls, more speed is a good thing. (That's not true for bloopers, which is a big reason BABIP is inconsistent for pitchers.) So if the pitcher can fool the batter and make him swing defensively or not get good wood on the ball, he's going to win out more often than not.
Launch angle matters a lot right around 12 degrees, where you're dropping it over the infielders heads. For any individual batted ball, the launch angle matters more than speed. But there doesn't seem to be a skill for pitchers to be able to control launch angles within narrow windows. There is a lot I have yet to learn about this, but my impression is that the basic idea that has been advanced before is correct: that even if a pitcher gets lots of groundballs, he'll allow some hits in the linedrive range of angles, and that even if a pitcher gets lots of popups and harmless flies, he'll allow some hits in the linedrive range of angles. It's not true to conclude from that that pitchers have no control over their line drive fraction, but it's probably why they have limited control.
Even though Mariano Rivera gets way more weaklyhit balls than anyone else, and he also gives up fewer line drives than most other pitchers, he still gives up a fair amount of line drives.
Part of this has to be that a batter can hit a line drive even if he mistimes the pitch or misjudges its trajectory. If the timing error and spatial error work in opposite directions, the batter can still get solid wood on the ball.
Am I right in understanding that the best launch angle for contact is 12 degrees and the best launch angle for home runs is 30 degrees?
One (poorly framed) question I had: Are there some pitch types more likely to result in a 12 degree launch angle?
Your question is a good one. What I know about that from this data, I have not gotten permission from Sportvision to share. However, if you look at the BABIP data by pitch type, you can see that fastballs have the highest BABIP allowed.
In order for example to make best contact with a 126 curveball is to launch at a higher angle. However, that presupposes that you will actually make contact like that.
For a submariner pitcher, you'd want a flat, to maybe negative launch angle.
So, if the pitch angle is coming in at say +15 to 45 degrees, your launch angle should be say from 0 to +30 degrees. Something like that.
Is that about right Mike (the general idea anyway, not necessarily the numbers)?
You could, of course, do a really high eephus lob that came basically straight down on the plate, but most pitches have to be thrown in that range I mentioned in order to make it across the plate at typical MLB game speeds.
The reason I ask is it makes sense to me from a softball perspective that the greater the arc of a thrown pitch, the easier it is to hit it at a high launch angle, which would suggest a higher chance of a pop up or home run. It makes sense that a hanging curveball is easier to hit for power than a good curveball. I just wonder what the difference is between a good and bad curve.
BABIP on low curveballs is .295 and on high curveballs in .301. HR/contact on low curveballs is 2.2% and on high curveballs is 3.8%. But the difference is similar to that between low pitches of all types (2.4%) and high pitches of all types (4.1%).
I did look into difference on performance on curveballs based upon how fast they were descending when they crossed the plate in some research that Matt Lentzner and I did for the 2010 PITCHf/x Summit. I'll have to dig that up and see if I found anything, but if it were dramatic, I think I'd remember.
Thanks!
Do you mean why Felix Hernandez isn't listed among the pitchers with best or worst adjusted BABIP allowed? His adjusted BABIP was right at the league average of .300 for 2008.
Or are you asking about something else?
Thanks, sorry I was going too fast, you basically answered what I didn't think through ...that he isn't listed due to being league average and doesn't show up. Of course he was playing in 2008, I'm looking forward to reading thoroughly, this evening.
Apologies!
2) LD rate is problematic because it includes a lot of bloopers. If you remove the bloopers and focus on the hardhit line drives, LD rate has some persistence. Some of it is still luck, or perhaps due to other characteristics we don't fully understand (e.g., ground ball rate).
3) I don't know. The data is available here in the sortable stats for pitchers if you want to check.
4) I think I've linked to this or similar work before, but Jeremy Greenhouse answered that question here. John Walsh, Dave Allen, Harry Pavlidis, and others have probably also put out similar data.