Checking the Numbers: Much Ado About Liners

June 25, 2009

Line drives are the hardest-hit balls put in play, resulting in base knocks much more frequently than their fly-ball and ground-ball counterparts. In terms of exact figures, the liner/hit conversion rate averages out to 73 percent, roughly three times as often as balls beaten into the ground, and almost five times as often as those lofted in the air. When trying to estimate BABIP, using the expected values of success-73 percent for line drives, 24 percent for grounders, and 15 percent for fly balls-proves to be a more accurate methodology than the popular formula developed by Dave Studeman that adds .12 to the line-drive percentage of a player.

Using the expected values helps to differentiate between, say, bloop popups that fall in between fielders for doubles and the scorched liner that a third baseman snares on a dive, dealing with the probability of success instead of the actual results. Clearly, the latter example will add to hit totals at a higher rate than the former, something that should be accounted for in any sort of estimation or expectation formula. Several issues arise when discussing line drives, however, involving correlation to overall performance, classification and park factors, and plate discipline with regards to the batter/pitcher matchup; this last point was brought up in the comments section of our look at Pedro Feliz and other substantial jumps in on-base percentage over the years.

Relative to overall performance, it stands to reason that those belting liners at higher clips will post better triple-slash stats. After all, if a hitter hovers around the normal batting average and slugging marks of .733 and 1.013 on liners, while scorching the ball more frequently than other players, he is expected to experience more success on balls in play, potentially translating into better all-around performance. But are line-drive rates stable for players, staying relatively consistent from year to year? And piggybacking off of that question, for that matter are success rates on liners consistent?

I have discussed intra-class correlations in this space several times before; they work much like a year-to-year correlation, incorporating multiple seasons for each player as opposed to just two. Running the ICC first for overall line-drive rates from 2003-08 produced a correlation of 0.33, suggesting that the rates at which batters hit frozen ropes is of moderate stability from year to year. When batting averages on line drives were placed under the ICC lens, a measly correlation of 0.14 surfaced, indicating that the success rates are not necessarily consistent in the aggregate on a multi-year level. The 0.043 standard deviation provides that roughly two-thirds of the players queried fell between .687 and .773; 95 percent of the players would fall between LD-BAs of .644 and .816.

What happens when players deviate drastically from the mean? Many very astute analysts are quick to bifurcate levels of luck when discussing BABIP, but getting a little more granular can help explain the reasoning behind such shifts. Given the standard deviation of the data set, players soaring well past or falling well below their means in a specific ball-in-play area, instead of the overall mark, could justifiably be considered lucky or unlucky in that specific area. In other words, instead of commenting on a player’s level of luck relative to his overall BABIP, dig deeper. Let me reiterate that I am in no way suggesting that success, or the lack thereof, on balls in play will always be the root cause of a player’s struggles, but rather putting it out there as a potential cause worthy of investigation. For instance, take a look at Jimmy Rollins of the Phillies, who since 2003 has boasted LD-BAs ranging from .723-.766, equaling or besting the league average. Through Wednesday, Rollins had nosedived toward a .600 mark, well below the league as well as his own expected threshold.

Since 2003, among those with at least 75 line drives, only three players-Jason Kendall in 2007, Scott Podsednik in 2004, and Placido Polanco in 2004-have posted LD-BAs lesser than, or equal to, Rollins’ current mark. It would seem that Rollins is certainly bound for some kind of regression over the remaining three months, but you can now see how few contemporaries have actually ended up this low in their success frequencies on liners. Another cause for concern with Rollins involves his .188/.214 BA/SLG on grounders in a .242/.263 league. From 2003-07, Rollins bested the league with a .252/.284 line, but plummeted last season to just .194/.211. Considering the virtually identical numbers from a year ago, the idea that he has been unlucky on grounders doesn’t hold as much water.

Falling below the league averages and personal levels of performance on balls-in-play statistics is certainly noteworthy from an anecdotal standpoint, but don’t fool yourself into thinking that if Rollins’ LD-BA were adjusted to the .730 range, all of his issues would magically subside. In fact, if his .600 LD-BA rose to .730, and a few of the hits were credited as doubles, with such a small number of raw line drives this “early” in the season, Rollins would only see his line rise from .214/.257/.332 to something in the vicinity of .234/.281/.359, essentially the slash-line equivalent of putting lipstick on a pig. With regards to the line drives themselves, Rollins may very well be belting the ball right at fielders, but he also could be hitting weak or soft liners-balls classified as line drives based on their observed trajectory, but not necessarily limiting the reaction time of the fielder as much as your standard frozen rope. This serves as the perfect segue into a discussion of the classification issues with line drives and balls in play.

The balls put in play are classified by trained scorers either present at a particular game, or via videos. The one commonality between the various entities offering the data is the fact that classifications are assigned by humans. Though batted-ball classifications are certainly accurate, to err is human, and such designations are not without potential flaws. For instance, what do you do with “fliners”? Or scorched one-hoppers that seem to be on a line before briefly landing on the infield dirt en route to the outfield? Cases could easily be made for both sides of the coin in each of these scenarios, an aspect that might not prove problematic except for the substantially different success rates and expected values. Calling a line drive a fly ball may seem relatively harmless, but it can hinder BABIP estimators since liners go for hits 73 percent of the time, while fly balls do so at just a 15 percent clip. Additionally, even if we eventually agree that one-hoppers are grounders, they cannot possibly carry the same expected value as dribblers or weakly hit grounders. Each type of ball put in play consists of a few different, more in-depth, categories. To not differentiate would be akin to grouping together the bloop double that falls in and the scorched liner that also produces a two-bagger. Forgive my this sidetrack, but this is why the most accurate linear-weights systems, at least from a theoretical standpoint, would incorporate the type of ball put in play in addition to the end result.

Another potential flaw deals with the various different scorers across the country, though the stadiums themselves may share some of the blame. Brian Cartwright penned a brilliant piece a while ago in which he researched park factors on line drives. In its simplest form, Cartwright investigated the frequency of liners per all non-grounders at home and on the road via the matched pair; take all line drives and fly balls for the Astros and Pirates in each of their home parks, repeat for every matchup, and sum both sides, showing the number of line drives for the Astros and their opponents in Houston as well as all other parks. A startling conclusion stated that a ball was 18 percent less likely to be classified as a line drive in Houston than on the road. This does not imply that Astros hitters themselves hit fewer line drives, but rather that their balls in play were 18 percent more likely to be coded as liners on the road than at Minute Maid Park. At the Ballpark in Arlington, balls in play were 18 percent more likely to be classified as line drives.

These park factors could be attributed to those scoring the games, but we cannot ignore that certain ballparks may depress certain statistics. Weather, air pressure, field dimensions, and several other factors independent of the scoring may cause balls that would otherwise rope their way into the outfield to loft a bit more, looking more like a fly ball. These are not guaranteed reasons, but rather just ideas possibly capable of explaining the practically irrefutable park factors researched by Cartwright. With the advent of the Hit-f/x system-which I plan on exploring more over the course of this season and absorbing at the Pitch-f/x Summit in July-coding errors like this could eventually subside, as the spray angle, launch angle, and speed off of the bat could indicate the type of ball put in play.

Line drives and their assorted BIP cousins may also be contingent upon plate discipline. In the comments thread of our look at historical OBP spikes, commenter Sarah Gelles put forth the following idea:

I’m curious whether the BABIP improvement could also possibly stem from the BB% improvement-perhaps if the hitter is laying off of more bad pitches they’re more likely to hit the pitches they do swing at hard. This, combined with the already-mentioned idea that pitchers might give the batter more pitches to hit when they realized he wasn’t swinging at the bad stuff as much, could explain the sustained BABIP increase. Maybe you could look at the LD%? If the contact was harder because the ball was in the zone more, some GBs and FBs might become LDs, which would show up.

This very valid proposition proves quite difficult to investigate, especially with the lack of reliability on ball-in-play classifications dating before 2003 and the still relatively recent implementation of the Pitch-f/x data set. Unfortunately, for the purposes of quantifying aspects of the above suggestion, the 16 players that experienced a sharp spike in OBP and sustained the new rate in the following year, as shown last week, all did so prior to 1995, so no valid strike-zone or plate-discipline data is freely available, and the rates at which players hit liners in these seasons is not worth guessing at. Regardless, the theory holds plenty of validity, as the batter-pitcher matchup is one of constant adjustments. If a player has exhibited a strong propensity for swinging out of the zone, it would behoove the pitcher to deliver his offerings out of the zone. If our hypothetical hitter shows signs of improvement in terms of spitting at pitches “juuuuust a bit outside,” then the pitcher must adjust and fire in the zone more often. Whereas contact made on the poorly placed pitches may result in a weak grounder to the left side, contact made on pitches in the strike zone, perhaps in a wheelhouse, could definitely lead to harder-hit liners.

The three major points of contention presented here can be summarized by saying that line drives, though moderately stable, do not always translate to better performance marks given the instability of LD-BA; classification errors are more costly than meet the eye considering the substantial discrepancy between the expected values of liners to fly balls and grounders, as well as the park factors present; and other strategic aspects evident in the game of baseball can lead to fluctuating line-drive rates. Those who hit the ball harder, limiting the reaction time of defenders, will be more likely to succeed, but not all hard-hit balls are line drives, and not all line drives are hard-hit balls.

A more uniform approach to classifying batted balls, such as the utilization of data in the Hit-f/x system, will help to reduce errors and increase the accuracy of expectation formulas, separating balls in play by how hard they were hit, as opposed to preselected buckets. Using the new batted-ball data this way can also aid in the eradication of luck-based analyses deeming Player X terribly lucky on the heels of a vastly above-average rate in certain areas. While someone like Jimmy Rollins may in fact be unlucky so far on his line drives, most of his liners may be hit rather weakly for all we know, cutting back on the likelihood that he should be near the league-average rate. The batted-ball data provided by Hit-f/x will allow analysts to further partition the LD/GB/FB buckets, determining expected values for weak vs. scorched liners, hard-hit vs. soft grounders, and other BIP contests on the card. Even more exciting is that this data may be available retroactively based on video archives used for Pitch-f/x purposes. The MLBAM systems are tremendous advances in the world of sabermetrics, and will provide various new roads of exploration in addition to scheduling some much-needed road work on those trodden on with great frequency. The Hit-f/x data set will take some time to get used to, just like its pitching sibling, but it will afford us with the information necessary to conduct some incredibly important and in-depth analyses.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now

Eric Seidman

Latest Articles

You need to be logged in to comment. Login or Subscribe

llewdor

6/25

We need HitF/X - then we can quantitatively determine what's a fly ball and what's a line drive.

Reply to llewdor

EJSeidman

6/25

Not just that - as I mentioned above, we can not only differentiate once and for all between liners and flyballs, we can further differentiate between various types of grounders, flyballs, and liners. A weak hit dribbler is going to have a lower expected value than a scorched one-hopper, but as is, everything is lumped together.

Reply to EJSeidman

tide182

6/25

well we can start to do a little bit more of this:

http://www.hardballtimes.com/main/article/an-early-look-at-hitf-x/

it really is pretty awesome and can help give us some more objective structure to the simple yet simultaneously complicated question of: "is so and so a good hitter?" and isn't that really - along with its sister question, "is so and so a good pitcher?" - really what it's all about?

Reply to tide182

EJSeidman

6/26

Byron, yeah, I have the data in my database but it is only for April at the present moment. I don't want to use it until more is available, but the spray and launch angles as well as the bat speed can certainly clue us into which pitchers allow the slowest speed off the bat, which specific pitches do so, etc.

Reply to EJSeidman

blcartwright

6/25

Well, the GameDay text descriptions of each play do include phrases such as 'grounder' 'soft grounder' or 'sharp grounder', which can help, but of course Eric and I might disagree on whether a ball was hit sharply or not. If we are told it was hit at 95 mph with a down angle of 5 degrees, then there's no longer any human interpretation needed to classify it.

Reply to blcartwright

EJSeidman

6/25

Exactly, and Brian and I discussed a specific example this weekend, in which one of the LaRoche's hit a blooper that Gameday recorded as a line drive. I want to eradicate any type of misinterpretation. While I do like the human aspect of the game in certain areas, Brian and I, and I'm sure many of you, can agree this is not one of those areas.

Reply to EJSeidman

BurrRutledge

6/26

I've always liked the "soft liner" play-by-play description. "So-and-so hits a soft liner to left that drops in for a single." I think that differs from a blooper by not so much, but a world of difference in how the listener may perceive the hitter afterwards.

Reply to BurrRutledge

illgamesh

6/26

All bloopers that go for extra bases end up in the Gameday system as line drives. I don't know why this is, but it's very true.

Reply to illgamesh

Oleoay

6/26

If an outfielder doesn't get to a batted ball, is it more likely to be termed a line drive.. and if they are able to get to it, is it more likely to be termed a fly ball?

I guess I'm picturing the difference between a ball hit over an outfielder's head as a line drive, but if the outfielder had been positioned a few feet deeper, that same ball might be interpreted as a flyball.

Am I making any sense at what I am getting at?

Reply to Oleoay

blcartwright

6/26

Great question Richard, and something I had planned to look for. I also have the HITf/x data for April downloaded, but not yet plugged in to my database. I would look for balls that are hit with the same speed and launch angle, group by if they are caught or go for hits, and then see if there is any significant difference in the LD% for each bucket.

Reply to blcartwright

EJSeidman

6/26

Yep, on my to do list as well. And when the samples are larger, success rates by bat speed, vartical launch, horizontal launch, etc. I think your question touches on the illusions of fielding. For instance, Shane Victorino runs 100 mph for a ball that Carlos Beltran easily glides two. Both made the play but the perception is that Victorino made the better play because he sprinted. In reality, Beltran's superior range meant he could simply get to it easier. In your example, the not getting to the ball makes it look like a liner; after all, how does a non-Adam Dunn outfielder not get to a flyball?

Reply to EJSeidman

Oleoay

6/26

While you guys look at that stuff (and since you're in the data anyway), it might be interesting to see if the weight or type of bat has any kind of correlation with launch. For example, if two people have about the same average bat speed, but one person is able to launch the ball farther more often, it might have to do with the weight of the bat they use.

Reply to Oleoay

fireorlime

6/30

It could, but (I'm honestly posing this question here) isn't there more to it than bat speed, contact point, and launch angle? When the bat actually makes contact with the ball won't the strength of the batter play a huge role?

If you had two identical bat speeds hitting the same pitch with the same launch angle, the velocity of the ball leaving the bat and its eventual trajectory is still subject to other variables si? I wonder what those other variables might be...

Perhaps the path and length of the swing prior to contact matter?

What about where exactly the ball makes contact on the bat?

Reply to fireorlime

fireorlime

6/30

I think the most understated part of HIT f/x is the objective data it will provide to measure the fielder proficiency.

Speaking of, will there be anything available that pinpoints the starting position of each defender on every pitch? That would be cool.

Reply to fireorlime

blcartwright

6/26

On the Adam LaRoche play I told Eric about, he nubbed a looper over the third baseman's head for a single. The description read "singles on a line drive to third base" - makes it sound like LaRoche knocked him over with a smash. I would have called it a 'soft fly ball'.

Reply to blcartwright

Oleoay

6/26

Is a line drive allowed to bounce once?

A line drive to the infield can turn into a grounder by the time it gets to the outfield. Let's say the infield is playing in or the first/thirdbasemen are playing close to the bag. A normal line drive out to their normal position would turn into a groundball single by the time it got to the outfield.

Reply to Oleoay

EJSeidman

6/26

Richard, questions like this are why we need a uniform system. It isn't as if there are a set of rules out there. I might answer No to your question while others might answer Yes. I hate that. Give me the robotic data output on BIP!

Reply to EJSeidman

Oleoay

6/26

Not only a uniform system, but a calibrated one.

I remember when Dan published pitch velocity for when a pitch left a pitcher's hand, then again when it crossed home plate and there was a wide variance in the drop in velocity from the two data points between ballparks. After some investigation, he thought calibration might've been the biggest issue, though air density/humidity might've also played a factor.

Reply to Oleoay

sbnirish77

6/27

This same sort of interpretive vagueness is what pervades many of the fielding metrics.

Reply to sbnirish77

Checking the Numbers: Much Ado About Liners

Thank you for reading

Latest Articles

Searching for Hidden Homers $

Five & Dive Episode 364: It’s actually Jared Triolo

First-Pitch Swinging is Good, but for Who? $

TA: Marlins Get Less Meyer-ed, More Mired; Rafaela Extension; One Million Injuries $

How Long Can the Twins Maintain an Alternating Current Behind the Plate? $

Eric Seidman

Latest Articles

Searching for Hidden Homers $

Five & Dive Episode 364: It’s actually Jared Triolo

First-Pitch Swinging is Good, but for Who? $