January 4, 2005
Blocking the Uppercut
Tackling the Jeromy Burnitz Problem
How many times have you heard an announcer say about a player, "He uses an uppercut swing to get more out of his power?" Or, conversely, "He tries to keep the ball on the ground to get the most out of his speed?" But then, you look at the player's numbers, and the first guy has a .230 batting average to go with all his homers, and the second guy leads the world in singles and little else. These are what I like to call, respectively, the Jeromy Burnitz Problem and the Luis Castillo Problem. Both have swings that deviate from the standard line drive swing. They use extreme swings, intended to hit flyballs or grounders, respectively, in order to get the most out of their natural abilities. What I wanted to know was, are they actually succeeding?
To answer this question, I had to begin by setting up a model of projected player performance to accurately determine what a player's numbers would look like based on changes in GB/FB/LD rates. This model was inspired by the one used by John Burnson in the 2004 Baseball Forecaster, in order to predict batting average. We use a linear regression of the form: Outcome Type = (Batted Ball Type)*(a + b*(Power) + c*(Speed)). Using Retrosheet--the greatest thing in the world--data from 1990-92, we can obtain all outcomes for groundballs, flyballs, line drives, and pop-ups, as well as bunts. Possible outcomes for which I derived the equation were singles, doubles, triples, home runs, one out, double plays, and triple plays. Of course, there is some overlap, as hits sometimes still lead to outs, but that's not a big problem.
The next step was to create values for raw power and raw speed. Neither one is close to perfect, but both work well enough. For power I used HR/(.092*FB+.026*LD). This is a measure of how many home runs the player hit compared to how many a league average player would be expected to hit given the same number of flyballs and line drives. A value such as 1.22 would then mean his power was about 22% above league average. The equation for raw speed is of the form (3B/(2B+3B))/.11. I chose this measure instead of one involving stolen bases because stolen bases involve baserunning skills, which are a separate talent from raw speed. The average player gets triples on about 11% of non-HR extra-base hits, generally due to odd bounces and average speed. This equation, like the other, measures the player's raw speed relative to league average.
At this point, I chose to normalize both values, so that the constant value in the regression equation would be the league average, and the result would then be adjusted up or down based on above or below-average power and speed. Power was normalized with 1 and .688, while speed was normalized with 1.084 and .775, to account for skewness. Again, these values are not perfect, but they more than adequately represent a player's raw power and speed in terms of projecting the results of balls put into play.
At this point, using players with a minimum of 250 balls after contact, I was able to calculate various expected values:
x1B = GB * (.216 + .008*Pwr + .002*Spd) + FB * (.038 - .002*Pwr) + LD * (.522 - .008*Pwr + .003*Spd) + Pop * (.034 - .003*Pwr - .003*Spd) + Bunt * (.273 + .034*Spd) x2B = GB * (.020 + .001*Pwr - .001*Spd) + FB * (.057 - .004*Pwr - .005*Spd) + LD * (.158 + .012*Pwr - .015*Spd) + Pop * (.010 - .001*Pwr + .002*Spd) x3B = GB * (.001 + .002*Spd) + FB * (.013 + .010*Spd) + LD * (.019 + .001*Pwr + .012*Spd) xHR = FB * (.088 + .057*Pwr + .001*Spd) + LD * (.027 + .023*Pwr - .001*Spd) x1P = GB * (.678 - .018*Pwr + .004*Spd) + FB * (.798 - .050*Pwr - .008*Spd) + LD * (.273 - .025*Pwr + .001*Spd) + Pop * (.952 + .005*Pwr + .002*Spd) + Bunt * (.681 - .032*Spd) x2P = GB * (.057 + .005*Pwr - .006*Spd) + .004*FB + LD * (.014 - .002*Pwr + .001*Spd) + .001*Pop + .003*Bunt *Note: All coefficients are significant at the 99.99% level except for LD*Spd in x1P, which has a p-value of .03.Triple plays are so infrequent that we cannot actually expect them as an outcome. Double plays are calculated separately from one-out plays for the same reason that singles are calculated differently from doubles: one double play is more damaging to a team than two plays on which one out is made, just as one double is not equal to two singles. So now, the next step is to determine what the true value of each outcome is, along with strikeouts and total walks (including HBP). For this, I ran all team data for the same three seasons, and attempted to determine the number of runs created by each possible outcome, in the style of Jim Furtado's XR model. This equation, using only coefficients significant at the 99% level, came out as follows:
XR = -.114*SO + .330 *TBB + .365*1B + .712*2B + 1.104*3B + 1.494*HR - .037*1P - .394*2PWith all of that done, we can get to the meat and potatoes. We can input the expected results from various batted ball types into the XR equation. Once we simplify it, we come up with the following:
XR = .330*BB - .114*SOThis is the number of runs we can expect from each possible outcome of an at-bat. So, we can tackle the Jeromy Burnitz Problem head-on by looking at this part of the equation: FB * (.170 + .084*Pwr + .010*Spd) + LD * (.349 + .043*Pwr-.001*Spd). It is clear that the initial value of a line drive (.349) is far greater than that of a flyball (.170). While a line drive leads to fewer home runs, it also leads to substantially fewer outs, and many more singles, doubles, and triples. But it is also evident that power has a stronger effect on flyballs than it does on line drives. So, is there a point at which a player can have enough power to make a flyball more valuable than a line drive? The answer, it would seem, is no.
Even going beyond the raw power of any player playing from 1990-92 (Cecil Fielder led the league with 3.91 standard deviations above average), the value of a flyball is exceeded by that of a line drive. The crossover point, for average speed, is a Pwr score of 4.40. To understand how high that is, Barry Bonds posted a Pwr score of 3.68 in 2001, though he may have been hurt a little by his enormous ballpark. In any case, it is highly unlikely that we will see even a handful of players powerful enough to make flyballs worth more than line drives.
In terms of the Castillo Problem, the case is similarly straightforward. Here is the same chart, with relative value given an increase in speed:
How useless are groundballs? Well, given average bunting ability and appropriate use, even bunts contribute more value than groundballs. Lest there be concern that the heavy weighting of double plays would hurt Castillo--or any leadoff man--the value of a groundball only slightly improves when we remove them, and is still less valuable than a bunt for any player with a Spd above -.50. The fact is that the line drive is the best of all possible outcomes, almost completely regardless of the player. Teams would be well served by looking to turn their players into line drive hitters, rather than extreme groundball or flyball hitters.
A further look at line drive hitters reveals even more value. While flyball hitters have a positive correlation with both SO (.304) and BB (.242), and groundball hitters have a negative correlation with them (-.253 and -.215), line drive hitters have a very slight positive correlation with walks (.048) and a negative correlation with strikeouts (-.167).
Logically, this makes some sense. Hitters with an uppercut swing have their bat in the hitting zone for less time, resulting in more swings and misses. This will lead to deeper counts, which in turn lead to more walks and strikeouts. The walks could also be a result of the fact that these hitters are generally power threats, and thus more likely to be pitched around.
By contrast, slap hitters with a flat swing keep their bats in the zone and create more contact, though it may be weak, and will thus strike out less. Since they will generally put the ball in play earlier in the count, they will also walk less. Line drive hitters, it seems, strike a balance right in the middle. This is even stronger evidence that players with extreme swings are failing to maximize their value.
Now, let's get back to the specific cases of Burnitz and Castillo. Let's look first at Burnitz's totals for the 2001 season, his last with Milwaukee:
BAC PA SO TBB GB FB LD Pop Pwr Spd XR --------------------------------------------------------- 407 651 150 94 145 132 79 51 2.04 -0.10 100.0
That's a pretty good season. But what if we increase his LD% from 19% to 25%, and take them away from his FB total?
Adjustment BAC PA SO TBB GB FB LD Pop Pwr Spd XR -------------------------------------------------------------------------- Actual 407 651 150 94 145 132 79 51 2.04 -0.10 100.0 Only FB and LD 407 651 150 94 145 108 103 51 2.04 -0.10 102.3
Twenty-four balls hit on a slightly reduced trajectory, and Burnitz is adding two and a half runs to his total. That's about a quarter of a win right there. Now let's try adjusting the GB and pop-up rates to reflect this change:
Adjustment BAC PA SO TBB GB FB LD Pop Pwr Spd XR -------------------------------------------------------------------------- Actual 407 651 150 94 145 132 79 51 2.04 -0.10 100.0 Only FB and LD 407 651 150 94 145 108 103 51 2.04 -0.10 102.3 All BAC 407 651 150 94 163 108 103 35 2.04 -0.10 103.5
Another run and change. Now, one final adjustment would be to adjust SO and BB rates. An accurate prediction of SO% can be made using the original FB, GB, LD and SO rates in conjunction with the adjusted FB rate (this regression was done by studying a player's changes in these rates from one season to the next). Walk rate, it seems, is more or less unaffected by changes in batted ball rates. I would expect this is a result of there being no change in pitches taken, while there would be a change in the contact rate of pitches offered at. At any rate, here are the complete adjustments:
Adjustment BAC PA SO TBB GB FB LD Pop Pwr Spd XR -------------------------------------------------------------------------- Actual 407 651 150 94 145 132 79 51 2.04 -0.10 100.0 Only FB and LD 407 651 150 94 145 108 103 51 2.04 -0.10 102.3 All BAC 407 651 150 94 163 108 103 35 2.04 -0.10 103.5 All Outcomes 421 651 136 94 168 112 106 36 2.04 -0.10 108.1
Simply by altering his swing enough to hit 24 more line drives and 24 fewer flyballs, Burnitz would have improved across the board, and produced 8.1 more runs over the course of a season. A look at Castillo's 2004 stats reveals a similar improvement:
Adjustment BAC PA SO TBB GB FB LD Pop Pwr Spd XR ------------------------------------------------------------------------- Actual 506 649 68 75 316 76 103 11 -1.22 2.92 72.4 Only FB and LD 506 649 68 75 292 76 127 11 -1.22 2.92 78.3 All BAC 506 649 68 75 292 77 127 10 -1.22 2.92 78.2 All Outcomes 511 649 63 75 295 77 128 11 -1.22 2.92 79.4
While the effect on strikeouts and walks is greatly reduced from that of Burnitz, the run increase is close. Now, this is missing Castillo's bunts, because I cannot find them, but the effect of decreasing groundballs and increasing strikeouts is undeniable. Furthermore, while stolen bases did not prove significant to XR, they are useful when they work. Getting Castillo on base more--which line drives would do--means more stolen bases, which in turn mean more runs. So, again, we see evidence that extreme swings hurt more than they help.
This data makes a very strong case for a specific, and realistic, improvement in the development of hitters. Rather than mistakenly attempting to tailor a hitter's swing to his attributes--i.e. Burnitz and Castillo--an organization would be best served by teaching young hitters a steady line drive swing. The adjustments portrayed here are slight, but each player would win their team about one more game per season just by using a slightly less extreme swing. Spread that improvement around an entire system, and a team could win at least a few more games each season by producing line drive hitters instead of flyball or groundball hitters. Assuming that the ultimate goal is to produce the hitter likely to contribute the most possible to victory, organizations would do best to get over the fan's fascination with the home run, and discover the winner's fascination with the line drive.
The information used here was obtained free of charge from and is copyrighted by Retrosheet. Interested parties may contact Retrosheet at 20 Sunset Rd., Newark, DE 19711.
Seth Samuels is a student as Wesleyan University who has contributed to Baseball HQ. He can be reached here.