July 10, 2012
Does the Hit and Run Help?
Believe it or not, most of our writers didn't enter the world sporting an @baseballprospectus.com address; with a few exceptions, they started out somewhere else. In an effort to up your reading pleasure while tipping our caps to some of the most illuminating work being done elsewhere on the internet, we'll be yielding the stage once a week to the best and brightest baseball writers, researchers and thinkers from outside of the BP umbrella. If you'd like to nominate a guest contributor (including yourself), please drop us a line.
Pete Palmer is the co-author of The Hidden Game of Baseball with John Thorn and co-editor of the Barnes and Noble ESPN Baseball Encyclopedia with Gary Gillette. Pete introduced on-base average as an official statistic for the American League in 1979 and invented on-base plus slugging, now universally used as a good measure of batting strength. A member of SABR since 1973, his baseball data is used by the SABR Encyclopedia, MLB.com, Retrosheet, ESPN, and Baseball-Reference.com. He was selected by SABR to be in the inaugural group of nine given the Henry Chadwick award in 2010. Pete is also the editor of Who’s Who in Baseball, soon to be celebrating its 100th anniversary. His latest book, Basic Ball: New Approaches for Determining the Greatest Baseball, Football, and Basketball Players of All-Time, was released late last year.
Major League Baseball has been keeping records for the runner going on the pitch back to 2004, so we now have eight solid years of reasonably accurate data. I decided to study this information in order to evaluate the hit and run play. One problem is that the runner is nearly always also going with the pitch on a straight steal, so I deemed any case where the batter did not swing to be a stolen base attempt.
The most common situation to attempt a hit and run was of course with a runner on first only, about eight percent of the time. It was tried about four percent with runners on first and third and less than one percent in all other situations. This does not include situations with a 3-2 count. With two outs, the play is automatic, virtually one hundred percent, as there is nothing to lose. With less than two out, the runner goes about 55 percent of the time. The analysis below looks at just the runner on first situation, with the 3-2 count cases broken out. It covers all games from 2004 through 2011.
Overall, the hit and run play dramatically increased the probability of advancing an extra base on a hit and also avoiding grounded into double plays. The batter himself got more hits but fewer homers and walks, so that part was neutral. There was an increase in line drive and other double plays. Batting performance was somewhat reduced if the play resulted in a foul, as the batter had an extra strike that might have been a ball if he was not forced to swing. On a missed swing, the runner stolen base success rate was lower than average. This is because the typical runner on a hit and run play is not as good a base stealer as the runner in a normal steal.
First, let's look at runner advances. The table below shows the percent of time each advance occurred.
Looking at the run probability matrix, 1st and 2nd/0 out is 1.51, 13/0 is 1.76, and 1h/0 is 1.86. This means a single is worth about 1.69 runs with a hit and run and 1.57 without. With one out it is 1.08 compared to .90. For two outs it is .49 and .46. This is less of a difference because being on 3rd with two out is not that much better than being on 2nd (.35 runs to .32). For doubles, the values are 2.07 and 2.01 with none out, 1,56 and 1.44 with one out, and 1.20 and .95 with two out. Taking each case and multiplying by the fraction of cases with zero, one, or two outs, you get an increase of .13 runs per single and .12 runs per double.
Next, we will check out stolen bases and double plays.
Stolen base attempts are less successful on a hit and run play because the runner is not as apt to be one of the top stealers. A runner on 2nd with none out is worth 1.14 runs, while bases empty and one out is .27, a loss of .87 runs for each of the extra caught stealing due to the 15 percent drop in success rate. With one out, a runner on 2nd is .67 runs, and bases empty and 2 out is .11, a net loss of .56 runs. There aren't enough cases of a straight steal on 3-2 to measure. If the batter takes the pitch for a ball, the stolen base is wiped out, and if he swings and misses, it counts as a hit and run, so the stats are distorted.
Total double plays are reduced by about 30 percent. A runner on first with one out is .51 runs, while bases empty and two out is .11, so saving a DP with none out gains .40 runs. With one out, scoring potential goes from .22 to zero. Double plays are rarer on 3 and 2 because the batter is more likely to strike out.
Almost a third of hit and run attempts result in fouls. The batter's outcomes in these cases are somewhat weaker due to the fact the batter has to swing, unless there are three balls and the pitch is outside of the strike zone. These pitches could have been taken if the hit and run play had not been on. The count is very important in predicting the outcome of an at-bat. I did an article for "Optimal Strategies in Sports" back in 1977, based on just the results of the World Series in 1974 and 1975. It showed that the batter produces twice as many runs as measured by linear weights when the first pitch is a ball rather than a strike. Not only that, but the eighth-place hitter (or the ninth when using the DH) with a 1-0 count is more productive than the third man with an 0-1 count. Now we have pitch-by-pitch data for all games back to 1988, but the results are the same.
I took the performance in cases where there was a foul with the runner going, compared to the same batter in the same game with a runner on first who was not going. This is necessary because the players who are called upon to hit and run are usually better-than-average hitters. I had to weigh each game situation equally, that is, number of outs, runner going or not, 3-2 count or not. Fouls are more apt to occur on a 3-2 count, and the total number of walks per plate appearance is higher in the foul case. Using linear weights, the typical batter is about 20 runs short per 500 plate appearances because of the extra fouls.
Overall output for the batter on a hit and run play is about the same as without the hit and run. Batting average is higher because the batter is trying to make contact rather than hit for power, and therefore home runs are down. Walks are lower because the batter is not waiting the pitcher out. The table below shows performance for players who batted in hit and run cases with and without the play actually being on. The with data is for runner on 1st, zero or one out, and not a 3-2 count. The without data is for all at-bats involving the same players.
The table below shows a summary of the analysis for the non-3-2-count cases. The "Runs Per" figure is the difference in runs between hitting away or playing the hit and run.
So the results are actually very close to neutral. There is a loss of about one hundredth of a run per attempt with none out, no change with one out, and a loss of one hundredth of a run with two outs There were a total of about 20,000 plays in the eight years, 2500 per year or about 80 per team. The difference per team is less than one run per year.
In 1973, Dick Cramer and I came up with using on-base average times slugging to relate batter performance to runs. Dick did it using a simulation of individual batting seasons, while I used team data. We published in the SABR Research Journal, using the name Batter Run Average. The problem for players, using Babe Ruth as an example, is that the simulation assumes eight other Babes in the lineup, which enhances the production. I came up with using on-base plus slugging, as it is closer to what a good player would add to an average lineup. This was first published as "Production" in The Hidden Game of Baseball with John Thorn in 1984. It took a while to catch on, but thanks to Total Baseball and the ESPN Baseball Encyclopedia, it is pretty commonly used today.
Of course, it is really nothing new. Branch Rickey had a similar equation in his 1954 Life Magazine article. Earnshaw Cook's DX from the early 1960s and Bill James' runs created are both basically on-base times slugging times at-bats. Both of these have enhanced versions in which other stats are brought in to increase their accuracy slightly.
While working as a consultant to the American League, I introduced OBA as an official stat in 1979. Unfortunately, the baseball guide did not pick it up until 1982, when the NL added it. The definition got changed from mine by adding sacrifice flies as outs. I never liked that, for two reasons. First, a sacrifice fly is not a time at bat, and second, sac flies for Babe Ruth, Lou Gehrig, Ty Cobb, and others are unknown. That is because for 1908-30 and 1939, the first two incarnations of the sac fly rule, bunts and flies were not separated in the official stats. It has only been in the current period from 1954 that each was counted.
In recent years, there has been talk that on-base average is really more important and that the number should be more like 1.8 times OBA plus slugging. This is true. However, when calculating expected team runs from batting stats, the error difference is only about two tenths of a run out of a total error of 23 runs over a wide range of multipliers. From 2000 to date, the range is 1.2 to 2.0 with the optimum around 1.8, while from 1960 to date it is 1.0 to 1.8 with the best near 1.5. I believe the simplicity of using OPS directly is more valuable than trying to complicate the method for a small gain in accuracy.
My enhanced version of OPS, which is much harder to calculate, is OBA/league plus slugging/league minus one. This normalizes it to the league average (not including pitcher batting) and has the advantage of being easy to make a park adjustment. To get the adjusted value, just divide by the batter park factor, which is runs scored by both teams home over away plus one, all divided by two. The normalized value also is exactly equal to the number of runs produced. That is, a player with both OBA and slugging 10 percent above the league average will produce runs at a rate 20 percent above average. The normalized version also has the advantage of weighing OBA more, since 33 points in OBA is equivalent to about 40 points in slugging.
Back in the 1960s, I developed the factor of 10 runs per win. That is, if a team outscores its opponents by 10 runs, it should win one more game, or 82 games out of 162. Later, Bill James came up with his Pythagorean theorem, namely, winning percentage is equal to runs scored squared divided by the sum of runs scored and runs allowed each squared. I always liked mine better because it was much easier to calculate and just as accurate.
For example, from 2000 through 2011, of 360 teams, 157 calculate to the same number of wins and only nine differ by more than two. The standard deviation for the wins error in prediction is 4.01 for mine and 4.05 for Bill's. Using 1960-date, 1342 teams, the errors are 4.01 and 4.02. Of the 12 teams with a different prediction of six or more, we each were closer to the actual value on five, with two ties.
A slightly more accurate form of Bill's equation uses the power of 1.83 instead of 2, while I have a similar improvement by using the 10 times the square root of run scored per inning by both teams. This reduces the prediction error to 3.95 for me and 3.96 for Bill, hardly worth the trouble. Since Bill was tied to the Pythagoras method, it took him 20 years to figure out how to relate runs created to win shares.
When evaluating pitchers, outs per balls in play (O/BIP) is fairly constant for all pitchers. Balls in play does not include home runs or strikeouts, which contribute a great deal to the difference between pitchers. You can think of batted balls as being a spectrum from home run to extra base hit to single to out to strikeout. When a good pitcher keeps the ball in the park by reducing a homer to a double, his O/BIP goes down, and when he converts an out into a strikeout, it goes down again. Changing a double into a single has no effect at all. Neither does avoiding a walk. The only time his O/BIP goes up is when a single is converted to an out. Thus the effect of being a good pitcher on O/BIP is muted, and the differences, although still significant, are smaller than the effect of homers and strikeouts.
This is data on starting pitchers from 2000 through 2011, separated by ERA for each season. There was only a difference of 17 points in O/BIP for an ERA of 3.25 compared to 4.75, but the difference in OBA was more than twice as big, and SLG doubles it again. The pitcher himself is not trying to minimize O/BIP, but rather trying to minimize OBA and SLG and therefore ERA. O/BIP involves some factors that go up with good pitching and some that go down, while the other measures are in one direction only. Thus the appearance of a small difference in O/BIP is really artificial.