“Baserunning arrogance is just like pitching arrogance or hitting arrogance. You are a force, and you have to instill that you are a force to the opposition. You have to have utter confidence.”
Lou Brock

For those of you who watched The Science Channel’s program “Baseball’s Secret Formula” earlier this week, you’ll recall that near the end of the show the focus turned to fielding. The narrator described the analysis and creation of fielding metrics as the “last bastion” for sabermetrics while highlighting the fine work of Baseball Info Solutions and John Dewan. For those of you who haven’t caught the program yet, you have one more chance: July 15th at 5pm EST.

While I wouldn’t disagree with that notion, and have written as much in the past, there are other areas of the game that perhaps can yield smaller insights but have not been quantified to the degree that we might like. As George Will put it in a recent column in the context of baseball, “the rage to quantify–to reduce reality to measurable units–is an impulse in modern societies.”

This week, we’ll satisfy some element of that impulse on the subject of baserunning.

A Look Back and a Step Forward

As some readers may recall, both James Click and I have done some work previously on quantifying the benefits and costs of baserunning. If you want to read all about how the methodology I developed works, you can take a look at a series of articles from last summer, or a short summary and how it applies to an individual team in a column I wrote earlier this season.

The key point, however, is that the methodology I employed focuses only on the following scenarios:

  • Runner on first, second not occupied, and the batter singles
  • Runner on second, third not occupied, and the batter singles
  • Runner on first, second not occupied, and the batter doubles

In other words, we only credited baserunners with Incremental Runs (IR) when they were on base and a batter following the runner in the lineup got a hit. As you can well imagine, this view of a player’s contribution on the bases leaves a little something to be desired. Particularly, it ignores the times when the batter did not get a hit and instead hit a ground ball or fly ball which still allowed the runner to use his speed to try and advance. But several other things were not factored in–stolen bases, caught stealings, and pick offs, and even aspects like the avoidance of grounding into double plays, and getting thrown out while attempting to stretch a hit. Today, we’ll take a small step towards rectifying those inadequacies by quantifying runner advancement on ground balls in the infield.

Winning the Ground War

Baseball is often a game played 90 feet at a time. Advancing a runner into scoring position, particularly to third base with less than two outs, can be the difference between winning and losing. We’ve all seen excellent baserunners like Juan Pierre and Carlos Beltran advance from second to third on a grounder to shortstop, beat a force attempt, or leg out a fielder’s choice at third. Clearly, these runners are helping their team by taking those 90 feet when other runners would simply remain anchored to their base, or perhaps foolishly attempt to advance, only to get thrown out. But before we can properly credit runners with such plays we need to create a baseline or framework for how often runners really advance on plays like this so we can form some expectations.

In order to do so, we took a look at the advancement frequencies in the following situations:

  • Runner on first only with less than two outs, ground ball or bunt is hit to an infielder where a hit or an error is not credited
  • Runner on second only with less than two outs, ground ball or bunt is hit to an infielder where a hit or an error is not credited
  • Runner on third only with less than two outs, ground ball or bunt is hit to an infielder where a hit or an error is not credited

By using only these three scenarios (note they do not overlap with the scenarios mentioned above) we’re able to isolate the effects of the runner, since there are no other baserunners that either impede the runner from taking an extra base or that alter the behavior of the fielders such that we can’t properly credit the runner because of the interactions between the runners and the fielders. For those wondering, these three scenarios account for just over 80% of the groundballs or bunt grounders to the infield that are not hits or errors, so this allows us to catch most runner advancement effects.

It turns out that we also need to break even these scenarios down a bit. For example, there is a vast difference between how often a runner advances from second to third on a groundball to the third baseman (45%) versus how often he does so when the ball is hit to the first baseman (95%). As a result, we would want to give more credit to the runner for the former result rather than the latter, since the runner was clearly more likely to make it regardless of his baserunning ability. By controlling for the position of the fielder who fields the ball we are also intrinsically controlling for the handedness of the batter.

Although having a smaller effect in most instances, we also break it down by the number of outs, since an infielder clearly might be more prone to simply take an out in some situations, rather than attempt to nab a speedy runner trying to advance. However, we don’t control for the score, which you can imagine will also play a role in decisions made by fielders.

The result is a matrix that records each possible advancement in our three scenarios by the position of the fielder and the number of outs. That matrix consists of 36 rows: 3 scenarios times 6 infield spots times 2 out states. In the interests of space, the following table shows just the typical advancement with a runner on second and nobody out.

Advancement When a Non-Hit Grounder is Fielded in the Infield
Base  Outs   Pos      To3rd    Scores      Stay      OA
x2x      0     P     69.91%     0.00%    18.61%  11.48%
x2x      0     C     79.11%     0.89%    12.44%   8.44%
x2x      0    1B     95.15%     0.00%     1.80%   3.05%
x2x      0    2B     97.13%     0.00%     1.60%   1.28%
x2x      0    3B     45.83%     0.09%    52.95%   1.22%
x2x      0    SS     43.61%     0.00%    48.24%   8.15%

Because the sample sizes for any single season across some of these 36 scenarios are not very large (for example, there were only 49 instances of a runner on second with nobody out and a grounder hit to the catcher in 2005), the percentages here are calculated from aggregating all of the data from 2000-2005, which includes 53,500 ground balls.

Now, using this matrix as a baseline, we can do two things. First, we can assign run expectancy outcomes to each of the advancements in the table above. For those of you unfamiliar with the concept, a Run Expectancy table simply tells us how many runs a team will score on average given each of the 24 possible (8 base states times 3 out states) base/out situations. The Run Expectancy Matrix for 2000-2005 (using a non-weighted average for each cell) is as follows:

Base/Out     0      1      2
xxx      0.530   0.287   0.112
1xx      0.921   0.552   0.242
x2x      1.160   0.704   0.335
xx3      1.441   0.978   0.371
12x      1.523   0.935   0.445
1x3      1.844   1.214   0.510
x23      2.030   1.430   0.599
123      2.364   1.579   0.789

Using the table, we can create derivative run expectancy values for each scenario and advancement. For example, in the first row of the scenario table above, where the runner advances to third with nobody out, we’ll credit the runner with changing the base state from a runner on second to a runner on third with one out in the inning (0.978) minus the run expectancy had the runner stayed put (0.704, for a runner on second with one out). It turns out that the runner making it to third is therefore worth .274, or about a quarter of a run. On the other hand if the runner is thrown out at third we’ll ding him for reducing the run expectancy versus simply staying put; in this case equivalent to having a runner on second and one out (-0.704). Note that we’re not assuming the defense would have turned a double play.

It should also be noted that the derivatives calculated don’t penalize a player for getting forced at second, nor for simply staying put when unforced. From a pure change in run expectancy standpoint, there are of course arguments to be made for including both. The thought was not to penalize a runner for a situation which he did not create (the force) or for at least avoiding an out (staying put) since he wasn’t the guy who hit the ball. Our sense of justice tells us that the batter and not the runner should rightly be dinged in both instances. Had both of these been included, the totals you’ll see near the end of the article would certainly be lower.

The matrix can then be augmented with the run values associated with each scenario, which of course are the same for each row in the above table since the starting base and out situation are the same.

Base  Outs  Pos   To3rd    RunVal  Scores  RunVal    Stay  RunVal     OA    RunVal
x2x      0    P  69.91%    0.2740   0.000  0.5828  18.61%   0.000  11.48%  -0.7043
x2x      0    C  79.11%    0.2740   0.009  0.5828  12.44%   0.000   8.44%  -0.7043
x2x      0   1B  95.15%    0.2740   0.000  0.5828   1.80%   0.000   3.05%  -0.7043
x2x      0   2B  97.13%    0.2740   0.000  0.5828   1.60%   0.000   1.28%  -0.7043
x2x      0   3B  45.83%    0.2740   0.001  0.5828  52.95%   0.000   1.22%  -0.7043
x2x      0   SS  43.61%    0.2740   0.000  0.5828  48.24%   0.000   8.15%  -0.7043

Second, we can create an expected number of runs contributed in each scenario by multiplying the frequency by the run value, and then summing across each outcome. So in the scenario of the runner on second with nobody out and the ball hit to the catcher we would calculate the following:

RunExp = (.7911 * .274) + (.009 * .5828) + (.1244 * 0) + (.0844 * -.7043)

The calculation above yields an expected run value of .2392. This value then is what we would expect a baserunner to contribute each time he’s on second with nobody out and a ground ball is fielded by the catcher. The table of scenarios with the run values filled in follows:

Base Outs  Pos  To3rd    RunVal  Scores  RunVal    Stay  RunVal     OA   RunVal   RunExp
x2x    0    P  69.91%    0.2740   0.000  0.5828  18.61%   0.000  11.48% -0.7043   0.1107
x2x    0    C  79.11%    0.2740   0.009  0.5828  12.44%   0.000   8.44% -0.7043   0.1625
x2x    0   1B  95.15%    0.2740   0.000  0.5828   1.80%   0.000   3.05% -0.7043   0.2392
x2x    0   2B  97.13%    0.2740   0.000  0.5828   1.60%   0.000   1.28% -0.7043   0.2571
x2x    0   3B  45.83%    0.2740   0.001  0.5828  52.95%   0.000   1.22% -0.7043   0.1175
x2x    0   SS  43.61%    0.2740   0.000  0.5828  48.24%   0.000   8.15% -0.7043   0.0621

At its core, this methodology compares each player to the aggregate behavior of all players. That means that it’s possible that the behavior of the aggregate actually results in a negative run value. For example, with a runner on second with one out when a ball is hit to the pitcher, runners (ostensibly running on contact) were thrown out at home 13.6% of the time. That relatively high percentage drives down the run expectancy to -0.0295, so on average we would expect a runner to contribute negative runs in that situation. A runner who is able to advance safely to third therefore will be credited not only with the run value of making it to third (.0363), but also the magnitude of the expected run value, and so will be credited with 0.0658 runs.

Applying the Method

Now that we have the framework set up, we can total all of the opportunities individual players had in these scenarios, along with both the actual run value we attribute to them for their baserunning exploits and the run value we would expect. The difference between the total and the expected therefore equates to the runs each player contributed in these scenarios above and beyond what would have been expected. So may I have a drum roll pleaseā€¦

The top and bottom ten for 2005 in this new metric I’m christening (at least until I can think of a better name) “Equivalent Ground Advancement Runs” or EqGAR are:

Top 10
Name                   Opp Total GAR  Ex EqGAR     EqGAR
Juan Pierre             54     16.67      9.15      7.52
Willy Taveras           40     12.33      6.83      5.50
Chone Figgins           53     13.89      8.99      4.90
Jason Ellison           33     12.14      7.75      4.40
Brady Clark             65     15.60     11.23      4.38
Cory Sullivan           28     10.44      6.33      4.11
Jimmy Rollins           51     14.24     10.34      3.89
Craig Counsell          47     14.17     10.39      3.77
Cristian Guzman         45     13.56     10.49      3.07
Jose Reyes              52     12.19      9.39      2.80

Bottom 10
Name                   Opp Total GAR  Ex EqGAR     EqGAR
Joe Randa               26      0.99      4.53     -3.54
Emil Brown              31      1.49      4.95     -3.46
Jason Varitek           30      0.56      3.74     -3.18
Morgan Ensberg          22      0.67      3.80     -3.13
Cliff Floyd             21      0.15      3.22     -3.07
Robinson Cano           24      2.26      5.10     -2.84
David Ortiz             22     -0.24      2.54     -2.77
Rafael Palmeiro         22      0.96      3.52     -2.56
Mike Lowell             21      1.82      4.32     -2.50
Travis Hafner           29      1.50      3.94     -2.44

As you can see from this list, there is clearly some correlation between players we anecdotally know to be good baserunners and those who do well by this metric. You should also take notice that there are very few players who score better than +3 runs, so most players are somewhere in the middle.

On the other side of the coin, most of those who do poorly are players we might have guessed at, although I found it a bit surprising that Joe Randa and Robinson Cano make the list. You can also see from the list that by subtracting the expected EqGAR from the total, we can rightly rank players like Brady Clark, who score very well in total GAR but also had more opportunities.

However, because more opportunities means a greater EqGAR, much in the same way that more runners on base means more RBIs, we can create a ratio of total to expected ground advancement runs in order to create a rate stat useful for comparing players. The top and bottom ten in “GA Rate” with 20 or more opportunities are as follows:

Top 10
Name                  Opps     EqGAR   GA Rate
Juan Pierre             54      7.52      1.82
Willy Taveras           40      5.50      1.80
Jim Edmonds             26      2.41      1.72
So Taguchi              30      2.75      1.67
Cory Sullivan           28      4.11      1.65
Jamey Carroll           23      2.58      1.60
Jason Ellison           33      4.40      1.57
Chone Figgins           53      4.90      1.55
Nick Punto              20      1.71      1.52
Kevin Mench             25      2.06      1.50

Bottom 10
Name                  Opps     EqGAR   GA Rate
David Ortiz             22     -2.77     -0.09
Cliff Floyd             21     -3.07      0.05
Jason Varitek           30     -3.18      0.15
Morgan Ensberg          22     -3.13      0.18
Carlos Delgado          21     -2.11      0.20
Joe Randa               26     -3.54      0.22
Oscar Robles            20     -1.64      0.25
Rafael Palmeiro         22     -2.56      0.27
Emil Brown              31     -3.46      0.30
Melvin Mora             24     -2.28      0.31

Kevin Mench? Well, in his 25 opportunities he did advance an amazing 15 times, and was never thrown out advancing. On the other hand, David Ortiz scored very poorly by being thrown out at the plate with nobody out and generally not advancing in another 17 of his 22 opportunities.

Since we calculated EqGAR back to 2000, let’s take a quick look at the seasonal leaders and trailers in those six seasons.

Season Leader
Year    Name                  Opps     EqGAR
2005    Juan Pierre             54      7.52
2004    Aaron Miles             53      5.95
2003    Kenny Lofton            42      5.16
2002    Adam Kennedy            38      5.28
2001    Luis Castillo           49      4.44
2000    Johnny Damon            76      5.08

Season Trailer
Year    Name                  Opps     EqGAR
2005    Joe Randa               26     -3.54
2004    Chipper Jones           22     -3.93
2003    Moises Alou             28     -4.02
2002    Mo Vaughn               26     -3.76
2001    Paul Lo Duca            38     -4.10
2000    Edgar Martinez          24     -3.57

As you can see, the leaders are usually a little over +5 runs per season (with the exception of Pierre’s 2005), while the trailers are around -4 runs. That’s a span of nine to ten runs of difference between the best and worst. Interestingly, the span is about the same as that in the Incremental Runs framework, meaning that an elite baserunner could add about 10 runs or about one win to his team by advancing on hits and grounders in the infield.

Finally, let’s take a look at the cumulative leaders and trailers in the 2000-2005 time period:

Top 10
Name                  Opps     EqGAR
Juan Pierre            279     20.64
Adam Kennedy           206     14.89
Tony Womack            201     14.27
Kenny Lofton           210     12.21
Ray Durham             196     11.65
Rafael Furcal          224     10.85
Fernando Vina          229     10.14
Craig Counsell         184     10.11
Jimmy Rollins          212      9.98
Mike Matheny           183      9.16

Bottom 10
Name                  Opps     EqGAR
Paul Lo Duca           154    -11.75
Rafael Palmeiro        129    -11.18
Paul Konerko           142     -9.91
J.T. Snow              138     -9.62
Luis Gonzalez          133     -9.13
Carlos Delgado         162     -9.02
Manny Ramirez          132     -8.83
Chipper Jones          175     -8.41
Jorge Posada           158     -8.21
Edgar Martinez         113     -8.03

Ahead of the Tag

As just mentioned, one can imagine that through the collection and aggregation of metrics like this we could start to form a better picture of the contribution runners make on the bases. Look for refinements to EqGAR and the development of other metrics as we move in that direction.