Baserunning is the neglected stepsister of offense. It tends not to be well correlated with metrics that evaluate hitting performance, and indeed is often more likely to be associated with defense. The common intuition that teams that play good defense tend to run the bases well to boot is corroborated by the data, which makes sense, given that speed plays an important role in both pursuits. The top ten baserunners in baseball by BRR are all considered above average defenders, and each plays—or is capable of playing—an up-the-middle position. At the same time, only Carl Crawford among them is considered a great hitter, and a couple of them (Emilio Bonifacio, I’m looking at you) are downright dreadful. In other words, baserunning is one of those dusty backwaters of analysis that continually confounds expectations.

A Baserunning Koan

What is great about baserunning is that it can be so deeply strange. For example, here’s a puzzle. In our baserunning metric, teams’ collective efforts on the bases are a zero-sum game. That is to say, over the last fifty-plus seasons, teams break even in terms of runs generated in each of those categories. When you add up all the runs produced against average (via run expectancy methodology) on grounders by all teams since 1954, you get a whopping 0.01 runs. Over 1408 team-seasons, in other words, it was basically all a wash. That makes sense, since the comparison is against average performance—the sum of performances compared to average should net out to zero. Do the same for runs gained on balls hit in the air, and the number comes out to an effective rounding error of 39.59 (or about 0.03 runs per team-season). For out advancement runs? The sum is just 0.21. So far, so good.

But what about stolen base runs? This is where things get interesting. In Dan Fox’s original article introducing EqSBR, the stolen base component of BRR, there was no adjustment made against average. The method was all run expectancy, no adjustment. The result is that the average team since 1954 has averaged -9.2 stolen base runs. The very best teams—the 2007 Philadelphia Phillies being the leader—have accrued fewer than 15 runs per season. The worst teams, by contrast, have cracked the -30 run barrier. The 1978 Athletics, led by the brutal-in-almost-every-way Mike Edwards, were worse than the best teams were good. The following histograms show the way in which EqSBR is thus different in kind from its BRR-component brethren:

So why is this? What sense does it make? One plausible reading is that teams are, and have almost always been, clumsy on the basepaths. It has long been a sabermetric criticism of managers that they run too often and in suboptimal situations. After all, Torii Hunter still gets caught stealing at third base with no outs down by two runs. This is the idea behind not normalizing stolen base runs, because it allows for some external criticism of player and team decisions. But is it necessarily the right way to approach the analysis? Another way to ask the question would be to wonder whether we might not be better off re-centering the stolen base runs so that average were equal to zero. According to this view, stolen base runs—expressed positively or negatively—would always be a comparison to how the league did as a whole. We’d lose the objective guidepost, but what if the guidepost were never objective to begin with?

Over the Columns and Through the Rows . . .

First, let’s understand the run expectancy methodology. The first step is to create an empirical run expectancy table. This is a concept undoubtedly familiar to many readers, so I’ll just explain it briefly here. A run expectancy table tells us how many runs were scored, on average, after a certain situation came about. For example, if there were a runner on first and no outs, the run expectancy might be something like 0.9 runs. We can calculate this data by going through play-by-play databases and figuring out what happened after every instance in which a runner was on first with no outs, and then averaging the results. This takes a lot of saying “please” to Colin Wyers, and also patience (because even today’s computers take time to generate results to broad queries like this), but the results are worth it.

For the question we want to answer, we need to restrict our focus to those situations in which there was a stolen base attempt. We would have to look at all attempts to steal second base, broken down both by number of outs and possible baserunner states (note that we wouldn’t have, say, runners on second and third, since then no one could steal second). In those attempts that were successful, we would calculate the number of runs that scored in the rest of the inning. In those that failed, we’d do the same. Each of those numbers would be compared against the run expectancy before the stolen base attempt was made. Sounds pretty good, right?

“Aha!” Says the Man With the Database-Fu

Unfortunately, two problems appear with this methodology, one of which is easier to fix than the other. The first is that the run expectancy from before the stolen base attempt was made is going to include lots of information about what happened in subsequent stolen base attempts. In other words, the fact that a runner might steal a base when on first with no outs affects the expected number or runs scored when a runner is on first with no outs. But what we really want to know is the run expectancy when compared against the situation where the runner didn’t attempt to steal at all.

“Aha!” says the man with the database-fu, “we can solve that!” We’ll just run a new query and look only at those situations in which there was no steal attempt. That sounds great, except that there’s something fishy about those situations where no steal was attempted. Like, why wasn’t a steal attempted? Maybe the situation didn’t call for it, or the pitcher was quick to the plate, or the catcher had a good arm. But it’s also possible that the runner on first was a crummy baserunner. If that’s true, then the run expectancy will be artificially low in that circumstance, because good baserunners are more likely to score than bad ones. (I hope this at least is an uncontroversial statement, but I’m growing increasingly unsure.)

The second problem has to do with the source of the data itself—play-by-play records. Caught stealing events include two types of outcomes: cases where the runner ran and was thrown out by the catcher and cases where a hit-and-run was called and went awry. In the latter case, there is a caught stealing recorded (which would thus show up in our run expectancy calculations) but in an important sense there was no stolen base attempt. The effect of this phenomenon is definitely in the negative direction, meaning that teams have artificially low baserunning totals when expressed in total runs and not normalized. And although the effect is likely spread across teams more or less equally, it affects the league as a whole each year. Even once recognized, this is a very difficult problem to quantify because the data to do so simply isn’t available.

Question of the Day

Given these limitations, which is the superior approach to expressing stolen base runs? Do the potential systematic errors counsel in favor of centering everything on zero? Wasn’t that Torii Hunter caught stealing just a disaster?

Thanks to Colin Wyers for research assistance.

Thank you for reading

This is a free article. If you enjoyed it, consider subscribing to Baseball Prospectus. Subscriptions support ongoing public baseball research and analysis in an increasingly proprietary environment.

Subscribe now
You need to be logged in to comment. Login or Subscribe
I'm confused: why should we try to normalize SBR so that average = zero when the evidence from Run Expectancy suggests that, on net, teams' actual base stealing attempts hurt more than they help?

Redefining what "zero" means would not change the spread between the best teams and the worst teams, but it would obscure the fact that in general, teams do not do a good job of determining when they should be running.
I think Tommy is suggesting there is a) a selection bias in the way our run expectancy matrix is calculated, and it may be undervaluing stolen bases and b) bad data in there since we are including busted hit-and-runs as caught stealing.
Perhaps, but at least on the subject of counting busted H&R as CS, why is it even meaningful to make a distinction? Whether the extra out occurs as a result of a straight steal attempt or as part of a larger tactical move (i.e., the hit & run), it is still an out, and it still leaves the batting team worse off.

The stolen base attempt is nice because it is something that is fairly easily countable, but it is really of a piece with an entire range of baserunning tactics: stealing, H&R, attempting (or declining to attempt) to take an extra base, advancing on ground ball/fly ball outs, etc.
Even if we wanted to lump stolen base attempts and hit-and-run plays together, the problem is that we are including the negative result of a busted hit-and-run but not the positive result of extra bases advanced when the hit-and-run succeeded. We're putting those positive runs in a different bucket because we don't have a record that they belonged with the hit-and-run plays.
I recently did a study to try to find out the likelihood of success for teams who simply shunned the running game as much as possible. I looked at all teams who finished last in their league in SB attempts since 1995. Over the first few years of that study, the trailer in SB's was actually more likely to make the playoffs than not. However, last year's Giants team was the first since 2003 to do so.

Of course, the connection between the two is not necessarily evident, but it is nontheless interesting. Watching the Jays regularly, the failure to ever run took away a concern of the opposing team, making them much easier to defend. In short, I believe that the threat of running must be present at all times, while actually running must be done judiciously.
Seems to me that there is another variable that might be worth considering.

It seems to me that the stolen base increases run expectancy MORE if this happens when the bottom of the order will be batting, rather than the strongest hitters (who are more likely to drive the guy home regardless of whether he is on first or second).

To me, this is yet another reason why the traditional batting order that features your best base-stealer at leadoff is sub-optimal.
I've been arguing this with my fellow Reds fans for years. The value of a runner being able to take extra base is greatest when the likelihood of the batter advancing him is lowest. Though, perhaps there's an additional factor of volume of opportunity. A 100 slight boosts is may be more valuable than 50 larger ones.

That said, it seems that SB in particular are more valuable at the bottom of the lineup ahead of guys with low SLG because there will be many fewer extra bases hits making it much less likely that any base-runner would be able to score from 1st and because the cost of making an out is lower.
While I agree with your general observation (i.e. that the value of the SB gained is higher when it occurs with the bottom of the lineup batting), I have to take exception with that last point you threw in there ("because the cost of making an out is lower.")

I'd say the cost of the baserunner making an out is far higher with the bottom of the order batting, because by definition it delays the stronger top and middle of the lineup from getting to the plate again. If anything, the strategy you've been advocating to your fellow Reds fans should be most accurately described as higher-risk, higher-reward.
Simply from a game theory perspective, doesn't it make sense that the choices base-runners and coaches make generally reach a break-even equilibrium? It seems that the decision making process would naturally lead one to take an increasing amount of risk up to the point at which it is no longer beneficial to do so. Given a "wisdom of the crowds" sort of effect, you'd see some players or teams routine come out ahead or behind, but on balance, we're pretty good at finding that break-even point.

A smart team, may consciously be able to stop while its ahead. But absent that, I think the instinctive decision making process will naturally lead to a break-even equilibrium.
I think part of the consensus of the crowds game theory is that the crowds communicate with each other about the optimal outcome. However, in the MLB I would assume that's untrue.
Actually it's the exact opposite. What makes the wisdom of the crowd so impressive is that it is not coordinated at all, but that on average, they tend to reach the best answer. It's certainly not a consensus.
I was seeking further details on this comment in the opening paragraph:

"The common intuition that teams that play good defense tend to run the bases well to boot is corroborated by the data, which makes sense, given that speed plays an important role in both pursuits."

The only example offered was ten premiere players who were good in both areas. I think that this connection is very interesting and so I'd really like to know what studies (if any) the writer was referencing with that comment. So if anybody knows about the baserunning/defense skills correlation, please share it. (I've e-mailed BP about this but, as usual, no reply.)